SSIS Aggregate Transformation

Like the aggregate functions in SQL SELECT statement, the Aggregate transformation is used to apply the aggregate functions to the input columns and put the result in the output columns. The GROUP By clause is provided to specify groups to aggregate across.

The operations supported in the transformation are listed below.

OperationDescriptionTransact-SQL Keywords
Group byDivides datasets into groupsGROUP BY
CountThe number of rows in a groupCOUNT
Count distinctThe number of unique non-null values in a groupN/A
SumThe total of the values in a columnSUM
AverageThe average of the values in a columnAVG
MinimumThe minimum value in a groupMIN
MaximumThe maximum value in a groupMAX

In the following example, I'm going to use the Aggregate transformation to get the number of books and its total cost by the year of the bought date. We'll use the the same data source as that in the previous section.

  1. Copy & past to a copy of the package Multicast.dtsx and rename the copy to Aggregate.dtsx.

  2. Open the package and click Data Flow tab, then remove the output from the Multicast to the Row Count transform.

  3. Drag and drop Derived Column and Aggregate transformation to the editor area and link them together as shown below.
  4. [Add Aggregate Transforms]

  5. Right-click Derived Column transform and choose "Edit..." to open its editor. Add a new column Year and configure its expression like below then click OK button.
  6. [New Derived Column]

    We'll group by the new column "Year" in the next step.

  7. Right-click the Aggregate transform and choose "Edit..." to open the Aggregate Transformation Editor. Then choose Year, (*) and Price and make their settings.
  8. [Aggregate Transformation Editor]

    We chose "Group by" operation for Year, "Count all" for all the records with its output alias changed to "Number of Books", and "Sum" for Price with Output Alias changed to "Total Cost". CountDistinctScale and CountDistinctKeys are properties for Distinct count operation and their settings will improve running performance.

  9. Click OK button. Then add Data Viewer before and after the Aggregate transform.
  10. [Add Data Viewers]

  11. Run the package.
  12. [Data In Data Viewers]

    The left side is the Input data of the Aggregate transform and the right side is the output data. As you can see the Aggregate transformation generates aggregation data based on its input data.