Home > SSIS > Multiple data aggregation using single dataset

Multiple data aggregation using single dataset


Aggregation is a process of calculating data such as sum, average, min, max, etc. It as another name in reporting environment called summary table. Pre-aggregated data is stored in a temporary or permanent table to serve for the summary reports. It help in improving performance of the report by avoiding real-time SQL query processing. It can be seen in the data warehouse environment widely. Of course, there are various methods to improve the query performance mainly for the summary report such as report snap shot, cache, etc.

In this article I will be talking about Aggregate transformation.

Aggregate Transformation

Aggregate transformation summarize input dataset. Aggregate Transformation applies two set of aggregation function on the dataset.

1. Functions for string aggregation.

2. Functions for number aggregation.

Some functions are common between two data types such as count, group by and count distinct.

The most important feature in this task is that we can create multiple aggregation output from single dataset.

Scenario

I have a requirement to get a summarized data from “AdventureWorks.Sales.SalesOrderDetail”.

But, I have to use only one aggregate task to calculate sum(unitprice) of each product and the average(order quantity) of each product. Special requirement is to store these two aggregation ouput into two destination.

Implementation

Step 1: Create a package and add a data flow task.

Step 2: Go to data flow task and add ADO.Net data source with adventureworks.sales.salesorderdetail table.

Step 3: Add aggregate task and connect Ado.net data source with it.

Step 4: Open aggregate transformation editor and click advance button.

Step 5: Select Productid and Unitprice. Set the aggregate operation to groupby for productid and sum to unitprice. This aggregation output pipeline is named as aggregate output 1. We can always change this name.

Step 6: Add another aggregate output and select orderqty and product id. Now, select the aggregate function for productid as groupby and orderqty as average.

Step 7: Add two union all task and connect the aggregate pipeline Aggregate Output 1 and Aggregate Output 2 respectively.

Step 8: Add data viewer to see the result.

Step 9: Execute package.

For more information read MSDN article.

Thanks for reading.

 

Advertisements
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: