Determining SSAS Aggregation Usage


Aggregations improve query performance by limiting the amount of partition data that needs to be scanned to answer a query. Instead of querying the fact level data of the partition, SSAS will query the aggregation and because the amount of data in the aggregation is smaller, the query can be answered more succinctly. However these benefits come with a trade-off, they take time to build which increases processing time and space. They can also have an adverse impact on performance where there too many designs. This post shows how to determine if an aggregation is being used by using trace events.

A quick overview: When SSAS receives an MDX request, the statement is passed to the formula engine (FE). The FE requests data from the cache. If the data is in the cache (readily available) it can be returned to the FE directly, if not, it the request is passed to the storage engine (SE) which retrieves the data. Note that, if the data is available in cache, there is no request to the SE.

This raises two side points. Firstly, there is no SE requirement for data that is in cache. Cache warming may improve query performance (without the need for the aggregation). However, to warm the cache, data must still be loaded (which may be improved with an aggregation). Secondly, the aggregation will only be effective when the query is executed against cold cache (requesting non-cached data).

In these examples, we are using the reseller sales partition of adventure works. There is a simple aggregation design that included [Month Name] which is only applied to the 2003 partition. Additionally, the relationships for the [date] dimension are shown below. Note that Calendar and Fiscal hierarchies can be defined from the [Month Name] attribute.

Trace Events

The trace is run with the following events;

Event Class Event
Progress Report Progress Report End
Query Events Query Event Begin
Query Event End
Query Processing Get Data From Aggregation
Get Data From Cache

Clearing the Cache

Unless otherwise stated, each query is executed after clearing the cache. Cache can be cleared from a database, cube or partition. For simplicity, cache is cleared from the database using the command.

<ClearCache xmlns=http://schemas.microsoft.com/analysisservices/2003/engine>
<Object>
<DatabaseID>Adventure Works</DatabaseID>
</Object>
</ClearCache>

A more comprehensive list of commands which cover how to specify other objects can be found at ssas-info;

http://www.ssas-info.com/analysis-services-faq/27-mdx/133-mdx-how-do-i-clear-analysis-services-ssas-database-cache

Query 1 – Data from the Partition

A query that requires data from 2002 will read the partition (recall there is no aggregation on this parition). We can see that the text data for the progress report end specifies the partition that is used (Reseller_Sales_2002).

select [Measures].[Reseller Sales Amount] on 0,
[Date].[Calendar].[Calendar Year].&[2002] on 1
from [Adventure Works]

When the same query is run without clearing the cache, the progress report is replaced with the event get data from cache.

Query 2 – Data from the Aggregation
When the query is executed for 2003, the trace event includes the get data from aggregation event. Additionally the text data for the progress report shows that the Aggregation is used.

select [Measures].[Reseller Sales Amount] on 0,
[Date].[Calendar].[Calendar Year].&[2002] on 1
from [Adventure Works]

Although not shown in the trace, the retrieval of data from the aggregation is faster than from the partition (duration of 7ms verse 0ms).

A Quick Note about the Benefits of Relationships

The only attribute defined in the aggregation was [Month Name], yet the queries, have used [Calendar Year] and hit the aggregation where possible. This is because there are relationships defined between [Month Name] and [Calendar Year] and, the aggregation can be used to answer queries for attributes from higher levels in the tree.

We can see the path by using the aggregation manager in BIDS helper (@ http://bidshelper.codeplex.com/)
Because the relationships for [Canendar] and [Fiscal] hierarchies are derived by the attribute [Month Name], both hierarchies can utilise the aggregation.
A query against the [Fiscal] hierarchy will also use the aggreagtion design.

Conclusion

This post has shown how to determine if aggregations are used to answer queries. When applied, they reduce the overhead associated with data retrieval. However, this reduction is only to the extent that it is requested from the SE. When data is in cache, there is no requirement on the SE (and hence neither the aggregation nor partition data will be used).

Advertisements

5 thoughts on “Determining SSAS Aggregation Usage

  1. Aw, this was a really nice post. In idea I would like to put in writing like this additionally – taking time and actual effort to make a very good article… but what can I say… I procrastinate alot and by no means seem to get something done.

  2. Good day very nice website!! Man .. Excellent .. Wonderful .
    . I’ll bookmark your site and take the feeds also? I am glad to search out so many helpful info here in the submit, we want work out more strategies on this regard, thanks for sharing. . . . . .

  3. Thanks a lot for sharing this with all folks you actually realize what
    you’re talking about! Bookmarked. Kindly also consult with my web site =).

    We could have a hyperlink change agreement between us!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s