Biml XIII – The Biml Script for the Entire SSAS Database


This final post on biml SSAS processing and provides the full script to produce two packages.  It really ties in the first 2 posts to provide a complete solution for our processing scenario.  Remember, our goal is to point our biml at a SSAS database and have a processing package built that only processes the last 2 fact partitions of each measure group based on their last processed date.

The First Method

Firstly, we look at a very conventional arrangement that would process dimensions, partitions and indexes.  This looks like the image below.  In order to get the most performance (lowest processing time), we need to specify that each processing task uses parallelisation.

image

The code is similar to prior posts so we wont dwell on it too much.  Basically, we use AMO (note the reference to the AMO dll in the first line) to connect to our cube, discover dimensions and partitions and these in an iterative manner to each processing task.  The two methods in the C# code of importance are DimensionKeys (which returns all dimensions in the database) and Partitions (which returns the last two processed partitions for our intended measure group).

<#@ assembly name="C:\Program Files (x86)\Microsoft SQL Server\110\SDK\Assemblies\Microsoft.AnalysisServices.dll" #> 

<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="Microsoft.AnalysisServices" #> 

<#@ template language="C#" #> 

<#
    String _ssas_server = @".\SQL2008R2";
    String _ssas_database = "Adventure Works DW 2008R2";
    int _max_partitions = 2;
#> 

<#+ 
Dictionary<string, string> DimensionKeys(string Server, string Database)
{
     Dictionary<string, string> _dimensions = new Dictionary<string, string>();
     Server _ssas_server = new Server();
     _ssas_server.Connect(Server); 

     Database _db = _ssas_server.Databases[Database];
     foreach (Dimension _dim in _db.Dimensions)
         _dimensions.Add(_dim.ID, _dim.Name); 

     _ssas_server.Disconnect();
     _ssas_server.Dispose(); 

     return _dimensions;
} 

public static List<PartitionStruct> Partitions(string ServerName, string DatabaseName, string CubeId, string MeasureGroupId)
{
        var _ret = new List<PartitionStruct>(); 

        Server _ssas_server = new Server();
        _ssas_server.Connect(ServerName);
        Database _db = _ssas_server.Databases[DatabaseName];
        Cube _c = _db.Cubes[CubeId];
        MeasureGroup _mg = _c.MeasureGroups[MeasureGroupId]; 

        foreach (Partition _p in _mg.Partitions)
        {
                _ret.Add(new PartitionStruct(_p.Name.ToString(), _p.ID.ToString(), _p.LastProcessed));
        }
        _ssas_server.Disconnect();
        _ssas_server.Dispose(); 

        _ret.Sort((y, x) => x.PROCESS_DATE.CompareTo(y.PROCESS_DATE)); 

        return _ret;
} 

public struct PartitionStruct
{
        public readonly string PARTITION_NAME;
        public readonly string PARTITION_ID;
        public readonly DateTime PROCESS_DATE;
        public PartitionStruct(string partition_NAME, string partition_ID, DateTime process_DATE)
        {
                PARTITION_NAME = partition_NAME;
                PARTITION_ID = partition_ID;
                PROCESS_DATE = process_DATE;
        } 

} 

#>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <AnalysisServicesConnection Name="olap" ConnectionString="Data Source=<#= _ssas_server #>;PROVIDER=MSOLAP;Impersonation Level=Impersonate;" Server="<#= _ssas_server #>" />
  </Connections>
  <Packages>
    <Package Name="SSAS_PROCESSING_ssas_process_all" ConstraintMode="Linear" ProtectionLevel="EncryptAllWithUserKey">
         <Tasks>
            <AnalysisServicesProcessing Name="Process_Dimensions_Update" ConnectionName="olap" ProcessingOrder="Parallel" >
              <ProcessingConfigurations>
                <# 
                Dictionary<string, string> _dimensions = DimensionKeys("Data Source=" + _ssas_server , _ssas_database);
                foreach (string _dim in _dimensions.Keys){ #>            
                        <DimensionProcessingConfiguration DatabaseId="<#= _ssas_database #>" ProcessingOption="ProcessUpdate"  DimensionId="<#= _dim #>" />
                <# } #>
                  </ProcessingConfigurations>
            </AnalysisServicesProcessing>
            <AnalysisServicesProcessing Name="Process_Partitions_Data" ConnectionName="olap" ProcessingOrder="Parallel">
                <ProcessingConfigurations>
                <#    
                Server _ssas_server_ = new Server();
                _ssas_server_.Connect(_ssas_server); 

                Database _db = _ssas_server_.Databases[_ssas_database];
                foreach (Cube _c in _db.Cubes) 
                { 
                    foreach (MeasureGroup _mg in _c.MeasureGroups)
                    {  
                        List<PartitionStruct> _listPartitions = Partitions(_ssas_server, _ssas_database, _c.ID, _mg.ID);
                        for (int _i = 0; _i < _max_partitions & _i < _listPartitions.Count; _i++ ) { #>
                        <PartitionProcessingConfiguration DatabaseId="<#= _db.ID #>" CubeId="<#= _c.ID #>"  MeasureGroupId="<#= _mg.ID #>" PartitionId="<#= _listPartitions[_i].PARTITION_ID #>" ProcessingOption="ProcessData" />
                     <# }
                    }
                    } 
                    #>
                </ProcessingConfigurations>
            </AnalysisServicesProcessing>
            <AnalysisServicesProcessing Name="Process_Indexes" ConnectionName="olap" ProcessingOrder="Parallel">
                <ProcessingConfigurations>
                    <# 
                    _db = _ssas_server_.Databases[_ssas_database];
                    foreach (Cube _c in _db.Cubes) { #>
                    <CubeProcessingConfiguration ProcessingOption="ProcessIndexes"  DatabaseID="<#= _db.ID #>" CubeId="<#= _c.ID #>"/>
                    <# } 
                       _ssas_server_.Disconnect();
                    #>
                </ProcessingConfigurations>
            </AnalysisServicesProcessing>
        </Tasks>
    </Package>
  </Packages>
</Biml>

A slight word of caution … If you look into the biml code, you will see that we explicitly set the ProcessingOrder as parallel.  This specifies that the Processing Order (mode) for the task should be in parallel and that the batch XMLA sent to your SSAS Server should include that option.  Unfortunately, this may not materialise on all versions of BIDS & BIML and the default (sequential) may be used.  I would check the outcome before implementation.

Method 2

Our second approach is essentially the same as the first, however, in this approach, we encapsulate each each measure group processing command in its own container with sequential constraints between each partition process.  Why?  Because we want to ensure that our partitions are processed according to last processed date.  That is, the last 2 processed partitions should be processed and we should process them in the order that we originally processed them.  If we add a new partition, we want to ensure that the one with the oldest data drops off the list.  You’ll also notice that we use SSIS containers to gain parallelisation for each Cube and Measure Group.

This code relies on the same methods (DimensionKeys and Partitions) used in the prior to build its output.

image

<#@ assembly name="C:\Program Files (x86)\Microsoft SQL Server\110\SDK\Assemblies\Microsoft.AnalysisServices.dll" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="Microsoft.AnalysisServices" #> 
<#@ template language="C#" #> 

<#
    String _ssas_server_name = @".\SQL2008R2";
    String _ssas_database_name = "Adventure Works DW 2008R2";
    int _max_partitions = 2;
#>
<#+ 

Dictionary<string, string> DimensionKeys(string Server, string Database)
{
     Dictionary<string, string> _dimensions = new Dictionary<string, string>();
     Server _ssas_server = new Server();
     _ssas_server.Connect(Server); 

     Database _db = _ssas_server.Databases[Database];
     foreach (Dimension _dim in _db.Dimensions)
         _dimensions.Add(_dim.ID, _dim.Name); 

     _ssas_server.Disconnect();
     _ssas_server.Dispose(); 

     return _dimensions;
} 

public static List<PartitionStruct> Partitions(string ServerName, string DatabaseName, string CubeId, string MeasureGroupId, int PartitionNum)
{
        /* returns a number of partitions (PartitionNum)
         * based on their processed_date -> ie last n processed partitions
         * sorted in asc order
         */ 
        List<PartitionStruct> _ret = new List<PartitionStruct>(); 

        Server _ssas_server = new Server();
        _ssas_server.Connect(ServerName);
        Database _db = _ssas_server.Databases[DatabaseName];
        Cube _c = _db.Cubes[CubeId];
        MeasureGroup _mg = _c.MeasureGroups[MeasureGroupId]; 

        foreach (Partition _p in _mg.Partitions)
        {
                _ret.Add(new PartitionStruct(_p.Name.ToString(), _p.ID.ToString(), _p.LastProcessed));
        }
        _ssas_server.Disconnect();
        _ssas_server.Dispose(); 

        _ret.Sort((y, x) => x.PROCESS_DATE.CompareTo(y.PROCESS_DATE)); 

        /* get first PartitionNum */
        while (_ret.Count > PartitionNum)
                _ret.RemoveAt(PartitionNum); 

        // ret asc
        _ret.Sort((x, y) => x.PROCESS_DATE.CompareTo(y.PROCESS_DATE)); 
        return _ret;
} 

public struct PartitionStruct
{
        public readonly string PARTITION_NAME;
        public readonly string PARTITION_ID;
        public readonly DateTime PROCESS_DATE;
        public PartitionStruct(string partition_NAME, string partition_ID, DateTime process_DATE)
        {
                PARTITION_NAME = partition_NAME;
                PARTITION_ID = partition_ID;
                PROCESS_DATE = process_DATE;
        }
}
#> 

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <AnalysisServicesConnection Name="olap" ConnectionString="Data Source=<#= _ssas_server_name #>;PROVIDER=MSOLAP;Impersonation Level=Impersonate;" Server="<#= _ssas_server_name #>" />
    </Connections>
    <Packages>
        <Package Name="SSAS_PROCESSING <#= _ssas_database_name #>" ConstraintMode="Linear">
            <Tasks> 

                <AnalysisServicesProcessing Name="Process_Dimensions_Update" ConnectionName="olap" ProcessingOrder="Parallel" >
                      <ProcessingConfigurations>
                        <# 
                            Dictionary<string, string> _dimensions = DimensionKeys("Data Source=" + _ssas_server_name , _ssas_database_name);
                                foreach (string _dim in _dimensions.Keys){ #>            
                                    <DimensionProcessingConfiguration DatabaseId="<#= _ssas_database_name #>" ProcessingOption="ProcessUpdate"  DimensionId="<#= _dim #>" />
                        <# } #>
                      </ProcessingConfigurations>
                </AnalysisServicesProcessing> 

                <Container Name="FACT_PROCESSING" ConstraintMode="Parallel">
                    <Tasks>
                    <#
                    Server _ssas_server_ = new Server();
                    _ssas_server_.Connect(_ssas_server_name);
                    Database _db = _ssas_server_.Databases[_ssas_database_name];
                    foreach (Cube _c in _db.Cubes) { #>
                        <Container Name="CUBE <#= _c.Name #>" ConstraintMode="Parallel">
                            <Tasks>
                                <# foreach (MeasureGroup _mg in _c.MeasureGroups){#>
                                <Container Name="MEASURE GROUP <#= _mg.Name #>" ConstraintMode="Linear">
                                    <Tasks>
                                        <# List<PartitionStruct> _listPartitions = Partitions(_ssas_server_name, _ssas_database_name, _c.ID, _mg.ID, _max_partitions);
                                           foreach (PartitionStruct _partition in _listPartitions) { #>
                                              <AnalysisServicesProcessing Name="PARTITION <#= _partition.PARTITION_NAME #>" ConnectionName="olap">
                                                  <ProcessingConfigurations>
                                                      <PartitionProcessingConfiguration CubeId="<#= _c.ID #>" DatabaseId="<#= _db.ID #>" MeasureGroupId="<#= _mg.ID #>" PartitionId="<#= _partition.PARTITION_ID #>" ProcessingOption="ProcessData" />
                                                  </ProcessingConfigurations>
                                              </AnalysisServicesProcessing> 
                                        <# }#>
                                    </Tasks>
                                </Container>
                                <# }  #>
                            </Tasks>
                        </Container>
                    <#} #>
                    </Tasks>
                </Container> 

            <AnalysisServicesProcessing Name="Process_Indexes" ConnectionName="olap" ProcessingOrder="Parallel">
                <ProcessingConfigurations>
                    <# 
                    _db = _ssas_server_.Databases[_ssas_database_name];
                    foreach (Cube _c in _db.Cubes) { #>
                    <CubeProcessingConfiguration ProcessingOption="ProcessIndexes"  DatabaseID="<#= _db.ID #>" CubeId="<#= _c.ID #>"/>
                    <# } 
                       _ssas_server_.Disconnect();
                    #>
                </ProcessingConfigurations>
            </AnalysisServicesProcessing> 

            </Tasks>
        </Package>
    </Packages>
</Biml>

Conclusion

You might suggest that this process is a very long way to build a processing schedule – and you may be right.  If you had to build it from scratch once to create a package that does some processing, you might have achieved it faster by hand.  However, here’s the thing about automation … you don’t have to code it, its there for you can you can use it to build out your packages and now package creation takes 10 sec! 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s