BIML II – Extracting Database Tables to Text Files


In this post we look at the Biml required to generate a SSIS package that copies data from a table to a flat file (that is, a text file output). It’s a pretty simple example but one that can build on our prior knowledge. We have already gone over the basics of Biml in our first post so we won’t worry about an overview of Biml, how to install it or generate packages. What’s worth remembering though, is that Biml is an XML specification and packages and connections are child nodes of the XML root. We’ll also deviate little bit in this post by using an OLEDB connection to SQL Server and specifying a file format.

Overview

If we think about the way our package would be built in SSIS, we would probably follow a simple set of steps;

  1. Create a connection to SQL Server
  2. Add a dataflow task
  3. Specify a source (let’s just use an OLEDB source) and provide the query.
  4. Add a flat file as a destination output then join the source to destination and then configure the output by creating a new flat file connection manager.

What’s somewhat hidden in this process is that the output file structure is defined automatically when we create a new connection – that is, in the destination wizard (it is handled by the wizard when you create a new connection manager). Keep in mind that the Biml has no idea about this and so we need to specify the file structure somewhere in the Biml code.

So if we are going to achieve this in Biml, we would need to;

  1. Add the connections
  2. Specify the file format
  3. Create a package that has a data flow task
    1. Uses the source connection (with a query) to produce a dataset … and
    2. Places that dataset into an output connection.

So we would expect our structure to be something like this (I’m using a little prior knowledge and know that the FileFormats tag is a root element);

<Biml xmlns="http://schemas.varigence.com/biml.xsd">

  <Connections>
    <!-- Placeholder for connection -->
  </Connections>

  <FileFormats>
    <!-- Placeholder for structures-->
  </FileFormats>

  <Packages>
    <Package Name="CopyAccount" ConstraintMode="Linear">
    <!-- Place Holder for a Package -->
    </Package>
  </Packages>

</Biml>

Connections

If we follow our previous example, we can simply specify our connections under the connection node as in the following code (again Intellisense can work wonders here). The OLEDB connection should be pretty straight forward, but what is interesting is the simplicity of the FlatFile connection. As you would expect, its go a Name and a Path (or destination), but what’s interesting is the FileFormat specification. This is a required property for a flat file and specifies the structure of the stream.

<Connections>
  <OleDbConnection Name="Source" ConnectionString="Provider=SQLOLEDB;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=AdventureWorksDW2012" />  
  <FlatFileConnection Name="Target" FilePath="F:\\Output.txt" FileFormat="AccountTable" />
</Connections>

FileFormat

We have briefly discussed the creation of a format. For our purposes we could summarise it as the definition of a dataflow (somewhat akin to the definition of a table). As you would expect (and is logical for an XML table definition), the structure defines a header (that is the table container) and the columns within that table.

<FileFormats>
    <FlatFileFormat Name="AccountTable" ColumnNamesInFirstDataRow="true" >
      <Columns>
        <Column Name="AccountKey" Delimiter="Comma" />
        <Column Name="AccountCodeAlternateKey" Delimiter="Comma" />
        <Column Name="AccountDescription" Delimiter="Comma" />
        <Column Name="AccountType" Delimiter="Comma" />
        <Column Name="Operator" Delimiter="LF" />
      </Columns>
    </FlatFileFormat>
</FileFormats>

Naturally each node has to have a Name, and columns also require a Delimiter. You’ll notice that the last columns delimiter is a Line Feed (LF or new line). This just ensures that each row appears on its on line.

We could also specify data types for columns and other properties however, these are not mandatory for our example. What is required is that the field names for the FileFormat match those from our table. When the SSIS package is generated, the mappings from the source (OLEDB data source) to the destination (the FlatFile) will be based on name. The order is not important, only the Name.

The Package

If we consider that we have the framework for moving data, we would expect that our package require the following tasks with a precedence constraint between them;

  1. A Data Flow Task
    1. OLEDB Source (which specifies the connection and the SQL Command)
    2. A flat file destination (specifying the file and format).

This is exactly what we need to specify in our package. The Biml for this is;

    <Package Name="Copy_Dim_Account" ConstraintMode="Linear" >
      <Tasks>
        <Dataflow Name="Copy_Dim_Account">
          <Transformations>
            <OleDbSource Name="Src_Dim_Account" ConnectionName="Source" >
              <DirectInput>Select AccountKey, AccountCodeAlternateKey, AccountDescription, AccountType, Operator from DimAccount;</DirectInput>
            </OleDbSource>
            <FlatFileDestination Name="Dest_Dim_Account" ConnectionName="Target" Overwrite="true">
            </FlatFileDestination>
          </Transformations>
        </Dataflow>
      </Tasks>
    </Package>

We can easily see how these tasks relate to both the source and destination of the dataflow and how they reference the objects that we previously defined (ie the connection and FileFormat).

Previously, we also examined the use of the packages ConstraintMode as a property which dictates how tasks on the Control Flow relate to each other. That is, a linear mode creates constraints between tasks where as a Parallel mode does not.
This property is not materialised in the DataFlow. For example, if the package has a Parrallel Constraint mode, the tasks (or transformations) within the data flow will automatically have the precedence constraint created (in the order that they appear in code).

Also note that the source uses a direct input tag as a child of the OLEDB Source which is the same as the syntax for the ExecuteSql tag.

The Complete Solution

For completeness, the full code is ;

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <OleDbConnection Name="Source" ConnectionString="Provider=SQLOLEDB;Data Source=localhost;Integrated Security=SSPI;Initial Catalog=AdventureWorksDW2012" />  
    <FlatFileConnection Name="Target" FilePath="F:\\Output.txt" FileFormat="AccountTable" />
  </Connections>
  <FileFormats>
    <FlatFileFormat Name="AccountTable" ColumnNamesInFirstDataRow="true" RowDelimiter="LF" >
      <Columns>
        <Column Name="AccountKey" Delimiter="Comma" />
        <Column Name="AccountCodeAlternateKey" Delimiter="Comma" />   
        <Column Name="AccountDescription" Delimiter="Comma" />
        <Column Name="AccountType" Delimiter="Comma"  />
        <Column Name="Operator" Delimiter="LF" />
      </Columns>
    </FlatFileFormat>
  </FileFormats>

  <Packages>
    <Package Name="Copy_Dim_Account" ConstraintMode="Linear" >
      <Tasks>
        <Dataflow Name="Copy_Dim_Account">
          <Transformations>
            <OleDbSource Name="Src_Dim_Account" ConnectionName="Source" >
              <DirectInput>Select AccountKey, AccountCodeAlternateKey, AccountDescription, AccountType, Operator from DimAccount;</DirectInput>
            </OleDbSource>
            <FlatFileDestination Name="Dest_Dim_Account" ConnectionName="Target" Overwrite="true">
            </FlatFileDestination>
          </Transformations>
        </Dataflow>
      </Tasks>
    </Package>
  </Packages>
  </Biml>

Conclusion

In this post, we have looked at the generation of packages that move data from a table to a text file. This is a simple scenario that demonstrates the minimum requirements for tag specification and utilises automatic mapping of source fields to target.

Advertisements

7 thoughts on “BIML II – Extracting Database Tables to Text Files

  1. Pingback: BIML III – Extracting Database Tables to Raw Files | Paul te Braak

  2. Pingback: BIML VII – Extracting Data Using ODBC Destinations | Paul te Braak

  3. Pingback: BIML – Importing Text Files | Paul te Braak

  4. HI Paul,

    It is wonderful way to learn BIML. Thanks for sharing your knowledge. If I want same solution in dynamic way for e.g. I have metadata tables which provide information about servername, database name, table name and column name. If possible please provide BIML script to generate each ssis package for table in metadata. Thanks in advance

    Thansk,
    Kamlesh

    • Hi Kamlesh,

      Yes – of course you can, you are just talking about iterations within iterations (eg
      foreach (connection _c in connections)
      {
      foreach (table _t in _c.Tables)
      {
      /* Do some work */
      }
      }

      You may have to build the constructs but i think you can see where i am going with it?

  5. Hi Paul,
    Just wanted to say how much your article has helped me today! I’m new to BIML and I find the way you write very easy to understand.
    Many thanks,
    Hollie

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s