BIML VIII – Managing Errors in Data Destinations (ADO Destination)


In a previous post we looked at the biml required to send data to an ODBC destination. Relational destinations are the bread and butter of SSIS (well the work I do anyway) and one of the most common ways to manage insert errors is to redirect the dataflow two or more forgiving destination. This post will look at how to achieve that.

Environment

The example that were going to use is pretty contrived – we will send some sample data through to a table with a primary key. Initially the table is empty and we will expect all the data load. A second data flow will then try to add all records and what we expect to find is that the number of records originally inserted will be rejected into an errors table.

It’s pretty simple example where I am relying on the primary key (and duplicates) to raise the error on the second load. Since my tables are initially empty, I know the number of errors that I am expecting (ie the number of records loaded in the first load). The SQL to create the two tables is;

create table dbo.customer 
(
	CustomerAlternateKey nvarchar(25) not null primary key 
	, FirstName nvarchar(50)
	, LastName nvarchar(50)

);
create table dbo.customer_errors 
(
	CustomerAlternateKey nvarchar(25)   
	, FirstName nvarchar(50)
	, LastName nvarchar(50)

);

The Biml

We’ll start off with the biml we left off in the previous post. There are a few small changes. You’ll notice that we’ve relying on auto-mapping of column names, are using an ado.net destination, selecting 100 customer records (source) and have changed a few names.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <OdbcConnection Name="src_adventure_works" ConnectionString="Driver={SQL Server Native Client 11.0};Server=.;Database=AdventureWorksDW2012;Trusted_Connection=yes;" />
    <AdoNetConnection Name="dest_stage_ado" ConnectionString="Data Source=.;Initial Catalog=stage;Integrated Security=true;" Provider="System.Data.SqlClient" />
  </Connections>
    <Packages>
      <Package Name="copy_customer_with_error" ConstraintMode="Linear">
        <Tasks>
          <Dataflow Name="dft_customer_01">
            <Transformations>
              <OdbcSource Name="src_adventure_works" Connection="src_adventure_works">
                <DirectInput>
                  select top 100 CustomerAlternateKey, FirstName, LastName from dbo.DimCustomer;
                </DirectInput>
              </OdbcSource>
              <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer" >
                <ExternalTableOutput Table="[dbo].[customer]" />
              </AdoNetDestination>
            </Transformations>
          </Dataflow>
        </Tasks>
      </Package>
    </Packages>
</Biml>

Next, we’ll add a second dataflow (called dft_customer_02). The purpose of this flow is to reinsert (all) records into the customer table. At the moment, this dataflow is exactly the same as the first except that our source doesn’t restrict the number of records.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <OdbcConnection Name="src_adventure_works" ConnectionString="Driver={SQL Server Native Client 11.0};Server=.;Database=AdventureWorksDW2012;Trusted_Connection=yes;" />
    <AdoNetConnection Name="dest_stage_ado" ConnectionString="Data Source=.;Initial Catalog=stage;Integrated Security=true;" Provider="System.Data.SqlClient" />
  </Connections>
    <Packages>
      <Package Name="copy_customer_with_error" ConstraintMode="Linear">
        <Tasks>
          
          <Dataflow Name="dft_customer_01">
            <Transformations>
              <OdbcSource Name="src_adventure_works" Connection="src_adventure_works">
                <DirectInput>
                  select top 100 CustomerAlternateKey, FirstName, LastName from dbo.DimCustomer;
                </DirectInput>
              </OdbcSource>
              <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer" >
                <ExternalTableOutput Table="[dbo].[customer]" />
              </AdoNetDestination>
            </Transformations>
          </Dataflow>

          <Dataflow Name="dft_customer_02">
            <Transformations>
              <OdbcSource Name="src_adventure_works" Connection="src_adventure_works">
                <DirectInput>
                  select CustomerAlternateKey, FirstName, LastName from dbo.DimCustomer;
                </DirectInput>
              </OdbcSource>
              <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer" >
                <ExternalTableOutput Table="[dbo].[customer]" />
              </AdoNetDestination>
            </Transformations>
          </Dataflow>          
          
        </Tasks>
      </Package>
    </Packages>
</Biml>

Now, we’ll add the error configuration. If we think about what we are trying to achieve, we need to alter our dft_customer_02 dataflow to add another destination (capture the errors) and configure the first destination dest_customer to redirect errors to it. The snippet for dft_customer_02 becomes;

<Dataflow Name="dft_customer_02">
<Transformations>
  <OdbcSource Name="src_adventure_works" Connection="src_adventure_works" >
	<DirectInput>
	  select CustomerAlternateKey, FirstName, LastName from dbo.DimCustomer;
	</DirectInput>
  </OdbcSource>
  
  <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer" >
	<ExternalTableOutput Table="[dbo].[customer]" />
	<ErrorHandling ErrorRowDisposition="RedirectRow" TruncationRowDisposition ="RedirectRow" />
  </AdoNetDestination>
  
  <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer_errors"  >
	<InputPath OutputPathName="dest_customer.Error" />
	<ExternalTableOutput Table="[dbo].[customer_errors]" />
  </AdoNetDestination>
</Transformations>

</Dataflow> 

The key components to include for error configuration is the definition of error handling (in dest_customer) and the InputPath for the errors destination. For ErrorHandeling, both attributes ErrorRowDisposition and TruncationRowDisposition are required. Also not the input path for the error table is the output path of the prior component (component name suffixed with “.Error”).

The Result Package

When we execute this biml and run the package, our output resembles the following. Note the 100 rows are correctly inserted in dft_customer_01 as we expect. Unfortunately, the second data flow (dft_customer_02) pushes all records to the error output. Since we only have 100 records in our customer table, we would expect 100 rows to be sent to the error because these are the records that violate the primary key (constraint).

This reason for this is the default configuration for our dest_customer destination (in dft_customer_02) – by default is tries a bulk load.

When a data destination is used that supports bulk inserts, and there is a fail on any insert, the entire buffer is rejected. We can see this setting in SSIS configuration for bulk inserts in the screen shot below. The same applies to fast loads on ODBC destinations which can use a Batch processing (Bulk) mode or a Row by Row insert. Naturally, the row by row methods will only reject the rows that fail.

This is a property of the destination, and our completed code (below) includes the fast load configuration (its only applicable to the dest_customer destination in dft_customer_02. When we run or package, the second data flow then rejects the 100 records that were previously inserted (remember to initialise the environment by delete records from the tables in our staging database).

Complete Code

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
  <Connections>
    <OdbcConnection Name="src_adventure_works" ConnectionString="Driver={SQL Server Native Client 11.0};Server=.;Database=AdventureWorksDW2012;Trusted_Connection=yes;" />
    <AdoNetConnection Name="dest_stage_ado" ConnectionString="Data Source.;Initial Catalog=stage;Integrated Security=true;" Provider="System.Data.SqlClient" />
  </Connections>
    <Packages>
      <Package Name="copy_customer_with_error" ConstraintMode="Linear">
        <Tasks>
          
          <Dataflow Name="dft_customer_01">
            <Transformations>
              <OdbcSource Name="src_adventure_works" Connection="src_adventure_works">
                <DirectInput>
                  select top 100 CustomerAlternateKey, FirstName, LastName from dbo.DimCustomer;
                </DirectInput>
              </OdbcSource>
              <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer" >
                <ExternalTableOutput Table="[dbo].[customer]" />
              </AdoNetDestination>
            </Transformations>
          </Dataflow>

          <Dataflow Name="dft_customer_02">
            <Transformations>
              <OdbcSource Name="src_adventure_works" Connection="src_adventure_works" >
                <DirectInput>
                  select CustomerAlternateKey, FirstName, LastName from dbo.DimCustomer;
                </DirectInput>
              </OdbcSource>
              
              <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer" UseFastLoadIfAvailable="false">
                <ExternalTableOutput Table="[dbo].[customer]" />
                <ErrorHandling ErrorRowDisposition="RedirectRow" TruncationRowDisposition ="RedirectRow" />
              </AdoNetDestination>
              
              <AdoNetDestination ConnectionName ="dest_stage_ado" Name="dest_customer_errors"  >
                <InputPath OutputPathName="dest_customer.Error" />
                <ExternalTableOutput Table="[dbo].[customer_errors]" />
              </AdoNetDestination>
            </Transformations>
            
          </Dataflow>          
          
        </Tasks>
      </Package>
    </Packages>
</Biml>

 

Advertisements

One thought on “BIML VIII – Managing Errors in Data Destinations (ADO Destination)

  1. Pingback: Biml IX – The Lookup | Paul te Braak

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s