Search This Blog

Loading...

Sunday, November 24, 2013

Querying SSIS package pipeline output from T-SQL

SQL Server 2012

This post is about the SSIS Data Streaming Destination. I must admit I wasn’t aware that it existed until I saw it in a presentation some weeks ago. Reason enough to write a short blog post about it. You can find a more detailed explanation here: http://office.microsoft.com/en-us/office365-sharepoint-online-enterprise-help/publish-ssis-packages-as-odata-feed-sources-HA104079177.aspx

The Data Streaming Destination comes with the Microsoft SQL Server 2012 Integration Services Data Feed Publishing Components, which can be downloaded using this link. To demonstrate the functionality, I created a simple SSIS package that reads data from the DimCustomer table of the AdventureWorks database. The output is then routed to the Data Streaming Destination.

image

The project containing the package is then deployed to the Integration Services Catalog on a SQL Server 2012 (project deployment model). Here is the path and package name that I used for this example:

image

Next we need to create a linked server to SSIS. In order to do so, two prerequisites have to be checked:

  • SSISOleDb Provider has to be installed
  • The “Allow inprocess” option of the SSISOleDb provider has to be enabled

 

You can check this in Management Studio using the object explorer by expanding the tree Server Objects/Linked Servers/Provider:

image

Double click the SSISOLEDB entry brings up the configuration dialog:

image

Make sure the option “Allow inprocess” is enabled. You may use this dialog or execute the following script:

USE [master]
GO
EXEC master.dbo.sp_MSset_oledb_prop N'SSISOLEDB', N'AllowInProcess', 1
GO

 

We now create a linked server to SSIS:

USE [master]
GO
EXEC sp_addlinkedserver
@server = N'SSISFeedServer',
@srvproduct = N'Microsoft',
@provider = N'SSISOLEDB',
@datasrc = N'.'
GO

image

And finally we can query the package result using the following T-SQL command:

SELECT * FROM OPENQUERY(SSISFeedServer,N'Folder=test;Project=SSISDataFeed;Package=DSD_Demo.dtsx')

image

Within the 2nd parameter of the OPENQUERY command you may also specify

  • Parameters
    Syntax is …;Parameters="<par_name1>=<par_val1>;<par_name2>=<par_val2>";…
  • Timeout
    Syntax is …;Timeout=<Number of seconds>;…
  • Usage of 32bit Runtime
    Syntax is …;Use32BitRuntime=<true | false>;…

 

Ok, the customer table is quite small, so let’s try with a bigger table, for example the FactProductInventory table:

  direct SQL Streaming
count(*) 0 sec 8 sec
sum(UnitsIn) 0 sec 8 sec
join to dimension and filter by dimension property, count(*)1 0 sec 8 sec
join to other fact table by date/product, group by date2 1 sec 103 sec
(output started at 17 sec)

1) Query was: SELECT count(*) From OPENQUERY(SSISFeedServer,N'Folder=test;Project=SSISDataFeed;Package=DSD_Demo2.dtsx')  PIN
inner join DimProduct P on PIN.ProductKey = P.ProductKey
where P.StandardCost>10.0

2) Query was: SELECT PIN.DateKey, count(*) From OPENQUERY(SSISFeedServer,N'Folder=test;Project=SSISDataFeed;Package=DSD_Demo2.dtsx') PIN
inner join FactInternetSales FIS on PIN.DateKey=FIS.OrderDateKey and PIN.ProductKey=FIS.ProductKey
group by PIN.DateKey

 

Some final remarks:

  • Instead of waiting for the package to end, rows are transferred to the SQL query as they are returned from the SSIS package pipeline.
  • Of course, this approach has some latency as the SSIS package needs to be validated and prepared
  • Not all column types are supported, for example varbinary(max) is not supported
  • Make sure to have only one streaming destination in your package. I ran into problems even if I had only two destinations with one being disabled.

No comments:

Post a Comment