In this example I’m using SSIS to write a .NET-like executable application that grabs data and “processes” it. Using SSIS is a nice alternative if your application is not “business-logic” heavy, and that way you can avoid a code-heavy program that may be more flexible but less robust in the process.
Background: Energy Suppliers must leverage pipeline transportation systems to transmit Natural Gas to local Utilies to be distributed to customers. Each day, the pipeline accepts nominations for transportation the following day. The Supplier must identify all counter-parties and indicate the date and amount to be delivered. Additionally, they must properly calculate the line-loss from delivery to destination, which is typically 5%.
Here are the steps at a very high level:
- Query for an un-submitted “open” set of gas nominations that I need to submit to the pipeline.
- Use the Nomination Set to load the nominations themselves.
- One-by-one, submit each nomination to the gas pipeline’s web service
- Log the output of each submission
- Close the Nomination Set
Setup
We’ll start with two tables:
Nomination – The table to house nominations to be submitted
Nomination Set – The “father table” to hold specifications for groups of Nominations
Nomination Tables
Sample Data from these tables
here.
At any time, I can create a new Nomination Set specifying parameters around the nominations that I want to encapsulate with this set. This Nomination Set will have its
Closed flag set to FALSE until I process the Nomination Set.
Create SSIS Package and Setup Variables
New Package
Variables in SSIS can serve several purposes. I’ve most often seen them used as configuration variables, specifying connection strings and URL’s, but you can also use them as local variables for your current working set of data.
Variables
- NominationSetID – The ID of the current Nomination Set that we’re processing.
- ScheduleDate – Loaded from the Nomination Set. Used as a filter when querying Nominations to submit.
- TransportationServiceURL – Loaded from the Nomination Set. Will be used by our web service task as the URL to call.
- TraderCode - Loaded from the Nomination Set. Used as a filter when querying Nominations to submit.
- NominationSector - Loaded from the Nomination Set. Used as a filter when querying Nominations to submit.
- NominationData - Will be used as an in-memory table for iterating through our nominations (more on this later).
- Loop_CurrentNomination – Used within our ForEach Loop. Represents the current Nomination Data to submit.
- Loop_NominationOutput – Used within our ForEach Loop. Represents the output from submitting our nomination to the web service.
Load an Open Nomination Set
I’ll use an Execute SQL task tied to an OLE DB connection to query for an open Nomination Set.
Load Open Nomination Task
This is a simple SQL Statement:
SELECT TOP 1
NominationSetID
,[NominationSector]
,[PipelineEndpointURL]
,[TraderCode]
,NominationSetDate = CONVERT(DATETIME,[NominationSetDate])
,[Closed]
FROM [dbo].[NominationSet]
WHERE Closed = 0
Load the result-set into our package variables for ongoing use. In order to do this we must set the
Result Set to
Single Row.
Set Result Set To Single Row
Then you must map your resulting output columns back to your pre-defined variables.
Map Result Set To Variables
Debugging can help verify that your mappings assign correectly.
Right-click on your Execute SQL Task and select
Edit Breakpoints
Edit Breakpoints
Using the
Watch Window, we can verify the values of our variables
Using the Watch Window to verify our variable assignments
Load Nominations
Now we run a query to load the Nomination data table rows that we will iterate through.
SELECT [NominationID]
,NominationData =
[Shipper_ContractNumber]
+ ' | ' + [DeliveringParty_ContractNumber]
+ ' | ' + CONVERT(VARCHAR(50),[StartDate],112)
+ ' | ' + CONVERT(VARCHAR(50),[StopDate],112)
+ ' | ' + [ReceiptMeterNumber]
+ ' | ' + CONVERT(VARCHAR(50),[ReceiptAmount])
+ ' | ' + [DeliveryMeterNumber]
+ ' | ' + CONVERT(VARCHAR(50),[DeliveryAmount])
+ ' | ' + [ReceivingPartyContractNumber]
,[ScheduleDate]
,[Sector]
,[TraderCode]
FROM [dbo].[Nomination]
WHERE
ScheduleDate = ?
AND Sector = ?
AND TraderCode = ?
We start by creating another
Execute SQL Task.
Because we want the full table, we set the Result Set output to
Full record set and assign it to the variable
NominationData.
Set Result Set to Full Record Set
Set Output to our NominationData variable
Loop through Dataset and Process
Now we can use a
Foreach Loop Container to iterate through our Nomination Data Set. For each iteration of the loop, we need to capture the current Nomination Data value by mapping it to our variable
Loop_CurrentNomination
Add a Foreach Container
Point enumeration to the NominationData collection from previous step
Map the current row to our variable Loop_CurrentNomination
Submit the Nomination
I’m using a simple .NET WCF Web Service that takes in the NominationData as a string and returns a confirmation message. I won't go into the details of a
Web Service Task or
ASP.NET WCF in this post, because I just want an example of "processing data".
Configure the SSIS Web service task to point to the URL and WSDL of the exposed service.
Set Method Call and pass in our variable Loop_CurrentNomination
To capture the result, I can retrieve the return value of my output into another SSIS variable,
Loop_NominationOutput.
Set output to our variable: Loop_NominationOutput
Because I'm hosting my web service, I can debug through the method call and verify the Nomination Data input:
Debugging our Web Service Call
Logging the Output
At this point, I have successfully submitted my nomination and want to log the output.
I can use a
Data Flow task with a destination Output File as my log.
Add Data Flow Task to our Foreach Loop
Our Data Flow task consists of 3 steps.
Our Data Flow Task to Log our output
In this case, we're only using our first step of the Data Flow task as a place-holder data-source. We'll be getting our data to output in the next step:
Data Flow Task Step 1
In the second step of our
Data Flow Task, we grab the variables we want to output to the log.
The expression for our
NominationOutput column is complex, but just understand we're exracting the output result string from our web service call.
Data Flow Task Step 2
For Step 3, I start by configuration a
Flat-File Connection Manager on my local drive. Then I can add my Flat-File output step and map my columns appropriately.
Data Flow Task Step 3 - Map Columns to Flat File
At this point, if I run my SSIS package I can see the output results of my web service submission call to my
LogOutput.txt file
Log file with our Nomination submission results
Closing our Nomination Set
Finally, to finish the process we'll run another
Execute SQL Task to close our Nomination Set. Our query is a simple
UPDATE statement. Notice the '?' marking the parameter input. We get our input from our variable
NominationSetID
Query to Close Nomination Set
Configure our input to our query
And now, one full cycle will query to find an un-closed Nomination Set, submit all Nominations through the exposed Web Service, log the output to our log file, and update the Nomination Set record.
Full source code for both the SSIS package and WCF Web Service can be found
here.
Conclusion
I realize that my rollercoaster is missing a few “struts” for complete stability (i.e. Error logging, transactional integrity). Error logging can be achieved through alternate error-routing in SSIS, which is another simple drag-and-drop task. Transactional Integrity is also a capability, but for this example it’s not necessarily achievable because I am calling a web-service, which would require a distributed roll-back on the service-end. But for everything else, I stand-by my assertion that SSIS has proven to be an incredibly powerful tool for automating rudimentary tasks (i.e. I/O, data retrieval and output). If I’m writing complex domain-driven code, I prefer and obviously recommend .NET, where you can modularize your code, wrap Unit Tests, and do much more. But for something this simple I was able to achieve what I need without a single line of code (scripting and querying is NOT code :) …).