Analytics in Action

How to build a SQL Server Integration Services package.

One of the biggest challenges for analysts is to cleanse, transform &  join multiple datasets so that the data can be used for  reporting, analysis or modelling.  This process is known as a data integration or  in more technical terms  as an  Extract, Transform, and Load (ETL) process. Without the right software ETL  is time consuming and painful, particularly when the data is required on a scheduled basis such as for  monthly reporting.

SQL Server Integration Services (SSIS)  is one of the more common & user friendly ETL tools. Its a common tool as SQL server databases are probably one of the most common databases used in business. As SSIS is a free component of the Standard & Enterprise editions many businesses have the tool.  Therefore its a  logical ETL tool for many businesses. I am, however,  continually surprised how many businesses have data  integrating challenges, but don’t realize the already own a tool that can help solve their pain.

Many analysts perceive Microsoft SQL Server  to be a specialist tool, that only developers and users comfortable writing  coding can use. This may  have had some  truth with the very early  releases (7 & 2000) , but now its a very user friendly graphical user interface (GUI).  Much of the functionality in SSIS  is no more difficult than similar tasks in Excel. The advantage of a specialist  ETL tool such as SSIS over a tool such as excel is that the process becomes much more flexible, scalable, auditable & repeatable.

The video associated with this post demonstrates how to build a basic SSIS ETL process using the “drag & drop” graphical user interface. The video shows how to

  • Open SQL Server Integration Services (SSIS).
  • How to navigate the Integration Development Environment (IDE) where you can create a simple ETL (Extract Transform Load) process.
  • Build “Connection Managers” that connect Integration services to data sources. This is the Extract part.
  • Filter data with a Conditional Split transformation. This is the Transform Part
  • Send data to a destination location. This is the Load  part.

Subsequent posts explore additional “data transformation” tasks such as filtering, aggregating, and deriving data.

admin