by Ellen
29. June 2010 01:27
I've been working with SSIS since just before the RTM of SQL Server 2005. Over time, I've been able to steal tips and tricks from a number of sources (sessions at PASS Summits, notably those presented by Rushabh Mehta and Erik Veerman; Ralph Kimball, Joy Mundy and Warren Thornethwaite of the Kimball Group; SQLIS; Brian Knight and others that I apologize ahead of time for not mentioning). From all these sources, distilled by the practical work we've done with our own clients, I've evolved a standard set of patterns and practices that allow quick, efficient development for moderately sized data marts.
I call this "Cookie Cutter SSIS."
The most extreme example of the use of these concepts is what I refer to as "Data Mart in a Day." Back in 2007, we had a prospective client who was interested in the Cognos (now IBM Cognos) business intelligence toolset, but wasn't certain how the tools would work for their organization. As a result, Perkins Consulting engaged to do a limited proof-of-concept project for this client, that started with the creation of a modest data mart (four dimensions, three base facts and an aggregate fact all with a relatively small number of rows) against which to deploy the Cognos reporting tools. Because the focus of the project was on the reports, we needed to get the data mart built and populated (along with ongoing maintenance) as quickly as possible.
Okay, okay. So the complete data mart was not literally finished in a day (we did some prep work prior to the onsite development day, added one of the facts and the aggregate after the first reporting pass and completed the data validation phase afterwards.) Nevertheless, this was PDQ - and the POC data mart remained in production with minimal downtime at this client site for at least six months.
I'd like to share some of these shortcuts that I continue to use to quickly deploy data mart objects. I'll say up front that the scope of the series is strictly mechanics - how to use the "cookie cutter" method of standard objects and templates to speed data mart ETL.
We're assuming that the data discovery and design phases have already occurred. We've got our data model; we know our data source options. Now we're ready to create our databases, build our tables and use SSIS to populate our data mart.
This methodology is predicated on several things:
- A standard set of supporting databases, in addition to the data mart database itself
- Uniform handling of data mart object metadata using SQL Server extended properties
- Use of configuration files to enable data-driven dynamic Connection Managers
- SSIS package templates, pre-configured with standard variables, containers and objects
- Utility tables, functions and stored procedures
- Custom SSIS components from community resources that extend SSIS functionality
- A standard date dimension design and data source (Excel spreadsheet) that can be customized for specific client needs
In the next post, we'll talk about the standard databases that are the first step in the process.