How Old Is That in Techno-Years?

by Ellen 4. November 2009 16:56

"You have to be very, very careful with this, because it's thousands and thousands of years old…from when Mommy was little."

--Hana, age 5

 

After my daughter uttered the fateful words above, my husband took entirely too much pleasure in calling me at work and telling me about it. After all, he's not that much younger than I am. Considering that Hana is now a junior at the University of Oregon, I suspect that another three or four millennia have been added to my age from her perspective.

Nothing makes you feel older than comments from your children.

Or so I thought.

This week, 71.43% of PCon is attending the annual PASS (Professional Association for SQL Server) conference in Seattle. Yesterday was the first day of the conference, and over dinner, Toni, Kathy, Eric, Serena and I discussed some of the sessions we'd attended. No life-changing sessions yet (something I always hope for at PASS), but we'd each seen sessions on features upcoming in the next SQL Server release. Yay! New stuff! We love it!

However, Eric also related a conversation he had with a young man in the elevator at his hotel. They discovered they were both in town to attend a tech conference. The other fellow asked if Eric was attending the iPhone conference on the second floor. Eric replied that he was here for PASS. The response:

"Oh, you're an old school geek."

It was very lowering.

According to this whippersnapper (I failed to ask Eric if the youngster was still wearing his Cub Scout uniform, although it wouldn't surprise me), PASS is the technological equivalent of AARP.

Well.

I refuse to be dismayed by this. I'm looking forward on my session today on SSIS design patterns, a Microsoft chalk talk on metadata-driven ETL and the session tomorrow on using agile development techniques in SSIS.

However, when I get back to Portland next week, I'm going to have to discuss my job title with Bill. In another month or so, "Senior Consultant" might be construed as a politically incorrect term in my case. I'll draw the line at "Consultant Emeritus" though. Even at my advanced technological age, I still look forward to learning cool new stuff – even if I have to look through my bifocals to do it.

Tags:

So Not the Drama (or not)

by Ellen 29. October 2009 18:00

 

In the fall of 1983, I was entering my last year of a three-year M.F.A. program in Theater Administration at the Yale School of Drama. Yeah, I know what you're thinking…what does that have to do with data warehouse consulting? Well, consulting as we know it at Pcon involves solving problems. Here are some of the problems presented to students in the Admin program at the time:

  • How do you convince an actor that the fluorescent lights in the rehearsal hall are not sapping his vitamin D?
  • When an actress complains that the water coming out of the faucet in her theater-provided apartment is too hot, how do you suggest she should add some more cold water to the mix without tacking "you idiot" to the end of your sentence?
  • How do you convince the recipient of a MacArthur "genius" grant that taking a cab from Boston to New Haven is perhaps not the best use of funds?

I was the only student with a finance emphasis at the time, so (thankfully) my problems did not involve actors. I was working for the Business Manager of YSD/Yale Repertory Theater. The problems we faced were slightly different:adding_machine_tape

  • How do we reconcile our (manually maintained) list of department purchase orders with the mainframe printouts from the University when the recent University accounting system conversion truncated the final character of the PO number?
  • How do we avoid suffocation under the mountain of adding machine tape necessary to foot and cross-foot our 11 x 24  green ledger pages?
  • What's the best way to hide the enormous calluses on our fingers (caused by prolonged contact with mechanical pencils)?

Enter the miracle.

My boss brought in a Compaq "portable" computer (which would probably be rejected by most modern airlines as exceeding their carry-on size limits). If I recall correctly, it had a single 5-1/4 inch floppy drive. The character-based screen was about the size of the display on one of my dad's oscilloscopes.

I adored it. Compared to mechanical pencils and green ledger sheets? Oh, yeah. It was heaven.

Using version 1.0 of Lotus 1-2-3 and RTFM, I used key-stroke macros to create a set of consolidated financial reports for the school and theater - the first that did not involve an IBM Selectric typewriter and a bucket of Liquid Paper.

This was the first time I used technology to improve a business process. It was a life-changing event. In a way, this is always in the back of my mind whenever I approach a new project. The client's data access/reporting difficulties play the part of the green ledger sheets and mechanical pencils. Our design and implementation of the data mart and business intelligence reporting solutions play the part of the Compaq and key-stroke macros.

Play the part….right.  Maybe data warehouse consulting isn't that far removed from drama school after all.

Tags:

Data Warehousing | Business Intelligence

Using the Trash Destination Adapter for SSIS Housekeeping

by Ellen 23. October 2009 19:23

In an earlier post about the custom Row Number transformation, I mentioned the resources available from Konesans (www.konesans.com) via SQLIS (www.sqlis.com). The Trash Destination adapter is another one of my can't-live-without custom components, although I may use it a little bit differently than is described on its SQLIS page (http://www.sqlis.com/post/Trash-Destination-Adapter.aspx).

The description of the adapter on that page presents it as something you would use only in development and testing, as an option for breaking out the data flow to evaluate performance bottlenecks, for instance, or for creating a path to associate with a Data Viewer. The description even states:

"It is also obvious that this is for development or diagnostic purposes, and is clearly not a part of the functional design of the package."

Actually, I use this component in the majority of my SSIS packages.  In the ETL logic flow, it’s frequently necessary to evaluate data  and determine whether a particular row should continue in pipeline or be discarded.  The Trash Destination is a low-maintenance way to ensure that every row of data in the pipeline has a definite endpoint – even if that endpoint is the garbage.  It’s easily configurable.  Well, no configuration really required.  Can’t get much easier than that.

As an example, here’s a picture of a section of a data flow task that processes a type 1 slowly changing dimension: 

type_1_split_trash_destination

 

The Conditional Split evaluates whether the type 1 attributes have changed based on an upstream Checksum transformation.  If the data has sustained a change, the row is diverted to the type 1 change staging table.  If the row is unchanged, it’s discarded (diverted to the Trash Destination).  This appeals to my sense of closure – all those rows of data in the pipeline have a place to go.  But by inserting the Row Count Plus transformation (another freebie from Konesans/SQLIS), I can count the unchanged rows and store that information in the audit data for the ETL run, so I can squeeze a bit more use out of the Trash Destination branch.  It’s all good.

The Cheshire Data Mart

by Ellen 14. October 2009 23:14

'Well! I've often seen a cat without a grin,' thought Alice; 'but a grin without a cat! It's the most curious thing I ever saw in all my life!'

--Lewis Carroll, Alice's Adventures in Wonderland

 

Several years ago, I sat in a local Cognos user's group meeting, listening to one of our clients give a presentation about the way she used Cognos Impromptu for her extensive reporting requirements. She gave a good presentation. She discussed limitations of her company's AS400 source system reporting capabilities and how Impromptu expanded her options.

She never once mentioned the data mart I had built - from two separate AS400 ERP systems, Excel and other supplementary data - on which all of the Impromptu reporting was based.

She didn't know it was there.

If she did, she didn't consciously consider the data mart as an entity separate from the presentation tool . Impromptu was her only experience with the data - and the Impromptu catalog insulates the end user from the complexities of its source data. That's its purpose - and the business analyst in charge of maintaining the catalog did an excellent job getting Impromptu to live up to that purpose.

cheshire_cateResult? The Cheshire Data Mart - a data mart that was so successful, it disappeared from view entirely.

We had a good chuckle over this at our own expense. However, over time, we've found the same situation arising with other data mart projects. Since our clients tend to be mid-market companies, many of whom have small IT departments, the consumers of the data mart business intelligence data can be several layers removed from the actual implementation of the data mart.

If the only experience a user has with the data mart is as a consumer of output through some application or third-party reporting tool, that user is less likely to think of the data as something discrete. If they can see data in the UI, then the data must actually be in that UI right there. A part of it. Not something that has to be separately considered and maintained.

Unless you're a confirmed data junkie like me, the concept of data in the abstract is just one more yawn-inducing geekoid topic that causes your eyes to glaze over and the fight-or-flight reflex to kick in. (I know from the number of times I've tried - unsuccessfully - to explain my job to my children and friends.) The average business intelligence consumer may not prepared to think about the data decoupled from its presentation, and perhaps that's okay. But someone within the organization needs to remain cognizant of Cheshire Data Mart lurking out there so that future decisions about how the data mart should grow and evolve - and how business users can take advantage of the richness of data available in the mart for their use - can be made appropriately.

Tags:

Business Intelligence | Data Warehousing

Oracle OLE DB Provider and 64-bit SQL Server 2005

by Ellen 6. October 2009 22:48

Back in fall of 2006, Perkins Consulting embarked on a data warehousing project with a client whose source data resided in Oracle and whose SQL Server 2005 environment was established on a 64-bit server. This was our first experience with this particular combination of variables, and we ran into some entertaining issues.

I was attempting to set up a Connection Manager for the Oracle source database, using the Native OLE DB\Oracle Provider for OLE DB. When testing the connection, I received the following error:

oracle_provider_error

Entertaining?  Oh, yeah.  A laugh riot.

After extensive Google searches, I finally found a post regarding this BIDS error in reference to Analysis Services.  Apparently the issue was on the Oracle side.  The Oracle provider had an issue with special characters in path names.  The default installation folder for SQL Server 2005 32-bit executables (including BIDS) on a 64-bit server includes parentheses (i.e., c:\Program Files (x86)\…).  Oracle was not pleased with this situation in the slightest.

The post I found recommended installing the 32-bit executables in a folder without special characters, but that wasn’t an option at our client site.  The post provided a work-around using a batch file to launch BIDS and to explicitly set the executable path without using the special characters, placing the Oracle client first:

Set path= c:\oracle\product\10.2.0\client_1\bin;%path% "C:\Progra~2\Microsoft Visual Studio 8\Common7\IDE\devenv.exe"

The error only arises when calling the Oracle provider using BIDS (a 32-bit application).  When executing the completed package, SQL Agent invokes the 64-bit dtexec executable, which is located in a path that TNS does not find objectionable. 

It’s entirely possible that this issue has been resolved in later versions of the Oracle client or the native Oracle OLE DB provider, but because of the configuration at this particular client site, I still launch BIDS there using this batch file.

Using the Row Number Transform to Assign Surrogate Keys in the Pipeline

by Ellen 30. September 2009 01:28

I suspect that most of us who have been using SQL Server 2005/2008 SSIS for any length of time know what a great resource SQLIS is (if you don't know, check it out - www.sqlis.com). This site provides downloads of several custom SSIS components that are now integral parts of any Pcon SSIS project.

One of these can't-live-without components is the Row Number transform. This component, like my other must-haves, was developed by Konesans (www.konesans.com).

When I first started using SSIS, I used a Script transform, as described in Donald Farmer's The Rational Guide to Extending SSIS 2005 with Script, to assign surrogate keys to data mart dimensions and facts in the data pipeline. Although I was grateful to have this model at the time, I don't really like to use script components if I can avoid them. I've run into too many compile issues, especially with 64-bit environments. The Row Number transform delivers a much simpler way of assigning surrogate keys (simpler because I don't have to write any code…always a plus for me!).

The Row Number transform…numbers rows. Yeah, sounds obvious, right? The transform UI allows you to assign a seed value, an increment value, a target pipeline column to store the sequential numbers and presents an option to store the final number in a target variable:

row_number_ui 

 

The Number Column can be a pipeline column that already exists (I always create the surrogate_key column in a Derived Column transform), or you can choose to let the Row Number transform create a new column for you. 

In my standard table processing methodology, I retrieve the next surrogate key value for the table prior to launching the data flow task and store it in a variable.  I then use SSIS Expressions to assign this seed value to the Seed property of the Row Number transform.

As with all data flow transforms, you set properties for the Row Number transform at the data flow task grain.  Just click anywhere on the Data Flow design surface to be able to access the properties for the transform:

row_number_properties

Since I haven’t found a way yet to insert an expression without popping open the Property Expression Editor window (by clicking the ellipsis next to Expressions – but you knew that already…), I open the editor and locate the Seed property for the Row Number transform (I’ve named it rnAssignSurrogateKeys in the screen shots):

row_number_expression_zero

There’s an issue with the way the custom transform interacts with the Property Expression Editor that sometimes results in an error when you try to directly assign the Seed property to a variable – regardless of the data type of your variable, the editor chokes and thinks you’re trying to assign a decimal value to a property that only accepts integers.  On the SQLIS page for the Row Number transform, the Troubleshooting section describes this as a known issue, and shows how to manually edit the .dtsx code to fix the issue.  Since I don’t want to mess with the code directly (no coding for me – see above), I prefer to fool the editor.  By assigning a zero to the Seed property as shown, you can successfully exit the editor.  Then, in the Properties pane, you can change the expression value to your variable as shown in the Properties screen shot above.

This only appears to be an issue the first time you try to assign a value to the property.  Once you’ve changed the value to a variable in the Properties pane, you can open and close the editor with no further errors – it will present the variable as the expression value:

row_number_expression_w_variable

Downstream of the Row Number transform, you can then assign the output column to the surrogate key column for your table.  Piece of cake.

Moving from XP Virtual PC 2007 to Windows 7 Virtual PC, Part 2

by Ellen 24. September 2009 03:02

Part 2 – Conversion

 

In part one of this post, I discussed preparing your Virtual PC 2007 VHD files for transfer to a Windows 7 box, and setting up the Windows 7 machine to use Windows Virtual PC, Microsoft’s Windows 7 virtualization engine.  In this post, I’ll step through the process I went through to convert a legacy VHD file.

 

This post refers to a Virtual PC 2007 VHD with Windows XP SP3 as its guest OS.

  • Launch Virtual PC. This new version presents as a Windows Explorer window rather than the VPC console as in VPC 2004 and 2007. You'll see an option to Create Virtual Machine:

vpc_create 

  • Choose the Create virtual machine option. The steps are very similar to the old VPC wizard - first you'll create the configuration file (which now has a .vmcx extension rather than .vmc), specify memory and networking options, then identify the VHD.

vpc_vmcx

vpc_memory

vpc_vhd 

Since you're converting an existing VHD, choose the Use an existing virtual hard disk option and navigate to its location.

  • Once you've completed this step, the .vmcx file will appear in the Virtual PC explorer window:

vpc_vmcx_list

  • To open the VPC, double-click on the .vmcx file. The first time you open the VPC, it will appear to be detecting new hardware. It may or may not require your interaction to complete the hardware detection, but if it does, it probably will not be able to find the driver (since it can't identify the hardware device in the first place). This does not appear to adversely affect anything.

 

  • When you're opening the VPC, it will tell you that it's restricting the cursor to the inside of the VPC window and give you the key sequence to release it to the host machine (<Ctrl><Alt><Left Arrow>). Unfortunately, this combination of keys may not work for your hardware.  On my laptop, for instance, this key combination rotates the host screen. So for the moment at least, you're stuck with the restricted cursor.

 

  • After the VPC has done its hardware detection dance, you'll want to restart it. If your VHD login was not previously password protected, you'll want to create a password for the login account now. After doing so, restart the VPC.

 

  • While the VPC is booting up (i.e., before the cursor restriction kicks in), from the Tools menu, select Install Integration Components. This is the Windows 7 version of Virtual Machine Additions.

vpc_bootup

You'll get the following message:

vpc_install_int_comp

Click Continue.

  • Once the VPC is open, it will present the Integration Components setup.exe in the CD drive:

vpc_cd_drive

Run setup.exe. It will appear to detect more new hardware, but this time you shouldn't be required to intervene.

  • Once the Integration Components have installed, the last screen will prompt you to download the updates required for running XP in this new mode. I usually let it navigate to the download URL as it wants to do, and run the hotfix update at this point. After the hotfix is installed, let the VPC restart.  This is the update URL, in case you miss this step:

http://www.microsoft.com/downloads/details.aspx?FamilyID=e5433d88-685f-4036-b435-570ff53598cd&displaylang=en

  • Even though you've installed the Integration Components, you still have to enable them. This appears to be user specific, although I haven't run enough tests to be absolutely sure of it. Unless you enable the features, you won't be able to copy and paste between the VPC and the host machine, nor share drives.

vpc_enable_int_feat

  • You may be prompted for a login when you choose to Enable Integration Features. I haven't been able to determine consistently when this shows up - definitely when the machine is attached to a domain, possibly when there's more than one non-system user account.

 

  • I’ve run into some oddities with the way a couple of VPN client tools interact with the Integration Features auto-enable (specifically the Cisco VPN client and the Cisco Any Connect web client).  When Integration Features are set to auto-enable at startup, the VPN client chokes with a user authentication failure before the user login screen is even presented.  To work around this issue, I set the Integration Features to enable manually, by clearing the Enable at startup check box as shown below.  When I do this, both of the Cisco VPN clients can connect successfully.  For the Windows client, I was able to enable the Integration Features after startup and still connect successfully.  For the Any Connect web client, enabling the Integration Features blocked the connection (the application delivered a “not supported” error message.)  I mention this as a heads-up:  you may need to play with this particular setting a bit if you experience odd connectivity behavior from within your VPC.

 vpc_manual_enable

Tags:

Moving from XP Virtual PC 2007 to Windows 7 Virtual PC, Part 1

by Ellen 22. September 2009 02:11

Part 1 - Preparation

 

Our extended firm is in the process of moving from the Vista operating system to Windows 7. As the first one in the consulting group to take the plunge, I had the privilege of testing the conversion process for some of our core applications.

At Perkins Consulting, we rely heavily on virtualization to manage the connection environments for our various clients. Consequently, one of the primary areas of concern for us in the conversion process was to make sure our virtual machines would still function under the new operating system.

Will they? Well, yes. More or less.

For those of you familiar with the Microsoft Virtual PC application, you'll remember that there wasn't a whole lot of difference between Virtual PC 2004 and Virtual PC 2007, at least cosmetically. The same is not true for the differences between Virtual PC 2007 and Windows Virtual PC (the Windows 7 virtualization engine). Virtual PC 2007 will not run on a Windows 7 machine. If you want to run a virtual PC on a Windows 7 box, you have to use Windows Virtual PC for Windows 7. And if you want to use your 2007 VHD files, you have to prepare and convert them for use in Windows Virtual PC.

As the official Conversion Sacrifice for PCon, I got to figure out how to make it work for us. In this two-part post, I'll describe the lessons I learned from the process. In the first part (the one you're reading now), I'll discuss preparing your legacy VHD files for conversion and setting up the Virtual PC application on the Windows 7 box. In the second part, I'll step through the conversion process for a 2007 VHD running Windows XP SP 3 as the guest OS.

 

Prepare Your 2007 VHD Files for Transfer

On your machine running VPC 2007, for each Windows XP SP3 VHD you wish to transfer:

  • Remove the Virtual Machine Additions from the VPC. If the additions are up to date, when you select the Install or Update Virtual Machine Additions from the Action menu, the only option presented to you will be Remove. If you get an Install or Update option, you'll need to apply the updates and restart the VPC before you can get the Remove option.

If you fail to complete this step, you will most likely get a drive read error when you try to open the VPC with the Windows 7 version of Virtual PC.

  • Windows 7 Virtual PC will only allow the following guest operating systems:
    • Windows 7
    • Vista
    • Windows XP SP3

If the VHD you wish to convert does not have a supported OS, you may not be able to successfully convert it for use with Windows 7.  I’ve seen some blog posts that others have been able to convert a VHD with an OS outside this list – Server 2003, for instance – but these are the only officially supported guest operating systems.

 

Prepare Your Windows 7 Box for Windows Virtual PC

  • Windows Virtual PC will only run on a machine with hardware virtualization capabilities.  Before installing Windows Virtual PC on your Windows 7 box, enable hardware virtualization in the BIOS.  This is a critical step, according to the download notes.  If you fail to make the changes to the BIOS, untold death and destruction may ensue.  (Or at the very least, the application won’t run.)

 

 

  • Install the Windows Virtual PC on your Windows 7 machine.

 

  • In addition to the Windows 7 Virtual PC bits, the above site also has a download for Windows XP Mode.  You may download and install this if you wish, but its use is outside the scope of this post.

Tags:

64-bit SQL Server 2005 SSIS and UNC paths, Part 2

by Ellen 11. September 2009 22:58

"You can't get there from here."

Part 2

In Part 1 of my post on this topic, I described how to configure a package containing UNC paths on flat file destinations to execute successfully in BIDS debug mode.

At the risk of being repetitive, here's the system background:

  • The SQL Server data mart resides on a remote server running 64-bit SQL Server 2005 Enterprise Edition.
  • Access to the file system on the server is not allowed.
  • SSIS development occurs on a virtual server with only the SQL Server client tools installed.
  • The destination for the output files is on yet a third machine, only accessible to the server and the development machine via UNC paths.

Now that I had successfully executed the package in debug mode by forcing BIDS to use the 32-bit runtime to recognize the UNC paths, I was ready to try to schedule the package via a SQL Agent job.

I'm used to using the CmdExec step type in SQL Agent to force execution of a package in 32-bit mode (I have to do this any time the package includes Excel file sources, for instance). However, when I used the usual method for this particular package, I got yet another error message:

Error: 2009-08-31 16:01:55.99 Code: 0xC001401E Source: stage_reports Connection manager "Extract File Cards" Description: The file name "\\target1\Dest-Data\Cards.csv" specified in the connection was not valid.

Well, rats.

Since I wasn't able to access the file system or view the context of the server in the network, I had to find a way to work around this - i.e., schedule the package in 32-bit mode without using SQL Agent.

To do this, I wrote a batch file to execute the package using 32-bit dtexec from the context of the development machine (I used dtexecui to get the command line for the package):

"C:\Program Files\Microsoft SQL Server\90\DTS\Binn\dtexec.exe" /FILE "\\vm-dev1\ReportStaging\SSIS\report_staging\stage_reports.dtsx"

/MAXCONCURRENT " -1 " /CHECKPOINTING OFF /REPORTING EW

By scheduling the batch file as a Windows task and bypassing SQL Agent entirely, I was finally able to get the UNC paths to work on the calling machine, the server and the destination machine.

Whew!

Tags: , , , ,

Data Warehousing

64-bit SQL Server 2005 SSIS and UNC paths, Part 1

by Ellen 11. September 2009 22:57

"You can't get there from here."

Part 1

As part of a recent client project, I had to use SSIS to create six pipe-delimited flat files for consumption by a third-party report writer. I created a template container with all the tasks necessary for the conversion. The template included variables to hold the path and file name for the destination files, and I used an expression in the connection string for each destination Connection Manager to reference the variables.

System background:

  • The SQL Server data mart resides on a remote server.
  • Access to the file system on the server is not allowed.
  • I do SSIS development on a virtual server with only the client tools installed.

During testing, I used a local drive path on the development machine (E:\data) and the package executed in debug mode with no problem. The final destination for the files was on a third machine, so I had to use UNC paths for access. After I changed the variables to UNC paths, I got the following error when executing in debug mode:

Error: 2009-08-31 15:20:49.96 Code: 0xC020200E Source: <task name> <destination adapter> [346] Description: Cannot open the datafile

I could open the Connection Manager for the files defined by UNC pathnames inside of the BIDS designer, but the execution of the package failed because it couldn't open the file.

Hmmmm. Very suspicious.

I checked the SQL Server version on the remote server. Sure enough, it was the 64-bit version. I set the Run64BitRuntime property for the project to False and the package executed successfully from the BIDS designer.

ssis_32_bit_runtime

Looks like UNC paths on flat files are another point of conflict between 64-bit and 32-bit SQL Server. Since the BIDS designer is a 32-bit application, I could open the files from inside the designer. Executing the package, however, invokes the 64-bit executable, unless BIDS is forced to use the 32-bit executable with the above setting.

Of course, then I had to try to schedule the thing….(to be continued)

Tags: , , , ,

Data Warehousing

Powered by BlogEngine.NET 1.5.0.7
Theme by Perkins Consulting Content Copyright 2009 Perkins Consulting, LLC All rights reserved.