Loading spatial data with SSIS

Spatial data support was added to SQL Server starting with SQL Server 2008. [reference]

Geography and geography are .Net CLR data types, and so there’s not much in the way of out of the box support. That’s not to say you’re blocked in creating SSIS ETL processes where spatial data is present. There are plenty of ways around this issue. The following is one.

The Data Source

For this demo I am using a dataset of Michigan schools downloaded from the State of Michigan’s Open Data portal. The source is a CSV file containing a list of Michigan intermediate school districts, individual districts and schools. Along with street addresses, latitude and longitude is recorded.

The source data is flattened. So first I will create some tables that will store the data in a more normalized fashion.

School dimensions entity relationship diagram.

The Data Destination

The last table, School Dimension, will have a School Location column with a data type of geography.

After loading the first two tables, it’s time to load the School Dimension table.

Load School Dimension Data Flow

In your SSIS package, create an ADO.Net Connection Manager to the SQL Server database.

Specify Connection Manager

As mentioned, the source data as latitude and longitude column. I can’t transform these two columns into one geography data type and pass it along the pipeline. So to account for that, create a Script Component and select Destination as the component type. Specify the Connection Manager and columns, then edit the script.

Select Script Component Type

Add the necessary code the insert rows to a SQL Server database.

Finally here’s where the latitude and longitude gets transformed. The dirty little secret, if you will, is I am not using SSIS to transform the data, but passing it to the SQL engine. The transformation happens in the SQL statement.

INSERT INTO dbo.SchoolDimension(SchoolDistrictDimensionId, School, SchoolLocation)
VALUES(@SchoolDistrictDimensionId, @School, geography::Point(@Latitude, @Longitude, 4326));

Summary

In this blog post I demonstrated how to transform data stored in a CSV file and load it to a SQL Server table using SSIS. This is admittedly a narrow use case. And in reality it was the SQL engine that did the transformation. But hey, the SS in SSIS stands for SQL Server.

Posted in Data Integration, SQL, Uncategorized | Tagged , , | Comments Off on Loading spatial data with SSIS

Fits and Starts

I have a few shade tree, or hobby projects I have been thinking about. But that’s mostly been what I have been doing, thinking about it.

So to put some structure and possible progress around these ideas, I have been creating some GitHub reposotories.

These projects are one part learning exercise, one part sharing, and maybe even some community building.

Here is a run down on a few project and ideas:

Vessel Traffic

Every summer our family spends a long weekend in Mackinaw City, MI with some of my cousins, and other hangers-on. One of those hangers-on is a former crew member of a Great Lakes freighter. A favorite pastime is watching and indentifying these freighters (trainspotting style)  pass through the Straits of Mackinac

My Vessel Traffic repository is dedicated to applying my interests in GIS and Busines Intelligence to this pastime.

Data Dot Gov

Similar to Vessel Traffic, the DataDotGov project is a catchall of Business Intelligence data derived from the Data.Gov web site.

See Kendra Little’s blog post Free, Open License Dataset at Data.gov for more information.

CensusDw

CensusDw is like the DataDotGov but dedicated to Business Intelligence insight gained with US Censis data.

Happy coding.

Posted in Business Intelligence, Data Integration, Database BI, Open Source | Comments Off on Fits and Starts

SQL Server naming conventions

Starting a new project in 2015. I am putting some personal/hobby projects on GitHub. My aim is to share projects, documents, and other resources. I am hoping it be useful, and possibly find some co-conspirators.

The first item is a collection software development style guides. Here I’ll gather some resources and (wait for it) best practices. I thought of just collecting them in my own private storage, but I am in a sharing mood at this time.

First document out of box is a document outlining some SQL Server naming conventions.

Happy coding.

Posted in Open Source, SQL, Uncategorized | Tagged , , | Comments Off on SQL Server naming conventions