Loading spatial data with SSIS

Spatial data support was added to SQL Server starting with SQL Server 2008. [reference]

Geography and geography are .Net CLR data types, and so there’s not much in the way of out of the box support. That’s not to say you’re blocked in creating SSIS ETL processes where spatial data is present. There are plenty of ways around this issue. The following is one.

The Data Source

For this demo I am using a dataset of Michigan schools downloaded from the State of Michigan’s Open Data portal. The source is a CSV file containing a list of Michigan intermediate school districts, individual districts and schools. Along with street addresses, latitude and longitude is recorded.

The source data is flattened. So first I will create some tables that will store the data in a more normalized fashion.

School dimensions entity relationship diagram.

The Data Destination

The last table, School Dimension, will have a School Location column with a data type of geography.

After loading the first two tables, it’s time to load the School Dimension table.

Load School Dimension Data Flow

In your SSIS package, create an ADO.Net Connection Manager to the SQL Server database.

Specify Connection Manager

As mentioned, the source data as latitude and longitude column. I can’t transform these two columns into one geography data type and pass it along the pipeline. So to account for that, create a Script Component and select Destination as the component type. Specify the Connection Manager and columns, then edit the script.

Select Script Component Type

Add the necessary code the insert rows to a SQL Server database.

Finally here’s where the latitude and longitude gets transformed. The dirty little secret, if you will, is I am not using SSIS to transform the data, but passing it to the SQL engine. The transformation happens in the SQL statement.

INSERT INTO dbo.SchoolDimension(SchoolDistrictDimensionId, School, SchoolLocation)
VALUES(@SchoolDistrictDimensionId, @School, geography::Point(@Latitude, @Longitude, 4326));

Summary

In this blog post I demonstrated how to transform data stored in a CSV fileĀ and load it to a SQL Server table using SSIS. This is admittedly a narrow use case. And in reality it was the SQL engine that did the transformation. But hey, the SS in SSIS stands for SQL Server.

This entry was posted in Data Integration, SQL, Uncategorized and tagged , , . Bookmark the permalink.

Comments are closed.