
In my last blog, I discussed how to do an Upsert in Snowflake using Matillion. Using the same workflow, I will discuss how we got the key value using the MD5 hash functionality available in Snowflake and why we need to do it. The why is easy, so let’s start there. The MD5 gives us a key value that we can use to reference back from the Fact tables in our data warehouse to the Dimension tables.
If you’re wondering why we need a surrogate key (SK) at all, you’re probably not alone. We use a surrogate key for join performance. Using a surrogate key is faster when joining two tables than using multiple columns in your join statement. This makes writing your queries a little faster while improving the performance of your query, and who doesn’t want that?
Here is a quick overview of what the workflow will look like when we finish:
Our first step is to make sure we understand which column, or combination of columns, gives us a unique value. In this case, it is the Airport Code column.
Now we simply need to bring in a Calculator tool onto the canvas:
Finally, we write the code below to create our key. A quick note on the COALESCE function in the code: We use this so that, in the event of a null value in the unique field, you will still get a unique key value.
Just as a check, you will want to do a row count on your key value to make sure that you truly have a unique key. After performing this check, you can now refer to my last blog to create your Upsert. Once all of that is completed, you simply need to create the table and run the Upsert. Before creating your table, simply pull out a Rename tool in your Transformation canvas where you have created your key and rename your columns. Then, select the Metadata tab and check the Text Mode box.
This will allow you to copy the Metadata of the table.
You can now paste this information into your Create Table tool in the Orchestration. While this may not help much with a small table like this, it can be a real time-saver with larger tables.
The post Creating MD5 Hash Keys in Snowflake via Matillion appeared first on InterWorks.