Quantcast
Viewing all articles
Browse latest Browse all 2697

Zero to Snowflake: Importing Data with Code via SnowSQL

Image may be NSFW.
Clik here to view.

Our previous post in this series explained how to import data via the Snowflake user interface. You can also stage your data and load it into tables with code via SnowSQL, which is what this blog post will demonstrate.

Setting the Stage for SnowSQL

There are many ways to import data into Snowflake, and utilising code allows users to automate the process. To import data into a Snowflake stage using SnowSQL, the following requirements must be met:

  1. SnowSQL must be installed on the user’s machine
  2. The data must be in a suitable file format
  3. A warehouse in Snowflake must exist to perform the import
  4. A destination database must exist in Snowflake
  5. A destination stage must exist within the Snowflake database

To load data from a Snowflake stage into a table, the following additional requirements must be met:

  1. A destination table must exist within the Snowflake database
  2. A suitable file format must exist within the Snowflake database

Installing SnowSQL

SnowSQL is the command line interface tool that allows users to execute Snowflake commands from within a standard terminal. This means users do not need to be within the Snowflake UI to prepare and execute queries. It is also noteworthy that certain commands cannot be executed from within the Snowflake Worksheets environment but can be executed via SnowSQL, with the most noteworthy being the command to import data into a Snowflake stage.

Before we can start using SnowSQL, we must install it. From within the Snowflake UI, the Help button allows users to download supporting tools and drivers:

Image may be NSFW.
Clik here to view.

As our goal is to install the SnowSQL Command Line Interface (CLI) Client, we select this from the list and choose the appropriate operating system:

Image may be NSFW.
Clik here to view.

Once the installer downloads, execute it and run through the installer to install SnowSQL. You may need to enter admin credentials to install the software.

Verifying the Installation of SnowSQL

Once SnowSQL is installed, it can be called from within any standard terminal. For this example, we will be using Windows Powershell; however, we could equivalently use any other terminal depending on the operating system:

Image may be NSFW.
Clik here to view.

Begin by executing the command SnowSQL to verify the installation. This will list the various options which can be used within SnowSQL:

Image may be NSFW.
Clik here to view.

Preparing Data for Import

Before we can import any data into Snowflake, it must first be stored in a supported format. At time of writing, the full list of supported is contained in the table below. An up-to-date list of supported file formats can be found in Snowflake’s documentation:

Image may be NSFW.
Clik here to view.

*Note: The XML preview feature link can be accessed here

For this example, we will load the following structured data which is delimited with vertical bars:

Image may be NSFW.
Clik here to view.

Using SnowSQL to Execute Commands

The first step toward utilising SnowSQL is to establish a connection with your Snowflake environment. Three pieces of information are required to establish a connection:

  1. The Snowflake account to connect to, including the region
  2. An existing user within that account
  3. The corresponding password for that user

As Snowflake is a cloud service, each account is accessed via a URL. For example, InterWorks’ account in the EU is accessed via https://interworks.eu-central-1.snowflakecomputing.com.

This URL is formed by https://<Account>.<Account_Region>.snowflakecomputing.com.

Therefore, we can see that our account is interworks.eu-central-1. Note that we are still using Windows Powershell; however, this can be achieved in any equivalent terminal. We enter account using the –accountname option in SnowSQL, along with my user chris.hastie with the –username option:

Image may be NSFW.
Clik here to view.

We can also use the shorter names for these options, respectively -a and -u:

Image may be NSFW.
Clik here to view.

In both situations, a password prompt will appear. Enter the password for the desired user. Do not be alarmed if your terminal does not display the password—it will still be reading the input.

With the connection established, SnowSQL will output the current user along with the current warehouse, database and schema. We have not yet selected any, which is why they are all left as none selected:

Image may be NSFW.
Clik here to view.

Preparing a Warehouse

We encourage Snowflake users to test their data loads with various warehouse sizes to find their optimal balance between credit cost to performance. As we are loading a very small file in a bespoke instance, we will use an extra-small warehouse called LOAD_WH.

If the warehouse does not exist, it can be created with the CREATE WAREHOUSE command. Note that we have determined the size as extra small, along with setting the AUTO_SUSPEND to 300 and the AUTO_RESUME to TRUE. This means that our warehouse will automatically suspend itself if it has not been used for five minutes, eliminating costs when the warehouse is unused:

Image may be NSFW.
Clik here to view.

Upon successful execution, SnowSQL will output that the warehouse has been created:

Image may be NSFW.
Clik here to view.

Once the warehouse is created, it can be used with the USE WAREHOUSE command:

Image may be NSFW.
Clik here to view.

We can now see our current warehouse in our connection string directly after the username:

Image may be NSFW.
Clik here to view.

Preparing a Database

In a similar fashion to warehouses, users can create databases via the CREATE DATABASE command. For this example, we will create a databased called DEMO_DB:

Image may be NSFW.
Clik here to view.

With this database created, our connection will automatically update to use this database. DEMO_DB is now provided in the connection string, along with the default PUBLIC schema:

Image may be NSFW.
Clik here to view.

For the sake of uniformity, we will also use the CREATE SCHEMA command to create a new schema called DEMO_SCHEMA:

Image may be NSFW.
Clik here to view.

Our connection string has now updated to display that we are connected to the DEMO_SCHEMA schema within our DEMO_DB database:

Image may be NSFW.
Clik here to view.

Preparing a Stage

Stages are Snowflake’s version of storage buckets. Each stage is a container for storing files and can be either an internal stage within Snowflake or an external stage within Amazon S3 or Azure Blob storage.

For this example, we will create a simple internal stage within Snowflake using the CREATE STAGE command. Our stage will be called DEMO_STAGE:

Image may be NSFW.
Clik here to view.

If our data was stored in Amazon S3 or Azure Blob storage, we would require authentication credentials to the storage location along with the URL. These can be entered as part of the CREATE STAGE command, as in the following example:

Image may be NSFW.
Clik here to view.

Finally, we can import our data into our stage using the PUT command. More information for this command and other SnowSQL commands can be found in Snowflake’s documentation. A key piece of information is that the PUT command can only be executed from outside the main Snowflake interface, which is why we are using SnowSQL instead.

It is important to get the file location string correct as this may vary depending on the terminal you are using. Note that the stage is determined using an @ sign:

Image may be NSFW.
Clik here to view.

After performing the import, SnowSQL will output some key information. This includes the name of the file, its size and the status of the import:

Image may be NSFW.
Clik here to view.

At this stage, our file has now been successfully imported into a secure and encrypted stage in Snowflake.

Preparing the Destination Table

Now that our data is secure in a Snowflake stage, we can think about how we want to view it. In order to load data from a stage into a table, we first must create the table and a file format to match our data.

The following steps could still be performed from within the SnowSQL command line interface; however, we will instead perform the remaining steps in Snowflake itself via the Worksheets functionality. We can do this because the remaining commands that we wish to execute can all be performed from within Snowflake, in contrast to the previous PUT command which could not.

To create our table, we first ensure that we are using the correct warehouse, database and schema. We can then use the CREATE TABLE command to create our table as we would in any SQL-compliant data warehouse:

Image may be NSFW.
Clik here to view.

Preparing the File Format

With our destination table created, the final piece of the puzzle before we load our data is to create a file format. As mentioned earlier, our data is stored as a vertical bar delimited CSV file:

Image may be NSFW.
Clik here to view.

Using the CREATE FILE FORMAT command, we can create a file format called CSV_VERTICAL_BAR. We determine that the file type is CSV, which instructs Snowflake to look for a structured text file with delimited fields. Our field delimiter is a vertical bar and skips the first row of data. We do this as the first line of data in our file contains the column names, which we do not want to import into our table in Snowflake:

Image may be NSFW.
Clik here to view.

Loading the Data from the Stage into the Table

Now that we have all of our pieces lined up, we can use the COPY INTO command to load our data from the stage into the destination table. As with all commands, full information can be found within Snowflake’s documentation:

Image may be NSFW.
Clik here to view.

The output informs us that the file loaded successfully, parsing and loading all 12 rows. We can verify this with a standard SELECT command from our table:

Image may be NSFW.
Clik here to view.

The post Zero to Snowflake: Importing Data with Code via SnowSQL appeared first on InterWorks.


Viewing all articles
Browse latest Browse all 2697

Trending Articles