Data Loading with Snowflake's COPY INTO Command-Table
Snowflake's COPY INTO command is a powerful tool for data professionals, streamlining the process of loading data from staged files into existing tables. Here's a quick overview of how it works and where you can stage your files.
Staging Locations:
Named Internal Stage: Files can be staged using the PUT command in a table or user stage.
Named External Stage: Reference an external location such as Amazon S3, Google Cloud Storage, or Microsoft Azure.
External Location: Directly from Amazon S3, Google Cloud Storage, or Microsoft Azure.
Important Note: You cannot access data held in archival cloud storage classes that require restoration before retrieval. This includes Amazon S3 Glacier Flexible Retrieval, Glacier Deep Archive, and Microsoft Azure Archive Storage.
Snowflake's COPY INTO command ensures efficient and organized data loading, making it an essential feature for managing your data ecosystem.
Here’s a detailed explanation of each parameter in the COPY INTO command:
ON_ERROR= {CONTINUE | SKIP_FILE | ABORT_STATEMENT}:
FORCE=TRUE:
VALIDATION_MODE= {RETURN_ALL_ERRORS | RETURN_ERRORS}:
RETURN_FAILED_ONLY=TRUE:
PURGE=TRUE:
MATCH_BY_COLUMN_NAME=CASE_INSENSITIVE:
INCLUDE_METADATA=(ColumnName=METADATA$field):
Logged into Snowflake account by SNOWSQL & select WAREHOUSE,DATABASE & SCHEMA
CREATE the Fileformat & Stage
Create the table
Put the file into the stages
Checking the bad records where it is there in source side
Skip the bad records and load it to table - showing 1 bad records skipped
FORCE=TRUE parameter means that the data load operation will be forced to run even if the files have already been loaded previously. This parameter is useful when you want to reload data from files that have not changed or when you need to reload files that previously encountered errors.
When we mentioned FORCE=TRUE -Same data will reloaded
TABLE cricket_stats created
Loaded this file to stage DATA_LOAD_STAGE_1;
Load the data from Stage to table
Benefits of Using PURGE=TRUE:
Storage Management: Automatically deleting the files helps in managing and optimizing your storage space, preventing unnecessary clutter and potential costs associated with storing large amounts of unused data.
Operational Efficiency: It simplifies the data loading process by eliminating the need for manual cleanup, which can be particularly useful when dealing with large datasets and frequent data loads.
Data Security: Removing files from the stage after loading reduces the risk of unauthorized access to staged files that are no longer needed, thus enhancing data security.
Loaded file which was there under the stage that is deleted t20.csv.gz
How do i add the filename,FILENAME = METADATA$FILENAME, ELT_LOAD_TIME = METADATA$START_SCAN_TIME)
order_info table got created
File Format & Stage got created
Data uploaded to stage from local via snowsql
Create a temp table to stage the data
Data dump into Final table order_info table
SELECT * FROM order_info
Copy from Table to Stages
REMOVE @OD01_STAGE--Remove all the files under this stage
LIST @OD01_STAGE -- Check if any files are there.
COPY INTO @OD01_STAGE FROM customer_info
🔹 All Parameters
🚀 Best Practices for COPY INTO @STAGE
✅ 1. Use SINGLE=FALSE for Large Exports
Allows Snowflake to split files efficiently using parallel processing
✅ 2. Optimize File Size for Performance
Use to balance read performance & storage costs.
For AWS S3, use ~128MB file size for best retrieval performance.
✅ 3. Use INCLUDE_QUERY_ID=TRUE to Avoid File Overwrites
Ensures exported filenames are unique (prevents accidental overwriting).
✅ 4. Choose the Right File Format (PARQUET for Analytics)
→ Best for big data analytics (compact, columnar storage).
→ Best for external systems that don’t support Parquet.
✅ 5. Encrypt Data for Security
Use for built-in encryption.
✅ 6. Validate Data Before Copying Using VALIDATION_MODE
Avoid wasting time by checking errors before exporting data.
✅ 8. Overwrite Files When Needed
Ensure fresh data by overwriting old files
COPY INTO @OD01_STAGE/CUSTOMERSDATA
FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER
FILE_FORMAT = (TYPE = 'PARQUET')
OVERWRITE=TRUE
SINGLE=TRUE
HEADER = TRUE
LIST @OD01_STAGE
https://docs.snowflake.com/en/sql-reference/sql/copy-into-table