How to address query performance challenges in Snowflake
In today’s world of cloud-based databases, long-running queries often meet a straightforward solution: allocate additional compute resources. Scaling warehouses and using brute force isn’t inherently wrong. But as data volumes grow, the cost of this approach becomes apparent: the same query costs significantly more to run over time due to added compute and natural growth in data size of tables.
While tuning every query may be unrealistic, focusing on frequently executed or long-running queries can yield substantial, cumulative savings. Optimizing these queries also benefits the broader system. Like large trucks congesting a highway, heavy queries consume bandwidth, slowing down other workloads and sometimes requiring extra clusters for load balancing.
In this post, we cover the foundation for efficient SQL queries and concrete steps you can take to optimize your queries for costs and performance.
Compilation time | Queued overload time | Execution time |
Compilation takes a long time for complex queries | Concurrency is impacted as new queries cannot run without additional clusters to scale |
Data spillage, join explosion and large table scans slow query processing Complex expressions, multi-table joins, windows operations and GROUP BY aggregations |
Data modeling and query compilation
A robust data model is the foundation of efficient query execution, yet it is often overlooked. Proper design ensures joins are efficient, and clustering keys help minimize data scans by reducing the number of partitions queried. Much like a library catalog system, clustering keys make locating relevant data faster and more efficient.
Every query in Snowflake begins with the Snowflake optimizer creating an explain plan based on metadata. The time spent generating this plan is known as query compilation time. Compilation time increases with query size, the number of joins, use of secured views and masking policies.
For example, consider a query that joins six tables:
SELECT c.customer_id, c.customer_name, s.sales_date, s.amount
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id
JOIN products p ON s.product_id = p.product_id
JOIN regions r ON c.region_id = r.region_id
JOIN employees e ON s.sales_rep = e.employee_id
JOIN stores st ON s.store_id = st.store_id
WHERE s.sales_date BETWEEN '2023-01-01' AND '2023-06-30';
The Snowflake optimizer must evaluate multiple join paths, check metadata for each table and create the lowest-cost plan. As table counts grow, so does compilation time. If secured views or masking policies wrap some of these tables, the optimizer must also validate security rules, further extending compilation.
Snowflake’s optimizer does an excellent job of generating candidate plans quickly, but query users can help by simplifying SQL when possible and avoiding unnecessary joins. This will result in better Snowflake optimizer choices, database performance and maintainability.
Concurrency and warehouse sizing
Concurrency is another critical factor in query performance. In our highway analogy, it can be compared to traffic flow—more cars (queries) require more lanes (clusters). Larger warehouses speed up individual queries, but when workloads spike, additional clusters often provide the most effective relief.
Consider two warehouse configurations:
-
Small (1,2) → Two clusters, better concurrency
-
Medium (1,1) → One cluster, faster and less frequent execution
Both cost four credits per hour, but their performance profiles differ. If your workload consists mainly of short BI dashboard queries, the Small (1,2) is more cost-effective. For a mix of heavy ETL jobs and reporting, a Medium (1,1) may be more appropriate.
To illustrate, here are two scenarios:
Dashboard workload (short queries):
SELECT MONTHNAME( order_date) AS sales_month,SUM(sales_amount) AS MonthlySales
FROM Sales
GROUP BY 1 ORDER BY 2;
ETL workload (heavier scans and transformations):
WITH WEEK_USAGE AS ( SELECT Acct_id,Week(transaction_date) AS reporting_week,SUM(account_balance) AS total_account_spend FROM ACCOUNTS GROUP BY Acct_id, reporting_week),
Top_spend AS (SELECT Acct_id,reporting_week,total_account_spend,RANK() OVER (PARTITION BY reporting_week ORDER BY total_account_spend DESC) AS Rank_balance
FROM WEEK_USAGE)
INSERT INTO top_weekly_customers (Acct_id, reporting_week, total_account_spend )
SELECT Acct_id,reporting_week,total_account_spend FROM Top_spend
WHERE Rank_balance <= 5;
As you scale, manually tuning hundreds of warehouses becomes impractical. Automated tools like Capital One Slingshot continuously analyze workloads and recommend optimal warehouse configurations, ensuring the right balance between cost and performance. You can even set Slingshot to automatically apply recommendations on your behalf for automation at scale.
Join explosion
One frequent and costly mistake is join explosion (or row explosion), which occurs when duplicate values or missing join conditions multiply rows unintentionally.
For instance, joining two tables without proper keys:
SELECT *
FROM customers c
JOIN orders o ON c.region_id = o.region_id; -- Wrong join condition
If the customers table has 10 rows and the orders table has five rows, the result could incorrectly balloon to 50 rows (Cartesian product). In the above example, the join was performed based on the wrong condition and, as a result, the query ran slower, cost more and produced incorrect results.
The correct join should be on customer ID:
SELECT *
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id;
Even a small oversight in join logic can generate massive and expensive query explosions.
Large table scans
Large scans are among the most common causes of slow queries. While Snowflake can brute-force them, moving terabytes of data from storage to compute takes time and money.
Example of a large scan:
SELECT * FROM sales
Improved version with filterable predicate:
SELECT *
FROM sales
WHERE sales_date BETWEEN '2025-01-01' AND '2025-9-30';
The query above allows Snowflake to prune partitions more effectively, reducing scan time and cost.
Divide and conquer strategy
Another way to increase the efficiency and cost effectiveness of your Snowflake queries is to split large workloads into smaller more digestible queries.
For example, instead of running one massive query:
SELECT c.cust_nm FROM orders c
WHERE c.acct_id IN (
SELECT s.acct_id FROM orders s
WHERE s.order_dt >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY s.acct_id HAVING AVG(s.balance) >
(SELECT AVG(balance) FROM orders WHERE order_dt >= CURRENT_DATE - INTERVAL '30 days'));
Break it down into incremental steps:
WITH my_orders AS (SELECT acct_id,
SUM(balance) AS total_orders,AVG(balance) AS Avg_purchase
FROM orders WHERE order_dt >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY acct_id
),
ov_avg AS (SELECT AVG(balance) AS gbl_avg FROM orders
WHERE order_dt >= CURRENT_DATE - INTERVAL '30 days')
SELECT c.cust_nm FROM orders c
INNER JOIN my_orders rs
ON c.acct_id = rs.acct_id
WHERE rs.Avg_purchase > (SELECT gbl_avg FROM ov_avg);
This approach reduces memory pressure and avoids excessive scans.
Data retention and archiving
Without data retention policies, tables can grow indefinitely, leading to ballooning storage and query costs. Many organizations find that users tend to query recent data (e.g., last six months) more than historical data. Yet queries often scan years of unused history.
For example:
-- Scans 10 years of data unnecessarily
SELECT *
FROM sales
WHERE sales_date >= '2015-01-01';
This query can run faster and cheaper if we design incremental reporting:
SELECT *
FROM sales
WHERE sales_date >= CURRENT_DATE - INTERVAL '180 DAYS';
Pair this with table partitioning or clustering keys on sales_date for even faster pruning.
Archiving older data that is rarely queried to cheaper storage tiers, or separating it into historical tables, can dramatically reduce both cost and performance overhead.
Disk spillage
When queries exceed available memory, intermediate results spill over to local disk memory, increasing I/O and slowing performance.
For example, a large aggregation:
SELECT A.CUSTOMER_FEEDBACK,B.ISSUE_RESOLUTION ,
SUM(A.ORDER_COST) OVER (PARTITION BY A.CATALOG ORDER BY A.ORDER_TIMESTAMP) AS RUNNING_ORDER_COST,
ARRAY_AGG(B.CUSTOMER_ID) WITHIN GROUP (ORDER BY CUSTOMER_ID) AS AGGREGATED_IDS
FROM CUSTOMER AS A JOIN ORDERS AS B
ON A.CUSTOMER_ID = B.CUSTOMER_ID
WHERE DATE( A.ORDER_TIMESTAMP) > '2020-01-01'
GROUP BY A.CUSTOMER_FEEDBACK, B.ISSUE_RESOLUTION, A.ORDER_COST, A.ORDER_TIMESTAMP;
If the orders table is massive, intermediate results may spill to disk.
Some possible solutions include:
-
Upsizing your warehouse
-
Adding pre-aggregated summary tables
-
Breaking the query into smaller steps
Complex queries
Queries involving multiple joins, window functions or large GROUP BY aggregations are particularly resource-intensive.
Example of a complex query:
SELECT c.customer_id,
SUM(o.amount) AS total_spent,
RANK() OVER (PARTITION BY c.region_id ORDER BY SUM(o.amount) DESC) AS regional_rank
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN regions r ON c.region_id = r.region_id
GROUP BY c.customer_id, c.region_id;
Queries like the above can strain compute resources. Options to optimize include materializing intermediate results into staging tables or using summary tables for common aggregations.
Continuous tuning
Query performance tuning is not a one-time exercise. As data changes naturally with time, queries that once ran efficiently can degrade. Monitoring thousands of queries manually is infeasible, making automation essential.
We built Slingshot to help us with challenges just like this as we scaled on Snowflake to thousands of users running millions of queries a day.
Slingshot helps you:
-
Identify high-cost queries
-
Highlight scan-heavy tables
-
Dynamically right-size your warehouses
-
Validation optimization savings
The inherent feedback loop not only optimizes performance but also provides visibility into the value delivered.
Conclusion
Scaling compute is an easy short-term solution, but it masks long-term inefficiencies. By focusing on better data modeling, warehouse sizing, pruning scans and retention policies—and by leveraging automated tools—organizations can ensure cost-efficient and performant workloads.
Query tuning is less about chasing perfect efficiency and more about building sustainable, scalable systems that can grow with your data.