40 new features to optimize Snowflake spend with Slingshot
Over the past few months, the Capital One Slingshot team has been hard at work and we’re excited to share what we’ve built! We’ve been working with customers on new features and capabilities. Today, we are thrilled to announce 40 brand new features spanning optimization, visualization, governance and notifications to help users manage their Snowflake infrastructure.
We hope these new capabilities help educate and accelerate Snowflake use across your organization. Many of these insights are based on learnings we’ve had at Capital One, running Snowflake at scale across thousands of global users running millions of queries a month on pedabytes of data.
Without further ado, let’s dive in!
All of the new features fall within these overarching categories:
Optimization
-
23 new optimization insights for classic and serverless compute, storage, workloads and data governance.
-
Timely warehouse notifications for faster visibility into performance anomalies.
Visualization
-
13 new reports to help you understand how the organization uses Snowflake, and identify inefficiencies and cost drivers in a glance.
-
Data pipeline cost lineage to understand your full cost context for your Snowflake workloads.
Governance
-
Fine-grained access controls to limit or remove users’ visibility into other lines of business, putting the Principle of Least Privilege into action.
-
Slack notifications for Snowflake cost and performance anomalies.
If you’re curious to learn more, we’ve got you. Let’s roll up our sleeves and dive into each section.
Snowflake Optimization
New features and insights to help you maximize your investment in Snowflake.
Optimization hub
The Slingshot Optimization Hub is where you can find inefficiencies driving up your Snowflake spend and how to remediate them. The insights on this page are based on best practices and lessons learned by Snowflake experts at Capital One, from years of optimizing massive data sets at scale.
The insights we have today focus on identifying inefficiencies, eliminating waste and pinpointing objects for optimization across the full Snowflake stack. Check out the list of 23 new insights for optimizing Snowflake and stay tuned for more quick insights in the coming weeks and months!
Storage
Maximize your investment in Snowflake warehouses, with insights into:
-
Time travel storage savings: Optimize time travel settings to reduce backup storage costs.
-
Unused tables: Find tables with no usage or changes.
-
Forgotten large tables: Identify tables with +100GB of data, from which data is written but not read.
-
Improper use of temporary table: Detect inefficient use of temporary or transient tables.
-
Tables with low pruning efficiency: Tables that have more partitions than necessary are frequently scanned, impacting both costs and performance. Use insights to reduce costs by ensuring the right amount of partitions are in place.
Cost anomalies
Stay in the know with cost anomaly insights for short and long-term optimization, with insights into:
-
Sudden changes in credit consumption: ML-powered insights into sudden deviations in credit consumption help you spot issues before they snowball and become costly. This optimization compares expected costs vs. actual costs on an hourly basis to detect and alert you of anomalies.
-
Significant cost patterns: Identify sustained changes in usage patterns over the past week vs. historical data. Use this insight to identify large drivers of costs that might not otherwise be accounted for.
-
Warehouse cost anomaly detection: ML-powered insights into warehouse spend data, allowing users to compare current spend vs. forecasted spend to identify unexpected cost spikes.
Queries
Optimize queries for speed and efficiency, with insights into:
-
Inflated costs and query time due to high remote spillage: Detect jobs that are running slower than anticipated due to remote spillage. These jobs can run faster and at a lower cost on appropriately sized infrastructure.
-
Wasted spend on tasks that continuously fail: Find continuously failing tasks to save on wasted compute costs.
-
Poor query pruning: Efficiently pruning unnecessary partitions can decrease cost and improve execution time. This insight pinpoints those queries so you can take action to reduce data scans, execution time and costs.
-
Recurring tasks which don't use a stream: Detect recurring queries that might prevent warehouses from autosuspending and drive costs up.
-
Recurring long-running failed queries: Identify recurring queries that run for +30 mins and fail to reduce the total sum of wasted credits from these queries.
Compute
Optimize Snowflake compute, with insights into:
-
Warehouses with no autosuspend: Save on compute by autosuspending warehouses when idle.
-
Inefficient usage of multi-cluster warehouse autoscaling: Detect when the min and max cluster count is the same value for multi-cluster warehouses, preventing them from scaling up or down to respond to demand. If you can’t scale down, you can cut costs.
Serverless costs
Don’t leave money on the table. Configure Snowflake auto functions correctly with insights into:
-
Rarely used tables with automatic clustering: Automatic clustering incurs costs every time new data is loaded into a table, which can outweigh the potential savings during data reads when tables are used fewer than 100 times a week.
-
Rarely used materialized views: Find underutilized materialized views incurring storage and compute costs without providing much performance benefit.
Data loading
Cut compute costs by loading data into the system more efficiently, with insights into:
-
Inefficient COPY commands: Break up large files (+128MB) when using COPY commands to reduce compute usage.
-
Repeated single row inserts: Increase efficiency and cut costs by making single row inserts in bulk.
User management
Manage users and access with insights into:
-
Users with elevated permissions: Quickly identify users with elevated permissions, like warehouse admins, and easily revoke permissions when they change functions or leave the organization.
-
Users assigned to many roles: Identify users that might have been assigned to roles to gain access to a single object or subset of resources accessible to users in the role. In these cases, users most likely have overprovisioned permissions and could potentially become internal threat actors.
-
Users who haven’t logged in for a while: Inactive users have either left the organization or no longer need access to the platform. This insight surfaces users whose access you might want to revoke.
Third-party spend
See how much you're spending on third-parties integrated with your Snowflake account. Slingshot automatically displays your spend on hundreds of third-party apps in the prebuilt third-party service dashboard. If you use any of the third-party tools like dbt, Fivetran, Airflow, Hightouch, Spark, etc. connect Slingshot and it will automatically display cost broken down by third-party service.
Timely warehouse notifications
Know of performance anomalies faster with Slingshot’s timely warehouse notifications. Easily set custom performance thresholds and determine who to alert when performance anomalies are detected. Set maximum query queuing, query execution times and/or warehouse spillage percentages and Slingshot will notify you within an hour of any anomalies or spikes.
Visualizations
New dashboards and reports to help you understand spend, allocate costs and get to root causes faster.
Granular reports
Seeing is believing when it comes to understanding your company’s Snowflake usage and how you can optimize it. We’ve built 13 new reports to help everyone from DBAs to data analysts to Snowflake admins to Finance understand how the organization uses and spends on Snowflake.
These new visualizations are meant to help you slice and dice your way into cost and performance issues, with insights into:
-
AI/ML/Cortex costs
-
Highest-cost data objects
-
Top users and queries per table
-
Query costs by role
-
Costs by table accessed
-
Cost by table
-
Cost per Task
-
Cost per Stored Procedure
-
Cost per Dynamic Table
-
Allocate costs by table/schema
-
Allocate costs by third party tools
-
Costs by Snowpipe workload
-
Cost of Snowpark Container services
Governance
New ways to control user access by role and org:
Federated access
Enterprise users can now assign roles and control object-level access to different lines of business. Admins can use this feature to implement the zero trust principle of least privilege, and limit user visibility and access to only the necessary objects.
Slack notifications
Slingshot now supports the ability to integrate with Slack. Stay up to date on the latest cost spikes and performance anomalies across warehouses and lines of business by enabling and configuring a Slack channel of Slingshot notifications. Stay tuned for additional integrations with messaging platforms, like Teams, that we'll be pushing out soon.
Conclusion
We’re excited to share these updates with you. These 40 new optimizations were created through collaboration with our valued customers to address the challenges they are facing with Snowflake optimization.
We aim to cover the full stack of Snowflake: including data load, storage, compute, queries and governance. There is a lot to still build, so stay tuned as we launch new insights and optimizations in the coming weeks and months.
Our goal is to take the burden of optimization off of your team, so they can focus on building pipelines, driving business value and differentiating work. If this sounds interesting to you, click here to book time with our experts to learn how Slingshot can help you save time and money on Snowflake optimization.