Optimizing warehouses for cost and performance
September 25, 2023
A highly effective way businesses can reduce their cloud spending while improving performance is by optimizing their data warehouses. When compute capacity is provisioned efficiently, businesses can minimize waste while improving the overall performance of their systems. Let’s look at how a combination of best practices and tooling can enable effective warehouse optimization.
Best practices for managing data warehouses
One of the major challenges Capital One faced in moving to the cloud was proactively right sizing warehouses without pulling our data teams away from driving business value to manage warehouses. We came away with the following best practices to best manage our warehouses in the cloud.
One warehouse size does not fit all. By scheduling your warehouse to change sizes dynamically depending on the workload size, time of day and day of the week, you pay based on your company’s usage patterns.
Set the right auto-suspend time. Reducing the auto-suspend time for a warehouse from 15 minutes to 2 minutes, for example, can greatly decrease idle times and save on costs. Paying attention to warehouse trends will allow you to set an auto-suspend time that helps improve usage and cost efficiency.
Configure multi-cluster correctly. At Capital One we adopted Snowflake as our data cloud because of the near-infinite scalability. When configuring a multi-cluster warehouse you need to consider how fast a response time is necessary. Significant cost savings are possible if you can afford to wait a few minutes of high load before spinning up a new cluster.
Keep different sizes and types of workloads separate. Each team or group has its own requirements and the data warehouse should be rightsized for that team’s needs. Keeping workloads that are different sizes and types separated leads to greater efficiency and cost savings.
How to right-size your warehouses
We covered how the warehouse size, scaling policy and auto-suspend time can affect the cost of running workloads. But how do we determine the rightsize for a warehouse? There are four key metrics to consider:
Query size: Understanding the distribution of typical query sizes that run in a warehouse will help determine the best size. You may find almost all queries are small and you do not actually need that XL warehouse you created by default.
Data spillage: When a data warehouse runs out of memory while executing a query, the data spillage that occurs significantly slows down query execution and degrades performance. Viewing the spillage patterns can help you identify queries that tend to lead to spillage and how much disk storage is necessary to complete the queries.
Query load: This metric helps organizations understand the time required for a given query to run in a data warehouse. If you decide to scale up the size to accommodate, there will be a tradeoff with query times decreasing while costs go up.
Queued queries: Reviewing the total number of queries that entered a queued state, usually due to the load on the warehouse, will help with understanding the minimum and maximum cluster settings that are ideal to scale.
Balancing warehouse cost and performance with Slingshot
There are many factors to consider when right-sizing your warehouses and managing these decisions effectively takes time. We built Capital One Slingshot to help businesses optimize their Snowflake investments by finding the right balance between warehouse cost and performance.
Slingshot’s warehouse scheduling tool allows users to set dynamic schedules using templates to automatically provision warehouses. The ability to tune warehouse sizes and scaling policies, apply schedules and manage inactivity controls to rightsize warehouses based on need means you only pay for the compute you need, when you need it.
Slingshot’s warehouse recommendations identify when a warehouse is under or overutilized and provides optimized size and cluster settings on a schedule. The recommendation also includes an estimated cost savings where applicable to apply tangible business value to the change. For example, Slingshot may recommend a schedule for a warehouse that estimates a 20.5% per month savings with just a 15 second increase in query execution time.
Putting it all together
The near-infinite scalability of data in the cloud brings incredible opportunities for new data insights and use cases, but managing all that data can present challenges without proper oversight. Following best practices and leveraging tools to help proactively manage and rightsize warehouses will help you optimize costs, improve efficiency and scale faster.