Jobs Classic vs Jobs Serverless vs DBSQL who wins on TPC-DS?
A data-driven analysis challenging conventional wisdom about serverless compute
TL;DR We ran the full TPC-DS benchmark suite across three compute planes: Databricks serverless SQL warehouses, Databricks serverless Jobs Compute and Databricks classic Jobs Compute, using Apache JMeter for standardized load generation. The goal: quantify latency, throughput, scalability and cost-per-query under controlled realistic workloads. After running more than 4,750 TPC-DS queries the data shows that Jobs Classic beats both Jobs Serverless variants on cost, performance and consistency. The real TPC-DS winner? Serverless SQL warehouses. Read on for all the details.
Introduction: Questioning the narrative
Databricks Serverless is the next evolution in data processing. It is a managed compute service that automatically provisions clusters and comes with built-in autoscaling and zero cluster management. It offers a compelling solution for many, especially those new to Databricks: why deal with cluster configurations when serverless can handle everything for you?
We ran thousands of analytical queries daily to simulate real life production workloads and assess where Jobs Serverless stands in regards to costs and performance compared to Jobs Classic. This article presents our findings from a comprehensive TPC-DS benchmark across two scale factors (SF1 and SF1000), which simulated standard analytics queries at a production scale.
The benchmark: What we tested
Here’s a rundown of what we tested and the metrics we benchmarked against:
The basics
Dataset: TPC-DS Scale Factor 1000 (1TB of data) and 1 (1GB of data)
Queries: 95 TPC-DS queries representing realistic analytical workloads complete with:
-
Complex multi-table joins
-
Window functions and aggregations
-
Varying selectivity and complexity
Total executions: More than 4,750 queries across multiple runs for statistical validity.
Compute planes tested
-
Serverless SQL warehouses: Small, medium, large, X-large
-
Jobs Classic: Standard compute cluster (
rd-fleet.xlarge) with Photon both enabled and disabled, 8 worker nodes, autoscaling disabled, on-demand and Spot (fallback to on-demand), release 16.4.12. This classic cluster with both Photon and Spot enabled is used as the baseline for comparisons throughout this report. We also tested variations with 2, 4 or 6 worker nodes, but defaulted to 8 worker nodes in the benchmark below. -
Jobs Serverless: Jobs Serverless - Standard (cost optimized) and Jobs Serverless - Performance (performance optimized) were both tested.
Key metrics
-
Actual DBU consumption from
system.billing.usage -
Real costs based on list pricing
-
AWS cluster costs for Spot and on-demand instances
-
Query latencies (P50, P90, P95, P99): P50 represents the median, meaning that 50% of queries complete in this time. P95 is approaching worst case scenarios where 1 in 20 queries exceed this value and P99 is the worst case where clusters still have to run regularly even if it’s just 1 in 100 queries which exceed this value. P99 is often the metric that determines the SLA guarantee.
-
Performance consistency (coefficient of variation)
All of this to say that our analysis and findings are based on real data from production-scale testing with actual billing information.
You can download the notebooks we ran and try it for yourself below.
Performance breakdown: Jobs Classic vs Jobs Serverless
We ran the entire benchmark analysis across the spectrum of Databricks’ Jobs Compute. We analyzed both the Jobs Serverless offering as well as Jobs Classic, including variations with Photon and Spot instances disabled.
For this experiment we used a r6idn.xlarge instance with Spot pricing at the time of this article of $0.19 per hour and on demand pricing of $0.391 per hour. The job takes just under an hour to run. The table below contains a summary of the costs including the AWS cluster costs. The data indicates a consistent pattern:
|
Configuration |
P50 (sec) |
P90 (sec) |
P99 (sec) |
DBUs |
Cost/DBU |
AWS cost |
Total cost |
|
Jobs Classic (baseline) |
5.41 |
26.86 |
82.22 |
60 |
$0.15 |
$1.29 |
$10.22 |
|
Jobs Classic: Photon disabled, Spot enabled |
5.73 |
29.36 |
163.1 |
42 |
$0.15 |
$1.52 |
$7.81 |
|
Jobs Classic: Photon disabled, Spot disabled |
12.26 |
77.1 |
285.64 |
41 |
$0.15 |
$3.22 |
$9.33 |
|
Jobs Serverless - Standard |
5.41 |
32.11 |
231.45 |
37 |
$0.35 |
Built-in |
$12.95 |
|
Jobs Serverless - Performance |
5.37 |
26.71 |
159.08 |
52 |
$0.35 |
Built-in |
$18.2 |
At first glance, it seems like the serverless variants can match Jobs Classic’s P50 performance (5.41s) while using fewer DBUs (37 vs 60). But this is where Databricks’ pricing model changes the story.
The 133% DBU rate premium for Serverless ($0.35/DBU vs $0.15/DBU) negates the efficiency gains. Let’s break it down:
-
Jobs Classic: (60 DBUs x $0.15) + $1.29 AWS charge = $10.22
-
Jobs Serverless - Standard: (37 DBUs x $0.35) = $12.95
-
Jobs Serverless - Performance: (52 DBUs x $0.35) = $18.2
As you can see from the calculation above, you end up paying 27% more for Jobs Serverless - Standard and 78% more for Jobs Serverless - Performance.
But that’s just one part of it. We haven’t even touched on the latency degradation. More about that next.
P99: When serverless hits a wall
If you care about SLAs, P90, P95 and P99 metrics are the ones keeping you up at night. Half your queries might run just fine, but 10% of them are running 30% slower? That’s a production incident. While median performance of Jobs Classic and Jobs Serverless appears to be somewhat competitive, the tail latencies expose some instability with serverless offerings:
-
Jobs Classic P99: 82.22 seconds, which means 99% of queries complete in under 82 seconds
-
Jobs Serverless - Standard P99: 231.45 seconds, which means 99% of queries complete in under 231 seconds (2.8x slower than Classic)
-
Jobs Serverless - Performance P99: 159.08 seconds, which means 99% of queries complete in less than 159 seconds (1.8x slower than Classic)
For time sensitive pipelines like revenue reports, fraud detection and real-time analytics that kind of variability can lead to consistency issues.
The impact of worker counts
Our analysis indicates that AWS cluster costs are not enough to offset the higher DBU rate of serverless jobs for the scale of the TPC-DS benchmark dataset. At the same time, the data shows that Jobs Classic configuration outperformed both Jobs Serverless offerings in P99.
So we decided to test the impact that different worker counts have on the cost and performance of Jobs Classic vs Jobs Serverless. Here are our findings:
|
Configuration |
P50 (sec) |
P90 (sec) |
P99 (sec) |
DBUs |
$/DBU |
DBU cost |
AWS cost |
Total cost |
|
Jobs Classic with 8 workers: Photon and Spot enabled |
5.41 |
26.86 |
82.22 |
60 |
$0.15 |
$8.93 |
$1.29 |
$10.22 |
|
Jobs Classic with 6 workers: Photon and Spot enabled |
5.59 |
26.05 |
98.43 |
39 |
$0.15 |
$5.83 |
$0.97 |
$6.79 |
|
Jobs Classic with 4 workers: Photon and Spot enabled |
6.84 |
35.95 |
470.74 |
94 |
$0.15 |
$14.14 |
$0.64 |
$14.78 |
|
Jobs Classic with 2 workers: Photon and Spot enabled |
10.85 |
63.41 |
240.4 |
96 |
$0.15 |
$14.45 |
$0.32 |
$14.77 |
|
Jobs Classic with 8 workers: Photon disabled, Spot enabled |
5.73 |
29.36 |
163.1 |
42 |
$0.15 |
$6.29 |
$1.52 |
$7.81 |
|
Jobs Classic with 8 workers: Photon disabled, Spot disabled |
12.26 |
77.1 |
285.64 |
41 |
$0.15 |
$6.11 |
$3.22 |
$9.33 |
|
Jobs Serverless - Standard |
5.41 |
32.11 |
231.45 |
37 |
$0.35 |
$12.92 |
Built-in |
$12.95 |
|
Jobs Serverless - Performance |
5.37 |
26.71 |
159.08 |
52 |
$0.35 |
$18.21 |
Built-in |
$18.2 |
The data shows that using 6 workers vs 8 actually reduced the number of DBUs consumed therefore reducing total costs, without significantly affecting the P99 time. However, reducing the number of workers to 4 or 2 degrades the workload as it causes the DBU consumption to spike to 94 or 96 DBUs, respectively. At these worker counts we also see an even greater hit to the performance of Jobs Classic, all at a cost which is 45% higher than the baseline.
What does this mean for you? Experimentation with cluster configuration is still very much a requirement in order to optimize Classic Jobs. Sometimes scaling up actually lowers costs and scaling down increases costs, as seen in the results above. Thankfully, there are tools that can help you rightsize your clusters, from Databricks’ own autoscaler to external solutions like Capital One Slingshot.
DBU efficiency: The optimization paradox
Serverless is supposed to eliminate waste through better resource management. While that might be true, the data show that the efficiency gains do not outweigh the additional costs of serverless.
When we take into consideration the different DBU rates for Jobs Classic vs Jobs Serverless ($0.15/DBU vs $0.35/DBU), apparent savings transform into actual losses even when we add in the cloud cluster costs for the classic clusters.
|
Configuration |
DBUs |
Relative efficiency |
$/DBU |
AWS costs |
Total costs |
Cost difference |
|
Jobs Classic |
60 |
Baseline |
$0.15 |
$1.29 |
$10.22 |
Baseline |
|
Jobs Classic: Photon disabled, Spot enabled |
42 |
30% more “efficient” |
$0.15 |
$1.52 |
$7.81 |
Costs 24% less |
|
Jobs Classic: Photon disabled, Spot disabled |
41 |
32% more “efficient” |
$0.15 |
$3.22 |
$9.33 |
Costs 9% less |
|
Jobs Serverless - Standard |
36 |
38% more “efficient” |
$0.35 |
Built-in |
$12.95 |
Costs 27% more |
|
Jobs Serverless - Performance |
52 |
13% more “efficient” |
$0.35 |
Built-in |
$18.2 |
Costs 78% more |
As you can see above, lower DBU consumption does not necessarily mean lower costs.
Beyond just pure costs, the data also suggests that Serverless Jobs have higher performance variance:
|
Configuration |
P50/P99 ratio |
Variance assessment |
|
Jobs Classic |
15.2x |
Moderate, predictable |
|
Jobs Serverless - Standard |
42.8x |
Extremely variant |
|
Jobs Serverless - Performance |
29.6x |
High variance |
A P50-to-P99 ratio calculates how far away your extreme values are from the median. The higher this multiplier the more variable the workload. The massive multipliers for the Jobs Serverless variants may indicate performance instability, that might make it difficult to guarantee SLAs.
Cost-per-performance: The final metric
When we calculate dollars per query weighted by performance, we can clearly see that Jobs Compute on classic clusters is the most cost efficient compute:
|
Configuration |
$/Query (DBUs) |
P99 (sec) |
Cost x P99 |
AWS cost |
Relative efficiency |
|
Jobs Classic |
$0.094 |
82.22 |
7.73 |
$1.29 |
Baseline |
|
Jobs Classic: Photon disabled, Spot enabled |
$0.067 |
163.10 |
10.93 |
$1.52 |
41% less efficient |
|
Jobs Classic: Photon disabled, Spot disabled |
$0.064 |
285.64 |
18.28 |
$3.22 |
236% less efficient |
|
Jobs Serverless - Standard |
$0.138 |
231.45 |
31.94 |
Built-in |
313% less efficient |
|
Jobs Serverless - Performance |
$0.200 |
159.08 |
31.82 |
Built-in |
312% less efficient |
As you can see in the table above, the data indicates that Jobs Serverless comes out much lower on cost-efficiency compared to Jobs Classic.
The data also indicates that P99s skyrocket for Classic Jobs when both Spot and Photon are disabled.
The Photon tax
Following the data, we decided to take a closer look at the different configurations tested. Throughout this article we use Jobs Classic cluster with 8 workers, Photon enabled and Spot with fallback to on-demand as the baseline configuration. We systematically disabled these features to reveal the true cost structure for Jobs Compute Classic.
See the impact of Photon and Spot instances have on the cost and performance of Classic Jobs in the table below.
As a reminder, Photon is Databricks’ native C++ vectorized query acceleration engine and Spot instances are highly discounted unused cloud compute.
|
Configuration |
DBUs |
P99 (sec) |
Cost/query |
Total cost |
Insight |
|
Photon enabled + Spot enabled |
60 |
82.22 |
$0.094 |
$10.22 |
Baseline |
|
Photon disabled + Spot enabled |
42 |
163.10 |
$0.067 |
$7.81 |
Costs 23% less and 2x slower |
|
Photon disabled + Spot disabled |
41 |
285.64 |
$0.064 |
$9.33 |
Costs 9% less and 3.5x slower |
Key takeaways:
-
Enabling Photon increases the DBU/hr consumption rate. As a result, we found that Photon adds a 43% DBU premium (60 vs 42 DBUs) for 2x performance.
-
Disabling Photon lowers total cost by 23% but doubles the P99 latency.
-
Spot instances are critical for performance, without them P99 balloons to 285 seconds.
Interestingly, while we also found that the use of Spot impacts performance, it does not significantly impact unit economics. DBU consumption, cost per query and total costs for Jobs Classic with and without Spot are comparable.
The verdict: Jobs Classic wins every metric
Across nearly 5,000 TPC-DS queries at both scale factors, the pattern favors Jobs Classic vs Jobs Serverless.
To recap our key findings:
-
Jobs Classic costs 27-78% less than the Serverless variants
-
Jobs Classic delivers 1.8-2.8x better P99 performance
-
Jobs Classic is far more predictable in terms of performance
-
Disabling Photon will save 23% if you can tolerate the double P99 latency
The data shows that Jobs Classic comes out on top compared to the Serverless offering of the same compute type, but what happens when we pit it against serverless SQL warehouses?
Serverless DBSQL warehouses rewrite the economics
Now that we have established that Jobs Classic is more cost-effective than its Serverless counterparts, we need to address the elephant in the room. The TPC-DS SF1000 workload is all SQL, so how does it perform on Databricks compute specifically optimized to run SQL?
|
Configuration |
P50 (sec) |
P90 (sec) |
P99 (sec) |
DBUs |
$/DBU |
Cost/query |
DBU/query |
AWS costs |
Total cost |
|
XL warehouse |
0.91 |
1.26 |
2.02 |
8.99 |
$0.7 |
$0.0662 |
0.0947 |
Built-in |
$6.3 |
|
M warehouse |
0.91 |
1.2 |
2.05 |
3.86 |
$0.7 |
$0.0284 |
0.0395 |
Built-in |
$2.7 |
|
L warehouse |
0.94 |
1.28 |
2.1 |
4.53 |
$0.7 |
$0.0334 |
0.0474 |
Built-in |
$3.2 |
|
S warehouse |
1.33 |
6.7 |
33.56 |
2.57 |
$0.7 |
$0.0189 |
0.0263 |
Built-in |
$1.8 |
|
Jobs Classic |
5.41 |
26.86 |
82.22 |
60 |
$0.15 |
$0.094 |
0.632 |
1.29 |
$10.22 |
As you can see in the table above, every single serverless DBSQL warehouse configuration came out costing less than Jobs Classic, despite costing 4.67x more per DBU ($0.7/DBU vs 0.15/DBU):
-
Small SQL warehouse: Cost 82% less than Jobs Classic
-
Medium SQL warehouse: Cost 74% less than Jobs Classic
-
Large SQL warehouse: Cost 69% less than Jobs Classic
-
XL SQL warehouse: Cost 38% less than Jobs Classic
On top of costing less for the same workloads, DBSQL serverless warehouses are also dramatically faster. When we compare a medium sized SQL warehouse, for example, to Jobs Classic we see the following performance differential:
-
P50: 0.91s vs 5.41s (5.9x faster)
-
P90: 1.20s vs 26.86s (22.4x faster)
-
P99: 2.05s vs 82.22s (40.1x faster)
With a 2-second P99 latency, serverless SQL warehouses are very capable of real-time analytics that a Jobs Classic cluster could not support. For instance, you could run up to 40 sequential queries on a serverless SQL warehouse in the time Jobs Classic compute takes to complete just one at P99.
The efficiency of serverless DBSQL
|
Configuration |
DBUs consumed |
Total cost |
P99 (sec) |
DBU consumption efficiency |
|
S warehouse |
2.57 |
$1.80 |
33.56 |
23x more efficient |
|
M warehouse |
3.86 |
$2.70 |
2.05 |
15.5x more efficient |
|
L warehouse |
4.53 |
$3.17 |
2.1 |
13x more efficient |
|
XL warehouse |
8.99 |
$6.29 |
2.02 |
6.67x more efficient |
|
Jobs Classic |
60 |
$10.22 |
82.22 |
Baseline |
While the cost per DBU for serverless SQL warehouses is higher than that for Jobs Classic, the data shows that serverless DBSQL is more cost-effective than Jobs Classic. The key to this seemingly counterintuitive pricing structure is all in the computation efficiency. SQL warehouses consumed just 4 DBUs on a medium warehouse compared to the 60 consumed by the Jobs Classic: a 15x improvement which more than offsets the cost. In fact, if we look at the cost per query percentile second which is an efficiency metric we can see:
-
Classic Jobs: $10.22/82.22 seconds = $0.124 per P99 second
-
Medium serverless DBSQL warehouse: $2.70/2.05 seconds = $1.37 per P99 second
On the surface it looks like you’re paying $1.37/second vs around $0.12/second but when you take into consideration the P99 of 2 seconds vs 82 seconds the savings disappear as the workload takes 40x less time to run. As a result a medium serverless warehouse costs $2.7 vs $10.22 to run the same queries using Jobs Classic.
The data shows that the same workload under the medium sized serverless SQL warehouse will be 40x faster and cost 74% less than on classic jobs clusters.
DBSQL warehouses achieve this efficiency through a myriad of optimization, but some features in particular make them particularly competent:
-
Native Photon acceleration included in the base price (not togglable like with Jobs)
-
Persistent compilation caches between queries
-
Columnar caching optimized for analytical patterns
Serverless DBSQL is also more predictable than Jobs Classic
On top of the efficiency, the Coefficient of Variation (CV), which measures the performance predictability, is significantly more stable for serverless DBSQL warehouses. The CV is calculated as standard deviation divided by the mean, which makes it better than standard deviation alone when comparing performance consistency across compute types.
For example, a medium sized serverless SQL warehouse has a mean latency of 1.15 seconds and standard deviation of 0.29 seconds, so the standard deviation is only 25.3% of the mean which indicates very tight consistent performance. In contrast, Jobs Serverless - Standard achieved a mean latency of 42.5 seconds with a standard deviation of 86 seconds. The standard deviation here is more than 2x the mean indicating wild swings, where slow queries can take 3-4x longer than average.
|
Configuration |
CV |
P50-P99 spread |
Real-world impact |
|
XL warehouse |
0.249 |
2.2x |
Guaranteed 2-second SLAs |
|
M warehouse |
0.253 |
2.3x |
Perfect for user-facing analytics |
|
L warehouse |
0.274 |
2.2x |
Predictable dashboard refreshes |
|
Jobs Classic |
1.236 |
15.2x |
Acceptable for batch ETL |
|
Jobs Classic: Photon enabled, Spot disabled |
1.527 |
23.3x |
Moderately unpredictable |
|
Jobs Classic: Photon disabled, Spot enabled |
1.765 |
28.5x |
Wide performance swings |
|
Jobs Serverless - Performance |
1.799 |
29.6x |
Premium price for unstable performance |
|
S warehouse |
1.926 |
25.2x |
Undersized, resource contention |
|
Job Serverless - Standard |
2.023 |
42.8x |
Wildly unpredictable |
When to use Serverless DBSQL vs Jobs Compute
This Databricks benchmark report clearly show that the serverless SQL warehouses aren’t just for analytics, they can be the optimal choice for batch processing too. But it's not a hard/fast rule. Here’s a breakdown of the workloads that are better suited for DBSQL:
-
Pure SQL transformations
-
dbt models and SQL-based ETL
-
Data quality checks and testing
-
Reporting and BI workloads
-
COPY INTO and data ingestion
-
Large-scale aggregations
Jobs Compute, on the other hand, is still very much necessary for patterns like these:
-
R/Python/Scala UDFs and Complex Logic
-
Machine learning windows
-
Streaming applications
-
Custom libraries and dependencies
-
File format conversions
Conclusion
After running close to 5,000 of TPC-DS queries and conducting rigorous statistical analysis, this Databricks benchmark report reveals somewhat surprising patterns that might challenge the conventional narrative.
First, we found that Jobs Serverless is less efficient than Jobs Classic despite using fewer resources. In fact, we found that the serverless variants consume 13-38% fewer DBUs than classic Jobs, yet deliver slower performance. This suggests there might be an architectural inefficiency that prevents serverless from matching the performance of classic job clusters even when the comparison should favor it. Combined with the 2.33x higher DBU rate ($0.35 vs $0.15), you’re paying 27-78% more for slower runtimes.
Secondly, we found that serverless SQL warehouses are the bargain we need. Despite having a higher price point, $0.70 per DBU versus Jobs’ $0.15 - a 4.67x premium, they end up costing less due to consuming fewer DBUs.
Overall, our TPC-DS benchmark report reveals an uncomfortable truth: serverless compute isn't always an upgrade. For the vast majority of production data workloads, such as predictable ETL pipelines, scheduled analytics and SLA-driven processing, Classic Jobs deliver better performance, lower costs and more reliable execution than the serverless options.
The real winner? Serverless SQL warehouses. They're not just for BI anymore. For analytical SQL workloads, they're 7x faster than any Jobs compute type at equal or lower cost.
