Top 5 Databricks features announced at DAIS 2025

Databricks introduced a series of new features and platform updates at the recent Data + AI Summit, most of which aim to unify data and AI workloads. The announcements centered on three primary themes:

  1. Integrating disparate data systems

  2. Operationalizing AI development

  3. Boosting developer velocity

The sheer number of features announced by data clouds at summits is staggering, making it challenging to discern what to pay attention to and what still needs time to evolve. We already published a recap of the features announced at Snowflake Summit, now we tackle Databricks.

In this post, we provide an overview of the five most impactful features for data engineers and platform leaders, announced at Databricks Data + AI Summit 2025. Let’s dive right in. 

1. Lakebase a low latency, high-throughput database for AI applications

What it is: Lakebase is Databricks’ fully managed Postgres service, now in Public Preview. It functions as an integrated Online Transaction Processing (OLTP) database engine that runs on Databricks. It uses technology from the recent acquisition of Neon to decouple database compute from storage to effectively scale both independently. The integration of Neon's serverless Postgres technology into Databricks’ lakehouse platform is how Lakebase was created. 

Designed for low-latency (<10ms) and high-throughput (>10,000 QPS) transactional workloads, Lakebase is ready for AI real-time use cases and rapid iterations. Its serverless databases are ephemeral and launch in <1 sec, allowing users to only pay for the compute they need and fallback on low-cost data lake storage for the rest of the time.

Technical implications: The primary function of Lakebase is to eliminate the requirement for separate ETL/ELT processes for operational data, i.e. the real-time transactional data that drives daily business activities. With Lakebase, applications can write directly to a Postgres endpoint within Databricks, and that data is immediately available for analytical queries in the lakehouse. This zero-ETL integration reduces data latency and architectural complexity. With that said, there are possible implications to query performance if the raw data needs to be cleaned prior to each run.

Lakebase also supports copy-on-write branching, allowing engineers to create zero-cost clones of a production database for CI/CD testing, schema migration validation, or isolated AI agent state stores.

Lastly, integrating Lakebase with Unity Catalog, ensures all transactional data inherits the existing governance, security and auditing policies. This is designed to support real-time AI applications that require both low-latency read/write access to operational data and access to large-scale historical data for training and analysis.

Databricks health check dashboard

A free, comprehensive dashboard to access the health of your Databricks workspace.

2. Unity Catalog with full Iceberg support for flexibility with governance

What it is: Databricks added full, native support for Apache Iceberg to Unity Catalog (UC). This update positions UC as an open Iceberg catalog, as it supports both read and write actions to and from iceberg tables in external compute engines like Snowflake, Flink, Trino and DuckDB in a single governed and federated environment. It also enables teams to define access controls, data masking and lineage once in UC and have them enforced across every connected engine.

Technical implications: This feature resolves the Delta Lake vs. Iceberg format conflict from a governance standpoint. Engineers can now use either format based on preference or workload requirements, while managing workloads under a single catalog, enabling complex, multi-engine pipelines. 

For example, a Flink job can write to a UC-managed Iceberg table, which is then used for training by a Databricks ML workload, all without duplicating data sets. Managed Iceberg tables in UC also leverage Databricks’ Predictive Optimization capability, which automates maintenance operations like file clustering, unused data deletion and data file size optimization to improve query performance.

Overall, Unity Catalog with full Iceberg support creates a more flexible and future-proof architecture. This enables the organization to adopt new compute tools without the need to migrate data or re-implement governance and federation policies. 

3. Lakeflow and Lakeflow Designer democratize ETL/ELT pipelines

What it is: Lakeflow is a data engineering solution covering data ingestion, transformation (ETL/ELT) and orchestration. A few updates to Lakeflow were announced at Databricks Data + AI Summit 2025. Lakeflow Designer is a key update. 

Lakeflow Designer bridges the gap between data engineers and business users with an AI-powered, no-code data pipeline builder in a governed environment. Non-developers can use a visual editor to build data pipelines, using structured or unstructured data (like screenshots or docs) as data sources. They can even push these pipelines to production and orchestrate external tasks, such as triggering a dbt Cloud job or a Power BI refresh, all without writing a single line of code. 

Technical implications: A notable design choice is that both the no-code Lakeflow Designer and the code-first Lakeflow Declarative Pipelines approach (formerly Delta Live Tables) produce the same result: an auditable data pipeline governed by Unity Catalog. In fact, a pipeline prototyped in the UI can be checked into source control and managed by engineers using standard CI/CD practices. This approach democratizes a core data engineering function without sacrificing governance or maintainability.

4. Agent Bricks: Automatically optimized business-aware AI agents

What it is: Agent Bricks is a framework for building and deploying production-grade AI agents. It provides a no-code interface to define an agent's task, automatically generate evaluations and optimize performance against cost. 

The framework comes with four pre-built agents that enable the following tasks:

  1. Information extraction from unstructured into structured tables 

  2. Model specialization to perform custom text generation (summarize, classify, text transformation)  

  3. Knowledge assistant that transforms internal docs and assets into chatbots 

  4. Multi-agent supervisor to manage workflows and bring Genie spaces and agents together 

Technical implications: Agent Bricks is designed to de-risk investments in generative AI by addressing the primary blockers to production: quality assurance and cost control. It automatically evaluates and optimizes agents for domain-specific tasks and tests multiple models and configurations for costs and performance, so that users can choose between cost-optimized and performance-optimized models. In addition, it includes a human-in-the-loop feedback mechanism where corrections from subject matter experts are used to further refine the models and improve agent output.

Overall, Agent Bricks reduces the time to value for deploying AI agents by automating labor intensive processes and including four pre-built agents. It enables business users to bring AI agents to production by declaring the task and the expected outcome in natural language and connecting data sources (structured / unstructured).

Databricks health check dashboard

A free, comprehensive dashboard to access the health of your Databricks workspace.

5. Next-Gen DBSQL offers automatic performance improvement

What it is: Databricks announced a major upgrade for the Databricks SQL (DBSQL) Serverless warehouse. It provides up to a 25% performance improvement for BI and ETL workloads at no additional cost or configuration change. 

Technical implications: The performance improvements in runtime reductions for SQL-based ETL jobs are driven by Predictive Query Execution and Photon Vectorized Shuffle.

Predictive Query Execution (PQE) monitors running tasks in real-time and is an evolution of Adaptive Query Execution, which could only re-plan after a query has completed. PQE can detect issues like data skew or memory spills as they occur and replan the query stage immediately.

The Photon Vectorized Shuffle is a new addition to the Photon vectorized query engine designed to accelerate Spark workloads. Photon Vectorized Shuffle improves throughput for CPU-bound workloads, such as large joins and aggregations, by optimizing data movement and processing within the CPU cache.

This update aims to reduce the costs associated with analytics workloads, as faster queries consume less compute. The increased query stability from PQE also improves reliability, reducing operational overhead.

Bonus: Databricks free edition for prototyping and learning

What it is: A no-cost, serverless-only version of Databricks with usage quotas. It is designed for individual learning and experimentation. It provides access to core features like Databricks SQL, Lakeflow and the AI/BI Genie, but does not include enterprise-grade features such as SSO, custom compute configurations, Lakebase or Agent Bricks.

Technical implications: The free edition provides a sandboxed environment for engineers to prototype projects, learn new platform features and even create small-scale proofs of concept without going through a procurement process. The environment does not expire and is paired with free access to self-paced Databricks Academy courses, enabling ongoing learning and development. 

Top 5 DAIS 2025 features announced for engineers

Feature name Core functionality Strategic value
Lakebase Postgres-compatible OLTP engine integrated into the lakehouse, eliminating the need for separate OLTP ETL enabling real time AI-driven applications. Consolidates transactional and analytical systems, simplifying architecture and reducing vendor overhead.
Unity Catalog & Iceberg Support Provides an Iceberg REST Catalog API endpoint, enabling read/write access from external engines like Flink, Trino and Snowlake Progress. Establish more flexible and future-proof architecture that enables easy cross-platform workflows.
Lakeflow & Lakeflow Designer Unified ETL/ELT development environment; no-code Lakeflow Designer enables business users to build UC governed pipelines. Bridges the skill gap between analysts and engineers, increasing development velocity under a unified governance model.
Agent Bricks No-code framework for building and optimizing AI agents using automated evaluation and cost-quality analysis. Provides developers and non-developers with a structured, governed path to move AI agents from prototype to production, mitigating risk and cost.
Next-Gen DBSQL Warehouse Automatic query performance increase of up to 25% via Predictive Query Execution and vectorized shuffle. Lowers costs associated with BI and SQL workloads through improved performance without requiring configuration changes.
Bonus: Databricks Free Edition Serverless-only environment with usage quotas for learning and prototyping on the core platform. Lowers the barrier for individual skill development and team-based evaluation of platform capabilities.

 

Conclusion

These five new features and capabilities announced at Databricks Data + AI Summit 2025 indicate a clear strategic direction toward a unified platform that abstracts away underlying complexities. 

For data engineers and platform leaders, these updates present opportunities to consolidate disparate systems (Lakebase, Unity Catalog for Iceberg), standardize development workflows across skill levels (Lakeflow) and operationalize AI with governed frameworks (Agent Bricks). 

The improvements to the core offerings, such as Spark 4.0, next-gen DBSQL and the introduction of a free tier offer direct avenues for improving developer productivity and reducing costs. 


Noa Shavit, Senior Product Marketing Manager, Capital One Software

Noa is a full-stack marketer specializing in infrastructure products and developer tools. She drives adoption and growth for technical products through strategic marketing. Her expertise lies in bridging the gap between innovative software and its users, ensuring that innovation translates into tangible value. Prior to Capital One, Noa led marketing and shaped GTM motions for Sync Computing, Builder.io, and Layer0.