Databricks Lakebase: Unifying OLTP and OLAP in the Lakehouse

How serverless PostgreSQL breaks down the transactional-analytical divide

For decades, data teams have faced a fundamental architectural choice: optimize for fast transactions or deep analytics, but rarely both. OLTP databases like PostgreSQL excel at millisecond writes and ACID guarantees, while data warehouses and lakehouses power complex queries across terabytes. The result? Fragmented architectures with nightly ETL jobs, data duplication, governance gaps and the operational overhead of managing multiple systems.

  OLTP (transactional) OLAP (analytical)
Goal Run the business Analyze the business
Focus Recent operational data Historical data
Data pattern Frequent, millisecond-level writes and updates Bulk loads with large-scale, read-heavy queries
Response times Milliseconds Seconds to hours
Data volumes MBs to GBs GBs to TBs and beyond
Query complexity Simple queries Complex queries

The growing pressure points

Modern enterprise applications face demands that expose this split:

  • Real-time operational decisioning: Credit approvals, fraud detection and inventory allocation need both transactional consistency and instant access to historical patterns.
  • Concurrent user workflows: When hundreds of analysts and business users interact with the same datasets, write contention and lock escalation become critical.
  • Governance at scale: Managing access policies, audit trails and data lineage across separate OLTP and OLAP systems multiplies complexity and risk.
  • Multi-row, multi-table ACID operations: Business logic that requires BEGIN ... COMMIT across multiple tables (ledgers, inventory, audit logs) is first-class in Postgres but is orchestrated piecemeal in most lakehouses.

Enter Databricks Lakebase

Lakebase fundamentally changes the equation. Lakebase is a fully-managed, serverless PostgreSQL service that runs inside the Databricks platform. It GA’d last week and now brings genuine OLTP capabilities into the lakehouse, while maintaining the analytical power users rely on. 

Designed for low-latency (<10ms) and high-throughput (>10,000 QPS) transactional workloads, Lakebase is ready for AI real-time use cases and rapid iterations.

What makes Lakebase different

Native platform integration

  • No VPC peering, PrivateLink endpoints or firewall rules to configure
  • Databricks users and groups automatically map to Postgres roles
  • OAuth-backed identity flows seamlessly between OLTP and OLAP workloads

Serverless economics

  • Auto-scales under load, scales to zero during idle periods
  • Eliminates idle database costs and over-provisioned capacity
  • Pay only for actual compute consumption

Zero-maintenance operations

  • Automated minor releases, security patches and failovers
  • Rolling updates with no downtime windows
  • No replica management or backup orchestration

True PostgreSQL semantics

  • Row-level locks, BEGIN ... COMMIT transactions
  • Support for extensions like pgAudit
  • Logical decoding for CDC workflows
  • Full SQL compatibility

The unified architecture

This is how Databricks Lakebase bridges the gap between high-speed enterprise applications and the broader BI analytics platforms:

Architecture layout of Lakebase.

Lakebase Postgres captures live data that is instantly made available through a secure internal channel to the Unity Catalog Metastore. This eliminates the need for complex, manual ETL pipelines, as the transactional records are automatically registered. The exact same data is then also immediately ready for BI analytics, ETL pipelines and ML training, all under a single governance framework.

Real-world AI use cases

The traditional separation of transactional and analytical systems creates fundamental barriers. This is especially true when building AI systems, as seen in the examples below. 

Example 1: Credit card fraud detection

ML models require both current transactional data and historical aggregates. Lakebase enables this pattern natively. Let’s take a look at how the split-path architecture can work in a credit card fraud detection system. 

When building a fraud prevention system, you want to score every credit card transaction for fraud risk in real-time by combining current transaction details with the cardholder’s historical spending patterns. This way, you can approve legitimate purchases within 50ms while blocking suspicious transactions before they complete, minimizing fraud losses and false declines. In order to do this, you need to be able to run queries against both kinds of registered databases simultaneously. 

Transactional features (OLTP: Lakebase): 

  • Current transaction amount, merchant, location 
  • Time since last transaction 
  • Current account balance
  • Active card status (locked/unlocked) 

Analytical features (OLAP: Unity Catalog):

  • Average transaction amount over the last 30/90 days 
  • Merchant category frequency patterns 
  • Geolocations velocity (distance between transactions) 
  • Historical fraud rate for merchant type

Example implementation

-- Single query joins live OLTP with historical OLAP
SELECT
  t.transaction_id,
  t.amount,
  t.merchant_id,
  t.card_status,                    -- OLTP: current state
  h.avg_30d_amount,                 -- OLAP: 30-day average
  h.merchant_category_frequency,    -- OLAP: behavioral pattern
  h.historical_fraud_rate           -- OLAP: risk score
FROM lakebase_cat.public.transactions t
JOIN analytics_cat.features.customer_history h
  ON t.customer_id = h.customer_id
WHERE t.transaction_id = ?

Result: Sub-50ms feature retrieval for real-time inference, combining separate feature store infrastructure as needed.

Databricks health check dashboard

A free, comprehensive dashboard to assess the health of your Databricks workspace.

Example 2: Autonomous inventory optimization agent

Agentic AI systems need to both read operational state and write actions back atomically. Lakebase provides ACID transactions for agent actions while enabling analytical context. Here's an example of an AI agent that can monitor inventory levels, predict demand and automatically creates purchase orders: 

-- Agent reads current inventory + demand forecast
BEGIN;

-- Step 1: Check current state (OLTP)
SELECT product_id, current_stock, reorder_point
FROM lakebase_cat.public.inventory
WHERE current_stock < reorder_point
FOR UPDATE;  -- Lock rows to prevent race conditions

-- Step 2: Query demand forecast (OLAP)
SELECT product_id, predicted_demand_7d
FROM analytics_cat.forecasts.demand_predictions
WHERE forecast_date = CURRENT_DATE;

-- Step 3: Agent decides and writes purchase order (OLTP)
INSERT INTO lakebase_cat.public.purchase_orders 
  (product_id, quantity, supplier_id, created_by, status)
VALUES 
  (12345, 500, 'SUP-001', 'ai-agent-v2', 'pending');

-- Step 4: Update inventory reservation (OLTP)
UPDATE lakebase_cat.public.inventory
SET reserved_stock = reserved_stock + 500
WHERE product_id = 12345;

-- Step 5: Log agent action for monitoring (OLTP)
INSERT INTO lakebase_cat.public.agent_audit_log
  (agent_id, action_type, decision_confidence, metadata)
VALUES 
  ('ai-agent-v2', 'create_purchase_order', 0.94, 
   '{"product": 12345, "reasoning": "demand_spike_detected"}');

COMMIT;

Why ACID matters: The entire sequence (read inventory → check forecast → create order → update reservation → log) must succeed or fail atomically. Lakebase ensures no partial updates, no orphaned orders and no lost audit trails.

Example 3: Customer support AI assistant

RAG systems need to combine vector similarity search with structured data lookups. Lakebase enables a hybrid pattern, like this one:

-- Step 1: Store customer inquiry in OLTP
INSERT INTO lakebase_cat.public.support_tickets
  (ticket_id, customer_id, inquiry_text, status, created_at)
VALUES (?, ?, 'My credit card was declined', 'open', CURRENT_TIMESTAMP);

-- Step 2: Retrieve customer context (OLTP + OLAP)
WITH customer_context AS (
  SELECT 
    c.customer_id,
    c.account_status,           -- OLTP: current state
    c.current_balance,          -- OLTP: live balance
    h.recent_transactions,      -- OLAP: transaction history
    h.past_support_issues       -- OLAP: issue patterns
  FROM lakebase_cat.public.customers c
  JOIN analytics_cat.features.customer_360 h
    ON c.customer_id = h.customer_id
  WHERE c.customer_id = ?
)
-- Step 3: Vector search finds similar resolved tickets (OLAP)
-- Step 4: LLM generates response with full context
-- Step 5: Store AI-generated response in OLTP
UPDATE lakebase_cat.public.support_tickets
SET 
  ai_response = ?,
  response_confidence = ?,
  status = 'ai_resolved'
WHERE ticket_id = ?;

Result: A single transaction ensures inquiry, context retrieval, AI response and status update are atomic. Unity Catalog governance applies to both customer data and AI-generated responses.

Business impact

By bringing OLTP capabilities into the lakehouse Lakebase is enabling teams to enjoy the following benefits:

Unified governance

  • Single OAuth flow authenticates access to both OLTP and OLAP
  • Unity Catalog policies (table, column, row filters, masks) apply uniformly
  • One audit trail instead of reconciling multiple systems

Operational efficiency

  • Cut down on nightly ETL jobs and data duplication
  • Reduce infrastructure footprint and associated operational overhead
  • Auto-scaling eliminates capacity planning guesswork

Cost optimization

  • Pay only for active compute during business hours
  • Scale-to-zero during off-peak periods
  • Retire dedicated OLTP infrastructure and associated licensing costs

Developer velocity

  • Application teams get standard PostgreSQL APIs
  • Analytics teams access live data through familiar interfaces
  • No impedance mismatch between transactional and analytical development

How to get started

Follow these three steps to get up and running on Databricks Lakebase Postgres: 

1. Create a Lakebase instance

Compute → OLTP Database → Create Database Instance
- Name: customer_db
- Region: (match your workspace)
- Click Create (~10 seconds)

2. Register in Unity Catalog

Instance card → Catalogs → Create managed database catalog
- Catalog name: customer_cat
- Click Create

3. Immediate use

CREATE TABLE customer_cat.public.accounts (
  account_id BIGINT PRIMARY KEY,
  customer_id BIGINT NOT NULL,
  balance NUMERIC(15,2),
  last_modified TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO customer_cat.public.accounts VALUES (1, 100, 5000.00, NOW());

-- Same table accessible to SQL Warehouses, notebooks, ML pipelines
SELECT * FROM customer_cat.public.accounts;

When to consider alternatives

Lakebase may not be the best fit if you:

  • Require horizontally-scaled distributed writes beyond vertical PostgreSQL scaling
  • Have compliance requirements preventing managed database services
  • Need sub-second replication to geographically distributed regions

For these scenarios, consider pairing Lakebase with Delta Lake's native ACID capabilities or exploring distributed SQL alternatives.

Conclusion

The traditional OLTP/OLAP is a source of operational complexity, governance gaps, and delayed insights. Databricks Lakebase breaks down this divide by bringing serverless Postgres directly into the lakehouse. You get millisecond transactional performance with full ACID guarantees alongside elastic analytics and ML, all governed by a single identity and access framework and with zero added infrastructure to manage.

The traditional separation of OLTP and OLAP systems creates fundamental barriers. This is particularly true for production AI.

  • Stale features due to ETL latency 
  • Complex pipelines to synchronize batch and streaming 
  • Train/serve skew from using disparate data sources 
  • Missing transactional integrity for actions taken by AI agents 
  • Fragmented governance across operational, analytics, and ML systems

Lakebase and, more broadly, Unity Catalog eliminate these barriers by creating a unified platform where: 

  • Real-time operational state and historical analytics coexist
  • Features are computed on-demand with sub-100ms latency 
  • Training and inference use identical data sources 
  • AI agents execute multi-step workflows with ACID guarantees 
  • Unity Catalog enforces consistent governance across the entire AI lifecycle 

For enterprise data teams juggling real-time operational systems and deep analytical workloads, Lakebase represents what the lakehouse has always promised: one platform, one governance model and zero trade-offs between transactions and analytics. 

Databricks TPC-DS benchmark notebooks

Download the pre-built notebooks to easily benchmark Classic Jobs vs. Serverless Jobs vs. serverless DBSQL.

Sachin Seth, Technical Writer

Sachin Seth is a data platform architect and analytics product builder known for his deep work benchmarking Databricks compute and delivering high-performance data applications at scale. He develops full-stack analytics solutions—ranging from billion-point time-series engines to portfolio optimization apps and real-time financial dashboards—blending Databricks, Rust, Arrow and modern web technologies. He writes to bring clarity, measurement and engineering rigor to the rapidly evolving world of Databricks and modern data platforms.