Lakehouse Compute Engine: Query / ETL / Real-time Ingest

Query, ETL, Real-time Ingest for Iceberg, Delta, Hudi, and Hive

Run anywhere, Hybrid-ready • Agentic lakehouse compute

Cloud, On Premise, Hybrid. AI-native lakehouse compute engine.

Faster Queries, ETL and Ingest—without migrating from Databricks, Snowflake, BigQuery, and Fabric. 60% Cost Savings. Runs Anywhere.

Save 60% without migrating from Databricks, Snowflake, Redshift, Fabric. 10x faster. Across Cloud, On-Prem, Hybrid.

e6data’s pioneering Atomic Architecture^TM never forces you to choose between cost efficiency, performance, and avoiding lock-in. Works without needing migration on all major platforms, architectures, catalogs, governance, and deployment scenarios (cloud, on-prem, hybrid, air-gapped, sovereign).

Works On All Data Platforms And Architectures ->

10x faster

on production workloads

$1M-$10M

3 year cost savings per use case

No migration

from your existing data platform

DESIGN PRINCIPLES

Bringing proven software architecture principles into compute engine design

Existing engines trace their architectures to when they were first created (early 2010). The cloud was in its infancy, and object-store native architectures that separated storage from compute disrupted Hadoop, early-Redshift style engines of the time. This trend is still playing out with all major engines now embracing object storage.

e6data architecture

e6data’s pioneering Atomic Architecture^TM brings contemporary software engineering and distributed systems principles into how compute engines are designed:

Decentralized, disaggregated and comprised of a granular set of services that communicate via contracts
Each service can be controlled, sized, and scaled independently of the other
Sizing and Scaling are both atomic (i.e. +/- 1 vCPU increments) compared to the large step jumps (e.g. L -> L x 2) that T-shirt sizing models involve

w/o e6data

Architecture diagram showing a standard lakehouse query engine workflow with multiple applications, table formats, governance solutions, data catalogs, deployed across clouds, regions, and on-premises.

w/ e6data

Architecture diagram showing e6data’s lakehouse query engine workflow with multiple applications, table formats, governance solutions, data catalogs, deployed across clouds, regions, and on-premises, with zero migration from existing setup.

w/o e6data

w/ e6data

w/o e6data

w/ e6data

Trusted by Data Teams at

‍“We achieved 1,000 QPS concurrencies with p95 SLAs of < 2s on near real-time data & complex queries. Other industry leaders couldn’t meet this even at a far higher TCO.”

Chief Operating Officer

“We’ve been impressed with e6data’s performance, concurrency, and granular scalability on our resource-intensive workloads.” 

Head of Platform Engineering

Technology

Why is e6data 10x faster at 60% lower cost?

You size the cluster or virtual warehouse (base size) based on query volume, complexity, and concurrency, as well as your target response time (e.g., p95 latency). Choose the size that achieves optimal cluster utilization for the given load and SLA.

w/o e6data

Legacy Centralized, VM-centric architectures

Depend on a single coordinator node — creating bottlenecks, single points of failure, and expensive step-jump scaling. Even slight increases in workloads trigger large cost spikes and SLA misses.

w/ e6data

e6data's Decentralized, k8s native architecture

Scales granularly with stateless services, with scaling granularity down to 1 vCPU increments. Result: 10x faster queries, consistently met SLAs, and a predictable 60% lower TCO at petabyte-scale.

Comparison

Atomic vs Step-Jump Scaling: Cost & QPS Under Production Load

Line graph comparing legacy step-jump scaling with e6data’s atomic scaling across fluctuating query loads; cost labels show steep jumps for legacy ($25 → $100) versus granular increments for e6data ($15 → $74).

Benchmarks

Vs. legacy lakehouse engine

3.09x

Faster

TPC-DS

Delta

8 QPS

Vs. legacy QUERY engine

11.02x

Faster

TPC-DS

Fabric

30 cores

Query type: comparison

1.58x

Faster

TPC-DS

Delta

AWS

XS

Vs. legacy lakehouse engine

67.64%

Lower cost

TPC-DS

Delta

8QPS

Vs. legacy query engine

7.04x

Faster

TPC-DS

Iceberg

XS

Query type: logical

1.80x

Faster

TPC-DS

Delta

AWS

XS

Vs. legacy lakehouse engine

3.08x

Lower p99 latency

TPC-DS

Delta

8 QPS

e6data + Fabric

60.05%

Lower cost

TPC-DS

Fabric

30 cores

High Concurrency

1.20x

Faster

TPC-DS

Delta

AWS

XS

Vs. legacy lakehouse engine

3.09x

Faster

TPC-DS

Delta

8 QPS

Vs. legacy QUERY engine

11.02x

Faster

TPC-DS

Fabric

30 cores

Query type: comparison

1.58x

Faster

TPC-DS

Delta

AWS

XS

Vs. legacy lakehouse engine

67.64%

Lower cost

TPC-DS

Delta

8QPS

Vs. legacy query engine

7.04x

Faster

TPC-DS

Iceberg

XS

Query type: logical

1.80x

Faster

TPC-DS

Delta

AWS

XS

Vs. legacy lakehouse engine

3.08x

Lower p99 latency

TPC-DS

Delta

8 QPS

e6data + Fabric

60.05%

Lower cost

TPC-DS

Fabric

30 cores

High Concurrency

1.20x

Faster

TPC-DS

Delta

AWS

XS

Learn more in docs

Use Cases

Run your most resource-intensive SQL and AI workloads

Get predictable SLAs, instant query responses, and radically lower compute costs—all with no query rewrites or app changes.

Packaged Analytics

Deliver embedded, multi-tenant analytics seamlessly within your SaaS applications. Gain 10x faster performance at scale while reducing infrastructure costs by up to 60% and operational complexity.

Interactive Analytics

Enable real-time dashboards and dynamic data exploration at massive scale. Deliver sub-2-second response times for 1000+ QPS with consistent SLAs and UX and without any latency.

Ad-hoc Analytics

Run complex ad-hoc queries 10x faster across diverse data sources (object storage, OLAP, data streams, and more) from a unified engine. Achieve zero-failed SLAs due to poorly optimized queries and resource constraints.

Scheduled Analytics

Run frequent, high-volume scheduled analytics with 99.99% reliability for scheduled workflows—without downtime, data delays, or compute cost overruns, even with rapid refresh cycles. 

Real Time Ingest

Stream data into your lakehouse with sub-second latency. Skip Flink, ETL, and pipeline overhead. Query fresh events instantly using SQL or Python—no shuffle, no joins, no delay between ingestion and analysis.

Vector Search

Run semantic search on unstructured data using built-in cosine similarity. No vector DBs, no retrieval pipelines. Query text like structured rows with SQL—fast, scalable, and lakehouse-native for instant, AI-powered insights.