33% faster performance on e6data + Fabric compared to Fabric native compute engine
Today, we’re excited to announce e6data's integration with Microsoft Fabric, which improves MS Fabric performance with a new way to execute your queries on OneLake data. This integration is for you if you’ve been looking for faster, high-concurrency query execution on Fabric or easier access to OneLake data for AI workloads. In this post, we’ll introduce what this integration means, how it works (with a peek at the architecture), performance insights, and the cool use cases it enables.
If you are interested in data lakehouse and compute engines, and want to gain early access to some of the features mentioned below, or learn more about our upcoming work, reach out to us here, and we’ll be in touch soon!
e6data is a lakehouse compute engine built for SQL analytics and AI workloads at 10x faster speed at 60% lower costs. One of our core product philosophies is the ability to be interoperable, which pushed us to experiment with Fabric initially.
After a couple of optimizations, we were able to query data directly in OneLake via e6data’s engine, at a 33% faster speed than other available options including native Fabric SQL engines.
Of course, the big question for any new engine is: how fast is it?
We benchmarked the Fabric + e6data combination to ensure that all this integration goodness translates into real performance gains. We used the industry-standard TPC-DS benchmark to stress test query performance on OneLake. And, the results are lit!
Complex analytic reports that took over an hour now finish in minutes with e6data + Fabric. It comes without any compromise in data freshness or architecture simplicity – the data stays in OneLake, and we simply query it super fast: if you need faster SQL on Fabric, e6data delivers it without increasing costs.
It’s surprisingly straightforward. e6data acts as another compute layer on top of OneLake, alongside Fabric’s own engines. The architecture is open and modular: you can still ingest and manage data via Fabric as usual (using Spark or pipelines), and then simply point e6data at the same OneLake data for fast querying.
There’s no data silo or fork – it’s the same single copy of data in OneLake, now accessible through e6data’s compute. In fact, you could have Fabric’s own SQL engine and e6data querying the same OneLake data simultaneously for different workloads.
If you're running ad-hoc SQL queries or powering dashboards directly from OneLake data, speed matters. e6data reduces query latency, keeping things responsive even as your data scales and queries become more complex.
Here’s what makes us stand out:
We're already 33% faster than other Fabric compute engines—but we're not done. We’re just getting started with commonly used engine optimization techniques, there’s still plenty of room for more.
For the above TPCDS experimentation, our costs come up to ~ $0.085 USD. Our pricing model is based on compute usage in vCPU seconds. We also deploy guardrails and autoscaling to maintain cost predictability during peak workloads.
Here’s how Fabric alone compares to Fabric + e6data in real-world enterprise analytics.
(For reference, approximate pricing: Fabric F2 costs about $263/month pay-as-you-go, an F32 is around $4,200/month. e6data usage is ~$0.10 per vCPU-hour in these examples.)
Microsoft Fabric already includes governance and security measures, but enterprise teams demand tighter controls. e6data fills critical security gaps:
Here’s where things get even more interesting: e6data isn’t just a SQL engine, it also supports vector search natively. This opens up Fabric’s OneLake to a variety of AI and machine learning workloads that go beyond traditional SQL analytics.
If you’re a data engineer working with Fabric, give e6data a shot on your OneLake data. It’s easy to set up, and you’ll immediately feel the difference in query performance. If you want to gain early access to some of the features mentioned above, or learn more about our upcoming work, reach out to us here, and we’ll be in touch soon!
We are universally interoperable and open-source friendly. We can integrate across any object store, table format, data catalog, governance tools, BI tools, and other data applications.
We use a usage-based pricing model based on vCPU consumption. Your billing is determined by the number of vCPUs used, ensuring you only pay for the compute power you actually consume.
We support all types of file formats, like Parquet, ORC, JSON, CSV, AVRO, and others.
e6data promises a 5 to 10 times faster querying speed across any concurrency at over 50% lower total cost of ownership across the workloads as compared to any compute engine in the market.
We support serverless and in-VPC deployment models.
We can integrate with your existing governance tool, and also have an in-house offering for data governance, access control, and security.