Lakehouse Days: March 2025

Want to see e6data in action?

Learn how data teams power their workloads.

Get Demo
Get Demo

About the Event

Join us for an exclusive in-person event on “Apache Iceberg: Basics, optimizations, features, streaming data, query execution” hosted by e6data in Hyderabad!

Lakehouse Days - Powered by AWS is designed specifically for data engineers, data architects, and senior software engineers who constantly seek to optimize their data architecture to make it more price-performant while delivering the best user experience. In this edition, we will dive deep into the internal architecture of open table formats like Apache Iceberg, how Apache Kafka works, building a modern data platform that simultaneously queries streaming and analytical data on Iceberg, how Amazon S3 Tables delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3, and how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. We aim to raise awareness about these open-table formats and gain a deeper understanding.

Lakehouse Days - Powered by AWS is designed to enable fellow data geeks to meet, network, and have insightful discussions on the entropic world of data.


Meet the Speakers

Diptiman Raichaudhuri, Staff Developer Advocate at Confluent

Topic: Streaming Data into a Lakehouse - Kafka Greets Iceberg

Summary: Join this session to learn how operational and analytical data estates are getting merged! Apache Kafka, the de-facto standard for real-time streaming data, can now materialize events in a Lakehouse(Iceberg/Delta Lake), and analytical queries can run on materialized Kafka topics. This session will start from the ground up on what Iceberg is, how Kafka works, and the community efforts behind two of the most important frameworks, Apache Kafka and Apache Iceberg, coming closer. The audience will learn how to build a modern data platform with streaming and analytical data simultaneously queried on Iceberg.

Time: 10:00 - 10:45 AM IST


David John Chakram, Principal Architect at AWS

Topic: Amazon S3 Tables: Scaling Apache Iceberg for High-Performance Analytics

Summary: Traditional data lakes provide immense scalability but often face performance, consistency, and interoperability challenges. In this session, David guides you through how Open Table Formats (OTFs) like Apache Iceberg revolutionize how organizations store and process tabular data at scale. He’ll dive into Iceberg’s key features, advantages over traditional approaches, and how Amazon S3 Tables, AWS’s latest innovation, delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3. The audience will learn how S3 Tables enhance query performance, reduce operational overhead, and empower businesses with seamless and high-performance analytics at scale. 

Time: 11:00 - 11:45 AM IST

Karthic Rao, Principal Engineer at e6data

Topic: Fast Distributed Iceberg Writes and Queries with Apache Arrow IPC

Summary: In modern distributed analytical systems, efficient data movement and processing are critical for performance. Apache Arrow’s Inter-Process Communication (IPC) framework provides a high-performance, language-agnostic columnar format that eliminates serialization overhead and optimizes in-memory analytics. This talk explores how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. Karthic will highlight Arrow IPC’s zero-copy data sharing and high-speed transport via Arrow Flight, which streamlines data movement, and its vectorized computation capabilities, which align seamlessly with Iceberg’s columnar storage. Key applications include batching streaming data to mitigate the small files problem during ingestion and optimizing data shuffling and result delivery during queries. Through practical examples, He will demonstrate how Arrow IPC unifies fast writes and queries, delivering efficiency and scalability to Iceberg data platforms.

Time: 12:00 - 12:45 PM IST

Register Now!

This is an exclusive and invite-only event. Please RSVP to reserve your spot through this link: https://lu.ma/ahuq2jqz?utm_source=website

Venue - Amazon Development Centre (HYD11), Nanakramguda

​Date and time - Mar 8, 2025, from 10:00 AM to 2:00 PM

Read more about Apache Iceberg

Share on

Build future-proof data products

Try e6data for your heavy workloads!

Get Started for Free
Get Started for Free
Frequently asked questions (FAQs)
How do I integrate e6data with my existing data infrastructure?

We are universally interoperable and open-source friendly. We can integrate across any object store, table format, data catalog, governance tools, BI tools, and other data applications.

How does billing work?

We use a usage-based pricing model based on vCPU consumption. Your billing is determined by the number of vCPUs used, ensuring you only pay for the compute power you actually consume.

What kind of file formats does e6data support?

We support all types of file formats, like Parquet, ORC, JSON, CSV, AVRO, and others.

What kind of performance improvements can I expect with e6data?

e6data promises a 5 to 10 times faster querying speed across any concurrency at over 50% lower total cost of ownership across the workloads as compared to any compute engine in the market.

What kinds of deployment models are available at e6data ?

We support serverless and in-VPC deployment models. 

How does e6data handle data governance rules?

We can integrate with your existing governance tool, and also have an in-house offering for data governance, access control, and security.