Lakehouse Days: March 2025

Mar 8, 2025, from 10:00 AM to 2:00 PM IST

Want to see e6data in action?

Learn how data teams power their workloads.

About the Event

Join us for an exclusive in-person event on “Apache Iceberg: Basics, optimizations, features, streaming data, query execution” hosted by e6data in Hyderabad!

Lakehouse Days - Powered by AWS is designed specifically for data engineers, data architects, and senior software engineers who constantly seek to optimize their data architecture to make it more price-performant while delivering the best user experience. In this edition, we will dive deep into the internal architecture of open table formats like Apache Iceberg, how Apache Kafka works, building a modern data platform that simultaneously queries streaming and analytical data on Iceberg, how Amazon S3 Tables delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3, and how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. We aim to raise awareness about these open-table formats and gain a deeper understanding.

Lakehouse Days - Powered by AWS is designed to enable fellow data geeks to meet, network, and have insightful discussions on the entropic world of data.

‍
Meet the Speakers

Diptiman Raichaudhuri, Staff Developer Advocate at Confluent

Topic: Streaming Data into a Lakehouse - Kafka Greets Iceberg

‍Summary: Join this session to learn how operational and analytical data estates are getting merged! Apache Kafka, the de-facto standard for real-time streaming data, can now materialize events in a Lakehouse(Iceberg/Delta Lake), and analytical queries can run on materialized Kafka topics. This session will start from the ground up on what Iceberg is, how Kafka works, and the community efforts behind two of the most important frameworks, Apache Kafka and Apache Iceberg, coming closer. The audience will learn how to build a modern data platform with streaming and analytical data simultaneously queried on Iceberg.

‍Time: 10:00 - 10:45 AM IST

‍
David John Chakram, Principal Architect at AWS

‍‍Topic: Amazon S3 Tables: Scaling Apache Iceberg for High-Performance Analytics‍

Summary: Traditional data lakes provide immense scalability but often face performance, consistency, and interoperability challenges. In this session, David guides you through how Open Table Formats (OTFs) like Apache Iceberg revolutionize how organizations store and process tabular data at scale. He’ll dive into Iceberg’s key features, advantages over traditional approaches, and how Amazon S3 Tables, AWS’s latest innovation, delivers a fully managed Apache Iceberg experience to simplify large-scale analytics on Amazon S3. The audience will learn how S3 Tables enhance query performance, reduce operational overhead, and empower businesses with seamless and high-performance analytics at scale.

Time: 11:00 - 11:45 AM IST

‍

Karthic Rao, Principal Engineer at e6data

Topic: Fast Distributed Iceberg Writes and Queries with Apache Arrow IPC

‍Summary: In modern distributed analytical systems, efficient data movement and processing are critical for performance. Apache Arrow’s Inter-Process Communication (IPC) framework provides a high-performance, language-agnostic columnar format that eliminates serialization overhead and optimizes in-memory analytics. This talk explores how Arrow IPC enhances Apache Iceberg-based data lakes by accelerating streaming ingestion and query execution. Karthic will highlight Arrow IPC’s zero-copy data sharing and high-speed transport via Arrow Flight, which streamlines data movement, and its vectorized computation capabilities, which align seamlessly with Iceberg’s columnar storage. Key applications include batching streaming data to mitigate the small files problem during ingestion and optimizing data shuffling and result delivery during queries. Through practical examples, He will demonstrate how Arrow IPC unifies fast writes and queries, delivering efficiency and scalability to Iceberg data platforms.

Time: 12:00 - 12:45 PM IST

‍

Register Now!‍

This is an exclusive and invite-only event. Please RSVP to reserve your spot through this link: https://lu.ma/ahuq2jqz?utm_source=website

Venue - Amazon Development Centre (HYD11), Nanakramguda

Date and time - Mar 8, 2025, from 10:00 AM to 2:00 PM

‍

Build future-proof data products

Try e6data for your heavy workloads!

Get Started for Free

Frequently asked questions (FAQs)

How do I integrate e6data with my existing data infrastructure?

How does billing work?

What kind of file formats does e6data support?

What kind of performance improvements can I expect with e6data?

What kinds of deployment models are available at e6data ?

How does e6data handle data governance rules?

Lakehouse Days: March 2025

About the Event

‍Meet the Speakers