Community Impact
Building a global community of data engineers and developers through content and open-source contributions.
0
YouTube Subscribers
0
LinkedIn Followers
0
Medium Followers
0
GitHub Followers
0
Videos Published
0
Technical Blogs
0
Monthly Views
0
Monthly Impressions
Tech Stack & Expertise
Core technologies and professional skills I bring to every project
Core Expertise
Primary technologies I specialize in
Data Lakehouse
3 technologies
AWS Services
8 technologies
Query Engines
7 technologies
Languages
4 technologies
Professional Skills
Leadership, communication, and problem-solving capabilities
Featured Video
Deep dive into real-world data engineering challenges and solutions
Featured Video
Explore my latest content on data engineering, Apache Hudi, AWS, and modern data lakehouse architecture.
Event Glimpses
Capturing moments from tech conferences, hackathons, and community meetups.






Latest Articles
Technical deep dives and tutorials on data engineering
+ means a minimum — many more posts exist off this list; the site index is updated gradually.

How Zeta Global scales multi-tenant data ingestion with Amazon S3 Tables
Learn how Zeta Global leverages Amazon S3 Tables to scale multi-tenant data ingestion at enterprise scale, enabling efficient data processing and management.

Batch framework(An Internal Data Ingestion Framework that process 1TB of data in Month and run 200+ Jobs)
Batch Framework is a fully scalable internal framework designed to run 1000+ jobs and can scale horizontally. Each job have ability to specify how much compute environment you need you can specify how many cores and RAM you need.

LakeBoost:Maximizing Efficiency in Data Lake (Hudi) Glue ETL Jobs with a Templated Approach and Serverless Architecture with Source Code
The project LakeBoost aims to maximize the efficiency of a data lake by using Apache Hudi and AWS Glue ETL. The project is designed to improve the speed and accuracy of data ingestion, processing, and retrieval.
What People Say
Recommendations and testimonials from colleagues, industry leaders, and community members.
Bharat Goyal
Executive Vice President, Head of Engineering at Zeta Global
"I've had the pleasure of working with Soumil over the past year, and his impact in that relatively short time has been significant. He played a key role in advancing Zeta's lakehouse initiative, helping modernize our multi-tenant data ingestion and scalable data architecture. His work directly improved our ability to scale the data platform efficiently and accelerated several critical analytics use cases. In addition to his technical contributions, he has been instrumental in raising the bar for knowledge sharing - both within the organization and with the broader engineering community. He consistently demonstrates strong ownership, a proactive mindset, and a willingness to step in wherever needed to drive results. He brings high energy, reliability, and a rare balance of deep technical skill with an execution-focused approach."

Manoj Agarwal
Chief Architect at Zeta Global
"I managed Soumil on Zeta's data platform team, where he built our multi-tenant Iceberg ingestion platform that AWS later published as a customer case study. He's a serious engineer who also happens to be a great teacher, which is how he ended up speaking at re:Invent and building a large following in the data community. He's one of the best data engineers I've worked with, and any team would be lucky to hire him."
Vinoth Chandar
Creator of Apache Hudi, Founder at Onehouse
"Soumil has been an incredible advocate for Apache Hudi and the open-source data engineering community. His content has helped thousands of developers understand complex data lakehouse concepts. His dedication to creating high-quality educational content is truly remarkable."

Sekhar Sahu
Principal Software Engineer | Data Engineer
"I've worked closely with Soumil, and he shows the judgment, speed, and ownership you'd expect from a strong lead. He moves fast, jumps into ambiguity without overthinking it, and isn't afraid to try things out quickly to unblock progress. When things got messy — unclear ownership, shifting requirements, tight deadlines — he stayed persistent, pulled the right people in, and kept the work moving. He brings a genuine growth mindset. He's gone deep into Spark internals, join strategies, and performance tuning through hands-on experiments and by learning from others on the team. His ownership shows up everywhere, not just on what's assigned. During the lakehouse ingestion effort, he led the project end-to-end, flagged risks early, and filled gaps whenever needed. He also picked up adjacent workflow optimizations that weren't technically in his scope because they were bottlenecks and fixing them made the overall system stronger. He shares what he learns through blogs, brownbags, and day-to-day conversations, and that lifts the team around him. Soumil is reliable, proactive, and committed to making the system better. I'd work with him again on any critical path project, no question."
Varadaraj Ramachandraiah
Enterprise Solutions Architect at Amazon Web Services
"Soumil is an exceptional engineer whose versatility and reliability truly stand out. During our collaboration on the Lakehouse project, I witnessed firsthand his impressive range of skills across analytics, automation, and DevOps. What sets Soumil apart is not just his technical capabilities, but his remarkable ability to be an active listener and quickly grasp complex concepts. His approach to problem-solving is methodical and thorough, diving deep into new technologies and frameworks with enthusiasm and dedication. Whether facing challenging deadlines or technical hurdles, Soumil consistently delivers high-quality solutions while maintaining a collaborative spirit. What I particularly value about working with Soumil is his ability to translate complex technical concepts into actionable insights, making him an invaluable bridge between technical and business requirements. I highly recommend Soumil for his technical expertise, quick learning ability, and excellent collaborative skills."

Howard Cho
Data Engineering Leader
"Soumil not only possess incredible technical skills, but his dedication and willingness to share knowledge is truly inspiring. He goes above and beyond to ensure that everyone he works with understands the concepts and intricacies of Apache Hudi along with the myriad of AWS technologies (most notably AWS Glue). On top of this, he approaches every interaction with an extremely positive and humble attitude; he makes everyone aound him want to continue to grow. Every engineering org would benefit from having someone like Soumil on their team."
Get in Touch
Have a question or want to work together? Feel free to reach out!
Response Time
I typically respond within 24-48 hours. For urgent matters, please mention it in your message subject.
