Don't miss us live on Twitch.tv on Monday, November 25th to learn how you can Build Next-Gen Data Platforms using Apache Iceberg.
Note: This episode aired on November 25th. You can watch the recording on demand by clicking here or on the image below.
Leave your questions and comments below for us to address live on AWS re:Post Live scheduled for Monday, November 25th at 11 am PST / 2 pm EST on twitch.tv/aws! On this episode, Principal Solutions Architect Anup Sivadas is joined by Sr. Customer Solutions Manager Rick Lobrecht, Specialist Sr. SA Gagan Brahmi, and Principal Solutions Architect Sahil Thapar to discuss Apache Iceberg and the best practices surrounding building your next-gen data platforms. During the show we will dive deep into this article authored by show guest Gagan Brahmi and discuss how to Generate production-grade synthetic data at petabyte-scale using Apache Spark and Faker on Amazon EMR. If you have any questions please add them in the comments section at the bottom of this article and we will answer them as part of our live show on Monday, November 25th over on Twitch. If your question is selected you will be awarded 5 re:Post points!
Apache Iceberg is a distributed, community-driven, Apache 2.0-licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it is fast, efficient, and reliable at any scale and keeps records of how datasets change over time. Apache Iceberg offers easy integrations with popular data processing frameworks such as Apache Spark, Apache Flink, Apache Hive, Presto, and more.