Thursday, December 3 • 10:45 - 11:35
History and Evolution of Data Lake Architecture - Post Lambda Architecture - Takuya Fukuhisa & Masaru Dobashi, NTT DATA
Around 2006, Apache Hadoop realized the open source based “Data Lake” architecture for enterprises to utilize large amounts of data, "Big Data". However, there are also growing expectations against "real-time analysis" that delivers analyzed results to end-users in seconds to minutes by immediately processing a large amount of “stream data”. In this talk, we present the history of open source software related to Data Lake, the overview of current software, and the potential tradeoffs.We also talk about how recent storage technologies, such as Apache Iceberg, Apache Hudi, Delta Lake, try to provide features to leverage both of historical and stream data on Data Lake in a different way from Lambda Architecture. Finally, we summarize these products based on the comparison of internal architectures. Attendances will learn about the overview of current storage software, and similarities and differencesof architectures. This helps you to design the system architecture build on Data Lake technologies to realize both batch and real-time based analysis. This post reflects some software upgrades from previous domestic presentation.

Takuya Fukuhisa

Deputy Manager, Senior IT Architect, NTT DATA
Takuya Fukuhisa is a system infrastructure architect and expert in distributed computing and stream data processing. He has developed mission-critical open systems in the public and financial sector since 2011. Currently, he is responsible for developing a system and addressing the... Read More →
Masaru Dobashi

Executive IT Specialist, Manager, NTT DATA
Masaru Dobashi is a system infrastructure architect and expert on distributed computing, machine learning platform, and stream data processing. He leads the open-source professional service team at NTT DATA Corporation and has responsibility for introducing open source-based data... Read More →

Thursday December 3, 2020 10:45 - 11:35 JST
