Delta Lake vs Iceberg: The Open-source Debate has Instigated a Public Spate

Arti

Published:9th Jun, 2022 at 5:30 PM

Delta Lake is an open-source storage layer that brings reliability to data lakes.

Debates are fun to watch until they get out of hand. Recently, open-source debate over Delta Lake and Iceberg is receiving heat as contenders instead of mentioning right and wrong, are using sarcasm and emojis to express their views. This brawl started earlier this year when James Malone, senior manager of Product Management at Snowflake, introduced Snowflake's support for Iceberg, an open-source database architecture, he emphasized its genuinely open, open-source status.

"Many data architectures can benefit from a table format, and in my view, #ApacheIceberg is the one to choose – it's (actually) open, has a vibrant and growing ecosystem, and is designed for interoperability," he wrote in a January LinkedIn post.

After this, many Delta Lake supporters felt offended though James didn't mention Delta Lake by name. It is another database table format originally created by Snowflake competitor Databricks that failed to get interest and engagement from the open-source developer community in comparison to Iceberg.

Do you think Databricks would have kept quiet after seeing that post? Well, the answer is obvious 'NO' and then John Lynch, field CTO at Databricks, poked Malone, replied in the same LinkedIn thread that Snowflake's own software is itself proprietary. He posted a link to Delta Lake's source code on GitHub, the go-to home for open-source software collaboration. A smiley face emoji punctuated the burn.

"It's not open source. It's open code," responded Malone about Delta Lake. "We don't need to get into semantics James," shot back Spencer Cook, financial services solutions architect at Databricks.

What is Delta Lake?

Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs.

What is Iceberg?

Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink, and Hive using a high-performance table format that works just like a SQL table.

This is not the first time this type of debate has taken place and surely this won't be the last time as there is always going to be a comparison between two similar service providers.

Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp

_____________

Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.

Delta Lake vs Iceberg: The Open-source Debate has Instigated a Public Spate

Delta Lake is an open-source storage layer that brings reliability to data lakes.

What is Delta Lake?

What is Iceberg?

Also Read

The Crypto Crown Clash: Qubetics, Bitcoin, and Algorand Compete for Best Spot in November 2024

Here Are 4 Altcoins Set For The Most Explosive Gains Of The Current Bull Run

8 Altcoins to Buy Before Their Prices Double or Triple

Could You Still Be Early for Shiba Inu Gains? Here’s How Much Bigger SHIB Could Get Before Hitting Its Peak

Smart Traders Are Investing $50M In Solana, PEPE, and DTX Exchange To Make Generational Wealth: Here’s Why You Should Too