Debates are fun to watch until they get out of hand. Recently, open-source debate over Delta Lake and Iceberg is receiving heat as contenders instead of mentioning right and wrong, are using sarcasm and emojis to express their views. This brawl started earlier this year when James Malone, senior manager of Product Management at Snowflake, introduced Snowflake's support for Iceberg, an open-source database architecture, he emphasized its genuinely open, open-source status.
"Many data architectures can benefit from a table format, and in my view, #ApacheIceberg is the one to choose – it's (actually) open, has a vibrant and growing ecosystem, and is designed for interoperability," he wrote in a January LinkedIn post.
After this, many Delta Lake supporters felt offended though James didn't mention Delta Lake by name. It is another database table format originally created by Snowflake competitor Databricks that failed to get interest and engagement from the open-source developer community in comparison to Iceberg.
Do you think Databricks would have kept quiet after seeing that post? Well, the answer is obvious 'NO' and then John Lynch, field CTO at Databricks, poked Malone, replied in the same LinkedIn thread that Snowflake's own software is itself proprietary. He posted a link to Delta Lake's source code on GitHub, the go-to home for open-source software collaboration. A smiley face emoji punctuated the burn.
"It's not open source. It's open code," responded Malone about Delta Lake. "We don't need to get into semantics James," shot back Spencer Cook, financial services solutions architect at Databricks.
Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs.
Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink, and Hive using a high-performance table format that works just like a SQL table.
This is not the first time this type of debate has taken place and surely this won't be the last time as there is always going to be a comparison between two similar service providers.
Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp
_____________
Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.