> The Cloud made the crap data problem infinitely worse.
This article is mainly focusing about the unused data by website and enterprise databases, only toward the end of the article it barely touched upon "the elephant in the room" of data in cloud.
Now everywhere in the world data centers are being built at breakneck speed to cater for the AI data modeling, training and serving. Most of the AI based data are being kept in datalake in the form of raw data that will probably never see the light of that day i.e never being processed.
Bill Inmon warned us against this potential data swamps in data center due to the increasing popularity of the datalake [1].
Hopefully open table format like Apache Iceberg can rectify this unused raw data epidemic but time will tell [2].
[1] Lakehouses Prevent Data Swamps, Bill Inmon Says
This article is mainly focusing about the unused data by website and enterprise databases, only toward the end of the article it barely touched upon "the elephant in the room" of data in cloud.
Now everywhere in the world data centers are being built at breakneck speed to cater for the AI data modeling, training and serving. Most of the AI based data are being kept in datalake in the form of raw data that will probably never see the light of that day i.e never being processed.
Bill Inmon warned us against this potential data swamps in data center due to the increasing popularity of the datalake [1].
Hopefully open table format like Apache Iceberg can rectify this unused raw data epidemic but time will tell [2].
[1] Lakehouses Prevent Data Swamps, Bill Inmon Says
https://www.datanami.com/2021/06/01/lakehouses-prevent-data-...
[2] What Are Apache Iceberg Tables and How Are They Useful?
https://www.snowflake.com/guides/what-are-apache-iceberg-tab...