Data lakes are increasingly becoming a popular approach to getting started with big data. Simply put, a data lake is a central location where all applications that generate or consume data go to get raw data in its native form. This enables faster application development, both transactional and analytical, as the application developer has a standard location and interface to write data that the application will generate and a standard location and interface to read data that it needs for the application.
However, left unchecked, data lakes can quickly become toxic, becoming a cost to maintain whereas the value delivered from them shrinks or simply does not materialize. Here are some common mistakes that can make your data lake toxic.
Your big data strategy ends at the data lake.
A common mistake is to choose a data lake as the implementation of the big data strategy. This…
View original post 945 more words