With the rise of Hadoop, organizations are adopting the new platform to extend their analytical capabilities. However, despite the availability of the open source framework that supports the processing and storage of large datasets, data analysts still struggle to find and harness all types of data across the enterprise.
Enterprises remain dominated by antiquated approaches to hand-coding, and code generation fails to automate labor-intensive manual processes or discover relationships between datasets. This can ultimately leave data inconsistent, incomplete, and stale for analytical use.
As a result, organizations are now adopting data lake management innovations to quickly and flexibly ingest, cleanse, master, prepare, govern, secure, and deliver all types of data on-premise or in the cloud.
DBTA recently held a webinar featuring several experts including Philip Russom, senior director at TDWI Research; Murthy Mathiprakasam, director of product marketing at Informatica; and Tavo De Leon, big data strategy and solutions, global leader at Cognizant, to discuss the latest innovations in data lake management.
Though the data lake presents a sound strategy to deal with data, organizations can still face challenges gaining value from data lakes, Mathiprakasam explained. These issues include insufficient understanding, unacceptable wait times, unknown trust, and inconsistent delivery.
Fragmented approaches only create more complexity, he noted. In addition, he said, data challenges can have real consequences, limiting the ability to make decisions in an effective and timely manner.
The first step to utilizing Data Lake Management tools effectively is understanding what a data lake can do, Russom said. Data lakes can handle large volumes of diverse data sets, ingest data quickly, persist data in its original raw detailed state, flexibly support data management best practices, and support multiple use cases in a variety of data architectures.
De Leon outlined some real world examples that include several tips for providing better customer experiences with analyzed data from data lakes.
An archived on-demand replay of this webinar will be available here.