A data lake is a repository of data stored in its natural or raw format. It is usually a single store of data that includes raw copies of source system data, sensor data, social data, converted data, and more. It is used for tasks such as reporting, advanced analytics, and machine learning.
A data lake can include structured data, semi-structured data, unstructured data, and binary data. A data lake can be established on-premises or in the cloud.