Storage Options for Transformed Data
Storage Options for Transformed Data
So far, you’ve learned that after locating and processing data, you need to store it. Luckily,
there are several storage options to choose from, including storing data locally, or in the cloud.
In this reading, you’ll learn more about where transformed data is stored, and explore examples
of tools that manage this task.
Cloud storage Online storage for all kinds of data, typically serving as a
part of a data lake
There are plenty of storage options for organizations to consider, including hybrid options like a
1
data lakehouse. These hybrid storage options have characteristics of more than one storage
solution, giving organizations more flexibility for their data.
Databases
Databases are like digital filing cabinets; they’re structured to hold data in an organized way,
making it faster to find the exact piece of information you’re looking for. Databases are
typically classified into two main categories according to their function: transactional and
analytical.
Analytical databases are designed to store and process huge amounts of data in seconds or
minutes. They help businesses answer questions about vast amounts of data to make business
decisions. Analytical databases are sometimes called online analytical processing (OLAP)
databases. Common types of databases include relational databases and NoSQL databases.
Relational databases contain a series of tables that can be connected to form relationships,
and usually supports SQL. NoSQL Databases store massive amounts of structured and
unstructured data that can be retrieved at high speeds.
Data warehouses
A data warehouse is a specially designed database that consolidates data from multiple source
systems for data consistency, accuracy, and efficient access. Services like BigQuery are
modern data warehouses where you can store data and run powerful queries.
Data lakes
A data lake is a storage system that stores large amounts of raw data in its original format until
it’s needed. Because the data is stored in its raw form, data lakes can cost less than data
warehouses, and can offer more flexibility in how the data is used. Data lakes offer a variety of
features like scalability, security, and performance. They can be used to store and process
large amounts of data from a variety of sources, including structured, semi-structured, and
unstructured data. Google Cloud Storage, Amazon S3, and Apache Hadoop distributed file
system (HDFS) are common examples of data lake storage.
2
Cloud storage
Cloud storage solutions enable users to save data online so it can be accessed from any
device, anytime, anywhere. Cloud storage also offers users flexibility to access their data.
Common examples of cloud storage include Google Drive, DropBox, iCloud Drive, Google
Cloud Storage, and Amazon S3.
Cloud data storage is a solution that enables organizations to help keep, access, and maintain
digital data on off-site, cloud-based servers. Data warehouses can also be deployed in the
cloud, because they’re hosted on remote servers by a cloud service provider.
Key takeaways
Data storage is crucial in cloud data analytics, serving as a space for processed data to live and
be used for analysis. From data warehouses like BigQuery, to data lakes like Google Cloud
Storage, there are a variety of tailored options for different organizational needs. Cloud
solutions like Google Drive or DropBox help empower users to access their data anytime, from
any device, emphasizIng the growing importance of flexibility in data storage and retrieval.