WA Data Warehouse
WA Data Warehouse
(DW) is a
digital storage system that connects and
harmonizes large amounts of data from
many different sources. Its purpose is to
feed business intelligence (BI), reporting,
and analytics, and support regulatory
requirements – so companies can turn
their data into insight and make smart,
data-driven decisions. Data warehouses
store current and historical data in one
place and act as the single source of truth
for an organization.
Data flows into a data warehouse from operational systems (like ERP and CRM),
databases, and external sources such as partner systems, Internet of Things (IoT)
devices, weather apps, and social media – usually on a regular cadence. The emergence
of cloud computing has caused a shift in the landscape. In recent years, data storage
locations have moved away from traditional on-premise infrastructure to multiple locations,
including on premise, private cloud, and public cloud.
Modern data warehouses are designed to handle both structured and unstructured data,
like videos, image files, and sensor data. Some leverage integrated analytics and in-
memory database technology (which holds the data set in computer memory rather than in
disk storage) to provide real-time access to trusted data and drive confident decision-
making. Without data warehousing, it’s very difficult to combine data from heterogeneous
sources, ensure it’s in the right format for analytics, and get both a current and long-range
view of data over time.
What is a data warehouse?
Diagram of data warehouse architecture. A typical data warehouse includes the three
separate layers above. Today, modern data warehouses combine OLTP and OLAP in a
single system.
Data layer: Data is extracted from your sources and then transformed and loaded
into the bottom tier using ETL tools. The bottom tier consists of your database
server, data marts, and data lakes. Metadata is created in this tier – and data
integration tools, like data virtualization, are used to seamlessly combine and
aggregate data.
Semantics layer: In the middle tier, online analytical processing (OLAP) and online
transactional processing (OLTP) servers restructure the data for fast, complex
queries and analytics.
Analytics layer: The top tier is the front-end client layer. It holds the data warehouse
access tools that let users interact with data, create dashboards and reports, monitor
KPIs, mine and analyze data, build apps, and more. This tier often includes a
workbench or sandbox area for data exploration and new data model development.
Data warehouses have been designed to support decision making and have been primarily
built and maintained by IT teams, but over the past few years they have evolved to
empower business users – reducing their reliance on IT to get access to the data and derive
actionable insights. A few key data warehousing capabilities that have empowered business
users are:
1. The semantic or business layer that provides natural language phrases and allows
everyone to instantly understand data, define relationships between elements in the
data model, and enrich data fields with new business information.
2. Virtual workspaces allow teams to bring data models and connections into one
secured and governed place supporting better collaborating with colleagues through
one common space and one common data set.
3. Cloud has further improved decision making by globally empowering employees with
a rich set of tools and features to easily perform data analysis tasks. They can
connect new apps and data sources without much IT support.
Get started
Try our cloud data warehouse today.
Free trial
Top seven benefits of a cloud
data warehouse
Cloud-based data warehouses are rising in popularity – for good reason. These modern
warehouses offer several advantages over traditional, on-premise versions. Here are the
top seven benefits of a cloud data warehouse:
1. Quick to deploy: With cloud data warehousing, you can purchase nearly unlimited
computing power and data storage in just a few clicks – and you can build your own
data warehouse, data marts, and sandboxes from anywhere, in minutes.
2. Low total cost of ownership (TCO): Data warehouse-as-a-service (DWaaS) pricing
models are set up so you only pay for the resources you need, when you need them.
You don’t have to forecast your long-term needs or pay for more compute throughout
the year than necessary. You can also avoid upfront costs like expensive hardware,
server rooms, and maintenance staff. Separating the storage pricing from the
computing pricing also gives you a way to drive down the costs.
3. Elasticity: With a cloud data warehouse, you can dynamically scale up or down as
needed. Cloud gives us a virtualized, highly distributed environment that can
manage huge volumes of data that can scale up and down.
4. Security and disaster recovery: In many cases, cloud data warehouses actually
provide stronger data security and encryption than on-premise DWs. Data is also
automatically duplicated and backed-up, so you can minimize the risk of lost data.
5. Real-time technologies: Cloud data warehouses built on in-memory database
technology can provide extremely fast data processing speeds to deliver real-time
data for instantaneous situational awareness.
6. New technologies: Cloud data warehouses allow you to easily integrate
new technologies such as machine learning, which can provide a guided
experience for business users and decision support in the
form of recommended questions to ask, as an example.
7. Empower business users: Cloud data warehouses empower employees equally
and globally with a single view of data from numerous sources and a rich set of tools
and features to easily perform data analysis tasks. They can connect new apps and
data sources without IT.
IT Best Practices
The data lifecycle includes data collection from identified sources, data integrity
management and reconciliation, data storage, data transfer, and continuous
improvement of data relative to organizational maturity, analytics, and decision needs.
The data warehouse architecture must support these activities and other aspects of
data lifecycle management.
In each data warehouse architecture listed, there is always room for additional
optimization, such as using clusters to decentralize how data is managed and
processed. This could be useful for challenges relative to data governance, locally or
internationally. Data warehouse architectures could include bus, hub- and-spoke, and
federated models to solve specific needs.
The following diagram shows a three-tier data warehouse architecture. The data
warehouse structure can be modified at each level to fit more like components, such
as an increase in the number of data marts to support additional functional units in the
organization.
Data Warehouse Infrastructure Diagram
The main components of a data warehouse architecture are:
Data
warehouse architectures should focus on analytical processing. Transactional
processing should be done separately using a different database. A transactional
processing database should be a data source for the more extensive data warehouse.
The data warehouse should easily support tools and applications such as reporting,
data mining, and application development tools.
OLAP solutions can be leveraged for either architectural solution. OLAP allows
multidimensional analysis of data warehouse data, information, and knowledge to
support complex modeling and trend analysis of the data warehouse solution.
Business Intelligence (BI) and decision making across all functional areas in the
organization that utilize data warehouses can leverage OLAP for quick, fast, effective,
and responsive analytics.