LESSON 1_Data Management
LESSON 1_Data Management
again in
FUNDAMENTALS OF DATA an active production environment, and the
MANAGEMENT
removal of this data from all active production
DATA MANAGEMENT environments.
- Is the practice of ingesting, processing, - Is simply a place where data is stored, but
securing and storing an organizations data, where no maintenance or general usage
where it is the utilized strategic decision- occurs. If necessary, the data can be restored
making to improve business outcomes. an environment where it can be used.
- It refers to the development and execution of Data Achieved and protected
architecture, policies, practices and Available for use
procedures, in order to manage the
information lifestyle of an enterprise in an DESTRUCTION – the volumes or archived data
effective manner. inevitably grows, and while you may want to save all
your data forever, that’s not feasible. Storage cost
- and compliance issues exert pressure to destroy data
DATA LIFE CYCLE MANAGEMENT
you no longer need.
- Process that helps organizations to manage
the flow of the data throughout its lifecycle – - The removal of every copy of a data item
from initial creation through to destruction. from an organization. It typically done from
an archive storage location.
5 STAGES OF DATA CYCLE - The challenges of this phase of the lifecycle is
to ensure that the data has been properly
DATA CREATION – the first phase of data lifecycle. destroyed. It is important to ensure before
destroying data that the data items have
- Data can be in many forms e.g., PDF, image, exceed their required regulatory retention
word document, SQL database data. period
- Data is typically created by an organization in Purging
one of 3 ways:
Data Acquisition – acquiring TYPES OF DATA MANAGEMENT
already existing data which has been
produced outside the organization. DATA INTEGRATION
Data Entry – manual entry of new
data by personnel within the Combine data from different systems to
organization create a unified data set.
Data Capture – capture of data To make data more freely available and
generated by devices used in various easier to consume to process by systems and
processes in the organization. users.
The goal of integration is to pull those
STORAGE – once the data has been created within fragments together and offer a SINGLE
the organization, it needs to be stored and protected, CUSTOMER VIEW (SCV)
with the appropriate level of security applies. A When you integrate data, its quality improves
robust backup and recovery process should also be because you can compare data for accuracy
implemented to ensure retention of data during the and relevance.
lifecycle. Integration allows you to track users
throughout the entire customer journey.
Security
Backup and Recovery DATA MODELING
USAGE – during the usage phase of the data The process of analyzing and defining all
lifecycle, data is used to support activities in the different data your business collects and
organization. Data can be viewed, processed, produces, as well as the relationships
modifies and saved. An audit trail should be between those bits of data.
maintained for all critical data to ensure that all Makes it easier for teams to see how data
modifications of data are fully traceable. Data may flow through your systems and business
also be made available to share with others outside processes.
the organization.
ER (Entity-Relationship) Model
Data Viewing, processing, modification and
saving - This model is based on the notion of real-
Available for use world entities and relationships among them.
- It creates and entity set, relationship set,
ARCHIVAL – is the copying of data to an general attributes, and constraints.
environment where it is stored in case it is needed
Computer (Data Management)
LESSON 1: FUNDAMENTALS OF DATA MANAGEMENT
1st Semester l S.Y 2023-2024
Professor: Mr. Shernan Mabborang Transcribed by: Athesha
Sarmiento l BSMLS 1F
DATA STORAGE
Is the practice of recording and preserving
data for the future.
Electronic storage is more common than DATA GOVERNANCE
paper document storage because of the
Is a set of standard and business processes
increased volume of data.
which ensure that data assets are leveraged
Companies might use magnetice tape, optical discs, effectively within an organization.
or mechanical media to store data. Other options Effective data governance creates consistent
include: and trustworthy data. It also helps keep data
secure.
Physical file storage This generally includes:
Block storage in storage area networks - Data quality
(SANs) - Data access
Object storage, which stores objects like - Usability
videos from Facebook or files from Dropbox. - Data security
DATA CATALOGS They will be responsible for things such as:
Database record maintenance Ex. Hadoop Distributed File System (HDFS) to store
process data
DATA SECURITY
DATA LAKES
Sets guardrails in place to protect digital
information from unauthorized access, - allows you to store relational data like
corruption, or theft. operational databases and data form line of
business applications, and non-relational data
Data Security Includes: like mobile apps and social media.
- also gives you the ability to understand what
Hardware
data is in the lake through crawling,
Software
cataloging, and indexing of data.
Storage
- ingest raw data from those same functions,
Backups
User devices removing dependencies and eliminating
Access single owners to a given dataset.
Admin Control A data lake can include structures data from:
Data governance
Relational databases (rows and columns)
Ex. CAPTCHAs are popular ways to deter hackers Semi-structures data (CSV, logs, XML, JSON)
from entering malicious code into web forms. Unstructured data (emails, documents, PDFs)
Binary data (images, audio, video)
DATA ARCHITECTURE
Is a discipline that documents an
IMPROVED COMPLIANCE AND SECURITY
organizations data asset, maps how data Governance councils assist in placing
flows through its systems and provides a guardrails to protect businesses from fines
blueprint for managing data. and negative publicity that can occur due to
The goal is to ensure that the data is noncompliance to government regulations
manages properly and meets business need and policies. Missteps here can be costly from
for information. both a brand and financial perspective.