0% found this document useful (0 votes)
15 views4 pages

Chapter 3

Uploaded by

Hayelom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

Chapter 3

Uploaded by

Hayelom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Chapter 3 - Database Systems and Big Data

1. Data Fundamentals

Definition:
Data represents raw facts and figures that are processed into meaningful information.

Hierarchy of Data:

1. Bit: Smallest unit of data (0 or 1).


2. Byte: Group of 8 bits representing a character.
3. Field: Single piece of data (e.g., name, age).
4. Record: A collection of related fields.
5. Table: A set of related records.
6. Database: A collection of related tables.

Entities, Attributes, and Keys:

 Entity: A thing or object in the database (e.g., student, product).


 Attribute: A property or characteristic of an entity (e.g., name, price).
 Key: Unique identifiers for records (e.g., primary key, foreign key).

2. The Database Approach

Traditional File System:

 Separate files for each application.


 Problems: Redundancy, inconsistency, and lack of integration.

Database Approach:

 Centralized data storage, shared by multiple applications.


 Benefits: Reduced redundancy, improved consistency, and better integration.

3. Data Modeling and Database Characteristics

Data Modeling:

 Creating a visual representation of data structures.


 Tools: Entity-Relationship (ER) diagrams.

1|Page
Database Characteristics:

 Integrity: Ensures accuracy and consistency of data.


 Scalability: Ability to grow with increased data.
 Security: Controlled access to data.

4. Relational Database Model

Definition:
A database model that organizes data into tables (relations).

Key Features:

1. Tables: Rows (records) and columns (fields).


2. Relationships: Connections between tables using keys.
3. SQL (Structured Query Language): Used to interact with the database.

Relational Database Management Systems (RDBMS):

 Examples: MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server.

5. Data Cleansing

Definition:
The process of detecting and correcting errors in data to improve quality.

Common Steps:

1. Remove duplicates.
2. Standardize formats.
3. Correct invalid entries.

6. Database Activities and Administration

Activities:

1. Data Entry: Adding new data to the database.


2. Querying: Retrieving specific data using SQL.
3. Reporting: Generating summaries and insights.

2|Page
Database Administration:

 Tasks: Backup, recovery, performance tuning, security management.


 Role: Database administrators (DBAs) manage databases to ensure efficiency and
security.

7. Using Databases with Other Software

 Integration with applications like ERP, CRM, and web platforms.


 APIs enable seamless communication between databases and other software.

8. Big Data

Characteristics of Big Data:

1. Volume: Massive amounts of data.


2. Velocity: Speed at which data is generated and processed.
3. Variety: Different types of data (structured, unstructured, semi-structured).
4. Veracity: Ensuring accuracy and reliability of data.
5. Value: Extracting actionable insights.

Sources of Big Data:

 Social media, IoT devices, sensors, business transactions.

Uses of Big Data:

1. Predictive analytics.
2. Customer behavior analysis.
3. Fraud detection.
4. Supply chain optimization.

Challenges:

 Data storage, processing speed, privacy concerns, skill gaps.

9. Technologies for Big Data Processing

Data Warehouses, Data Marts, and Data Lakes:

3|Page
 Data Warehouse: Centralized repository for structured data.
 Data Mart: A subset of a data warehouse for specific departments.
 Data Lake: Stores raw, unprocessed data for flexibility in analysis.

NoSQL Databases:

 Handle unstructured and semi-structured data.


 Examples: MongoDB, Cassandra, DynamoDB.

Hadoop:

 Open-source framework for distributed storage and processing of big data.

In-Memory Databases:

 Store data in RAM for faster processing.


 Examples: SAP HANA, Redis.

10. Summary Table

Topic Key Points


Hierarchy of Data Bit → Byte → Field → Record → Table → Database
Database Models Relational (tables, keys, SQL)
Big Data Characteristics Volume, Velocity, Variety, Veracity, Value
Technologies Data Lakes, NoSQL, Hadoop, In-Memory Databases

Discussion Questions

1. How does the relational model ensure data integrity?


2. Discuss the differences between data warehouses and data lakes.
3. What are the key challenges in managing big data effectively?

4|Page

You might also like