Met A Data
Met A Data
Topic 4.7, Page 62 Chapter -9 Page 139-148 Data Warehousing in Real World - Sam Anahory - Dennis Murray PRESENTER
META-DATA
from Greek MEANS "after", "beyond", "with", "adjacent", "self"
A piece of information. An assumption or premise from which inferences may be drawn
DEFINITIONS
Structured data about data. Increasingly this term refers to any data used to aid the identification, description and location of networked electronic resources. swdb.berkeley.edu/glossary.html (Statewise Database, University of California) data about the content, quality, condition, and other characteristics of data. www.fgdc.gov/metadata/csdgm/glossary.html (Federal Geographic Data Committee) A set of data that describes and gives information about other data (Wikipedia) Metadata is generally defined as 'descriptive information about information' and refers to any data used to support the identification, description and location of an information object, such as a document. Simply put, metadata is the collection of labels that describe a piece of information. www.namahn.com/resources/documents/note-metadata.pdf (Human-centered Design Consultancy in Belgium) Metadata is data about data. (Anahory)
Purpose of Meta-Data
What they serve, defines the purpose of metadata:
Semantic analysis
data explaining the content of the information object: title, subject (or subject categories, taxonomies, ontologies), keywords, intended audience, content rating, and so forth.
administration
data used for managing the information object: author(s) of the resource, reviewer(s), the version number, date to be reviewed, property rights, and so forth
Properties of Metadata
Metadata can be generated automatically, or created by humans. They can be queried by a user, or they can be used by software agents in service of a user. Can be associated with resource in following ways: They can be embedded directly in the information object: e.g. HTML metatags in a web page. They can be a separate entity linked to or from the object they describe. They can be stored in a remote database. The record in the database may either have been directly created within the database or extracted from another source, such as a web page.
Meta-Data in Warehousing
Data warehouses are designed to manage and store the data whereas the Business Intelligence (BI) focuses on the usage of data to facilitate reporting and analysis. The purpose of a data warehouse is to house
standardized, structured, consistent, integrated, correct, cleansed and timely data, extracted from various operational systems in an organization.
Ralph Kimball* describes metadata as the DNA of the data warehouse as metadata defines the elements of the data warehouse and how they work together. * Done Ph.D. in 1972 from Stanford University in electrical engineering (specializing in man-machine systems)
Categories of meta-data
TECHNICAL Technical metadata defines the data model and the way it is displayed for the users, with the reports, schedules, distribution lists and user security rights.
(Tables, fields, data types, indexes and partitions in the relational engine, and databases, dimensions, measures, and data mining models.)
BUSINESS tells you what data you have, where it comes from, what it means and what its relationship is to other data in the data warehouse PROCESS describe the results of various operations in the data warehouse.
(includes start time, end time, CPU seconds used, disk reads, disk writes and rows processed)
Destination
Unique identifier Name Type Table Name
Transformations
Name Language Module Syntax
Cont.
II.
Data Management
Indexes
Columns Name Type
Constraints
Name Type Table Columns
Cont.
III.
Query Generation
Query
Tables accessed Columns accessed Name Reference Identifier Aggregate Functions Used Column Name Aggregate Function Sort Criteria Column Name Sort direction Syntax Resources Disk Read / Write CPU Memory User
Used to direct a query to the most appropriate data source and give information about the query executed.
Live Example
NTFS Architecture
MFT Records
Small Files (<900B) are contained completely in the MFT entry.
MFT Records
Folders contain index data. Small folders reside within the MFT record Larger folders have an index structure to other data blocks. They use a B-tree structure.
0x08
0x10 0x18 0x20 0x24 0x28 0x2C 0x30
8
8 8 4 4 4 4 4
Attribute is 0x 00 00 00 60 bytes long. Attribute is resident (0x00) Contents are 0x 00 00 00 48 bytes long and start at offset 0x 00 18.
THANK YOU