0% found this document useful (0 votes)
23 views

DDB UNIT - 5

Uploaded by

praneet trimukhe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

DDB UNIT - 5

Uploaded by

praneet trimukhe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Distributed Object Database Management

Fundamental Object Concepts and Object Models


An Object DBMS (Database Management System) uses "objects" as the primary means for
modeling and accessing data. Object Data Management Group (ODMG) model, which includes an
Object Model, an Object Definition Language (ODL), and an Object Query Language (OQL).
Alternatively, there have been proposals to extend the relational model, such as SQL3. Substantial
efforts have also focused on the theoretical foundations of object models.
1. Object:
Core Concept of Object DBMS:
Object DBMS is centered around the concept of an object, which represents a real-world entity in the
modeled system.
Object Representation:
Objects are represented as a triple:
• OID (Object Identifier): A unique, invariant identifier for each object.
• State: Represents the current condition or data of the object.
• Interface: Defines the behavior or actions of the object.
Object Identifier (OID):
• OIDs uniquely and permanently distinguish objects both logically and physically, irrespective
of their state.
• OIDs facilitate referential object sharing, enabling composite and complex structures.
• Models may vary:
o Some use OID equality as the only comparison.
o Others differentiate between objects being identical (same OID) and equal (same
state).
Object State:
• Defined as either:
o Atomic value (e.g., a single element like an integer).
o Constructed value (e.g., a tuple or set).
• Domains:
o D: System-defined (e.g., integers) or user-defined abstract data types (e.g.,
companies).
o I: Identifiers to name objects.
o A: Attribute names.
Value Definitions:
• Atomic Value: An element of D.
• Tuple Value: [a1: v1; ...; an: vn] where attributes ai are in A and values vi are in D or I.
• Set Value: {v1; ...; vn} where vi is in D or I.
Constructors in Object Models:
• Tuple Constructor: [ ] defines tuple structures.
• Set Constructor: { } defines sets.
• Additional constructors like lists or arrays can enhance modeling capabilities.
Object Identifiers as Values:
• OIDs are treated as values, similar to pointers in programming languages.
Essential Data Constructors:
• Sets and tuples are crucial for database applications, but other types (e.g., lists, arrays) can be
added for extended functionality.
Types and Classes:
• The terms "type" and "class" can cause confusion due to inconsistent usage.
• In this context:
• "Class" refers to the specific object model construct.
• "Type" refers to a domain of objects (e.g., integer, string).
• A class serves as a template for a group of objects, defining a common type for those objects.
• No distinction is made between:
• Primitive system objects (e.g., values).
• Structural objects (e.g., tuples or sets).
• User-defined objects.
• A class specifies a data type by:
• Providing a domain of data with a uniform structure.
• Defining methods applicable to that domain.
• Classes enable abstraction (encapsulation) by:
• Hiding the implementation details of methods.
• Allowing methods to be implemented in general-purpose programming languages.
• A subset of the class structure and methods forms the publicly visible interface of its objects.

Composition (Aggregation):
• Instance Variables:
• Some variables hold simple values (e.g., model, year).
• Others are object-based, like the make attribute, which links to objects of type Manufacturer.
• Composite Objects:
• A type like Car that links to other objects (e.g., Manufacturer) is called a composite type.
• Composite objects allow sharing of linked objects, known as referential sharing.
• Example: If John and Mary both own the same car (c1), it’s due to referential sharing.
• Complex Objects:
• Composite objects allow sharing, but complex objects do not.
• Example: A car (c1) and another car (c2) wouldn’t share the same tires because it’s
unrealistic for tires to belong to multiple vehicles at the same time.
• Representation:
• Relationships between composite objects can be shown using a composition graph (or
hierarchy for complex objects).
• In such graphs, an edge connects a variable of one type to another type when the variable’s
domain is the other type.
• The distinction between composite and complex objects is important, even if it’s not always
emphasized.

Subclassing and Inheritance:


• Extensibility in Object Systems:
• User-defined classes can be created using:
o Type constructors.
o Subclassing (creating new classes based on existing ones).
• Subclassing and Specialization:
• A subclass (e.g., A) is a specialization of a superclass (e.g., B) if its interface includes all
features of B plus additional ones.
• Subclassing establishes an "is-a" relationship, allowing substitutability (instances of A can
replace instances of B).
• Single vs. Multiple Subclassing:
• Single Subclassing: A class can only have one superclass (e.g., Smalltalk).
• Multiple Subclassing: A class can have multiple superclasses (e.g., C++).
• Class Structures:
• With single subclassing, the class system forms a tree.
• With multiple subclassing, it forms a graph or semilattice, possibly with multiple roots or a
single root (least specified class).
• Some systems define a most specified type as the bottom of the lattice.
• Inheritance:
• A subclass inherits properties from its superclass and can add its own.
• Inheritance enables code reuse, inheriting:
o Behavior (interface).
o Implementation.
o Or both.
• Can be single inheritance (one superclass) or multiple inheritance (multiple superclasses).
• Class Structures in Databases:
• Define the database schema.
• Help model commonalities and differences among types effectively.
____________________________________________________________________________
OBJECT DISTRIBUTION DESIGN
• Complexity of Distribution Design:
• In the object world, an object is defined by its state and methods.
• Distributing objects involves splitting (fragmenting) these components, which creates
challenges.
• Fragmentation Types:
• State Fragmentation: Raises questions like whether methods are duplicated for each
fragment or also split.
• Class Fragmentation: When attributes (variables) link to other classes, splitting one class
may affect others.
• Method Fragmentation: Simple methods (that don’t call others) and complex methods (that
do) may need different handling.
• Types of Fragmentation:
• Horizontal Fragmentation: Splitting objects based on rows (similar to relational databases).
• Vertical Fragmentation: Splitting objects based on columns (attributes).
• Hybrid Fragmentation: Combines horizontal and vertical splitting.
• Advanced Partitioning Methods:
• Derived Horizontal Partitioning: Similar to relational databases, splits based on
relationships between objects.
• Associated Horizontal Partitioning: Like derived partitioning but without strict conditions
(no predicate clause).
• Path Partitioning: Another type of partitioning that considers object relationships, discussed
later.
• Simplified Assumptions:
• For simplicity, the explanation assumes a class-based model that doesn’t distinguish between
types and classes.

Horizontal Class Partitioning:


• Analogies to Relational Databases:
• Horizontal fragmentation in object databases is similar to relational databases.
• Primary horizontal fragmentation works the same way in both, but derived fragmentation
differs.
• Derived Horizontal Fragmentation in Object Databases:
• Subclass Partitioning:
o Fragmenting a specialized subclass impacts its general superclass.
o Conflicts may arise when different subclasses impose conflicting fragmentation rules.
o Start fragmenting the most specialized class and propagate effects upward in the class
hierarchy.
• Complex Attribute Partitioning:
o Fragmenting a complex attribute may affect the containing class.
• Method Invocation-Based Partitioning:
o Fragmentation based on how methods are invoked between classes may influence the
design.
• Primary Horizontal Partitioning:
• For classes with simple attributes and methods, partitioning is based on predicates applied to
the class’s attributes.
• Result:
o Subclasses (e.g., C1, C2, ..., Cn) contain objects satisfying specific predicates.
o If predicates are mutually exclusive, the subclasses are disjoint.
o The original class (C) can be converted into an abstract class without instances.
• Handling Overlapping Predicates:
• If predicates are not mutually exclusive, complications arise:
o Some object models allow objects to belong to multiple classes.
o Alternatively, define “overlap classes” for objects that satisfy multiple predicates.
Vertical Class Partitioning:
• Vertical Fragmentation Overview:
• Involves splitting a class (C) into fragments (C1, C2, ..., Cm), each containing a subset of
attributes and methods.
• The fragments are less defined than the original class.
• Key Challenges:
• Handling the subtyping relationship between the original class, its fragments, and related
superclasses/subclasses.
• Defining relationships among the fragment classes.
• Deciding the location of methods, especially for complex methods.
• Method Partitioning:
• If all methods are simple, partitioning is straightforward.
• For complex methods, their location becomes a significant issue.
• Relational Approaches in Object Databases:
• Affinity-based techniques from relational databases have been adapted for object databases.
• Encapsulation Concerns:
• Vertical fragmentation disrupts encapsulation, raising doubts about its effectiveness and
suitability in object database management systems (DBMSs).
Path Partitioning
• Clusters all objects forming a composite object into a partition for easier access.
• Represents a composite object as a structural index (hierarchy of nodes) containing references
(OIDs) to component objects.
• The structural index eliminates the need to traverse the class composition hierarchy.
2. Class Partitioning Algorithms
• Aim to improve query/application performance by reducing irrelevant data access.
• Class partitioning restructures the object database schema based on application needs.
• Challenges: Class partitioning is complex and NP-complete.
a. Affinity-Based Approach
• Uses affinity among instance variables, attributes, and methods for partitioning.
• Partitions can be horizontal or vertical, depending on data/method requirements.
• Simple and complex variables/methods are treated differently.
b. Cost-Driven Approach
• Models disk access costs to refine partitioning.
• Combines affinity-based and heuristic approaches for better performance (e.g., hill-climbing
technique).
• Develops structural join index hierarchies for efficient complex object retrieval.
3. Allocation
• Focuses on allocating methods and classes to optimize database performance.
• Four cases of allocation based on method and object locations:
1. Local behavior, local object: No special handling needed.
2. Local behavior, remote object: Either move the object to the method's site or vice
versa.
3. Remote behavior, local object: Reverse of case 2.
4. Remote behavior, remote object: Reverse of case 1.
• Iterative solutions address method and class interdependencies.
4. Replication
• Involves duplicating objects, classes, or collections of objects.
• Decisions depend on the object model and replication requirements (e.g., whether type
specifications should be replicated across sites).
_______________________________________________________________________________
ARCHITECTURAL ISSUES:

• Since data and procedures are encapsulated as objects, the unit of communication between the
clients and the server is an issue. The unit can be a page, an object, or a group of objects.
• Closely related to the above issue is the design decision regarding the functions provided by
the clients and the server. This is especially important since objects are not simply passive
data, and it is necessary to consider the sites where object methods are executed.
• In relational client/server systems, queries are sent from the client to the server, which
processes them and returns result tables—a process called function shipping. In object-
oriented client/server DBMSs, this approach may not be ideal due to the need for data
shipping, where data is moved to clients for navigating complex object structures.
• This creates challenges in managing client cache buffers for data consistency, which is
closely tied to concurrency control since cached data may be shared across multiple clients.
Most commercial object DBMSs use locking mechanisms for concurrency control, raising
architectural questions about where locks should be placed and whether they should also be
cached on clients.
• Additionally, due to the composite nature of objects, prefetching component objects when
an object is requested may be beneficial. Unlike relational systems, which rarely prefetch
data, object DBMSs might leverage prefetching to improve performance.

By above considerations involves 3 sections are to be addressed:

1. Architectural design (including architectural alternatives, buffer management, and cache


consistency) .
2. Object management (covering object identifier management, pointer swizzling, and object
migration).
3. Storage management (focusing on object clustering and garbage collection).

Alternative Client-Server Architectures:

1. Object Servers
2. Page Servers

The distinction is partly based on granularity of data that are shipped client and server , partly based
on Functionality.

• Object Server Architecture:

• Simplify DBMS code by maintaining uniform object representation across server, client, and
disk.
• Updates occur in client caches and are flushed to the server for persistence.
• Fully utilize client workstation power, reducing server bottlenecks.
• Servers handle fewer functions, allowing them to serve more clients efficiently.
• Work distribution between server and clients can be optimized by the query optimizer.
• Exploit operating systems and hardware functionalities, such as pointer swizzling.

• Advantages of Object-aware Servers:

• Apply locking and logging at the object level, enabling multiple clients to access the same
page for small objects.
• Reduce data transmission by filtering objects at the server instead of sending entire pages.

• Challenges in Object DBMSs:

• Mixed workloads involve both query access and object navigation, making server-side
navigation inefficient due to frequent remote procedure calls (RPCs).
• Navigation-heavy workloads favor page servers.
• Code Shipping as a Solution:
• User applications can be shipped to servers for execution, reducing the need for frequent data
transfers (data shipping).
• Requires safe execution environments (e.g., safe languages like in the Thor system) to
maintain DBMS safety and reliability.
• Divided execution increases concerns over cache consistency between client and server.
• Function Shipping:
• Combines client and server resources for query and application execution, accommodating
mixed workloads.
• Supports a move toward peer-to-peer architectures with distributed execution.
• Dynamic Architecture Switching:
• Some systems can switch between architectures. For example, O2 operates as a page server
but switches to object shipping when page conflicts increase.
• Open Research and Challenges:
• Performance studies have not conclusively defined trade-offs between page and object
servers.
• Complexity increases for objects like multimedia documents that span multiple pages.

• Clients request objects from the server, which retrieves and sends them to the client.
• Server handles most DBMS services; clients provide an execution environment and some
object management functionality.
• Object management layer is duplicated on both client and server for executing methods at
both locations.
• Object manager responsibilities include:
o Providing a context for method execution.
o Handling object identifiers (logical, physical, or virtual) and object deletion (explicit
or garbage collection).
o Supporting object clustering and access methods at the server.
o Implementing object caches at both client and server to improve performance by
reducing server accesses.
• Group Object Transfers:
• Servers can send groups of objects instead of individual ones.
• Groups can be contiguous or span multiple pages, based on prefetching hints from the client.
• Group sizes are dynamically adjusted based on group hit rates.
• Handling Updates:
• Updated objects from clients must be installed onto their home pages in the server.
• If the home page is not in the server buffer, an installation read is performed to reload the
page.

Page Server Architecture:


• Data transfer occurs at the page or segment level, not at the object level.
• Servers act as "value-added" storage managers without dealing directly with objects.
• Object processing services are split between clients and servers.

• Performance Comparisons:
• Early studies favored page server architectures, especially when data clustering matches
access patterns.
• Object server architectures perform better when access patterns do not align with clustering.
• More research is needed for a conclusive judgment, particularly for multi-client/multi-server
environments.
• Advantages of Page Servers:
• Simplifies DBMS code with uniform object representation from disk to user interface.
• Updates occur in client caches and are reflected on disk when pages are flushed.
• • Simplify DBMS code by maintaining uniform object representation across server, client,
and disk.
• Updates occur in client caches and are flushed to the server for persistence.
• Fully utilize client workstation power, reducing server bottlenecks.
• Servers handle fewer functions, allowing them to serve more clients efficiently.
• Work distribution between server and clients can be optimized by the query optimizer.
• Exploit operating systems and hardware functionalities, such as pointer swizzling.
• Advantages of Object-aware Servers:
• Apply locking and logging at the object level, enabling multiple clients to access the same
page for small objects.
• Reduce data transmission by filtering objects at the server instead of sending entire pages.
• Challenges in Object DBMSs:

• Mixed workloads involve both query access and object navigation, making server-side
navigation inefficient due to frequent remote procedure calls (RPCs).
• Navigation-heavy workloads favor page servers.

• Code Shipping as a Solution:

• User applications can be shipped to servers for execution, reducing the need for frequent data
transfers (data shipping).
• Requires safe execution environments (e.g., safe languages like in the Thor system) to
maintain DBMS safety and reliability.
• Divided execution increases concerns over cache consistency between client and server.

• Function Shipping:

• Combines client and server resources for query and application execution, accommodating
mixed workloads.
• Supports a move toward peer-to-peer architectures with distributed execution.

• Dynamic Architecture Switching:


• Some systems can switch between architectures. For example, O2 operates as a page server
but switches to object shipping when page conflicts increase.

• Open Research and Challenges:

• Performance studies have not conclusively defined trade-offs between page and object
servers.
• Complexity increases for objects like multimedia documents that span multiple pages.

Client Buffer Management:

Clients in distributed systems can manage three types of buffers: page buffers, object buffers, and
dual page/object buffers, each with distinct characteristics:

1. Page Buffers:
o Manage entire pages of data.
o Suitable for bulk data transfer but may waste buffer space if application access
patterns don't align with disk data clustering.
2. Object Buffers:
o Handle individual objects, allowing fine-grained access and better concurrency.
o Risk of buffer fragmentation due to unused buffer space when objects don't fit evenly.
3. Dual Page/Object Buffers:
o Combine the benefits of both page and object buffers.
o Pages are loaded into the page buffer, and useful objects are copied to the object
buffer when a page is flushed.
o This approach balances efficient access and space utilization.

Server Buffer Management:

Server buffer management in distributed object DBMS is similar to relational systems but with
object-specific optimizations:

• Page Buffer Management: Servers manage a page buffer, sending relevant pages to clients
based on their requests.
• Modified Object Buffer (MOB): Stores updated objects returned by clients. These objects
are installed back into their corresponding data pages, reducing disk I/O by batching
read/write operations.
• Buffer Replacement Policy: Servers often use the "LRU with hate hints" policy, marking
pages cached by clients as "hated" and evicting them first to reduce data duplication.

Cache Consistency Management:

Cache consistency ensures that client-cached data remains synchronized with the server. Two main
strategies are used:
1. Avoidance-based Algorithms: Prevent stale data by restricting updates when data is
accessed by other clients.
2. Detection-based Algorithms: Allow stale data access but validate data consistency at
commit time.

Consistency Algorithm Classifications

Algorithms are further classified based on when clients notify the server of updates:

1. Synchronous: Clients block until the server responds.


2. Asynchronous: Clients proceed without waiting for a response.
3. Deferred: Clients batch updates and send them at commit time.

Specific Algorithms and Their Characteristics

1. Avoidance-based Synchronous:
o Callback-Read Locking (CBL): Clients retain read locks but relinquish write locks
after transactions. It minimizes abort rates and outperforms most alternatives.
2. Avoidance-based Asynchronous:
o Asynchronous Avoidance-Based Cache Consistency (AACC): Clients send lock
requests but proceed without blocking. This reduces deadlock rates.
3. Avoidance-based Deferred:
o Optimistic Two-Phase Locking (O2PL): Clients batch lock requests and send them
at commit time, though it risks high deadlock rates with increased contention.
4. Detection-based Synchronous:
o Caching Two-Phase Locking (C2PL): Clients check data freshness on every access.
Its performance is generally lower due to frequent server checks.
5. Detection-based Asynchronous:
o No-Wait Locking (NWL): Clients assume lock success and proceed, with the server
later notifying them of updates. It has lower performance than CBL.
6. Detection-based Deferred:
o Adaptive Optimistic Concurrency Control (AOCC): Clients defer lock
notifications until commit time. It performs well in data-intensive environments with
reduced messaging overhead, despite higher abort rates.
Object Management:
• object identifiers (OIDs) are system-generated and used to uniquely identify every object
(transient or persistent, system-created or user-created) in the system.
• Implementation of persistent object has 2 common solutions based either on Physical or
logical identities.
• Physical Identifier POID approach equates the OID with the physical address of the
corresponding object. The address can be disk space address and an offset from the base
address in the page.
• The address can be a disk page address and an offset from the base address in the page.
• The advantage is that the object can be obtained directly from the OID.
• The drawback is that all parent objects and indexes must be updated whenever an object is
moved to a different page.
Logical Identifier:
• The logical identifier (LOID) approach consists of allocating a system-wide unique OID (i.e.,
a surrogate) per object.
• OIDs are invariant; there is no overhead due to object movement.
• Overhead can be avoided by an OID table associating OID with physical object address
at the expense of one table look-up per object access.
• The ODBMS system prefer logicial identifier because it supports dynamic
environment.
• In transisent objects OID can be physical or logical.
• The physical identifier approach is the most efficient, but requires that objects do not move.
• The logical identifier approach, promoted by object-oriented programming, treats objects
uniformly through an indirection table local to the program execution.
• This table associates a logical identifier, called an object oriented pointer (OOP)

• LOID Generation:

LOIDs (Logical Object Identifiers) must be unique across an entire distributed domain.
Centralized LOID generation ensures uniqueness but is undesirable due to network latency and
high load on the central site.
In multi-server environments, each server generates LOIDs for objects stored locally.
LOID uniqueness is ensured by combining a server identifier and a sequence number.
The server identifier distinguishes LOIDs generated by different servers.
The sequence number uniquely represents the disk location of an object within a server.
Sequence numbers are not reused to avoid issues such as:

• Deleted object references inadvertently pointing to newly created objects using the same
sequence number.

This approach prevents anomalies and ensures reliable object identification.

LOID Mapping Location and Data structures:


The location of LOID-to-POID mapping is crucial for system efficiency.
Pure LOIDs require mapping information to be present at the client, especially if clients
can connect to multiple servers.
Pseudo-LOIDs require mapping information to be stored only at the server.
Storing mapping information at the client is not scalable, as updates must be propagated to
all clients accessing the object.
Mapping information is typically stored using hash tables or B+-trees.
Hash tables provide fast access but are not scalable as the database grows.
B+-trees are scalable but have:

• Logarithmic access time.


• Complexity in concurrency control and recovery.

B+-trees support range queries, making them suitable for accessing collections of objects.

2. Pointer Swizzling:
• Path expressions enable navigation between objects in object systems, resembling pointers (e.g.,
c.engine.manufacturer.name).
• On disk, object identifiers represent pointers, while in memory, in-memory pointers are
preferred for efficiency.
• Pointer-swizzling is the process of converting disk-based pointers to in-memory pointers.
• Two types of pointer-swizzling mechanisms exist:

• Hardware-based schemes:
o Use the operating system's page-fault mechanism.
o Swizzle all pointers in a page upon loading into memory.
o Provide better performance for repeated traversals of object hierarchies (no indirection
for object access).
o Have disadvantages:
▪ High overhead during page faults in poorly clustered data.
▪ Potential access to deleted objects.
▪ Risk of exhausting virtual memory address space.
▪ Limited support for object-level concurrency, buffering, data transfer, and
recovery.
• Software-based schemes:
o Use an object table for swizzling pointers.
o LOIDs are swizzled to point to the table location.
o Variants:
▪ Eager swizzling: Converts pointers immediately.
▪ Lazy swizzling: Converts pointers when needed.
o Introduces a level of indirection for object access.

• Software-based schemes are better suited for object-level operations and allow more granular
control.
• The choice between schemes depends on factors like clustering quality, data access patterns, and
concurrency requirements.
Object Migration:

• In distributed systems, objects may move between sites, raising several challenges.

• Unit of migration:

• Object state can migrate without methods, requiring remote procedure calls for method
application.
• Relocated objects might be separated from their type specifications.
• Solutions for migrating classes (types):
1. Move and recompile source code at the destination.
2. Migrate compiled class versions like any object.
3. Move source code only, with a lazy migration of compiled operations.

• Tracking object movements:

• Surrogates or proxy objects are used to point to an object's new location.


• The system transparently redirects access via proxies.

• Object states: Objects exist in four states:

• Ready: Not invoked but ready to receive messages.


• Active: Currently responding to an invocation or message.
• Waiting: Awaiting a response after sending a message.
• Suspended: Temporarily unavailable for invocation.
• Active and waiting objects cannot migrate.

• Object migration process:

1. Shipping the object from the source to the destination.


2. Creating a proxy at the source to replace the original object.

• System directory updates:

• Directory updates can be lazy (on redirection via proxies) or eager (at the time of movement).

• Proxy chain management:

• Frequent object movements may create long proxy chains.


• Systems should compact these chains periodically, updating directories accordingly.

• Composite object migration:

• Moving a composite object may require shipping its referenced objects.


• Object assembly is an alternative strategy to handle composite object migration.
__________________________________________________________________________
Distributed Object Storage:

• Object Model:
• Object models are conceptual and aim to improve programmer productivity.
• Translating conceptual models to physical storage is a common database challenge.
• Key Relationships in Object DBMSs:
• Sub typing: Defines relationships between parent and child types.
• Composition: Groups objects based on shared attributes or as sub-objects of the same parent.
• Object Clustering:
• Groups related objects in physical storage for faster access.
• It is complex due to:
o Object Identifier Types:
▪ LOID (Logical OID): Allows class partitioning but adds overhead.
▪ POID (Physical OID): Enables direct access but includes inherited attributes.
o Shared Objects: Objects with multiple parents increase complexity.
• Storage Models for Object Clustering:
• Decomposition Storage Model (DSM):
o Breaks objects into pairs (OID, attribute).
o Simple but relies on LOID.
• Normalized Storage Model (NSM):
o Stores each class separately.
o Works with both LOID and POID; LOID allows inheritance-based partitioning.
• Direct Storage Model (DSM):
o Clusters multi-class objects by composition.
o Best for known access patterns but struggles when parent objects are deleted.
• Distributed System Implementation:
• DSM and NSM: Use horizontal partitioning in distributed systems.
• Goblin: Implements DSM with large memory and caching for flexibility.
• Eos:
o Uses direct storage with system-wide POID.
o Adapts dynamically for load balancing and object movement, avoiding overhead.
• Managing object updates or deletions in complex relationships.
• Ensuring performance in dynamic environments like distributed systems.

Distributed Garbage Collection:

1. Object-Based Systems and Garbage Collection:

• Object-based systems use object identifiers to refer to objects.


• Unreachable objects (no references) become "garbage" and need to be deallocated.
• Relational DBMSs rely on manual garbage collection (e.g., cascading updates) instead of
automatic garbage collection.
• Distributed object-based systems require automatic garbage collection due to their
complexity.

2. Garbage Collection Algorithms:

• Reference Counting:
o Each object tracks the number of references to it.
o Memory is freed when the reference count drops to zero.
o Problem: Cannot handle mutually-referential objects (cyclic garbage).
• Tracing-Based Algorithms:
o Mark and Sweep:
▪ Mark phase: Marks all reachable objects.
▪ Sweep phase: Deletes unmarked (unreachable) objects.
o Copy-Based Collectors:
▪ Divides memory into two areas (from-space and to-space).
▪ Copies reachable objects to the empty area and compacts memory.
o Both methods are stop-the-world (suspend programs during collection).

3. Challenges with Garbage Collection:

• Incremental Garbage Collection:


o Preserves application response time but introduces concurrency issues.
o Risk: Program changes during collection may cause errors (e.g., reclaiming reachable
objects).
• Object DBMS Complexity:
o Challenges include:
▪ Handling system failures and transaction rollbacks.
▪ Managing client-server optimizations (e.g., caching, large data analysis).
• Fault-tolerant garbage collection has been researched for transactional systems.

4. Distributed Garbage Collection:

• Difficulties:
o Objects may have references across multiple sites, complicating collection.
o Distributed systems face issues like message loss, duplication, or delays.
• Methods:
o Distributed Reference Counting:
▪ Tracks references across sites but fails with message order issues or cycles.
▪ Variants like "reference listing" address some failures but not cyclic garbage.
o Distributed Tracing:
▪ Combines local collectors with a global inter-site collector.
▪ Relies on inconsistent information due to communication delays.
• Research Solutions:
o Algorithms like those by Ferreira and Shapiro (1994) reclaim garbage cycles across
distributed spaces.

• Automatic garbage collection is essential in object-based and distributed systems.


• Challenges arise due to object relationships, distributed environments, and concurrency.
• Ongoing research focuses on improving fault tolerance and efficiency for large-scale
systems.

________________________________________________________________
Object Query Processor Architectures:
• Query Optimization as an Optimization Problem:
• Query optimization selects the "optimum" state (algebraic query) in a search space of
equivalent queries, using a cost function.
• Query processors vary in how they model the search space, cost functions, and
transformation rules.
• Architectures of Object DBMS Optimizers:
• Many optimizers are integrated into the object manager or implemented as client modules
in client/server architectures.
• Most are "hardwired," limiting their extensibility across various components.
• Need for Extensibility:
• Extensible optimizers can support diverse search strategies, algebra specifications,
transformation rules, and cost functions.
• Rule-based optimizers allow adding new transformation rules but lack extensibility in
other dimensions.
• Extensibility Through Modularization:
• Modularization can achieve extensibility by separating components like:
o User query language parsing from operator graphs.
o Algebraic operator manipulation from execution algorithms.
• This separation enables changes to one module (e.g., query language or optimization
method) without affecting others.
• Search Space Extensibility:
• Search space can be divided into regions, each representing a family of equivalent query
expressions.
• Regions may differ in transformation rules, optimization objectives (e.g., cost
minimization or form transformation), and search strategies.
• Object-Oriented Approach for Extensibility:
• Using object-oriented design, components like queries, classes, operators, and cost
functions are treated as first-class objects.
• This approach simplifies the addition of new operators, transformation rules, and operator
implementations, enhancing flexibility and extensibility.

Query Processing Issues:

• ODBMS use a query processing method similar to relational database, but differ in
details due to the unique object and query methods.
Here we focus on
i) Algebraic Optimization.
ii) Path Expressions.
iii) Query Execution.

1. Algebraic Optimization

1. Search Space and Transformation Rules:


o Transformation rules in object algebra depend on specific algebraic operations,

differing from relational systems because objects have subclass and composition
relationships.
o Unique rules include:
o C1∩C2=∅, if c1≠c2 No object can belong to two different classes.
o C1∪C2=C2, if C1 is a subclass of C2.
Ensures inheritance-based unions include all subclass objects.
o Parameterized select (R(PσF⟨QSet⟩)∩R simplifies into (PσF⟨QSet⟩)∩(RσF′⟨QSet>),
based on type consistency and rule conditions.
2. Search Algorithm:
o Enumerative Algorithms:
▪ Relies on dynamic programming, useful for limited joins.
▪ Inefficiency arises when queries exceed 10 joins (as noted by Ioannidis and
Wong, 1987).
o Randomized Search Algorithms:
▪ Proposed as alternatives to reduce search space.
▪ These lack extensive research and distributed algorithm implementations.
3. Cost Function:
o Relational systems define cost functions using system catalogs (e.g., cardinality,
indexing).
o In object DBMSs:
▪ Cost functions require additional information due to encapsulation (e.g.,
internal object structure may be hidden).
▪ Recursive cost calculations are based on algebraic trees.
▪ Objects can "reveal" costs through interfaces, enabling optimizations.
o Abstract definitions of costs and optimization rules remain areas for further study.

2. Path Expressions

1. Definition:
o Path expressions represent object reference chains, such as
c.engine.manufacturer.name, used to retrieve deeply nested values.
o Includes attributes, methods, and complex predicates.
2. Optimization of Path Expressions:
o Recognized during query parsing or algebraic optimization phases.
o Rewritten into logical algebra expressions for optimization, enabling efficient
execution (e.g., using indexed scans or joins).
3. Rewriting and Algebraic Optimization:
o Example: The path expression c.engine.manufacturer.name involves:
▪ Links to retrieve Engine and Manufacturer objects from disk.
▪ Optimizations focus on these retrieval steps.
o Techniques:
▪ Type-based rewriting decomposes complex expressions into simpler joins
and indexed scans.
▪ Materialize (Mat) Operator:
▪ Defines "scope" of path expressions for predicate evaluation or
algebraic transformations.
▪ Enables optimized grouping or individual handling of materialized
objects.

3. Query Execution

Path Indexes:

1. Indexing Techniques:
o Path indexes improve path expression execution.
o Examples:
▪ Join indexes: Use relationships between objects for optimized joins.
▪ Access support relations: Store frequently traversed paths for faster lookups,
improving query performance by up to two orders of magnitude.
o Maintenance costs of these structures arise during updates to underlying data.

2. Set Matching

1. Join Algorithms:
o Hybrid-Hash Join:
▪ Uses hashing to partition operand collections into smaller buckets that fit in
memory.
▪ Efficiently joins bucket pairs in memory.
o Pointer-Based Hash Join:
▪ Utilizes object pointers for direct referencing.
▪ Steps:
1. Partition operand RRR by OID (Object Identifier) values.
2. Build a hash table for RRR based on its pointers to SSS.
3. Match objects in RRR with those in SSS via pointer-based hash table
entries.
o Both algorithms are centralized, with no distributed counterparts.
2. Assembly Operator:
o Generalizes pointer-based joins for multi-way joins.
o Efficiently assembles complex objects (e.g., cars with components like engines and
manufacturers).
o Example:
▪ Assemble two Car objects with references to Engine and Bumper objects.
▪ Uses a window of size WWW to manage unresolved references.
▪ Steps
▪ Start with unresolved references (e.g., C1,C2 ).
▪ Resolve C1and add its references to the list (e.g., E1,B1 ).
▪ Continue resolving and assembling components until all objects in the
window are processed.
Window-based processing optimizes disk access and memory usage.
Scheduling strategies include depth-first, breadth-first, and elevator order
Distributed Assembly:
Proposed strategies:
Centralized processing: All data shipped to a central site.
Partially distributed processing: Local assembly at remote sites, with final assembly at the central
site.
Fully distributed processing: Complex operations like joins executed at remote sites, with results sent
to the central site.
Examples of Object Oriented Data Model:

Yes, web-based applications or mobile apps developed using Java or J2ME with DBMS
connectivity that retrieve image or video files from storage can follow the Object-Oriented Data
Model (OODM) if they manage data as objects.

Why This Follows OODM:

1. Data as Objects:
o In Java, everything is treated as an object.
o Example: An Image or Video can be a class with attributes like fileName, fileSize,
fileType, and methods like play() or show().
2. Relationships and Persistence:
o Objects can have relationships, such as User having a list of Videos.
o These objects can be persisted in databases using JDBC (Java Database
Connectivity) or frameworks like Hibernate.
3. Multimedia Support:
o Image/Video Retrieval: Files can be stored in a database (BLOB/CLOB) or on a file
server.
o Example: A media app retrieving user-uploaded videos from cloud storage follows
OODM when objects like User, Video, and Playlist interact.

Real-World Example:

Mobile App: Video Streaming App (Java + DBMS)

1. Classes/Objects:

class Video {
String title;
String filePath;
long duration;

void play() {
System.out.println("Playing " + title);
}
}

class User {
String username;
List<Video> uploadedVideos = new ArrayList<>();

void uploadVideo(Video video) {


uploadedVideos.add(video);
}}
2. DBMS Integration (RDBMS or NoSQL):
o Use JDBC to connect to a MySQL/PostgreSQL database.
o Use Hibernate ORM for object persistence.
3. Storage and Retrieval:
o Use AWS S3, Google Cloud Storage, or database BLOB storage for large media
files.
o Example Query:

SELECT video_file FROM videos WHERE user_id = 101;

When It Doesn't Follow OODM:

• If the app only uses SQL tables and manages multimedia files separately without treating
them as objects, it follows an RDBMS model rather than OODM.

Conclusion:

• If the application uses Java’s object-oriented structure, connects to a DBMS, and treats
multimedia as objects, it follows OODM.
• If it treats multimedia as records in a 2D table and only uses SQL for queries, it follows
RDBMS.

Retrieving an Image from Local Drive Using JDBC in Java

To retrieve an image stored locally using JDBC, we can design a database-driven application that
stores the image path in a database. The application will retrieve the path using JDBC and display
the image details (file name, file size, and path).

Step 1: Create the Database Table

1. Database Setup:
o Use a MySQL database.
o Create a table images to store image file paths.

CREATE TABLE images (


id INT PRIMARY KEY AUTO_INCREMENT,
image_name VARCHAR(100),
image_path VARCHAR(255)
);
Step 2: Insert Sample Data into the Table
INSERT INTO images (image_name, image_path)
VALUES ('example.jpg', 'C:\\Users\\YourUsername\\Pictures\\example.jpg');

C:\\Users\\91970\\Pictures\\Saved Pictures\\ram.jpg

Step 3: Define the Java Class

ImageFile.java (Object Class - OODM Design)

import java.io.File;

public class ImageFile {


private String imageName;
private String filePath;

// Constructor
public ImageFile(String imageName, String filePath) {
this.imageName = imageName;
this.filePath = filePath;
}

// Check if the file exists


public boolean fileExists() {
File file = new File(filePath);
return file.exists() && file.isFile();
}

// Print file details


public void printFileDetails() {
File file = new File(filePath);
if (fileExists()) {
System.out.println("Image Name: " + imageName);
System.out.println("File Path: " + file.getAbsolutePath());
System.out.println("File Size: " + file.length() + " bytes");
} else {
System.out.println("File not found!");
}
}
}

Step 4: JDBC Database Connection

DatabaseConnection.java
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;

public class DatabaseConnection {

private static final String URL =


"jdbc:mysql://localhost:3306/yourdatabasename";
private static final String USER = "root";
private static final String PASSWORD = "yourpassword";

public static Connection getConnection() {


try {
Connection conn = DriverManager.getConnection(URL, USER, PASSWORD);
System.out.println("Database connected successfully!");
return conn;
} catch (SQLException e) {
e.printStackTrace();
return null;
}
}
}

Step 5: Retrieve the Image Path from the Database

Main.java

java
Copy code
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class Main {

public static void main(String[] args) {


try (Connection conn = DatabaseConnection.getConnection()) {
if (conn != null) {
String query = "SELECT image_name, image_path FROM images WHERE
id = ?";
PreparedStatement stmt = conn.prepareStatement(query);

// Retrieve the image with ID 1


stmt.setInt(1, 1);
ResultSet rs = stmt.executeQuery();

if (rs.next()) {
String imageName = rs.getString("image_name");
String imagePath = rs.getString("image_path");

// Create an ImageFile object


ImageFile image = new ImageFile(imageName, imagePath);
image.printFileDetails();
} else {
System.out.println("No image found with the specified ID.");
}
}
} catch (SQLException e) {
e.printStackTrace();
}
}
}

How It Follows OODM

1. Object-Oriented Design (OODM Principles):


o The ImageFile class represents the image entity with properties (imageName,
filePath) and methods (fileExists(), printFileDetails()).
2. Encapsulation:
o Data attributes are private and accessed only through methods.
3. Data Abstraction:
o File-checking and file details retrieval logic are encapsulated within the ImageFile
class.
4. Persistence and Retrieval:
o The application persists the image file path in the database and retrieves it using
JDBC.

Expected Output (if the file exists):


mathematica
Copy code
Database connected successfully!
Image Name: example.jpg
File Path: C:\Users\YourUsername\Pictures\example.jpg
File Size: 524288 bytes
Persistence Object:

In the context of the Object-Oriented Data Model (OODM), a Persistence Object is an object that
is designed to store and retrieve data from a persistent storage system, such as a database or file
system. These objects represent entities that can be saved beyond the runtime of an application and
retrieved later, allowing the data to persist across multiple sessions or executions.

Key Features of Persistence Objects:

1. Persistence:
The primary characteristic of a persistence object is its ability to store its state (data) in a
persistent storage medium, such as a relational database, NoSQL database, or file system, so
that it can be restored and accessed later.
2. Mapping between Object and Database:
These objects map directly to a database table or a file structure. Each object in an
application typically corresponds to a row in a database table or a document in a NoSQL
database.
3. Object-Relational Mapping (ORM):
In some cases, persistence objects are used with ORM tools (like Hibernate in Java, or Entity
Framework in C#) that automatically map between object-oriented models and relational
databases.
4. State Management:
Persistence objects manage the state of the data. When their state changes, the changes can be
saved to the database, and when they are retrieved, they restore their state from the database.

Example in Java (Using JDBC for Persistence):

Let’s consider a simple User object that needs to be saved and retrieved from a database.

User Class (Persistence Object Example)

public class User {


private int id;
private String name;
private String email;

// Constructor
public User(int id, String name, String email) {
this.id = id;
this.name = name;
this.email = email;
}

// Getters and Setters


public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}

public String getName() {


return name;
}

public void setName(String name) {


this.name = name;
}

public String getEmail() {


return email;
}

public void setEmail(String email) {


this.email = email;
}

// Method to save the User object into the database (Persistence)


public void saveToDatabase(Connection conn) throws SQLException {
String query = "INSERT INTO users (name, email) VALUES (?, ?)";
try (PreparedStatement stmt = conn.prepareStatement(query)) {
stmt.setString(1, this.name);
stmt.setString(2, this.email);
stmt.executeUpdate();
}
}

// Method to retrieve a User object from the database (Persistence)


public static User getFromDatabase(Connection conn, int userId) throws
SQLException {
String query = "SELECT id, name, email FROM users WHERE id = ?";
try (PreparedStatement stmt = conn.prepareStatement(query)) {
stmt.setInt(1, userId);
ResultSet rs = stmt.executeQuery();

if (rs.next()) {
return new User(rs.getInt("id"), rs.getString("name"),
rs.getString("email"));
} else {
return null;
}
}
}
}

Explanation of the Example:

1. User Class:
The User class is a persistence object because it represents a real-world entity that can be
stored and retrieved from a database. The class has attributes like id, name, and email that
represent a user.
2. Database Interaction:
The saveToDatabase() method saves the user object to a database, while the
getFromDatabase() method retrieves the user object based on an ID.
3. Persistence of State:
The state of the User object (name and email) is saved in the database, making it possible to
persist the data and retrieve it later.

Why Persistence Objects are Important:

1. Data Durability:
Without persistence objects, the application would only work with data that exists in memory
during runtime. Persistence objects allow the data to persist even after the application is
closed.
2. Data Retrieval and Manipulation:
Persistence objects can be used to manipulate and retrieve data from a storage system,
making the application more dynamic and interactive.
3. Separation of Concerns:
Persistence objects help separate the application’s logic from the data storage logic. This is
important for maintaining clean code and better management of resources.

Persistence in OODM vs. RDBMS:

In OODM, persistence objects typically map to real-world objects (like a User, Product, etc.) that
are persisted into a database. In a traditional RDBMS, the focus is more on tables, columns, and
rows of data. OODM focuses on objects, which can be easily transformed into database records
through techniques like Object-Relational Mapping (ORM).
Persistence Programming Languages:

A "persistent programming language" in the context of object-oriented data models refers to a


programming language that allows data (objects) to persist beyond the lifetime of the program,
meaning the data remains stored even after the program terminates, usually by integrating with a
database system, and enabling the retrieval of that data when the program is run again; essentially, it
provides built-in mechanisms to manage the storage and retrieval of object data in a persistent
manner.
Key points about persistent programming languages:
• Object-oriented persistence:
The primary focus is on storing and retrieving objects as complete entities, preserving their
relationships and data structure, unlike traditional database systems that might require mapping
objects to relational tables.
• Transparent persistence:
Ideally, the language should handle the persistence mechanisms seamlessly, so developers don't need
to explicitly write code for every data storage operation.
• Database integration:
These languages often have built-in features to interact with object-oriented databases (OODBMS)
which are designed to handle complex object structures effectively.
Example features of a persistent programming language:
• Annotations:
Special keywords or attributes in the code to mark which objects or data should be persisted.
• Object serialization:
The ability to convert objects into a format that can be stored in a database.
• Querying capabilities:
Allowing retrieval of data from the database using object-oriented query syntax.
Important considerations:
• Performance:
Managing object persistence can introduce overhead, so optimizing database interactions is crucial.
• Impedance mismatch:
Sometimes there can be a disconnect between how an object-oriented programming language
represents data and how a traditional relational database stores it, which can require additional
mapping logic.
Examples of languages often associated with object persistence:
• Smalltalk:
Considered one of the earliest languages with strong support for object persistence
• C++ with Object-Oriented Databases:
When used with dedicated OODBMS systems, C++ can achieve persistent object storage .
• Java with ObjectDB:
Java developers can leverage ObjectDB, a NoSQL database specifically designed for object
persistence.
Comparison between OODBMS and ORDBMS

OODBMS ORDBMS
Object-Oriented Database Management Object-Relational Database Management
System (OODBMS) designed to handle System (ORDBMS) designed to handle
complex data complex data
A database system that supports the creation A hybrid system combining features of both
and management of objects, similar to object- relational databases and object-oriented
oriented programming. databases.
Uses objects, classes, and inheritance from Uses tables with additional support for
object-oriented programming. complex data types such as objects and
collections.
Stores data as objects. Each object has Stores data in tables but supports objects as
attributes and methods. data types through extensions.
Uses object-oriented query languages like Uses SQL with object extensions (e.g., SQL3)
OQL (Object Query Language).
Fully supports inheritance, polymorphism, Provides limited inheritance features through
and encapsulation. table hierarchies.
Best for applications requiring complex data Suitable for traditional business applications
models, like CAD, multimedia, and scientific requiring relational structure with some object-
applications. oriented features.
No universally accepted standard exists. Well-standardized with SQL and related
extensions.
Easily integrates with object-oriented Requires mapping between objects and
languages like Java, C++, and Python. relational tables. May need additional ORM
(Object-Relational Mapping) tools.
More complex due to its object-oriented Less complex because it extends a familiar
features. relational model.
OODBMS is ideal for applications requiring ORDBMS is a practical middle-ground,
complex data modeling and direct object offering object-oriented capabilities within a
manipulation. familiar relational database framework.
Object-orineted programming (OOP) is a programming paradigm based
upon objects (having both data and methods) that aims to incorporate the
advantages of modularity and reusability. Objects, which are usually
instances of classes, are used to interact with one another to design applications
and computer programs.

Features are as follows:

▫ Bottom–up approach in program design

▫ Programs organized around objects are grouped in classes

▫ Focus on data with methods to operate upon object’s data

▫ Interaction between objects through functions

▫ Reusability of design through creation of new classes by adding


features to existing classes

What is an Object?

• Object means a real word entity such as pen, chair, table etc.

• Any entity that has state and behavior is known as an object. It can be
physical and logical. An Object is an instance of a class.
What is an Class?

• Collection of objects is called class. It is a logical entity.

• A Class is a user defined data-type which has data members and


member functions.

• Data members are the data variables and member functions are the
functions used to manipulate these variables and together these data
members and member functions defines the properties and behavior
of the objects in a Class.

What is abstraction?

• Its main goal is to handle complexity by hiding unnecessary details


from the user.

• Abstraction is a process of hiding the implementation details and


showing only functionality to the user.

• It only indicates important things to the user and hides the internal
details, ie. While sending SMS, you just type the text and send the
message.

What is Encapsulation?

• Encapsulation is a principle of wrapping data (variables) and code


together as a single unit.

• This concept is also often used to hide the internal representation, or state,
of an object from the outside. This is called information hiding.
What is Inheritance?

Inheritance is a mechanism in which one class acquires the property of


another class.

For example, a child inherits the traits of his/her parents.

Inheritance facilitates Reusability.

Sub Class: The class that inherits properties from another class is called Sub
class or Derived Class.

Super Class: The class whose properties are inherited by sub class is called
Base Class or Super class.

The above process results in duplication of same code 3 times. This increases
the chances of error and data redundancy.
Inheritance

Using inheritance, we have to write the functions only one time instead of three
times as we have inherited rest of the three classes from base class(Vehicle).

What is polymorphism?

Polymorphism is a OOPs concept where one name can have many forms.

“Poly” means many and “morphs” means forms hence “many forms”.

For example, you have a smartphone for communication. The communication


mode you choose could be anything. It can be a call, a text message, a picture
message, mail, etc.

Ex:

void sum (int a , int b);

void sum (int a , int b, int c);

void sum (float a, double b);


Persistent programming languages
Persistent programming languages are designed to handle data that remains in existence
beyond the execution of a program. They integrate data storage mechanisms directly into the
language, allowing objects and data structures to be stored and retrieved seamlessly without
requiring explicit read/write operations for databases or files.

This makes data management more transparent and reduces the impedance mismatch
between the in-memory data structures of a program and the data stored on disk or in
databases.

Key Features of Persistent Programming Languages:

1. Seamless Data Persistence: The language allows data structures and objects to be
stored persistently and retrieved in their original form.

2. Automatic Serialization: Objects are automatically converted to a format suitable for


storage and deserialized when accessed.

3. Transaction Support: Some persistent languages support transactions, ensuring data


integrity and consistency.

4. Type Safety: Persistent languages often provide type-checking features to ensure that
the stored data matches the expected data types.

Examples of Persistent Programming Languages:

1. Java (with libraries like Hibernate for Object-Relational Mapping)

2. C++ (with persistent frameworks such as Objectivity/DB)

3. Smalltalk (integrated persistence with image-based storage)


4. Python (with libraries like pickle, shelve, or frameworks such as ZODB for object
persistence)

5. Ruby (ActiveRecord for database persistence in Ruby on Rails)

6. Lisp (Common Lisp can use external libraries for persistent storage)

Specialized Persistent Languages:

 Napier88: A language specifically designed to provide integrated support for


persistent data.
 PS-algol: An extension of the Algol language that included built-in support for
persistent objects and data structures.

Persistent programming languages help bridge the gap between in-memory data structures
and long-term data storage, streamlining development, and improving code maintainability.
Object Identity and Pointers

1. The association of an object with a physical location in storage (as in


C++) may change over time.

2. There are several degrees of permanence of identity:

o intraprocedure: Identity persists only during the execution of a


single procedure

e.g., local variables within procedures

o intraprogram: Identity persists only during the execution of a


single program or query

o e.g., global variables in programming languages, and main


memory or virtual memory pointers.

o interprogram: Identity persists from one program execution to


another

e.g., pointers to file system data on disk but may change if the way
data is stored in the file system is changed.

o persistent: Identity persists not only among program executions


but also among structural reorganizations of the data.
o This is the persistent form of identity required for object-oriented
systems.

3. In persistent extension of C++, object identifiers are implemented as


``persistent pointers'' which can be viewed as a pointer to an object in the
database.

Storage and Access of Persistent Objects

1. How are objects stored in a database?

Code (that implements methods) should be stored in the database as part


of the schema, along with type definitions, but many implementations
store them outside of the database, to avoid having to integrate system
software such as compilers with the database system.

Data: stored individually for each object.

2. How to find the objects?

1. Give names to objects like we give names to files: works only for
small sets of objects.

2. Expose object identifiers or persistent pointers to the objects:

Store the collections of object and allow programs to iterate over


the collections to find required objects.

The collections can be modeled as objects of a collection type.

A special case of a collection is a class extent, which is a collection


of all objects belonging to the class.

Most OODB systems support all three ways of accessing persistent


objects.
All objects have object identifiers.

Names are typically given only to class extents and other collection
objects, and perhaps to other selected objects, but most objects are not
given names.

Class extents are usually maintained for all classed that can have
persistent objects, but in many implementations, they contain only
persistent objects of the class

Figure : Example of ODMG C++ Object Definition Language

Persistent Programming Languages

1. Persistent data: data that continue to exist even after the program that
created it has terminated

2. A persistent programming language is a programming language extended


with constructs to handle persistent data.
It distinguishes with embedded

SQL in at least two ways:

1. In a persistent program language, query language is fully integrated


with the host language and both share the same type system.

Any format changes required in databases are carried out


transparently

Comparison with Embedded SQL

(1) host and DML have different type systems, code conversion
operates outside of OO type system, and hence has a higher chance
of having undetected errors

(2) format conversion takes a substantial amount of code

2. Using Embedded SQL, a programmer is responsible for writing


explicit code to fetch data into memory or store data back to the
database

In a persistent program language, a programmer can manipulate


persistent data without having to write such code explicitly

3. Drawbacks:

(1) Powerful but easy to make programming errors that damage the
database

(2) harder to do automatic high-level optimization

(3) do not support declarative querying well

Persistence of Objects

Several approaches have been proposed to make the objects persistent

1. persistence by class. Declare class to be persistent: all objects of the class


are then persistent objects

Simple, not flexible since it is often useful to have both transient and
persistent objects in a single class
In many OODB systems, declaring a class to be persistent is interpreted
as “persistable'' -- objects in the class potentially can be made persistent

2. persistence by creation. Introduce new syntax to create persistent objects

3. persistence by marking. Mark an object persistent after it is created

4. persistence by reference. One or more objects are explicitly declared as


(root) persistent objects

All other objects are persistent iff they are referred, directly or indirectly,
from a root persistent object

It is easy to make the entire data structure persistent by merely declaring


the root of the structure as persistent, but is expensive to follow the chains
in detection for a database system
Comparison between OODBMS and ORDBMS

OODBMS ORDBMS
Object-Oriented Database Management Object-Relational Database Management
System (OODBMS) designed to handle System (ORDBMS) designed to handle
complex data complex data
A database system that supports the creation A hybrid system combining features of both
and management of objects, similar to object- relational databases and object-oriented
oriented programming. databases.
Uses objects, classes, and inheritance from Uses tables with additional support for
object-oriented programming. complex data types such as objects and
collections.
Stores data as objects. Each object has Stores data in tables but supports objects as
attributes and methods. data types through extensions.
Uses object-oriented query languages like Uses SQL with object extensions (e.g., SQL3)
OQL (Object Query Language).
Fully supports inheritance, polymorphism, Provides limited inheritance features through
and encapsulation. table hierarchies.
Best for applications requiring complex data Suitable for traditional business applications
models, like CAD, multimedia, and scientific requiring relational structure with some object-
applications. oriented features.
No universally accepted standard exists. Well-standardized with SQL and related
extensions.
Easily integrates with object-oriented Requires mapping between objects and
languages like Java, C++, and Python. relational tables. May need additional ORM
(Object-Relational Mapping) tools.
More complex due to its object-oriented Less complex because it extends a familiar
features. relational model.
OODBMS is ideal for applications requiring ORDBMS is a practical middle-ground,
complex data modeling and direct object offering object-oriented capabilities within a
manipulation. familiar relational database framework.

You might also like