Lab 3
Lab 3
The objective of this laboratory work is to familiarize oneself with the data exchange
formats and data representation in MongoDB, as well as data modeling.
Tasks of the laboratory work: learn how to translate data models from relational
databases into document-oriented format.
Theoretical background
“OSFamily”: “iOS”,
“OSVersion”: “4”
}
{
“Brand”: “Samsung”,
“Model”: “Galaxy S3”,
“OSFamily”: “Android”,
“OSVersion”: “4.0 Ice Cream Sandwich”,
“Display”: “4.8 HD Super AMOLED”
}
“\x31\x00\x00\x00\x04BSON\x00\x26\x00
\x00\x00\x020\x00\x08\x00\x00
{“BSON”: [“awesome”, 5.05, 1986]} →
\x00awesome\x00\x011\x00\x33\x33\x33
\x33\x33\x33
\x14\x40\x102\x00\xc2\x07\x00\x00
\x00\x00”
Note:
Please note that strings enclosed in quotes represent terminal symbols and should be interpreted with "C"
semantics (for example, "\x01" represents the bytes "0000 0001").
Data Modeling
Data modeling in MongoDB involves designing the structure and organization of data
within MongoDB collections, which are analogous to tables in relational databases. MongoDB is
a NoSQL database that uses a flexible, document-oriented data model, where data is stored in
BSON (Binary JSON) format.
Here are some key considerations for data modeling in MongoDB:
Denormalization: MongoDB does not enforce rigid schema structures, allowing for
denormalization, where related data can be embedded within a single document for efficient
retrieval. This eliminates the need for complex joins, but may require duplication of data.
Embedded documents: MongoDB supports nested or embedded documents, allowing
for the modeling of hierarchical relationships between data. This can simplify data retrieval and
improve performance in some cases.
Document-oriented design: MongoDB is optimized for handling documents, which can
be rich in structure and can vary from one document to another. Data modeling should consider
the document-oriented nature of MongoDB and leverage its flexibility for storing data.
Query optimization: Data modeling should consider the types of queries that will be
performed on the data and optimize the document structure accordingly. This may involve
creating indexes, using appropriate data types, and organizing data in a way that aligns with
query patterns.
Scalability: MongoDB is designed to scale horizontally, so data modeling should take
into account the potential for high data volumes and plan for distributed deployments, such as
sharding and replica sets, to ensure scalability and high availability.
Data integrity: While MongoDB allows for flexible data modeling, care should be taken
to ensure data integrity, consistency, and accuracy. Application-level validation and data
validation rules should be implemented as needed to maintain data quality.
Performance considerations: Data modeling should take into consideration
performance aspects, such as the size of documents, the frequency of updates, and the read/write
patterns of the application, to optimize for performance and minimize potential bottlenecks.
Overall, data modeling in MongoDB requires careful consideration of the application's
requirements, query patterns, scalability needs, and performance considerations, while
leveraging the flexibility and document-oriented nature of MongoDB to design an efficient and
effective data model.
Non-relational databases allow designing the model of a domain as a set of objects. In
contrast to relational databases (RDB), where information about one entity is scattered across
different tables, in a non-relational database, it can be stored in a single object.
The main difference between MongoDB and RDB is the absence of an equivalent
operation to JOIN. If there is a need to perform joins in a database, they are implemented in the
application's source code. To find data related to a particular document, typically a second query
needs to be executed.
To associate documents, they can be saved together with the "_id" of the related
documents.
As an example, let's illustrate the storage of information about a phone manufacturer as an
embedded record.
{
_id: ObjectId ("1"),
“Name”: “Nokia”,
“BrandName”: “Nokia”,
“BrandCountry”: “Finland”
}
The documents that need to refer to the "Nokia" document as the manufacturer will have
a reference to its "_id" field. The record indicating the manufacturer company will look like
this:
{
_id: ObjectId ("2"),
“Name”: “L920”,
“Model”: “Lumia 920”,
“OSFamily”: “Windows”,
“OSVersion”: “8”,
“Brand”: ObjectId ("1")
}
Note that the value of the "Brand" field in the "L920" document and the "_id" field in the
"Nokia" document are the same.
The "_id" field can be any unique value.
To find all phones manufactured under the "Nokia" brand, you would need to execute a query
specifying the value of its "_id" field.
db.phones.find ({Brand: ObjectId ("1")})
If you need to specify more than one related document, you can use
arrays:
“Brand”: [ObjectId ("1"), ObjectId ("3")]
One way to get rid of links between documents is to use nested documents. For example, the
above example could be rewritten using the Nokia company details as an attached
document:
{ “Name”: “L920”,
“Model”: “Lumia 920”,
“OSFamily”: “Windows”,
“OSVersion”: “8”,
“Brand”: {
“BrandName”: “Nokia”,
“BrandCountry”: “Finland”
}
}
{
"_id": 1,
"name": "John",
"age": 30,
"comments": [
{
"comment_id": 1,
"text": "Great post!"
},
{
"comment_id": 2,
"text": "Interesting article!"
}
]
}
In this example, comments are embedded as an array within the user document, creating a
one-to-many relationship between users and comments. You can easily retrieve comments for a
user by querying the "users" collection.
References:
MongoDB also supports referencing documents from one collection to another using
references or foreign keys. For example, you can have a "users" collection and a separate
"comments" collection, where each comment has a reference to the user who made the comment:
// Users collection
{
"_id": 1,
"name": "John",
"age": 30
}
// Comments collection
{
"_id": 101,
"text": "Great post!",
"user_id": 1
}
In this example, the "comments" collection includes a "user_id" field that references the
"_id" field of the corresponding user in the "users" collection. This creates a many-to-one
relationship between comments and users. You can use queries and joins to retrieve comments
and their associated user information as needed.
Now, to retrieve comments along with their associated user information, you can use
MongoDB's $lookup aggregation operator to perform a join-like operation between the
"comments" and "users" collections:
db.comments.aggregate([
{
$lookup: {
from: "users",
localField: "user_id",
foreignField: "_id",
as: "user_info"
}
}
])
This aggregation query will retrieve comments from the "comments" collection, and for
each comment, it will perform a lookup in the "users" collection based on the "user_id" field,
matching it with the "_id" field in the "users" collection. The retrieved user information will be
added to the "user_info" field in the output.
The result of the above aggregation query might look like this:
{
{
"_id": 101,
"text": "Great post!",
"user_id": 1,
"user_info": [
{
"_id": 1,
"name": "John",
"age": 30
}
]
}
Hybrid Approach:
You can also use a hybrid approach, combining embedded documents and references,
depending on the specific requirements of your application. For example, you can embed some
related data within a document for efficiency, and use references for other related data that may
have more complex relationships or require frequent updates.
Safety guidelines
The safety guidelines for carrying out the laboratory work are in line with those generally
accepted for personal computer users. Do not attempt to repair your personal computer or install
and remove software. In case of a malfunction of the personal computer, report it to the
laboratory service staff (operator, administrator). Observe safety regulations when working with
electrical equipment. Do not touch electrical sockets with metallic objects. The user's
workstation should be kept clean. Eating and drinking are not allowed near the personal
computer.
Individual assignments
Before carrying out the laboratory work, each student receives an individual assignment.
The defense of the laboratory work takes place only after its completion (individual assignment).
During the defense of the laboratory work, the student:
• Answers control questions;
• Explains the process of completing the individual assignment;
• Explains the results obtained as a result of completing the individual assignment.
The progress of the defense of the laboratory work is monitored by the teacher.
FAQ:
1. What are some considerations to keep in mind when designing data models in
MongoDB?
Answer: Some considerations to keep in mind when designing data models in MongoDB
include understanding the data access patterns, considering the read vs. write trade-offs, selecting
appropriate data types, denormalizing or embedding data for performance, and planning for
future scalability and flexibility.
5. How can you optimize data models in MongoDB for write-heavy workloads?
Answer: To optimize data models in MongoDB for write-heavy workloads, you can
consider denormalizing data to reduce the need for joins, using indexes to improve query
performance, and leveraging MongoDB's sharding capabilities to distribute data across multiple
shards for horizontal scalability.