Ccs368-Stream Processing Lab Manual
Ccs368-Stream Processing Lab Manual
Name : ……………………………………….
Register No : ………………………………………
Year/Semester: …….……………………………….
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
Certified that this is the bonafide Record of the work done by Mr./Ms.
…………………………………….……. in the CCS368 Stream Processing
Laboratory of this institution, as per the Anna University, Chennai for the third Semester
Computer Science and Engineering, during the period of August 2024 to December
2024.
Register Number:
2. 10
Create and Drop a database in MongoDB
3. 13
MongoDB on the Fly
4. 14
Creating Collection
5. 17
Insert document using MongoDB
6. 23
MongoDB Update document
27
MongoDB Delete
7.
36
Java & PHP
8.
9. 39
Procedure for installing Apache Kafka
43
Kafka Cluster& Basic operations
10.
4
EX.NO:1 Procedure for Install MongoDB
DATE:
Step 1: Go to MongoDB download Page and click download as shown in the screenshot. A
.msifile like this mongodb-win32-x86_64-2008plus-ssl-3.4.7-signed will be downloaded in
your system. Double click on the file to run the installer.
Step 2: Click Next when the MongoDB installation windows pops up.
5
Step 3: Accept the MongoDB user Agreement and click Next.
Step 4: When the setup asks you to choose the Setup type, choose Complete.
6
Step 5: Click Install to begin the installation.
7
Step 6: That’s it. Click Finish once the MongoDB installation is complete.
8
Result: Thus the MongoDB is successfully installed.
9
EX.NO: 2 Create and Drop a database in MongoDB
DATE:
mongod is a background process used by MongoDB. The main purpose of mongod is to manage
all the MongoDB server tasks. For instance, accepting requests, responding to client, and memory
management. mongo is a command line shell that can interact with the client (for example, system
administrators and developers).
To run the MongoDB shell, type the following command:
After pressing enter we are at the MongoDB shell as shown in below figure
10
Once you are in the MongoDB shell, create the database in MongoDB by typing this
command:
>use database_name // to create a new DB or to connect to already existed DB
The DB madavi is created; is not present in the list of all the databases. This is because a
database is not created until you save a document in it.
Note: If the database name you mentioned is already present then this command will
connect you to the database. However if the database doesn’t exist then this will create the
database with the given name and connect you to it.
• Now we are creating a collection Student and inserting a document in it.
>db.student.insert({name: “sree", age: 30, address:”vijayawada”})
• You can now see that the database “madavi” is created.
11
OUTPUT:
Result: -
12
EX.NO: 3 MongoDB on the fly
DATE:
• The cool thing about MongoDB is that you need not to create collection before you insert
document in it. With a single command you can insert a document in the collection and the
MongoDB creates that collection on the fly.
• SYNTAX:
db.collection_name.insert({key:value, key:value…})
• EXAMPLE:
db.student.insert({rollno:”20X41A0441”,name:”durga”,age:18,city:“Vijayawada”})
• SYNTAX: db.collection_name.find()
• To check whether the collection is created successfully, use the following command.
> show collections // This command shows the list of all the collections in the currently
selected database.
OUTPUT:
Result:-
Aim:-
Creating collection with options before inserting and Drop thecollection .
• We can also create collection before we actually insert data in it. This method provides
you the options that you can set while creating a collection.
SYNTAX:
db.createCollection(name, options)
• name is the collection name
• options is an optional field that we can use to specify certain parameters such as
size, max number of documents etc. in the collection.
db.collection_name.drop()
• Note: Once you drop a collection all the documents and the indexes associated with
them will also be dropped. To preserve the indexes we use remove() function that
only removes the documents in the collection but doesn’t remove the collection
itself and the indexes created on it. We will learn about indexes and remove()
function in the later tutorials.
EXAMPLE:
> db.createCollection("students")
{ "ok" : 1 }
> db.students.drop()
true
OPTIONS field in the above syntax:
{ "ok" : 1 }
• This command will create a collection named “teachers” with the max size of 9232768
bytes. Once this collection reaches that limit it will start overwriting old entries.
15
Result:-
16
EX.NO: 5 Insert document using MongoDB
DATE:
• The field “course” in the example below is an array that holds the several key-value
pairs.
>db.students.insert(
{
name: "Chaitanya",
age: 20,
email: "[email protected]",
course: [ { name: "MongoDB", duration: 7 }, { name: "Java", duration: 30 } ]
}
)
Output:
WriteResult({ "nInserted" : 1 })
b. Insert multiple documents in collection
To insert multiple documents in collection, we define an array of documents and later we
use the insert() method on the array variable as shown in the example below. Here we are inserting
17
three documents in the collection named “students”. This command will insert the data in
“students” collection, if the collection is not present then it will create the collection and insert
these documents.
EXAMPLE:
>var beginners =
[
{
"StudentId" : 1001,
"StudentName" : "Steve",
"age": 30
},
{
"StudentId" : 1002,
"StudentName" : "Negan",
"age": 42
},
{
"StudentId" : 3333,
"StudentName" : "Rick",
"age": 35
},
];
db.students.insert(beginners);
output:
BulkWriteResult({
"writeErrors" : [ ],
"writeConcernErrors" : [ ],
"nInserted" : 3,
"nUpserted" : 0,
18
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
As you can see that it shows number 3 in front of nInserted. this means that the 3 documents
have been inserted by this command.
To verify that the documents are there in collection. Run this command:
db.students.find()
print the output data in a JSON format so that you can read it easily. To print the data in JSON
format run the command
db.collection_name.find().forEach(printjson)
In the screenshot below, you can see the difference. First we have printed the documents using
normal find() method and then we printed the documents of same collection using JSON format.
The documents in JSON format are neat and easy to read.
19
OUTPUT:
20
b. Insert multiple documents in collection
21
Printing all documents based on Query
Result:-
22
EX.NO: 6 MongoDB Update document
DATE:
Aim :- To update the document using update method and save method.
a) using update()method.
b) using save() method.
SYNTAX:
EXAMPLE:
>db.studnets.update({“name”:”sai”},{$set:{“name”:”sree”}})
SYNTAX:
EXAMPLE:
>db.studnets.update({“name”:”sai”},{$set:{“name”:”sree”}},{multi:true})
SYNTAX:
• To work with save() method you should know the unique _id field of that document.
• A very important point to note is that when you do not provide the _id field while using
save()
23
method, it calls insert() method and the passed document is inserted into the collection as
a new document
• To get the _id of a document, you can either type this command:
• db.students.find().pretty()
OUTPUT:
24
To update multiple documents with the update() method:
25
a) Update a document using save() method:
Result:-
26
EX.NO: 7 MongoDB Delete
DATE:
The remove() method is used for removing the documents from a collection in MongoDB.
>db.collection_name.remove(delete_criteria)
EXAMPLE:
> db.students.find().pretty()
"_id" : ObjectId("59bcecc7668dcce02aaa6fed"),
"StudentId" : 1001,
"StudentName" : "Steve",
"age" : 30
"_id" : ObjectId("59bcecc7668dcce02aaa6fef"),
27
"StudentId" : 3333,
"StudentName" : "Rick",
"age" : 35
To remove the student from this collection who has a student id equal to 3333. To do this write a
command using remove() method like this:
db.students.remove({"StudentId": 3333})
Output:
WriteResult({ "nRemoved" : 1 })
When there are more than one documents present in collection that matches the criteria then all
those documents will be deleted if you run the remove command. However there is a way to limit
the deletion to only one document so that even if there are more documents matching the deletion
criteria, only one document will be deleted.
SYNTAX:
>db.collection_name.remove(delete_criteria, justOne)
Here justOne is a Boolean parameter that takes only 1 and 0, if you give 1 then it will limit the the
document deletion to only 1 document. This is an optional parameters as we have seen above that
we have used the remove() method without using this parameter.
> db.walkingdead.find().pretty()
28
"_id" : ObjectId("59bf280cb8e797a22c654229"),
"age" : 32,
"_id" : ObjectId("59bf2851b8e797a22c65422a"),
"name" : "Negan",
"age" : 35,
"_id" : ObjectId("59bf28a5b8e797a22c65422b"),
"age" : 32,
To remove the document that has age equal to 32. There are two documents in this collection that
are matching this criteria. However to limit the deletion to one we are setting justOne parameter
to true.
db.walkingdead.remove({"age": 32}, 1)
WriteResult({ "nRemoved" : 1 })
29
c) Remove all documents
To remove all the documents from a collection but does not want to remove the collection itself
then you can use remove() method like this:
SYNTAX:
>db.collection_name.remove({})
To drop a collection , first connect to the database in which you want to delete collection and then
type the following command to delete the collection:
>db.collection_name.drop()
Note: Once you drop a collection all the documents and the indexes associated with them will also
be dropped. To preserve the indexes we use remove() function that only removes the documents
in the collection but doesn’t remove the collection itself and the indexes created on it.
> db.teachers.drop()
true
30
OUTPUT:
31
b) Remove only one document matching your criteria
32
(Or)
33
c) Remove all documents
34
Result :-
35
EX.NO: 8 Java &PHP
DATE:
Aim:
importcom.mongodb.client.MongoDatabase;
importcom.mongodb.MongoClient;
importcom.mongodb.MongoCredential;
36
System.out.println(“sucessfullyconnected”+”to the database”);
catch (Exception e) {
System.out.println("Connection establishment failed");
System.out.println(e);
}
}
2. To use MongoDB with PHP, you need to use MongoDB PHP driver. Download the driver from
the url Download PHP Driver. Make sure to download the latest release of it. Now unzip the
archive and put php_mongo.dll in your PHP extension directory ("ext" by default)and add the
following line to your php.ini file −
extension = php_mongo.dll
Make a Connection and Select a Database
To make a connection, you need to specify the database name, if the database doesn't exist then
MongoDB creates it automatically.
Following is the code snippet to connect to the database −
<?php
// connect to mongodb
$m = new MongoClient();
37
echo "Connection to database successfully";
// select a database
$db = $m->mydb;
Result:-
Thus the simple application is successfully created.
38
EX.NO: 9 Procedure for installing Apache Kafka
DATE:
Apache Kafka can be run on all platforms supported by Java. In order to set up Kafka on the
Ubuntu system, you need to install java first. As we know, Oracle java is now commercially
available, So we are using its open-source version OpenJDK.
Download the Apache Kafka binary files from its official download website. You can also
wget https://downloads.apache.org/kafka/3.4.0/kafka_2.12-3.4.0.tgz
tarxzf kafka_2.12-3.4.0.tgz
sudomv kafka_2.12-3.4.0 /usr/local/kafka
39
Step 3 — Creating System Unit Files
Now, you need to create system unit files for the Zookeeper and Kafka services. Which will
nano /etc/systemd/system/zookeeper.service
[Unit]
Description=Apache Zookeeper server
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service] Type=simple
ExecStart=/usr/local/kafka/bin/zookeeper-server-start.sh
/usr/local/kafka/config/zookeeper.properties
ExecStop=/usr/local/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
nano /etc/systemd/system/kafka.service
[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
[Service]
40
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/java-1.11.0-openjdk-amd64"
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties
ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
First, you need to start the ZooKeeper service and then start Kafka. Use the systemctl command
to start a single-node ZooKeeper instance.
sudosystemctlstart zookeeper
Now start the Kafka server and view the running status:
41
All done. The Kafka installation has been successfully completed. The part of this tutorial will
Kafka provides multiple pre-built shell scripts to work on it. First, create a topic named
cd /usr/local/kafka
bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --
topic myTopic
The replication factor describes how many copies of data will be created. As we are running
with a single instance keep this value 1. Set the partition options as the number of brokers you
want your data to be split between. As we are running with a single broker keep this value 1.
You can create multiple topics by running the same command as above.
After that, you can see the created topics on Kafka by the running below command:
bin/kafka-topics.sh--list--bootstrap-server localhost:9092
Result:-
42
EX.NO: 10 Kafka Cluster& Basic operations
DATE:
Aim :-
Demonstrate setting up a single-node, single-broker Kafka cluster and show
basic operations such as creating topics and producing/consuming messages.
To set up a Kafka cluster, you will need to follow these general steps:
1. Install Kafka on all nodes of the cluster. You can download Kafka from the Apache
Kafka website.
2. Configure the server.properties file on each node to specify the broker ID, the
ZooKeeper connection string, and other properties.
3. Start the ZooKeeper service on each node. This is required for Kafka to function.
4. Start the Kafka brokers on each node by running the kafka-server-start command
and specifying the location of the server.properties file.
5. Test the cluster by creating a topic, producing and consuming messages, and verifying
that they are replicated across all nodes.
1. Install Kafka on all nodes of the cluster. You can download Kafka from the Apache
Kafka website.
2. Configure the server.properties file on each node to specify the broker ID, the
ZooKeeper connection string, and other properties. For example, here is a
configuration for a simple Kafka cluster with three brokers:
broker.id=1listeners=PLAINTEXT://localhost:9092num.partitions=3
log.dirs=/tmp/kafka-logs-1zookeeper.connect=localhost:2181broker.id=2
listeners=PLAINTEXT://localhost:9093 num.partitions=3 log.dirs=/tmp/kafka-logs-2
zookeeper.connect=localhost:2181 broker.id=3
listeners=PLAINTEXT://localhost:9094 num.partitions=3 log.dirs=/tmp/kafka-logs-3
zookeeper.connect=localhost:2181
In this example, each broker has a unique broker.id and listens on a different port for client
connections. The num.partitions property specifies the default number of partitions for new
topics, and log.dirs specifies the directory where Kafka should store its data on disk.
zookeeper.connect specifies the ZooKeeper connection string, which should point to the
43
ZooKeeper ensemble.
1. Start the ZooKeeper service on each node. This is required for Kafka to function. You
can start ZooKeeper by running the following command:
bin/zookeeper-server-start.shconfig/zookeeper.properties
This will start a single-node ZooKeeper instance using the default configuration.
1. Start the Kafka brokers on each node by running the kafka-server-start command and
specifying the location of the server.properties file. For example:
bin/kafka-server-start.shconfig/server.properties
This will start the Kafka broker on the default port (9092) using the configuration in
config/server.properties.
1. Test the cluster by creating a topic, producing and consuming messages, and verifying
that they are replicated across all nodes. You can use the kafka-topics, kafka-
console-producer, and kafka-console-consumer command-line tools to perform
these tasks. For example:
These commands will create a topic with three partitions and three replicas, produce
messages to the topic, and consume them from all three brokers. You can verify that the
messages are replicated across all nodes by stopping one of the brokers and observing that
the other brokers continue to serve messages.
server.properties
broker.id=1
listeners=PLAINTEXT://localhost:909
log.dirs=c:/kafka/kafka-logs-1
auto.create.topics.enable=false (optional)
Creating new Broker-1
Follow these steps to add a new broker.
1. change id to 1
45
2. Changing port no. to 9093 and auto-create to false
Edit: server-2.properties
broker.id=2
listeners=PLAINTEXT://localhost:909
4 log.dirs=c:/kafka/kafka-logs-2
auto.create.topics.enable=false
Starting up these 2 Kafka brokers
→ So we have successfully started 3 Kafka brokers and now we have a Kafka cluster that
46
is up and running in our machine with 3 brokers.
It's time to create a new topic, then we will produce and consume the messages with our new
cluster setup.
.\bin\windows\kafka-topics.bat --create --topic test-topic-replicated -
zookeeper localhost:2181 --replication-factor 3 --partitions 3
47
.\bin\windows\kafka-console-producer.bat --broker-list localhost:9092 --topic test-topic-
replicated
message sent: Hi
Instantiate a new Consumer to receive the messages.
message received: Hi
Now whatever message we have sent is received to console consumers. Now the interesting
part is that we have 3 new Kafka folders right? Let’s go ahead and check that what we have
in it.
Log directories
• close the producer console now and you know have created a kafka-logs-1 and kafka-
logs-2 directories are created.
48
• Now each broker got a new folder and that is where it is actually persisting all the
messages that are produced to a particular broker. So we have three different
directories for each and every broker.
Result :-
Thus successfully setting up Kafka cluster and execute the operations.
49
50