0% found this document useful (0 votes)
21 views

A Brief Overview On Apache CouchDB

Apache CouchDB is an open source NoSQL database that stores data in JSON format and uses JavaScript for querying. It allows users to create, read, update and delete databases and documents through a RESTful HTTP API. Documents can contain fields, attachments and metadata. The API uses HTTP methods like GET, POST, PUT and DELETE and returns responses in JSON format.

Uploaded by

harshithyv02
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

A Brief Overview On Apache CouchDB

Apache CouchDB is an open source NoSQL database that stores data in JSON format and uses JavaScript for querying. It allows users to create, read, update and delete databases and documents through a RESTful HTTP API. Documents can contain fields, attachments and metadata. The API uses HTTP methods like GET, POST, PUT and DELETE and returns responses in JSON format.

Uploaded by

harshithyv02
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Apache CouchDB

A Brief Overview on Apache CouchDB

Submitted by
Harshith V
Trainee Software Engineer
Custom Solution Services
Excelsoft Technologies Inc.
Apache CouchDB

Catalog
1.0 DBMS-Database Management System
1.1. RDBMS ------------------------------------------------------------------------------------------------------------1
1.2. OLAP ----------------------------------------------------------------------------------------------------------1
1.3. NoSQL ----------------------------------------------------------------------------------------------------------1
1. Document databases: ---------------------------------------------------------------------------------------1
2. Key-Value stores: ---------------------------------------------------------------------------------------1
3. Column-Family stores: ---------------------------------------------------------------------------------------2
4. Graph databases: ---------------------------------------------------------------------------------------2

2.0 Apache Couch DB Introduction ----------------------------------------------------------------------------3


2.1. Technical Overview ------------------------------------------------------------------------------------------3
2.1.1. Document storage ---------------------------------------------------------------------------------------3
2.1.2. RESTful HTTP API -----------------------------------------------------------------------------------------3
2.1.3. Request format and responses ------------------------------------------------------------------------4
2.1.4. HTTP Headers ----------------------------------------------------------------------------------------------7
2.1.5.Status codes -----------------------------------------------------------------------------------------------8
2.1.6. JSON Basics ------------------------------------------------------------------------------------------------8

3.0. Apache CouchDB Installation ----------------------------------------------------------------------------------10

4.0. Couchdb interactives --------------------------------------------------------------------------------------------11

5.0. Crud operations ---------------------------------------------------------------------------------------------------13


5.1. CRUD Operations by using Fauxton UI ---------------------------------------------------------------------13
5.1.1. Create database in Couchdb Fauxton ------------------------------------------------------------------13
5.1.2. Creating and Deleting Documents in Couchdb fauxton --------------------------------------------14
5.1.3. Show all DBs --------------------------------------------------------------------------------------------------15

6.0 .CouchDB technical ------------------------------------------------------------------------------------------------16


6.1 .The Cap theorem ------------------------------------------------------------------------------------------------16
6.2 .Btree engine ------------------------------------------------------------------------------------------------------16
6.3 .Acid properties ----------------------------------------------------------------------------------------------------17
6.4 .Multi version concurrency control(MVCC) -----------------------------------------------------------------17
6.5 .Replication ---------------------------------------------------------------------------------------------------------17
6.6 .Couch DB Index ---------------------------------------------------------------------------------------------------18
6.7 .Couchdb View -----------------------------------------------------------------------------------------------------18
6.7.1 .Map Reduce function ---------------------------------------------------------------------------------------18

7.0 .Mango queries ------------------------------------------------------------------------------------------------------20


7.1 .Couchdb operators ----------------------------------------------------------------------------------------------21
7.2. Couchdb architecture --------------------------------------------------------------------------------------------22

8.0 .CouchDB Vs MongoDB--------------------------------------------------------------------------------------------23


Apache CouchDB

1.1 DBMS-Database Management System


DBMS provides mechanism for storage and retrival of data.

Three main types of DBMS namely


1. RDBMS
2. OLAP
3. NoSQL

1. RDBMS
RDBMS stands for Relational Database Management System.
It is a type of DBMS software in which we store the data
In the form of Tables ( rows & columns ) ".

2. OLAP
OLAP stands for Online Analytical Processing.
It is a category of software tools that enable users to interactively analyze and explore
multidimensional data for business intelligence purposes. OLAP systems are designed to handle
complex queries and support analytical processing, allowing users to gain insights from large
volumes of data.

3. NoSQL
NoSQL is a type of database management system (DBMS) that is designed to handle and store large
volumes of unstructured and semi-structured data.

NoSQL databases are generally classified into four main categories:

1. Document databases:
These databases store data as semi-structured documents,such as JSON, XML and binary forms like
BSON.
Documents are addressed in the databases via a unique key that represents the document.
Software Systems:
 Mongo DB
 CouchDB
 Elastic Search

2. Key-Value stores:
Every data element in database is stored in key-value pairs.
The data can be retrived by using a unique key alloted to each element in the database.
The values can be simple data types like string and numbers or complex objects.
Software Systems:
 Redis
 Amazon DynamoDB
 Riak

1
Apache CouchDB

3. Column-Family stores:
A column-oriented database is a non-relational database that stores the data in columns instead of
rows.
Software Systems:
 Apache Cassandra
 HBase

4. Graph databases:
Graph-Based Databases focus on the relationship between the elements.It stores the data in the
form of nodes in the databases.
The connection between the nodes are called links or relationships.
Software Systems:
 Neo4j
 Amazon Neptune
 Arango DB

1.1.1: Diagram of NoSql storage

2
Apache CouchDB

2. Apache Couch DB Introduction


Couch Db is an open source NoSQL database which focuses on the ease of use.
It is developed by Apache.
It is fully compatible to web.Couch Db uses JSON to store data and uses Javascript as its query
language.Couchdb works well with modern web applications and mobile applications
Access your documents with your web browser, via HTTP.

2.1:Technical Overview

2.1.1 Document Storage:

A CouchDB server hosts named databases,which store documents. Each document is uniquely
named in the database, and CouchDB provides a RESTful HTTP API for reading and updating
(add,edit,delete,fetch) database documents.

Documents are the primary unit of data in CouchDB and consists of any number of fields and
attachments.Documents also include metadata that is maintained by the db system.Document
fields are uniquely named and containing values of varying types (text,number,boolean,lists,etc),
And there is no set limit to text size or element count.

2.1.2 RESTful HTTP API:

The CouchDB API is the primary method of interfacing to a CouchDB instance. Requests are made
using HTTP and requests are used to request information from the database, store new data, and
perform views and formatting of the information stored within the documents.

Note: I used Postman API for interacting with CouchDB Local-host(Fauxton).


Postman simplifies each step of the API lifecycle and streamlines collaboration so we can create
better APIs—faster.

3
Apache CouchDB

2.1.3 Request Format and Responses.

CouchDb supports the following HTTP request methods:


We can access Couch db through Rest API using HTTP requests,

 GET:
Requests the specified item. As with normal HTTP requests,the format of the URL defines what is
returned.With CouchDB this can include static items, database documents, and configuration and
statistical information. In most cases the information is returned in the form of a JSON document.

GET URL: http://username:[email protected]:5984/database_name/document_id


GET URL Example:
http://admin:[email protected]:5984/sample/09e05202727db8222e91081030000a8f

 HEAD:
The HEAD method is used to get the HTTP header of a GET request without the body of the
response.
URL: GET http://localhost:5984/mydatabase/mydocument

 POST
Post means-Upload data.
Within CouchDB POST is used to set values, including uploading documents, setting document
values
URL: POST http://localhost:5984/mydatabase

 PUT
Used to put a specified resource.
In CouchDB PUT is used to create new objects, including databases, documents, views and design
documents.
URL: PUT http://localhost:5984/mydatabase/mydocument

4
Apache CouchDB

1. First send Update request with body

2. Then for verification again send Get request.

 DELETE
Deletes the specified resource, including documents, views, and design documents.
Delete URL:http://admin:[email protected]:5984/sample/0011

When response =true; means The database is successfully deleted

5
Apache CouchDB

 COPY
A special method that can be used to copy documents and objects.
Example URL:
COPY http://localhost:5984/mydatabase/mydocument
In the request headers, include the destination URL where you want to copy the document
(Destination: http://localhost:5984/anotherdatabase/newdocument).

6
Apache CouchDB

2.1.4 HTTP Headers

Because CouchDB uses HTTP for all communication, we need to ensure that the correct HTTP
headers are supplied.
An HTTP header is a field of an HTTP request or response that passes additional context and
metadata about the request or response

Main Types of HTTP Headers:


1. Request Headers
2. Response Headers

Request Headers

 Accept:
Indicates the media types that the client is willing to accept from the server in the response.
Example: Accept: application/json tells the server that the client prefers to receive JSON
data.

 Content-Type:
Specifies the format of the data in the request or response body. It tells the server or client
how to interpret the data.
Example: Content-Type: application/json indicates that the data is in JSON format.
For the majority of requests this will be JSON (application/json).

Response Headers

 Cache-control:
The Cache-Control header in HTTP requests and responses provides instructions on how
caching should be handled by browsers, proxies, and other intermediary servers.

 Content-length:
Indicates the size of the request or response body in bytes.
Example: Content-Length: 1024 tells the recipient how many bytes of data to expect in the
body of the request or response.

 Content-type:
The Content-Type header in an HTTP response specifies the media type of the data being sent
in the response body.

 Etag:
Contains the entity tag of the resource, which is a unique identifier representing the current
state of the resource.
Example: ETag: "1-6a0c974072b3e621eb22a387033e83d5" is used to compare the current
state of a document with a previously obtained ETag value.

7
Apache CouchDB

2.1.5 Status Codes

Status Codes Description

200-OK This status is generated when a request


completed succesfully

201-Created This status is generated when a document is


created

202-Accepted This status is generated when a document is


accepted

404-Not Found This status is issued when the serve is unable to


find the requested content

405-Resource not allowed This status is issued when http request type is
invalid

409-Conflict This status is issued whenever there is any


update conflict.

2.1.6 JSON Basics

The majority of requests and responses to CouchDB use the JavaScript Object Notation (JSON) for
formatting the content and structure of the data and responses.

JSON is used because it is the simplest and easiest solution for working with data within a web
browser.

JSON supports the same basic types as supported by JavaScript,

these are:
• Array - a list of values enclosed in square brackets.
For example: ["one", "two", "three"]

• Boolean - a true or false value. You can use these strings directly.
For example: { "value": true}

• Number - an integer or floating-point number.

• Object - a set of key/value pairs (i.e. an associative array, or hash).


The key must be a string, but the value can be any of the supported JSON values.

8
Apache CouchDB

For example:

{
"servings" : 4,
"subtitle" : "Cooking without fire",
"cooktime" : 30,
"title" : "Churumuri"
}

In CouchDB, the JSON object is used to represent a variety of structures in the document, including
the main CouchDB document.

• String - this should be enclosed by double-quotes.


For example: "A String"

9
Apache CouchDB

3.0 Apache CouchDB Installation

3.1 Couch DB Installation for windows

 To install CouchDB, visit [https://couchdb.apache.org/] and click on the download button as


shown below.

 After Visiting the official Couchdb page, select os of your choice

 After downloading open the downloaded package and click on install.

10
Apache CouchDB

4.0 Couch DB Interactives


There are two ways to communicate with CouchDB
1. CouchDB cURL
2. CouchDB Fauxton

1. CouchDB Fauxton:
Fauxton is web based interface built into CouchDB. We can do actions like creating and deleting
operations.
A web based built in administration interface to facilitate a simple GUI to Interact with Couch DB.

Fig:4.0.1 Fauxton UI

2. CouchDB cURL:
But to communicate with the CouchDB Database to transfer data from or to a server, CouchDB
cURL utility is needed.

It uses protocols like


HTTP,HTTPS,FTP,FTPS,TFTP,DICT,TELNET,LDAP or FILE for communication

The examples of curl commands for various HTTP methods in CouchDB:

GET Method (Retrieve Document):


curl -X GET http://localhost:5984/mydatabase/mydocument

POST Method (Create Document):


curl -X POST -H "Content-Type: application/json" -d '{"name": "John", "age": 30}'
http://localhost:5984/mydatabase

11
Apache CouchDB

PUT Method (Update Document):


curl -X PUT -H "Content-Type: application/json" -d '{"name": "Jane", "age": 25}'
http://localhost:5984/mydatabase/mydocument

DELETE Method (Delete Document):


curl -X DELETE http://localhost:5984/mydatabase/mydocument

HEAD Method (Check Existence):


curl -I -X HEAD http://localhost:5984/mydatabase/mydocument

COPY Method (Copy Document):


curl -X COPY -H "Destination: http://localhost:5984/mydatabase/newdocument"
http://localhost:5984/mydatabase/mydocument

POST Method with Attachment (Create Document with Attachment):


curl -X POST -H "Content-Type: application/json"
-d
'{
"_attachments":
{
The Body of the Post
"myattachment.txt":
Method
{"content_type": "text/plain",
"data": "SGVsbG8gV29ybGQh"
}
}
}'

http://localhost:5984/mydatabase

12
Apache CouchDB

5.0 CRUD Operations


5.1 CRUD Operations by using Fauxton UI

5.1.1 Create database in CouchDB Fauxton

To create a couch Db database,


1. Click on Database tab in the left menu
2. Then click create database

3. Give database a name

4. Created database can be viewed here

13
Apache CouchDB

5.1.2 Creating document in couchdb fauxton

1. To create a document in database,click create document button

2. After creating documents we can see JSON Document with _id

3. We can keep _id as is, or we can change and we can add more fields to JSON document and Click
on Create Document button

14
Apache CouchDB

5.1.2 Deleting the document


We can directly delete any documents in Fauxton UI

5.1.3 Show all DBs


For viewing all the databases, we need to send a HTTP GET request with the following URL.
http://127.0.0.1:5984/_all_dbs/

In Postman, we will use GET request

15
Apache CouchDB

6.0 Couch DB Technical

6.1 The Cap Theorem

The CAP Theorem describes a few different strategies for distributing application logic across the
network. CouchDB’s Solution uses replication to propagate application changes across the
participating nodes.

The CAP Theorem


CAP stands for
1.Consistency
2.Availability
3.Partition Tolerance

1. Consistency:
All database clients see the same data, even with concurrent updates.
In other words, all nodes in the system have the same data at the same time.

2. Availability:
All database clients are able to access the same version of data, means It contains the most recent
write.

3. Partition Tolerance:
The databases can be split over multiple servers.
The system continues to operate despite arbitrary message loss or failure of part of the system.

6.2 BTree Engine

A BTree (Balanced tree) is a data structure that is commonly used in database systems and file
systems to store and manage large sets of data in a sorted order for efficient search,insertion and
deletion operations.

The BTree structure is designed to maintain balance,ensuring that all leaf nodes are at the same
level, which helps in maintaining efficient performance.

Btrees are frequently used in database management systems(DBMS) to implement Indexes.

In the Context of databases, whenever we mention a ‘BTree Engine’ it could be referring to the
underlying storage engine or index structure used by DBMS.

Many relational databases,such as MySQL and PostGreSQL, use B-Trees to implement indexes,
providing fast and balanced access of data

16
Apache CouchDB

6.3 ACID Properties

CouchDB implements ACID properties for data storage and document updates.

Atomicity: CouchDB ensures document updates are "all or nothing." Either the entire update
succeeds and gets committed, or if any failure occurs, everything rolls back, leaving the document
untouched.

Consistency: When the data in couchdb was once committed, then the data will not be modified or
overwritten.

Isolation: Concurrent writes to the same document are prevented. Only one client can modify a
document at a time, ensuring no conflicts occur due to simultaneous edits.

Durability:
 CouchDB ensures durability by writing data to disk and maintaining multiple copies (replicas)
of data across nodes in a cluster.
 Once a write operation is acknowledged, CouchDB guarantees that the data will persist even
In the event of hardware failures or crashes.

6.4 Multi-version Concurrency Control(MVCC)

MVCC is a core concept that plays a crucial role in handling concurrent access to the database.
The MVCC mechanism in couchdb enables multiple users or transactions to work with database
concurrently while maintaining consistency and avoiding conflicts.

Here’s how MVCC works with couchdb


 Every document in couchdb is assigned with a unique identifier(‘_id’) and revision id (‘_rev’).
The ‘_rev’ represents the version or revision of the document.When a document is updated a
New revision(‘_rev’) is created.
 When a client wants to update a document, it needs to provide the current revision(‘_rev’)
along with the updated document.This ensures that the client is working with the latest version
of the document.
 Concurrency control:Couchdb uses the revision ids to implement concurrency control.
If two clients attempt to update the same document simultaneously, the client with the
outdated revision will receive a conflict error. This conflict revolution mechanism ensures that
updates are applied in a consistent and ordered manner.

6.5 Replication

CouchDB supports replication, allowing us to synchronize data between different instances of


CouchDB databases

Replication is the fundamental feature in CouchDB that provides data distribution fault tolerance
and scalability.

17
Apache CouchDB

Replication Basics:
1. One way replication
Data is copied from source database to target database.

2. Two way replication(BI Directional)


Data is syncronized between the source and target database.

URL for Replication: http://username:[email protected]:5984/_replicate

Body of the URL:


{
"source": "http://admin:admin123@localhost:5984/database1",
"target": "http://admin:[email protected]:5984/database2",
"continuous": false
}

Note: Continuous:false means replication is done only once,


If Continuous:true, means replication is continuously done simultaneously.

One of the couchdb’s strengths is the ability to synchronize two copies of the same database.
This enables data across several nodes or data centers,but also move data more closely to the
clients.

Replication involves a source and a destination database,which can be on the same or different
couchdb instances.

The Aim of replication is that at the end of the process, all active documents in the source
databases are also in the destination database and all documents that were deleted in the source
documents are also deleted in the destination databases

6.6 Indexes

In CouchDB, indexes are known as "views". Views are special functions defined within design
documents that generate indexes for querying documents in a database. These views are created
using map functions, which define how documents should be indexed.

6.7 CouchDb view


In CouchDB, views act as pre-calculated indexes that allow efficient retrieval of documents based
on specific criteria.

6.7.1 Map Reduce Functions


Map reduce is a programming model and processed technique for distribution and parallel
computing.
In the context of couchdb,Map Reduce functions are used to query and aggregate data stored in
the database.
These functions are applied to the documents in the database.

18
Apache CouchDB

Map Function job:


The Map function is like a sorter, it goes through each document and looks for specific information
The Map function is a Javascript function that takes a document(‘doc’) as a parameter.

Map Function:all
Function(doc)
{
If(doc.type===”people”)
Emit(doc.id);
}
Doc: Inside our map function, our logic will determine if the doc needs to be mapped or not.

IF YES

We will use the emit() function to index it.


Emit() function will use two parameters
1) The key to index.
2) The Value to emit.

URL to access view in couchdb:


http://admin:[email protected]:5984/pycouchdb/_design/mydesign/_view/new-view
Here pycouchdb means database name, mydesign is design name, and inside design new-view is
view name.

Reduce Function

In CouchDB, a reduce function is used to aggregate data emitted by the map function of a view.

It takes a set of key-value pairs and produces a single result, such as summing up values or finding
maximum/minimum values.

The function iterates over the values and computes the desired result, returning a single value as
the output.

Note:In couchdb Reduce function is optional.

19
Apache CouchDB

7.0 Mango Queries


Mango Queries in couchdb allows us to perform advanced queries on your documents using
JSON-Based query language.

Mango is designed to provide a flexible and powerful querying mechanism that goes beyond basic
views.
There are two parts to a Mango Query: the index and the selector.
Example of a simple Selector Mango Query
{
“selector”:{
“field1”:”value1”,
“field2”:”{“$gt”:42}
}
“fields”:[“field1”,”field2”]
“sort”:[{“field1”:”asc”}]
“limit”:10,
“skip”;0
}

“selector”: This is the main part of the query where you define the condition that documents must
meet to be included in the result set.
In above example it only selects
Field1=value1 &
Field2 > greater than 42

“field”: specifies which field should be included in the result.


In this case,only “field1” and “field2” will be returned.

“sort”:Defines the sorting order of the result set. It is an array of field and directin pairs.
In this example, the result set will be sorted in ascending order on “field1”.

“Limit”: specifies the maximum number of document to return


In this case, only 10 documents will be sorted in ascending order on “field1”

“skip”:specifies the number of documents to skip before starting into include documents in result
set.

20
Apache CouchDB

7.1 CouchDb Operators

In CouchDB, operators are used in various contexts such as querying documents, filtering results,
and defining views.
Operators are identified by the dollar ($) prefix in the name field.

Equality Operator ($eq):


Used to find documents where the specified field is equal to a certain value.
Example: { "field": { "$eq": "value" } }

Comparison Operators ($gt, $gte, $lt, $lte):


Used to perform greater than, greater than or equal to, less than, and less than or equal to
comparisons respectively.
Example: { "field": { "$gt": 10 } }

Inequality Operator ($ne):


Used to find documents where the specified field is not equal to a certain value.
Example: { "field": { "$ne": "value" } }

Logical Operators ($and, $or, $not):


Used to perform logical AND, OR, and NOT operations respectively.
Example (AND): { "$and": [ { "field1": "value1" }, { "field2": "value2" } ] }

Existence Operator ($exists):


Used to find documents where the specified field exists or does not exist.
Example: { "field": { "$exists": true } }

Array Operators ($in, $nin, $all):


Used to match documents where the specified field is in a given array, not in a given array, or
contains all the elements of a given array respectively.
Example: { "field": { "$in": ["value1", "value2"] } }

Regex Operator ($regex):


Used to perform regular expression pattern matching on string fields.
Example: { "field": { "$regex": "^pattern" } }

Element Operators ($type, $size):


Used to match documents based on the type of a field ($type) or the size of an array field ($size).
Example: { "field": { "$type": "number" } }

21
Apache CouchDB

7.2 CouchDb architecture

22
Apache CouchDB

8.0 CouchDB Vs MongoDB

Comparison Feature CouchDB MongoDB

Data Model It follows the document- It follows the document-


oriented model and data is oriented model but data is
presented in JSON format. presented in BSON format.
Interface CouchDB uses HTTP/REST MongoDB uses binary protocol
based interface. and custom protocol over
TCP/IP.
Object Storage In CouchDB, database contains In MongoDB, database contains
documents. collections and collection
contains documents.
Query Method CouchDB follows Map/Reduce MongoDB follows Map/Reduce
query method. (JavaScript) creating collection
(JavaScript+others) + object-based query language.
Replication CouchDB supports master- MongoDB supports master-
master replication with custom slave replication.
conflict resolution functions.
Concurrency It follows MVCC (Multi Version Update in-place.
Concurrency Control).

Performance Consistency In CouchDB is safer than In MongoDB, database contains


MongoDB collections and collection
contains documents.
Consistency CouchDB is eventually MongoDB is strongly
consistent. consistent.

Written in it is written in Erlang. it is written in C++.

Preferences CouchDB favors availability. MongoDB favors consistency.

23

You might also like