Notes On Aymptotic Notation
Notes On Aymptotic Notation
Algorithm 1 : 5n2 + 2n + 1
Algorithm 2 : 10n2 + 8n + 3
Generally, when we analyze an algorithm, we consider the time complexity for larger
values of input data (i.e. 'n' value). In above two time complexities, for larger value
of 'n' the term '2n + 1' in algorithm 1 has least significance than the term '5n2', and the
term '8n + 3' in algorithm 2 has least significance than the term '10n2'.
Here, for larger value of 'n' the value of most significant terms ( 5n2 and 10n2 ) is very
larger than the value of least significant terms ( 2n + 1 and 8n + 3 ). So for larger value
of 'n' we ignore the least significant terms to represent overall time required by an
algorithm. In asymptotic notation, we use only the most significant terms to represent the
time complexity of an algorithm.
Majorly, we use THREE types of Asymptotic Notations and those are as follows...
1. Big - Oh (O)
2. Big - Omega (Ω)
3. Big - Theta (Θ)
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If f(n) <= C g(n) for all n >= n 0, C > 0 and n0 >= 1. Then we can
represent f(n) as O(g(n)).
f(n) = O(g(n))
Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on
X-Axis and time required is on Y-Axis
In above graph after a particular input value n0, always C g(n) is greater than f(n) which
indicates the algorithm's upper bound.
Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as O(g(n)) then it must satisfy f(n) <= C g(n) for all values
of C > 0 and n0>= 1
f(n) <= C g(n)
⇒3n + 2 <= C n
Above condition is always TRUE for all values of C = 4 and n >= 2.
By using Big - Oh notation we can represent the time complexity as follows...
3n + 2 = O(n)
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If f(n) >= C g(n) for all n >= n 0, C > 0 and n0 >= 1. Then we can
represent f(n) as Ω(g(n)).
f(n) = Ω(g(n))
Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on
X-Axis and time required is on Y-Axis
In above graph after a particular input value n 0, always C g(n) is less than f(n) which
indicates the algorithm's lower bound.
Example
Consider the following f(n) and g(n)...
f(n) = 3n + 2
g(n) = n
If we want to represent f(n) as Ω(g(n)) then it must satisfy f(n) >= C g(n) for all values
of C > 0 and n0>= 1
f(n) >= C g(n)
⇒3n + 2 >= C n
Above condition is always TRUE for all values of C = 1 and n >= 1.
By using Big - Omega notation we can represent the time complexity as follows...
3n + 2 = Ω(n)
Consider function f(n) as time complexity of an algorithm and g(n) is the most
significant term. If C1 g(n) <= f(n) <= C2 g(n) for all n >= n0, C1 > 0, C2 > 0 and n0 >=
1. Then we can represent f(n) as Θ(g(n)).
f(n) = Θ(g(n))
Consider the following graph drawn for the values of f(n) and C g(n) for input (n) value on
X-Axis and time required is on Y-Axis
The following 2 more asymptotic notations are used to represent the time
complexity of algorithms.
Little ο asymptotic notation
Big-Ο is used as a tight upper bound on the growth of an algorithm’s
effort (this effort is described by the function f(n)), even though, as
written, it can also be a loose upper bound. “Little-ο” (ο()) notation is used
to describe an upper bound that cannot be tight.
Definition: Let f(n) and g(n) be functions that map positive integers to
positive real numbers. We say that f(n) is ο(g(n)) (or f(n) Ε ο(g(n))) if
for any real constant c > 0, there exists an integer constant n0 ≥ 1 such
Amortize Analysis
This analysis is used when the occasional operation is very slow, but most of
the operations which are executing very frequently are faster. Data structures
we need amortized analysis for Hash Tables, Disjoint Sets etc.
In the Hash-table, the most of the time the searching time complexity is O(1),
but sometimes it executes O(n) operations. When we want to search or insert
an element in a hash table for most of the cases it is constant time taking the
task, but when a collision occurs, it needs O(n) times operations for collision
resolution.
Aggregate Method
The aggregate method is used to find the total cost. If we want to add a
bunch of data, then we need to find the amortized cost by this formula.
For a sequence of n operations, the cost is −
Probabilistic data structure works with large data set, where we want to
perform some operations such as finding some unique items in given
data set or it could be finding the most frequent item or if some items
exist or not. To do such an operation probabilistic data structure uses
more and more hash functions to randomize and represent a set of data.
The more number of hash function the more accurate result.
Things to remember
A deterministic data structure can also perform all the operations that a
probabilistic data structure does but only with low data sets. As stated
earlier, if the data set is too big and couldn’t fit into the memory, then the
deterministic data structure fails and is simply not feasible. Also in case of
a streaming application where data is required to be processed in one go
and perform incremental updates, it is very difficult to manage with the
deterministic data structure.
Use Cases
1. Analyze big data set
2. Statistical analysis
3. Mining tera-bytes of data sets, etc
Popular probabilistic data structures
1. Bloom filter
2. Count-Min Sketch
3. HyperLogLog
Binary Tree is a special datastructure used for data storage purposes. A binary tree has
a special condition that each node can have a maximum of two children. A binary tree
has the benefits of both an ordered array and a linked list as search is as quick as in a
sorted array and insertion or deletion operation are as fast as in linked list.
Important Terms
Following are the important terms with respect to tree.
Path − Path refers to the sequence of nodes along the edges of a tree.
Root − The node at the top of the tree is called root. There is only one root per tree
and one path from the root node to any node.
Parent − Any node except the root node has one edge upward to a node called parent.
Child − The node below a given node connected by its edge downward is called its
child node.
Leaf − The node which does not have any child node is called the leaf node.
Subtree − Subtree represents the descendants of a node.
Visiting − Visiting refers to checking the value of a node when control is on the node.
Traversing − Traversing means passing through nodes in a specific order.
Levels − Level of a node represents the generation of a node. If the root node is at
level 0, then its next child node is at level 1, its grandchild is at level 2, and so on.
keys − Key represents a value of a node based on which a search operation is to be
carried out for a node.
Tree Node
The code to write a tree node would be similar to what is given below. It has a data
part and references to its left and right child nodes.
struct node {
int data;
struct node *leftChild;
struct node *rightChild;};
Binary search tree is a data structure that quickly allows us to maintain a sorted
list of numbers.It is called a binary tree because each tree node has a maximum
of two children.It is called a search tree because it can be used to search for the
presence of a number in O(log(n)) time.
The properties that separate a binary search tree from a regular binary tree is
All nodes of left subtree are less than the root node
All nodes of right subtree are more than the root node
Both subtrees of each node are also BSTs i.e. they have the above two
properties.
Binary Search Tree Application
In multilevel indexing in the database
For dynamic sorting
For managing virtual memory areas in Unix kernel
Operations:
The operations of Binary search tree are
1. Insert
2. Delete
3. Search
Search Operation
The algorithm depends on the property of BST that if each left subtree has
values below root and each right subtree has values above the root.
If the value is below the root, we can say for sure that the value is not in the right
subtree; we need to only search in the left subtree and if the value is above the
root, we can say for sure that the value is not in the left subtree; we need to only
search in the right subtree.
Algorithm:
If root=null
return null
if number==root->data
return root->data
if number<root->data
return search(root->left)
if number>root->data
return search(root->right)
Insert Operation
Inserting a value in the correct position is similar to searching because we try to
maintain the rule that the left subtree is lesser than root and the right subtree is
larger than root.
We keep going to either right subtree or left subtree depending on the value and
when we reach a point left or right subtree is null, we put the new node there.
Algorithm:
If node=null
Return create node(data)if (data<node->data)
Node->left=insert(node->left,data);elseif(data->data->node->data)
Time Complexity
Here, n is the number of nodes in the tree.
Space Complexity