0% found this document useful (0 votes)
160 views

DS Introduction To Tree

The document discusses different types of tree data structures, including their properties, representations, and traversal algorithms. It defines trees as non-linear data structures where multiple items can follow each node. Common tree types discussed include binary trees, where each node has at most two children, and B-trees, which are balanced m-way search trees. The document also explains different traversal orders for trees like preorder, postorder, level order and in-order, and provides examples and pseudocode for implementing tree traversals.

Uploaded by

Prashant Jain
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views

DS Introduction To Tree

The document discusses different types of tree data structures, including their properties, representations, and traversal algorithms. It defines trees as non-linear data structures where multiple items can follow each node. Common tree types discussed include binary trees, where each node has at most two children, and B-trees, which are balanced m-way search trees. The document also explains different traversal orders for trees like preorder, postorder, level order and in-order, and provides examples and pseudocode for implementing tree traversals.

Uploaded by

Prashant Jain
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Introduction

Lists, stacks, and queues, are all linear structures: in all three data structures, one item follows another. Trees will be our first non-linear structure:

More than one item can follow another. The number of items that follow can vary from one item to another.

Trees have many uses:


representing family genealogies as the underlying structure in decision-making algorithms to represent priority queues (a special kind of tree called a heap) to provide fast access to information in a database (a special kind of tree called a b-tree)

Here is the conceptual picture of a tree (of letters):

each letter represents one node the arrows from one node to another are called edges the topmost node (with no incoming edges) is the root (node A) the bottom nodes (with no outgoing edges) are the leaves (nodes D, I, G & J)

So a (computer science) tree is kind of like an upside-down real tree...

A path in a tree is a sequence of (zero or more) connected nodes; for example, here are 3 of the paths in the tree shown above:

The length of a path is the number of nodes in the path, e.g.:

The height of a tree is the length of the longest path from the root to a leaf; for the above example, the height is 4 (because the longest path from the root to a leaf is A C E G, or A C E J). An empty tree has height = 0. The depth of a node is the length of the path from the root to that node; for the above example:

the depth of J is 4 the depth of D is 3 the depth of A is 1

The level of a node is the length of the path from the root to that node The depth of a tree is the maximum level of any node of any node in the tree The degree of a node is the number of partitions in the subtree which has that node as the root Nodes with degree=0 are called leaves Given two connected nodes like this:

Node A is called the parent, and node B is called the child. A subtree of a given node includes one of its children and all of that child's descendants. The descendants of a node n are all nodes reachable from n (n's children, its children's children, etc.). In the original example, node A has three subtrees:
1. B, D 2. I 3. C, E, F, G, J.

An important special kind of tree is the binary tree. In a binary tree:

Each node has 0, 1, or 2 children. Each child is either a left child or a right child.

Here are two examples of binary trees that are different:

The two trees are different because the children of node B are different: in the first tree, B's left child is D and its right child is E; in the second tree, B's left child is E and its right child is D. Also note that lines are used instead of arrows. We sometimes do this because it is clear that the edge goes from the higher node to the lower node.

Representing Trees
Since a binary-tree node never has more than two children, a node can be represented using a class with 3 fields: one for the data in the node, plus two child pointers:
class BinaryTreenode { // *** fields *** private Object data; private BinaryTreenode leftChild; private BinaryTreenode rightChild; }

However, since a general-tree node can have an arbitrary number of children, a fixed number of child-pointer fields won't work. Instead, we can use a List to keep all of the child pointers:
class Treenode { // *** fields *** private Object data; private List children; }

(Note that the items in the List will be of type Treenode.)

As we know, a list can be represented using either an array or a linked-list. For example, consider this general tree (a simplified version of the original example):

For the array representation of the List (where the array has an initial size of 4) we would have:

TEST YOURSELF #1

Draw a similar picture of the tree when the List fields are implemented using linked lists. solution

Tree Traversals
It is often useful to iterate through the nodes in a tree:

to print all values to determine if there is a node with some property to make a copy

When we iterated through a List, we started with the first node and visited each node in turn. Since each node is visited, the best possible complexity is O(N) for a tree with N nodes. All of our traversal methods will achieve this complexity.

For trees, there are many different orders in which we might visit the nodes. There are three common traversal orders for general trees, and one more for binary trees: preorder, postorder, level order, and in-order, all described below. We will use the following tree to illustrate each traversal:

Preorder A preorder traversal can be defined (recursively) as follows: 1. 2. 3. 4. visit the root perform a preorder traversal of the first subtree of the root perform a preorder traversal of the second subtree of the root etc. for all the subtrees of the root

If we use a preorder traversal on the example tree given above, and we print the letter in each node when we visit that node, the following will be printed: A B D C E G F H I. Postorder A postorder traversal is similar to a preorder traversal, except that the root of each subtree is visited last rather than first: 1. 2. 3. 4. perform a postorder traversal of the first subtree of the root perform a postorder traversal of the second subtree of the root etc. for all the subtrees of the root visit the root

If we use a postorder traversal on the example tree given above, and we print the letter in each node when we visit that node, the following will be printed: D B G E H I F C A. Level order The idea of a level-order traversal is to visit the root, then visit all nodes "1 level away" (depth 2) from the root (left to right), then all nodes "2 levels away" (depth 3) from the root, etc. For the example tree, the goal is to visit the nodes in the following order:

A level-order traversal requires using a queue (rather than a recursive algorithm, which implicitly uses a stack). Here's how to print the data in a tree in level order, using a queue Q, and using an iterator to access the children of each node (we assume that the root node is called root, and that the Treenode class provides a getChildren method):
Q.enqueue(root) while (!Q.empty()) { Treenode n = Q.dequeue(); System.out.print(n.getData()); List L = n.getChildren(); Iterator it = L.iterator(); while (it.hasNext()) { Q.enqueue(it.next()); } }

TEST YOURSELF #2 Draw pictures of Q as it would be each time around the outer while loop in the code given above for the example tree given above. solution

In-order An in-order traversal involves visiting the root "in between" visiting its left and right subtrees. Therefore, an in-order traversal only makes sense for binary trees. The (recursive) definition is: 1. perform an in-order traversal of the left subtree of the root 2. visit the root 3. perform an in-order traversal of the right subtree of the root If we print the letters in the nodes of our example tree using an in-order traversal, the following will be printed: D B A E G C H F I The primary difference between the preorder, postorder and in-order traversals is where the node is visited in relation to the recursive calls; i.e., before, after or in-between.

TEST YOURSELF #3

What is printed when the following tree is visited using (a) a preorder traversal, (b) a postorder traversal, (c) a level-order traversal, and (d) an in-order traversal?

Answers to Self-Study Questions for Trees

Test Yourself #1

Test Yourself #2

The output would be: A B C D E F G H I

Test Yourself #3
(a) A B D H E I C F J K G

(b) H D I E B J K F G C A (c) A B C D E F G H I J K (d) D H B I E A J F K C G

Tree Traversal Algorithms: 1: DEPTH-FIRST: Inorder: 1. Traverse left subtree 2. Visit node (i.e. process node) 3. Traverse right subtree Preorder: 1. Visit node 2. Traverse Left

3. Traverse right Post-order: 1. Traverse left 2. Traverse right 3. Visit node

Example: An arithmetic expression tree stores operands in leafs, operators in non-leaf nodes:

inorder traversal: (LNR) (A-B)+((C/D)*(E-F)) A-B+C/D*E-F (paranthesis assumed)

postorder traversal: (LRN) AB-C/EF-*+

preorder traversal: (NLR) +-AB*/CD-EF

Note: Postorder traversal, with the following implementation of visit : if operand PUSH if operator POP two operands, calculate, push result back

Conditional expression syntax: CONDITION? True-case-EXP:False-case-EXP The preorder, postorder and inorder traversals are "depth-first"

A breath-first traversal algorithm; eg.

Traversal sequence a,b,c,d,e,f Algorithm Level-Traverse 1. Insert root node in queue 2. While queue is not empty 2.1. Remove front node from queue and visit it 2.2. Insert Left child 2.3. Insert right child

M_WAY SEARCH TREE

Definition: An m_way search tree is a tree in which all nodes are of degree<=m. (It may be empty). A non empty m_way search tree has the following properties: a) It has nodes of type:

b) key1 < key2 <...< key(m-1) in other words, keyi<key(i+1), 1<=i<m-1 c) All Key values in subtree Ti are greater than Keyi and less than Keyi+1 Sometimes, we have an additional entry at the leftmost field of every node, indicating thenumber of nonempty key values in that node.

Example: 3_way search tree: Nodes will be of type:

B_TREES A B_tree of order m is an m_way search tree (possibly empty) satisfying the following properties (if it is not empty) a) All nodes other than the root node and leaf nodes have at least m/2 children, b) The tree is balanced. (If we modify all link fields of leaf nodes to point to special nodes called failure nodes are at the same level)

Example: B_tree of order 3:

Inserting new key values to B_tree: . We want to insert a new key value: x, into a B-tree . The resulting tree must also be a B-tree. (It must be balanced.) . We'll always insert at the leaf nodes.

Example: 1) Insert 38 to the B_tree of the above example: First of all, we do a search for 38 in the given b_tree. we hit the failure node marked with "*" . The parent of that failure node has only one key value so it has space for another one. Insert 38 there , add a new failure node, which is marked as "+" in the following figure, and return.

2) Now, insert 55 to this B_tree. We do the search and hit the failure node "~". However, it's parent node does not have any space for a key value. Now, assume we create a new node instead of

So we end up;

3) Now, let us insert 37 to this B_tree: We search for 37, and hit a failure node between 35 and 38. So,we have to create:

So, we end up with;

If we can not insert to the root node, we split and create a new root node. For example try to insert 34, 32, and 33. Thu, the height of the B_tree increases by 1 in such a case.

Deletion algorithm is much more complicated! It will not be considered here.

Binary Search Tree


A prominent data structure used in many systems programming applications for representing and managing dynamic sets. Average case complexity of Search, Insert, and Delete Operations is O(log n), where n is the number of nodes in the tree. DEF: A binary tree in which the nodes are labeled with elements of an ordered dynamic set and the following BST property is satisfied: all elements stored in the left subtree of any node x are less than the element stored at x and all elements stored in the right subtree of x are greater than the element at x. An Example: Figure 4.14 shows a binary search tree. Notice that this tree is obtained by inserting the values 13, 3, 4, 12, 14, 10, 5, 1, 8, 2, 7, 9, 11, 6, 18 in that order, starting from an empty tree. Note that inorder traversal of a binary search tree always gives a sorted sequence of the values. This is a direct consequence of the BST property. This provides a way of sorting a given sequence of keys: first, create a BST with these keys and then do an inorder traversal of the BST so created. Note that the highest valued element in a BST can be found by traversing from the root in the right direction all along until a node with no right link is found (we can call that the rightmost element in the BST). The lowest valued element in a BST can be found by traversing from the root in the left direction all along until a node with no left link is found (we can call that the leftmost element in the BST). Search is straightforward in a BST. Start with the root and keep moving left or right using the BST property. If the key we are seeking is present, this search procedure will lead us to the key. If the key is not present, we end up in a null link. Insertion in a BST is also a straightforward operation. If we need to insert an element x, we first search for x. If x is present, there is nothing to do. If x is not present, then our search procedure ends in a null link. It is at this position of this null link that x will be included. If we repeatedly insert a sorted sequence of values to form a BST, we obtain a completely skewed BST. The height of such a tree is n - 1 if the tree has n nodes. Thus, the worst case complexity of searching or inserting an element into a BST having n nodes is O(n).

Figure 4.14: An example of a binary search tree

4. Implementing a Tree in an Array


How can we represent an arbitrary binary tree in an array? In fact, there are numerous ways to do this, we'll just look at one.

Because an array's length is fixed at compile time, if we use an array to implement a tree we have to set a limit on the number of nodes we will permit in the tree. Our strategy is to fix the maximum height of the tree (H), and make the array big enough to hold any binary tree of this height (or less). We'll need an array of size (2**H)-1. Here is the biggest binary tree of depth 3:

If we picked H=3 as our limit, then every tree we might build will be a subtree of this one - this is the key insight behind our implementation. What we do now is assign each of nodes to a specific position in the array. This could be done any way you like, but a particular easy and useful way is:

root of the tree (A): array position 1 root's left child (B): array position 2 root's right child (C): array position 3 ... left child of node in array position K: array position 2K right child of node in array position K: array position 2K+1

So D is in position 2*2 (4), E is in position 2*2+1 (5), F is in position 2*3 (6), G is in position 2*3+1 (7).

This figure shows the array position associated with each node:

This particular arrangement makes it easy to move from a node to its children, just double the node's index (and add 1 to go right). It also makes it easy to go from a node to its parent: the parent of node I has index (I div 2). Using this strategy, a tree with N nodes does not necessarily occupy the first N positions in the array. For example, the tree:

Somehow we need to keep track of which array elements contain valid information. Two possibilities:

1. as usual, each node stores information saying which of its children exist 2. each array position stores information saying if it is a valid node. However, if we restrict ourselves to complete trees, these problems go away. Because of the way we assigned nodes to positions, if there are N nodes in a complete tree, they will correspond to the first N positions of the array. So: 1. only need to keep track of N in order to know which array positions contain valid information. 2. if we add a new value, it must go in position N+1; if we delete a value, we must re-organize the tree so that the `gap' created by deleting is filled. 3. can traverse the tree by going through the array from first to Nth position.
for(i=0;i<N;i++) { process node in position i; }

This gets us level-by-level traversal!

To see this `amazing' fact, look again at the earlier picture. We have insisted that heaps be complete trees precisely so that they will have this very nice implementation in arrays. It is a useful exercise to work through the insert and delete operations for heaps in this array representation: the textbook gives code implementing INSERT and DELETE for this representation

Infix, Postfix and Prefix


Infix, Postfix and Prefix notations are three different but equivalent ways of writing expressions. It is easiest to demonstrate the differences by looking at examples of operators that take two operands. Infix notation: X + Y Operators are written in-between their operands. This is the usual way we write expressions. An expression such as A * ( B + C ) / D is usually taken to mean something like: "First add B and C together, then multiply the result by A, then divide by D to give the final answer."

Infix notation needs extra information to make the order of evaluation of the operators clear: rules built into the language about operator precedence and associativity, and brackets ( ) to allow users to override these rules. For example, the usual rules for associativity say that we perform operations from left to right, so the multiplication by A is assumed to come before the division by D. Similarly, the usual rules for precedence say that we perform multiplication and division before we perform addition and subtraction. (see CS2121 lecture).
Postfix notation (also known as "Reverse Polish notation"): X Y + Operators are written after their operands. The infix expression given above is equivalent to
A B C + * D /

The order of evaluation of operators is always left-to-right, and brackets cannot be used to change this order. Because the "+" is to the left of the "*" in the example above, the addition must be performed before the multiplication. Operators act on values immediately to the left of them. For example, the "+" above uses the "B" and "C". We can add (totally unnecessary) brackets to make this explicit:
( (A (B C +) *) D /)

Thus, the "*" uses the two values immediately preceding: "A", and the result of the addition. Similarly, the "/" uses the result of the multiplication and the "D". Prefix notation (also known as "Polish notation"): + X Y Operators are written before their operands. The expressions given above are equivalent to
/ * A + B C D

As for Postfix, operators are evaluated left-to-right and brackets are superfluous. Operators act on the two nearest values on the right. I have again added (totally unnecessary) brackets to make this clear:
(/ (* A (+ B C) ) D)

Although Prefix "operators are evaluated left-to-right", they use values to their right, and if these values themselves involve computations then this changes the order that the

operators have to be evaluated in. In the example above, although the division is the first operator on the left, it acts on the result of the multiplication, and so the multiplication has to happen before the division (and similarly the addition has to happen before the multiplication). Because Postfix operators use values to their left, any values involving computations will already have been calculated as we go left-to-right, and so the order of evaluation of the operators is not disrupted in the same way as in Prefix expressions.
ARITHMETIC EXPRESSIONS One of the most important applications of stacks is the evaluation of arithmetic expressions. Consider the simple arithmetic expression; A+B We have three possibilities for the positioning of the operator; 1. Before the operands as +AB which is called Prefix notation or Polish notation. (after the Polish logician Jan Lu Lukasiewicz) 2. Between the operands A+B which is called Infix notation . 3. After the operands AB+ which is called Postfix notation or Reverse Polish notation . How to convert from infix to postfix from infix to prefix
Arithmetic Expressions Dr. Faruk Tokdemir, METU 2

General rules for conversion 1. Completely paranthesize the infix expression according to the order of precedence to specify the order of operation. 2. Move each operator to its corresponding right (for postfix) left (for prefix) 3. Remove all parantheses Consider the arithmetic operators and their precedence as operators priority ^ highest *,/ + , - lowest

Construct Binary Tree From Inorder and Preorder/Postorder Traversal


April 20, 2011 in binary tree Given preorder and inorder traversal of a tree, construct the binary tree.

Hint: A good way to attempt this question is to work backwards. Approach this question by drawing a binary tree, then list down its preorder and inorder traversal. As most binary tree problems, you want to solve this recursively. About Duplicates: In this solution, we will assume that duplicates are not allowed in the binary tree. Why? Consider the following case:
preorder = {7, 7} inorder = {7, 7}

We can construct the following trees which are both perfectly valid solutions.
7 / 7 or 7 \ 7

Clearly, there would be ambiguity in constructing the tree if duplicates were allowed. Solution: Let us look at this example tree.
_______7______ / \ __10__ ___2 / \ / 4 3 _8 \ / 1 11

The preorder and inorder traversals for the binary tree above is:
preorder = {7,10,4,3,1,2,8,11} inorder = {4,10,3,1,7,11,8,2}

The crucial observation to this problem is the trees root always coincides with the first element in preorder traversal. This must be true because in preorder traversal you always traverse the root node before its children. The root nodes value appear to be 7 from the binary tree above.

We easily find that 7 appears as the 4th index in the inorder sequence. (Notice that earlier we assumed that duplicates are not allowed in the tree, so there would be no ambiguity). For inorder traversal, we visit the left subtree first, then root node, and followed by the right subtree. Therefore, all elements left of 7 must be in the left subtree and all elements to the right must be in the right subtree. We see a clear recursive pattern from the above observation. After creating the root node (7), we construct its left and right subtree from inorder traversal of {4, 10, 3, 1} and {11, 8, 2} respectively. We also need its corresponding preorder traversal which could be found in a similar fashion. If you remember, preorder traversal follows the sequence of root node, left subtree and followed by right subtree. Therefore, the left and right subtrees postorder traversal must be {10, 4, 3, 1} and {2, 8, 11} respectively. Since the left and right subtree are binary trees in their own right, we can solve recursively!

You might also like