CS 1110 Notes
CS 1110 Notes
Types of Programs
Program: a sequence of instructions that species how to perform a computation Algorithm: a general process for solving a category of problems Declarative Programming: a programming paradigm where the program describes what should be accomplished without describing the control-ow of how to actually accomplish it, in stark contrast with imperative programming Imperative Programming: a programming paradigm that describes computation in terms of statements that change a program state Procedural Programming: imperative programming sub-paradigm where the program uses functions or subroutines Structured Programming: a programming paradigm where the program uses lots of subroutines, loops, conditionals, or other methods of imposing structure on the program, as opposed to a giant list of statements or the use of the notorious GOTO statement Object-Oriented Programming (OOP): a programming paradigm where the program consists of a set of objects, each with a set of data elds and methods, interacting with each other Functional Programming: a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data; a subtype of declarative programming
Program Use
Portability: a programming language whose programs can be run on different kinds of computers with little or no modication Interpreter: a program that Compiler: a program that Source Code: a high-level program Object Code (Executable): the translated source code that the computer can execute Prompt: a set of symbols such as >>> to indicate that the interpreter is ready for user input Script: a le containing code for an interpreter to execute Interactive Mode: Script Mode:
Program Structure
Bug: an error in a program Debugging: the process of xing bugs in programs Syntax: the structure of a program and the rules about that structure Syntax Error: Runtime Error (Exception): an error that doesnt appear until a program has started running Semantics: Semantic Error: an error in what the program means; the program runs, but doesnt do what you want it to Natural Language: languages that people speak Formal Language: languages that are designed for specic applications
natural languages are not explicitly developed by people but rather evolve naturally and have some structure
imposed along the way
programming languages are formal languages that have been designed to express computations
Token: Parse: Print Statement:
math: perform basic mathematical operations like addition and multiplication conditional execution: check for certain conditions and execute the appropriate code repetition: perform some action repeatedly, usually with some variation
immutable objects can be read and copied with changes to a new variable to effectively work very similarly
to mutable objects
the Python symbol for a comment is # the Python line continuation character is \ lines with the line continuation character can NOT contain comments you can indent and break lines however you want between any sort of delimiter Python distinguishes between tabs and spaces you want to always use four spaces, never tabs the website for Python documentation is: http://docs.python.org/3.2/index.html
Operators
Operator: a special symbol that represent a computation like addition or multiplication Operand: the values the operator is applied to Expression: a combination of values, variables, and operators; only one thing is necessary to be considered an expression Subexpression: an expression in parentheses that acts as a single operand in a larger expression Statement: a unit of code that the Python interpreter can execute Binary Operator: an operator that takes two operands Prex: writing a mathematical expression with the operators before the operands Inx: writing a mathematical expression with the operators between the operands Postx: writing a mathematical expression with the operators after the operands Floor Division: when both of the operands are integers, the result is also an integer; oor division truncates the fraction part
Python performs oor division Python allows multiple assignment in a very intuitive way e.g. : x = y = 5 this can also be done with boolean expressions
Arithmetic Operators Assignment Operators
= + * / % ** //
addition subtraction multiplication division modulus exponentiation oor division
simple assignment add the RHS to the variable subtract the RHS from the variable multiple the variable by the RHS divide the variable by the RHS assign the modulus of the variable and RHS to the variable raise the variable to the power of the RHS oor divide the variable by the RHS
+= -= *= /= %= **= //=
Logical Operators
Membership Operators subset not a subset Identity Operators are the same object are not the same object
Operator Precedence
** ~, +, *, /, %, // +, >>, << & ^, | <=, <, >, >= ==, != =, %=, /=, //=, -=, +=, *=, **= is, is not in, not in not, or, and
exponentiation complement, unary plus and minus multiple, divide, modulo, oor division addition and subtraction right and left bitwise shift bitwise AND bitwise exclusive OR and regular OR comparison operators equality operators assignment operators identity operators membership operators logical operators
Python Keywords
in Python 3, exec is no longer a keyword in Python 3, nonlocal is a new keyword Python does NOT have a GOTO statement Python does NOT have a switch statement this is in order to encourage making a polymorphic call instead anything a switch statement can do can be done with if -- elif -- elif -- else
Python Keywords
and as assert break class continue def del elif else except exec finally for from global if import in is lambda not or pass print raise return try while with yield
Data Structures
Strings
String: an immutable sequence of characters
Python treats single quotes exactly the same as double quotes normal strings in Python are stored internally as 8-bit ASCII indexes always start at zero a negative sign makes indexes start from the right instead of the left when starting from the right, the indexes start at one instead of zero for slices, the range selected starts at the rst index and ends with the element before the second index
Create A String
< string name > = < string characters >
String Functions
len() count(str) find() index() startswith() endswith()
returns the number of characters in the string returns the number of times str occurs in string or the denoted substring
max() min() isalpha() isdigit() islower() isupper() capitalize() title() lower() upper() swapcase() replace( , ) split() splitlines()
returns the maximum alphabetical character from the string returns the minimum alphabetical character from the string returns True if the string has at least one character and all characters are alphabetic and False otherwise returns True if the string contains only digits and False otherwise
capitalizes the rst letter of the string returns a titlecased version of the string all words begin with uppercase and the rest are lowercase converts all uppercase letters in the string to lowercase converts all lowercase letters in the string to uppercase inverts the case of all letters in the string replaces all occurrences of the old substring with the new substring splits the string according to the delimiter and returns a list of substrings splits the string at all newlines and returns a list of each line with the newline characters removed
Tuples
Tuple: an immutable sequence of multiple types of objects
the objects that a tuple contains must remain the same, but the contents of any given object can potentially
change
a simple string or number in a tuple can NOT be modied a string or a number inside a list inside a tuple can be freely modied the list itself must still exist in its original location in the tuple, even if whats in the list changes indexes always start at zero a negative sign makes indexes start from the right instead of the left
9
when starting from the right, the indexes start at one instead of zero for slices, the range selected starts at the rst index and ends with the element before the second index
Create A Tuple
< tuple name > = (< tuple elements >)
Tuple Functions
tuple() len() count() index() max() min()
returns the element from the tuple with the maximum value returns the element from the tuple with the minimum value converts a list to a tuple returns the number of elements in the tuple
Lists
List: a mutable sequence of multiple types of objects
indexes always start at zero a negative sign makes indexes start from the right instead of the left when starting from the right, the indexes start at one instead of zero for slices, the range selected starts at the rst index and ends with the element before the second index
Create A List
< list name > = [< list elements >]
10
List Functions
list() len() count() index() max() min() append() extend() clear() reverse() remove(obj) pop() sort() insert() map()
reverses the order of the elements of the list removes the object obj from the list removes and returns the last element from the list converts a tuple to a list returns the number of elements in the list returns the number of times the passed argument appears in the list returns the index number of the passed argument returns the element from the list with the maximum value returns the element from the list with the minimum value adds the passed argument to the end of the list
Dictionaries
Dictionary (Associative Array) (Hash Table) (Struct):
dictionary keys do NOT have to be strings IF a dictionary key is a string, it needs to be placed in quotes dictionary keys MUST be immutable
Create A Dictionary
< dictionary name > = {< key name >: <value > , <key name >: <value > , <key name >: <value > }
11
Dictionary Functions
cmp(dict1 , dict2) len(dict) str(dict) dict.clear() dict.copy() dict.fromkeys() dict.get(key , default=None) dict.has_key(key) dict.items() dict.keys() dict.setdefault(key , default=None) dict.update(dict2) dict.values()
adds dictionary dict2 key-value pairs to the dictionary returns a list of the dictionarys values returns a list of the dictionarys (key , value) tuple pairs returns a list of the dictionarys keys returns a shallow copy of the dictionary returns the length of the dictionary
Conditionals
IF Statement
if < boolean expression >: < statements >
IF-ELSE Statement
if < boolean expression >: < statements > else : < statements >
IF-ELSEIF Statement
if < boolean expression >: < statements > elif < boolean expression >: < statements > elif < boolean expression >: < statements > elif < boolean expression >: < statements >
12
IF-ELSEIF-ELSE Statement
if < boolean expression >: < statements > elif < boolean expression >: < statements > elif < boolean expression >: < statements > elif < boolean expression >: < statements > else : < statements >
Loops
Loops
Accumulator: a variable used in a loop to accumulate a series of values, such as by concatenating them onto a string or adding them to a running sum
anything that can be done by a for loop can be done by a while loop NOT everything that can be done by a while loop can be done by a for loop anything that can be done by a while loop can be done by recursion anything that can be done by recursion can be done by a while loop loops tend to be easier and a better choice for simpler tasks recursion tends to be easier and a better choice for when the amount of repeating is less certain due
to complex algorithms and data structures
iteration tends to be slightly faster than recursion, but usually not enough to bother considering when compiled, some types of recursion will be converted to iteration recursion resides on the call stack, while iteration does not iteration is less likely to create a stack overow and various problems with handling large numbers index, outerIndex, innerIndex, counter, and increment are useful variable names for use with loops the pass statement is a null operation it does nothing the pass statement is useful as a temporary syntactic placeholder before you get around to putting
something there
the pass statement can be used anywhere, not just in loops the break statement immediately exits a loop entirely this not only exits the current iteration through loop, but the loop in its entirety the continue statement immediately exits a loop for the given iteration this only exits the current iteration through the loop, but the rest of the iterations will continue on normally
13
For Loop
for < iterating variable > in < sequence of values >: < repetend >
For-Else Loop
for < iterating variable > in < sequence of values >: < repetend > else : < statements >
While Loop
while < boolean expression >: < repetend >
While-Else Loop
while < boolean expression >: < repetend > else : < statements >
Testing
Exceptions
Catching: handling an exception with a try statement Throwing (Raising): handling an exception with an except statement
the finally block is always executed whether or not an exception is thrown, and is often used to close les
and do similar tidying up
Exceptions Block
try : < statements > except < optional boolean expression >: < statements > except < optional boolean expression >: < statements > except < optional boolean expression >: < statements > finally : < statements > else : < statements >
14
Custom Exceptions
class <>( Exception ): def __init__ ( self , value ): self . value = value def __str__ ( self ): return repr ( self . value )
Functions
Functions
Functional Programming: a style of programming where the majority of functions are pure Pure Function: a function that does not modify any of the objects it receives as arguments; most pure functions are fruitful Fruitful Function: a function that returns a value Void Function (Procedure): a function that does NOT return a value Modier: a function that changes one or more of the objects it receives as arguments; most modiers are fruitless First-Class Citizen: an entity that can be constructed at run-time, passed as a parameter, returned from a subroutine, and assigned into a variable or data structure Function Call: a statement that executes (calls) a function Wrapper: a method that acts as a middleman between a caller and a helper method, often making the method easier or less error-prone to invoke Helper: a method that is not invoked directly by a caller but is used by another method to perform part of an operation Encapsulation: wrapping a piece of code up in a function Generalization: adding a parameter to a function Docstring (Documentation String): a string at the beginning of a function that explains the interface Invariant: a condition that should always be true during the execution of a program Precondition: things that are supposed to be true before a function starts executing Postcondition: things that a function is supposed to do and any side effects of the function Namespace: a syntactic container providing a context for names so that the same name can reside in different namespaces without ambiguity Naming Collision: when two or more names in a given namespace cannot be unambiguously resolved Scope: the part of a program that can access a particular variable Global Variable: a variable that is declared in the current module and can be accessed anywhere in the program Local Variable: a variable that is declared inside a function and only exists inside the function, including parameters Argument (Actual Parameter): a value provided to a function when the function is called, which can be assigned to the corresponding parameter in the function Parameter (Formal Parameter): a name used inside a function to refer to the value which was passed to it as an argument
variables dened inside a function body have a local scope a function without any arguments to return will return the value None
Create a Function
def < function name >( < function arguments >): <docstring> < function body > return < optional return expression >
Call a Function
< function name >( < function arguments >)
Proof By Induction
1. prove the base case 2. prove the induction step
ASSUME that the induction hypothesis is true (the general n case) both steps must hold for the proof to be valid a common error is making an irrelevant base case or not having enough base cases
it may be necessary to cache not just the recursively calculated value, but also other parameters necessary
to calculate it Tail Recursion: when the last action performed by a recursive function is a recursive call
for tail recursion to occur, it isnt enough for a recursive call to just be in the last expression, it has to be the
last action performed
the simple approach to calculating the Fibonacci series recursively is NOT tail recursive because the last action performed is addition, not recursion f_n = fib(f_n-1) + fib(f_n-2) tail recursion can easily be converted to a while loop to save memory overhead and prevent a stack overow
Basic Recursion
def recursor ( value ): # handle base cases if < base case 1 >: return < base case 1 defined value > if < base case n >: return < base case n defined value > # general recursive case if <not at a base case >: < statements > recursor ( value ) return value
Memoization
def recursor ( value ): # check if value is in cache if value in cache : return cache [ value ] # handle base cases if < base case 1 >: return < base case 1 defined value > elif < base case n >: return < base case n defined value >
17
# general recursive case else < not at a base case >: < statements > recursor ( value ) # add value to cache if it isnt already there if value not in cache : cache . append ( value ) return value
IS-A Relationship: the relationship between a child class and its parent class; are all X necessarily a Y? HAS-A Relationship: the relationship between two classes where instances of one class contain references to instances of the other Multiplicity: a notation in a class diagram that shows, for a HAS-A relationship, how many references there are to instances of another class
Python uses breadth rst attribute search starting from the left in Python, everything is public this is based on the philosophy of protections? to protect what from who? attributes belonging to a given object should be passed in their normal form, e.g. int, string, etc. class attributes are accessible from any method in the class and are shared by all instances of the class attributes a given object needs that belong to a different object should be given by passing the name of the
object owning the attribute and then using a getter method for the needed attribute
you should plan everything out as being separate objects and then afterwards see how inheritance can help
simplify things
dont try to go from parent classes to child classes, instead, go from child classes to parent classes
19
Dene a Class
class < class name >( < optional parent1 class name > , < optional parent2 class name >) <doc string> < attribute definitions > # Python equivalent of a constructor def __init__ ( self , < arguments >): self .< arguments > = < argument value > # Python equivalent of a toString def __str__ ( self , , < arguments >): # getter methods def < method name >( self , < method arguments >): < method statements > # setter methods def < method name >( self , < method arguments >): < method statements > # other methods def < method name >( self , < method arguments >): < method statements >
Import a Module
import < module name >
20
abstract data types simplify the task of specifying an algorithm if you can denote the operations you need
without having to think at the same time about how the operations are performed
abstract data types provide a common high-level language for specifying and talking about algorithms
Interface: the set of operations that dene an abstract data type Implementation: code that satises the syntactic and semantic requirements of an interface Client: a program (or the person who wrote the program) that uses an abstract data type Provider: the code (or the person who wrote the program) that implements an abstract data type Veneer: a class denition that implements an abstract data type with method denitions that are invocations of other methods, sometimes with simple transformations
the veneer does no signicant work, but it improves or standardizes the interface seen by the client
Linked Lists
Linked List: a data structure that implements a collection using a sequence of linked nodes Embedded Reference: a reference stored in an attribute of an object Nodes: an element of a list, usually implemented as an object that contains a reference to another object of the same type Cargo: an item of data contained in a node Link: an embedded reference used to link one object to another Recursive Data Structure: a data structure with a recursive denition Collection: multiple objects assembled into a single entity The Fundamental Ambiguity Theorem: a variable that refers to a list node can treat the node as a single object or as the rst in a list of nodes
Node Implementation
class Node : an implementation of the node ADT def __init__ ( self , cargo = None , next = None ): self . cargo = cargo self . next = next def __str__ ( self ): return str ( self . cargo )
21
def printBackward ( self ): if self . next is not None tail = self . next tail . printBackward () print ( self . cargo , end = )
Stacks
Stack: a list that operates according to LIFO Last In First Out
Stack Implementation
class Stack : an implementation of the stack ADT def __init__ ( self ): self . items = [] def __str__ ( self ): return self . items def is_empty ( self ): return ( self . items == []) def push ( self , item ): self . items . append ( item ) def pop ( self ): return self . items . pop ()
Queues
Queue: a list that operates according to FIFO First In First Out
22
Queueing Policy: the rules that determine which member of a queue is removed next Linked Queue: a queue implemented using a linked list
by maintaining a reference to both the rst and the last node, operations can be performed in constant time
instead of linear time
Queue Implementation
class Queue : an implementation of the queue ADT def __init__ ( self ): self . length = 0 self . head = None self . last = None def is_empty ( self ): return self . length == 0 def insert ( self , item ): node = Node ( cargo ) if self . length == 0: # if the list is empty, the new node is head and last self . head = node self . last = node else : # find the last node last = self . last # append the new node last . next = node self . last = node self . lenth = self . length + 1 def remove ( self ): cargo = self . head . cargo self . head = self . head . next self . length = self . length - 1 if self . length == 0: self . last = None return cargo
Priority Queues
Priority Queue: a queue where each member has a priority determined by external factors and the member with the highest priority is the rst to be removed
self . items = [] def is_empty ( self ): return self . items == [] def insert ( self , item ): self . items . append ( item ) def remove ( self ): maximum = 0 for index in range (1 , len ( self . items )): if self . items [ index ] > self . items [ maximum ]: maximum = index item = self . items [ maximum ] del self . items [ maximum ] return item def __gt__ ( self , other ): only necessary for queues of objects, not numbers or strings return < boolean expression comparing the priority of self and other >
Trees
Tree: a set of nodes connected by edges that indicate the relationships among the nodes
trees are for hierarchical data the path between a trees root and any other node is unique
Full Tree: Complete Tree: Subtree: a node and its descendants from a original tree General Tree: a tree where each node can have an arbitrary number of children
n-ary Tree: a tree where each node has no more than n children
Binary Tree: a tree where each node has at most two children Expression Tree: Search Tree: Decision Tree: Expert System: Game Tree: a general decision tree that represents the possible moves in any situation in a game Parse Tree: Grammar:
n = 2h 1 h = log2 (n + 1)
number of nodes in a full binary tree height of a full binary tree with n nodes
2-3 Tree: a general tree whose interior nodes must have either two or three children and whose leaves occur on the same level 2-4 Tree (2-3-4 Tree): a general tree whose interior nodes must have two, three, or four children and whose leaves occur on the same level 24
the root is black every red node has a black parent any children of a red node are black every path from the root to a leaf contains the same number of black nodes
Root: a node in a tree with no parent; the top-most node in a tree Leaf: a node in a tree with no children; a bottom-most node in a tree Parent: the node that refers to a given node Child: one of the nodes referred to by a node Siblings: nodes with the same parent node Ancestor: any node above a given node Descendant: any node below a given node Up: towards the root Down: towards the leaves Depth: Height: the number of levels in a tree; equivalently, the number of nodes along the longest path between the root and a leaf Level: the set of nodes the same depth from the root Path: a set of edges and nodes connecting a starting node and an ending node Preorder Traversal (Depth-First Traversal): traversing a tree by visiting each node before its children Inorder Traversal: traversing a tree by visiting the left child of each node, then the parent node, and then the right child of each node Postorder Traversal: traversing a tree by visiting the children of each node before the node itself Level-Order Traversal (Breadth-First Traversal): traversing a tree in the order of left, root, right; going from left to right; a breadth-rst search of the tree
25
def total ( tree ): if tree is None : return None return total ( tree . left ) + total ( tree . right ) + tree . cargo def printTreePreOrder ( tree ): prints the cargo of every node in the tree uses a preorder traversal if tree is None : return None print ( tree . cargo ) printTreePreOrder ( tree . left ) printTreePreOrder ( tree . right ) def printTreeInOrder ( tree ): prints the cargo of every node in the tree uses a inorder traversal if tree is None : return None printTreeInOrder ( tree . left ) print ( tree . cargo ) printTreeInOrder ( tree . right ) def printTreePostOrder ( tree ): prints the cargo of every node in the tree uses a postorder traversal if tree is None : return None printTreePostOrder ( tree . left ) printTreePostOrder ( tree . right ) print ( tree . cargo ) def printTreeLevelOrder ( tree ): prints the cargo of every node in the tree uses a levelorder traversal if tree is None : return None if printTreeLevelOrder ( tree . left ) printTreeLevelOrder ( tree . right ) print ( tree . cargo )
Heaps
Heap: a complete binary tree whose nodes contain comparable objects and are organized so that each nodes object is no smaller (or alternatively no larger) than the objects in its descendants Maxheap: a heap where the object in each node is greater than or equal to its descendant objects Minheap: a heap where the object in each node is less than or equal to its descendant objects Semiheap: a heap where the root node breaks the order
26
Algorithms
Algorithm Strategies and Problems
Brute Force (Guess and Check): an algorithm strategy that systematically calculates all of the possible answers to the problem and seeing which one is the best or satises the problem statement Greedy Algorithm: an algorithm that uses the strategy of always picking the local optimal solution to generate the global optimal solution to a larger problem Relaxation: the algorithm strategy of approximating a difcult problem with a simpler one and using its solution to work towards the solution to the original problem in an iterative fashion; alternatively, make a simple estimate of the solution to the original problem and iteratively improve the accuracy of the solution Divide-And-Conquer (D&C): repeatedly reducing a hard problem to multiple simpler subproblems until the subproblems are simple enough to solve and then combine the results from the subproblems in such a way to solve the original problem; can be used with non-overlapping subproblems Multiple And Surrender (Procrastinate And Surrender): an algorithm strategy for reluctant algorithms where the problem is repeatedly replaced by multiple simpler subproblems as long as possible until the problems are so simple that they must be solved (unless you just completely stop making any sort of progress towards the problem) Traveling Salesman Problem: given a list of locations and distances between each pair of locations, nd the shortest possible route that visits each location exactly once and returns to the starting location Dynamic Programming: an algorithm strategy only for problems with optimal substructure AND overlapping subproblems where the problem divided into simpler subproblems which are solved and the subproblem solutions are combined to get the solution to the original problem Optimal Substructure: a problem whose solution can be obtained by combining the optimal solutions of the subproblems Overlapping Subproblems: a problem that when broken down into subproblems, the same subproblems can come up multiple times and memoization of the solution can save time Integer Programming: a mathematical optimization or feasibility problem where some or all of the variables are restricted to be integers; integer programming is NP-hard Linear Programming (LP): a technique for mathematical optimization of a linear objective function subject to linear equality and inequality constraints Nonlinear Programming (NLP): techniques for mathematical optimization of an objective function subject to a system of equality and inequality constraints where the objective function and or some or all of the constraints are nonlinear
Algorithm Analysis
P: NP: NP Easy: a problem that is at most as hard as NP, but not necessarily in NP NP Equivalent: a problem exactly as difcult as the hardest problems in NP, but not necessarily in NP NP Hard (Non-Deterministic Polynomial-Time Hard): a problem; a class of problems that are at least as hard as the hardest NP problems NP Complete (NP-C) (NPC): 27
Satisability (The Satisability Problem) (SAT): Turing Machine: Constant Time: an operation whose runtime does not depend on the size of the data structure Linear Time: an operation whose runtime is a linear function of the size of the data structure
f (n) = ( (n)) IF there exists constants N , A, and B such that A | (n)| | f (n)| B | (n)| n > N = there exists f (n) c (n) = for all
function asymptotic to
only the largest term of each half of the rational expression is relevant take the largest term from the numerator and the denominator as a fraction and cancel things out to get the big O behavior of a function searching a list from start to end to nd something (linear search) is on average O (n) searching a binary search tree from the root to nd something (binary search) is on average O (log2 n) in computer science log n is assumed to be log2 n because thats whats useful in computer science
Computing Requirements of Common Constructions conditional loop through n items visit every element in a binary tree to depth n visit 1 node at every depth to a depth n in a binary tree
Searching Algorithms
Searching Algorithm Comparison
Algorithm Linear Search Binary Search Best Average Worst
O (1) O (1)
O (n) O (log n)
28
Sequential Search
1. check if the rst element in the list is what youre searching for (a) IF yes, return it (b) IF no, check the next element 2. repeat until you either nd what youre searching for or check the entire list
sequential search is your only real option for completely unstructured data
def SequentialSearch ( list , value ): searches the unsorted list for the given value using a sequential search a returns the index(s) containing the desired value length = len ( list ) occurances = [] for index in range (0 , length -1): if list [ index ] == value : occurances . append ( index ) return occurances
Binary Search
1. check the median or middle element of the list (a) IF yes, return it (b) IF it come after the desired element in the list, get the median element of the rst half of the list (c) IF it comes before the desired element in the list, get the median element of the second half of the list 2. continue checking and getting the median of the relevant section of the list until the desired element is found or there is nothing in between two elements youve already checked
binary search requires that the data is sorted in the relevant order in practice, binary search is good, but not as good as one would expect because modern CPUs are heavily
reliant on caching of nearby values, which generally isnt helpful in a binary search, making it effectively slower than one would expect
def binarySearch ( list , value ): searches the sorted list for the given value using a binary search algorit returns the index containing the desired value startIndex = 0 endIndex = len ( list ) - 1 while startIndex <= endIndex : median = startIndex + ( endIndex - startIndex ) // 2 medianValue = list [ median ] if medianValue == value :
29
print ( the location of the value is : , median ) return median elif medianValue > value : endIndex = median - 1 elif medianValue < value : startIndex = median + 1 else : return None
def recusiveBinarySearch ( list , value ): searches the sorted list for the given value using a recursive binary sear returns the index containing the desired value length = len ( list ) median = length // 2 medianValue = list [ median ] if medianValue == value : print ( the location of the value is : , median ) return median elif medianValue > value : recusiveBinarySearch ( list [0: median ] , value ) else : recusiveBinarySearch ( list [ median +1: length -1] , value )
Sorting Algorithms
Stable: equal keys arent reordered Adaptable: speeds up to O (n) when data is nearly sorted or when there are few unique keys Memory Usage: the additional space in memory required to perform the sort beyond storing the initial list of items
sorting is generally considered a solved problem in computer science the ideal sorting algorithm has the following properties:
stable adaptable
O (1) memory usage worst case O (n log n) comparisons worst case O (n) swaps
minimum overhead maximum readability
no algorithm has all of these properties, so the best choice depends on the application in practice, you can usually just use a built-in sorting function for most programming languages the Python sorted() function uses Timsort, which is a hybrid of merge sort and insertion sort
30
O (n) O (1)
Depends, worst = n
O (1)
average = log n , worst = n
O (n)
O n2 k d k O n d O (n + r) O n
O n2 k d k O n d O (n + r) O n
Lexicographical Sorting Algorithms MSD Radix Sort LSD Radix Sort Bucket Sort
O n+
O (n) O (n + r)
Yes Yes
Bogo Sort
1. randomly order the list 2. check if the list is order, element by element (a) IF the list is sorted, THEN return the sorted list (b) IF the list is not sorted, repeat the above process until it is
Bogosort is the canonical example of a comically bad algorithm intended to be as inefcient as possible import random def bogoSort ( list ): takes a list and returns the list sorted from smallest to largest # must shuffle the list first or its a bug if the list was pre-sorted length = len ( list ) if length == 0 or length == 1: return list random . shuffle ( list ) while not in_order ( list ): random . shuffle ( list )
31
return list def inOrder ( list ): last = list [0] for element in list [1:]: if element < last : return False last = element return True
Bubble Sort
1. from beginning to end, compare adjacent pairs of items (a) IF in order, continue to the next element
this means the pair indexes shifts by one, so [1,2] becomes [2,3]
(b) IF out of order, swap 2. repeatedly go through the list until no swaps are required
bubble sort isnt really used in practice def bubbleSort ( list ): takes a list and returns the list sorted from smallest to largest length = len ( list ) if length == 0 or length == 1: return list # keep going through the list for round in range ( length -1): # increment through the list for increment in range ( length -1 - round ): # swap values if necessary if list [ increment ] > list [ increment +1]: temp = list [ increment ] list [ increment ] = list [ increment +1] list [ increment +1] = temp return list
Selection Sort
1. append the smallest value of the given unsorted list to a new list 2. repeat until all of the values have been appended to the new sorted list
selection sort has bad big O performance, but is still useful because it minimizes the number of swaps, so if
the cost of swaps is very high, it may be the best choice 32
def selectionSort ( list ): takes a list and returns the list sorted from smallest to largest length = len ( list ) if length == 0 or length == 1: return list else : for outerIndex in range ( length -1): minIndex = outerIndex minValue = list [ outerIndex ] for innerIndex in range ( outerIndex +1 , length ): value = list [ innerIndex ] if value < minValue : minValue = value minIndex = innerIndex # swap values at minIndex and outerIndex list [ minIndex ] = list [ outerIndex ] list [ outerIndex ] = minValue return list
Insertion Sort
1. create a new list to put the elements into in the correct order 2. insert the rst element of the original list to the sorted list 3. take the next element of the original list and insert it into the correct spot in the new list (a) increment through the sorted list, comparing the value to be inserted to the values already in the list 4. continue inserting elements from the original list into the sorted list until everything has been added to it
insertion sort has bad big O performance, but is still useful because it is adaptive and has low overhead def insertionSort ( list ): takes a list and returns the list sorted from smallest to largest length = len ( list ) if length == 0 or length == 1: return list else : # for every element in the unsorted portion of the list for unsortedIndex in range (1 , length ): # check against every element in the sorted portion of the list for insertionIndex in range ( unsortedIndex , 0 , -1): sortedElement = list [ insertionIndex - 1] unsortedElement = list [ insertionIndex ] # if unsortedElement is in the correct spot if sortedElement <= unsortedElement : break # swap elements as necessary
33
temp = unsortedElement list [ insertionIndex ] = sortedElement list [ insertionIndex - 1] = unsortedElement return list
Shell Sort
shell sort is essentially an improved version of insertion sort def shellSort ( list ): takes a list and returns the list sorted from smallest to largest length = len ( list ) if length == 0 or length == 1: return list else : # for every element in the unsorted portion of the list for unsortedIndex in range (1 , length ): # check against every element in the sorted portion of the list for insertionIndex in range ( unsortedIndex , 0 , -1): sortedElement = list [ insertionIndex - 1] unsortedElement = list [ insertionIndex ] # if unsortedElement is in the correct spot if sortedElement <= unsortedElement : break # swap elements as necessary temp = unsortedElement list [ insertionIndex ] = sortedElement list [ insertionIndex - 1] = unsortedElement return list
Merge Sort
1. recursively divide a list in half 2. keep dividing until all of the pieces consist of either one or two elements 3. sort the lists of one or two elements each 4. recombine the lists into larger sorted lists
def mergeSort ( list ): takes a list and returns a copy sorted from smallest to largest # corner cases length = len ( list ) if length == 0 or length == 1: return list else :
34
middleIndex = length // 2 # recursively break the lists down into single elements left = mergeSort ( list [: middleIndex ]) right = mergeSort ( list [ middleIndex :]) # call the merge method to merge the sorted left and right lists return merge ( left , right ) def merge ( left , right ): merges the two sorted lists into a sorted list merged = [] leftIndex = 0 rightIndex = 0 leftLength = len ( left ) rightLength = len ( right ) # dont exit the loop, just go until a return statement is reached while True : # exit when one of the lists has been completely gone through # also can handle if one or both of the lists is empty if rightIndex >= rightLength : return merged + left [ leftIndex :] elif leftIndex >= leftLength : return merged + right [ rightIndex :] # add the next smallest element to the new list elif left [ leftIndex ] < right [ rightIndex ]: merged . append ( left [ leftIndex ]) leftIndex = leftIndex + 1 else : merged . append ( right [ rightIndex ]) rightIndex = rightIndex + 1
Quick Sort
1. select a random element to be a pivot 2. copy the other elements into two sublists
def quickSort ( list ): takes a list and returns a copy sorted from smallest to largest # corner cases length = len ( list ) if length == 0 or length == 1: return list else : pivot = list [ -1] rest = list [: -1] left = [] right = []
35
for index in rest : if index < pivot : left . append ( index ) else : right . append ( index ) left . quickSort ( left ) right = quickSort ( right ) return left + [ pivot ] + right def partition ( list , startIndex , endIndex ):
Bucket Sort
1. must be given largest item in list 2. all list elements must fall in a predictable number of discrete values, e.g. ints, chars, floats with only one decimal place, etc. 3. make new list from 0 to the largest value 4. iterate through the list and increment the second list accordingly to whatever value you read in the rst list
def bucket_sort ( list ): assuming list to be a list of integers length = len ( list ) if length == 0 or length == 1: return list big = list [0] # starting to find the extreme values small = list [0] for term in list : # to establish max range of possible values if term > max : big = term if term < min : small = term freq = [ ] # to hold frequencies for values for i in range ( small , big + 1): freq . append (0) # initialising freqs to be zero for term in list : freq [ term - min ] += 1 # incrementing freqs i = 0 for loc in range ( len ( freq )): # run through freq list count = freq [ loc ] # get frequency of occurrence to see how often to repeat value for more in range ( count ): # repeat the insertion of the value _count_ tim list [i] = loc + small i += 1 return list
36
IF not all of the numbers have the same number of digits, THEN the corresponding digit places must be lled
in with zeros so that all of the items have the same number of digits
def radix_sort ( my_list ): assuming my_list to be a list of integers, and well proceed base 10 for This could be vastly more efficient, but Ive written the code to be high if len ( my_list ) == 0 : return my_list shifted_list = [ ] # to hold the values once shifted by the smallest value big = my_list [0] # starting to find the extreme values small = my_list [0] for term in my_list : # to establish max range of possible values if term > big : big = term if term < small : small = term spread = big - small # the max range of numbers for value in my_list : shifted_list . append ( value - small ) # so sorting a list whose smallest is print ( shifted_list ) base = 10 radix = get_radix ( spread , base ) print (" radix = " + str ( radix )) digitised = [] # to hold the digitised values of my_list relative to the base for value in shifted_list : digitised . append ( long_get_digits ( value , radix , base )) print ( digitised ) buckets = [ ] # to hold the various queues of values for count in range ( base ) : # one bucket for each potential value of a digit buckets . append ( Queuer ( count )) for k in range ( radix ) : for value in digitised : buckets [ value [ radix - k - 1]]. insert ( value ) for b in buckets : print (b) digitised = [ ] # empty and re-use for q in buckets : while not q. isEmpty () :
37
digitised . append (q. leave ()) print ( digitised ) my_list = [ ] # re-use this list for digital in digitised : my_list . append ( reform_number ( digital ) + small ) # so unshifting the valu return my_list def get_radix ( spread , base =10): # default base is 10 assume spread > 0 and base > 1 n = 1 # to explore the least power of base to exceed spread temp = base while spread >= temp : temp *= base n += 1 # to try the next power of base return n def get_digits ( value , base =10): radix = get_radix ( value , base ) digits = [ ] # to hold the digits of spread relative to the value of base for count in range ( radix ) : digits . append ( value % base ) value = value // base digits . reverse () # for human sanity!! return digits def long_get_digits ( value , radix , base =10): to fill in with leading zeros as needed digits = get_digits ( value , base ) digits . reverse () # easy trick to prepend the right number of zeros n = len ( digits ) for count in range ( radix - n) : digits . append (0) digits . reverse () return digits def reform_number ( digital , base =10): assuming digital is a list of digits to that base radix = len ( digital ) temp_power = base temp = digital [ radix - 1] for k in range (1 , radix ) : temp += digital [ radix - k - 1] * temp_power temp_power *= base return temp
38
39
Critical Path: the path with the greatest weight in a weighted, directed, acyclic graph Diameter: for an unweighted graph, the maximum of all the shortest distances between pairs of vertices in the graph Cycle: a path that begins and ends at the same vertex Acyclic: a graph that has no cycles Simple Cycle: a cycle that passes through other vertices only once each Hamiltonian Graph: a graph that contains a Hamiltonian cycle Hamiltonian Path (Traceable Path): a path in a directed or undirected graph that visits each vertex exactly once Hamiltonian Cycle (Hamiltonian Circuit): a Hamiltonian path that is a cycle
all Hamiltonian graphs are biconnected graphs not all biconnected graphs are Hamiltonian graphs every platonic solid, considered as a graph, is Hamiltonian every prism is Hamiltonian a simple graph with n vertices (n 3) is Hamiltonian IF every vertex has degree n/2 or greater a graph with n vertices (n 3) is Hamiltonian IF for every pair of non-adjacent vertices, the sum of their degrees is n or greater
Eulerian Graph: a graph that contains an Eulerian cycle Eulerian Trail (Euler Walk): a path in an undirected graph that uses each edge exactly once Eulerian Cycle (Eulerian Circuit) (Euler Tour): a cycle in an undirected graph that uses each edge exactly once Topological Order: Minimum Spanning Tree: Graph Coloring: assigning a color to every vertex in a graph with the restriction that two vertices of the same color cannot be adjacent
all graphs are trees not all trees are graphs a tree is a connected acyclic graph
Graph Algorithms
Topological Sort Kruskals Algorithm
1. pick the edge with the lowest weight on the graph
40
if two edges have the same weight, then the choice doesnt matter you are never allowed to pick edges that create a cycle
2. continuing picking the edge with the next lowest weight until a minimum spanning tree is achieved
nds a minimum spanning tree on a weighted graph Kruskals algorithm is better than Prims algorithm for
Prims Algorithm
1. pick a random starting vertex 2. pick the edge from the starting vertex with the lowest weight 3. pick the edge from the connected vertexes with the lowest weight
nds a minimum spanning tree on a weighted graph Prims algorithm is better than Kruskals algorithm for
Dijkstras Algorithm
1. assign the starting vertex a distance of zero 2. assign all the other vertexes a tentative distance value of innity 3. for the current node, consider all of its unvisited neighbors and calculate the [distance to the current vertex] plus the [distance from the current vertex to the neighbor] 4. IF this is less than the vertexs current tentative distance ( or otherwise), THEN replace it with the new value
nds the shortest/cheapest path on a weighted graph the weights must be non-negative
Bellman-Ford Algorithm
the Bellman-Ford algorithm is slower than Dijkstras algorithm, but can be used with negative edge weights IF there are negative cycles, THEN there is no shortest/cheapest path because any path can always
be made shorter by another walk through the negative cycle
A* Search Algorithm
A* is essentially Dijkstras shortest path algorithm plus a heuristic to improve time performance A* is NOT guaranteed to nd the optimal path (but hopefully a very good one), especially if the heuristic
has low accuracy
Ford-Fulkerson Algorithm Bipartite Graph Matching Kosaraju-Sharir Algorithm Hopcroft-Karp Algorithm Dinitzs Algorithm
File IO
File IO
it is generally advisable to read les line by line rather than all at once in case the input le is extremely large it is a good practice to close a le as soon as youre done with it
42
raw_input() input() print() open(<filename> , <r/w permission>) <file name>.close() <file name>.name <file name>.mode <file name>.closed read() readline() readlines() write(<string>) writelines(<list of strings>) truncate()
read one line from standard input and return it as a string (without the trailing newline)
read one line from standard input and return it as an evaluated expres converts the passed expressions to strings and then writes them to standard output opens the specied le and returns a pointer to it
ushes any unwritten information from memory and closes the le obje after which no more writing can be done returns the name of the le returns the access mode with which the le was opened returns true if the le is closed and false if it open read the entire le or optionally, the specied number of characters or read the next line read the next write the string to the le write the list of strings to the le
Useful Packages
numpy scipy matplotlib IPython PyQt Sage Cython MLabWrap RPy py2exe py2app wxPython Tkinter
43
GUIs
GUIs
Widget: one of the elements that makes up a GUI, including buttons, menus, text entry elds, etc. Option: a value that controls the appearance or function of a widget Keyword Argument: an argument that indicates the parameter name as part of the function call Callback: a function associated with a widget that is called when the user performs an action Bound Method: a method associated with a particular instance Event-Driven Programming: a style of programming in which a ow of execution is determined by user actions Event: a user action, like a mouse click or key press, that causes a GUI to respond Event Loop: an innite loop that waits for user actions and responds Item: a graphical element on a Canvas widget Bounding Box: a rectangle that encloses a set of items, usually specied by two opposing corners Pack: to arrange and display the elements of a GUI Geometry Manager: a system for packing widgets Binding: an association between a widget, an event, and an event handler. The event handler is called when the event occurs in the widget
Tkinter GUIs
44
Tkinter Widgets
Button Canvas Checkbutton Entry Frame Label Listbox Menubutton Menu Message Radiobuttion Scale Scrollbar Text Toplevel Spinbox PanedWindow LabelFrame tkMessageBox
creates buttons in an application creates a rectangular area intended for drawing pictures and other complex objects creates checkbox buttons in an application creates a textbox for the user to input a single-line string of text creates rectangular areas in the screen to organize the layout and to provide padding for other widgets creates a display box for text and or images creates a box of selectable lines of text creates a button to open a drop-down menu creates a pop-up, toplevel, or pull-down menu creates a non-editable display box similar to Label, but with automatic line breaking and justication of the contents creates a multiple choice radio button creates a graphical sliding object to select a value for a variable creates vertical and horizontal scrollbars creates a display box with advanced text editing abilities creates a window to put other widgets in creates a box to select a value with direct input or clickable up and down arrows creates a window pane inside a larger window creates a container or spacer for other widgets with the properties of both a Frame and a Label creates a pop-up message box
wxPython GUIs
import wx class TestFrame ( wx . Frame ): def __init__ ( self , parent , title ): wx . Frame . __init__ ( self , parent , wx . ID_ANY , title = title ) text = wx . StaticText ( self , label = " Hello World !") app = wx . App ( redirect = False ) frame = TestFrame ( None , " Hello World !") frame . Show () app . MainLoop ()
Event-Driven Programming
45