Graph 1
Graph 1
Introduction and definitions Typical network modelling applications Classic graph theory problems and proofs The Seven Bridges of Knigsburg The four colour map colouring theorem The three cottage problem Data structures used for storing graphs Incidence and adjancancy lists and matrixes
Definitions
Graph theory concerns the study of networks based on a mathematical abstraction of the form of a graph. A graph is made up of vertices (singular: vertex) and edges. An edge connects exactly 2 vertices. These are the same vertex in the special case where the edge is called a loop or loopback.
A vertex can have any number of edges connected to it. Edges might be directed, and drawn as an arrow to indicate that network flow or traffic or the connection represented by the edge works in one direction only. In an undirected graph the edges are bidirectional. Edges may be associated with a numeric cost. The meaning of edge cost will depend upon the graph application.
More definitions
A path is a route through a graph visiting vertices and edges in turn. A cycle is a path that ends at the starting vertex. A Hamiltonian path or cycle visits all vertices exactly once. A Eulerian path or cycle visits all edges exactly once.
A graph is simple if no edge is a loop, and no 2 edges have the same endpoints. A planar graph can be drawn on a plane surface, e.g. a piece of paper, so that no edges cross over other edges.
A directed acyclic graph is a directed graph without cycles, i.e. you can't get back to the vertex where you started by following edges in their defined direction. This is a way to map a forest of trees so that subtrees can be shared between trees. Sources are vertices all of whose edges lead out from them and sinks are vertices all of whose edges lead into them. A tree is a graph which is connected, acyclic and simple.
Terms to remember
graph vertex (pl vertices) edge edge cost undirected graph directed graph simple graph loop
path cycle Hamiltonian Eulerian planar graph acyclic directed graph forest tree
"The Seven Bridges of Knigsberg is a problem inspired by an actual place and situation. The city of Kaliningrad, Russia (at the time, Knigsberg, Germany) is set on the Pregolya River, and included two large islands which were connected to each other and the mainland by seven bridges. The question is whether it is possible to walk with a route that crosses each bridge exactly once, and return to the starting point. In 1736, Leonhard Euler proved that it was not possible."
Source: http://en.wikipedia.org/wiki/Seven_Bridges_of_Koenigsberg
"The problem can be modified to ask for a path that traverses all bridges but does not have the same starting and ending point. Such a walk is called an Eulerian trail or Euler walk. Such a path exists if and only if the graph has exactly two nodes of odd degree, those nodes being the starting and ending points. (So this too was impossible for the seven bridges of Knigsberg.)"
"The three cottage problem is a well-known mathematical puzzle. It can be stated like this: Suppose there are three cottages on a plane (or sphere) and each needs to be connected to the gas, water, and electric companies. Is there a way to do so without any of the lines crossing each other?"
Incidence list
The connectivity of an edge is regular in that each represents a connection between exactly 2 vertices. This lends itself to fixedsize record designs, e.g. an array of records known as an incidence list. The coding in the next slide uses pointers to the vertex records, as a vertex may appear in any number of edge records. Alternatively, if the vertices are stored in an array which is populated before the edge records are created, and unchanged during the runtime of the application, the edge records could store the integer array indices. Which approach is optimal depends upon how vertex records are stored and how this collection of data is searched and updated.
Adjacency list
The connectivity of a vertex is irregular, in the sense each can be connected using a variable number of edges. Vertex information can be stored in a fixed width record containing the head pointer of a linked list. Each linked list item will represent an edge. The latter will contains references (e.g. array indices, keys or pointers) to edge record storage so this isn't duplicated. If this isn't done within the edge records themselves, the linked list nodes must also reference other vertices connected via the edges. This approach is known as an adjacency list.
Incidence matrix
This structure is a 2 dimensional array where the rows index the edges, and columns index the vertices. In the simplest implementation the array element is boolean, with a 1 indicating a connection and a 0 indicating no connection. The advantage of this structure is that it enables data to be rapidly indexed, either for vertices or for edges. The disadvantage is that memory proportional to V x E is required, where V is the number of vertices, and E the number of edges. List or array based structures require memory proportional to V + E.
Adjacency matrix
This is a V x V 2D array where V is the number of vertices. In the simplest case, if the data present at index [i][j] is a 1 this indicates a (possibly directed) edge from i to j, while a 0 indicates there being no such edge. If the edges have costs, this could be a floating point number present at index [i][j], with a sentinel value, e.g. 0 or -1, indicating lack of an edge from vertex i to vertex j. This data structure makes it easier to search for subgraphs of particular patterns or identities, e.g. K3,3 or K5.
Conclusions
Graph theory enables us to study and model networks and solve some difficult problems inherently capable of being modelled using networks. Various terms e.g. vertex and edge, are associated with graph theory which gives these terms special meanings. These meanings need to be understood and remembered in order to apply graph theoretic approaches to solving problems.
When solving a problem by developing a graph-based program, careful attention must be given at the design stage to the structuring of data to help make solving the problem tractable, to enable linkages to be traced efficiently and to avoid duplication of data.