2. Instructor
Prof. Amrinder Arora
amrinder@gwu.edu
Please copy TA on emails
Please feel free to call as well
TA
Iswarya Parupudi
iswarya2291@gwmail.gwu.edu
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 2
LOGISTICS
3. L4 - BTrees CS 6213 - Advanced Data Structures - Arora 3
CS 6213
Basics
Record /
Struct
Arrays / Linked
Lists / Stacks
/ Queues
Graphs / Trees
/ BSTs
Advanced
Trie, B-Tree
Splay Trees
R-Trees
Heaps and PQs
Union Find
4. Graphs – Basics
Degrees, Number of Edges, Min/Max Degree
Kinds of Graphs
How to Represent in Data Structures
Trees, Paths, Cycles
Journeys
Topological Sorting, DAGs
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 4
AGENDA
5. A graph G=(V,E) consists of a finite set V, which is the
set of vertices, and set E, which is the set of edges.
Each edge in E connects two vertices v1 and v2,
which are in V.
Can be directed or undirected
Not to be confused with a bar graph!!!!
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 5
GRAPH
6. A core data structure that shows up in many
circumstances:
Transportation and Logistics (paths, etc.)
Circuit Design
Social Networking – Model connections between people
Sociology – Model influence
Zoology and Wildlife
Project Task Management
Job Scheduling and Resource Assignment (matching)
Time table scheduling
Task parallelization (graph coloring)
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 6
GRAPH – APPLICATIONS
7. (Undirected) Degree of a node
(Directed) Indegree / Outdegree
Min degree in a graph:
Max degree in a graph:
Basic observations
(Undirected) Sum of degrees = 2 x number of edges
(Directed) Sum of indegree = Sum of outdegree = number of
edges
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 7
DEGREE
8. If (x,y) is an edge, then x is said to be adjacent to y, and y is adjacent
from x.
In the case of undirected graphs, if (x,y) is an edge, we just say that x
and y are adjacent (or x is adjacent to y, or y is adjacent to x). Also, we
say that x is the neighbor of y.
The indegree of a node x is the number of nodes adjacent to x
The outdegree of a node x is the number of nodes adjacent from x
The degree of a node x in an undirected graph is the number of
neighbors of x
A path from a node x to a node y in a graph is a sequence of node x,
x1,x2,...,xn,y, such that x is adjacent to x1, x1 is adjacent to x2, ..., and xn
is adjacent to y.
The length of a path is the number of its edges.
A cycle is a path that begins and ends at the same node
The distance from node x to node y is the length of the shortest path
from x to y.
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 8
GRAPH DEFINITIONS
9. Using a matrix A[1..n,1..n] where A[i,j] = 1 if (i,j) is an
edge, and is 0 otherwise. This representation is called
the adjacency matrix representation. If the graph is
undirected, then the adjacency matrix is symmetric about
the main diagonal.
Using an array Adj[1..n] of pointers, which Adj[i] is a
linked list of nodes which are adjacent to i.
The matrix representation requires more memory, since it
has a matrix cell for each possible edge, whether that
edge exists or not. In adjacency list representation, the
space used is directly proportional to the number of
edges.
If the graph is sparse (very few edges), then adjacency
list may be a more efficient choice.
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 9
GRAPH REPRESENTATIONS
10. A very practical choice is to use graphing libraries,
such as:
JGraphT (Java)
Boost (C++)
GraphStream (Java)
JUNG (Java)
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 10
GRAPH REPRESENTATION (CONT.)
11. Graphs can be characterized in many ways. Two
important ones being:
Directed or Undirected
Weighted or Unweighted
Both Adjacency Matrix (AM) and Adjacency List (AL)
representations can be used for graphs – weighted
or unweighted, directed or undirected.
A[i,j] = A[j,i] if graph is undirected. So, we could decide to use
just the upper triangle.
If graph is weighted, in adjacency list, we can also store the
weight. AL[i] = [(j,w(i,j)), (k,w(i,k)), …]
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 11
GRAPH CHARACTERIZATIONS
12. A tree is a connected acyclic graph (i.e., it has no
cycles)
Rooted tree: A tree in which one node is designated
as a root (the top node)
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 12
TREE
Example:
Node A is root node
F and D are child nodes of A.
P and Q are child nodes of J.
Etc.
13. Definitions
Leaf is a node that has no children
Ancestors of a node x are all the nodes on the path from x to
the root, including x and the root
Subtree rooted at x is the tree consisting of x, its children and
their children, and so on and so forth all the way down
Height of a tree is the maximum distance from the root to any
node
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 13
TREE (CONT.)
14. Option 1
•Trees are
graphs, so we
can use
standard graph
representation
– Adjacency
Matrix or
Adjacency List
Option 2
•Use parent
node and list
for child nodes
Option 3
•Use parent
node, and two
pointers – one
for first child
and the other
for nextSibling
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 14
REPRESENTING TREES
3 Basic Options
15. We can use standard graph representation –
Adjacency Matrix or Adjacency List
These options are overkill for trees, but in some
instances, the graph is not known to be a tree
beforehand, so this option works.
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 15
TREE REPRESENTATION – OPTION 1
16. Rather than using Adjacency Matrix
or Adjacency List representation,
we can use a simpler representation
Each node has a pointer to parent,
and a linked list of child nodes
The root node’s parent pointer is null
This representation works for “rooted” trees. If the
tree is not rooted, we can designate one node as
root.
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 16
TREE REPRESENTATION – OPTION 2
Node {
Node parent,
List<Node> childNodes
}
17. Another representation for Trees: Left
Child, Right Sibling representation for a
tree. Each node has 3 pointers:
Parent – Points to parent (null if this is the root
node)
Left pointer – Points to first child (null if this is
a leaf node)
Right pointer – Points to right sibling (null if no
more siblings)
For example:
is represented by
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 17
TREE REPRESENTATION – OPTION 3
Node {
Node parent,
Node firstChild,
Node nextSibling
}
18. When updating values in a tree, such as the size of the
subtree rooted at a node, the weight of the subtree
rooted at a node, etc, there are two main methods:
Recompute whenever there is a modify operation (addition of a
node, deletion of a node, changing the weight of a node, etc).
Recomputations usually only need to propagate from the change
node upwards to the root. [Advantage: Values always up to date,
Disadvantages: Lot of time spent in Recompute, Method cannot
be run concurrently.]
Use a dirty flag and set to true when there is a change. Set dirty
flag to true for all ancestors (navigate to parent, until the root).
Recompute when there is a need, or as per a schedule.
[Advantages: Efficient, Methods (other than the “recomputed”
operation can be run concurrently. Disadvantages: Values are out
of date at times.]
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 18
UPDATING VALUES
19. Able to hold the entire graph in memory?
If your graph is large and changing fast, like the
Facebook interconnection graph, you simply cannot
hold it in memory using traditional methods. You
need to replicate it across multiple servers and use
reliable services to get partial data out from the
graph. Your graph may simply be backed by a
database with which you interact directly (without
ever loading a complete “graph” object.)
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 19
LARGE GRAPHS
20. Given each of the graph representations, how do we
find a path?
Shortest path algorithms
Dijkstra
All Pairs Shortest Paths
[Refer to CS 6212 Notes for details]
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 20
PATHS
21. Path, spread over time
Useful concept if the graph changes over time
Specifically, consider this scenario:
Edge e1 existed from x to y at time t1
Edge e2 existed from y to z at time t2
t1 < t2
Then, we say that there exists a journey from x to z
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 21
JOURNEY
22. How can we detect cycles in a graph?
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 22
CYCLES
23. Assuming a graph is a Directed Acyclic Graph,
topological ordering can be produced in linear time.
O(n + m)
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 23
TOPOLOGICAL SORTING
24. Graphs are important data structures with numerous
applications
Graphs can be of different kinds
Many convenient ways to model graphs
Adjacency Matrix
Adjacency List
Special structures for trees
Many commercial and open source libraries exist, such as
JGraphT
CS 6213 – Arora – L2 Advanced Data Structures - Graphs 24
SUMMARY