# Spanning Trees

• Sometimes, you are given graph, but only need one way to connect nodes.
• For example, a network with redundant connections.
• When routing data, you don't care about redundancy: just need a way to send the data to its destination.
• So if we are looking at routing decisions, we basically want to reduce the graph to a tree. The tree tells usexactly how we will send the data.
• If a link fails, we can choose a different tree.
• Given a graph $$G=(V,E)$$, a subgraph of $$G$$ that is connects all of the vertices and is a tree is called a spanning tree.
• We can remove edges until we are left with a tree: the result is a spanning tree.
• Clearly, a spanning tree will have $$|V|-1$$ edges, like any other tree.
• Built into Ethernet, there is a Spanning Tree Protocol.
• The job of STP is to take a network with redundant connections and build a spanning tree.
• … which can be used to send data without worrying about forwarding data around loops and bringing down the network.
• Theorem: A graph is connected iff it has a spanning tree.

Proof: If a graph is connected, we can identify a cycle and remove an edge from it: it will still be connected. We can continue this until no cycles remain. The result is a spanning tree.

If we have a graph with a spanning tree, then every pair of vertices is connected in the tree. Since the spanning tree is a subgraph of the original graph, the vertices were connected in the original as well.

# Minimum Spanning Trees

• If we just want a spanning tree, any $$n-1$$ edges will do.
• If we have edge weights, we can ask for the spanning tree with the lowest total edge weights.
• This is the minimum spanning tree.
• For example, if we add some edge weights, what is the minimum spanning tree?
• Is there a general algorithm to find the MST?
• Actually there are two.
• Prim's Algorithm works by building a tree from the minimum weight edge, out.
procedure prims_algorithm(V, E)
Et = {minimum weight edge in E}
repeat n-2 times
e = the edge in E that is adjacent to an edge in Et, doesn't form a circuit, and has minimum weight
Et = Et + {e}
return (V, Et)
• That is, we start with the minimum weight edge as our partial-tree.
• Look for an adjacent edge that we can add (no loops) with minimum weight.
• Keep adding edges until we have $$n-1$$ of them.
• Kruskal's Algorithm works by building a tree from the minimum weight edge, out.
procedure kruskals_algorithm(V, E)
Et = {}
repeat n-1 times
e = the edge in E that doesn't form a circuit, and has minimum weight
Et = Et + {e}
return (V, Et)
• The only difference here is that we don't limit our search to edges near the tree we're building.
• Both of these algorithms find the MST.
• On our example graph with weighted edges…
• Prim's find the edges of the MST in this order: $$fg, df, ad, ab, be, ce$$.
• Kruskal's find the edges in this order: $$fg, ad, ce, df, ab, be$$.
• It's the same set of edges (so the same MST), just a difference in how we find them.
• Running time?
• It depends on how we store the edges and how quickly we can find the minimum-weight ones.
• But let's assume we're clever about it. Then…
• Prim's runs in $$O(|E|+|V|\log |V|)$$.
• Kruskal's runs in $$O(|E|\log |V|)$$.
• So if we have a lot of edges ($$|E|=O(|V|^2)$$), then Prim's will win.
• If the graph is fairly sparse, Kruskal's will win.
• Both of these are examples of greedy algorithms.
• In a greedy algorithm, you make the choice that seems best right now.
• e.g. the minimum weight edge you can find that doesn't form a circuit.
• If a greedy algorithm works, it's probably fast, because you can make an easy choice.
• But it doesn't get the right answer for every problem.
• e.g. travelling salesman: choosing the minimum weight edge will often find a bad overall solution.
• Don't use a greedy algorithm when it's not appropriate.