6  Walking Around via Euler Tours

Slides coming soon: most of this discussion was done on the board in class.

We revisit the following problem from our introduction to graphs:

The Problem

The city of Königsberg in Prussia (now Kaliningrad, Russia) was set on both sides of the Pregel River, and included two large islands—Kneiphof and Lomse—which were connected to each other, and to the two mainland portions of the city, by seven bridges.

Devise a walk through the city that would cross each of those bridges once and only once. Try this yourself on a few different maps at Mathigon!

We also found such traversals useful for computing de Bruijn sequences, so between success on city exploration challenges and impressing with card tricks, there is plenty of motivation to take an Euler tour in a graph.

In general, we are given a directed or undirected graph G=(V,E)G = (V,E) and we want to know if there is a sequence of edges:

W:=e0=(u0,v0),,em1=(um1,vm1)W := e_0 = (u_0,v_0), \ldots, e_{m-1} = (u_{m-1},v_{m-1})

such that vi=ui+1mod  mv_i = u_{i+1 \mod m} for all i{0,1,,m1}i \in \{0,1,\ldots,m-1\}, and every edge in EE features exactly once in this sequence.

6.1 Necessary and sufficient conditions

Note that if such a sequence does exist for a directed graph GG, then the indegree of every vertex must equal its outdegree, i.e:

indegree(v)(v) = outdegree(v)(v) for all vVv \in V.

Likewise, if such a sequence exists for an undirected graph GG, then every vertex must have even degree:

degree(v)=2kv(v) = 2k_v for all vVv \in V and some integer kvk_v.

You can observe this based on simulating the sequence on the graph and imagining it from the perspective of your favorite vertex vv in it. In particular, assume you are walking around in GG as dictated by WW. Fix your attention on vv: every time you “enter” vv, via, say the edge eie_i, then you must “exit” vv via the edge ei+1e_{i+1}. If GG is directed, eie_i is an incoming edge and ei+1e_{i+1} is an outgoing edge, and if GG is undirected, these are simply two edges incident on vv that can be naturally “paired off”. So any successful walk witnesses the claims above.

It turns out that you could have graphs where these conditions are true, but there are no Euler tours. This happens when the graph is “disconnected”, i.e, when there is a pair of vertices uu and vv such that there is no path from uu to vv. The following are good exercises to work through:

  • Come up with an example of a graph where the degree conditions are met but there is no Euler tour.

  • Convince yourself that if GG is connected and satisfies the degree conditions indicated above, you can always find an Euler tour.

6.2 A naive algorithm

Now we turn to the procedural question: knowing what it takes to find an Euler tour, how do we actually find one? First, we get the simple degree-based sanity check out of the way. For undirected graphs we have:

for v in V(G):
    if deg[v] % 2 != 0:
        return False

and for directed graphs we have:

for v in V(G):
    if indeg[v] != outdeg[v]:
        return False

Assuming we pass these sanity checks, we want to embark on an actual tour. Here’s a reasonable starting point, which essentially amounts to saying that you get to start anywhere, and keep going while you can:

find_tour(G,v):
    // G is the graph
    // and v is our favorite vertex
    set curr := v
    set S := empty
    while there is an outgoing edge e = (curr,u) which is not in S:
        add e to S
        set curr := u
    return S

This is basically a “keep going until stuck” process. By the very nature of the process, the sequence SS that we come up with is walkable and does not repeat edges, but it is unclear if this list is exhaustive. Indeed, you should be able to come up with examples of graphs GG where GG has an Euler tour but find_tour(G) does not output one.

6.3 Fixing the naive approach

How do we fix this? For one, we need to know if we are done or not: if an edge is missing from SS, then that’s a bad sign, and we need to do something about it. How do we know if every edge is enlisted in the output? One way is to compute the length of SS: if it falls short of mm, we are not done yet.

But we also need to know what the missing edges are. We could in principle go through our edge list and ask ourselves if the edge made it to SS or not, but that sounds mildly painstaking. Let’s save ourselves the pain with some additional bookkeeping — let’s track the “residual degree” of the vertices: this is the number of edges incident on vv that are not yet listed in SS. Any vertex with non-zero residual degree gives us concrete hints about missing edges.

So for directed graphs we have:

find_tour(G,v):
    init res_deg[v] = outdeg[v]
    set curr := v
    set S := empty
    while there is an outgoing edge e = (curr,u) which is not in S:
        add e to S
        res_deg[curr] = res_deg[curr]-1
        set curr := u
    return S

and for undirected graphs we have:

find_tour(G,v):
    init res_deg[v] = deg[v]
    set curr := v
    set S := empty
    while there is an outgoing edge e = (curr,u) which is not in S:
        add e to S
        res_deg[curr] = res_deg[curr]-1
        res_deg[u] = res_deg[u]-1
        set curr := u
    return S

So now we know when our algorithm is a fail. What’s the fix? Well, let’s approach vertices who are not done yet as per our intel from their res_deg value. We use these vertices to trigger more happy-go-lucky tours:

find_tour_fr(G):
    i := 0
    marked := emptyset
    res_deg[v] := outdeg[v]
    S := list of lists
    while there is some v with res_deg[v] > 0:
        let S[i] := find_tour(G,v,marked)
        add every edge in S_i to marked
        i = i+1

Since we need to track visited edges across multiple runs now, we actually inform the find_tour function about the edges already visited from past lives. This is tracked with the marked set.

So our updated find_tour function looks like this:

find_tour(G,v,marked):
    set curr := v
    set S := empty
    while there is an outgoing edge e = (curr,u)
    which is not in S or marked:
        add e to S
        res_deg[curr] = res_deg[curr]-1
        set curr := u
    return S

(The change is analogous for the version dealing with undirected graphs.)

What we have now is a bunch of fragments, each of which is essentially a walk that begins and ends at the same vertex. Because of our relentless and careful pursuit (c.f. the while condition and the marked set), every edge in GG features in exactly one of these fragments. Now it’s just a matter of putting everything together.

Start with the first fragment S0S_0. If this is the only fragment we have, that means that our first happy-go-lucky tour was in fact also a lucky one! So we have nothing left to do. Otherwise, there are at least two fragments. Let us look at the set of vertices involved in S0S_0. The crucial observation is that there must be at least one other fragment, say SiS_i, that also features some vertex that appears in S0S_0. Indeed, if this is not the case, then you can argue that S0S_0 is a sad isolated fragment, and we can actually report that GG has no Euler tour.

Otherwise, find the common vertex between S0S_0 and SiS_i, and extend one of them using the other: for example, if vv is the common vertex, take a walk in S0S_0 until you encounter vv, and then instead of following along on S0S_0, take a detour as specified by SiS_i. Remember if you start on SiS_i at vv, then you will eventurally exhaust SiS_i by coming back to vv: and at this point you can “resume” your walk on S0S_0. Note that this process welds two fragments at the vertex vv thereby reducing the total number of fragments by one. This should count as a sure sign of progress: repeating for as long as possible, we have to do this at most (f1)(f-1) many times, where fm2f \leq \frac{m}{2} is the total number of fragments.

patchup():
    Let S[i] for i in 1, 2, ..., f denote the set of all fragments
    while f > 1:
        look for a fragment S[i] that intersects S[0]
        if S[i] does not exist:
            return false
        else:
            expand S[0] along S[i]
            remove S[i] from the set of all fragments

6.4 Expenses

Although arguably a naive algorithm, the cool thing about this procedure is that it is guaranteed to work, and the number of steps involved is not terribly bad either. Here’s a naive back-of-the-envelope analysis, assuming GG is stored as an adjacency list:

  1. The degree-based sanity checks are m\approx m since GG is stored as an adjacency list.
  2. Consider find_tour_fr(G):
    • The outer while loop runs at most mm times since each iteration decreases the residual degree of at least two vertices and the sum of residual degrees is 2m2m at the start (an analogous argument applies for directed graphs).
    • The inner while loop in find_tour(G,v,marked) is executed at most mm times. To find an appropriate edge, we have to go through the negibors of curr and check if the associated edge is already in S or marked. This takes at most (nm)(nm) iterations in the worse case, if implemented directly.
    • Assuming that S is a linked list and res_deg is an array, the innermost operations are all constant time.
    • The overall cost of a direct implementation is therefore nm3nm^3.
  3. The patchup procedure involves a outer while loop that runs at most ff times if we have ff fragments. The looking business to merge fragments will take no more than mm units of time: worst case we have to scan all remaining fragments to find a suitable match, and the merger involves updating a few pointers — noting again that the fragments are stored as lists. This is of course a conservative estimate, but it will do. The overall time here is therefore no worse than m2\approx m^2 since fm2f \leq \frac{m}{2}.

This gives us m+nm3+m2m + nm^3 + m^2 in total damages. However, notice that a more careful analysis helps us do better without doing anything substantially different. If you think about find_tour_fr(G) think about how often the instructions:

add e to S
res_deg[curr] = res_deg[curr]-1
set curr := u

from find_tour(G,v,marked) are actually executed. Every time we execute this set of instructions, we effectively add one edge to the set of marked edges, and since this happens exactly once per edge, these instructions only execute mm times overall. Recall that what it takes to enter this loop is the discovery of an unused edge:

while there is an outgoing edge e = (curr,u)
which is not in S or marked

This is the bit that took us a while, because we assumed we have to examine all possible edges that go out of curr and then also tally up against S and marked. One way to speed this up is to always commit to pulling out the first element in the adjacency list of the vertex u and then in fact deleting this element from the list. If you are paranoid about subjecting your input to this kind of annihilation, then you can just make a working copy upfront. This way, you can be sure that you can find what you want in constant time, a good case in point to show how the way you store your data can impact the performance of your algorithm.

Based on the hints above, convince yourself that find_tour_fr(G) in fact has a total expense amounting to m\approx m, modulo constants. With this, our overall cost comes down to 2m+m2\approx 2m + m^2.

6.5 Afterthoughts

Can we do better? Indeed, it turns out that with a slightly more careful implementation, the business of merging fragments can also be done with an expense proportional to the number of edges. The careful version goes by Hierholzer’s algorithm, and you can read up about this on Wikipedia or watch the video below.