Common Mistakes In Loop-Finding Algorithms For Graphs

Today in Edworking News we want to talk about Possibly all the ways to get loop-finding in graphs wrong

Introduction

In my puzzle collection, there are many games in which you have to connect points together by edges, making a graph, and the puzzle rules say you must avoid making any loop in the graph. Examples are Net, Slant, and some configurations of Bridges (although not the default one). Loopy and Pearl also care about whether there’s a loop in a graph, although those two are more subtle: your aim is to make a loop, and only wrong loops must be rejected. Therefore, those puzzle programs need to be able to check whether a graph has a loop in it, in order to decide whether the puzzle solution is correct. If there is a loop, they also have to identify the edges that make up the loop, in order to point out to the player why their solution hasn’t been accepted.
Over the years I’ve been developing these puzzles, I’ve gone through an amazing number of algorithms for doing that job. Each one was unsatisfactory for some reason, and I threw it away, and moved on to the next. I might by now have collected all the ways to do this job wrong! So I thought I’d write up all my mistakes, as a case study in all the ways you can solve this particular problem wrongly – and also in how much effort you can waste by not managing to find the existing solution in the literature.

The Algorithms

Vertex DSF

When one of these puzzles is generated, the generator will run an automated solving algorithm, to check that its solution is unique. So the solving algorithm needs to check whether it’s accidentally made a loop in the graph. Or, more precisely, it’s better if the solver can tell whether it’s about to make a loop: when it’s considering adding some edge to the graph, it wants to know if that edge would form a loop. If so, it decides not to add it at all, which is more efficient than adding it first, finding you’ve made a mess, and having to undo your work.
My automated solvers generally handle this by using an implementation of the disjoint-set forest data structure, otherwise known as a ‘union-find’ data structure, or just ‘disjoint-set data structure’. For those who haven’t encountered a dsf before, it’s a very fast data structure for tracking equivalence between elements of a set, as long as equivalences are added incrementally. This algorithm is correct and very efficient. For automated solving, there’s nothing at all wrong with it, especially because you get to keep just one dsf for the whole run of the solver. However, it won’t generate the nicely highlighted loops I showed in the introduction. This algorithm can only report ‘yes’ or ‘no’: either there is a loop, or there isn’t.

Graph Pruning

So our revised problem is: don’t just say that a loop exists, but identify every edge involved in it. Ideally, if there are multiple loops, identify all their edges. My first attempt to solve this problem worked by iteratively pruning the graph. Find a vertex with only one edge coming into it. Then that edge can’t be part of a loop. So we can remove it, without destroying any loops in the graph. Once you start removing edges, you reduce the degree of further vertices, so now maybe more of them have degree 1. So keep pruning, until you can’t find any more degree-1 vertices.
This algorithm correctly identifies whether there’s a loop, and it guarantees to highlight every edge involved in any loop. Unfortunately, that’s not all that it highlights. If the graph contains a ‘dumb-bell’ shaped subgraph consisting of two loops connected by a path, then the loops can’t ever be pruned, but neither can the connecting path. So that will be highlighted in addition to the loops themselves. This algorithm highlights too many edges!

Loop Tracing

So I thought about how to solve this ‘dumb-bell’ problem, and after some pondering, came up with a completely different approach. To begin with, we go back to the ‘vertex dsf’ idea. To check the player’s solution for errors during play, we make a fresh dsf with all vertices separate, and iterate through the graph edges, unifying the two endpoints of each edge. Before each unification, we query the dsf to see whether the two endpoints were already connected to each other. If they were, we know that the edge we’re currently processing is part of a loop.
But this time, we don’t stop there. Once we’ve identified one edge that’s part of a loop, we do a graph search around the rest of the graph to find an alternative path between its endpoints. Then we’ve identified a specific loop involving that edge, and we can light up every part of that loop as an error. Once you’ve done that, don’t stop there: return to the main iteration that adds the rest of the edges to the dsf, in case there are more loops you can find and trace around.
This algorithm solves the dumb-bell problem: it only ever lights up an edge when it’s found an actual loop in the graph, with that edge being part of it. So it definitely can’t light up any edge that is not part of a loop. The central bar of the dumb-bell graph is safe from accidental highlighting. But I couldn’t convince myself that this technique would catch all the loops in the graph. What if there are several loops that touch, or intersect each other?

Face DSF

In 2008, a contributor sent me a patch making major changes to Loopy. Before the changes, it only supported playing on square grids, like conventional Slitherlink. Afterwards, it supported a wide variety of other periodic tilings – triangular grids, hexagonal honeycombs, various tilings of mixed shapes, etc – using a general system for representing the game grid as a planar graph. While I was discussing that patch with its author, I realized that there’s a neat algorithm for loop detection, using the fact that all the graphs involved are planar.
As well as vertices and edges, a graph embedded in the plane has a concept of ‘faces’: the regions of the plane separated by the graph edges. If a planar graph contains no loops, then there’s only one face: from any part of the plane you can reach any other part, without having to cross a graph edge. You might have to take a roundabout route, but there always is a route. Conversely, if a planar graph does contain a loop, then that loop separates the plane into two regions: the inside and the outside. To get from one to the other, you’d have to cross an edge of the loop.
So my new idea was: instead of making a dsf on the vertices of the grid to find a loop that joins them to each other, we make a dsf on the faces of the grid – to find a loop that separates them from each other.

Tarjan’s Bridge-Finding Algorithm

At this point – much, much later than I probably should have – I stopped trying to solve the problem from scratch myself, and found a solution somebody else had already invented and proved correct. One of the reasons I hadn’t found this algorithm before is that it’s phrased the opposite way round. A bridge in a graph is an edge which is the only way to get between some pair of vertices. So if you remove the bridge, those vertices become disconnected from each other. On the other hand, a loop edge is precisely one which, if you remove it, doesn’t disconnect any pair of vertices from each other, because there’s still some other route between its two endpoints.
In other words, an edge is a bridge if and only if it is not part of a loop. I was thinking in terms of ‘finding all the loops’, but of course finding all the edges that aren’t part of a loop is just as good – you just invert all the answers once you’re done.
Tarjan’s bridge-finding algorithm starts by finding a spanning forest of the graph – a spanning tree of each of its connected components. That is, we find a subset of the edges which provide a route between any two vertices that the original graph linked, but which don’t contain any loops. Once we’ve done that, every bridge must be one of the edges of the spanning forest, because a bridge is the only route between some pair of vertices, and whichever vertices those are, the spanning forest contains a path between them – so it must contain the bridge. So any edge not in our spanning forest can’t be a bridge – it must be a loop edge instead.
But that’s the easy part. The hard part is that some of the spanning-forest edges are also loop edges. So the problem is to identify which. The next step is to root the spanning forest: for each component, choose a vertex (it doesn’t matter which one) to consider to be the root of the tree, so that every edge has an ‘upward’ direction (towards the root) and a ‘downward’ direction (away from it), and the subtree of that edge is all the vertices further ‘down’. Then an edge e is a bridge if and only if nothing in its subtree connects to anything outside that subtree, by any route other than leaving the subtree out of the very top, via e itself.
To determine that, you iterate over these rooted trees labelling each vertex with a number, in a way that makes every subtree into a consecutive interval, say containing numbers from a to b inclusive. Then, for each subtree, you find the smallest and largest labels of any neighbour of anything in the subtree, say u and v. From those bounds, you can immediately tell whether anything in the subtree has a neighbour outside it: if it does, then that neighbour will have a label outside the range [a, b], which will either make the smallest reachable label u smaller than a, or make the largest reachable label v larger than b. So the edge at the top of the subtree is a bridge if and only if a ≤ u ≤ v ≤ b, and we’re done!
As far as I know, there’s nothing wrong with this algorithm. But then, I’ve believed that before and been wrong! That previous time, the problem was that I was depending on an extra property of the graph, namely its planar embedding. Tarjan’s algorithm definitely doesn’t depend on any such thing: it’s a ‘pure’ graph algorithm, requiring nothing except a list of neighbours for each vertex. You could run it on a graph embedded on a torus, or on a Möbius strip, or even something inherently not two-dimensional like a cubic lattice in three-dimensional space, or n-dimensional space if you wanted. It wouldn’t care.
So, at the time of writing this, Tarjan’s algorithm lives in findloop.c in my puzzles’ source tree, and I hope not to discover any more fundamental problems that mean I have to throw it away and start again!

Remember these 3 key ideas for your startup:

Iterate and Learn: Just like the journey of finding the right algorithm for loop detection, your startup will go through multiple iterations. Each failure is a learning opportunity. Embrace the process and keep refining your product or service.
Leverage Existing Solutions: Sometimes, the solution to your problem already exists. Instead of reinventing the wheel, look for established methods and adapt them to your needs. This can save time and resources, allowing you to focus on innovation.
Efficiency is Key: In both algorithm design and business operations, efficiency can make a significant difference. Streamline your processes, automate where possible, and always look for ways to optimize.

Edworking is the best and smartest decision for SMEs and startups to be more productive. Edworking is a FREE superapp of productivity that includes all you need for work powered by AI in the same superapp, connecting Task Management, Docs, Chat, Videocall, and File Management. Save money today by not paying for Slack, Trello, Dropbox, Zoom, and Notion.

Description: An illustration of an efficient workflow system integrating various productivity tools.
By focusing on these key areas, your startup can navigate challenges more effectively and position itself for long-term success.
For more details, see the original source.

Common Mistakes in Loop-Finding Algorithms for Graphs