Additional example

Original How to effectively walk graphs and trees in Python. Edits from Reddit C/C. Additional example Editable
version 3 of 3

Recently Zed called out Python on some of its warts. His arguments are sound, but the ensuing reddit thread has a few rather confused attempts where folk gripe about non-issues.

This is specifically about the claim that Python's 1000 deep recursion limit makes it impossible to walk unbalanced trees. There are simple and pythonic ways to iterate over trees, and I will illustrate one. I greatly prefer this to the typical tree recursion process. It is actually similar to BFS in Lisp. With one slight modification you can process a tree either breadth-first or depth-first, switching dynamically between the two at run time based on the contents of a node. It can also be adapted to walk cyclic graphs.

Kyle (2012-03-08-18-47-55-179)

Technically this uses an adjacency list, not a true tree or graph.

Typically child links are treated as points where the code should recurse, and this recursion should happen in the language's stack. An alternative is to build your own stack and iterate over that. Each node contains references to child objects that need to be examined. Think of it as a "todo list" of nodes to look at next. Instead of recursing through the nodes, add the nodes you want to look at next (i.e., the child links) onto a "master todo list". Then iterate over the list.

Which is a better data structure for the master list, a stack or a FIFO? A stack would let you walk the tree depth first. Each child is pushed onto the stack. The top child is popped off, and its children are pushed on. A FIFO performs a breadth first search, as children are pushed to the start of the FIFO and older/higher nodes are popped from the end.

A deque provides the best of both. It is a nifty structure (based on a double linked list) that is both stack and FIFO. For each node of the tree, add the children to either the front (FIFO like) or the back (stack like) of the deque. Meanwhile iterate over the deque by popping off the back.

Let's have some code. This is lifted more or less directly from pacgraph. The map is stored as a dictionary of string tokens, instead of a tree of references. The technique works the same for both. The 'master todo deque` is called to_crawl. This function starts at a specified node and returns all of it's children. It operates depth first.

fanta (2015-05-13-22-54-46-794)

It operates depth first.

It operates breadth first, as far as I can tell. It's using a FIFO to track nodes, so naturally siblings get visited before children.

from collections import deque

tree = {'a': ['b', 'c'],
        'b': ['d', 'e'],
        'c': [],
        'd': ['f'],
        'e': [],
        'f': []}

def children(token, tree):
    "returns a list of every child"
    child_list = []
    to_crawl = deque([token])
    while to_crawl:
        current = to_crawl.popleft()
        node_children = tree[current]
    return child_list

> children('b', tree)
... ['d', 'e', 'f']

By keeping track of visited nodes in a set it becomes a blazing fast cyclical graph walker:

def children(token, tree):
    "returns a list of every child"
    visited = set()
    to_crawl = deque([token])
    while to_crawl:
        current = to_crawl.popleft()
        if current in visited:
        node_children = set(tree[current])
        to_crawl.extend(node_children - visited)
    return list(visited)

If you want breadth first search, replace to_crawl.extend() with to_crawl.extendleft() To dynamically switch between BFS and DFS, add a test to decide either extend() or extendleft(). I use this to walk the tree of nonogram solutions, where promising solutions are scanned depth first and improbable solutions are eventually processed breadth first.

Eli Spizzichino (2011-11-17-08-21-40-635)

your code works but you are confusing the two algorithm. to_crawl.extend gives you breadth first search and extendleft is depth first search.

for background on this and other topics on tree search and general AI you can follow this and other related videos


John Tyree (2015-03-19-21-08-49-143)

This is a good place to use something like a priority queue rather than a simple deque. That way you short-circuit your DFS when it becomes obvious that it isn't going to pan out.