SAT Flood Fill
tl;dr: nurikabe solver
Many Nikoli puzzles have a common requirement: there will be a single connected region of some sort across the entire board. This can be done by choosing a single seed location and performing a flood fill, within the other constraints of the puzzle.
A year ago I had a very crude sort of flood worked out. It was messy and needed to be rewritten for every puzzle. A couple of months ago (while test-solving some unusual non-square puzzles) the generalization struck: adjacency tables could be used to create a generic flood fill that works for everything.
Prior to this, the flood fill code used Cartesian coordinates. This was remarkably brittle. Breaks down for puzzles with holes, ragged edges, extra dimensions, or hexgrids. All things you see at Mystery Hunt. An adjacency table can represent all of these special cases.
An adjacency table is a list of named areas and the other named areas they connect to. For example, here are the named regions of a 3x3 tic-tac-toe board:
Add another 9 cells with similar linkages and it models a two sided piece of paper. With a couple of connections between the front and back cells, it now represents a tube. Arrange those connections differently and it becomes a mobius strip instead. Not to mention the silly things you can do with unidirectional links. I could give more examples, but it should be pretty obvious this can represent any 2-D, 3-D or even n-D layout. (Except perhaps weird sub-atomic spin models where you must go around a closed loop twice to get back to where you started.)
So, hopefully that both explains adjacency tables and provides a convincing argument that they are worth the hassle of creating.
Nurikabe is a good example for demonstrating the flood fill. The puzzle has two different types of fills: a large single stream and many small islands of exact size.
Let's walk though nurikabe.py.
It starts by loading the puzzle into a sparse dictionary. The layout of the dictionary looks like {(x,y):size, ...}
The CNF is using the auto_term()
function, aliased to f()
, for mapping labels to term numbers. The final answer will be found in the terms ('base', x, y)
.
First all the squares with numbers are guaranteed to not be part of the river and so can be marked as such on the base layer:
Another requirement is no pools. The following code enumerates all possible 2x2 squares in the puzzle and forbids those combinations from being solid river. The xs
and ys
lists contain the ranges of the X and Y values of the board.
cnf.comment('no pools')
for x,y in product(xs[:-1], ys[:-1]):
cells = [f('base',x,y), f('base',x+1,y), f('base',x,y+1), f('base',x+1,y+1)]
cnf.write([neg(cells)])
Next the adjacency table is created. While the unlimited potential of the adjacency table is nice, the simple rectangular grid happens fairly often. So cartesian_adjacency()
is a helper function to handle this common case. There is a boolean flag for 4-neighbor vs 8-neighbor.
Now it can start flood filling. First are the small walls. Their fills start from their x,y
location on the board and must be an exact size. This is an excerpt from a larger loop:
cnf.comment('wall')
label = 'wall %i %i' % (x,y)
mapping = floodfill(cnf, label, adj, size, exact=True, seed=(x,y))
cnf.comment('link')
for x2,y2 in mapping:
cnf.write(if_then(f(mapping[(x2,y2)]), -f('base', x2, y2)))
- The cnf object to write to
- A unique name prefix label to avoid collisions
- An adjacency table
- How big the flood fill could be
- If the size should be strictly followed (optional)
- A starting location for the fill (optional)
The floodfill()
function goes on to create a large cubic volume of new terms. Only a small handful of these are useful; the final flattened summary layer. The function returns a mapping between normal address and the names of the corresponding summary variables.
After this comes some logic to prevent walls from touching each other. (I am not entirely happy with this code because it makes blind assumptions about the naming convention of the flood fill summary terms. And these are likely to change in the future to make flood fill play better with cross-language autoterm merging. It really should be using the mappings instead.)
Now for the stream fill. There are two different ways to go about it. First is to calculate the exact size of the stream (xdim * ydim - sum(clues)) and do an exact fill. The second is to do an unbounded fill, which is what I'll illustrate here since the walls use exact fills. And unbounded fills are generally less intensive, so use them when you can.
First it finds a seed location. It looks for a cell with a value of 1 (all neighbors will be part of the stream) and picks the first neighbor to use as as starting point for the fill. While this step is optional, using a seed makes the solver 20% faster or so.
cnf.comment("stream")
mapping = floodfill(cnf, 'stream', adj, z_dim, seed=seed)
cnf.comment("stream summary")
for x,y in mapping:
cnf.write(if_then(f(*mapping[(x,y)]), f('base', x, y)))
And then the summary is connected to the base. This looks very similar to code for the walls, with a small tweak to the if_then()
for their relative behaviors. The areas of the walls must not be filled in, so they are linked to the base by if_then(wall_term, -base_term)
while the stream looks like if_then(stream_term, base_term)
because the stream area is supposed to be filled in on the base layer.
There is a little bit of trivial logic to make sure the walls and streams do not overlap and that every square is covered by something.
base_layer = [f('base', x, y) for x,y in product(xs, ys)]
for solution in cnf.solutions(interesting = base_layer):
....
Flood fills are messy and operate by growing from a seed one point at a time. There is nothing deterministic to that growth pattern. So you can get fifty different "solutions" to the same flood fill, purely based on the order it filled. The interesting
field restricts the solution space to a subset of the terms involved, letting the solver disregard how a given solution was arrived at.