Backtracking
Backtracking is a form of recursion — using recursion, we assign values to variables, one variable at a time.1 It's commonly used to solve a CSPs:
Definition: Constraint Satisfaction Problems (CSP). A CSP is a problem that takes the following form: Given variables and labels, assign each of the variables a label, subject to restrictions (called constraints). The restrictions specify what combinations of labels are allowed for some subset of the variables.2
From the definition above, we can think of many problems that can be restated as CSPs:
- Combinatorical optimization problems (e.g., the travelling salesman problem).
- Sudoku puzzles
- The 𝑁-queens problem
- Various optimization problems
Question: Couldn't we solve these problems with iteration? We could, but it would be unwieldy. From the definition above, solving a CSP via iteration — say, brute force — requires our algorithm to evaluate all the possible assignments (or states). For some problems, many of those evaluations are wasteful. Midway through the process, it's apparent that the current assignment is bad (violates a constraint). Unfortunately, because of the way CSPs are, there's usually no easy way to stop the evaluation midway. Recursion, however, handles this problem easily — return on the base case.
Details may be hazy at this point, so let's clarify with some analogy.
Backtracking & Depth-first Search
Backtracking is analogous to a depth-first search for a solution. Say we had some CSP. In backtracking, we imagine treat the CSP as a constraint tree (or more generally, a constraint graph):
Each of the nodes represent a variable, and each edge represents a constraint. That constraint could be anything: "If there's an edge, these two variables can't be assigned together", "If there's an edge, these two variables can be assigned together", etc.
Viewed this way, we see why backtracking has advantages over traditional
iterative methods. If we encounter a particular node (a particular
subtree), we can immediately abandon the search for a solution altogether.
For example, in the tree above, suppose the correct solution is J
. We
start from the root, and traverse the entire tree (the set of all possible
solutions). Say we get all the way down G
. At that point, we make a
determination: Is this a good state or a bad state? I.e., is the path I've
taken from the root down to this node (root--A--C--G
) satisfactory
(doesn't violate the constraints)? No. So, we backtrack to the parent
node, C
, and report that the path is not a valid assignment. This report
goes up to A
, which then determines: "C
, you are not a valid
assignment. Don't bother to continue searching." So, the search goes down
to D
, and eventually to J
, the correct solution.
Where backtracking takes a depth-first search, the brute-force approach
takes a bread-first search. To find the correct solution J
, we check each
node at each level, one by one: A
, then B
, then C
, then D
, then
E
, etc.
Approach | Path | Total Checks |
---|---|---|
Backtracking | G-C-A-I-D-J | 6 |
Brute-force | A-B-C-D-E-F-G-H-I | 9 |
The difference doesn't seem like much (we've purposely kept the solution close for readability) but it's not difficult to imagine a CSP where the correct solution is at the right-most leaf node of some gargantuan tree. Or, worse, if the correct solution doesn't exist at all.
Importantly, many CSPs have a runtime complexity of While backtracking doesn't always evade this upperbound, it can certainly reduce the number of steps necessary. That said, there are some problems (as we'll see shortly) that cannot get around this bound — it would take years, if not decades, to find a solution.
The N-queens Problem
To understand the N-queens problem, recall that a queen can move in any of these directions:
- squares to the left.
- squares to the right.
- squares forwards.
- squares backwards.
- squares diagonally.
With this in mind, the N-queens problem asks the following:
Given an chessboard, how many ways can we place queens such that no two queens can threaten one another?
To start, let's say we have a board:
We place queens column by column, then row by row. Our first column is 0, so we place a queen there:
Let's cross out all the spaces that subsequent queens can't occupy:
Now we place a second queen at column 2, again, crossing out all the paths again:
Now we try to place a third queen — we can't. Column 3 is covered entirely in and 's paths. So, we have to backtrack to and increment their position:
We now consider again. Now we have a space:
When we get to we find ourselves in trouble again — no spaces. So, we backtrack to We try incrementing 's position, and find that we can't. So, we backtrack again to When we get to we again find that we can't increment 's position, so we backtrack again to and increment their position:
We get back to and use the last position, since that's the only available space:
We get to and use the first position, since that's the only available space:
Finally, we get to and use the second-to last space, since that's the only available space:
If we were to use an iterative approach, we'd have to check all of the nodes in this search tree:
Footnotes
-
Paul W. Purdom, Jr. Search Rearrangement Backtracking and Polynomial Average Time, 21 Artificial Intelligence 1-2, 117-133 (1983). ↩
-
Belaid Benhamou. Study of Symmetry in Constraint Satisfaction Problems, Proceedings of the 2nd Workshop on Principles and Practice of Constraint Programming PPCP’94, 18-19 (1994). ↩