Refactoring Varisat: 3. Conflict Driven Clause Learning

Posted on March 18th 2019, tagged algorithms, refactoring-varisat, rust, sat

This is the third post in my series about refactoring varisat. In this post the new code base turns into a working SAT solver. While you can use the command line tool or the library to solve some small and easy SAT problems now, there is still a lot ahead to gain feature and performance parity with varisat 0.1.

In the last post we saw how unit propagation is implemented. When some variables are known, unit propagation allows us to derive the values of new variables or finds a clause that cannot be satisfied. Unit propagation alone isn’t enough though, as there is no guarantee to make progress. To continue the search for a satisfying solution after propagating all assignments, it is necessary to make a guess. A natural way to handle this would be recursion and backtracking. This would give us a variant of the DPLL algorithm from which conflict driven clause learning evolved.

DPLL

When we combine unit propagation with backtracking we get an algorithm like this:

solve(formula, partial assignment):
  update the partial assignment using unit propagation
  if there was a conflict:
    return UNSAT
  else if the assignment is complete:
    return SAT
  else:
    make a decision l
    (this selects an unassigned literal l using a branching heuristic)

    if solve(formula, partial assignment + l) is SAT:
      return SAT

    return solve(formula, partial assignment + ¬l)

This is a variant of the DPLL algorithm. The original DPLL algorithm also performs something called pure literal elimination. Pure literal elimination is a process that finds and assigns literals that appear only with one polarity within the not yet satisfied clauses. Such literals can be safely set as they cannot falsify clauses. This isn’t used in the search procedure of CDCL SAT solvers though, as detecting pure literals is too expensive for the advantage it offers.

The first problem is that this algorithm is recursive and copies the partial assignment to allow for backtracking. The recursion can easily overflow the call stack and has much overhead. We can turn this into an iterative algorithm that undoes assignments by adding two explicit stacks. The first stack, called the trail, records all literals assigned. The other stack records which assignments in the trail correspond to the decisions. This decision stack stores the lengths the trail had when making a decision. The length of the trail is also called the depth. The resulting algorithm is this:

solve iterative(formula):
  trail = [], decisions = [], partial assignment = {}

  loop:
    update the partial assignment using unit propagation
    (record assignments in the trail)

    if there was a conflict:
      if the decision stack is empty:
        return UNSAT

      pop the last decision's depth from the decision stack
      remove all assignments past the last decision from the trail and assignment
      invert the polarity of the decision in the assignment

    else if the assignment is complete:
      return SAT

    else:
      push the current depth to the decision stack
      make a decision l
      add l to the assignment and trail

What I omitted from this description are the watchlists used for unit propagation and described in the last post. They are only updated in the call to unit propagation as they are designed to allow backtracking without requiring updates. Undoing the partial assignments is enough.

The trail and decision stack is implemented pretty much like described here. The backtracking search however is something that can be improved a lot. To see why we have to take a closer look at why it can be inefficient.

Duplicated Work

A backtracking search like this is inefficient as it often duplicates work just to discover the same things over and over again. Let’s consider a formula containing the clauses $C_1 = a \vee b \vee c \vee d$ and $C_2 = a \vee b \vee c \vee \neg d$ as well as a bunch of clauses with variables $x_1, x_2, \ldots, x_n$ . At some point the trail and decision stack could look like this:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots], [\neg x_3, \ldots], [x_1, \ldots], [\neg a, \ldots, d]$

The brackets group assignments on the same decision level, i.e. the decision stack points to each opening bracket. The ellipses stand for possible propagated assignments not relevant to this example. The final assignment $d$ was propagated by the clause $C_1$ which became unit. This in turn makes the clause $C_2$ false and we have a conflict. Our backtracking search would remove all assignments $[\neg a, \ldots, d]$ of the last decision level and add the negated decision to the then topmost level:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots], [\neg x_3, \ldots], [x_1, \ldots, a]$

The problem shows up when the decision $x_1$ leads to a conflict. When the assignments of the topmost level are reverted the assignment $a$ is also reverted. This means the solver is free to try $\neg a$ again:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots], [\neg x_3, \ldots, \neg x_1], [\neg a, \ldots, d]$

As $\neg b$ and $\neg c$ are still assigned this again propagates $d$ using $C_1$ and makes $C_2$ false. This causes another backtracking:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots], [\neg x_3, \ldots, \neg x_1, a]$

Should we discover that $\neg x_3$ leads to a conflict this could happen a third time leading to:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots, x_3, a]$

In this example only two clauses are involved in the repeated sequence of identical propagations. In realistic scenarios this happens regularly with much longer sequences.

Non-Chronological Backtracking

To avoid some of this duplicated work we can introduce non-chronological backtracking (also called backjumping). Non-chronological backtracking identifies which decisions lead to the conflict and undoes further decisions as long as they are not involved. This is a sound strategy as it results in the same final state as making the conflict causing decision at an earlier point would have.

Let’s consider our example again:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots], [\neg x_3, \ldots], [x_1, \ldots], [\neg a, \ldots, d]$

Here the decision $\neg x_3$ and $x_1$ are not necessary to cause a conflict. This means we can also remove them while backtracking, adding the assignment $a$ to the decision level of $\neg c$ :

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots, a]$

This is the same state normal backtracking would have produced if the decision $\neg a$ was made after $\neg c$ and the trail looked like this:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots], [\neg a, \ldots, d]$

Implication Graph

To implement non-chronological backtracking we need to determine which prefix of decisions is sufficient to cause the current conflict. This is done by maintaining an implication graph during unit propagation. The implication graph is a DAG with the assigned literals as nodes. Each assigned literal has the reason for its assignment as incoming edges. Decisions therefore have no incoming edges.

This is implemented by storing a reference to the propagating clause for each assignment. As a clause propagates when it becomes unit, there is an edge from the negation of each false literal to the single true literal of the clause.

The involved decisions of a conflict are the 0-in-degree predecessors of the negated falsified clause’s literals. Those can be found by a graph search. We will see how this search is implemented after considering another improvement that avoids even more duplicated work.

Clause Learning

Even with non-chronological backtracking the algorithm can rediscover essentially the same conflict. We continue the example where we backtracked to the decision level of $\neg c$ and arrived at:

$[x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots, a]$

At some point in the future we might get a conflict caused by $x_2$ , resulting in a trail of just $[\neg x_2]$ . If we continue the search it’s likely that we’ll make the decisions $\neg b$ and $\neg c$ again:

$[\neg x_2, \ldots], [\neg b, \ldots], [\neg c, \ldots]$

We already know that $\neg b$ and $\neg c$ imply $a$ , but during backtracking all the way to the top, the algorithm forgot about that. Instead it will have to try $\neg a$ first to realize that this leads to a conflict.

To avoid this, a new clause is added to the formula after every conflict. This is called conflict driven clause learning. That clause must be implied by the formula and should stop the algorithm from repeatedly making the same decisions. The added clause is also important to maintain our implication graph after backtracking. The way the implication graph is implemented, a clause that is unit under the partial assignment is required for each propagated literal. In the example, without learning a clause, there is no clause that justifies assigning $a$ after backtracking from $\neg a$ .

One way to add such a learned clause would be to add a clause consisting of the negation of all involved decisions. This would be $C = b \vee c \vee a$ in the example. This clause then propagates $a$ after backtracking the initial conflict as well as after repeating the decisions $\neg b$ and $\neg c$ at a later point. Such a clause $C$ is implied by the formula $F$ , as the conflict means that $F \wedge \neg C = \bot$ which is equivalent to $F \to C$ .

First Unique Implication Point

In practice there is a way to learn a better clause than the one blocking the involved decisions. We can learn any clause that is implied by the formula and causes unit propagation to assign the negation of the last involved decision given the other involved decisions.

Such a clause can be found by starting with the conflict clauses literals and successively replacing a literal with the negation of the reason it was assigned false. Such a replacement maintains the fact that assigning all literals of the clause causes a conflict. An alternative justification for this process is that it is equivalent to a resolution of the current clause with the reason clause of an assignment falsifying a literal in the current clause.

This doesn’t specify in which order the literals should be resolved (replaced by their reason) and when to stop. It also doesn’t ensure that the clause is compatible with non-chronological backtracking. To make it compatible with backtracking we need to ensure that there is only one literal of the current decision level left in the clause. This causes the clause to become unit after backtracking. Such a single literal is called unique implication point or UIP. To find a clause that includes a UIP we can simply count the number of literals in the conflicting decision level and stop the resolution at any point where there is only one left.

There still might be multiple possible clauses with a UIP and we still don’t know in what order to resolve literals. A useful heuristic observation here is that each resolution step reduces the set of assignments that could cause the learned clause to propagate. When we replace a literal by the reason it was propagated, we block only that reason, while there might be multiple possible ways for that literal to get propagated. This isn’t an exact argument, but gives an intuition that matches what I’ve seen in practice.

To find a UIP containing clause that still blocks as much as possible, we stop as soon as we get a UIP. To only do resolving that will get us closer to a UIP, we resolve the literals in the reverse order they were assigned. This finds the first possible UIP, and thus is called 1-UIP. This means we will only resolve literals of the conflicting decision level. The target level for non-chronological backtracking then is the highest level among the literals apart from the UIP.

Using the reverse-chronological order for resolution means we can walk backwards over the trail. We start with the conflict clause and for each literal in the trail we check if the literal is in the clause. If it is, we replace it with its reason. As soon as there is only one literal of the current decision level left, we are done. The implementation uses a bitmap to store which literals of the current decision level are currently present. Whenever a literal of another decision level is reached it is directly added to the learned clause, as we will not resolve it.

API and Command Line Interface

Together with a dummy branching heuristic that always chooses the first unassigned literal, this gives a working SAT solver. To make it usable I also added a public API and a simple command line interface. To try the current master you can install it from github (replacing any previous version) by running

cargo install --force --git https://github.com/jix/varisat
varisat some_input_formula.cnf

This already includes stuff I’ll describe in a future post.

What’s Next

A working SAT solver doesn’t mean it’s a SAT solver you’d want to use yet. The dummy branching heuristic is very bad for performance and learning more and more clauses causes unit propagation to become slower and slower. In the next post I’ll go over what’s missing to make it as fast as varisat 0.1 was.