Changing Partitions in Rectangle Decision Lists

Rectangle decision lists are a form of decision lists that were recently shown to have applications in the proof complexity of certain OBDD-based QBF-solvers. We consider a version of rectangle decision lists with changing partitions, which corresponds to QBF-solvers that may change the variable order of the OBDDs they produce. We show that even allowing one single partition change generally leads to exponentially more succinct decision lists. More generally, we show that there is a succinctness hierarchy: for every k ∈ N , when going from k partition changes to k + 1, there are functions that can be represented exponentially more succinctly. As an application, we show a similar hierarchy for OBDD-based QBF-solvers


Introduction
Decision lists are a classical formalism for encoding Boolean functions.Intuitively, they consist of a list of lines of the form "if L i (X) then c i " where L i (X) is a condition on the input variables X and c i is a constant from {0, 1}.To evaluate a decision list on an input, one traverses the lines from the first to the last line and checks the respective condition L i (X).As soon as one meets a condition L i (X) that evaluates to true, one stops this process and returns the corresponding value c i .Decision lists were originally introduced by Rivest in learning theory [22] and have since then been studied in different areas.
There are different types of decision lists, depending on which form the conditions L i (X) have.In the classical work of Rivest [22] they are terms of width k which already makes the model powerful enough to strictly generalize k-DNF, k-CNF and k-decision trees.Other works consider different classes of functions as the L i (X), e.g.linear threshold functions [8,24], bounded depth circuits [2], and combinatorial rectangles [15,21].
In recent years, decision lists have become important in the context of proof complexity of quantified Boolean formulas (QBF).The idea is that for several proof systems, one can from a refutation of an input formula efficiently extract a winning strategy of the universal player that is encoded as a decision list.The type of the decision list depends on the respective proof system, see e.g.[1,2,3].Since the length of the decision list depends on the size of the refutation, lower bounds on decision lists then translate to lower bounds for proofs in the proof system at hand.
In this paper, we focus on so-called rectangle decision lists, i.e., decision lists in which all of the L i (X) are combinatorial rectangles, so functions of the form L i (X) = r1 i (X 1 ) ∧ r 2 i (X 2 ) where (X 1 , X 2 ) is a partition of X into sets of equal size and r 1 i , r 2 i are arbitrary Boolean functions in X 1 and X 2 , respectively.Rectangle decision lists were originally introduced in communication complexity theory [15,21] 1 .Recently, they were also considered in QBF proof complexity in [19] where they were used to show lower bounds to certain OBDD-based proof systems.
While in settings from communication complexity it makes sense to assume that the partition (X 1 , X 2 ) is part of the problem statement and is thus fixed from the outside and the same throughout the decision list, this is not the case in the QBF setting: the solvers modeled by the OBDD-based proof system can freely choose the variable order of the OBDDs they construct.On the side of the decision lists, this results in a choice of the partition (X 1 , X 2 ) that is used throughout the proof.Thus, to show lower bounds, unlike in [15,21] which considered one fixed partition, in [19] it was shown that there are functions for which all possible balanced partitions decision lists must be long.This yields lower bounds for solvers that choose the variable order of the OBDDs optimally at the beginning and then work with that order throughout their run.
While this is in fact what the QBF-solver QBDD [20] does, this result is somewhat unsatisfying since it does not use the full power of modern practical OBDD-libraries: stateof-the-art OBDD-implementations like CUDD [23] allow reordering the variables of OBDDs.So it is conceivable to implement OBDD-based solvers for QBF that adapt the variable order throughout their run when this appears useful.Could this make the resulting solvers more powerful?
This paper answers the above question positively: even if we allow only one change of the variable order in the computation of the solver, this leads to exponentially faster runtime on some formulas.More generally, we show that for every constant k ∈ N, allowing k + 1 variable order changes yields exponential runtime savings over k variable order changes.We show this by using the translation of lower bound questions for OBDD-based QBF-solvers into lower bounds for rectangle decision lists from [19].In this translation, every variable order change on the OBDDs becomes a potential partition change in the decision list.Thus, we study lower bounds for rectangle decision lists with changing partitions and show that their length decreases exponentially for some functions whenever one additional partition change is allowed.
Note that different partitions for rectangles have been studied before in so-called multipartition communication complexity.In particular, there is a strict hierarchy with respect to the number of partitions in that setting as well [10].Since rectangle decision lists are easily seen to simulate multi-partition communication protocols, our lower bound results for fixed k are qualitatively stronger than those of [10], even though the dependence on k is far better in [10] and k is not needed to be constant there.Other related work is found in [6] which considers refutation lower bounds in an OBDD-based proof system for SAT that allows variable order changes.As it is common for lower bounds for QBF-systems based on strategy extraction, see e.g.[1,2,3], we do not take into account the cost of reasoning inside NP but only measure the hardness that stems from the addition of quantifiers.Thus, our results are incomparable to those of [6].

Structure of the paper:
We start with some preliminaries and necessary background in Section 2. In Section 3, we showcase our approach by separating rectangle decision lists with a single partition change from those with none.The lower bound in this part of the paper is relatively easy with the results from [19] but can be seen as a warm-up for the more technical bounds later on.In Section 4, we develop the general technique for lower bounds for rectangle decision lists with partition changes.Afterwards, in Section 5, we show how to use this technique to prove that there is a strict hierarchy for rectangle decision lists with respect to the number of partition changes.In Section 6, we apply our techniques to QBF proof systems.
We assume that the reader knows some basic graph theory, see e.g.[9].All graphs in this paper are finite and have no self-loops but in some cases parallel edges.Given a graph If the exact values of c and d are not important or clear from the context, we simply speak of a class of expander graphs.It is well-known that for every d ≥ 3 there is a constant c > 0 such that there is an infinite class of expander graphs, see e.g.[14] for background on this.

17:4
Changing Partitions in Rectangle Decision Lists Boolean Functions.
We use standard notation for Boolean functions, i.e., functions f : {0, 1} n → {0, 1} for some n ∈ N. In particular, as usual, ∧ denotes conjunction, ∨ disjunction, and ⊕ exclusive or.An assignment of a set X of variables is a mapping τ : X → {0, 1} of variables to truth values.Given assignments τ : X → {0, 1} and σ : Y → {0, 1} such that X and Y are disjoint, we let τ ∪ σ denote the assignment of The result of applying an assignment τ to formula ϕ and propagating constants is denoted ϕ Unless stated explicitly otherwise, we assume in the remainder of this paper that all variable partitions are balanced.A combinatorial rectangle on variable set X respecting the partition (X 1 , X 2 ) of X is defined to be a function R(X) that can be written as where R 1 and R 2 are arbitrary Boolean functions depending only on X 1 and X 2 , respectively.When writing the value table of R as a matrix where rows are indexed by assignments to X 1 and columns are indexed by assignments to X 2 , then, after arranging the rows and columns in such a way that models of R 1 (X 1 ), resp.R 2 (X 2 ), are listed as consecutive rows, resp.columns, the models of R form a rectangle in the matrix, which explains the name of the concept.Due to this representation, we call the models of R 1 and R 2 the rows and columns of R, respectively.In a slight abuse of notation, it is often convenient to identify combinatorial rectangles with their set of models R −1 (1) in which case we simply write

Rectangle Decision Lists.
A rectangle decision list L in variables X is a sequence (R 1 , c 1 ), . . ., (R , c ) where every R i is a combinatorial rectangle in X and c i ∈ {0, 1} is a constant.The Boolean function f L computed by L is defined as follows: given an assignment τ to X, for i = 1, . . ., we evaluate R i on τ .Let i * be the first index such that R i * (τ ) = 1, then the value computed by f L on τ is c i * .For this to be well-defined, we assume that R is the constant 1-function.We say that is the length of L and denote it by |L|.If all rectangles R i respect the partition Π = (X 1 , X 2 ) of X, we say that L respects Π.We say that L has k partition changes if there are exactly Given a rectangle decision list L and a partial assignment τ to a subset Y of X, one can construct from L a rectangle decision list L that computes the function one gets from f L by fixing the variables in Y according to τ .The list L is constructed by simply fixing all rectangles of L according to τ and thus |L | ≤ |L|.
We generally assume that rectangle decision lists respect a balanced partition Π, but in some cases it will be convenient to allow for some slight imbalance.The following observation shows that this cannot decrease the length of the decision lists by much.

Observation 1. Let L be a rectangle decision list respecting a partition
Proof (sketch).For every rectangle R i in L and every assignment τ to X , consider the sub-rectangle R i τ containing the models that are consistent with τ .Clearly, R i can be partitioned into at most 2 k/2 such R i τ , so we can substitute (R i , c i ) by 2 k/2 entries (R i τ , c i ).But since the models of (R i τ ) all take the same values on X , the rectangle R i τ can also be rewritten to respect (X 1 \ X , X 2 ∪ X ).
In lower bounds, we often tacitly ignore slight imbalances in partitions due to Observation 1.We will use the following relation between the length of rectangle decision lists and size of monochromatic rectangles from [15] which we slightly reformulate.Theorem 2. Let f be a function in variables X and let Π be a balanced partition of X.If f has a rectangle decision list with partition Π of length s, then f has a monochromatic rectangle with partition Π of size at least 1 4es 2 |X| where e ≈ 2.718 denotes Euler's number.
The inner product function IP n in variables X n := {x 1 , . . ., x n } and Y n := {y 1 , . . ., y n } is defined as the inner product of the two-element field, so We use a generalization of the inner product function IP with respect to an underlying graph structure from [19,13], see also [17,Chapter 5.8].So let X be a set of Boolean variables and let G be a graph with vertex set X and edge set E. Then we define Note that with this definition IP = IP Mn where M n is a matching with n edges.For the statement of the following lemma, recall that a matching is induced if it can be obtained as the subgraph induced by the endpoints of its edges.We will use the following result from [19].Lemma 3. Let G = (X, E) be a graph with n vertices.Let {e 1 , . . ., e m } be an induced matching of G and let (X 1 , X 2 ) be a partition of X such that for every e i we have e i ∈ E(X 1 , X 2 ).Then every monochromatic rectangle of IP G respecting the partition (X 1 , X 2 ) has size at most 2 n−m .

Ordered Binary Decision Diagrams.
Ordered binary decision diagrams (short OBDDs) are a classical representation of Boolean functions [5].we only give a very short introduction here; see [25] for a textbook treatment.
Let X be a set of variables and π an ordering of X.An OBDD on variables X with variable order π is defined to be a directed acyclic graph B with one source s and two sinks labeled 0 and 1, called the 0-and 1-sink respectively.All non-sink nodes are labeled with variables from X such that on every path P in B the variables appear in the order π.Moreover, all non-sink nodes have two outgoing edges, one labeled with 0, the other with 1.The size of B, denoted by |B|, is defined as the number of nodes in B. For every assignment τ to X, the OBDD B computes a value B(τ ) as follows: starting in the source, we construct a path by taking for every node v labeled by a variable x the edge labeled with τ (x).We continue until we end up in a sink, and the label of the sink is the value of B on τ denoted by B(τ ).This way B computes a Boolean function and every Boolean function can be computed by an OBDD.The OBDD B is called complete if on every source-sink path P all variables in X appear as node labels.The width of a complete OBDD B is defined as the maximal number of nodes that are labeled with the same variable.
OBDDs are well-known to be canonical in the sense that, for fixed variable order π, there is a unique minimal representation of any Boolean function f by an OBDD with order π, see again [25,Chapter 3] for background and proofs of this.Lemma 6.Let f be a Boolean function on variables X and let π be a variable order of X.Then there is a unique OBDD (up to isomorphism) of minimal size with order π computing f .Moreover, given an OBDD with order π representing f , this unique OBDD can be computed in polynomial time.The same is true for complete OBDDs.
We always assume that OBDDs are minimized with the help of the algorithm of Lemma 6.

Lemma 7 ([7]). Let B be an OBDD of width w and let Y be a subset of the variables in B.
Then there is an OBDD B of width 2 w that encodes ∃Y.B with the same variable order as B.

Quantified Boolean Formulas.
We consider quantified Boolean formulas (QBF) in prefix conjunctive normal form (PCNF), i.e., in the form Φ where the Q i are quantifiers ∃ or ∀ and the C j are clauses.We call Q 1 x 1 . . .Q n x n the prefix of Φ and C 1 ∧ . . .∧ C m the matrix.
We write D Φ (x i ) = {x 1 , . . ., x i−1 } for the set of variables that come before x i in the quantifier prefix.A variable x i is existential if Q i = ∃, and universal if Q i = ∀.We write var ∃ (Φ) for the set of existential variables, var ∀ (Φ) for the set of universal variables, and var(Φ) for the set of all variables occurring in Φ.A universal strategy for a PCNF Φ is a family f = {f u } u∈var ∀ (Φ) of functions f u : [var(Φ)] → {0, 1} such that f u (τ ) = f u (σ) for any assignments τ and σ that agree on D Φ (u).If f is a universal strategy and τ : var ∃ (Φ) → {0, 1} an assignment of existential variables, we write τ ∪ f (τ ) for the assignment of var(Φ) such that for universal variables u ∈ var ∀ (Φ).A universal strategy f is a universal winning strategy for Φ if τ ∪ f (τ ) falsifies the matrix of Φ for every assignment τ of the existential variables.A QBF is false if it has a universal winning strategy, and true otherwise.

3
Rectangle decision lists with more than one order are exponentially shorter In this section, we will see that even allowing one partition change in a rectangle decision list allows for exponentially shorter lists.We show this with the following function: let G = (X, E) be a (c, 3)-expander graph where the vertex set X consists of the variables of the function we will construct.Using Vizing's theorem, see [9, Section 5.3], we know that there is a valid edge coloring χ of G with at most 4 colors, say, {1, 2, 3, 4}.Now set and define the two graphs G 1 := (X, E 1 ) and G 2 := (X, E 2 ).The function we will consider is then where z is a fresh variable not in X.

Proposition 8. f G has a constant size rectangle decision list with one variable partition change.
Proof.We will start with a simple observation: Claim 9.There is a variable partition Π 1 such that IP G1 (X) has a constant length rectangle decision list.
Proof.All vertices in X are incident to at most 2 edges in G 1 .So G 1 consists of a collection of cycles and paths.First, consider an order π of X such that for each component the vertices appear in a contiguous sequence.Let (X 1 , X 2 ) be the partition of X that we get by cutting X into two parts in the middle of π .Then there is a most one component C of G 1 that has vertices in both X 1 and X 2 .Sorting C in DFS order then yields an order π and a corresponding partition (X 1 , X 2 ) such that there are at most two edges e 1 , e 2 between X 1 and X 2 in G 1 .
To complete the proof of the claim, we explain now how to compute IP G1 (X) with the help of few rectangles.Let e 1 = x 1 y 1 and e 2 = x 2 y 2 .Then, given an assignment τ to X, we can decide if τ satisfies IP G1 (X) from the values IP G1[X1] (τ ), IP G1[X2] (τ ), τ (x 1 ), τ (x 2 ), τ (y 1 ), τ (y 2 ).Note that the set of assignments τ that coincides with τ on all these values is a rectangle with partition (X 1 , X 2 ) and these rectangles are monochromatic with respect to f G and partition the space of all assignments to X.Moreover, on half of the 32 resulting rectangles IP G1 evaluate to 1.As a consequence, we can construct a rectangle decision list respecting (X 1 , X 2 ) by iteratively testing for containment in one of the 16 rectangles on which IP G1 evaluates to 1 and reject if the input is in none of them.
With Claim 9, we get partitions Π 1 and Π 2 such that IP G1 (X) has a constant length decision list with respect to the partition Π 1 and IP G2 (X) has a constant length decision list with partition Π 2 .By construction, in both of the constructed decision list, all outputs but that of the last (default) rectangle are 1.Thus, we get constant size decision lists of z ∧ IP G1 (X) and ¬z ∧ IP G2 (X) by adding the variable z and fixing it to 1, respectively 0, in all rectangles.We get a decision list for f G by deleting the last default line from the list for IP G1 (X) and concatenating IP G2 (X) to it.
We next show that if we allow no variable order changes, then rectangle decision lists for f G are exponentially long.Proposition 10.For every balanced partition (X 1 , X 2 ), every rectangle decision list for f G respecting the order (X 1 , X 2 ) has size 2 Ω(|X|) .S AT 2 0 2 2 17:8

Changing Partitions in Rectangle Decision Lists
Proof.Fix a partition (X 1 , X 2 ).Then, because G is an expander graph, there are Ω(|X|) edges between X 1 and X 2 .Call this set of edges E .Assume w.l.o.g. that Now consider a (X 1 , X 2 )-rectangle decision list for f G .By fixing in all rectangles the variable z to 1, we get a (X 1 , X 2 )-rectangle decision list for IP G1 (X) without increasing the size of the list.Now greedily extract from E ∩E 1 an induced matching M .Since the degree of all vertices in G 1 is at most 2, we have |M | = Ω(|X|).Using Lemma 3, we get that any monochromatic rectangle of IP G1 (X) respecting the partition has size at most 2 |X|−Ω (|X|) .Plugging this into Theorem 2, we get where s is the length of the rectangle decision list.It follows that s = 2 Ω(|X|) as claimed.

Lower Bounds for Lists with Partition Changes
In this section, we will develop a lower bound technique for rectangle decision lists that also works when some partition changes are allowed.The main problem when trying to generalize the proof of Section 3 is that the argument of the proof of Theorem 2 breaks down when allowing even a single partition change.This is because a rectangle R for a partition Π 1 might not contain any big rectangles anymore when considered with respect to a different partition Π 2 .To avoid this problem, we develop a new lower bound technique that can play the role of Theorem 2 when some partition changes can occur.Since bounds on rectangle sizes seem to be not quite strong enough to show such a lower bound, we base it on the more restrictive notion discrepancy which we introduce first.

Discrepancy
Discrepancy of Boolean functions is a well-known tool in communication complexity, in particular in randomized models, see e.g.[18,Chapter 3].Here we will consider a variant of discrepancy with respect to different partitions of the variables.To this end, we make some definitions.Let f : {0, 1} n → {0, 1} be a Boolean function.As usual, we define the discrepancy of a rectangle R with respect to the function f and a probability distribution µ of inputs to f as where x is chosen randomly according to the distribution µ.Now we define the discrepancy of f for the partition Π and the distribution µ as where the maximum is over all rectangles respecting the partition Π.We will exclusively work with the uniform distribution so we leave out the subscript µ and get

S. Mengel 17:9
We now give a bound on the discrepancy of the graph inner product function IP G .It turns out that, as for the size of monochromatic rectangles in Lemma 3, the discrepancy is bounded exponentially in the size of an induced matching between the two sides of the considered partition.
Lemma 11.Let G = (X, E) be a graph with n vertices.Let {e 1 , . . ., e m } be an induced matching of G and let Π = (X 1 , X 2 ) be a partition of X such that for every e i one of the end points is in X 1 and one is in X 2 .Then The proof of Lemma 11 is very similar to that of Lemma 3, so we sketch it in Appendix A.

Rectangles in Partial Functions
It will be useful to consider partial functions which we model as functions f : {0, 1} n → {0, 1, * }.Here, as usual, 1 and 0 stand for true and false, respectively, while * denotes inputs on which f is undefined.We say that a Boolean function f is consistent with f if f (a) = f (a) whenever f (a) ∈ {0, 1}.Essentially, the Boolean functions consistent with f are all functions we can get from f by defining all undefined values.As a special case, we say that a rectangle R is consistent with f if f (R) ⊆ {0, * } or f (R) ⊆ {1, * }.Note that we assume in this that R is a Boolean function and in particular is not partial.
We will use the following simple observation.
Lemma 12. Let f : {0, 1} n → {0, 1} be a Boolean function with discrepancy d.Let f be a partial Boolean function with at most u undefined values such that f is consistent with f .Then any rectangle consistent with f has size at most 2 n d + 2u.
Proof.Let R be a rectangle consistent with f .Assume w.l.o.g. that f (R) ⊆ {1, * }.Then we bound the size of R as follows: The first step is true because by definition of discrepancy we have |f −1 (1)∩R|−|f −1 (0)∩R| ≤ 2 n Disc(R, f ).In the second step we use that, since R is consistent with f , all values in |f −1 (0) ∩ R| must be undefined in f .

Lower Bounds for Functions with Small Discrepancy
We can now formulate and prove the main result of of this section which shows that discrepancy can be used to show lower bounds for decision lists with partition changes.Proposition 13.Let f : {0, 1} n → {0, 1} be computed by a rectangle decision list with k − 1 partition changes using the k partitions Π 1 , . . ., Π k .Assume that for every i ∈ [k] we have Disc(f, Π i ) ≤ 2 −cn .Then the length of the rectangle decision list is at least Ω(2 Proof.Let f be computed by the decision list (R 1 , c 1 ), . . ., (R t , c t ).Assume that the partitions Π 1 , . . ., Π k are used in that order in the decision list.
We will iteratively construct a big rectangle R that is consistent with a partial function f that is consistent with f and has relatively few unknown values.The idea is similar to S AT 2 0 2 2

17:10 Changing Partitions in Rectangle Decision Lists
that in [15] but more complicated due to the partition changes.We think of the rectangles R 1 , . . ., R t as organized in phases: rectangle R i is in phase j if it is with respect to the partition Π j .We construct a rectangle Ri iteratively for all i ∈ [t] such that Moreover, we construct for every phase j ∈ [k] a partial function f j that is consistent with f .We start by setting f 1 := f .We now construct the Ri .If R i is the first rectangle in phase j, we check if If so, we set R = R i , f = fj and stop.Otherwise, if R i has less than (2t + 1) j−1 2 (n− cn 2 j−1 )/2 rows, we set Ri to be the rectangle we get from {0, 1} n by deleting the rows of R i .Otherwise, R i has less than (2t + 1) j−1 2 (n− cn 2 j−1 )/2 columns which we then delete from {0, 1} n to get Ri .
By construction, the property (1) is satisfied: the only rectangle R j that is in the same phase as R i and has j ≤ i is in fact R i .But since we deleted either all rows or all columns of R i in the construction of Ri , the intersection is empty.
If R i is not the first rectangle in the phase, then we have already constructed Ri−1 which has the same partition Π j as R i .Note that R i ∩ Ri−1 is a rectangle.We proceed similarly to before but consider Otherwise, we delete the lines or columns of R i ∩ Ri−1 from Ri−1 , whichever are less, to construct Ri .
Finally, we define f j for j > 1 inductively as the partial function we get from f j−1 by making all entries of all lines and columns that have ever been deleted in the construction of the Ri in an earlier phase take the value * .Obviously, f and f j are consistent for all j ∈ [k].Let us analyze how many inputs to fj evaluate to * .In the worst case, we have deleted columns or rows in t steps of the construction.The undefined values in f j are from the deletions of rows and columns in phases 1, . . ., j − 1.The highest number of rows or columns deleted for one such R i is in phase j − 1 and is (2t + 1) j−2 2 (n− cn 2 j−2 )/2 there, so in that step at most 2 n/2 (2t + 1) j−2 2 (n− cn 2 j−2 )/2 = (2t + 1) j−2 2 n− cn 2 j−1 values have been set to * .So f j has at most u := t(2t + 1) j−2 2 n− cn 2 j−1 undefined values.Now assume that R and f are assigned in phase j which happens by construction if we have that the first rectangle R i in phase j satisfies Then in any case we get that Now assume w.l.o.g. that c i = 1, i.e., the rectangle R i assigns the value 1 to all inputs that end up at this test during the evaluation of the decision list and satisfy R i .We claim that the function is consistent with f .Assume this were not the case.Then there must be an element (x, y) ∈ R on which f (x, y) = 0, so this value must be assigned in the test for a rectangle R i that is tested before R i .Assume first that this happens in the same phase j.Then R ⊆ Ri ⊆ Ri .
But by (1), the rectangle R i can then not be responsible for assigning any value in R.So the rectangle R i must be in an earlier phase j < j.But note that when constructing Ri , we have by ( 1) deleted all entries of R i .Thus, in fi we have the value * for the corresponding inputs which is a contradiction to f taking the value 0 there.But then we get a contradiction with Lemma 12 and (2), because the rectangle f is too big.It follows that R and f can never be assigned in the construction.As a consequence, the construction of the Ri goes through to the end.Reasoning as before, all non-*-values in Rt must have the same value.As a consequence, we get with Lemma 12 that Reasoning as for the number of unknown values in f j before, we also know that Rt is constructed from {0, 1} n by deleting in at most t rounds at most (2t + 1) k−2 2 n− cn 2 k−1 values each.From this we get Putting this together, it follows that

Separating the Hierarchy for Partition Changes
In this section, we will use Proposition 13 to show that increasing the number k of partition changes allowed in a rectangle decision list makes them exponentially more succinct.To this end, we will need functions that have small discrepancy when considered for any set of k partitions but as soon as we allow k + 1 partitions, they become easy.The functions that we will consider are constructed from such of the form IP G for carefully chosen graphs.
To construct these graphs, let us make some additional definitions.Given two graphs G 1 and G 2 on the same vertex set V , denote by G 1 + G 2 the graph that we get by taking the union of the edge sets of G 1 and G 2 .Proposition 14.For every k ∈ N there is a constant c such that for every n ∈ N large enough, there are k + 1 graphs G 1 , . . ., G k+1 on a vertex set V of size n with the following properties: no graph G i has any parallel edges, every vertex in every G i has degree at most 2, and for every k partitions has an induced matching of size cn.
Proof.Let V be a vertex set that is big enough.We choose all graphs G i randomly in the so-called configuration model which we briefly describe next.Let d be even.We then get a multigraph by projecting the edges of the configuration to V .It is known that almost every 4-regular multi-graph chosen in the configuration model is an expander, see [4], i.e., with probability going to 1 for n → ∞ a graph chosen in this model is an (α, 4)-expander for some constant α > 0.
An alternative way to construct a random 4-regular graph is to choose two random 2-regular graphs G 1 , G 2 in the configuration model and take their sum G 1 + G 2 ; call this the sum model.It is known that the configuration model and the sum model are contiguous, which intuitively means that both of them have asymptotically the same properties almost surely, see [16,Chapter 9] for exact definitions and details on this.In particular, since graphs chosen in the configuration model are (α, 4)-expanders for some constant α > 0 with probability going to 1 for n → ∞, the same is true for graphs chosen randomly in the sum model.From this convergence, it follows that there is a constant n 0 such that for a random graph with at least n 0 vertices chosen in the sum model the probability of not being an (α, 4)-expander is bounded by the constant 1 10(k+1) 2 .We now choose the graphs G 1 , . . ., G k+1 that were promised in the statement of the proposition as random 2-regular graphs in the configuration model with more than n 0 vertices.Applying the union bound, we get the following bound on the sum graphs G i + G j : .
So with probability at least .9all sums G i + G j with i, j ∈ [k + 1], i = j are (α, 4)-expanders.We assume in the remainder that this is the case for our graphs G i .By definition, the degree bound of the G i is clear.We next show the third item of the claim.We first show that there is a G i such that for every j ∈ [k] the induced bipartite graph G i [V j 1 , V j 2 ] has at least α|V |/3 edges.By way of contradiction, assume that this were not the case, so for every has less than α|V |/3 edges.Then there is a j such that there are two graphs have at most α|V |/3 edges which contradicts G i1 + G i2 being an (α, 4)-expander.So there is a graph G i such that for every j we have that G i [V j 1 , V j 2 ] has at least α|V |/3 edges.It only remains to greedily extract an induced matching from each G i [V j 1 , V j 2 ] which due to the fact that G i has bounded degree is of size linear in α|V |/3.
The only problem we still have to take care of is that the graphs G i might have parallel edges.Because of the degree bound between each pair of vertices, there are at most two parallel edges.Deleting one of them reduces the number of edges in the induced matchings by at most half which completes the proof.Theorem 15.For every constant k ∈ N and n ∈ N sufficiently big, there is a Boolean function f n,k in n + k + 1 variables such that f n,k can be computed by a rectangle decision list of length O(k) with k partition changes, but any rectangle decision list with k − 1 partition changes computing f n,k has length 2 Ω(n) .
Proof.Let G 1 , . . ., G k+1 be graphs on a vertex set X with the properties guaranteed by Proposition 14.The function we consider is S. Mengel
Reasoning as in Claim 9, we see that for all i ∈ [k +1] there is a variable partition such that IP Gi (X) and thus y i ∧ IP Gi (X) has a constant length rectangle decision list.Concatenating these decision lists yields one of length O(k) computing f .This proves the first claim.
For the second claim, consider any rectangle decision list with at most k − 1 partition changes computing f .Let (X 1  1 , X 1 2 ), . . ., (X k 1 , X k 2 ) be the partitions used.Then, by Proposition 14, there is an i ∈ [k + 1] such that for every j ∈ [k] the graph G i [X j 1 , X j 2 ] has an induced matching of size cn for some constant c only depending on k.Fixing the variables y 1 , . . ., y k+1 in the right way, we get from the rectangle decision list for f one for IP Gi of the same length and with the same partitions.By Lemma 11 we get that Disc(f, ( Plugging this into Proposition 13, we get that the rectangle decision list for IP Gi and thus that for f must be of length 2 Ω(n) , which completes the proof.

Application to QBF Proof Complexity
In this section, we will present a consequence of the results developed above for certain QBF proof systems.We consider an extension of the proof systems introduced in [19] that models the behavior of certain OBDD-based QBF-solvers.We show that in this setting there is a similar hierarchy as in Theorem 15 for refutations in an extension of these proof systems that allows changing variable orders in derivations.

OBDD-Refutations with Reordering
In this section, we introduce the model that we will consider in the remainder of this paper.We work with the proof system introduced in [19] that uses OBDDs as lines in derivations as follows: let Φ = Q 1 x 1 . . .Q n x n .C 1 ∧ . . .∧ C m be a PCNF.A derivation of an OBDD L k from Φ is a sequence L 1 , . . ., L s of OBDDs such that for all i ∈ [m] the OBDD L i is equivalent to the clause C i and for i > m the OBDD L i is derived by one of the following rules: 1. conjunction (∧): L i represents L j ∧ L j for j, j < i. 2. projection (∃): L i represents ∃x.L j for some x ∈ var(L j ) and j < i.
Here, L j [u/c] denotes the OBDD obtained from L j by removing each node labeled with variable u and rerouting all incoming edges to its neighbor along the c-labeled edge (effectively substituting c for u).
In [19] it is explained how the above system corresponds to practical solvers like the QBDD-solver of [20] in the sense that lower bounds for the proof system give lower bounds for the runtime of the solver.In [19], it is assumed that the variable order of all OBDDs in a derivation is the same-which corresponds to the fact that the QBDD-solver uses a fixed variable order in each run.Since we want to model QBF-solvers that are allowed to change variable orders, we introduce a new rule: 5. reordering (r): L i is equivalent to a line L j where j < i but has a different variable order.

17:14 Changing Partitions in Rectangle Decision Lists
We assume that, whenever applying the rules 1.-4., all OBDDs mentioned in those rules have the same variable order.As a consequence, whenever we want to change the variable order in a derivation, we have to do so by explicitly using rule 5.In the following, we assume only a bounded number of different variable orders are used in derivations.To this end, we make the following definition: Let L 1 , . . ., L s be a derivation.We say that it has k variable order changes if there are k indices i ∈ [s − 1] such that the variable order of L i is different from that of L i+1 .We denote by r ≤k the reordering rule from above restricted to the case in which there are at most k variable changes allowed in a derivation.
The size of a derivation L 1 , . . ., L s is i∈s |L s |, i.e., the sum of the sizes of all occurring OBDDs.The derivation is called a refutation if L s ≡ 0. For every subset of rules S ⊆ {∧, ∃, ∀, r, r k }, the proof system OBDD(S) is the restriction of the proof system from above to the rules in S. Of particular interest in [19] were OBDD(∧, ∃, ∀) and OBDD(∧, ∃, ∀, |=): the former is a formalization of the practical solver QBDD [20] while the latter is the strengthening that allows "free" reasoning in NP and for which any lower bound is thus due to genuine hardness due to adding quantification.We here will study the fragments OBDD(∧, ∃, ∀, r ≤k ) and OBDD(∧, ∃, ∀, |=, r ≤k ) which add up to k variable order changes in derivations.Note that it was shown in [19] that the systems OBDD(∧, ∃, ∀) and OBDD(∧, ∃, ∀, |=) are sound, so in particular only false PCNF allow refutations, and it is easy to see that this remains true when allowing changes of variable orders.

Statement of the Hierarchy for OBDD-Refutations and the Separating Functions
The main aim of this section is the following hierarchy with respect to the number of variable order changes.We remark that in Theorem 16 the lower bound is for the stronger model with entailment (|=) while the upper bound does not use it.To prove Theorem 16, we again consider the graphs G 1 , . . ., G k+1 from Proposition 14.We here use them to define a separating function for OBDD-refutations with an increasing number of variable order changes.Remember that X is the underlying vertex set of all the G i and thus the variable set of the IP Gi .Let y 1 , . . ., y k+1 , z be additional variables not appearing in X.We use the following observation as a building block.
Observation 17.There is a constant w such that for every i ∈ [k + 1], there is a CNFencoding φ i of y i ∨ (IP Gi = z) with auxiliary variables and an order π i of the variables of φ i such that every sub-formula of φ i has an OBDD-representation of width at most w and with order π i .
The proof of Observation 17 is not hard but somewhat tedious, so we defer it to Appendix B.
For every i ∈ [k + 1], let Z i denote the set of auxiliary variables used in the construction of φ i and let Z := Z 1 ∪ . . .∪ Z k+1 .Assume that for all i, i ∈ [k + 1] with i = i , the sets Z i and Z i are disjoint.We now define the matrix of the PCNF we want to construct as

A Proof of Lemma 11
When fixing all variables not incident to any edge e i according to a partial assignment τ , we get a function f τ in the variables {x 1 , . . ., x m } ∪ {y 1 , . . ., y m } where we assume for every i ∈ [m] that e i = x i y i .As in the proof of Lemma 3 in [19], f τ is essentially IP m up to flipping the output and the sign of some inputs.Let R be the rectangle with partition (X m , Y m ) that maximizes |IP −1 m (1) ∩ R| − |IP −1 m (0) ∩ R| .Then it follows that for every τ and every rectangle R respecting Π we have where we use in the last line that the discrepancy of IP m is known to be 2 −m/2 .

B Proof of Observation 17
Remember that we want to encode where G i is a graph such that all vertices have degree at most 2 and IP Gi (X) = xy∈E x ∧ y.
In our construction, it will be helpful to use the notion of pathwidth of a graph.So consider a graph G = (V, E).A path decompostion of G is defined to be a sequence B 1 , . . ., B s of so-called bags where B i ⊆ V for all i ∈ [s] with the following properties: for every v ∈ V there is an i ∈ [s] such that v ∈ B i , for every e = uv ∈ E there is an i ∈ [s] such that {u, v} ⊆ B i , and for every i, j ∈ [s] with i ≤ j, if v ∈ B i ∩ B j then for all k with i ≤ k ≤ j we have v ∈ B j .The width of a path decomposition is defined as max i∈[s] (|B i | − 1).The pathwidth of G is the minimal width of a path decomposition of G.
We will first encode IP Gi as follows: because of the degree bound, we know that G i consists of a union of paths and cycles.It follows that G i has a path decomposition B 1 , . . ., B s of width 2. Let V r := i∈[r] B i and let G r i be the graph G i [V r ] induced by V r in G i .We inductively construct a CNF-formula F r with auxiliary variables z 1 , . . ., z r and of size O(r) such that for every assignment τ to X, there is exactly one extension of τ to a model τ of F r and τ (z r ) is the value of IP G r i (τ ).For r = 1, this is easy to do, since the number of variables and edges is constant.For the induction step, observe that the additional clauses for F r only need to contain z r−1 , z r and the variables in B r−1 and B r .In particular, it follows that there is a linear size CNF-encoding F of IP Gi whose models compute the value of that function in a variable z n .
Define the primal graph of a CNF-formula F to be the graph whose vertices are the variables in F and whose edge set contains an edge xy if and only if x and y appear in a common clause of F .Define for every i ∈ [s − 1] new bags B i := B i ∪ B i+1 ∪ {z r−1 , z r }.Then B 1 , . . ., B s−1 is a path decomposition of width at most 7 of the primal graph of F .
Next, add two clauses that force z = z n .Finally, add y i to all clauses of the resulting formula.Call the result φ i .Note that φ i encodes y i ∨ (IP Gi = z) as required.Moreover, since adding two variables increases the pathwidth by at most two, the primal graph of φ i has pathwidth at most 9. Now using the fact that taking subgraphs does not increase the pathwidth of a graph and that all CNF-formulas whose primal graph has pathwidth k have OBDD-representations of width O(2 k ), see [11], completes the proof.

C Strategy extraction
In this section, we show how to perform strategy extraction for OBDD-proof systems with changes of variable orders, proving Proposition 21.Instead of directly extracting rectangle decision lists, we make an intermediate step using OBDD-decision lists which we define next.
Definition 23.An OBDD-decision list of length s is a sequence (L 1 , c 1 ), . . ., (L s , c s ) where the c i ∈ {0, 1} are truth values and the L i are OBDDs, and L s computes the constant function 1.Let V be the set of variables occurring in the circuits L i .The decision list computes a function f : {0, 1} V → {0, 1} as follows.Given an assignment τ : V → {0, 1}, let i = min{1 ≤ j ≤ s | L j (τ ) = 1}.Then we have f (τ ) = c i .The width of the OBDD-decision list is the maximal width of the L i .We say that there are k variable order changes in the decision list if there are k indices i ∈ [s − 1] for which the variable order of L i and L i+1 differ.
The next result states that OBDD-decision lists can be efficiently extracted from OBDDrefutations.
Proposition 24.There is a linear-time algorithm that takes an OBDD(∧, ∃, ∀, |=, r)refutation of length s of a PCNF formula Φ and outputs a family of OBDD-decision lists of

4 .Lemma 5 .
There is a polynomial time algorithm that, given an OBDD B, computes an equivalent complete OBDD B .Moreover, |B | ≤ (|X| + 1)|B|.The algorithm for Observation 4 can e.g.be found in the proof of [25, Lemma 6.2.2].OBDDs can be combined by arbitrary binary Boolean functions efficiently, see e.g.[25, Chapter 3]: Let f : {0, 1} 2 → {0, 1} be a binary Boolean function.Then there is an algorithm that, given two OBDDs B 1 and B 2 with order π over the variable set X, computes in time polynomial in |B 1 | + |B 2 | an OBDD B with order π such that B computes on every assignment τ : X → {0, 1} the value B(τ

S AT 2 0 2 2 17: 12 Changing
Then a random d-regular multi-graph in the configuration model is chosen by first considering the set W := V × [d] and choosing a random matching of W which we call the edges of the Partitions in Rectangle Decision Lists configuration.