On Redundancy in Constraint Satisfaction Problems

A constraint language Γ has non-redundancy f ( n ) if every instance of CSP(Γ) with n variables contains at most f ( n ) non-redundant constraints. If Γ has maximum arity r then it has non-redundancy O ( n r ), but there are notable examples for which this upper bound is far from the best possible. In general, the non-redundancy of constraint languages is poorly understood and little is known beyond the trivial bounds Ω( n ) and O ( n r ). In this paper, we introduce an elementary algebraic framework dedicated to the analysis of the non-redundancy of constraint languages. This framework relates redundancy-preserving reductions between constraint languages to closure operators known as pattern partial polymorphisms, which can be interpreted as generic mechanisms to generate redundant constraints in CSP instances. We illustrate the power of this framework by deriving a simple characterisation of all languages of arity r having non-redundancy Θ( n r ).


Introduction
The constraint satisfaction problem (CSP) is a fundamental computer science problem with many applications in artificial intelligence and operational research.An instance of the CSP is a set of variables, a set of domain values, and a set of constraints, which are relations imposed upon certain sequences of variables.The goal is to decide whether it is possible to assign domain values to variables in such a way that all constraints are satisfied.The CSP is a natural common framework for a wide variety of well-studied combinatorial problems, such as satisfiability and graph homomorphism, and is in general intractable.
Following early work of Schaefer on the Boolean domain [25], Feder and Vardi initiated the systematic study of CSPs with fixed constraint languages and famously conjectured that all these "non-uniform" CSPs are either polynomial-time solvable or NP-complete [14].This conjecture prompted a considerable research effort aimed at identifying generic sufficient conditions for the tractability of non-uniform CSPs, which eventually coalesced into a powerful, unified algebraic framework for analysing and classifying the complexity of constraint languages [2,4].After more than two decades of research, the Feder-Vardi conjecture was finally settled in the affirmative with two independent proofs by Bulatov [8] and Zhuk [26].
The success and flexibility of the algebraic framework motivated the study of constraint languages from a broader perspective.Beyond the classical "P versus NP-complete" question, classifications of constraint languages have been obtained for a wide variety of properties,
In this paper we will study non-uniform CSPs from a different perspective.The central question we ask is the following: given a finite constraint language Γ of arity r, what is the maximum number of non-redundant constraints in a CSP instance over Γ?If we denote by n the number of variables, then this quantity (which we call the non-redundancy of Γ) is O(n r ), and if Γ is non-trivial (i.e. at least one relation is neither empty nor complete) then it is Ω(n).As extreme examples, a set of affine relations over a finite field has non-redundancy Θ(n), while sets of r-clauses are easily seen to have non-redundancy Θ(n r ).Curiously, very little is known beyond these trivial bounds, especially outside the Boolean domain.The purpose of this paper is to describe an elementary algebraic framework for classifying the nonredundancy of constraint languages, which we illustrate by deriving a simple combinatorial characterisation of r-ary constraint languages with non-redundancy Θ(n r ).
We draw motivation for studying non-redundancy from two different lines of work.The task of learning a constraint network from answers to queries (sometimes called constraint acquisition) has attracted considerable interest in the past decades [7,10], and a significant effort has been devoted to designing systems that can learn CSPs with as few queries as possible.In this context, it was observed in [5,6] that the non-redundancy of a language Γ corresponds exactly to its VC-dimension, which is a lower bound on the number of yes/no queries (of any kind) that is necessary in order to learn exactly a constraint network over Γ.Therefore, any progress on lower bounds for non-redundancy immediately translates into unconditional, universal lower bounds for constraint acquisition.More generally, for applications where non-uniform CSPs are used to represent knowledge, the non-redundancy of a constraint language is a good estimate of its representational power: if Γ has nonredundancy f (n) and arity r, then the number of n-variable CSP instances over Γ with pairwise distinct solution sets is Ω(2 f (n) ) and O(2 f (n)r log n ).
Our second motivation comes from a series of recent results on the sparsification of non-uniform Boolean CSPs [9,20].In these papers, the goal is to determine whether there exists a polynomial-time algorithm that takes as input an instance of CSP(Γ) (with up to roughly n r constraints if Γ has arity r) and outputs an equisatisfiable instance of size q(n), q(n) = o(n r ).On the surface, this question looks quite different from estimating the non-redundancy of Γ: sparsification is in essence an algorithmic question, and sparsification algorithms are not limited to removing redundant constraints because they only have to maintain equisatisfiability.Nevertheless, all sparsification algorithms for NP-hard Boolean CSPs presented in [9,20] operate purely by removing redundant constraints, and to the best of our knowledge all CSPs whose non-redundancy is known to be O(n q ) also have an O(n q ) sparsification algorithm.While non-redundancy and sparsifiability cannot be equivalent in general (for instance, all polynomial-time non-uniform CSPs have a sparsification algorithm that outputs an instance of size O(1)), this suggests that an improved understanding of non-redundancy in constraint languages would help design sparsification algorithms.

Our results
Our first contribution is a generic algebraic framework for the asymptotic study of nonredundancy in non-uniform CSPs.More precisely, we establish a tight connection between redundancy-preserving reductions for constraint languages and pattern partial polymorphisms, a type of closure operator that was recently introduced in the context of exponential algorithms for certain classes of non-uniform Boolean CSPs [21].A key property of this algebraic duality 11:3 is that both sides are easily interpretable in terms of non-redundancy.We observe that each pattern partial polymorphism of a constraint language Γ describes a rule to identify (or produce) redundant constraints in CSP instances over Γ.In some cases, knowledge of a single non-trivial pattern partial polymorphism of Γ can be sufficient to establish an improved upper bound on its non-redundancy.
Then, we combine our framework with a theorem of Erdős on the maximum cardinality of K r 2 -free hypergraphs [13] to obtain an explicit characterisation of those constraint languages of arity r having non-redundancy Θ(n r ).Incidentally, we show the existence of a small gap: either a constraint language of arity r has non-redundancy Θ(n r ), or it has non-redundancy O(n r−ϵ ) for ϵ = 2 1−r .This (improperly) extends a result of Chen et al. [9] for Boolean languages, which was obtained using very different methods.Beyond non-redundancy, our main result has direct consequences for sparsification, which will be discussed towards the end of the paper.

Related work
A recent series of papers on the sparsification of Boolean languages have established a number of results on the non-redundancy of constraint languages as byproducts.In [9], Chen et al. show that every Boolean language of arity r that does not contain an r-clause can be expressed using multivariate polynomials of total degree at most r − 1. Coupled with elementary arguments on Boolean clauses (see e.g. the proof of Lemma 15 in Section 3), this implies that the non-redundancy of any Boolean constraint language of arity at most r is either Θ(n r ) or O(n r−1 ).Other results in the same paper imply a non-redundancy classification for Boolean constraint languages of arity at most 3, and a characterisation of symmetric Boolean constraint languages with linear non-redundancy.The framework presented in our paper is inspired from their methods, although it is extended to work with arbitrary domains and adapted to study specifically the non-redundancy of constraint languages.
Building upon these results, Lagerkvist and Wahlstrom [20] devised an O(n) sparsification algorithm for the class of languages with a Mal'tsev embedding, which generalises linear equations over finite fields.Their algorithm operates by removing redundant constraints, and hence implies a similar bound on the non-redundancy of these languages.To the best of our knowledge, all languages known to have non-redundancy O(n) belong to this class.The same paper also provides a sufficient condition for having non-redundancy O(n q ), q > 1 based on the closely related notion of k-edge embedding.
Bessiere et al. [5] initiated the direct study of non-redundancy of constraint languages, with a focus on applications in machine learning.They established the equivalence between non-redundancy and VC-dimension, classified the non-redundancy of constraint languages of arity at most 2, and identified a class of ternary constraint languages whose non-redundancy is o(n 2 ) and cannot be fully determined using results based on algebraic embeddings.On Redundancy in Constraint Satisfaction Problems relations over a finite domain D, and the arity of a constraint language Γ is defined as the maximum arity of its relations.Given a constraint language Γ, a CSP instance over Γ is a pair (X, C), where X is a finite set of variables and C is a finite set of constraints, that is, pairs (R, S) with R ∈ Γ and S ∈ X ar(R) .A solution to a CSP instance (X, C) is a mapping ϕ : X → D such that for every (R, S) ∈ C, we have ϕ(S) ∈ R. We will denote the set of all solutions to a CSP instance I by sol(I).The constraint satisfaction problem over Γ, denoted by CSP(Γ), takes as input a CSP instance I over Γ and asks whether sol(I) is non-empty.

Primitive-positive definitions and polymorphisms
Given a constraint language Γ, a relation R of arity r is primitive-positive definable (ppdefinable) over Γ if there exists a first-order formula ψ with r free variables x 1 , . . ., x r that only uses existential quantification, conjunction, equality, and relations from Γ such that In that case, we will often write R(x 1 , . . ., x r ) ≡ ψ.
If ψ is quantifier-free, then R is qfpp-definable over Γ.We denote by ⟨Γ⟩ (resp.⟨Γ⟩ ̸ ∃ ) the set of all relations that are pp-definable (resp.qfpp-definable) from Γ.It is well-known that CSP(Γ ′ ) is log-space reducible to CSP(Γ) for all Γ ′ ⊆ ⟨Γ⟩ [17].If in addition we have )) belongs to R. By extension, an operation is a partial polymorphism of a language if it is a partial polymorphism of each of its relations.A polymorphism of a relation over D is a partial polymorphism f with D f = D k .Given a language Γ, we denote by pol(Γ) the set of polymorphisms of Γ.
Geiger's theorem [15] states that for any two languages Γ, Γ ′ over the same domain, we have Γ ′ ⊆ ⟨Γ⟩ if and only if pol(Γ) ⊆ pol(Γ ′ ).A similar duality was observed between qfppdefinability and partial polymorphisms by Romov [24].These results form the foundation of the algebraic approach to non-uniform CSPs, in which the complexity of constraint languages is studied through the lens of their (partial) polymorphisms.We refer the reader to recent surveys for a more in-depth treatment of the subject [4] [11].

Redundancy
In a CSP instance (X, C), a constraint c ∈ C is non-redundant if and only if (X, C) and (X, C\{c}) have different solution sets.Given a constraint language Γ, the non-redundancy of Γ, denoted by NRD Γ , is the function that maps each n ∈ N to the maximum number of non-redundant constraints in an instance of CSP(Γ) with n variables.It is easily seen that if Γ is a constraint language of arity r that does not contain only empty or complete relations, then NRD Γ (n) = O(n r ) and NRD Γ (n) = Ω(n).It is also known that the asymptotic behaviour of the NRD Γ function for a finite language Γ is governed by that of its individual relations, as witnessed by these two inequalities: The second inequality holds because each instance over {R} is also over Γ, and the first holds because the property of being non-redundant is monotone.(If c = (S, R) is nonredundant in I, then it is non-redundant in the subinstance of I consisting only of those constraints with relation R. Repeating this reasoning with all R ∈ Γ provides the desired upper bound.)Formal proofs can be found in [5].In this paper we are only interested in the asymptotic behaviour of the NRD Γ function; it follows from the inequalities above that classifying single-relation languages is sufficient to deduce a classification for all finite constraint languages.

Redundancy-preserving reductions
It is easily observed that primitive-positive definability does not preserve non-redundancy in general, in the sense that two constraint languages Γ 1 and Γ 2 with Γ 1 ⊆ ⟨Γ 2 ⟩ and Γ 2 ⊆ ⟨Γ 1 ⟩ may have very different non-redundancy asymptotics.(An extreme example is Γ 1 = {(0, 0, 1), (0, 1, 0), (1, 0, 0)} and Γ 2 being the set of all ternary Boolean clauses.By the results of [9], NRD ).The pp-interdefinability of these languages is well known and can be verified by inspecting Post's lattice [23].)On the other hand, qfpp-definitions do preserve non-redundancy, but have limited expressive power.In this section, we attempt to construct an ideal notion of definability tailored for non-redundancy, with three goals in mind: the corresponding reductions between constraint languages must preserve non-redundancy bounds, a useful algebraic duality must exist, and the framework should be as general as possible.
We start by presenting our proposed notion of definability.
▶ Definition 1.Let D be a set and Γ be a constraint language over D. We say that a relation R of arity r has an fgpp-definition over Γ if R has a pp-definition , and for each existentially quantified variable y i there exists some x j such that Q g (x j , y i ) is an atom in ψ.
In Definition 1, "fgpp-definition" stands for functionally guarded pp-definition.Note that qfpp-definability implies fgpp-definability, but that fgpp-definability does not imply pp-definability in general.(This is due to the functional atoms Q g , which may not belong to Γ.) On the Boolean domain, fgpp-definitions are equivalent to the cone-definitions of Chen et al. [9].
Given a constraint language Γ over D, let ⟨Γ⟩ fg denote the set of relations over D that are fgpp-definable over Γ.The next proposition is the first step towards proving that fgpp-definitions are suitable for studying the NRD function.
▶ Proposition 2. Let Γ 1 and Γ 2 be two non-trivial languages over the same finite domain D.
Proof.Let I be an instance of CSP(Γ 2 ) with variable set X, |X| = n, and exactly NRD Γ2 (n) non-redundant constraints.Without loss of generality, we assume that no constraint in I is redundant.
Let R ∈ Γ 2 be some relation and R(x 1 , . . ., x r ) ≡ ∃y 1 , . . ., y q : ψ(x 1 , . . ., x r , y 1 , . . ., y q ) be an fgpp-definition of R over Γ 1 .For each constraint c i = (R, (x i 1 , . . ., x i r )) in I, we introduce a set Y i of q fresh variables y i 1 , . . ., y i q and replace c i with the set of constraints Repeating this process for all R ∈ Γ 2 and constraint c i yields a CSP instance I * over Γ 1 ∪ {Q g | g : D → D} whose solution set, when projected onto X, is exactly sol(I).

On Redundancy in Constraint Satisfaction Problems
By construction, for each y ∈ Y = ∪ i Y i there exist g : D → D and x ∈ X such that for all ϕ ∈ sol(I * ), we have ϕ(y) = g(ϕ(x)).In particular, if there exist y 1 , y 2 ∈ Y , x ∈ X and g : D → D such that y 1 = g(x) and y 2 = g(x) then we have ϕ(y 1 ) = ϕ(y 2 ) for all ϕ ∈ sol(I * ).It follows that y 1 and y 2 can be merged into a single variable without changing the number of non-redundant constraints in I * .After exhaustive application of this rule, we have Now, we greedily remove redundant constraints from I * until all constraints are nonredundant.Observe that this process cannot remove all constraints from a set S i , for any i.Indeed, by assumption, for each constraint c i in I there exists an assignment ϕ : X → D that only violates c i in I.This assignment can be extended to an assignment ϕ * : X ∪ Y → D that is not a solution to I * and may only violate constraints in S i .Therefore, removing all of S i would increase the solution set of I * , which cannot happen since only redundant constraints are removed.
In addition, the language {Q g | g : which implies that R ∈ ⟨{R lin }⟩ fg .From Proposition 2 and the fact that linear equations over finite fields have linear non-redundancy, we deduce that {R} has non-redundancy O(n).
▶ Example 4. Following [20], a language Γ 1 over non-empty domain D 1 has an embedding over a language Γ 2 over domain D 2 ⊇ D 1 if there exists a bijective function h : Γ 1 → Γ 2 such that for all R ∈ Γ 1 , ar(R) = ar(h(R)) and R = h(R) ∩ D 1 .If we interpret both Γ 1 and Γ 2 as languages over D 2 and define g : otherwise (where d * 1 is an arbitrary value in D 1 ), then each R ∈ Γ 1 can be written as and hence Γ 1 ⊆ ⟨Γ 2 ⟩ fg .Therefore, by Proposition 2, embeddings preserve the non-redundancy asymptotics of constraint languages.
We will establish an algebraic duality for fgpp-definitions based on pattern partial polymorphisms, which were introduced by Lagerkvist and Wahlstrom [21] in a different context (the study of exponential algorithms for sign-symmetric Boolean languages).
A polymorphism pattern of arity k is a set of pairs (t, x), where t is a sequence of variables of length k and x occurs in t.A k-ary partial operation f : D f → D satisfies a k-ary polymorphism pattern P if and f (ϕ(x 1 ), . . ., ϕ(x k )) = ϕ(x) for all ((x 1 , . . ., x k ), x) ∈ P , ϕ : {x 1 , . . ., x k } → D. It follows from definition that for any pattern P and finite set D, there is at most one partial operation on D that satisfies P .We denote this function by f D P and call it the interpretation of P on D.
We say that a partial operation f is a pattern partial operation if it satisfies some polymorphism pattern P .We will often use the following equivalent characterisation.
▶ Observation 5. Let D be a finite set, k be a nonnegative integer and D f ⊆ D k .A partial operation f : D f → D is a pattern partial operation if and only if for every t ∈ D f and g : D → D, we have that Proof.Suppose that f is a pattern partial operation because it satisfies a certain polymorphism pattern P .In particular, for every t ∈ D f there exists some ((x 1 , . . ., x k ), x) ∈ P and ϕ : {x 1 , . . ., x k } → D such that t = (ϕ(x 1 ), . . ., ϕ(x k )).Then, for any mapping g : D → D we have g(t) = (g(ϕ(x 1 . . ., g(ϕ(x k ))), which must belong to D f as witnessed by the mapping Conversely, suppose that for every t ∈ D f and g : D → D, we have that g(t) ∈ D f and f • g(t) = g • f (t).Let D P = {x 1 , . . ., x q } be a set of variables in bijection with D = {d 1 , . . ., d q }, and let P denote the pattern Then, we must have x j ∈ {x i1 , . . ., x i k } for any ((x i1 , . . ., x i k ), x j ) ∈ P .Indeed, if it were not the case then there would exist a tuple t = (d i1 , . . ., Furthermore, mappings ϕ from D P to D can be identified with mappings from D to D, so with a slight abuse of notation we have for all ϕ : D P , ((x i1 , . . ., x i k ), x j ) ∈ P and f satisfies P .◀ On the Boolean domain, pattern partial operations are called pSDI operations [21] (for partial self-dual idempotent operations).Beyond the Boolean domain, notable examples of pattern partial operations are the first Pixley partial operation of [5] and the universal Mal'tsev partial operations of [20], the simplest of which is presented in Example 6.  (a, c), (d, c) such that (d, b) / ∈ R. It can be further observed (although it is not immediately obvious) that a binary relation admits f D P M 2 as a partial polymorphism if and only if it is fgpp-definable from the empty constraint language.This polymorphism pattern plays a critical role in the characterisation of the non-redundancy of binary constraint languages obtained in [5], and we will revisit it in the next section.

C P 2 0 2 2 11:8 On Redundancy in Constraint Satisfaction Problems
Throughout this note we will use p 2 pol(Γ) to denote the set of all pattern partial polymorphisms of Γ.The following proposition shows that p 2 pol(Γ) determines precisely the set of relations that are fgpp-definable over Γ. ▶ Proposition 7. Let Γ 1 and Γ 2 be two constraint languages over the same finite domain D.
Proof.We first prove the backward implication.Suppose that Γ 2 ⊆ ⟨Γ 1 ⟩ fg but there exists some pattern partial operation f ∈ p 2 pol(Γ 1 ) of arity k that is not a partial polymorphism of some relation R ∈ Γ 2 .Let R(x 1 , . . ., x r ) ≡ ∃y 1 , . . ., y q : ψ(x 1 , . . ., x r , y 1 , . . ., y q ) be an fgpp-definition of R over Γ 1 and define R ̸ ∃ (x 1 , . . ., x r , y 1 , . . ., y q ) ≡ ψ(x 1 , . . ., x r , y 1 , . . ., y q ).First, observe that for all g : D → D and k tuples l [1, . . ., r] = t l for all l ≤ k.By Definition 1, there exists for each r < i ≤ r + q an index j ≤ r and a mapping g : The forward implication is a bit more difficult.Let R denote the set of all relations R over D such that R / ∈ ⟨Γ 1 ⟩ fg and every pattern partial polymorphism of Γ 1 is a partial polymorphism of R. Towards a contradiction, suppose that R is non-empty.Let R be a relation in R with minimum arity r.Note that ⟨Γ 1 ⟩ fg contains all unary relations over D, so we may assume that r ≥ 2. Now, we define R = Q∈⟨Γ1⟩ fg R⊆Q Q and observe that R is well defined (because D r ∈ ⟨Γ 1 ⟩ fg ) and strictly contains R. In particular, there exists a certain tuple t ∈ R\R.We pick an arbitrary ordering t 1 , . . ., t m of the tuples of R, and for all l ≤ r we define the lth column of R as c l = (t 1 [l], . . ., t m [l]).Then, we define and let p = |D f |, as well as σ : D f → {1, . . ., p} be an arbitrary bijection such that σ −1 (i) = c i for i ≤ r.Now, consider the relation R f (y 1 , . . ., y r ) ≡ ∃y r+1 , . . ., y p : ψ(y 1 , . . ., y p ), where ψ(y 1 , . . ., y p ) is given by and the first conjunction is restricted to tuples of variables that are well-defined with respect to σ.By construction, the tuples of R f are in one-to-one correspondance with the pattern partial polymorphisms of Γ 1 of arity m whose domain is the closure of c 1 , . . ., c r under all unary operations D → D. In particular, R f contains the tuples corresponding to the m partial projection operations on D f and hence R f contains R.Then, since R f is fgppdefinable over Γ 1 , it follows that t ∈ R f .This particular tuple t corresponds to a certain pattern partial polymorphism f t of Γ 1 , of arity m, domain D f and such that f (c l ) = t[l] for all l ≤ r.Since t / ∈ R, f t is not a partial polymorphism of R, which concludes the proof.◀ Pattern partial polymorphisms and redundancy Recall from Section 2 that in order to study the function NRD Γ , we can assume without loss of generality that Γ contains a single relation R.Then, it will be convenient to rephrase CSP(Γ) as a homomorphism problem: given a relation R X over some finite set X of the same arity as R, is there a homomorphism from R X to R? Here, a homomorphism is a mapping ϕ from X to D such that ϕ(t) ∈ R for all t ∈ R X .We will use hom(R X , R) to denote the set of all homomomorphisms from R X to R. In this formulation, the constraint scopes are given by the tuples of R X and a constraint (R, t), t ∈ R X , is redundant if and only if hom(R X , R) = hom(R X \{t}, R).
▶ Lemma 8. Let R X , R be relations with respective domains X, D and let f D P be a k-ary partial polymorphism of R that satisfies a pattern P .If t, t 1 , . . ., t k are tuples of R X such that t = f X P (t 1 , . . ., t k ), then hom(R X , R) = hom(R X \{t}, R).
Proof.For the sake of contradiction, suppose that there exists a homomorphism h : X → D such that h(t) / ∈ R but h(t 1 ), . . ., h(t k ) ∈ R. Observe that f X∪D P is a partial polymorphism of R (when interpreted as a relation over X ∪ D) and define g : X ∪ D → X ∪ D such that g(u) = h(u) if u ∈ X and g(u) = u otherwise.Since f X∪D P is a pattern partial operation, we have that In essence, a (partial) polymorphism is an operator that combines solutions (tuples of values) to produce new ones.What this lemma says is that pattern partial polymorphisms can also be used to combine constraints and produce new ones that are valid for the instance, i.e. redundant.The is particularly interesting in light of the algebraic duality uncovered in Proposition 7: if Γ can fgpp-define a relation R with high non-redundancy, then Γ has high non-redundancy by Proposition 2, and if it cannot then Proposition 7 and Lemma 8 provide a non-trivial mechanism to identify redundant constraints that is valid for CSP(Γ) but not for CSP({R}).
▶ Example 9. Let R be a relation with the operation f D P M 2 of Example 6 as partial polymorphism.Consider a CSP instance (R X , R) and suppose that there exist four variables x 1 , x 2 , y 1 , y 2 ∈ X such that (x 1 , y 1 ), (x 1 , y 2 ), (x 2 , y 2 ), (x 2 , y 1 ) are tuples of R X (i.e. are scopes of constraints with relation R).Then, the pattern partial polymorphism f D P M 2 combined with Lemma 8 implies that the constraint (R, (x 2 , y 1 )) is redundant, as it is the image through f X P M 2 of the first three constraints.
Given a relation R over a set X and a set F of partial operations on X, we denote by F(R) the transitive closure of R under operations from F. If no tuple t of R can be generated from tuples in R\{t} via an operation in F, we say that R is F-independent.The following two propositions are natural consequences of Lemma 8 regarding upper bounds on the NRD function.
▶ Proposition 10.Let R be a relation over a set D, P R be the set of polymorphism patterns that are satisfied by partial polymorphisms of R, and P S R denote the set of interpretations of P R on set S. If for every relation R X over a set X of n elements such that ar(R X ) = ar(R) there exists a relation R * equipped with an algebraic structure; this discrepancy makes any bound obtained this way quite loose.For instance, on the elementary case r = 2, Corollary 20 only produces an upper bound of O(n 3/2 ) for binary rectangular relations while more direct arguments (Example 11) easily establish the tight bound Θ(n).Similarly, on Boolean languages the same result holds for ϵ = 1, but proving such a bound using Lemma 8 (rather than polynomials, as in [9]) would necessitate a much deeper analysis of the pattern partial polymorphisms of constraint languages preserved by f D P u r .Finally, we remark that the proof of Corollary 20 implies a simple polynomial-time sparsification algorithm for all languages Γ of arity r with NRD Γ (n) = o(n r ).
▶ Theorem 22.Let Γ be a constraint language with domain D and maximum arity r ≥ 2. If f D P u r ∈ p 2 pol(Γ), then there exists a polynomial time algorithm that takes an instance of CSP(Γ) as input and outputs an equisatisfiable instance of CSP(Γ) with O(n r−ϵ ) constraints, where ϵ = 2 1−r > 0.
Proof.Let I = (X, C) be an instance of CSP(Γ).For each relation R ∈ Γ, the algorithm constructs the relation R X = {(x 1 , . . ., x r ) | (R, (x 1 , . . ., x r )) ∈ C} and enumerates all sequences t 1 , . . ., t 2 r of tuples of R X .For each sequence, it tests whether t 1 = f X P u r (t 2 , . . ., t 2 r ) and discards the constraint (R, t 1 ) from I when the test succeeds.By Lemma 8, this process only removes redundant constraints.The algorithm then outputs the residual instance.
After this algorithm has terminated, for each relation R the corresponding relation R X contains at most O(n r−ϵ ) tuples because the r-uniform r-partite hypergraph H M (R X ) has cardinality |R X | and does not contain K r 2 as a subhypergraph.There are O(1) distinct relations in Γ, so the total number of remaining constraints is O(n r−ϵ ).◀

Conclusion
We have presented an algebraic framework based on fgpp-definitions and pattern partial polymorphisms dedicated to the study of non-redundancy of constraint languages, extending earlier work on Boolean languages [9,21].Based on this framework, we have established a loose connection with extremal hypergraph theory and deduced a characterisation of constraint languages of arity r with non-redundancy Θ(n r ).The progress we have made in this paper is modest, and much is still unknown on this topic.We believe that the following challenges are the natural next steps towards a better understanding of non-redundancy.

Find a characterisation of constraint languages with non-redundancy O(n).
In this paper we have characterised constraint languages whose non-redundancy is the highest possible with respect to their arity, so it would be interesting to do the same for languages whose non-redundancy is the lowest possible.It is conceivable that this class coincides with that of languages with a finite Mal'tsev embedding [21] since no counter-example is known.However, proving that it is the case will likely require a better understanding of the pattern partial polymorphisms of these languages and lower bounds more sophisticated than those based on Boolean clauses.
Determine whether all r-ary constraint languages with non-redundancy o(n r ) have non-redundancy O(n r−1 ).This is known to be true for the Boolean domain by the results of Chen et al. [9], but for larger domains we are only able to prove the existence of a considerably smaller gap which vanishes as r grows.Both our approach and that of Chen et al. have intrinsic limitations when dealing simultaneously with large domains and large arities, so it would be interesting to see how they could be combined.
C P 2 0 2 2 11:14 On Redundancy in Constraint Satisfaction Problems Determine the non-redundancy of all ternary constraint languages.A classification is known for binary languages (see [5], although a more direct proof follows from Example 11 and Lemma 15) and ternary Boolean languages [9], but not on ternary languages with arbitrary domains.
Clarify the relationship between non-redundancy, sparsification, and learnability.In particular, it would be interesting to determine whether non-redundancy O(n q ) implies sparsification algorithms with output size O(n q ) and whether non-redundancy is asymptotically equivalent to chain length, a closely related measure that characterises the efficiency of a class of learning algorithms for constraint acquisition [5].

A
relation R of arity r = ar(R) over a domain D is a subset of D r .Given a tuple t of length r and S ⊆ {1, . . ., r}, we denote by t[S] the tuple obtained from t by discarding elements whose index is not in S. Similarly, the projection on S ⊆ {1, . . ., r} of a relation R of arity r is denoted by R[S] = {t[S] | t ∈ R}.A (finite) constraint language Γ is a finite set of

▶ Example 6 .M 2 over
Let P M 2 denote the polymorphism pattern ((x, x, y), y) ((y, x, x), y) and consider the partial operation f D P some set D, which is an example of a pattern partial operation with domain {(d 1 , d 2 , d 3 ) ∈ D 3 | (d 1 = d 2 ) or (d 2 = d 3 )}.By definition, a binary relation R admits f D P M 2 as a partial polymorphism if and only if it is rectangular, that is, R does not contain three tuples (a, b),