Convexity preserving deformations of digital sets: Characterization of removable and insertable pixels

In this paper, we are interested in digital convexity. This notion is applied in several domains like image processing and discrete tomography. We choose to study the inflation and deflation of digital convex sets while maintaining the convexity property. Knowing that any digital convex set can be read and identified by its boundary word, we use combinatorics on words perspective instead of a purely geometric approach. In this context, we characterize the points that can be added or removed over the digital convex sets without losing their convexity. Some algorithms are given at the end of each section with examples of each process.


Introduction
In this paper, we provide characterizations of convexity-preserving removable and insertable points of digital convex sets.In Z 2 , even a simple transformation (such as rotation) of a digital convex set can cause the loss of the digital convexity property.We aim to provide the discrete equivalent of infinitesimal transform, from which we can derive convexity-preserving set deformations.In order to study such atomic deformations that preserve digital convexity, we need to investigate deflation and inflation processes [30].The theory of combinatorics on words provides useful tools and techniques for our investigation [22].Relying on this theory, we are able to characterize the removable points x of a digital convex set C, such that C \ {x} is still digitally convex.Similarly, we provide the characterization of the insertable points x ∈ C (the complement of C in Z 2 ) such that C ∪ {x} is still digitally convex.These two actions can only be applied at some specific points on the digital convex set if we want to preserve digital convexity.As a matter of fact, deflation is easy; finding the correct point to remove from a digital set is simple from a geometrical point of view.On the other hand, inflation is more involved.Propositions and theorems for inflation and deflation are naturally associated with algorithms in the mathematical sense.In [31], we provide an implementable algorithm, containing all the necessary details.
The plan of the paper is the following.Section 2 provides an overview of digital convexity, from combinatorics on words perspective, and gives the basic notations needed to understand our results.In Section 3, we characterize removable points and prove that such a point can be any simple point that is a corner of the convex hull of C. As said before, finding a characterization of insertable points is not an easy task.We provide necessary and sufficient conditions to determine candidate points in Section 4. The necessary condition is based on a result from [13]: the authors proved that adding the closest outer point of a segment maintains its digital convexity.Based on this result, we provide in this paper the characterization of all insertable points for the whole digital convex set.We provide two main results for the sufficient condition.The first one is the general case and leads to propagation after inflation.The second result imposes a strong constraint on the sufficient condition.Figure 1 shows an example of a digitally convex set with some removable and insertable points.For both procedures and after each iteration, we must consider an update for the segments of the convex hull.In this paper, we discuss all the possible cases for this update.One of them is presented in Figure 2. The last section is left for the conclusions and perspectives.Figure 2: The first picture shows an example of deflation.The green point x is the point to be removed from a digital convex set (orange points) C. The second picture shows an example of inflation.The green point x is the new point to be added.The blue and red segments in both pictures are respectively Conv(C) and their modifications after each process.

Digital convexity and combinatorics on words
In this section, we first give the definition of digital convex sets based on convex hull [19], which are also called H-convex sets [16].We then show that their boundaries can be expressed by words represented with the Freeman chain code [18], called a boundary word.After recalling basic notions of combinatorics on words, we present the characterization of digital convexity along boundary words using those notions [8].

Definitions of digital convex sets
In R 2 , a subset R is said to be convex if for any pair of points x, y ∈ R, every point on the straight line segment joining x and y is also within R. This notion, however, cannot be straightforwardly applied to subsets in Z 2 .In order to tackle this issue, various notions of convexity of a set S of Z 2 have been proposed.They are categorized into four approaches; the first approach is based on no existence of triplet of collinear points (p, q, r) such that p, r ∈ S and q ∈ S [25].The second one is based on the existence of a convex set R ⊂ R 2 such that the digitization of R is S [29].Kim proposed another definition, which can be characterized by verifying if the digitization of the convex hull of S is equal to S, and also showed its equivalence with the two former [19]; the definition based on the convex hull is later called H-convex by Eckhardt [16].The fourth approach is based on the notion of digital line segments [20], such that S is digitally convex if the digital line segment joining any pair of points of S belongs to S. Note that this last notion guarantees the connectivity of S contrary to the other three.Under the connectivity assumption, it is then shown that the above different concepts coincide [16].
The definition of connectivity on Z 2 is based on the notion of neighborhood.Given a point p ∈ Z 2 , the neighborhood of p is defined by N (p) = {q ∈ Z 2 : ∥p − q∥ 1 ≤ 1}, also called 4-neighborhood.We say that a point q is 4-adjacent to p if q ∈ N (p) \ {p}.From the reflexive-transitive closure of this adjacency relation on a finite subset X ⊂ Z 2 , we derive the 4-connectivity relation on X, which is an equivalence relation.If there is exactly one equivalence class for this relation, then we say that X is 4-connected.In this article, we consider finite and 4-connected sets of Z 2 .As this 4-connectivity assumption yields the coincidence of the above different definitions of digital convexity [16] 1 , we use the one based on convex hull [16,19], which is defined for a finite subset Y ⊂ R 2 as: From this definition, we have the following remark, which is contrary to the implication of connectivity in the concept of convexity in R 2 (see Figure 3 for an example).
Remark 1. Digital convexity does not imply connectivity in Z 2 .
The above definition of digital convexity is also used in [12], whose aim is verifying if a given finite 4-connected set S of Z 2 is digitally convex.Their approach focuses rather on the boundary of S, on which maximal line segments and their arithmetic properties are analyzed [12].On the other hand, properties of the boundary of S based on combinatorics on words are also studied [8].
The problems treated in this article are different from the ones from [8,12], but are related.Our problem is stated as follows; given a finite, 4-connected and digitally convex set C of Z 2 and a point p of C (resp. the complement C), we would like to verify if C \ {p} (resp.C ∪ {p}) is still digitally convex (and 4connected).In order to answer such questions, we use the boundary properties based on combinatorics on words that are presented in [8].

Boundary words of digitally convex sets
Let C ⊂ Z 2 be a finite, 4-connected digitally convex set.The border points of C can be tracked by a classical border following algorithm (for example, see [1] for "left-hand-on-wall" border following, i.e. placing the left hand on a wall (border) and then following the wall by maintaining contact between the hand and the wall), which generates a 4-connected sequence of the border points of C. Note that the sequence can include dead-ends and thus sometimes turnaround sub-sequences.Such a sequence is also called the boundary path of C, denoted by Bd(C), and represented by a word obtained by Freeman chain code [18], denoted by W (Bd(C)) or simply W (C) and called the boundary word of C [17].Boundary words are thus defined over an alphabet of four letters 0, 1, 0, 1, which are associated to the right, up, left and down steps, respectively.See Figure 4 for an example.
It is then observed that the boundary word W (C) of any digital convex set C can be factorized into four sub-words, such that each sub-word is a binary word, i.e., it contains only two letters.For such a factorization, we first enclose the boundary path Bd(C) by its bounding box, and cut Bd(C) into four parts at the four intersections with the bounding box; for example, on the left side of the bounding box, we consider the lowest intersection point W as a cutting position.Similarly, we can find the other three cutting positions, denoted by N, E, and S, as seen in Figure 5. Starting from W in the clockwise direction and ending at N, we obtain the WN -path as a part of Bd(C), which is associated with the WN -word of W (C). Similarly, we obtain N E−, ES−, and SW-paths and their associated words.Figure 5 illustrates the factorization of boundary words and shows that each word is a binary word.This factorization allows us to treat the four parts of the boundary of any digital convex set independently, and to introduce the notion of digital convexity adapted to each part [8].

Definition 2 ([8]
).A word w is said WN -convex if it codes the WN -word of the boundary word of some finite, 4-connected and digitally convex set of Z 2 .
Similarly, we can define N E-, ES-, SW−convex words.Our aim is to deform digital convex sets with preserving their digital convexity and our approach is based on combinatorics on words.In order to present the important characterization of digital convexity by combinatorics on words [8], we first recall the necessary notions of combinatorics on words in the following.

Basic notations of words
We first present some terminologies of words that can be found in [22].An alphabet A is a nonempty finite set of symbols called letters; in this article, we have four letters 0, 1, 0, 1 as mentioned above.A word w is a sequence of concatenated letters from A. The empty word ϵ is a sequence of zero symbols.A * denotes the set of all finite words over A. We let |w| represent the length of a word w, while |w| a represents the number of occurrences of a in w.For all a ∈ A, we thus have |w| = a∈A |w| a .The n-times concatenation of w is denoted by w n .The sub-word s of w from position k to position l is denoted by: s A word is said primitive if it is not the power of a nonempty word.We say that w and w ′ are conjugate, denoted by w ≡ w ′ , if there exist two factors u, v in both w and w ′ , such that w = uv and w ′ = vu.The reversal of a word w = a 1 a 2 . . .a n is w = a n . . .a 2 a 1 where each a i is a letter.If w = w, the word is called a palindrome.In this paper, we use the total lexicographic order, denoted by <.
We recall that the boundary word over {0, 1, 0, 1} can be divided into 4 parts and each of them belongs to a different binary alphabet.For each of these parts, we give the lexicographic order between the two letters in such a way we preserve the decreasing order of the Lyndon factorization (see below for the definition).
In this case, the slope of each factor will be calculated depending on which part it belongs to.Table 1 shows this information with respect to each part of the boundary word.

Words Alphabet Order Slopes
Table 1: The alphabet, the lexicographic order and the slope's calculation used in each of the 4 parts of the boundary word In the following, we introduce the two families of words needed in this article, Lyndon and Christoffel words.

Lyndon words
We introduce the lexicographic order over the words in order to talk about the Lyndon family that was introduced by R. C. Lyndon [23].Definition 3 ([22]).Let u and v be two words.We say u > v if and only if: Definition 4 ( [23]).A word w is Lyndon if it is the smallest among its conjugates using its lexicographic order.
We then have the following unique factorization of any word, which is called Lyndon factorization, introduced by Lyndon, Chen and Fox in 1958.
Theorem 1 ( [28]).Every non-empty word w admits a unique factorization as a lexicographical decreasing sequence of Lyndon words, , where every n i ∈ N and every ℓ i is a primitive Lyndon word such that A linear-time algorithm to compute the Lyndon factorization was proposed by Duval [15], while there also exists a O(log n)-time parallel algorithm proposed by Apostolico and Crochemore [2], where n is the length of the word.Geometrically, Lyndon points correspond to the vertices of the convex hull of C. They are used in the next section and you can see them in Figure 1 as the green points in C.

Christoffel words
Christoffel words are regarded as the discretization of line segments with rational slopes.
Geometrically, a Christoffel word is determined by encoding with Freeman chain code the discretization of a line segment of rational slope [3].More precisely, for any two non-negative co-prime integers a and b, the discretization of a line segment of rational slope a b is the closest digital path below the line segment such that no integer point exists between the path and the line segment.This digital path is called the Christoffel path.If a and b are positive co-prime numbers, a Christoffel word w of slope a b , denoted by C a b , is a sequence of a + b letters chosen from the binary alphabet {0, 1}.The choice of letters is not random, but it is obtained by assigning the letter 0 (resp.1), to each increasing (resp.decreasing) step in the sequence of all the multiples of a modulo (a + b) as given below.
Definition 6 ( [9,28]).Given a pair of non-negative co-prime integers a and b, the Christoffel word w of slope a b is the sequence of a+b letters of {0, 1} * , such that the i-th letter of w is given by: ∀i ∈ {1, . . ., n} w where the remainder r i is defined by The remainder sequence r i and the i-th letter of C 5 8 are shown in Table 2.It is also noticed that each Christoffel word starts with a horizontal step, i.e. 0, and ends with a vertical step, i.e. 1, while the central part is a palindrome.

Property 1 ([4]
).Let C a b be the Christoffel word of slope a b with a and b two positive co-prime numbers, we can write C a b = 0w ′ 1 where w ′ is a palindrome.
As the slope of a Christoffel word is exactly the number of occurrences of the letter 1 over the number of occurrences of the letter 0, the slope of each Christoffel word is defined as follows.

Property 2 ([6]
).For any two non-empty Christoffel words u and v, we have For this family of words, Borel and Laubie introduced the standard factorization [6,7], which allows writing any Christoffel word as the concatenation of two other Christoffel words in a unique way as follows.
Theorem 2 ( [6,7]).Any Christoffel word w of length greater or equal to 1 can be written in a unique way such that w = uv where u and v are both primitive Christoffel words.The couple (u, v) is called the standard factorization of w.
Geometrically, this factorization can be seen as the decomposition of w into u and v at the closest point of the path corresponding to w with respect to the line segment between the origin to (|w| 0 , |w| 1 ).It is equivalent to say that this factorization corresponds exactly to the position where r i = 1.This closest point of w is denoted by cl(w).Figure 6 illustrates the standard factorization of the Christoffel word in Table 2, that is (00100101, 00101).
(0, 0) The standard factorization of the Christoffel word w of slope 5  8 .P corresponds to the closest point cl(w) while Q corresponds to the furthest point fu(w).P is also called the interior Bézout point, while B is the exterior Bézout point.We have: A direct application of Theorem 2 and Property 1 induces the palindromic factorization, which allows to write any Christoffel word as a concatenation of two palindromes [6,28].Property 3 ([6,10,28]).Any primitive Christoffel word w of length strictly greater than 1 can be written in a unique way as w = p 1 p 2 where p 1 and p 2 are palindromes.
This property comes from the fact that the central part of each factor of the couple (u, v), which forms the standard factorization of w, is a palindrome (i.e.u can be written as 0u 1 1 with u 1 a palindrome).Namely, we have: where p, p 1 and p 2 are palindromes (Property 1 and Theorem 2).

Corollary 2 ([13]
).Let w be a Christoffel word of length greater than 1.Then the palindromic factorization decomposes w into p 1 and p 2 exactly at the furthest point of the Christoffel path of w with respect to the line segment from the origin to the grid point (|w| 0 , |w| 1 ), which is unique respect to w.This furthest point is denoted fu(w).Note that the position of the grid point fu(w) over w is r i = |w| − 1 as seen in Figure 6.The two types of points, the closest point cl(w) and the furthest point fu(w), on the Christoffel path are highlighted due to particular interest for the rest of the paper; fu(w) is used in Section 4 for inflation of digital convex sets.We call a closest outer point of a word, a point which is the diagonally opposite to fu(w).In order to link Lyndon and Christoffel words, we need the notion of k-balanced words [24]; a word w ∈ {0, 1} * is k-balanced if and only if for every pair of sub-words s, t of w, we have: It is shown that Christoffel words are 1-balanced [6,7], and the following theorem connects the two families of words.

Theorem 3 ([24]
).A word w is a Christoffel word if and only if it is a 1balanced Lyndon word.

Digital convexity interpreted by combinatorics on words
The authors in [8] gave a characterization for the boundary word W (C) using the notions of combinatorics on words, in particular Lyndon and Christoffel words.

Theorem 4 ([8]
).A word of {0, 1} * is WN-convex if and only if its Lyndon factorization is unique, and its factors are all primitive Christoffel words.
Thanks to Theorem 3, if each factor of Lyndon factorization is in addition 1balanced, we can say that the factors are also Christoffel words.From Theorem 4, we can also conclude that the slopes of the Christoffel factors are in decreasing order.
..w n k k be the Lyndon factorization of the boundary word of a certain digital convex set C then ρ(w 1 ) > ρ(w 2 ) > ... > ρ(w k ).
The following example shows a WN -path and the factorization of its boundary word.The slopes ρ of the factors are decreasing: In [8], we can find a linear-time algorithm over the word length that checks the WN -convexity of a path encoded by a binary alphabet.Then, Theorem 4 allows us to detect the exact position of the vertices of the convex hull of Bd(C) over the boundary word w.

Property 4 ([8]
).Given a digital convex 4-connected set C, the vertices of the convex hull of C or its boundary Bd(C) corresponds to the Lyndon points of Bd(C).
In the next sections, we show how to deflate and inflate a digital convex 4-connected set using the above tools of combinatorics on words.In such a context, we characterize removable points for the deflation process in Section 3 and similarly identify insertable points for inflation in Section 4.

Deflation of digital convex sets
In this section, we first define removable points that allow deflating a digital convex 4-connected set C while preserving its convexity.We then characterize such points in order to make a list of all removable points using concepts of digital topology and combinatorics on words.They are defined along the boundary path Bd(C) via the boundary word W (C) (see Section 2.2).Applying iteratively such point-wise deflation operation may be required in various applications of computer imagery.This implies that we need each step to choose a removable point among the list of removable points and then to update the list, which is made by the new Lyndon factorization of W (C). We show that this update can be made locally.In practice, we need to choose one removable point in the list by using some priority.In this article, however, we do not discuss how to define such priority and simply assume that we have such a priority a priori [30].We focus on the characterization of removable points and the update of the list of removable points, which are critical issues in the deflation algorithm.

Removable points
We first give a definition of removable points, which are used for deflating a 4-connected digital convex set C while preserving its convexity.
First of all, let us consider the preservation of connectivity.In digital topology, given a subset X ⊂ Z 2 , a point x ∈ X is said to be simple if deleting it from X preserves the topological characteristics of X, i.e. the number of connected components of both X and its complement in Z 2 [11,21].As we consider X to be 4-connected and digitally convex here, preserving the topological characteristics of X is equivalent to preserving the connectivity of X; X can have no hole due to the digital convexity.Therefore, removable points must be simple if we would like to preserve the connectivity of C. Note that digital convexity does not imply connectivity (see Remark 1).
The definition of simple points relies on the notion of connected components, which is a global characteristic such that the whole object must be taken into account.However, it is well known that simple points can be characterized locally [11,26], for example, using the connectivity number defined in the 8neighborhood [5].
Given a digital convex 4-connected set C, we recall that the boundary path Bd(C) is decomposed into four parts Bd WN (C), Bd N E (C), Bd ES (C) and Bd SW (C) whose boundary words are respectively W WN (C), W N E (C), W ES (C) and W SW (C).Let us consider removing a point on W WN (C), except its extremities, with preserving the convexity.Then we observe that if boundary points are removable, then the points and their neighboring points back and forth in the path Bd WN (C) may form the sub-word 10 in W WN (C).Removing a point from C means switching a factor of the form 10 into 01 in W WN (C) (resp.10, 01 , 1 0 into 0 1, 10 , 01 in W N E (C), W ES (C), W SW (C)) as seen in Figure 8.For the extremity points, we usually use the same concept.A particular study will be considered further in this paragraph.This switch operator can be defined over the alphabet {0, 1, 0, 1} so that two consecutive letters are exchanged at a position k as follows.Note that any point all over Bd(C), whose neighboring points back and forth in Bd(C) form a sub-word of the same letter can be removed.But, in this case, we lose the convexity as we can see in Figure 9. ).Given a word w = a 1 a 2 . . .a n , the switch operator at position k, k < n, on w is defined by: We should mention that sometimes the same grid points appear twice in the boundary path Bd(C) so that the boundary word W (C) contains two consecutive letters of opposite directions.They are always positioned at the junction of two of four decomposed paths, Bd WN (C), Bd N E (C), Bd ES (C) and Bd SW (C), such that the two consecutive letters are in different paths.We can also remove such a grid point, so that we replace the sub-word consisting of two letters of opposite directions by ϵ, as seen in Figure 10.
Over W (C), we can find several points that form with their neighboring points either sub-word 10, 0 1, 10 , 01, 00, 1 1, 0 0 or 11.According to Definition  By Property 1, we can write ℓ i = 0u1 and ℓ i+1 = 0v1 where u and v are both palindromes, so that ℓ i ℓ i+1 = 0u10v1.By applying the switch operator on ℓ i ℓ i+1 at position |ℓ i |, we obtain: switch k (ℓ i ℓ i+1 ) = 0u01v1, which is also a Christoffel word if u01v is a palindrome, as seen in Property 3; otherwise, we get several other Christoffel words.This effect will be studied and detailed in the next subsection where we give the update's effect on the sub-word after removing a point (see also the proof of Theorem 1 in [30]).Finally, when the corresponding sub-word of the point to remove is 00 (resp. 1 1), this means that we are at the intersection between the SW and WN paths (resp.WN and N E paths).Replacing this sub-word with ϵ means that the position of the point W will change and the Bd WN (C) will be longer and starts from the first letter after the last 0 in Bd SW (C).□ The property 4 provides the geometrical interpretation of Theorem 5: any vertex of the convex hull of C corresponds to a Lyndon point on Bd(C).We note that detecting all the removable points for C is performed in linear complexity with respect to |W (C)| as it is done via Lyndon factorization of W (C).

Updating the removable points
Given a digital convex 4-connected set C of Z 2 , the first iteration of our deflation process of C is started by making the list of all the possible removable points with respect to C. Once a candidate has been chosen among them, the boundary word W (C) is modified and the Lyndon factorization is made again.This means that new candidates can arise or disappear from the list of removable points.3 ) and w 4 = C( 1419 )C( 4 17 ) be three different factors of the boundary word W (C) of a digital convex 4-connected set C. As proved before, the removable points with respect to C are simple and Lyndon points.From the definition, the Lyndon points of each w i are positioned at the joint between consecutive distinct Christoffel words.If we apply the switch operator on each w i at the Lyndon point, we obtain: )) 4 .These examples show that Lyndon points can disappear or newly appear due to the switch operation (see Figure 11).
The above deflation step is iterated in general, and we remark that several removable point candidates exist at each iteration step.Various heuristics can be considered to choose one of the removable points [30].It is also important to know what happens after each iteration because of the update of the Lyndon factorization.The property 5 shows all the possible effects that arise after removing a point.We can also see that the update of removable points are made locally, which means that the Lyndon factorization before the factor u and after the consecutive factor v is unchanged if the Lyndon point is removed at the joint of u and v.

Algorithm
We give a small recap about all the previous results considered in combinatorics on words point of view.In order to deflate a digital convex 4-connected set C, we must start the procedure by considering the boundary word w of C. By Property 4, it is sufficient to apply the Lyndon factorization on w in order to get the positions of all the Lyndon points.At each step of the deflation procedure, several points are considered good candidates.In order to know which point is removable, we need to be sure that this point is also a simple point thanks to Theorem 5. We then choose a certain heuristic in order to stock these candidates in a priority queue.Once a point is selected from the queue, we apply the switch operator at its position, given in Definition 9. Some local updates must be considered over the Lyndon factorization.This will certainly affect the priority queue of removable points, which requires an update too since the Lyndon points may vary.This procedure can be repeated for a fixed number of times.Knowing that we are removing a single point at each step, the number of iterations is limited by the cardinality of the digital set.To remove k points from C while preserving digital convexity, the procedure must be iterated k times.Algorithm 1 represents such an iterative pointwise deflation over a 4-connected digital convex set C. Decrement k 9: end while 10: return R Figure 12 shows the deflation procedure applied on a digital convex 4connected set using the heuristic of area-change, which minimize or maximize the area difference between the convex hull of C and C \ {p} (see [30] for more details).

Inflation of digital convex sets
In this section, we study the pixel-wise inflation of any digital convex 4connected set C with the preservation of digital convexity, which is more complex than the deflation process as seen in the previous section.We first give a necessary condition and then a sufficient condition for characterizing insertable points for such inflation.We will show that such insertable points cannot be at any arbitrary place around C.Here the key process, if such inflation is applied iteratively, is also the update of Lyndon factorization over the boundary word W (C). The question raised is whether the induced factorization can be made locally and efficiently.
Here, we focus on the closest outer points, each of which is uniquely defined for every Lyndon factor of the boundary word W (C), with length greater than 1 (see Corollary 2).Lyndon factors of length 1 have no closest outer point, and this case will be treated differently later on in this section.In order to inflate C, we need to determine the insertable points, which are necessarily the closest outer points indeed.In order to verify the insertability of a closest outer point, we have to take into consideration the local and global effects on W (C) after adding this point, which requires the Lyndon factorization update of W (C). In fact, we might lose the digital convexity after adding a furthest point if we do not choose the right one.We characterize insertable points with some iterative local digital convexity verification.In order to avoid such an expensive iterative verification, we also propose some strong constraints to locally characterize a subset of insertable points.

Definition of insertable points
We define insertable points, which are used for pixel-wise inflation of a digital convex 4-connected set C without losing its digital convexity.
In this section and unlike Section 3, there is no need to verify if an insertable point refers to a simple point or not.We only need to verify the digital convexity as the simple 4-connectedness of C ∪ {x} is kept.We recall that the Lyndon factorization of the boundary word W (C) is of the following form: W (C) = ℓ n1 1 . . .ℓ ns s where all ℓ i are made of primitive Christoffel words of length greater or equal to 1, with the decreasing slope order, and every n i is a positive integer.If n i > 1, this signifies that a Christoffel word ℓ i is repeated n i times in W (C). To find insertable points, we need to consider the following two cases for each Lyndon factor ℓ i of W (C):

The link between insertable points and furthest points
In this part, we show the link between insertable points and furthest points associated with primitive Christoffel words.Let us call the diagonal opposite point in C of each furthest point in C the closest outer point2 .We show that for each primitive Christoffel word of length strictly greater than 1, we can find a unique position in the boundary word W (C) where a point is added at its closest outer point.We first present the definition of the split operator applied on primitive Christoffel words, proposed in [13].This operator helps us to see the local modification over the boundary word W (C), which can influence the Lyndon factorization of W (C) globally sometimes.

Necessary condition for insertable points
Adding a point to a digital convex 4-connected set C at neither its vertical nor horizontal parts of the bounding box of C correlates, in a viewpoint of combinatorics on words, to applying the switch operator, given in Definition 9, to a sub-word of form 01 (resp.10, 01 , 1 0) over the boundary word W WN (C) (resp.W N E (C), W ES (C), W SW (C)); on the vertical and horizontal sides, the letters 010 (resp.10 1, 0 10 , 10 1) replace 1 (resp.0, 1, 0) instead.We aim at inflating C without losing its digital convexity.This means that after adding a point, the updated Lyndon factorization of W (C) must remain made of factors of Christoffel words whose slopes are in decreasing order, as seen in Corollary 3. In general, this condition is not satisfied if we add a point randomly.We will show that any insertable point must be a closest outer point, which is the diagonally opposite point of a furthest point.Note that the reverse is not always true.Definition 11 ([13]).The split operator applied on a primitive Christoffel word w decomposes w into two Christoffel words w + and w − , defined as split(w) := (w + , w − ) such that: where w ′ = switch k (w); with k the position of the furthest point of w, In Figure 13, we show examples of the case that the length of a Christoffel word is equal to 1.Note that, if the Christoffel word w is not primitive, i.e. w = ℓ ni i with n i > 1, we can apply the split operator on any of these ℓ i .
Corollary 2 shows that the split operator applied at the furthest point position of a Christoffel word of length greater than 1 gives two other Christoffel words.These two words are of decreasing order with respect to their slopes.Lemma 1 also shows that the concatenation of these two new factors gives the same result for the switch operator, defined previously when applied at the same position.This can be seen in Example 3 and illustrated in Figure 14.
Lemma 1 ([13]).Given a Christoffel word w with |w| > 1 such that w = uv and u and v are the factors of the standard factorization of w.Then we have where ρ(v) > ρ(u).
From Theorem 2, we know that u and v, the two factors of the standard factorization of w, are both primitive Christoffel words.Lemma 1 then shows that the result of the split operator exactly consists of the two primitive Christoffel words v and u, which are in the reverse order of the standard factorization.

Example 3. (Example of the split operator)
Let w = 00100100101 be the Christoffel word of slope 4  7 with its standard factorization w = w − w + where w − = C 1 2 , w + = C 3 5 and ρ(w − ) < ρ(w + ).By applying the split operator on w, we obtain split(w) = (w + , w − ) = (00100101, 001).Property 3 ensures the uniqueness of the furthest point for each Christoffel word.Based on that, it is shown in [13,14] that any Christoffel word of length greater than 1 can only be split at this position.In other words, if we need to add a point from C to C in order to inflate a certain digital line segment of the boundary of C, we must choose a closest outer point associated to the digital line segment.
So far, we have shown how to split one Christoffel word with or without multiplicity; Lemma 1 ensures that the order of the slopes for the new Christoffel words, obtained after splitting, is still decreasing.The next question is the following: will this order be preserved also around the boundary word?In other words, does this operation affect the order of the slopes around a chosen Christoffel word to be split?These questions are answered in the following part.They help us to give the characterization of insertable points.

Example of closest outer points that are not insertable
Let us consider a digital convex 4-connected set C and its boundary word whose Lyndon factorization is given by W (C) = ℓ n1 1 . . .ℓ ns s .Each factor ℓ ni i represents one of the polygonal line segments of Conv(C) and each ℓ i is a primitive Christoffel word.As we have seen, in order to add a point around C, we choose one of the factors of W (C) together with its closest outer point.This must be done by respecting the conditions given in Definition 11 and Lemma 1. Two examples are illustrated in Figure 15.During the inflation process, we can also get the case where we can completely lose the convexity property.In this case, we know that this closest outer point, cannot be chosen as an insertable point.This is illustrated in Example 4.
Example 4. (Example of a closest outer point that is not insertable) Let w 1 = C 30 41 and w 2 = C 5 7 be two consecutive Christoffel words on the boundary word of a certain digital convex 4-connected set.From Theorem 4, we have ρ(w 1 ) > ρ(w 2 ).If we apply the split operator on w 2 , i.e. add the closest outer point of w 2 , we lose the digital convexity while it is not the case if we apply it on w 1 : This example indicates that a closest outer point is not always insertable.In the following, we study sufficient conditions for insertable points in detail.

Characterization of insertable points and Lyndon factorization update
As seen before, not all closest outer points correspond to insertable points.In order to verify such insertability, we need to update, after adding a closest outer point, the Lyndon factorization over the boundary word W (C).As mentioned before, we might need to do some propagation for certain cases.This propagation can be made on the right and/or left side of the Christoffel word where we split, and can reach the beginning or the end of the sub-word , in the worst case.We give the characterization of an insertable point in Theorem 6.The proof will be given at the end of this section since we need to show some notions before.Definition 12 introduces two types of insertability verification for grid points, one on the left side and the other one on the right side.A point is insertable on the left (resp.right) if we are able to concatenate a finite number of the previous (resp.next) consecutive Christoffel words.This concatenation is called the propagation and is possible under some conditions as shown in Definition 12.
Definition 12.Given a digital convex 4-connected set C, let us consider the boundary word W (C) and its Lyndon factorization W (C) = ℓ n1 1 . . .ℓ nm m .Let x be the closest upper point in C of the j-th Lyndon factor , we say that: • x is insertable on the left if there exists some non-negative integer k )).The point added is insertable on the right since we can do a finite propagation: 19  15 ).After this finite propagation, we end up with C( 21 )R 2 .For more details check Figure 16.
Theorem 6 is one of the main results of this paper.It provides the characterization of the insertability of such a closest upper point x ∈ C. Theorem 6.Given a digital convex 4-connected set C, let x be the closest upper point in C with respect to one of the Lyndon factors of the boundary word W (C).Then, x is insertable if and only if x is insertable on both the left and right sides.
Theorem 6 characterizes insertable points.Using this characterization, we can obtain the positions of all the insertable points to inflate a digital convex set by preserving its digital convexity.Once one of these eligible candidates is chosen and added, we update W (C), which also updates the list of insertable points for the next step of inflation.Theorem 6 and Definition 12 show all the possible cases we can face after adding an insertable point.In order to prove this theorem, we first need to define the following morphism that maps the set of Christoffel words to the set of binary words.From the lexicographic order, we have that C 3 5 < C 2 3 and w 1 < w 2 .By applying Definition 13 and Property 6, we get the following words: Lemma 2 studies the effect of the split operator of a certain Christoffel word with multiplicity higher than 1. .
Proof: First, we prove that the two factors ℓ i−1 j ℓ + j and ℓ − j ℓ nj −i j are Christoffel words.This is obtained by applying the Christoffel morphism Θ A over A = (ℓ − j , ℓ + j ) on the words (01) i−1 1 and 0(01) nj −i respectively.Second, we must prove that . This inequality comes from the fact that Θ A is an increasing morphism, as seen in Property 6; since ℓ − j < ℓ + j and 0(01 □ When the length of a Christoffel word is equal to one, the split operator gives two Christoffel words with different binary alphabets such as split(1) = (1 0, 0).These two new Christoffel words cannot be compared with each other as the two Christoffel words do not belong to the same part of the four parts of the boundary path (ex.WN -path).When they belong to different parts of the boundary word of a digital convex 4-connected set C, the right insertability of the associated point is verified in the initial binary boundary sub-word while the left insertability is verified in the previous one.In the case of split(1) = (1 0, 0), the right insertability verification is made in W WN (C) while the left one is made in W SW (C).
To simplify the proof of Theorem 6, we treat only the right insertability as the left one can be proved similarly.We consider all the possible situations that can arise after applying the split operation.

Proof of Theorem 6:
Let us consider the boundary word W (C) and its Lyndon factorization W (C) = ℓ n1 1 . . .ℓ nm m .Let x be the closest upper point in C of the j-th Lyndon factor ℓ i for i ∈ [1, m] to ℓ i+1 .We get the following cases: 1.If ℓ i+1 < R 0 , so no further verification is needed and the digital convexity is maintained.2. If ℓ i+1 = R 0 , the convexity is also preserved in this case since the line segments discretized by R 0 and ℓ ni+1 i+1 will be aligned, and the multiplicity of ℓ i+1 is increased by 1 in the factorization of w.

If ℓ
i+1 is set and R 1 is now compared with to ℓ ni+2 i+2 using the same reasoning.This propagation is kept until the right insertability verification is done with one of the cases 1, 2 and 4 or until the end of W WN (C).Hence the sequence R k is constructed.
4. If ℓ i+1 > R 0 and ℓ i+1 ̸ = R mi 0 ℓ i , then the we lose the decreasing order, namely the WN -convexity.Hence, the point x is not insertable.
A similar proof can be given for the left insertability.Note that if the length of ℓ i is equal to one, the left insertability should be verified in the previous binary boundary sub-word.□ Note that the propagation test is limited by the extremity of the WN side.In Figure 16, we show an example of the propagation following Theorem 6.
With Theorem 6, we have the characterization of the insertable pixels over W (C). We give in Example 7, 8 and 9, several numerical cases showing the inflation process., so that ℓ 3 = (ℓ − 2 ℓ 2 2 ) 3 ℓ 2 , and consider splitting the third word of the second factor C 3 5 .We get: whose Lyndon factorization is (C 3 4 ) 4 C 8 13 C 31 53 .

Strong sufficient condition to insertable points
Given a digital convex 4-connected set C, till now, for each closest upper point of C, we must check the conditions given in Theorem 6 in order to know if it is an insertable point or not.However, checking if a point is insertable, using Theorem 6 may require processing the whole boundary word W (C) due to the propagation process for the left and right insertability verification (see Definition 12).In this section, we consider some extra local constraints on a chosen Lyndon factor, which ensures that the associated closest upper point is always insertable without the propagation process.For this aim, the authors in [7] restricted the study within the case where all the factors of the Lyndon factorization of the boundary word are primitive.They choose the closest upper point of the segment that is correlated to a primitive Christoffel word of maximal length with respect to the previous and next primitive Christoffel words.With this constraint, removing the corresponding closest upper point of the locally longest Christoffel word always preserves the digital convexity.This result was not proved and neither generalized in the case where the factors of the boundary word are not primitive.We provide here the general result with a full study for all the possible cases and updates.
In fact, we would like to show that splitting a locally longest primitive Christoffel word of W (C) guarantees inflation at any step, while preserving the digital convexity.In other terms, the closest upper points of all the primitive Christoffel words of local maximal length correspond to insertable points.Before reaching this theorem, we prove first that if we have three consecutive decreasing Christoffel words such that the first one is longer than the neighbors, then its split preserves this decreasing order in the local part.
From Definition 13 and Proposition 6, we recall that the Christoffel morphism Θ B induces an increasing bijection between the set of Christoffel words and itself.Based on this we can get the following Corollary. . .ℓ ns s be the Lyndon factorization of a digital convex set C, and let ℓ j , 1 ≤ j ≤ s, be one of the Christoffel words, seen in a cyclic way at j = 1 and j = s.We say that ℓ j has a local maximal primitive length By inflating the Christoffel word ℓ j , with a local maximal primitive on the closest upper point, we preserve the digital convexity, thanks to Theorem 7. □ This means that adding this constraint assures that this particular closest upper point is an insertable one.After inflating C with respect to this strong constraint, we only can face one out of four cases, when we update W (C).They are mentioned in Lemma 3. Lemma 3. Let ℓ j be one of the Christoffel words of local maximal primitive length in the boundary word, W (C) = ℓ n1 1 . . .ℓ ns s , of a 4-connected digital convex set C. By applying the split operator on the i-th ℓ j for any 1 ≤ i ≤ n j such that split(ℓ j ) = (ℓ + j , ℓ − j ), the Lyndon factorization can be updated by the following local replacement: ) and (i = n j and ℓ j+1 = ℓ − j ): . if (i = 1 and ℓ j−1 = ℓ + j ) and (i = n j and ℓ j+1 = ℓ − j ): Proof: The proof of this lemma relies on the following two points: are Christoffel words.This follows from Lemma 2 by taking the base B = (ℓ − j , ℓ + j ).We obtain u = Θ B ((01) i−1 1) and v = Θ B (0(01) nj −i ). 2. Proving the following inequalities: • The inequality in the middle comes from the fact that Θ B defined earlier is increasing and 0(01) nj −i < (01) i−1 1.
• If the last inequality is not correct, we have: ℓ − j ≤ ℓ − j ℓ nj −i j ≤ ℓ j+1 < ℓ j .Then ℓ j is a Christoffel word in the B and the condition does not allow the equality ℓ j+1 = ℓ j ; in this case it has to be longer than ℓ j , contradicting the condition that ℓ j is longer than ℓ j+1 .
• The first inequality is treated in a symmetric way as the previous one.
□ From Lemma 3, we can remark that in all the four cases, the digital convexity is preserved.The inflated segment of the conv(C) is replaced by two others.For the second case, ℓ + j is the same as ℓ nj−1 j and a concatenation from the left side arises.Similarly and by symmetry, for the third case, ℓ − j is the same as ℓ nj+1 j and a concatenation from the right side arises.For the last case, ℓ + j (resp.ℓ − j ) is the same as ℓ nj−1 j (resp.ℓ nj+1 j ), and concatenations from both sides arise.In other words, we lose one of the segments of conv(C).For all the cases, we can note that the propagation does not exceed the neighboring Christoffel words ℓ

Algorithm
We give now a general algorithm to inflate C based on the previous results: for the general case or the case with strong constraint.In Algorithm 2, we determine the list of all the possible insertable points obtained either by applying Theorem 6 or Theorem 7.For each iteration, we choose the point with the highest priority queue and we apply the necessary updates on the Lyndon factorization.We recall that for the general case, this update can lead to a propagation, while for the case with strong constraint, this update is local.Two different and more detailed algorithms, for each of the cases separately, will be given in future work.By using this approach for inflation, if we keep choosing the side with primitive local maximal length at each iteration, we remark that the horizontal and vertical segments of conv(C) will not be chosen.In fact, the discretization of these segments is either factors of the form 0 p , 1 q , 0 r or 1 s for certain integer numbers p, q, r, or s.This means that their primitive length is always equal to 1. Hence, at any step of the inflation they cannot be designated.We see in this case, that the inflation will happen at the beginning on the WN , N E, ES and SW sides without considering the horizontal and vertical sides which leads to an octagonal shape as seen in Figure 17.Once we are at this step, all the remaining factors are of length 2 and 1.The inflation will continue until we reach the form: 1 k 0 l 1 k 0 l for certain integers l, k, which is our rectangle-bounded box.Pull the highest-priority insertable point p from P and add p to I

5:
Let ℓ j be the associated primitive Christoffel word of p in F ;

6:
Compute split(ℓ j ) and update F and P following Lemma 2 or 3

Conclusion
We have proposed a combinatorics-on-words study of the points that can be chosen to inflate and deflate a 4-connected digital convex set C while preserving its digital convexity property.The approach relies on Christoffel words and Lyndon factorizations of the boundary word of C that is represented on a fourletter alphabet.Some update procedures are to be done on these factors, in order to maintain the Lyndon factorizations while adding/removing specific points.For both operations, we have characterized for C the set of points that can be inserted or removed, while maintaining the convexity.For the deflation process, the updates and modifications are local.In contrast, for the inflation process, the updates, using the general procedure, can be global.In worst case, these updates do not go past the side where the inflation is applied.Adding the strong condition on the choice of the insertable points corresponding to the local maximal primitive Christoffel word, the updates during this procedure become local.
In this work, we have focused on the characterization and theoretical properties of geometrical set operations on boundary word factorizations.The algorithmic details and optimization can be found in [31].Another question that can arise is to determine if there exists an optimal heuristic for deflating a digital convex set.The choice of the heuristic is crucial, since for each choice we get a different convergent shape.Another stimulating perspective would be to apply these algorithms on non-convex shapes by studying the locally convex boundary using combinatorics on words [12,27].

Figure 1 :
Figure 1: The green points are the removable points, which are the vertices of the convex hull of a digital convex set C. The red points are the insertable points, which some are the exterior Bézout point of each segment of the convex hull.

Figure 3 :
Figure 3: An example of a digital set C that is digitally convex but not 4-connected.The orange points represents the elements x ∈ C while the blue line represents Conv(C).

Figure 4 :
Figure 4: A digital convex set C is represented by the orange points and its boundary path Bd(C) is represented by the ordered orange points linked by black line segments.The boundary word W (C) is then given by 10100100 10 11 0 10010100 1 011 0 while the blue polygonal line represents the convex hull Conv(C).

Figure 5 :
Figure 5: The four parts WN , N E, ES and SW of the boundary of a digital convex set C are represented in four different colors.The word that codes the WN -path is w = 101001.

Definition 5 .
Let C ⊂ Z 2 be a finite 4-connected, digitally convex set.The points on Bd(C) that separate different Lyndon factors of W (C) are called Lyndon points of C.

Figure 7
Figure 7 shows the result of Theorem 4 and Property 4 for the digitally convex set with boundary word W (C) = 10100100 10 11 0 10010100 1 011 0.

Figure 8 :
Figure 8: a) A is a point in the WN -path, where its neighboring points back and forth in Bd WN (C) form the sub-word 10. b) Removing the point A corresponds to switching the sub-word 10 into 01, i.e applying the switch operator at the position of the point A

Figure 9
Figure 9: a) B is a point in the WN -path, where its neighboring points back and forth in Bd WN (C) form the sub-word 00.b) The digital convexity is lost when removing the point B

Figure 10
Figure 10: a) The digital set C whose boundary word is: W (C) = 01100000 111010100 1 011 0. The point A ∈ C creates the sub-word 00 in W (C). b) Removing A means replacing this factor by ϵ.

5 ,Theorem 5 .
they correspond to Lyndon points; we recall that Lyndon points are geometrically the vertices of Conv(C), each of which is at the end of each factor of the Lyndon factorization of W (C) according to Property 4.However, not all of Lyndon points are removable.Indeed, the switch operation on such factors may lead to losing the connectivity.To avoid this problem, Theorem 5 gives the characterization of removable points for any C.In order to prove this theorem, we recall that the Christoffel words, which are the one-balanced Lyndon words of the Lyndon factorization of W (C) (Theorem 4), have the following form: 0u1, 1k0, 0ℓ 1 and 1m 0, where u, k, ℓ and m are palindromes inW WN (C), W N E (C), W ES (C) and W SW (C)(Property1).Given a digital convex 4-connected set C of Z 2 , a point x ∈ C is removable ifand only if x is a simple point with respect to C and a Lyndon point of the boundary Bd(C).Proof: Let us consider the boundary word W (C) of C, which is decomposed by the Lyndon factorization such that W(C) = ℓ n1 1 ℓ n2 2 . . .ℓ ns s .Since C is digitally convex, each ℓ i , 1 ≤ i ≤ s,is a Christoffel word (Theorems 3 and 4).We give the proof only for the binary sub-word W WN (C) of W (C) as follows; the similar proofs are found to the three other sub-words W N E (C), W ES (C) and W SW (C).As mentioned above, removing a point from the boundary means applying the switch operator at a Lyndon point, so that the corresponding sub-word 10 is replaced by 01, or 00, 1 1 by ϵ.The simplicity of a point x guarantees the 4connectivity of C \ {x}.If a point is not a Lyndon point, i.e. the boundary sub-word belongs to one of the Lyndon factors ℓ i of W WN (C), it cannot be removed as we lose the WN -convexity.Let us consider any boundary sub-word 10 such that 1 appears at the end of one of the Lyndon factors in W WN (C), and 0 appears at the beginning of the consecutive Lyndon factor.If these two Lyndon factors are identical, then switching the pair 10 makes us lose the WN -convexity.Now, let us focus on sub-words 10 that are obtained by two consecutive distinct Lyndon factors ℓ i ℓ i+1 of W WN (C).

Figure 11 :
Figure 11: Before and after the pixel-wise deflation of (a) w 1 , (b) w 2 , (c) w 3 and (d) w 4 and their Lyndon factorization (left and right); a) and b) no removable point remains after the update; c) a removable point arises at a different position; d) several new removable points appear.

Algorithm 1 1 : 2 : 4 : 5 : 6 : 7 :
Point-wise deflation Input: a digital convex 4-connected set C, a number of removing points k Output: a sequence of removed points R Compute the Lyndon Factorization F of the boundary word of C Insert all the Lyndon points in a priority queue L 3: while k > 0 do Pull the highest-priority simple point p from L and add p to R Let u and v be the two distinct Christoffel words of F around p Compute w = switch |u| (uv) Compute the Lyndon Factorization of w and update F and L 8:

Figure 12 :
Figure 12: The deflation process of a digital convex 4-connected set (a) represented after 150 and 250 iterations respectively in (b) and (c) using the heuristic of area-change.

Proposition 1 .
[14] Let w be a Christoffel word of length n and k the position of fu(w).

1 . 2 .Corollary 4 .
The words u = w[1, k −1]1 and v = 0w[k +2, n], are two Christoffel words, where the notation w[i, j] indicates the subword of w from position i to j, with 1 ≤ i ≤ j ≤ n.For each non-negative integer k ′ different from k, the wordsu ′ = w[1, k ′ − 1]1 and v ′ = 0w[k ′ + 2,n] are not both Christoffel words.Proof: Let w be a Christoffel word of length n and k is the position of fu(w).From Property 1, we know that w = 0p1 where p is a palindrome.From Corollary 2, we can write w in a unique way as w = p 1 p 2 , where p 1 and p 2 are two palindromes with |p 1 | = k.This gives w = 0s01t1 where the length of s is k − 2. Therefore, u = w[1, k − 1]1 = 0s1 and v = 0w[k + 2, n] = 0t1 are two Christoffel words.The unicity of the position of fu(w) ensures the unicity of u and v. □ A useful consequence follows.Let w, u and v be as defined in Proposition 1.It holds ρ(u) > ρ(v).

Figure 15 :
Figure 15: Figure (a) shows that the inflation at this position maintains the convexity.Figure (b) shows that there is an additional step of propagation to verify the convexity on the left side of the segment.

Definition 13 (
[7]).Given an ordered pair ofChristoffel  words over {0, 1} * B = (C a b , C c d ), we define the Christoffel morphism Θ B from the set of Christoffel words to A * such that Θ B (0) = C a b and Θ B (1) = C c d .Being a morphism, Θ B (uv) = Θ B (u)Θ B (v), and it is ordered as we can see in the following property.Property 6 ([7]).If C a b < C c d , then the Christoffel morphism Θ B with B = (C a b , C c d ) is an increasing morphism.In other words, for any two Christoffel words w 1 and w 2 such that w 1 < w 2 , we have: Θ B (w 1 ) < Θ B (w 2 ).Example 6.Let B = (C 3 5 , C 2 3 ), w 1 = C 3 4 and w 2 = C 3 2 .

Corollary 5 .
Any Christoffel word C such that C a b < C < C c d satisfies |C| ≥ |C a b | + |C c d | > max(|C a b |, |C c d |).Proof: Let B = (C a b , C c d ), and Θ = Θ B be the Christoffel morphism defined by Θ(0) = C a b and Θ(1) = C c d .If C a b < C < C c d then C = Θ(U ), where U is a Christoffel word other than 0 or 1.Hence, U contains one letter 0 and one letter 1. Therefore the image contains two disjoint factors C a b and C c d .□ Corollary 5 allows us to give an algorithm that inflates a digital convex set C while preserving its digital convexity.We start by defining the notion of maximal primitive length of a Christoffel word with respect to the previous and next one.Definition 14.Let W (C) = ℓ n1 1 .

Theorem 7 .
Given a digital convex 4-connected set C, if x ∈ C is the closest upper point of a local maximal primitive length Christoffel word of the Lyndon factorization of the boundary word W (C) then x is an insertable point.Proof: The proof is deduced from Corollary 5.

nj+1 j+1 and ℓ nj− 1 j− 1 . 6 .
Corollary Splitting the Christoffel word of the Lyndon factorization which has the local maximal primitive length limits the propagation and bounds it only by the previous and next factor.

Figure 17 :Algorithm 2 1 : 2 :
Figure 17: Figure b) shows the inflation of the digital convex set represented in a) by applying the algorithm with the stronger constraint based on local maximal primitive length of a Lyndon factor.

Table 2 :
The Christoffel word w of slope5  8with the remainder sequence (r i ) 0≤i≤13 .