Multi-Robot Active Sensing for Bearing Formations

This paper proposes a novel distributed active sensing control strategy for formations of drones measuring relative bearings. To be able to localize their relative positions from bearing measurements, the drone formation must satisfy specific Persistency of Excitation (PE) conditions. We propose a solution that can meet these PE conditions by maximizing the information collected from onboard cameras via a distributed gradient-based algorithm. Additionally, we also consider presence of a (concurrent) position-based formation control task using Quadratic Program-based control with Control Lyapunov Functions (CLFs). The results show that the inclusion of active sensing in the formation control law enhances the localization accuracy and, as a consequence, the precision of reaching the desired formation. The improvement is especially important when the underlying graphs are not Infinitesimally Bearing Rigid (IBR), as it can be expected.


I. INTRODUCTION
Cooperative localization from relative sensing is a relevant topic in the multi-robot community [1]- [5], it is especially relevant in GPS-denied environments, such as indoors.One specific area of interest is cooperative localization from bearing measurements [2]- [5], primarily due to the ease of retrieving relative bearings using onboard cameras.Much of the previous work in this domain has been built around the notion of Infinitesimal Bearing Rigidity (IBR) [2], [3].IBR characterizes the conditions, primarily related to graph topology, under which a group of drones measuring constant relative bearings can localize themselves up to a global translation and scaling factor.In case of constant bearings, to resolve the scale ambiguity, a distance measurement among a pair of robots is necessary.
Maintaining formation rigidity can be, however, quite restrictive in terms of motion flexibility, especially when the presence of edges among robots depends on sensing constraints such as limited range, occlusions, and limited field of view (FoV).Recently, in [5], [6] the notion of Bearing Persistently Exciting (BPE) formations has been introduced, where at least a subset of the relative bearings is time-varying.By leveraging the concept of PE, this allows resolving the scale ambiguity and to relax the IBR assumption.A distributed observer is also proposed with Uniform Exponential (UE) convergence under the condition of the formation being BPE.
However, there remains the problem of how to ensure the BPE of the formation.Although a coordinated rotation was proposed as a reference trajectory satisfying the BPE condition [5], this strategy may not be desirable in practical applications where more generic motions are needed.An alternative approach proposed in this work is to instead make use of an active sensing strategy that enables the optimization of robot motion for gathering informative measurements about the system state [7]- [9].This can be exploited for producing a group motion that can (i) actively satisfy the necessary BPE conditions and (ii) be generic enough for fulfilling at the same time other tasks of interest related to the particular mission.
In the past, active sensing has been applied to the cooperative localization problem, e.g., in [10]- [12].where distance measurements together with rigidity and the presence of multiple anchors (i.e.robots knowing their poses) was assumed.An active sensing strategy was then used to steer the formation towards the best geometry from the localization point of view, without considering any PE condition.In [13], the authors considered the active information gathering for ensuring observability of a multi-robot system with two anchors and range measurements.In our previous work [14], we considered a formation with relative distance measurements, and we maximized the observability of the system over a future prediction horizon.The proposed method could be made distributed but at the cost of being suboptimal w.r.t. the centralized one and it was computationally heavy.In [15], an active sensing strategy to estimate the scale of an IBR formation was proposed.Most of the previous works mainly consider relative distance measurements instead of bearings, which are instead quite interesting from an application point of view.Some of these previous works also solely focus on optimizing the geometric arrangement of the group instead of the robots trajectories.In addition, the assumption of rigid formations is also often exploited (which, as explained, can be quite restrictive in terms of group mobility when considering sensing constraints).
In this paper, we propose a gradient-based control law that can address the above shortcomings.We consider maximization of the group observability by optimizing a global information measure: the derivative of the minimum nonzero eigenvalue of the weighted Observability Gramian (OG).This approach is simpler and computationally cheaper than receding horizon strategies and it allows maximizing a global quantity in a distributed way.Additionally, we address the coupling of the active cooperative localization strategy with presence of an independent primary task (formation control) whose performance is however enhanced by the active sensing action.This is obtained by leveraging distributed Quadratic Programming (QP)-based Control Lyapunov Functions (CLFs) [16].The combination of active sensing and CLFs has previously been successfully applied to single robot scenarios employing a receding horizon strategy [9].We believe that the approach we propose is a promising direction as it allows to achieve cooperative tasks while optimizing the trajectories to enhance the observability of the formation.
The rest of the paper is organized as follows: Section II introduces the main modeling assumptions and the employed observer.Section III presents a gradient-based active sensing controller.Section IV outlines a strategy to integrate active sensing into a position-based formation control task, accommodating not necessarily bearing rigid formations.Section V presents statistical results on various graph topologies, and finally, Sect.VI concludes the paper.

A. Graph Theory
We consider a group of N connected drones.The drone sensing and communication interactions are modeled by a constant connected and undirected graph G := (V, E), where V = {1, ..., N } is the set of vertices and E ⊆ V × V is the set of edges.Two drones i and j are called neighbors if (i, j) ∈ E. The set of neighbors of drone i is denoted by N i := {j ∈ V|(i, j) ∈ E} and its cardinality is |N i |.With a slight abuse of notation, we may refer to variables related to the edge e k := (i, j) with a subscript k or ij.The incidence matrix for an arbitrary orientation of the graph is denoted by E, with 0 otherwise.The Laplacian matrix associated to the graph G can be computed as L = EE T , while a weighted Laplacian is given by L W = EWE T , with W ⪰ 0 (i.e.positive semidefinite).The Laplacian matrix is positive semidefinite with L1 N = 0, i.e. it has zero row-sum, and it has positive diagonal and non-positive off-diagonal entries.Its eigenvalues are 0 = µ 1 ≤ µ 2 ≤ ... ≤ µ N with eigenvectors {1 N / √ N , v 2 , ..., v N } and µ 2 > 0 if and only if the graph is connected.

B. Formation Model
Each drone state is represented by its position p i ∈ R 3 and we refer to the stack of the drones position as T ∈ R 3N .The positions and linear velocities of the drones are expressed in a common inertial frame and the velocity of drone i is indicated as u i ∈ R 3 while the stack of the drones velocities as The dynamics of each drone is modeled as a simple single integrator: Each drone is able to measure the relative bearing with respect to its neighbors, i.e.: where p ij = p j − p i and d ij = ∥p ij ∥ 2 .We also define the stack of bearing measurements . The inclusion of sensing limitations is left as a future work.

C. Localization of Bearing Formations
The matrix-valued Laplacian [17] associated to the classical position-based formation control is given as L 3 := E 3 E T 3 [18], with E 3 = E ⊗ I 3 where ⊗ is the Kronecker product and I 3 the identity matrix.If the graph is connected, then rank(L 3 ) = 3N − 3 and the null-space is N (L 3 ) = U, where U = 1 N ⊗ I 3 represents a basis for a common 3D translation of the formation.For bearing formations the bearing Laplacian matrix is defined as IBR is a necessary and sufficient condition to localize the relative positions of the robots up to a scale factor from constant bearing measurements.As it was shown in [5], this condition can be relaxed if the bearings are time-varying, implying the IBR condition is not anymore necessary and the scale ambiguity can be resolved.We give some necessary definitions before proceeding.
Definition II.1.Given a system with state x ∈ R d , state sensitivity matrix Φ(t, t 0 ) = ∂x(t) ∂x(t 0 ) , output y(t), and output Jacobian H(t) = ∂y(t) ∂x(t) , the Observability Gramian (OG) is defined as follows [7]: Observability of the system linearized along the nominal trajectory is characterized by the invertibility of the OG [19], i.e. the state can be observed from the measured output y(t) if there exists some ϵ > 0 such that: Definition II.2.
[5] A formation is defined bearing persistently exciting (BPE) if the graph G is connected and given T > 0 and γ > 0 the following PE condition holds: It should be noted that this condition is less stringent than the classical PE condition since dim(N (L 3 )) = 3.
We point out that the previous integral in ( 5) represents a weighted Observability Gramian (OG) of the system.In fact, the state sensitivity matrix for a single integrator is the identity matrix and the output Jacobian is the so-called bearing rigidity matrix, i.e.: From ( 4) and ( 6), using the properties of the orthogonal projector, it follows that the OG of the system can be expressed as: where we see the similarity to (5), in which the information associated to each edge is weighted by the squared distance.
Since the scaling by the positive distances does not affect the rank of the matrix, it follows that the formation being BPE is a local observability condition of the subspace orthogonal to the common translation.We now introduce the observer presented in [5], which we use for localizing the drones in a bearing formation.In [5], it is shown that, if the formation is BPE, the bearings are always well-defined (i.e.no collisions occur) and the velocities are bounded, then the following distributed observer achieves UE convergence to the true state up to a common translation: where p is the estimated position vector and k e > 0.
The convergence rate of this observer depends on the BPE parameter γ [5].The aim of the following section is to then provide a control law that aims at indirectly increasing the value of γ for ensuring a satisfactory observer performance even for non-rigid bearing formations.

III. ACTIVE SENSING CONTROL
In this section, we propose a gradient-based control law for actively collecting information about the drone relative positions.To quantify the amount of information, we adopt the minimum nonzero eigenvalue, specifically the fourth eigenvalue denoted as λ 4 , of the weighted OG (5) but with an additional forgetting factor, that is, an OG with dynamics The forgetting factor ρ > 0 ensures that the OG remains bounded, since otherwise the term L β ⪰ 0(t), ∀t ≥ 0 would make the OG grow unbounded over time.Notice that if G β (t 0 ) is a matrix-weighted Laplacian [17] associated to the graph G, then this also holds for G β (t), ∀t ≥ t 0 , that is, (i) the required definiteness pattern of the matrix weights (negative semidefinite off-diagonal terms) is preserved: where and; These properties are used in the next derivations.In particular, the preservation of the sparsity pattern allows for a distributed implementation of the proposed control law.
Consider the state of the full system ζ = p T vec G β T T , where vec(•) is the vectorization operator.Then, the system dynamics can be written as: where the system has a cascade structure with and g = I 3N 0 3N ×(3N ) 2 T .Note that the first derivative of λ 4 (our 'information metric') depends on the positions of the drones, so that λ 4 has relative degree 2 w.r.t. the drone velocities (the available control inputs).This poses a challenge for direct control since the drone positions are solely involved in the dynamics of λ 4 within the orthogonal projector Π β .However, the derivative of λ 4 can be more easily controlled.This has expression where is the Lie derivative of λ 4 along f (analogously we also define L gi ) and T is the eigenvector associated to λ 4 .For convenience, let us define The second order derivative of λ 4 is then where T is the hessian of λ 4 with respect to the matrix entries.The important point to note here is that ( 14) has a direct (affine) dependence on the control inputs u i , which can then be exploited for controlling the quantity λ4 , for instance by applying the gradient-based control law The control action (15) implements the sought active sensing since it aims at maximizing λ4 that, indirectly, maximizes the eigenvalue λ 4 itself (which is the metric we care about, but which is less directly controllable by acting on the control inputs u i ).
We now discuss some properties of the control law (15).
Proposition III.1.From an information perspective, it is intuitive that the gradient of λ4 associated to an edge is orthogonal to the corresponding bearing, i.e.: Proof.First of all, notice that and the first term does not depend on p i .Furthermore, using the properties of the Kronecker product, we have One can then use the Kronecker product properties and show that: Proposition III.2.Assuming that no collisions happen i.e. d ij ≥ D min > 0, ∀i, j ∈ {1, 2, ..., N }, then the resulting input is always bounded.
Proof.w.l.o.g.considering k = 2: and where we used the subadditivity property of norms, the facts that Finally, the active sensing gradient control (15) requires each robot i to only know quantities which are locally available or communicated by neighboring robots, i.e., • v 4i : the components of the eigenvector corresponding to the i-th robot itself, which can be estimated in a distributed way by suitably modifying the distributed power iteration method (see e.g.[20], [21]) as done e.g. in [22]; • v 4j ∀j ∈ N i : which can be communicated by neighboring robots; • d ij : for which an estimate dij can be computed from the estimated positions; pi , locally available, and pj which can be communicated by the neighbors • β ij which is measured.In the next section, we show how to embed the active sensing action (15) within a primary task of formation control.

IV. FORMATION CONTROL
We establish a hierarchical framework that prioritizes position-based formation control over active sensing by using distributed QP-based Control Lyapunov Functions [16].We now formally introduce the definition of CLFs which will be instrumental for the next developments.
with α of class K.
The position-based formation control task considered in this work is defined based on the error associated with each edge, denoted as e ij = p j − p i − p d ij .Here, p d ij represents the desired relative position between robot i and j.We also define the desired position vector as T , which relates to the relative desired position vector It is worth noting that p d is defined up to a common translation.Then, the position error T with dynamics ė = u − ṗd .
We consider the following potential function: We emphasize that, while this function is only positive semidefinite i.e.V (w) = 0 for any vector w ∈ span(U) representing a common translation, it is positive definite with respect to the desired equilibrium set {e 1 = e 2 = ... = e N }.Let us de- is thus the unit-norm direction of the active sensing control ( 15)).The centralized QP, including the active sensing task, is formulated as: where Π ci = I 3 − c i c T i is the orthogonal projector onto the plane perpendicular to c i , η > 0 and α i (•) is an extended class K.
The first term in the cost function aims at achieving the same information gain as the one obtained by u i = kc i (ζ) and, therefore, it represents the active sensing task.The second (regularization) term in the cost function is meant to address two issues, namely (i) avoiding excessive inputs in the direction orthogonal to the active sensing task due to constraint satisfaction, and (ii) obtaining a strongly convex cost function.Indeed, the Hessian of the QP is given by Our objective now is to transform this problem into one that is suitable for distributed implementation.Inspired by [24], we compute the analytical expression for the solution of the QP.First, we define a = [a T 2 ), so that the constraint can be written as We point out that, in absence of input limits, the constraint is always feasible, i.e. a = 0 implies b = 0, in which case the constraint is trivially satisfied.The analytical solution to the QP can be obtained by using the Karush-Kuhn-Tucker (KKT) optimality conditions.Let us define the Lagrangian of the problem with λ L being the Lagrange multiplier.The resulting KKT conditions, using * to indicate the optimal solution, are: From the first equation one obtains: from which two cases are possible: 1) the constraint is not active at the unconstrained solution kH −1 QP,i c i , hence from the last condition one has λ * L = 0 and u * i = kH −1 QP,i c i ; 2) the constraint is active, hence substituting (26) into the constraint equation with equality yields Substituting back in (26), one obtains the complete solution for the two cases as: Since H QP,i ≻ 0, a zero denominator implies a i = 0 ∀i, that is, accomplishment of the formation control task.Hence, in this case the input can be set to zero.Otherwise, the first term represents the solution to the unconstrained problem.The term in the numerator ) is negative when the constraint is satisfied by the solution to the unconstrained problem.Since the term at denominator is always greater than zero, then the second term is different from zero only when the constraint is active and it acts to correct the input in order to satisfy the constraint.Also, notice that the proposed controller is piecewise (because of the eigenvalues re-ordering) locally Lipschitz continuous [25].
The two terms QP,i a i are not locally available but one can estimate their average through dynamic consensus, e.g.[26].Finally notice that the terms a i and b i depend on the estimated positions of robot i and its neighbors, which must then be estimated as accurately as possible to correctly achieve the formation task.This is achieved by the gradient-based active sensing strategy embedded in the cost function of (23).

V. SIMULATION RESULTS
In this section, we show the effectiveness of our approach through extensive numerical simulations.We compare the results obtained applying the proposed control law (28), which achieves the desired formation while performing active sensing against a control law which only implements formation control by satisfying the constraint in (23) while minimizing the input norm.For convenience, we refer to the two methods, respectively, as AS and CLF-only.We consider a group of N = 7 drones, which can sense the neighboring robots without any sensing constraints and three different graph topologies with different levels of connectivity.We performed a set of 50 simulations starting from random initial positions for each of the following graph topologies (ordered in increasing connectivity level): line graph, cycle graph and 1-redundantly bearing rigid (RBR) graph [4] (Fig. 1).Notice that the first two graphs are not bearing rigid.The drones initial estimated position is drawn from a gaussian distribution centered around the real position with standard deviation 0.8m 2 along each axis.As desired formation, we chose to have the drones equally spaced along a circle lying on a plane parallel to the x-y plane.The decrease of the CLF imposed in the constraint is the same for CLF-only and AS, hence, the converge speed of the formation is not of particular interest.We instead focus on the (more relevant) evaluation of the steady state error at convergence, and of the estimation error.In the simulations, we used the following parameters: the observer gain is k e = 0.1, the forgetting factor is ρ = 0.04, the OG is initialized as a matrix weighted Laplacian with matrix weights 0.5I 3 , the active sensing gain is k = 0.2, the regularization term is η = 0.01.We also implemented Control Barrier Functions (CBFs), which filters the inputs provided by the proposed controller to ensure collision-free trajectories [27].
We present the results for the line graph in Fig. 2, the cycle graph in Fig. 3, and the 1-redundant bearing rigid graph in The results clearly show that the use of active sensing leads to higher acquired information and, thus, a more accurate localization and, as a consequence, a better performance for the formation control.The use of active sensing provides benefits, particularly in scenarios where the connectivity is lower, as for the line and cycle graphs.In these cases, the active sensing task reduces both the estimation and formation errors.Conversely, in highly connected and rigid graphs like the 1-redundantly rigid graph, the improvement margin is quite small.Indeed, in this case, rigidity of the graph greatly simplifies the localization task since the robot group becomes "instantaneously localizable" (up to a scale factor).This is also evident from Fig. 4(c), where the information growth is considerably high even when active sensing is not used (CLF-only case).

VI. CONCLUSIONS AND FUTURE WORKS
In this paper, we have presented a novel distributed control strategy for bearing formations to maximize the information gathered for cooperative localization.Our active sensing strategy is based on the minimum eigenvalue of the weighted Observability Gramian with a forgetting factor as an information measure.This active sensing strategy can be combined with other additional tasks of interest (a position-based formation control task in our case) by making use of distributed QP-based control with Control Lyapunov Functions.This integration enables the execution of higherlevel tasks while concurrently enhancing the minimum level of "localizability" for the robot group, thereby improving the overall task execution.
Numerical simulations over graphs with different levels of connectivity demonstrated the benefits of the approach, in particular for non-rigid graphs.Future works will focus on including sensing constraints such as limited range and field of view, as well as accounting for the drones orientation.By addressing these factors, we can develop a more comprehensive and practical framework that considers the corresponding directed time-varying graph.Such advancements will enhance the applicability and robustness of our proposed strategy in real-world scenarios.

1 .
a i := j∈Ni (e i − e j ), and b = N i=1 b i , with b (a) Line graph (b) Cycle graph (c) 1-RBR graph Fig. 1: Graphs corresponding to the sets of simulations.

Fig. 4 .
In each case, the figures are arranged from left to right to display: (a) the trajectory of λ 4 (t), (b) the estimation error, (c) the formation error, (d) and (e) the violin plot respectively for the estimation and formation error at the end of the trajectory.As a reminder, λ 4 (t) is the minimum non-zero eigenvalue of the OG and it is a measure of the collected information which affects the worst case convergence rate of the employed localization observer.The plots (a), (b) and (c) show the average trajectories along with the standard deviation of the results across simulations.