A hybrid metaheuristic approach for the capacitated arc routing problem

The capacitated arc routing problem (CARP) is a diﬃcult combinatorial optimization problem that has been intensively studied in the last decades. We present a hybrid metaheuristic approach (HMA) to solve this problem which incorporates an eﬀective local reﬁnement procedure, coupling a randomized tabu thresholding procedure with an infeasible descent procedure, into the memetic framework. Other distinguished features of HMA include a specially designed route-based crossover operator for solution recombination and a distance-and-quality based replacement criterion for pool updating. Extensive experimental studies show that HMA is highly scalable and is able to quickly identify either the best known results or improved best known results for almost all currently available CARP benchmark instances. In particular, it discovers an improved best known result for 15 benchmark instances (6 classical instances and 9 large-sized instances all with unknown optima). Furthermore, we analyze some key elements and properties of the HMA-CARP algorithm to better understand its behavior.


Introduction
The capacitated arc routing problem (CARP) has been the subject of a large number of studies during the last decades due to its wide applicability in logis-tics, such as household waste collection, product distribution, winter gritting and postal deliveries, among others [10].The CARP model can be informally described as follows.We are given a graph with a set of vertices and edges, where each edge has a predefined traversal cost and where a subset of edges, which are required to be serviced by some vehicles, are additionally associated with a service cost and a demand.A fleet of identical vehicles with a limited capacity is based at the depot vertex.The objective of CARP is to find a set of vehicle routes with a minimum cost such that: 1) each required edge is serviced on one of the routes; 2) each route must start and end at the depot vertex; and 3) the total demand serviced on the route of a vehicle must not exceed the vehicle capacity.
From a theoretical point of view, CARP is known to be NP-hard [12], and hence is not expected to be solved by any exact algorithm in a polynomial time in the general case.The computational difficulty of solving CARP is also confirmed in practice.Indeed, the best existing exact algorithms are limited to moderate instances with only 140 vertices and 190 edges [3,6,7].For these reasons, intensive research has been devoted to developing heuristic and metaheuristic methods.Representative heuristic methods include Augment-Merge [12], Path-Scanning [17], the route first-cluster second heuristic [33] and Ulusoys Heuristic [35].Among the metaheuristic methods, neighborhood search approaches are popular, e.g., tabu search [4,19,24], variable neighborhood search [20,30], guided local search [2], GRASP with evolutionary path relinking [36].As another class of popular metaheuristics for tackling CARP, population-based algorithms generally achieve better performances, such as the memetic algorithm [22,34], the ant colony algorithm [32] and the cooperative co-evolution algorithm [26].Among these methods, two population-based algorithms (MEANS [34] and Ant-CARP [32]) and one local search algorithm (GLS [2]) represent the state-of-the-art solution methods for the classical test instance set, while RDG-MEANS [26] is the current best performing algorithm for the large-scale CARP (LSCARP) instances.Finally, for a thorough and up-to-date discussion of arc routing problems, the reader is referred to the recent book [8] edited by Á. Corberán and G. Laporte and in particular, chapter 7 by C. Prins dedicated to heuristic approaches.
In this work, we investigate a new population-based algorithm under the memetic search framework [27].Memetic algorithms (MAs) have been proved to be very effective for solving a large number of difficult combinatorial optimization problems [28,29], including CARP [22,34].The success of a MA highly depends on a careful design of two key search components: the crossover operator and the local refinement procedure [18].Based on our previous experiences on MAs applied to various combinatorial problems, we go one step further by innovating especially on these two key components (crossover and local refinement) with the goal of elaborating a powerful memetic algorithm able to surpass the current state-of-the-art CARP methods.
The main contributions of our work can be summarized as follows.
• From the algorithmic perspective, the proposed population-based hybrid metaheuristic approach (HMA) combines a powerful local refinement procedure to ensure an effective search intensification with a dedicated crossover operator specially designed for CARP to guarantee a valid search diversification.The local refinement procedure couples a randomized tabu thresholding procedure to locate high-quality feasible solutions, with an infeasible descent procedure to enable tunneling between feasible and infeasible regions.The dedicated crossover operator relies on route information that can be embodied in exchanges of parent solutions to create new promising solutions.Additionally, to maintain a healthy population diversity and to avoid premature convergence, HMA employs a quality-and-distance strategy to manage the pool of solutions using a dedicated distance measure.• In terms of computational results, extensive experiments carried out on 8 sets of widely used benchmarks show the competitiveness of the proposed method compared to the state-of-the-art CARP algorithms in solution quality and computational efficiency.For the 7 sets of 181 small-sized and medium-sized instances, HMA consistently matches or improves on all the best known results.In particular, HMA discovers a new best known result (new upper bound) for 6 well-studied instances.For the last set of 10 large-sized CARP benchmarks, HMA exhibits an even better performance.
It easily dominates the state-of-the-art algorithms, including those specially designed for these CARP instances, by finding 9 new best known solutions, demonstrating the outstanding scalability of the proposed method.
The rest of the paper is organized as follows.Section 2 introduces preliminary notation and the solution representation.Sections 3 and 4 are dedicated to the description of the main HMA algorithm.Section 5 presents the computational results.Section 6 investigates some key elements of HMA, followed by the conclusions in Section 7.

Notation and solution representation
We are given a graph G(V, E) with a set of vertices (V ), a set of edges (E), a set of required edges (E R ⊂ E) and a fleet of identical vehicles with a capacity of Q that is based at the depot vertex v d (v d ∈ V ).Each edge e = (i, j) ∈ E is represented by a pair of arcs < i, j > and < j, i >.A required edge is said to be served if and only if one of its two arcs is included in one vehicle route of the routing plan.For the sake of simplicity, we use the term task to represent a required edge hereafter.Let n be the number of tasks, i.e., n = |E R |.Each arc of a task, say u, is characterized by four elements: the head vertex (head(u)), the tail vertex (tail(u)), the traversal cost (tc(u)) and the demand (q(u)).
To represent a CARP solution, we assign to each task (i.e., a required edge) two IDs (i, i + n) where i is an integer number in [1, n], i.e., one ID for each arc of the task.We also define a dummy task with 0 as its task ID and both its head and tail vertices being the depot vertex v d .This dummy task is to be inserted somewhere in the solution as a trip delimiter.Suppose a solution S involves m vehicle routes, S can then be encoded as an order list of (n+m+1) task IDs among which (m+1) are dummy tasks: S = {S(1), S(2), ..., S(n + m + 1)}, where S(i) denotes a task ID (an arc of the task or a dummy task) in the i th position of S. S can be also written as a set of m routes (one route per vehicle): , where R i denotes the i th route composed of the task ID at the j th position of R i .Let dist(u, v) denote the shortest path distance between the head vertex of arc u (head(u)) and the tail vertex of arc v (tail(v)), the total cost of a solution S can be calculated as: The total load load(R i ) of a route R i can be calculated as: 3 A hybrid metaheuristic algorithm for CARP In this section, we describe the proposed hybrid metaheuristic algorithm (HMA) for CARP including the main procedure, the procedure for generating initial solutions, the specific route-based crossover as well as the quality-and-distance based pool maintenance procedure.The local refinement procedure of HMA is presented in Section 4.

Main scheme
Our HMA algorithm can be considered as a hybrid steady-state evolutionary algorithm which updates only one population solution at each generation of the evolution process [14].Algorithm 1 shows the main scheme of the HMA algorithm.HMA starts with an initial population of solutions (Line 1 of Algo. 1) which are first generated by a random path scanning heuristic (Sect.3.2) and further improved with the local refinement procedure (Sect.4).Before entering the main loop, HMA initializes a counter array Cnt (Lines 3-4) which is used to record the accumulated number of successful pool updates with the related threshold ratio value in a given set Sr (an external input).
At each generation, HMA randomly selects two parent solutions S 1 and S 2 from the population (Line 6), and performs a route-based crossover (RBX) operation (Line 7, see Sect.3.3) to generate an offspring solution S 0 .RBX basically replaces one route of one parent solution with one route from the other parent solution, and repairs, if needed, the resulting solution to ensure the feasibility of S 0 .HMA then applies the local refinement procedure (Line 10, Sect.4) to further improve S 0 .The local refinement procedure involves two sub-procedures, namely a randomized tabu thresholding procedure (RTTP) and an infeasible descent procedure (IDP), which can be carried out in two possible orders: RTTP followed by IDP (RTTP→IDP) and IDP followed by RTTP (IDP→RTTP).The applied order is determined randomly before running the local refinement procedure (Line  4).If S 0 is successfully inserted into the population, the counter in relation to the threshold ratio used in the current generation is incremented by one (Lines 14-15).HMA terminates when a stopping condition is reached, which is typically a lower bound cutoff or a maximum number of generations.

Population initialization
To generate one initial solution of the population, HMA uses a randomized path-scanning heuristic (RPSH) to construct a solution which is then further improved by the local refinement procedure described in Section 4. RPSH is adapted from the well-known path-scanning heuristic [17] by randomizing its five arc selecting rules.Specifically, RPSH builds one route at a time in a step-by-step way, each route starting at the depot vertex.At each step, RPSH identifies a set A of arcs (belonging to a set of unserved tasks) that are closest to the end of the current route and satisfy the vehicle capacity constraint.If A is empty, RPSH completes the current route by following a shortest deadheading path to the depot vertex and starts a new route.Otherwise RPSH randomly selects one arc from A and extends the current route with the selected arc.The selected arc as well as its inverse arc are marked served.This process continues until all tasks are served.
The solution constructed by RPSH is further improved by the local refinement procedure of Section 4. The improved solution is inserted to the population if it is unique in terms of solution cost relative to the existing solutions, or discarded otherwise.The population initialization procedure stops when the population is filled with P size (population size) different individuals or when a maximum of 3 * P size trials is reached.The latter case helps to fill the population with P size distinct individuals.If ever only k < P size distinct solutions are obtained after 3 * P size trials, we set the population size to k.

Route-based crossover operator for CARP
At each of its generations, HMA applies a crossover operator to create an offspring solution by recombining two parent solutions randomly selected from the population.It has been commonly recognized that the success of memetic algorithms relies greatly on the recombination operator which should be adapted to the problem at hand and be able to transfer meaningful properties (building blocks) from parents to offspring [18].This idea is closely related to the idea of using structured combinations and vocabulary building [16].
By considering that the solution of CARP is composed of a set of routes, it is a natural idea to manipulate routes of tasks rather than individual tasks.In this regard, the route-based crossover (RBX) operator used for the vehicle routing problem (VRP) [31] seems attractive for CARP.However, given that CARP is quite different from the VRP, RBX must be properly adapted in our context within HMA.Given two parent solutions with m 2 routes, our RBX crossover basically copies S 1 to an offspring solution S 0 and replaces a route of S 0 with a route from S 2 , and then repairs S 0 to establish feasibility if needed.The RBX crossover procedure consists of three main steps: • Step 1: Copy S 1 to an offspring solution S 0 = {R 0 1 , R 0 2 , ..., R 0 m 1 } and replace a route of S 0 with a route from S 2 .Generate two random integer values a a of solution S 0 with the route R 2 b of solution S 2 , and collect the tasks that are not served in S 0 to a set U T ; • Step 2: Remove duplicated tasks by the following rule.Let S 0 (p i ) be the task in position p i , and let p i − 1 be the position before p i and p i + 1 be the position after p i .Also let dist(u, v) denote the shortest path distance between vertex head(u) and vertex tail(v).Given a task t 0 which appears twice respectively in position p 1 and p 2 , RBX removes the appearance with the largest value of s(p i ) (i ∈ {1, 2}), where s(p i ) = dist(S 0 (p i −1), S 0 (p i ))+ dist(S 0 (p i ), S 0 (p i + 1)) − dist(S 0 (p i − 1), S 0 (p i + 1)).• Step 3: Insert the unserved tasks of U T in S 0 .Before task insertion, RBX sorts the tasks in set U T in random order.Then for each task t in U T , RBX scans all possible positions of S 0 to insert t.If a position is able to accommodate t while respecting the vehicle capacity, RBX further calculates the saving (change of the total cost) with t inserted.The two arcs of t are both considered for insertion, and the minimum saving is recorded.RBX finally inserts the task to a position which causes the overall least augmentation of the total cost while maintaining the solution feasibility.Ties are broken randomly.This process is repeated until U T becomes empty.
Our proposed RBX operator not only introduces a new route (taken from S 2 ) into S 0 , but also modifies other existing routes due to deletion of duplicated tasks and insertion of unserved tasks.Clearly, RBX could lead to an offspring solution which is structurally different from its parent solutions.This is a desirable feature which promotes the overall diversity of HMA.Moreover, the quality of the offspring is not much deteriorated due to the use of greedy heuristics in Steps 2 and 3.As such, when the offspring is used as a seeding solution of local refinement, it helps the search to move into a new promising region.RBX can be realized in O(n 2 ), where n is the number of tasks.

Population management
In population-based algorithms, one important goal aims to avoid premature convergence of the population.This can be achieved by adopting a carefully designed strategy for population management.In HMA, we use a quality-anddistance strategy (QNDS) for this purpose.QNDS takes into account not only the solution quality, but also the diversity that the solution contributes to the whole population by resorting to a solution distance measure.
We propose in this work to adapt for the first time the Hamming distance in the context of CARP and use it as our distance measure.Any pair of consecutive tasks (S(i), S(i + 1)) of a solution S is linked by a shortest path (a path with minimum deadheading cost) between head(S(i)) and tail(S(i + 1)), called deadheading-link hereafter.Thus, solution S has n + m deadheadinglinks where n is the number of tasks and m is the number of routes.Let V R ⊂ V be a set of vertices that belong to the required edges, let } be the set of all possible deadheading-links.Given two solutions S i with m i routes and S j with m j routes, their Hamming distance D i,j is defined as the number of different deadheading-links between S i and S j : where m = min{m i , m j }, ) is a deadheading − link of both S i and S j 0, otherwise Given a population P OP = {S 1 , S 2 , ..., S P size } of size P size and the distance D i,j between any two individuals S i and S j (i = j ∈ [1, P size]), the average distance between S i and any other individual in P OP is given by: AD i,P OP = ( QNDS evaluates each solution in the population using the following qualityand-distance fitness (QDF for short) function: where OR(f (S i )) and DR(AD i,P OP ) represent respectively the rank of solution S i with respect to its objective value and the average distance to the population (objective value is ranked in ascending order while average distance is ranked in descending order), and α is a parameter.We require the value of α to be higher than 0.5 to ensure that the best individual in terms of objective value will never be removed from the population, which formalizes the elitism property of QNDS.
Given an offspring S 0 (which has undergone both crossover and local refinement), QNDS first inserts S 0 into P OP , evaluates the QDF value of each individual and finally removes from P OP the solution S w with the largest QDF value.

Local refinement procedure of HMA
The local refinement procedure is another key component of our HMA algorithm and plays en essential role in enforcing intensification which ensures the high performance of HMA.Our local refinement procedure involves two sub-procedures, i.e., a randomized tabu thresholding procedure which explores only the feasible region, and an infeasible descent procedure which visits both feasible and infeasible regions.Both sub-procedures are based on a set of move operators which are explained below.The implementation of the two sub-procedures are also described.

Move operators
Our local refinement procedure employs six move operators, including five traditional small-step-size operators: inversion, single insertion, double insertion, swap, two-opt; as well as a large-step-size operator called merge-split recently proposed in [34].These operators are briefly described as follows.
Let u and v be a pair of tasks in the current solution S, tasks x and y be respectively the successor of u and v, rt(u) be the route including task u.
• Inversion (IV): replace the current arc of task u with its reverse arc in S; • Single insertion (SI): displace task u after task v (also before task v if v is the first task of rt(v)); both arcs of u are considered when inserting u in the target position, and the one yielding the best solution is selected; • Double insertion (DI): displace a sequence (u, x) after task v (also before task v if v is the first task of rt(v)); similar to SI, both directions are considered for each task and the resulting best move is chosen; • Swap (SW): exchange task u and task v; similar to SI, both directions are considered for each task to be swapped and the resulting best move is chosen; • Two-opt (TO): two cases exist for this move operator: 1) if rt(u) = rt(v), reverse the direction of the sequence (x, v); 2) if rt(u) = rt(v), cut the link between (u, x) and (v, y) , and establish a link between (u, y) and (v, x); • Merge-split (MS): this operator obtains an unordered list of tasks by merging multiple routes of the current solution, and sorts the unordered list with the path scanning heuristic [17].It then optimally splits the ordered list into new routes using the Ulusoy's splitting procedure [35].Each application of this operator results in five new solutions and the best one is chosen.Interested readers are referred to [34] for more details.
In the following two subsections, we explain how these operators are used in our two local refinement sub-procedures.

Randomized tabu thresholding procedure
The proposed randomized tabu thresholding procedure (RTTP) follows the general principle of the Tabu Thresholding (TT) method whose basis was first proposed in [13].A main ingredient of TT is the candidate list strategy (CLS) which is dedicated to reduce the number of moves to be considered in order to accelerate the neighborhood examination.CLS subdivides the possible moves of the current solution into subsets and executes one move for each subset rather than for the whole neighborhood.CLS, along with the elements of probabilistic tabu search, simulates the tabu mechanism with memory structure.RTTP is a randomized procedure in the sense that it explores multiple neighborhoods in a random order.

Outline of the randomized tabu thresholding procedure
The randomized tabu thresholding procedure basically alternates between a Mixed phase and an Improving phase where for both phases, five traditional move operators are employed: inversion, single insertion, double insertion, swap and two-opt.Algorithm 2 sketches the outline of the RTTP procedure for CARP.RTTP starts by initializing a set of global variables with an initial solution S 0 taken from an external input.RTTP then enters the main loop where Mixed phase and Improving phase alternate.
In the Mixed phase, for any move operator o and for a given task i, RTTP examines the candidate list MOVE CL(i, S, o) in random order and accepts the first improving feasible move if any, or the best admissible feasible move otherwise.The admissible feasible move satisfies a quality threshold T V , i.e., f (S ′ ) ≤ T V where S ′ is the neighboring solution generated by the accepted move.T V is calculated as: T V = (1 + r) * f p , where f p is the current best local optimum objective value, and r is a threshold ratio.With this quality threshold, deteriorating solutions are allowed in order to diversify the search.
Solution cycling is prevented through the complete reshuffling of the order in which candidate lists are examined before each neighborhood examination.An iteration of the Mixed phase is based on the examination of the complete neighborhoods of all move operators.The Mixed phase is repeated for T iterations.T is called a tabu timing parameter, which is analogous to the tabu tenure when an explicit tabu list is used.T is randomly selected among the values of a given set St.
In the Improving phase, RTTP always seeks an improving move among the feasible moves within each candidate list MOVE CL(i, S, o).If no improving move is found in a given candidate list, RTTP skips to the next candidate list.This phase is iterated until no improving move can be found in any candidate list.
If the local optimum reached in the Improving phase has a better objective value than the recorded best objective value f p , the algorithm updates f p and resumes a new round of M ixed − Improving phases.RTTP terminates when f p has not been updated for a consecutive W M ixed − Improving phases.

Construct candidate list
When using the five traditional move operators, neighborhoods of the current solution can always be obtained by operating on two distinct tasks.For instance, insertion is to insert one task after or before another task; swap is to swap one task with another task; two-opt is to exchange the subsequent part of a task with that of another one.As such, given a move operator o, a natural choice for the subsets to be used in the candidate list strategy is to define one subset MOVE SUBSET(i, S, o) for each task i.In order to speed up neighborhood examination, we further use an estimation criterion to discard moves from MOVE SUBSET(i, S, o) that are unlikely to lead to a promising solution.This estimation criterion is based on a distance measure between two tasks t 1 , t 2 which is defined as: where ) is the traversing distance between t 1 's ath end node v a (t 1 ) and t 2 's bth end node v b (t 2 ).(This distance measure was first used in [26] to define the distance between two routes.)The candidate move list associated to task i (MOVE CL(i, S, o)) is restricted to contain Csize most promising moves such that for each move which is associated with two tasks (i, t), t is a member of i 's Csize closest neighboring tasks according to the distance measure of formula 7.

Infeasible descent procedure
For a constrained optimization problem like CARP, it is known that allowing a controlled exploration of infeasible solutions may facilitate transitions between structurally different solutions and help discover high-quality solutions that are difficult to locate if the search is limited to the feasible region.This observation is highlighted by discoveries made with the strategic oscillation approach (see, e.g., [15]) which alternates between phases of infeasible descent and phases of improving feasible search.To further intensify the search, we employ in our local refinement procedure, as a complement to RTTP, an infeasible descent procedure (IDP) which allows visiting infeasible solutions.IDP is a best-improvement descent procedure based on three traditional move operators, i.e., single insertion, double insertion, swap, as well as a large-step-size merge-split operator that was recently proposed and proved to be effective for CARP [34].We use the merge-split operator in the way as suggested in [34].
IDP basically involves two different stages.In the first stage, IDP examines the complete neighborhoods induced by the SI, DI and SW operators and chooses the best move to perform if it improves the current solution.When no improvement can be found in the first stage, IDP switches to the second stage where it examines the neighboring solutions generated by the MS operator.Since MS is computationally expensive, IDP restricts the examination to a maximum of 100 neighboring solutions which are randomly sampled from the C 2 m possibilities where m is the number of routes.If C 2 m ≤ 100, all neighboring solutions will be examined.Still, the best improving move is performed until no improvement is reported in this stage.If any improvement is found in the second stage, IDP switches back to the first stage to explore the new local region and terminates the algorithm when this stage is finished; otherwise, IDP terminates at the end of the second stage.As in many previous CARP algorithms which allow intermediate infeasible solutions [2,4,19,34], we evaluate the solution quality generated in the search process of IDP by adding a penalty item to the original cost: where EX(S) is the total excess demand of S and β is a self-adjusting penalty parameter.β is halved (doubled) if feasible (infeasible) solutions have been achieved for five consecutive iterations, and its initiating value is set to: where Q is the vehicle capacity.One notices that we don't consider the violation of S in Eq. 9.This is because we always ensure that IDP starts the search from a feasible solution.

Combination of RTTP and IDP
After presenting the implementation of RTTP and IDP, the order of combining them in the local refinement procedure remains an issue to be addressed.RTTP is the most important component of our HMA algorithm which compared to IDP, makes more contribution to the high performance of HMA, but also consumes more computing time (see the analysis in Sect.6.1).IDP is a very simple descent procedure which, when used alone, is not expected to identify very high quality solutions (see the analysis in Sect.6.1).However, the search ability of RTTP and IDP can be mutually strengthened when they are combined.Indeed, it is beneficial to put IDP either before or after RTTP.When IDP is placed before RTTP, the best feasible solution found by IDP can be considered as a good starting point for RTTP.This is because the performance of neighborhood search algorithms may highly depend on the initial solution and a high-quality initial solution could help discover still better solutions.When IDP is put after RTTP, the property of tunneling through infeasible regions and the large-step-size MS operator of IDP may help to further improve the high quality solution provided by RTTP.For the above reasons, both orders (i.e., RTTP→IDP and IDP→RTTP) are allowed in our HMA algorithm.The order is randomly determined before the local refinement procedure is carried out.

Computational experiments
To evaluate the efficacy of the proposed HMA algorithm, we carry out extensive experiments on a large number of well-known CARP benchmark instances, and compare the results1 with those of the state-of-the-art algorithms as well as the best known solutions ever reported in the literature.
HMA was coded in C++ and compiled by GNU g++ 4.1.2with the '-O3' option.The experiments were conducted on a computer with an AMD Opteron 4184 processor (2.8GHz and 2GB RAM) running Ubuntu 12.04.When solving the DIMACS machine benchmarks2 without compilation optimization flag, the run time on our machine is 0.40, 2.50 and 9.55 seconds respectively for instances r300.5, r400.5 and r500.5.

Experimental setup
Our HMA algorithm was evaluated on a total of 191 benchmark graphs with 7 to 255 vertices and 11 to 375 edges.These instances are very popular and widely used in the CARP literature.They cover both random instances and real-life applications, and are typically classified into eight sets: • gdb: 23 instances randomly generated by DeArmon [9], with 7-27 nodes and 11-55 required edges.• val : 34 instances derived from 10 randomly generated graphs proposed by Benavent et al. [1], with 25-50 nodes and 34-97 required edges.• egl : 24 instances proposed by Eglese [11], which originate from the data of a winter gritting application in Lancashire (UK), with 77-140 nodes and 98-190 edges that include 51-190 required edges.• C : 25 instances generated by Beullens et al. [2] based on the intercity road network in Flanders, with 32-97 nodes and 42-140 edges that include 32-107 required edges.
• D: 25 instances modified from the instances of set C by doubling the vehicle capacity for each instance.• E : 25 instances, also generated by Beullens et al. [2] based on the intercity road network in Flanders, with 26-97 nodes and 35-142 edges that include 28-107 required edges.• F : 25 instances modified from the instances of set E by doubling the vehicle capacity for each instance.• EGL-G: 10 large-sized CARP instances, which like the set egl, were also generated based on the road network of Lancashire (UK) [4], each having 255 nodes and 375 edges with 374 to 375 required edges.
Following the common practice in the literature, we compare the results produced by our HMA algorithm on these benchmarks to those of the following eight state-of-the-art algorithms: 1 A guided local search (GLS) algorithm proposed by Beullens et al. [2], who reported results on the instance set gdb, val and C-F. 2 A deterministic tabu search algorithm (TSA) proposed by Brando & Eglese [4], who reported results on all eight instance sets.Two sets of results ("TSA1" and "TSA2") were reported, and the one ("TSA2") yielding better performance will be considered for comparative study for all instance sets except for EGL-G where only results of "TSA1" were reported.3 A variable neighbourhood search (VNS) algorithm proposed by Polacek et al. [30], who reported results on set val and egl.Two sets of results ("993 MHz" and "3.6 GHz") were reported, and the one ("3.6GHz") yielding better performance will be considered for comparative study.4 A memetic algorithm with extended neighbourhood search (MAENS) proposed by Tang et al. [34], who reported results on set gdb, val, egl and C-F. 5 An improved ant colony optimization based algorithm (Ant-CARP) proposed by Santos et al. [32], who reported results on set gdb, val, egl and C-F.Two sets of results ("Ant-CARP 6" and "Ant-CARP 12") were reported, and the one ("Ant-CARP 12") yielding overall better performance will be considered for comparative study.Hereafter, we use "Ant 12" to represent "Ant-CARP 12", and use its median results for comparison when we study the average performance of the reference algorithms since their average results are not available.on set EGL-G.
These reference algorithms were tested on different computers with a CPU frequency ranging from 500 MHz to 3.6 GHz.To make a relatively fair comparison of the runtime, all CPU times reported in the reference papers are scaled here into the equivalent AMD Opteron 4184 2.8 GHz run times.Like in previous CARP literature [25,26,32,34], our time conversion is based on the assumption that the CPU speed is approximately linearly proportional to the CPU frequency.We provide in Table 1 the CPU type and its frequency of each reference algorithm, as well as its resulting scaling factors.This time conversion is only made for indicative purposes, since the computing time of each algorithm is not only influenced by the processor, but also by some inaccessible factors such as the operating systems, compilers and coding skills of the programmer.Nevertheless, we show in the following experiments, the outcomes provide interesting information about the performance of the proposed algorithm relative to the best performing algorithms.

Parameter tuning
The HMA algorithm relies on a set of correlated parameters.To achieve a reasonable tuning of the parameters, we adopt the Iterated F-race (IFR) method [5], which allows an automatic parameter configuration, using the IFR algorithm that is implemented and integrated in the irace package [23].Table 2 summarizes the parameters of our HMA algorithm, along with the range of values that were determined by preliminary experiments.Among these parameters, four of them (Psize, α, W, Csize) need to be tuned and the other two parameters (threshold ratio r and tabu timing parameter T ) are adaptively or randomly chosen among the values in the given sets (Sr and St) during the search process.We set the tuning budget to 1000 runs of HMA and each run is given 100 generations.We restrict the training set to contain 8 challenging instances taken from val, egl, C, E and EGL-G sets: val-10D, egl-e3-B, egl-s3-C, C11, E12, E15, EGL-G1-B, EGL-G2-B.The final choices of the parameter values are presented in Table 2 and they are used in all experiments in the following sections unless otherwise mentioned.

Comparative results on 7 classical instance sets
We first assess HMA on the 7 most commonly used instance sets (181 instances): gdb, val, egl, C, D, E, F. It is compared to 5 current state-of-the-art algorithms: GLS [2], TSA2 [4], VNS [30], Ant-CARP [32], GRASP [36], and MAENS [34].To give a general picture of the performance of each compared algorithm, we summarize in Table 3, for each instance set and for each algorithm, the number of best results that match or improve on the best known results (#Best), the number of average results that match or improve on the best known results (#BestAvg), the average gap between the average results and the best known results in percentage (AvgGap, the gap is calculated as (f avg -f bk )×100/f bk where f avg is the average solution value obtained by the algorithm and f bk is the best known solution value reported in the literature), and the average of the instance computing time in seconds (AvgTime).When we count #Best, we refer to the current best known results (BKRs) which are compiled from the "best results" reported in all previous CARP literature.These "best results" could be those obtained by a single algorithm with various parameter settings (e.g., [22,30,32]) or even with a specific setting tuned for each instance (e.g., TSA best in [4]).Finally, to complement these summarized results, Appendix A (Tables A.1-A.7) reports, for each of the 181 CARP instance, the detailed results of our HMA algorithm as well as the average results of the reference algorithms.These tables permit a thorough assessment of all compared algorithms.
Note that some results were obtained from a single run of the algorithms (GLS and TSA) whereas other results came from multiple runs (VNS, GRASP, Ant-CARP, MAENS, HMA1 and HMA2).Clearly, #Best favors multiple-run results.To make a fair comparison, we refer to average statistics (#BestAvg, AvgGap, AvgTime) when we compare single-run results with multiple-run results.
For each instance, our HMA algorithm was run 30 times under two different stop criteria: 500 generations and 2000 generations.To ease presentation, we denote HMA with 500 generations as HMA 1 , and HMA with 2000 generations as HMA 2 .Studying the outcomes of these two termination criteria affords insights into how HMA behaves when more computing time is available.
From Table 3 Now we turn to compare our HMA algorithm to the single-run reference algorithms.As mentioned before, we should look at average statistics when comparing multiple-run algorithms to single-run algorithms.According to two average indicators, namely #BestAvg and AvgGap, GLS is clearly the best performing single-run algorithm among all 6 reference algorithms (including single-run algorithms and multiple-run ones).Still, compared to GLS, our HMA algorithm remains competitive on 6 instance sets (i.e., gdb, val and C-F).Indeed, when the short time limit (500 generations) is applied, HMA 1 performs better in items of AvgGap (by achieving an equal or lower AvgGap for more instance sets: 5 vs. 3), but worse in terms of #BestAvg (by attaining an equal or higher #BestAvg for less instance sets: 2 vs. 6).On set D, both HMA 1 and HMA 2 are dominated by GLS in terms of both indicators.Finally, one observes that when given more computing time, HMA 2 is able to further improve its results.
To validate the above observations, we apply a Wilcoxon test with a significance factor of 0.05 for a pairwise comparison of the average performance between HMA 1 and TSA2, Ant 12 as well as MAENS, which are three approaches that have been tested on all 181 instances.The resulting p-values of 2.15E-10, 9.31E-10 and 2.20E-16 confirm that the results of HMA 1 are significantly better than those of these current best performing algorithms.This conclusion remains valid for HMA 2 since it always performs better than When it comes to computational time ('AvgTime' in Table 3), our HMA algorithm also remains competitive.Recall that the indicated time for the reference algorithms are scaled according to our computer and the average time of a multiple-run algorithm can be compared to the time of a singlerun algorithm.Table 3 shows that HMA 1 is in overall not slower than any of the reference algorithms.Compared to the fast GLS, TSA2 and Ant 12 algorithms, HMA 1 generally requires comparable computing time.Compared to the remaining reference algorithms (i.e., VNS, GRASP and MAENS), HMA 1 is clearly more efficient.By extending the stop condition to 2000 generations, HMA 2 , which finds improved solutions, consumes more computing time than HMA 1 as expected.
Finally, the "best results" reported by previous CARP studies were often achieved by executing tests involving multiple parameter settings to show the extreme performance of the associated algorithms.Following this practice, we report in Appendix A some new best known results discovered by our HMA algorithm with parameter settings other than the standard one given in Table 2.The form of HMA using these additional parameter settings, which we call HMA * , further attains two new BKRs (for S4-A, S4-B) and matches the BKR for S3-C, which finally makes HMA * consistently match or improve on all 181 BKRs.

Comparative results on the EGL-G set
To test the scalability of HMA, we carried out experiments on the EGL-G set containing 10 large scale CARP (LSCARP) instances.As stated in [26], solving LSCARP is much more challenging than solving small-sized or medium-sized instances since the solution space increases exponentially as the problem size increases.Compared to the classical instance sets which involve instances having at most 190 required edges, all instances in EGL-G have more than 347 required edges.Such a size, as was shown in previous studies [4,25], is large enough to pose a scalability challenge to the existing CARP algorithms.For this reason, a dedicated algorithm called RDG-MAENS [26] has been proposed specifically for solving LSCARP instances.In this section, we evaluate the capacity of our HMA algorithm to solve these 10 LSCARP instances by comparing its performance to those of the current best performing CARP algorithms including RDG-MAENS.
As before, HMA was executed 30 runs to solve each instance under two termination criteria: 500 generations (HMA 1 ) and 2000 generations (HMA 2 ).We also report the results obtained by HMA with various other parameter settings (HMA * ).Table 4 summarizes our results on the EGL-G set, along with those of the current best performing algorithms: TSA1, ILS, MAENS, RDG-MAENS.In [26], the authors report the results of RDG-MAENS for 6 parameter combinations of (g,α) where g = 2 and 3, α = 1, 5 and 10.We include the results of the best version (g = 2, α = 10) for our comparative study.Table 4 lists the average results of each algorithm, the solution time of HMA, the best lower bounds (LB), the best known results (BKR), and the best results of HMA.The last two rows show, for each algorithm, the average of the average gaps to the BKR (AvgGap), and the "scaled" average of the average solution time (AvgT ime).
Table 4 discloses that HMA 1 dominates all reference algorithms.In terms of average results, HMA 1 is much better than any of the reference algorithms for all 10 instances.The very small AvgGap value of -0.02% of HMA 1 indicates that HMA 1 is on average better than the previous BKRs, and compares favorably to the AvgGap value of more than 0.89% for the other approaches.
In terms of best results, HMA 1 discovers an improved solution relative to the A Wilcoxon test is finally applied to a pairwise comparison of the average performance between HMA 1 and each of the four reference methods, which always results in a p-value of 0.001953 (<0.05) for all tested pairs, indicating the superiority of our method relative to the compared approaches.

Analysis
In this section, we present additional experimental results to analyze the performance of each algorithmic component of the proposed HMA algorithm, in order to understand their contribution to the overall performance of HMA and how they should be combined in the proposed algorithm.These experiments were performed on the egl set which contains a number of challenging instances of medium size, and helps to better distinguish the performance of the algorithm variants to be considered.

Analysis of algorithmic components
To provide insight of the performance of the local refinement procedure and the crossover operator, we test in this experiment several algorithm variants based on a single solution rather than a population of solutions.The "IDP" version is a very simple algorithm having only solution initialization (random path scanning) and IDP as its two components.Similar to "IDP", the "RTTP" version includes only the solution initialization and the RTTP procedure.The "TTP" version is exactly the same as "RTTP" except that the move operators are used in a fixed order as they are presented in Section 4.1.The "I+R" version combines IDP and RTTP in random order.The "RST" version is a random restart algorithm that simply starts "I+R" for 500 times.We also include the results of HMA 1 (500 generations) for comparative purposes.For each algorithm variant, we report in Table 5 the average of the best gap to the BKR in percentage (BestGap), and the average of the average gap to the BKR in percentage (AvgGap), as well as the average computing time in seconds (AvgTime).
Table 5 shows that TTP performs much better than IDP.RTTP further improves on TTP which shows the effectiveness of the random use of move operators.The results of "I+R" indicates that though RTTP is a very crucial component, its performance can be still ameliorated by a random collaboration with IDP.The comparative results of RST and HMA 1 clearly shows the relevance of the crossover operator and the population-based framework.

Analysis under the population-based framework
Given that both the local refinement procedures and the crossover operator are effective, this part of analysis investigates how they should be combined to achieve the best performance.For this purpose, we propose another set of algorithm variants based on a population of solutions.The "XO+I" version is obtained by removing the RTTP procedure from HMA, "XO+R" version by removing IDP from HMA. "XO+I→R" version works all the same as HMA, except that the order of using IDP and RTTP is fixed to I→R.Similarly, "XO+R→I" uses the order R→I.Table 6 summarizes the results of these algorithm variants, along with those of HMA 1 .Without surprise, using only IDP in the local refinement procedure shows a rather poor performance, while employing RTTP in the local refinement procedure leads to a much better performance.Compared to "XO+R", including an additional IDP in the local refinement procedure never leads to a definitely better performance if the order of using IDP and RTTP is fixed.However, if IDP and RTTP are used in a random order as in HMA, a better performance can be observed in terms of both BestGap and AvgGap.

Analysis on the pool updating strategy
The third part of the analysis investigates the effectiveness of our pool updating strategy, which uses the hamming distance to control the diversity of the population.We therefore compare the adopted strategy to a traditional "pool worst" strategy which simply replaces the worst solution in terms of fitness in the population, leading to an algorithm variant denoted as HMA P W . HMA P W was tested under two termination criteria: 500 generations (HMA P W 1 ) and 2000 generations (HMA P W 2 ), and the outcomes are compared to those of HMA 1 and HMA 2 .The computational results are summarized in Table 7, from which we can clearly see that HMA 1 performs better than HMA P W 1 , and the superiority of HMA 2 relative to HMA P W 2 enlarges when more computing time is allowed.This experiment confirms the usefulness of the diversity control mechanism used in our HMA.
The capacitated arc routing problem (CARP) is of great practical interest and represents a significant computational challenge due to its NP-hardness.We developed a new hybrid metaheuristic approach (HMA) for effectively solving CARP, which employs a randomized tabu thresholding procedure (RTTP) coupled with an infeasible descent procedure to explore both feasible and infeasible regions.HMA relies on a specialized route-based crossover operator to generate diversified and promising new solutions.Thanks to its qualityand-distance based pool updating strategy, HMA prevents the search process from premature convergence.
The proposed approach demonstrates an excellent performance over the eight sets of 191 popular CARP benchmarks.Specifically, on the 7 sets of 181 classical instances, HMA with a standard parameter setting outperforms the current best performing algorithms, in terms of both solution quality and computational efficiency.HMA further improves its own performance when more computing time is available (to run 2000 generations), attaining the best known results for all 181 cases including 6 improved new best results.HMA also proves to be scalable to handle the last set of 10 large-sized instances, by obtaining 9 new best results, dominating the current state-of-the-art algorithms including the approaches which were specially designed for the large-sized CARP instances.We additionally conducted experiments to analyze the contribution of the two sub-procedures for local refinement, the relevance of the route-based crossover operator (and thus the population-based framework), the strategy for combining crossover with the local optimization procedure, as well as the quality-and-distance pool updating strategy.
Finally, we observe that the proposed method can be adapted to handle other CARP variants with slight modifications of the route-based crossover operator and of the local refinement procedure to accommodate additional constraints.

A Appendix
This appendix shows the detailed results of our HMA algorithm on the 181 conventional CARP instances (Table A.1-A.7).We also include in Table A.3 the new best known results (BKRs) discovered by HMA on egl instances with various parameter settings (indicated by HMA * ) other than the standard setting given in Table 2.

Algorithm 2 :
Outline of the RTTP for CARP Data: P -a CARP instance, St -a set of tabu timing values, r -threshold ratio, W -the number of non-improving attractors visited, S 0 -an initial solution; Result: the best solution S * found so far; O ← {IV, SI, DI, SW, T O} ; /* O contains five move operators */ S * ← S 0 ; /* S * records the global best solution */ S ← S 0 ; /* S records the current solution */ f p ← f (S * ); ; /* f p records the best local optimum objective value */ w ← 0 ; /* set counter for consecutive non-improving local optima */ while w < W do T ← random select(St) ; // Mixed Phase for k ← 1 to T do Randomly shuffle all operators in O ; for each o ∈ O do Randomly shuffle tasks of S in M ; for each i ∈ M do (S, S * ) ← Apply operator o to task i by searching MOVE CL(i, S, o) and accepting a move according to the quality threshold; // Improving Phase while Improving moves can be found do Randomly shuffle all operators in O ; for each o ∈ O do Randomly shuffle tasks of S in M ; for each i ∈ M do (S, S * ) ← Apply operator o to task j by searching MOVE CL(i, S, o) and accepting the first met improving move; If the improved solution reaches the lower bound LB, HMA terminates immediately and returns this solution (Line 11-12).Otherwise, HMA ends a generation by updating the recorded best solution and the population with the offspring solution S c (Line 13, see Sect. 3. 8, seeSect 4.4).RTTP requires a threshold ratio which is probabilistically chosen among the values of a given set Sr according to the probability formula: P r(i) = Cnt(i)/ |Sr| i=1 Cnt(i), where P r(i) denotes the probability of selecting the i th value of Sr (Line 9).Algorithm 1: Main Scheme of HMA for CARP Data: P -a CARP instance; P size -population size; St -a set of tabu timing values; W -the number of non-improving attractors visited; Sr -a set of threshold ratios; LB -the best known lower bound of P Result: the best solution S * found // Population initialization, Section 3.2 P OP ← P ool Initialization(P size) ; S * ← Best(P OP ) ; /* S * records the best solution found so far */ for i = 1 to |Sr| do Cnt(i) = 1 ; /* Initialize the counter array Cnt */ // Main search procedure while stopping condition not reached do Randomly select two solutions S 1 and S 2 from P OP ; S 0 ← Crossover(S 1 , S 2 ) ; /* Route-based crossover, Sect.3.3 */ Od ← random determine(RT T P → IDP, IDP → RT T P ) ; /* Determine the order of conducting RTTP and IDP, Sect.4.4 */ k ← probabilistic select(Cnt) ; /* Select a ratio, Sect.4.2 */ S 0 ← Local ref ine(P, S 0 , St, Sr(k), W, Od) ; /* Improve S 0 , Sect. 4 */ if f (S 0 ) = LB then return S 0 (S * , P OP ) ← P ool U pdating(S 0 , P OP ) ; /* Pool updating, Sect.3.4 */ if pool updating is successful then Cnt(k) ← Cnt(k) + 1 ; return S *

Table 1
Scaling factors for computers used in the reference algorithms.Our computer (AMD Opteron 4184) is used as the basis.

Table 2
Parameter tuning results , we can see that HMA 1 shows a remarkable performance on all 7 tested instance sets compared to the multiple-run reference algorithms.Indeed, it attains the largest number of best known results for all 7 data sets and the lowest average gap to the best known results for 6 out of 7 sets.Compared to Ant 12 and MAENS which, like HMA 1 , are both population-based algorithms, HMA 1 clearly shows its dominance in terms of both best results and average results.For set D, HMA 1 is the only algorithm which is able to find all BKRs.Additionally, HMA 1 obtains improved best known results on three well-studied instances from set egl.By increasing the HMA 1 termination criterion of 500 generations to 2000 generations, HMA 2 achieves a still better performance, always obtaining equal or better results in terms of both #Best and AvgGap.In particular, for set egl, HMA 2 discovers 6 new BKRs and matches 6 more BKRs, leading to #Best = 23 which is significantly larger than those obtained by the reference algorithms.HMA 2 is able to achieve overall 180 current or new BKRs out of 181 instances with one standard parameter setting, while the previous BKRs are compiled from many previous articles, among which some were obtained with parameters specifically tuned for individual instance.

Table 3
Comparative statistical results on the 7 classical instance sets: gdb, val, egl, C-F.The best result of each row is indicated in bold.The results of HMA 2 are starred if they improve on the results of HMA 1 .

Table 4
Comparative results of our HMA algorithm with 4 state-of-the-art algorithms on the 10 instances of EGL-G set.The best average results and best results are in bold.The best result of HMA * is starred if it improves on the BKR.BKRs for 9 out of 10 instances (90%).As the current best and highly specialized algorithm, the best version of the reference algorithm RDG-MAENS is outperformed by HMA 1 .Moreover, the average computational time of HMA 1 is comparable to that of the reference algorithms.HMA 1 requires much less time to find substantially better results than MAENS.Compared to the best version of RDG-MAENS, HMA 1 is also on average faster (1522.86 vs. 1633.86seconds).The fact that HMA 2 further significantly improves on HMA 1 demonstrates that HMA can reach better performance when more computational time is allowed.By testing several other parameter settings, HMA * further discovers 6 improved best known results.

Table 5
Analysis of the components of the local refinement procedure using a single solution approach.

Table 6
Analysis of different combinations of the crossover operator and the local refinement procedure using a population of solutions.

Table 7
Analysis of the quality-and-distance pool updating strategy.

Table A .
1 Detailed comparative results of HMA algorithm with five state-of-the-art algorithms on the 23 instances of Set gdb.The average results of all algorithms and the best results of HMA 1 (500 generations) are highlighted in bold if they match the BKRs.

Table A .
2 Detailed comparative results of HMA algorithm with 6 state-of-the-art algorithms on the 34 instances of Set val.The average results of all algorithms and the best results of HMA Detailed comparative results of HMA algorithm with five state-of-the-art algorithms on the 24 instances of Set egl.The average results of all algorithms and the best results of HMA 1 (500 generations) and HMA 2 (2000 generations) are highlighted in bold if they match the BKRs.The best results of HMA * are starred if they are new BKRs.The optimal BKRs are underlined.

Table A .
4Detailed comparative results of HMA algorithm with 4 state-of-the-art algorithms on the 25 instances of Set C. The average results of all algorithms and the best results of HMA 1 (500 generations) and HMA 2 (2000 generations) are highlighted in bold if they match the BKRs.The optimal BKRs are underlined.

Table A .
5Detailed comparative results of HMA algorithm with 4 state-of-the-art algorithms on the 25 instances of Set D. The average results of all algorithms and the best results of HMA 1 (500 generations) and HMA 2 (2000 generations) are highlighted in bold if they match the BKRs.The optimal BKRs are underlined.Table A.6 Detailed comparative results of HMA algorithm with 4 state-of-the-art algorithms on the 25 instances of Set E. The average results of all algorithms and the best results of HMA 1 (500 generations) and HMA 2 (2000 generations) are highlighted in bold if they match the BKRs.The optimal BKRs are underlined.Table A.7 Detailed comparative results of HMA algorithm with 4 state-of-the-art algorithms on the 25 instances of Set F. The average results of all algorithms and the best results of HMA 1 (500 generations) are highlighted in bold if they match the BKRs.The optimal BKRs are underlined.