Randomized Heuristic Repair For Large-Scale Multidimensional Knapsack Problem
Randomized Heuristic Repair For Large-Scale Multidimensional Knapsack Problem
knapsack problem
Jean P. Martins
Ericsson Research, ER, Brazil
[email protected]
arXiv:2405.15569v1 [cs.AI] 24 May 2024
Abstract
The multidimensional knapsack problem (MKP) is an NP-hard combinatorial optimization problem
whose solution consists of determining a subset of items of maximum total profit that does not violate
capacity constraints. Due to its hardness, large-scale MKP instances are usually a target for metaheuristics,
a context in which effective feasibility maintenance strategies are crucial. In 1998, Chu and Beasley
proposed an effective heuristic repair that is still relevant for recent metaheuristics. However, due to its
deterministic nature, the diversity of solutions such heuristic provides is not sufficient for long runs. As
a result, the search ceases to find new solutions after a while. This paper proposes an efficiency-based
randomization strategy for the heuristic repair that increases the variability of the repaired solutions,
without deteriorating quality and improves the overall results.
1 Introduction
The Multidimensional Knapsack Problem (MKP) is a well-known strongly NP-Hard combinatorial optimization
problem [Freville, 2004, Puchinger et al., 2010]. To solve an instance of the MKP one must choose, from a set
of n items, a subset that yields the maximum total profit. Every item chosen contributes with a profit pj > 0
(j = 1, . . . , n) but also consumes wij from the resources available ri > 0 (i = 1, . . . , m). Therefore, a solution
is feasible only if the total of resources it consumes does not surpass the amount available. By representing
solutions as n-dimensional binary vectors, the following integer programming model defines the MKP:
n
X
max f (x) = pj xj ,
j=1
n
X
subject to wij xj ≤ ri i = 1, . . . , m,
j=1
xj ∈ {0, 1}, j = 1, . . . , n.
The MKP is considerably harder than its uni-dimensional counterpart, not admitting an efficient
polynomial-time approximation scheme even for m = 2 [Kulik and Shachnai, 2010]. Furthermore, MKP’s
hardness increases with m, and larger instances still cannot be efficiently solved to optimality [Mansini and
Speranza, 2012]. Due to these limitations, metaheuristics have been the most successful alternatives to
solve large instances (or-library1 ). Hybrid methods, incorporating tabu-search, evolutionary algorithms,
linear programming, and branch and bound techniques, produced most of the best-known solutions for such
instances [Puchinger et al., 2010].
1 http://people.brunel.ac.uk/ mastjjb/jeb/info.html
1
Chu and Beasley [1998] results can be considered a landmark regarding metaheuristics for the MKP.
The method they proposed, denoted as Chu & Beasley Genetic Algorithm (CBGA), was one of the first
metaheuristic to deal with large-scale MKP instances, leading to several succeeding research studies Gottlieb
[2000, 2001], Tavares et al. [2006, 2008]. Additionally, the heuristic repair applied by CBGA has been an
effective alternative for feasibility maintenance commonly applied by evolutionary algorithms and related
techniques. As a result, there have been many attempts to improve or replacing it Kong et al. [2008], Wang
et al. [2012b,a], Martins et al. [2013], Martins and Delbem [2013], Chih et al. [2014], Azad et al. [2014],
Martins and Ribas [2020], along with attempts to explain it Martins et al. [2014b], Martins and Delbem
[2016, 2019].
This paper revisits Chu and Beasley [1998]’s heuristic repair and tackles one of its weaknesses: determinism.
As an attempt to overcome such limitation, we propose a randomized heuristic repair and compare its
performance against CBGA’s results. Section 2, reviews the main concepts needed for defining the type
of heuristic repair discussed in the paper. Section 3 describes how to enable an effective randomization.
Sections 4, 5 and 6 define the implementation of the proposed heuristic repair on top of CBGA along with
the analysis of the results. Section 7 concludes the paper.
2 Background
Many metaheuristics produce during the search infeasible candidate solutions. For the MKP, an infeasible
solution consists of a subset of items whose resource consumption exceeds the available amount, i.e., they
violate capacity constraints. Therefore, the only way to repair an infeasible solution is by removing some of
the items it contains. However, when removing a particular item, the resources that now become available
might be enough to allow the addition of another item. As a result, a heuristic repair for the MKP usually
consists of two phases: (1) DROP items, (2) ADD items.
Naturally, by removing items of high value and low resource consumption in exchange for items of low
value and high resource consumption, will yield a solution of low total profit. Therefore, a useful heuristic
repair must consider removing and inserting items guided by an ordering that favors highly valuable solutions.
Algorithm 1 formalizes these ideas: consider a candidate solution x and an ordering O. The ordering indicates
how to compare items in terms of the total value of the solutions they compose. Assume a non-increasing
order of value for items, i.e., Oi is probably better than Oj if i < j.
Algorithm 1 heuristic-repair(x, O)
1: for all j = On , . . . , O1 do ▷ DROP items:
2: if x is feasible then break
3: Remove j from the knapsack
4: for all j = O1 , . . . , On do ▷ ADD items:
5: if j fits in the knapsack then
6: Add j to the knapsack
Algorithm 1 will produce high-quality solutions only if the ordering matches the efficiency of the items.
For example, in the uni-dimensional knapsack problem, every item j has a profit of pj and consumes wj from
the resource available. Therefore, by considering the items in non-increasing order of efficiency ej = pj /wj , we
would have an ordering that favors valuable items of small dimensions during the heuristic repair. Efficiency
measures are also desirable for the multidimensional case. However, due to the multiple resources wji involved,
it is not straightforward to define a denominator for the equation. The usual approach consists of a weighted
sum of the resources consumed by every item, as defined by Equation (1).
pj
ej = Pm , ∀j = 1, . . . , n. (1)
i=1 λi · wij
From a comprehensive set of experimental, Puchinger et al. [2010] argued that the most effective weights
λi , would be optimal solutions for the dual linear programming relaxation of the MKP. Indeed, that was the
2
weight vector employed by Chu and Beasley [1998] in their experiments. We denote as edual j the efficiencies
computed by the use of such weight vectors in equation (1). Additionally, we denote as Odual the ordering of
the knapsack items that follow from such efficiencies.
Efficiencies provide reasonable estimates of how likely every item belongs to optimal solutions. Therefore,
if an item has a high value of efficiency, it will probably be present in an optimal solution. On the other
hand, if it has low value, it will most likely be rejected. Unfortunately, as the number of constraints grows
(m), the efficiency values for many items become too close to discriminate them (named as the core items).
As a result, for large MKP instances, the ordering Odual is less informative for the heuristic repair which
hinders its effectiveness. As shown by Martins et al. [2014a], that seems to be the case for the CBGA, with
the algorithm struggling to decide if core items should be put in the knapsack even in very long runs.
1.0 1.0
0.8 0.8
Efficiency
Efficiency
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
0 20 40 60 80 100 0 20 40 60 80 100
Item Item
(a) Original efficiencies (b) Efficiencies rounded to the first decimal case.
Figure 1: Efficiency groups produced by rounding the original efficiencies to the first decimal case.
Figure 1(b) shows us the same efficiencies but now rounded to the first decimal case (d = 1). Rounding
makes evident other plateaus, that indicate groups of items whose efficiencies are very close, i.e., an efficiency
group. Therefore, by adjusting the number of decimal cases for rounding, we can roughly control the sizes of
these groups. In summary, an efficiency group is a subset of two or more items, whose efficiencies are equal
2 Scaled to the [0, 1] interval.
3
after rounding. Given this fact, we expect that items in the same group can have their ordering randomly
modified without massive deterioration of the heuristic repair effectiveness. In this sense, efficiency groups
would enable a smooth exploration of the space of orderings during the search.
4 Methodology
From now on, we evaluate the possible benefits of two simple randomizing operators that modify the ordering
of intra-group items given an initial ordering Odual .
Random-group swap (rg-swap). Randomly chooses an efficiency group, swap the positions of two ran-
domly chosen items in the group.
Random-group shuffle (rg-shuffle). Randomly chooses an efficiency group, shuffles the positions of
possibly all of its items.
To verify the effectiveness of these operators, we must first decide how frequently randomization should
take place. Since the CBGA is an effective algorithm to find high-quality solutions fast, the use of intensive
randomization during the first generations would slow-down its progress. To avoid such a drawback, we
associate the exploration of new orderings with the number of improving solutions produced by heuristic
repair during a generation. Whenever an ordering ceases to produce improving solutions, randomization will
take place. Next section, describes a modified version of the CBGA that implements such a strategy.
Algorithm 2 CBGA-newsolution(P)
1: Select x1 ∈ P by a binary tournament
2: Select x2 ∈ P by a binary tournament
3: x ← uniformCrossover(x1 , x2 )
4: flipTwoRandomBits(x)
5: return x.
CBGA is a steady-state algorithm, which means it generates and evaluates one solution at the time.
However, to control the activation of the randomization of efficiency orderings, we must account for the
number of improving solutions produced during a generation. Therefore, we adapted the CBGA to make
that possible, with a generation consisting of N attempts to produce improving solutions (where N is the
population size). Every time CBGA produces a solution x it applies the heuristic repair to it. The resulting
solution is an improvement if it is both unique and better than the worst solution in P (see Algorithm 3).
The adapted algorithm CBGAdϵ supports the randomization of efficiency groups, with d referring to the
number of decimal cases used for rounding the efficiencies and ϵ standing for efficiency (see Algorithm 4)3 .
The first step in CBGAdϵ is to sort the indexes of the items in non-increasing order according to edual . Next
step is to store the efficiency groups in g. Next step, generates a population of N random candidate solutions.
All candidate solutions undergo heuristic repair and the search starts. At the end of every generation, if no
improving solutions were produced, the randomization of the ordering takes place (where rand-ordering is
a placeholder for rg-swap or rg-shuffle).
3 https://gitlab.com/jeanpm/mkp-egroups
4
Algorithm 3 CBGA-generation(P, N, Odual )
1: m ← 0
2: improvements ← 0
3: while m < N do
4: x ← CBGA-newsolution(P)
5: heuristic-repair(x, Odual )
6: if x is unique and better than the worst solution in P then
7: x substitutes the worst solution in P
8: improvements ← improvements +1
9: m←m+1
10: return improvements.
Algorithm 4 CBGAdϵ
Require: The population size N
1: O dual ← sort({1, . . . , n}, edual )
2: g ← get-efficiency-groups(O dual , d)
3: Generates N random candidate solutions in P
4: heuristic-repair(x, O dual ), ∀x ∈ P
5: while Stop criteria not met do
6: m ← CBGA-generation(P, N , Odual )
7: if m ≤ 0 then rand-ordering(Odual , g)
8: return the best solution in P.
5 Experiments
The instances provided by Chu and Beasley [1998] have been widely used in the literature to benchmark
methods for the MKP, and it will also be the basis for our experiments. In this paper we use larger-scale
instances provided by Glover, the set comprises instances of size n ∈ {100, 150, 200, 500, 1500, 2500}, for each
size there are instances of dimension m ∈ {15, 25, 50, 100}. Not every pair n − m is available, there are 11
instances4 .
To evaluate how the rounding impacts the efficiency groups, we consider two options d = 1, 2. With d = 1
the efficiency groups are large, enabling a more intense exploration of the ordering space. As we increase d,
the efficiency groups shrink, restricting the exploration of the orderings.
Regarding the randomizing operators, there are also two options. Given a randomly chosen efficiency
group, the rg-swap operator will exchange the position of two items, leading to subtle modification of the
previous ordering. The rg-shuffle, on the other hand, will possibly exchange the position of all items in
the group, leading to a drastic modification of the previous ordering.
The combination of two operators with two rounding options d = 1, 2 yields four algorithms to be compared
with the original CBGA, which we abbreviate as follows:
5
We ran all algorithms until finding a solution of quality equivalent to the best known, or until performing
108 evaluations of the objective function. The average results from 30 runs, for every instance, is reported.
The tables of results contain four main parts:
1. Instance name,
2. Best known: indicates the objective value of the best known feasible solution in the literature,
3. Gap: group of columns indicating the average gap of the solutions found in relation to the best known;
4. Time: group of columns indicating the average running time (in seconds) of every algorithm.
Additionally, an star symbol next to the gap is used to indicate if the best known solution was found at
least once during the runs. The last row in every table summarizes the results by giving the overall number
of “wins” of every algorithm, regarding solution quality (smallest gap) and running time (fastest).
7 Conclusion
The use of efficiency measures to estimate the quality of knapsack items is a common heuristic used to guide
the search for solutions for the MKP. Such estimates induce an ordering for the items, that can be used
during the search to prefer one item instead of others. However, if the ordering employed is not accurate, it
might bias the search far from regions in the search space with high-quality solutions.
This paper evaluated how a search in the space of orderings could improve the search for solutions. For
that, we proposed a randomization strategy that acts by modifying the items ordering, always that improving
solutions cease to be found. Such a strategy relies on the concept of efficiency groups to avoid deterioration
of the heuristic information provided by the initial ordering. Four variants of our proposal were implemented
and compared to the original CBGA in 270 or-library MKP instances. Those variants differ by the size of
the efficiency groups they induce (depends on the parameter d) and how intensely they modify the items’
ordering (swap of two items or shuffling the items within a group).
The results were encouraging and all variants of the randomized heuristic repair led to improvements
in relation to the CBGA. Such improvements consist of considerably smaller running times (more than ten
6
times faster in some cases), smaller average gap to the best-known solutions and ability of finding solutions
equivalent to the best known in cases where the CBGA has failed.
As a drawback, the quality of the results seems to depend on the randomization strategy chosen to modify
the efficiency groups and the individual instance characteristics. So far, we could not identify a pattern on
when to choose an aggressive strategy (rg-shuffle) or a less disruptive one (rg-swap). Additionally, there
were cases where the CBGA still achieved the best results overall, which makes the analysis of results even
more difficult.
For future work, it would be interesting to understand how the rounding parameter d and the randomization
strategies interact on different problem instances in order to propose a more robust combination. Another
possibility would be to investigate the usefulness of efficiency groups for improving local search procedures
and also verify if they could be employed to define instance-specific strategies for modifying the orderings.
References
References
Md. Abul Kalam Azad, Ana Maria A.C. Rocha, and Edite M.G.P. Fernandes. Improved binary artificial fish
swarm algorithm for the 0–1 multidimensional knapsack problems. Swarm and Evolutionary Computation,
14(0):66–75, 2014. doi:10.1016/j.swevo.2013.09.002.
Mingchang Chih, Chin-Jung Lin, Maw-Sheng Chern, and Tsung-Yin Ou. Particle swarm optimization with
time-varying acceleration coefficients for the multidimensional knapsack problem. Applied Mathematical
Modelling, 38(4):1338–1350, 2014. doi:10.1016/j.apm.2013.08.009.
P.C. Chu and J.E. Beasley. A Genetic Algorithm for the Multidimensional Knapsack Problem. Journal of
Heuristics, 4:63–86, 1998. doi:10.1023/A:1009642405419.
Arnaud Freville. The multidimensional 0–1 knapsack problem: An overview. European Journal of Operational
Research, 155(1):1–21, 2004. doi:10.1016/S0377-2217(03)00274-1.
D.E. Goldberg. Genetic algorithms and Walsh functions: Part I, a gentle introduction. Complex Systems, 3
(2):129–152, 1989.
Jens Gottlieb. Permutation-based Evolutionary Algorithms for Multidimensional Knapsack Problems. In
Proceedings of the 2000 ACM Symposium on Applied Computing - Volume 1, SAC ’00, pages 408–414.
ACM, 2000. doi:10.1145/335603.335866.
Jens Gottlieb. On the Feasibility Problem of Penalty-Based Evolutionary Algorithms for Knapsack Problems.
In Applications of Evolutionary Computing, volume 2037 of Lecture Notes in Computer Science, pages
50–59. Springer Berlin Heidelberg, 2001. doi:10.1007/3-540-45365-2 6.
Min Kong, Peng Tian, and Yucheng Kao. A new ant colony optimization algorithm for the multidimensional
Knapsack problem. Computers & Operations Research, 35(8):2672–2683, 2008. doi:10.1016/j.cor.2006.12.029.
Ariel Kulik and Hadas Shachnai. There is no EPTAS for two-dimensional knapsack. Information Processing
Letters, 110(16):707–710, 2010. doi:10.1016/j.ipl.2010.05.031.
Renata Mansini and M. Grazia Speranza. Coral: An exact algorithm for the multidimensional knapsack
problem. INFORMS Journal on Computing, 24(3):399–415, 2012. doi:10.1287/ijoc.1110.0460.
Jean P. Martins and Alexandre C.B. Delbem. Pairwise independence and its impact on estimation of distribu-
tion algorithms. Swarm and Evolutionary Computation, 27:80–96, apr 2016. doi:10.1016/j.swevo.2015.10.001.
URL http://dx.doi.org/10.1016/j.swevo.2015.10.001.
7
Jean P. Martins and Alexandre C.B. Delbem. Reproductive bias, linkage learning and diversity preservation
in bi-objective evolutionary optimization. Swarm and Evolutionary Computation, 48:145–155, aug 2019.
doi:10.1016/j.swevo.2019.04.005. URL https://doi.org/10.1016%2Fj.swevo.2019.04.005.
Jean P. Martins and Bruno C. Ribas. A randomized heuristic repair for the multidimensional knapsack
problem. Optimization Letters, 15(2):337–355, jun 2020. doi:10.1007/s11590-020-01611-1. URL https:
//doi.org/10.1007%2Fs11590-020-01611-1.
Jean P. Martins, Constancio Bringel Neto, Marcio K. Crocomo, Karla Vittori, and Alexandre C. B. Delbem.
A Comparison of Linkage-learning-based Genetic Algorithms in Multidimensional knapsack Problems. In
IEEE Congress on Evolutionary Computation , volume 1 of CEC’2013, pages 502–509, June 20-23 2013.
doi:10.1109/CEC.2013.6557610.
Jean P. Martins, Humberto Longo, and Alexandre C.B. Delbem. On the effectiveness of genetic algorithms
for the multidimensional knapsack problem. In Proceedings of the Companion of Genetic and Evolutionary
Computation, GECCO Comp ’14, pages 73–74. ACM, 2014a. doi:10.1145/2598394.2598477.
Jean Paulo Martins and Alexandre Claudio Botazzo Delbem. The influence of linkage-learning in the linkage-
tree GA when solving multidimensional knapsack problems. In Proceeding of the conference on Genetic
and Evolutionary Computation, GECCO ’13, pages 821–828. ACM, 2013. doi:10.1145/2463372.2463476.
J.P. Martins, C.M. Fonseca, and A.C.B. Delbem. On the performance of linkage-tree genetic algorithms for the
multidimensional knapsack problem. Neurocomputing, 146:17–29, 2014b. doi:10.1016/j.neucom.2014.04.069.
Jakob Puchinger, Günther R Raidl, and Ulrich Pferschy. The multidimensional knapsack problem: Structure
and algorithms. INFORMS Journal on Computing, 22(2):250–265, 2010.
J. Tavares, F.B. Pereira, and E. Costa. The Role of Representation on the Multidimensional Knapsack
Problem by means of Fitness Landscape Analysis. In IEEE Congress on Evolutionary Computation,
CEC’2006, pages 2307–2314, 0-0 2006. doi:10.1109/CEC.2006.1688593.
J. Tavares, F.B. Pereira, and E. Costa. Multidimensional Knapsack Problem: A Fitness Landscape Analysis.
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 38(3):604–616, june 2008.
ISSN 1083-4419. doi:10.1109/TSMCB.2008.915539.
Ling Wang, Xiping Fu, Yunfei Mao, Muhammad Ilyas Menhas, and Minrui Fei. A novel modified
binary differential evolution algorithm and its applications. Neurocomputing, 98(0):55–75, 2012a.
doi:10.1016/j.neucom.2011.11.033. Bio-inspired computing and applications (LSMS-ICSEE ’ 2010).
Ling Wang, Sheng-yao Wang, and Ye Xu. An effective hybrid EDA-based algorithm for solving
multidimensional knapsack problem. Expert Systems with Applications, 39(5):5593–5599, 2012b.
doi:10.1016/j.eswa.2011.11.058.