0% found this document useful (0 votes)
7 views45 pages

A Bilevel Approach For Compensation and Routing Decisions in 1d2i7ndi7ybt

This paper presents a bilevel programming model for optimizing compensation and routing decisions in last-mile delivery logistics, where a platform allocates orders to independent carriers who maximize their own profits. It introduces two settings: one with fixed compensation margins and another allowing margin decisions, and proposes single-level reformulations and a branch-and-cut algorithm for solving the bilevel models. Extensive computational tests are conducted to evaluate the proposed formulations and their impact on the platform's profit and carrier acceptance of delivery requests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views45 pages

A Bilevel Approach For Compensation and Routing Decisions in 1d2i7ndi7ybt

This paper presents a bilevel programming model for optimizing compensation and routing decisions in last-mile delivery logistics, where a platform allocates orders to independent carriers who maximize their own profits. It introduces two settings: one with fixed compensation margins and another allowing margin decisions, and proposes single-level reformulations and a branch-and-cut algorithm for solving the bilevel models. Extensive computational tests are conducted to evaluate the proposed formulations and their impact on the platform's profit and carrier acceptance of delivery requests.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

A bilevel approach for compensation and routing

decisions in last-mile delivery


Martina Cerulli1 , Claudia Archetti1 , Elena Fernández2 , and Ivana Ljubić1
arXiv:2304.09170v3 [math.OC] 6 Jun 2024

1
Department of Information Systems, Decision Sciences and Statistics, ESSEC Business School,
95021 Cergy Pontoise, France
2 Departament of Statistics and Operational Research, Universidad de Cádiz, Puerto Real, Spain

Abstract
In last-mile delivery logistics, peer-to-peer logistic platforms play an important role
in connecting senders, customers, and independent carriers to fulfill delivery requests.
Since the carriers are not under the platform’s control, the platform has to antici-
pate their reactions, while deciding how to allocate the delivery operations. Indeed,
carriers’ decisions largely affect the platform’s revenue. In this paper, we model this
problem using bilevel programming. At the upper level, the platform decides how to
assign the orders to the carriers; at the lower level, each carrier solves a profitable tour
problem to determine which offered requests to accept, based on her own profit max-
imization. Possibly, the platform can influence carriers’ decisions by determining also
the compensation paid for each accepted request. The two considered settings result in
two different formulations: the bilevel profitable tour problem with fixed compensation
margins and with margin decisions, respectively. For each of them, we propose single-
level reformulations and alternative formulations where the lower-level routing variables
are projected out. A branch-and-cut algorithm is proposed to solve the bilevel models,
with a tailored warm-start heuristic used to speed up the solution process. Extensive
computational tests are performed to compare the proposed formulations and analyze
solution characteristics.

1 Introduction
The term “last-mile delivery” refers to the final leg in a Business-To-Consumer delivery
service whereby the consignment is delivered to the recipient, either at the recipient’s home
or at a collection point (Archetti and Bertazzi, 2021). In today’s last-mile delivery systems, a
Email addresses:
[email protected] (Martina Cerulli), [email protected] (Claudia Archetti), [email protected]
(Elena Fernández), [email protected] (Ivana Ljubić)

1
rather common scenario involves a platform that receives orders from customers who require
delivery (Punel and Stathopoulos, 2017; Agatz et al., 2012; Le et al., 2019). The platform
charges customers for delivery, but delivery operations are performed by independent carriers
who have spot contracts with the platform, and receive compensation for each delivered order.
The platform is responsible for allocating the orders to the carriers, and, possibly, determining
the compensation for each order. On the other hand, carriers decide whether to accept the
assignments offered by the platform. This situation contrasts with delivery systems in which
there is a single decision maker and applies to real-world routing applications where multiple
stakeholders pursue their own objectives, which may be conflicting.
Clearly, the goal of the platform is maximizing its own profit. However, carriers also act
on the base of their own profit maximization, and may possibly reject delivery assignments
in case the corresponding compensation is deemed insufficient. Thus, the platform has to
anticipate carriers’ decisions in order to optimize its own profit. In this paper, we model this
sequential decision-making process using bilevel programming (Dempe, 2002). The platform
acts as a leader, getting a profit associated with each delivered order, corresponding to the
price paid by the customer minus the compensation given to the carriers. At the lower level,
each carrier maximizes the difference between the compensation offered by the platform for
the assigned order and the routing cost incurred in the delivery.
Problems of this type are faced in practice by the so-called peer-to-peer transportation
platforms, as, among others: Uber Eats, which is Uber’s food delivery service that connects
customers with local restaurants and independent delivery drivers; Glovo, which offers a wide
range of delivery services, including food, groceries, and courier services; Amazon Flex, which
allows individuals to deliver Amazon packages using their own vehicles, offering flexibility
and earning opportunities. These are just a few examples, and we address the reader to
Alnaggar et al. (2021) and Horner et al. (2021) for a more complete list of existing peer-to-
peer delivery platforms. These platforms dynamically connect service (e.g., a ride, a delivery)
requests with independent carriers that are not under the platform’s control. As a result, the
platform cannot guarantee that an offered request will be accepted by the service providers
(carriers), who base the selection on their net profits. In some cases, the platform may
influence carriers’ decisions by selecting, not only the requests offered to each of them but
also the compensation associated with each request.
Throughout this paper we assume that the carriers deliver all accepted orders in one
single route, hence the lower-level problem faced by each driver corresponds to a Profitable
Tour Problem (PTP) (Feillet et al., 2005). The PTP belongs to the class of Vehicle Routing
Problems (VRPs) with profits. In PTP, a vehicle, starting from a central depot, can visit a
subset of the available customers, collecting a specific profit whenever a customer is visited.
The objective of the problem is the maximization of the net profit, i.e., the total collected
profit minus the total route cost.
Two different settings for the studied problem are considered: the Bilevel PTP with Fixed
Margins (BPFM) and the Bilevel PTP with Margin Decisions (BPMD). In the first setting,
the compensations paid to the carriers are fixed in advance. At the upper level, a platform
offers disjoint subsets of a given set of items (parcels/orders) to the set K of carriers, while, at
the lower level, each driver k ∈ K solves a PTP and decides which items she accepts to serve,

2
as well as the route she follows. Both the platform and the carriers aim to maximize their
net profits, calculated differently at the two levels: the profit of the platform is the difference
between the price paid by the customers and the compensation paid to the carriers; the profit
of the carriers is the difference between the compensation received for the delivered items and
the routing costs. We solve both a value function reformulation and a no-good cuts based
reformulation of this problem through a branch-and-cut approach. The second setting models
the case where the leader may influence the decisions of the carriers by determining not only
the sets of items to offer to each driver but also the compensation paid for each of them. We
propose two different bilevel formulations for the latter problem and solve a value function
reformulation (in two versions) through a branch-and-cut approach. We further discuss the
link between these two bilevel formulations, before comparing them computationally.
The contributions of this paper can be summarized as follows:

• We show that considering a single-level formulation for the problem leads either to an
overestimation or to an underestimation of the platform’s profit. The former is obtained
assuming that the carriers can only accept the whole offered bundle of deliveries or
refuse it. The latter is obtained assuming a profit-sharing between the platform and
the set of carriers.

• We introduce the bilevel PTP with fixed compensation margins (i.e., BPFM), and
with margin decisions (i.e., BPMD), where the platform acts as the leader who assigns
customer orders to carriers, who act as followers; in the BPMD, the leader also defines
the compensation for each item. With it, we fill the gap existing in the last-mile
delivery literature regarding simultaneously considering upper-level compensation and
lower-level routing decisions.

• We provide bilevel formulations for both problems, as well as their corresponding single-
level reformulations. We start with the BPFM, and its formulation, properties, and
single-level reformulations are used to introduce the more complex case, i.e., the BPMD.
For this problem, we propose two alternative bilevel formulations, that are compared to
prove that they provide equivalent bounds. For readers who are not familiar with bilevel
optimization, which is at the core of our contribution, we present a brief introduction
to this topic in Appendix A.

• We propose alternative formulations where the lower-level routing variables are either
included or projected out. We also introduce some valid inequalities which can be used
to strengthen the proposed formulations.

• We develop a branch-and-cut algorithm for the solution of the problems where optimal-
ity, as well as feasibility cuts, are inserted dynamically. We also propose a warm-start
Mixed-Integer Programming (MIP) heuristic to speed up the solution process. The
heuristic is based on the solution of the BPFM under additional constraints.

• We perform extensive tests on instances adapted from benchmark instances for related
problems to compare the different formulations we propose. We also compare the

3
bilevel solutions with the ones obtained through two single-level formulations, modeling
different problem settings, with the aim of highlighting the advantage for the platform of
considering the carriers as independent agents optimizing their own profit. We further
analyze the gain of the platform when considering margin decisions in our setting.
The paper is organized as follows. In Section 2 we revise the relevant literature. Sec-
tion 3 introduces formally the problems under study. Then, Section 4 focuses on the case
where the margins are fixed, whereas Section 5 is devoted to the case where the margins are
decision variables, for which we propose two formulations, which we compare in Section 5.3.
In particular, in Section 4.3 we derive, for the problem with fixed margins, a new formu-
lation by projecting out the routing variables of the follower’s problem in the upper-level
formulation; in Section 4.4, we discuss how the proposed formulation changes if the setting
of the followers’ problem is modified by considering a maximum route duration constraint
instead of a capacity constraint (measured as the maximum number of packages that each
carrier can deliver). In Section 6, we introduce some valid inequalities that strengthen the
proposed formulations. In Section 7 the solution approaches for the proposed formulations
are discussed. Section 8 describes the computational experiments and presents the numerical
results. Finally, Section 9 concludes the paper.

2 Literature review
he study of peer-to-peer logistic platforms has experienced a significant increase in recent
years. For a comprehensive overview, we address the readers to Agatz et al. (2012); Cleophas
et al. (2019); Wang and Yang (2019); Le et al. (2019).
A wide literature on suppliers’ (carriers, drivers) selection in peer-to-peer logistic services
either overlooked the behavior of suppliers or assumed that their preferences are known in
advance, sometimes together with a carrier’s bid on services (Kafle et al., 2017). Often,
suppliers’ responses are assumed to be predeterminable: all suppliers will accept matches
as long as they are stable (Wang et al., 2018) or meet some constraints, in the form, for
example, of an upper bound on the extra driving time/distance (Masoud and Jayakrishnan,
2017; Arslan et al., 2019). More recently, the setting of peer-to-peer transportation platforms
offering a menu of packages to occasional drivers, which we consider in the current work, has
been addressed in Mofidi and Pazour (2019); Horner et al. (2021); Ausseil et al. (2022). As
in our framework, the suppliers are not employed by the platform, thus the platform does
not have perfect knowledge of the suppliers’ preferences related to which requests they would
be willing to accept. In Mofidi and Pazour (2019), the platform decides the composition
of multiple, simultaneous, personalized recommendations to the suppliers, who then select
from this set. It is assumed that the platform is able to estimate the expected value of
suppliers’ utility for each alternative assignment. A deterministic bilevel optimization model
is thus presented, in which the platform takes as input the expectation of suppliers’ estimated
utilities to make recommendation decisions. Horner et al. (2021) propose another bilevel
formulation, based on the deterministic formulation presented in Mofidi and Pazour (2019),
but adjusted by considering stochastic selection behaviors. A single-level relaxation is then

4
proposed and a Sample Average Approximation method is used to optimize the expected
value of the objective function over a sample of scenarios for the drivers’ behavior. Also
Ausseil et al. (2022) consider a multiple scenario approach, repeatedly sampling potential
drivers’ selections, solving the corresponding two-stage decision problems, and combining the
multiple different solutions through a consensus function. Neither routing nor compensation
decisions are taken into account in these models, which we instead consider in our paper.
The deficiency in customized incentive systems, in particular, could potentially jeopardize
the satisfaction of both the requester and the deliverer in terms of their utility and profit,
respectively. This is implicitly highlighted in Horner et al. (2021), when stating that the
proposed methods achieve good performances as long as the drivers are well compensated,
i.e., when a percentage of 80% of the platform revenues goes to them. In support of this
statement, Hong et al. (2019) propose a Stackelberg game to model the interaction between
the platform, which decides both the delivery fees and the paths, and the drivers, who, based
on the distance and their utility, make decisions about whether to participate in the delivery
process or not. The computational experiments show that including decisions about the
compensation level in the process can significantly improve delivery efficiency, and reduce
delivery costs compared to traditional delivery methods. In Gdowska et al. (2018), both a
professional delivery fleet and a set of occasional carriers are taken into account. While the
professional fleet is owned by the platform, the occasional carriers are independent, and can
only deliver one parcel. Each delivery request has a fixed probability of being rejected by the
occasional carriers. A compensation mechanism is considered to determine the fee to pay to
each occasional driver in case a delivery request is accepted. Barbosa et al. (2023) extend the
model introduced in Gdowska et al. (2018) by implementing a golden-section search method
that determines the best compensation to offer for each request, considering the probability
of rejection dependent on the compensation.
While compensation decisions have been optimized in a bilevel setting in some of the
studies listed above, none of them explicitly models routing decisions at the lower level,
as is done in this paper. In fact, an important feature of our approaches relies on the
routing nature of the lower-level problem, and in the following, we review works where bilevel
optimization is used to model VRPs, as Du et al. (2017); Nikolakopoulos (2015); Parvasi
et al. (2019); Marinakis and Marinaki (2008); Ning and Su (2017). All these works propose
metaheuristics to solve the considered problems, and, more specifically, genetic algorithms.
In Du et al. (2017), a multi-depot VRP is considered, and at the upper level, the assignment
of customers to depots is decided, while depots-customers routing decisions are taken at the
lower level. Nikolakopoulos (2015) address the VRP with backhauls and Time Windows,
where a backhaul is a return trip to the depot, during which the vehicles pick up loads from
the visited customers. The goal of the leader is to minimize the number of vehicles involved,
whereas the follower aims to minimize the duration of the routes. A bilevel bi-objective
formulation is proposed in Parvasi et al. (2019) to model the VRP where the involved vehicles
are school buses. At the upper level, a transportation company selects some locations from
a set of potential bus stop locations (first objective) and determines the optimal bus routes
among the selected stops (second objective). At the lower level, students are allocated to a
stop or to another transportation company in order to minimize the time spent on buses. A

5
bilevel location VRP is studied by Marinakis and Marinaki (2008). The upper level concerns
decisions at the strategic level, i.e., the optimal locations of facilities. The lower level is about
operational decisions regarding optimal vehicle routes. In Ning and Su (2017), a bilevel model
is used to formulate the VRP with uncertain travel times. The leader aims at minimizing the
total waiting times of the customers, and the followers want to minimize the waiting times of
the vehicles before the beginning of customers’ time windows. The uncertain bilevel model
is reformulated into an equivalent deterministic one.
Calvete et al. (2011) consider a multi-depot VRP within a production–distribution plan-
ning problem. At the upper level, a distribution company orders from a manufacturing
company the items that must be supplied to the retailers, while deciding on the allocation
of these retailers (who play the role of customers) to each depot and on the routes that
serve them. The manufacturing company, which is the follower, decides what manufacturing
plants will produce the ordered items. Both players want to minimize their own costs. An
ant colony optimization approach is developed to solve the bilevel model. The same con-
flicting agents are considered in Camacho-Vallejo et al. (2021), but with different objectives:
the distribution company aims at maximizing the profit gained from the distribution process
and minimizing CO2 emissions; the manufacturer aims at minimizing its total costs. The
upper level is thus a bi-objective problem. A tabu search heuristic is designed to obtain non-
dominated feasible solutions for the distribution company. A hybrid algorithm combining
ant colony optimization and tabu search is proposed in Wang et al. (2021) to solve a bilevel
problem modeling the location-routing problem with cargo splitting, under four different
low-carbon policies. The leader is the engineering construction department, which decides
on the distribution center location. The follower takes the distribution department as the
decision-maker to solve the VRP.
In Santos et al. (2021), a bilevel formulation is proposed to model the VRP with backhauls,
without taking into account time windows considered in Nikolakopoulos (2015). At the upper
level, a shipper aims at minimizing the transportation costs by integrating delivery and pickup
operations in the routes, whereas at the lower level a set of carriers, acting together, want
to maximize their total net profit. The carriers, who may also serve requests from other
shippers, may not be willing to collaborate with the shipper. To motivate the carriers to
perform integrated routes, the shipper pays them an additional incentive. A reformulation
is used to build an equivalent single-level problem.
In some cases, a bilevel approach has been proposed to address a routing problem even
if the decision maker of both levels is the same. For instance, Marinakis et al. (2007) for-
mulate the capacitated VRP as a bilevel problem, where the first-level decisions concern the
assignment of customers to the routes, and the second-level decisions determine the actual
routes. In Handoko et al. (2015), the last-mile delivery problem faced in an urban consoli-
dation center, which corresponds to a PTP with multiple vehicles, is modeled using bilevel
programming. At the upper level, the customers to serve are selected, in order to maximize
the profit of the carriers’ alliance. At the lower level, a Capacitated VRP deals with the
optimization of the route given the set of selected customers.
None of the works mentioned above deals with BPFM or BPMD so we now introduce a
formal description of these problems.

6
3 Problems definition and formulations
In this section, we provide a formal description of the problems we address, whose correspond-
ing mathematical formulations are proposed in the next sections (Section 4 and 5). After
introducing the notations we will use throughout the paper, as well as the two problems
we address (BPFM and BPMD), we show in Section 3.1 that using single-level formulations
leads to a misprediction of the platform’s profit.
Input sets and parameters
Sets and parameters used in the definition of the problems are listed below.

• G = (V, A): routing network over all nodes V0 = {0, . . . , n}; in particular, V = V0 \ {0}
corresponds to the set of customers to serve, and node 0 to the depot where routes
start and end;

• K: index set of carriers/drivers/followers/vehicles;

• pi : price that is paid to the platform if customer i is served;

• p̄ki : compensation paid by the platform to carrier k if she accepts to serve customer i;

• ckij : arc (i, j) weight representing travel time for carrier k; we assume that arc weights
ckij satisfy the triangle inequality;

• bk : upper bound on the number of items carrier k can deliver (bk < n for all k);

• tkmax : upper bound on the duration of the route for carrier k.

Notations
In the rest of this paper, we denote by δ + (i) (δ − (i)) the set of arcs exiting from (entering)
vertex i, and by T be the set of all routes in graph G = (V, A). Moreover, we denote by
T ∈ T an arbitrary route, with V (T ) and A(T ) being the set of vertices and arcs visited and
k
ckij be the arc cost of
P
traversed by the route, respectively. Furthermore, let C (T ) =
(i,j)∈A(T )
tour T for driver k.
The BPFM and the BPMD
We consider a single-leader multiple-follower Stackelberg game in which there is a set of
items I that needs to be delivered to a corresponding set of customers V . Each customer
i ∈ V requires exactly one item in i ∈ I (multiple items required by the same customer are
considered as multiple duplicated customers). For this reason, in the following, we will refer
just to the set V (and V0 when including the depot), either when referring to the delivered
items, or to the served customers. An intermediary platform, acting as a leader, receives a
price pi for each item to be delivered. Given a set K of potential carriers (e.g., occasional
drivers), the intermediary searches for carriers that can deliver these items, and pays to
carrier k ∈ K a compensation p̄ki , 0 < p̄ki < pi , for each delivered item, i.e., for each served
customer. The difference between the price pi and the compensation p̄ki is the net profit of

7
the platform in case item i is delivered by carrier k. Note that it is 0 in case the item is
not delivered by any carrier. The net profit associated with item i expressed as a fraction of
p −pk
the price is defined as the “profit margin”, i.e., i pi i . The leader has to create |K| disjoint
subsets of items, each of them to be offered to a carrier. We call Pk the subset of items
offered to carrier k ∈ K. Each carrier k ∈ K receives the proposal, and, based on her net
profit, decides on a subset of customers Qk ⊆ Pk to accept to serve. To this end, each carrier
solves a PTP with respect to the given set of items (customers) Pk . A carrier can refuse to
deliver some items, in which case the intermediary’s margin for this item becomes zero. The
goal of the leader is to make a call to the carriers, so as to maximize its revenue, which is
defined as XX
(pi − p̄ki ).
k∈K i∈Qk

We consider two alternative assumptions concerning the number of packages to be served:


either we assume that carrier k cannot deliver more than bk items, or we assume that there
is a travel time limit for the follower, tkmax .
We assume, without loss of generality, an optimistic bilevel setting, i.e., for a given leader’s
choice, if follower k has multiple optimal responses determined by different sets Q̃k of items
to be delivered, she will accept to deliver the items that are more favorable to the leader.
This means that the follower k will choose to deliver the subset
X
Q∗k ∈ arg max{ (pi − p̄ki ) : Q̃k is optimal for the follower k},
Q̃k
i∈Q̃k

where Q̃k is optimal for the follower k if it is an optimal solution of the PTP solved by the
follower k with respect to the set of items Pk assigned to her by the leader, i.e.:
X X
Q̃k ∈ arg max{ p̄ki − ckij : Qk ⊆ Pk , and the route serving nodes i ∈ Qk is feasible}.
Qk
i∈Qk (i,j)∈A:
i,j∈Qk

This is without loss of generality because, in the implementation, when we find the optimistic
solution, the leader can a-posteriori add a small ϵ to the compensation value for the proposed
parcels to break the ties. We further assume that there is no communication between the
carriers, i.e., a carrier k is not aware of what is offered to carriers k ′ ∈ K \ {k}. Thus, there
is no possible bargaining, and no need to establish a generalized Nash equilibrium between
the multiple followers’ solutions. Nevertheless, the problems cannot be seen as single-level,
because the carriers are selfish agents and do not have to collaborate with the intermediary.
Their major goal is to maximize their own net profits, and, therefore, a solution that is
optimal for the leader is not necessarily optimal for the follower. We provide in the following
section two different single-level formulations, as well as an illustrative example to support
this claim.
In the BPFM we assume that the leader does not decide on the compensations to pay to
the carriers, i.e., p̄ki is fixed and given a-priori. In the BPMD, instead, we assume that the
intermediary platform decides, in addition to the assignment of customers to carriers, the

8
compensation to pay to each carrier (measured as the fractional margin of the price obtained
by the platform). Since the compensation is a fraction of the price, what remains is the
“margin” gained by the platform, so we use the term “margin optimization”. In particular,
we consider |M | different margin values that the intermediary platform can choose for each
item. This set is the same for all items. We denote as pmi the profit gained by the platform
when applying margin m ∈ M to item i.

3.1 Single-level formulations of related problems


We discuss here two different single-level formulations of the problem faced by the platform,
which lead to an upper and a lower bound of the platform’s profit, respectively.
Let us suppose that the platform maximizes its own profit while imposing that the profit
of the carriers is larger than a certain threshold, e.g., their willingness to accept (WTA),
and assuming that Pk = Qk once this threshold is satisfied. This last assumption defines
a different problem setting in which carriers are not allowed to decide on the subset of the
assigned items to serve, but can only accept or reject the whole assigned bundle. Assume also
that the WTA value for all carriers is 0, i.e., they are willing to accept to serve in case the
net profit is non-negative. This single-level setting, which we call the WTA-PTP, involves
two types of binary variables: αik for each i ∈ V0 and k ∈ K, which is 1 if item i is offered
to carrier k; and zijk for all (i, j) ∈ A and k ∈ K, which is 1 if arc (i, j) ∈ A is traversed by
carrier k. The WTA-PTP can be formulated as:
XX
max (pi − p̄ki )αik (1a)
α,z
k∈K i∈V
X
s.t. αik ≤ 1 ∀i∈V (1b)
k∈K
X
αik ≤ bk ∀k∈K (1c)
i∈V
X X
p̄ki αik − ckij zijk ≥ 0 ∀k∈K (1d)
i∈V (i,j)∈A

(αk , z k ) is a route ∀k∈K (1e)


k n+1 k |A|
α ∈ {0, 1} , z ∈ {0, 1} ∀ k ∈ K. (1f)

The objective function (1a) represents the net profit of the platform to be maximized. The
first set of constraints (1b) imposes that each item is served by at most one carrier; the
second ones (1c) are the capacity constraints (equivalently one could consider the route
duration limit); constraints (1d) state that each carrier should have a nonnegative profit;
finally, constraints (1e) ensure that z k is the incidence vector of a route that visits the depot
and all customers i such that αik = 1. The value of WTA-PTP as calculated in (1a) provides
an upper bound to BPFM. Indeed, given constraints (1d), in WTA-PTP, we assume carriers
accept to serve the whole assigned bundle of items, as long as the associated net profit is
non-negative. Instead, in BPFM, carriers determine the subset of assigned items maximizing

9
their net profit. However, we note that, given any feasible solution ᾱ of WTA-PTP, a bilevel
feasible solution can be recovered by solving a PTP for each carrier, on the subset of items
assigned to that carrier in the WTA-PTP solution. Indeed, in this way, we can compute the
accepted subset of the items among the ones for which ᾱi is 1, and these subsets, associated
with the optimal tours to serve them, represent a feasible solution of the BPFM.
WTA-PTP formulation can be easily generalized to the case in which the platform can
decide also on the compensation paid to each carrier, in terms of the price margin the platform
gains from each item (see Appendix C).
Let us now consider a different setting. If a profit-sharing between the platform and the
set of carriers is considered, a different single-level formulation is obtained. This situation
arises when dealing with the so-called Urban Consolidation Centers (UCC), or City Logistics
Centers, i.e., logistics facilities strategically located within urban areas to optimize the effi-
ciency of last-mile deliveries (Handoko et al., 2015). In this case, the platform corresponds
to the alliance of all carriers, and the objective function is the total profit of this alliance.
The corresponding single-level formulation is then:
 
X X X
max  p̄ki αik − ckij zijk  (2a)
α,z
k∈K i∈V (i,j)∈A
X
s.t. αik ≤ 1 ∀i∈V (2b)
k∈K
X
αik ≤ bk ∀k∈K (2c)
i∈V

(α , z k ) is a route
k
∀k∈K (2d)
αk ∈ {0, 1}n+1 , z k ∈ {0, 1}|A| ∀ k ∈ K. (2e)

This single-level formulation, which we define as UCC-PTP, can be used to obtain a lower
bound on the profit of our delivery platform in the setting we study. Indeed, the solution of
UCC-PTP is a feasible solution for the BPFM, as well as for the BPMD.
With the following example, we better clarify the relationships between the two single-
level problems presented above and the bievel problem we focus on.
Consider the complete graph with 7 nodes in Figure 1, where node 0 is the depot. Assume
that there are two carriers a and b, and that for all k ∈ {ka , kb }:
• ck0j = 0.5 for all j ∈ V , ck12 = ck23 = ck34 = ck45 = ck51 = 0.5, ck13 = ck14 = ck24 = ck25 = ck35 =
1, and ckij = ckji for all (i, j) ∈ A;

• bk = 2;

• (p1 , p̄k1 ) = (10, 0.5), (p2 , p̄k2 ) = (10, 2), (p3 , p̄k3 ) = (10, 4), (p4 , p̄k4 ) = (5.5, 5.1), (p5 , p̄k5 ) =
(5.5, 5), (p6 , p̄k6 ) = (10, 8).
In the WTA-PTP setting, the leader would assign to carriers ka and kb items 1, 2, 3, and 6
(two to each carrier), predicting a total profit of 25.5. The platform would exclude items 4

10
Figure 1: Example graph. Weights of gray/red edges are 0.5/1 respectively. Labels next to
each vertex i display (pi , p̄ki ) for all k.

and 5 because they correspond to the smallest margins of 0.4 and 0.5, respectively. Let us
now consider the BPFM as described above. The carrier who is assigned item 1, together
with any other item among 2, 3, and 6, would not deliver it, since serving only the other
offered item (2, 3, or 6) would produce a higher profit than serving the assigned pair. Thus,
the profit of the platform associated with their optimal WTA-PTP solution is 16 (value of
the recovered bilevel feasible solution).
In the UCC-PTP setting, the UCC would assign to carrier ka items 3, and 4, and to
carrier kb items 5, and 6 (or vice-versa), with a UCC-PTP value of 19.1. This is a feasible
solution in the BPFM, with a value of 8.9, but it is not the optimal one. Indeed, the optimal
solution of BPFM would assign items 2, 3, 5 and 6 (two to each carrier, see the colors of the
nodes in Figure 1 for a graphical representation), yielding a total profit of 16.5.
This example illustrates that the problem we study is bilevel in nature and that, to avoid
misprediction of the true profit of the leader, it is crucial to integrate the optimal followers’
response inside of the optimization problem, as it is done with the BPMF. Approximating this
optimal response, and either replacing it with WTA assumption or considering the carriers
as part of the platform leads to a suboptimal decision for the platform. Nevertheless, the
values of WTA-PTP and UCC-PTP can be used to derive an upper and a lower bound on
the net profit of the leader in the bilevel setting, respectively.

11
4 The Bilevel PTP with Fixed Margins
In this section, we introduce a formulation for the BPFM, first, with a limit on the number
of packages, and then, in Subsection 4.4, with a limit on the duration of the route.
We recall that in the BPFM the leader does not decide on the compensations to be paid to
the carriers, i.e., p̄ki is fixed and given a-priori. To model the leader’s decision on the proposal
to each carrier, we use the binary decision variable xki for each i ∈ V , k ∈ K, which takes the
value 1 if the platform assigns customer i to carrier k, i.e., i ∈ Pk . In addition, we define the
lower-level binary variable yik for each i ∈ V0 and k ∈ K, to model the acceptance decision
of the carriers. yik is 1 if carrier k accepts to serve customer i, i.e., i ∈ Qk ; in particular, y0k
is equal to 1 in case carrier k accepts to make at least one delivery. Finally, we consider the
set of lower-level binary variables zijk for all (i, j) ∈ A and k ∈ K (already introduced for
formulations (1) and (2)) to model the routing decisions of the carriers. In particular, zijk = 1
if arc (i, j) ∈ A is traversed by carrier k.
Then, the BPFM formulation is as follows:
XX
pi − p̄ki yik

max (3a)
x,y
i∈V k∈K
X
s.t. xki ≤ 1 ∀i∈V (3b)
k∈K

y ∈ Sφk (xk )
k
∀k∈K (3c)
xk ∈ {0, 1}n , y ∈ {0, 1}n+1 ∀ k ∈ K, (3d)

where Sφk (xk ) is the set of optimal solutions of the k-th follower problem, which, for a given
x̃k , is formulated as:
X X
φk (x̃k ) = max p̄ki yik − ckij zijk (4a)
y,z
i∈V (i,j)∈A

s.t. yik
≤ x̃ki ∀i∈V (4b)
X
yik ≤ bk (4c)
i∈V

(y , z k ) is a route
k
(4d)
y k ∈ {0, 1}n+1 , z k ∈ {0, 1}|A| . (4e)

Constraints (3b) state that each item is offered to at most one carrier. Constraints (3c)
ensure that the solution y k returned by the k-th follower is an optimal response with respect
to the set of items offered to her by the leader. The objective function (4a) of follower k is
the difference between the sum of the delivered items’ compensations and the total travel
cost (length of the route that visits all customers accepted by the carrier). Constraints (4b)
link the decisions of the leader with the ones of the follower and establish that an item i can
be delivered by carrier k only if it is offered to her. Constraints (4c) impose that carrier k
can accept to serve at most bk items. Finally, constraint (4d) states that z k is the incidence

12
vector of a route that visits the depot and all customers i such that yik = 1. More in detail,
constraint (4d) is given by:
X
zijk = yik ∀i∈V (5a)
(i,j)∈δ + (i)
X
k
zji = yik ∀i∈V (5b)
(i,j)∈δ − (i)
X X
zijk ≤ yi − yh ∀ S ⊆ V, |S| ≥ 2, h ∈ S. (5c)
i∈S,j∈S i∈S

The sets of equalities (5a)–(5b) impose that one arc enters and leaves each visited vertex.
The set of exponentially many inequalities (5c) ensures subtour elimination and connection
to the depot.
In what follows, in order to derive a single-level reformulation of the BPFM formula-
tion (3), we propose two approaches. The first one, presented in Subsection 4.1, is the value
function reformulation approach typically used in bilevel literature (see Appendix A for more
details). The relationship of the so-obtained single-level formulation with the WTA-PTP and
UCC-PTP formulations is discussed in Subsection 4.2, and an alternative formulation is ob-
tained by projecting out the routing variables in Subsection 4.3. A variant of the problem
when considering a limit on the route duration instead of the capacity constraints is pre-
sented in Subsection 4.4. Then, in Subsection 4.5, the second reformulation approach, based
on bilevel no-good cuts (Tahernejad and Ralphs, 2020) exploiting the binary nature of the
upper-level decisions, is discussed.

4.1 Value function reformulation


As explained in Appendix A, one possible way to reformulate optimistic bilevel problems is
through the so-called value function approach, which, for model (3), leads to the following
reformulation:
XX
pi − p̄ki yik

max (6a)
x,y,z
i∈V k∈K
X
s.t. xki ≤ 1 ∀i∈V (6b)
k∈K
X
xki ≤ bk ∀k∈K (6c)
i∈V
X X
p̄ki yik − ckij zijk ≥ φk (xk ) ∀k∈K (6d)
i∈V (i,j)∈A

yik ≤ xki ∀ i ∈ V, k ∈ K (6e)


k k
(y , z ) is a route ∀k∈K (6f)
k n k n+1 k |A|
x ∈ {0, 1} , y ∈ {0, 1} , z ∈ {0, 1} ∀ k ∈ K. (6g)

13
In formulation (6), we replace constraints
X
yik ≤ bk ∀k∈K (7)
i∈V

of the classic value function reformulation of the BPFM (3) with constraints (6c), which,
combined with (6e), imply (7). Constraints (6c) on x are indeed stronger than (7), and this
helps in the solution of the single-level formulation (6).
The difficulty of solving formulation (6) lies in the value-function constraints (6d): the
function φk (xk ) is non-convex and non-continuous. Thus, in order to derive a single-level
MIP formulation of the problem, we further analyze this function, which we try to convexify.
Under the assumption that arc weights ckij satisfy the triangle inequality (in order to ensure
that zijk = 1 implies yik = yjk = 1 for all k), we derive the following result.
Proposition 1. For any k ∈ K, given a vector x̃k ∈ {0, 1}n satisfying constraints (6b)–(6c),
there always exists an optimal solution of the following problem, which is also optimal for
φk (x̃k ):
X X
φ̄k (x̃k ) = max p̄ki yik x̃ki − ckij zijk (8a)
y,z
i∈V (i,j)∈A
k k
s.t. (y , z ) is a route (8b)
k n+1 k |A|
y ∈ {0, 1} , z ∈ {0, 1} . (8c)
Proof. Let x̃k ∈ {0, 1}n be a vector satisfying constraints (6b)–(6c). Let us consider a given
k ∈ K and let Pk ⊆ V be the subset of vertices associated with x̃k , i.e., Pk = {i ∈ V : x̃ki = 1},
with |Pk | = bk . Being Sφ̄k (x̃k ) the set of optimal solutions of problem φk (x̃k ), we want
to
P prove that there exists (ŷ k , ẑ k ) ∈ Sφ̄k (x̃k ) such that a) its objective function value is
p̄ki ŷik − ckij ẑijk , and b) ŷik = 0 for all i ∈ V \Pk , despite the relaxation of constraints (4b)
P
i∈Pk (i,j)∈A
in formulation (8). If this is true, constraint (4c) will also hold because of constraints (6c),
which impose that |Pk | = bk . The reasoning is the following. If an i′ ∈ V \ Pk exists s.t.
ŷik′ = 1, then the compensation collected from i′ would be 0 as x̃ki′ = 0. Also, for the triangle
inequality, going directly from the predecessor to the successor of i′ in the optimal solution
is cheaper (or at most has the same cost) than going through i′ . Thus, either the solution
visiting i′ is not optimal, or there exists a solution that does not visit i′ with the same
objective function value. This procedure can be iterated over all i ∈ V \ Pk .
According to Proposition 1, the problem of follower k could be solved by considering the
entire graph G and multiplying the compensation associated with each customer i ∈ V by
the value x̃ki , representing the assignment made by the leader. In this way, in case x̃ki = 0,
customer i would not be visited in the optimal solution of follower k.
k
Let Pext denote the set of all the extreme points (y k , z k ) of the convex hull of the follower’s
feasible solutions space determined by constraints (8b)–(8c). It holds that:
 
X X 
φk (xk ) = max p̄ki ŷik xki − ckij ẑijk ,
k 
(ŷ k ,ẑ k )∈Pext 
i∈V (i,j)∈A

14
which is a convex function in xk .
We notice that, in terms of routes T ∈ T , problem (8) can be restated as:
 
 X 
φ̄k (x̃k ) = max p̄ki x̃ki − C k (T ) .
T ∈T  
i∈V (T )

In other words, constraints (6d) can be replaced by constraints


X X X
p̄ki yik − ckij zijk ≥ p̄ki xki − C k (T ) ∀ T ∈ T , k ∈ K, (9)
i∈V (i,j)∈A i∈V (T )

obtaining the following single-level reformulation of problem (6):


 XX
pi − p̄ki yik


 max (10a)

 x,y,z


 i∈V k∈K
 X
s.t. xki ≤ 1 ∀i∈V (10b)







 k∈K
 X
xki ≤ bk ∀k∈K (10c)




(BPFM) i∈V
X X X
p̄ki yik − ckij zijk ≥ p̄ki xki − C k (T ) ∀ T ∈ T , k ∈ K (10d)







 i∈V (i,j)∈A i∈V (T )
 k k
y i ≤ xi ∀ i ∈ V, k ∈ K (10e)





(y k , z k ) is a route ∀k∈K



 (10f)

xk ∈ {0, 1}n , y k ∈ {0, 1}n+1 , z k ∈ {0, 1}|A|

 ∀ k ∈ K, (10g)

to which we will refer in the rest of the paper when considering the BPFM.

4.2 Relationship with WTA-PTP and UCC-PTP formulations


WTA-PTP formulation (1) and UCC-PTP formulation (2) are respectively a relaxation and
a restriction of the BPFM formulation. On the one hand, the WTA-PTP formulation (1) can
be obtained from formulation (10) by setting x = y = α and replacing constraints (10d) with
the weaker constraints (1d). On the other hand, to see the relationship with the UCC-PTP
formulation, we set x = α, which gives us a valid partitioning of the items. Then, we set
y = α as a follower response. In order to prove that (x, y) is a feasible bilevel solution (and
thus provides a lower bound for BPFM), it remains to prove that y is the optimal followers’
response for the given x = α. This follows from the model (2) itself, since, once α is fixed,
model (2) separates into |K| independent subproblems, each of them corresponding to the
lower-level problem (4) for a given x = α.

15
4.3 Projecting out the z variables
In this section, we present a new formulation for the BPFM derived from projecting out
the z variables in the value-function reformulation presented above. We introduce the new
continuous variables θk , for each k ∈ K, which represent the cost of the route followed by
carrier k. In this case, problem (BPFM) becomes
 XX
k
 k

 max p i − p̄i yi (11a)
 x,y,θ
i∈V k∈K



 X
xki ≤ 1




 s.t. ∀i∈V (11b)
k∈K



 X
xki ≤ bk



 ∀k∈K (11c)
(BPFM-z) i∈V
X
p̄ki yik − θk ≥ φk (xk )




 ∀k∈K (11d)
i∈V




θ ≥ ckTSP (y k )
k
∀k∈K (11e)





yik ≤ xki ∀ i ∈ V, k ∈ K (11f)





xk ∈ {0, 1}n , y k ∈ {0, 1}n+1 , θk ∈ R ∀k∈K

(11g)

where ckTSP (y k ) is the cost of the optimal route associated with vector yk (TSP standing for
Travelling Salesman Problem), and φk (xk ) is the optimal solution value of the k-th follower
problem, which, for a given (x̃k ) is formulated as in (4).
According to Proposition 1, constraints (11d) can be replaced by:
X X
p̄ki yik − θk ≥ p̄ki xki − C k (T ) ∀ T ∈ T , k ∈ K, (12a)
i∈V i∈V (T )

whereas constraints (11e) can be replaced by the cuts:


 
X
θk ≥ ckTSP (V (T ))  yik − |V (T )| + 1 ∀ T ∈ T , k ∈ K. (12b)
i∈V (T )

Additionally, we can add to the formulation the following strengthening inequalities for
all k ∈ K in order to provide a non-trivial lower bound on the value of variables θk :
X
θk ≥ dki yik , (13)
i∈V

where dki = max{ min



ckji , min
+
ckij }. These inequalities state that the cost of the route
j∈δ (i) j∈δ (i)
associated with y k cannot be lower than the sum of the costs of the arcs of minimum cost
incident to all the visited nodes.

16
4.4 Bounded route duration variant
In the setting considered above, each carrier cannot deliver more than bk packages (con-
straints (1c), (2c), (4c), (10c), (11c)). Another common practical setting is the one where
the limitation is instead imposed on route duration, thus having a time limit tkmax . To model
this case, we remove the constraints on the maximum number of packages and we add the
following constraint:
X
ckij zijk ≤ tkmax (14)
(i,j)∈A

in each lower level. The considerations on the bilevel nature of the problem also apply in
this case.
The formulation presented in Section 4.3, where z variables are projected out, has to
include constraint
θk ≤ tkmax ∀k ∈ K (15)
in the single-level formulations, and constraint (14) in each lower level.

4.5 Reformulation based on no-good cuts


An alternative to the value function approach for obtaining a single-level reformulation of
model (3) consists in introducing no-good cuts based on the solution of the PTPs associated
with the carriers.
Consider a set of items V k , such that, when they are offered to carrier k (αik = 1 for all
i ∈ V k ), not all are accepted. In other words, there exists a proper subset of items within
V k that constitutes an optimal PTP solution for carrier k. Using the α variables introduced
for the WTA-PTP formulation (1), we can obtain the following reformulation of BPFM with
no-good cuts:
 XX
pi − p̄ki αik


 max (16a)
α


i∈V k∈K



 X
αik ≤ 1 ∀i∈V




 s.t. (16b)
k∈K




X
(BPFM )
 αik ≤ bk ∀k∈K (16c)
i∈V



 X X
k
− αik ≥ 1 ∀ k ∈ K, V k ⊂ V




 (1 α i ) + (16d)
i∈V k i̸∈V k




k n
α ∈ {0, 1} ∀ k ∈ K.

 (16e)

Constraints (16d) are the no-good cuts and they are exponentially many. Note that for any
given subset of items whose incidence vector is given by αk , in order to determine whether
it coincides with a k-th carrier’s optimal response, one has to solve the lower-level PTP
problem (4), obtaining the set of accepted parcels y k . If the vectors αk and y k do not

17
coincide, the no-good cut has to be added to the model. Hence, this is an alternative way
of restating the bilevel problem as a single-level reformulation. However, it is worth noting
that the no-good cuts are known to be weak, only cutting off one point at a time.

5 The Bilevel PTP with Margin Decisions


In this section, we turn our attention to the BPMD and present two different formulations
for this problem in Section 5.1 and Section 5.2, respectively. In Section 5.3, we compare these
two bilevel formulations, referring to their value function reformulations.
As already discussed, in the BPMD we assume that, in addition to assignment decisions,
the leader decides also the margin m ∈ M to gain from each item i delivered by carrier k.
Let mmin and mmax be the minimum and maximum margin, respectively. In order to
model the leader’s choice among the different margins m ∈ M , we have considered the
following two alternative options:
k
i) We define a new upper-level binary variable for each carrier, margin, and item, Xmi ,
1 if i ∈ Pk and the selected margin is m for item i ∈ V , and 0
which takes value P
k
otherwise. Since Xmi = xki for all i ∈ V and k ∈ K, we can discard the variables
m∈M
x. However, we still use variables y as defined for the BPFM.
k
ii) We define also the lower-level binary disaggregated variable Ymi , which substitutes the
k
previous decision variable yi , and takes value 1 if carrier k accepts to serve customer i
k
(i.e., i ∈ Qk ) with margin level m. In particular Ym0 = 1 for an arbitrarily selected m
if carrier k accepts to make at least one delivery.
In both cases, we model the routing decisions with the same binary variables z as defined
for the BPFM. The decision variables of alternative i) lead to what we call the aggregated
formulation, in contrast to alternative ii) which leads to a disaggregated formulation. We
prove in Section 5.3 that these two formulations provide the same bounds.
Before presenting the formulations we observe that if the selected margin is m, the cor-
responding net profit for the leader will be pmi yik , where pmi = m · pi . Furthermore, in this
context, we denote by p̄mi = pi − pmi the compensation paid to follower k when the selected
margin is m.

5.1 Aggregated formulation


k
Using
Pthe kdecision variables of alternative i), the upper-level variable xi can be replaced
by Xmi . Furthermore, the upper-level objective function, representing the profit of the
m∈M
leader to be maximized, reads XX X
k k
pmi Xmi yi .
k∈K i∈V m∈M

We can linearize this function by using Fortet’s inequalities (Fortet, 1960), i.e., a special case
of the McCormick inequalities for products of binary variables. These inequalities define the

18
k k
convex envelopes of the bilinear terms Xmi yi . In order to do this, we have to introduce
k k k
additional binary variables wmi defined as Xmi yi and insert the Fortet’s inequalities (17c)–
(17e) in the upper-level problem formulation obtaining the following bilevel problem:
XX X
k
max pmi wmi (17a)
X,w,y
k∈K i∈V m∈M
XX
k
s.t. Xmi ≤1 ∀i∈V (17b)
k∈K m∈M
k
Xmi + yik ≤ wmi k
+ 1 ∀ m ∈ M, i ∈ V, k ∈ K (17c)
k k
wmi ≤ Xmi ∀ m ∈ M, i ∈ V, k ∈ K (17d)
k
wmi ≤ yik ∀ m ∈ M, i ∈ V, k ∈ K (17e)
y k ∈ SΦk (X k ) ∀k∈K (17f)
k k
Xm , wm ∈ {0, 1}n ∀ m ∈ M, k ∈ K (17g)
k n+1
y ∈ {0, 1} ∀k∈K (17h)

with SΦk (X k ) the set of optimal solutions of the k-th follower problem, which, for a given X̃ k ,
reads:
XX X
Φk (X̃ k ) = max k k
p̄mi X̃mi yi − ckij zijk (18a)
y,z
i∈V m∈M (i,j)∈A
X
s.t. yik ≤ k
X̃mi ∀i∈V (18b)
m∈M
X
yik ≤ bk (18c)
i∈V

(y , z k ) is a route
k
(18d)
k n+1 k |A|
y ∈ {0, 1} , z ∈ {0, 1} . (18e)
k k
= 0 for all m, and hence yik will be 0 (from
P
We note that, if Xmi = 0, then Xmi
m∈M
constraints (18b)), so the upper-level objective function, as well as the first term of the
lower-level objective function, will also be 0. Thus, we can replace constraints (17c) by:
X
yik ≤ wmi
k
+ k
Xui ∀ m ∈ M, i ∈ V, k ∈ K. (19a)
u∈M :u̸=m

Furthermore, to strengthen the Linear Programming (LP) relaxation of the problem, we


add the following constraint, implicitly satisfied for binary solutions, but not necessarily for
LP solutions: X
yik = k
wmi ∀ i ∈ V, k ∈ K. (19b)
m∈M

A single-level reformulation of the proposed bilevel problem may be obtained using the
value function reformulation. Also in this case, we restrict x instead of y when moving

19
constraints (18c) to the upper level, i.e., in the single-level value function reformulation we
have constraints XX
k
Xmi ≤ bk ∀ k ∈ K. (20)
i∈V m∈M

As in Section 4, assuming parameters ckij satisfy the triangle inequality, the following result
holds.

Proposition 2. For any k ∈ K, given a vector X̃ k ∈ {0, 1}n satisfying constraints (17b)–
(17e), and (20), there always exists an optimal solution of the following problem, which is
also optimal for Φk (X̃ k ):
 
 X X 
Φ̄k (X̃ k ) = max k
p̄mi X̃mi − C k (T ) . (21)
T ∈T  
i∈V (T ) m∈M

Proof. Problem (21)


P iskobtained from (18) by relaxing constraints (18b) and (18c) When, for

a given i ∈ V , X̃mi′ = 1, constraint (18b) is implicitly satisfied, being the components
m∈M
of y at most 1. Thus, being SΦ̄k (X̃ k ) the set of optimal solutions of problem Φ̄k (X̃ k ), we want
to prove that, if, for a given i′ ,
P k
X̃mi′ = 0 there exists (ŷ k , ẑ k ) ∈ SΦ̄k (X̃ k ) such that ŷik′ = 0.
m∈M
Assuming that ŷik′ = 1, the compensation collected from i′ would be 0 as X̃mi k
′ = 0 for all m.

Also, for the triangle inequality, going directly from the predecessor to the successor of i′ in
the optimal solution is cheaper (or at most has the same cost) than going through i′ . Thus,
either the solution visiting i′ is not optimal, or there exists a solution that does not visit i′
with the same value of the objective function. Straightforwardly, since constraints (20) hold
for X̃ k , constraint (18c) will hold for ŷ k .
Following the same approach as in Proposition 1, we have the following single-level refor-
mulation:
 XX X
k

 max pmi wmi (22a)
X,w,y,z
k∈K i∈V m∈M




 s.t. (17b), (19a)–(19b), (17d)–(17e)




 X X
k
Xmi ≤ bk ∀k∈K (22b)







 i∈V m∈M
 X
yik ≤ k
∀ i ∈ V, k ∈ K

 Xmi (22c)
(BPMD) m∈M


(y k , z ) is a route
k
∀k∈K (22d)





 X X X X X
k
p̄mi wmi − ckij zij
k
≥ k
p̄mi Xmi − C k (T ) ∀ T ∈ T ,k ∈ K (22e)







 i∈V m∈M (i,j)∈A i∈V (T ) m∈M

 k k n



 Xm , wm ∈ {0, 1} ∀ m ∈ M, k ∈ K (22f)

k n+1 k |A|
y ∈ {0, 1} , z ∈ {0, 1} ∀k∈K

(22g)

20
where constraints (22e) replace the value function constraints
XX X
k
p̄mi wmi − ckij zijk ≥ Φk (X k ) ∀ k ∈ K.
i∈V m∈M (i,j)∈A

In the same way as we did in Section 4.3 for BPFM, by introducing the new real variable
k
θ , constraints (22d) and (22e) in (BPMD) can be replaced by (12b) and
XX X X
k
p̄mi wmi − θk ≥ k
p̄mi Xmi − C k (T ) ∀ T ∈ T , k ∈ K, (23)
i∈V m∈M i∈V (T ) m∈M

respectively, obtaining the new single-level formulation (BPMD-z), which is reported in Ap-
pendix B.
Furthermore, also for BPMD, as discussed for BPFM in Section 4.4, we can consider a
route duration limit instead of a bound on the number of packages to serve, by replacing
constraints (22b) with constraints (14) for (BPMD), or with constraints (15) for (BPMD-z).

5.2 Disaggregated formulation


k
Using the disaggregated decision variables Ymi of alternative ii) we obtain the following
formulation:
XX X
k

max pmi Ymi (24a)
X,Y
k∈K i∈V m∈M
XX
k
s.t. Xmi ≤1 ∀i∈V (24b)
k∈K m∈M

Y k ∈ SΨk (X k ) ∀k∈K (24c)


k
Xm ∈ {0, 1}n , Ymk ∈ {0, 1} n+1
∀ m ∈ M, k ∈ K, (24d)
where SΨk (X k ) is the set of optimal solutions of the k-th follower problem, which, for a given
X̃ k , reads:
XX X
Ψk (X̃ k ) = max k
p̄mi Ymi − ckij zijk (25a)
Y,z
i∈V m∈M (i,j)∈A
k k
s.t. ≤
Ymi X̃mi ∀ i ∈ V, m ∈ M (25b)
XX
k
Ymi ≤ bk (25c)
i∈V m∈M
X
( Ymk , z k ) is a route (25d)
m∈M

Ymk ∈ {0, 1}n+1 , z k ∈ {0, 1}|A| ∀ m ∈ M. (25e)


k
In this formulation, thanks to the disaggregated variables Ymi , there
P is no bilinear product to
k
linearize. In this case, a route T ∈ T corresponds to the pair ( Ymi , z k ). As before, under
m∈M
the assumption that the costs ckij satisfy the triangle inequality, the following result holds.

21
Proposition 3. P anyk k ∈ K,
PFor given a vector X̃ k ∈ {0, 1}n satisfying constraints (24b),
k
and such that X̃mi ≤ b , ∀ k ∈ K, there always exists an optimal solution of the
i∈V m∈M
following problem, which is also optimal for Ψk (X̃ k ):
 
 X X 
Ψ̄k (X̃ k ) = max k
p̄mi X̃mi − C k (T ) . (26a)
T ∈T  
i∈V (T ) m∈M

Note that the proof of Proposition 3 is similar to the one of Proposition 1, so it is omitted.
As for BPFM, we can obtain single-level reformulations of formulation (24) through the
value function approach or the no-good cuts approach. The value function reformulation
of (24) reads:
 XX X
k

 max pmi Ymi (27a)
X,Y,z
k∈K i∈V m∈M



 X X
k

s.t. Xmi ≤1 ∀i∈V (27b)







 k∈K m∈M
 X X
k
Xmi ≤ bk ∀k∈K (27c)







 i∈V m∈M
k k

(BPMDd ) Y mi ≤ Xmi ∀ i ∈ V, m ∈ M, k ∈ K (27d)
X
Ymk , z k ) is a route

( ∀k∈K (27e)







 m∈M
 X X X X X
k k k k
p̄ Y − c z ≥ p̄mi X̃mi − C k (T ) ∀ T ∈ T ,k ∈ K (27f)



 mi mi ij ij



 i∈V m∈M (i,j)∈A i∈V (T ) m∈M

 k n k n+1



 Xm ∈ {0, 1} , Ym ∈ {0, 1} ∀ m ∈ M, k ∈ K (27g)

z k ∈ {0, 1}|A| ∀k ∈ K,

(27h)

where constraints (27f) reformulate, according to Proposition 3, the value function con-
straints XX X
k
p̄mi Ymi − ckij zijk ≥ Ψk (X k ) ∀ k ∈ K.
i∈V m∈M (i,j)∈A

We can alternatively formulate the BPMD problem through the no-good cuts approach,
obtaining formulation (BPMD′ ), presented and discussed in Appendix D.
An equivalent single-level formulation, which we call (BPMDd -z), can be obtained by
projecting out the z variables and introducing the new variables θk , replacing constraints (27e)
and (27f) in (BPMDd ) by
 
X X
θk ≥ ckTSP (V (T ))  k
Ymi − |V (T )| + 1 ∀ T ∈ T , k ∈ K, (28a)
i∈V (T ) m∈M
XX X X
k
p̄mi Ymi − θk ≥ k
p̄mi X̃mi − C k (T ) ∀ T ∈ T , k ∈ K, (28b)
i∈V m∈M i∈V (T ) m∈M

22
respectively. The obtained formulation (BPMDd -z) is reported in Appendix B.
Furthermore, if a route duration limit has to be considered instead of the capacity con-
straint, constraints (27c) are replaced by constraints (14) for (BPMDd ), or by constraints (15)
for (BPMDd -z).

5.3 Comparing the BPMD formulations


In this section, we compare the two BPMD value function formulations, (BPMD) and (BPMDd ),
in terms of the value of their linear relaxations. In the following theorem, we prove that the
aggregated formulation (BPMD) is as strong as formulation (BPMDd ) (i.e., its linear relax-
ation provides equivalent upper bounds). Let us define vLP (BPMD) and vLP (BPMDd ) as the
optimal values of the LP relaxation of (BPMD) and (BPMDd ), respectively.

Theorem 1. Any LP feasible solution (X̃, ỹ, w̃, z̃) of the model (BPMD) can be translated
into a LP feasible solution (X̃, Ỹ , z̃) of the model (BPMDd ) by imposing that, for all i ∈
V, m ∈ M, k ∈ K:
k k
Ỹmi = w̃mi , (29)
and vice versa. Hence vLP (BPMD) = vLP (BPMDd ).

Proof. Let us start by proving that vLP (BPMD) ≤ vLP (BPMDd ), i.e., that any LP solution
(X̃, ỹ, w̃, z̃) of (BPMD) is also feasible in (BPMDd ), when imposing Eq. (29). Indeed, con-
straints (27b) and (27c) are constraints (17b) and (22b), respectively. Constraints (27d)
correspond to (17d). Constraints (27f) are equivalent to constraints (22e). Finally, con-
straints (27e) correspond to (22d) combined with constraints (19b). Hence the inequality
vLP (BPMD) ≤ vLP (BPMDd ) holds. Let us now consider an LP solution (X̃, Ỹ , z̃) of (BPMDd ).
We need to show that it is LP feasible also for (BPMD) if Eq. (29) is imposed. Indeed, as
already said, constraints (17b), (17d), (22b), and (22e) correspond to (24b), (27d), (27c),
and (27f), respectively.
P k For the remaining constraints, in model (BPMDd ), we can replace
k
variable yi by wmi by constraint (19b). Therefore:
m∈M

• Constraints (17e) reads wmi


k k k
P
≤ wui . They are satisfied by (X̃, Ỹ , z̃), because Ỹmi ≥
u∈M
k
Ỹuik .
P
0 for all m ∈ M, i ∈ V, k ∈ K, and thus Ỹmi ≤
u∈M

• Constraints (19a) read k k k


P P
wui ≤ wmi + Xui . They are satisfied by (X̃, Ỹ , z̃)
u∈M u∈M :u̸=m
because for all m ∈ M, i ∈ V, k ∈ K:
X X X X
Ỹuik ≤ Ỹmi
k
+ k
X̃ui ⇐⇒ Ỹuik ≤ k
X̃ui
u∈M u∈M :u̸=m u∈M :u̸=m u∈M :u̸=m

hold since Ỹuik , X̃ui


k
≥ 0 and Ỹuik ≤ X̃ui
k
for all u ∈ M, i ∈ V, k ∈ K from (27d).

23
• Constraints (22c) read
P k P k
wmi ≤ Xmi . They are satisfied by (X̃, Ỹ , z̃) because
m∈M m∈M
P k P k k k k k
Ỹmi ≤ X̃mi holds since Ỹmi , X̃mi ≥ 0 and Ỹmi ≤ X̃mi for all m ∈ M, i ∈ V, k ∈
m∈M m∈M
K from (27d).

• Constraints (22d) read “( k


, z k ) is a route” for all k ∈ K, which correspond to
P
wm
m∈M
constraints (27e).
Hence, also inequality vLP (BPMD) ≥ vLP (BPMDd ) holds. Consequently, we have that
vLP (BPMD) = vLP (BPMDd ).

6 Valid inequalities
In this section, we present some valid inequalities that strengthen the proposed formulations,
cutting off parts of the feasible domain due to symmetries or dominance conditions.
First of all, we consider the setting in which the carriers’ problems are all equivalent, i.e.,
when bk (or tkmax ) and ck are the same for all k in K. In this case (which is the case we
consider in our numerical experiments, see Section 8), whenever we impose an inequality for
a given k̄, we can add the same inequality for all k ∈ K.
In the same setting, we can consider the so-called symmetry-breaking inequalities which
help reduce the number of equivalent solutions in the platform’s feasible region, making
it faster for optimization algorithms to find the optimal solution. The symmetry-breaking
inequalities read, for formulation (BPFM):
X X
xk−1
i ≥ xki ∀k ∈ K \ {1},
i∈V i∈V

and for formulations (BPMD) and (BPMDd ):


XX XX
k−1 k
Xmi ≥ Xmi ∀k ∈ K \ {1},
i∈V m∈M i∈V m∈M

imposing that the number of customers assigned to carrier k − 1 is greater than the number
of customers assigned to carrier k.
The second type of valid inequalities we can add is from the family of so-called cover
inequalities. They can be added to the model also if the carriers are not necessarily equiv-
alent, but have a capacity constraint (i.e., we consider formulations with constraints on the
maximum number of packages to serve). This family of inequalities prevents the formation of
suboptimal or inefficient customer visit sequences by specifying certain patterns that should
be avoided. Indeed, for the models with the constraint on the maximum number of cus-
tomers each carrier k can serve, the platform could identify, for each carrier k, all the sets of
bk customers that, even when setting the compensations to the highest values for the carrier
(and performing the optimal tour to serve them), would not be served together, because
corresponding to a negative profit for carrier k. Set S ⊂ V is called a cover with respect to

24
carrier k ∈ K, if |S| = bk and (1 − mmin )pi − ckTSP (S) < 0. For any k ∈ K, given its cover
P
i∈S
S, we can add the following valid inequality for formulations (BPFM), and (BPFM-z):
X
xki ≤ bk − 1, (30)
i∈S

and for formulations (BPMD), (BPMDd ), (BPMD-z), and (BPMDd -z):


XX
k
Xmi ≤ bk − 1, (31)
i∈S m∈M

which exclude the identified suboptimal tour.


Identifying all the sets S which correspond to inefficient assignment decisions may be
k
computationally
P heavy.
n
 Indeed, checking for all k ∈ K all the subsets of cardinality b means
enumerating bk
many subsets S. Furthermore, computing the optimal tour serving the
k∈K
set of customers S is NP-hard. Thus, even for small values of |K| and bk , this approach would
be computationally intractable. Alternatively, one could separate these cover inequalities
dynamically and/or heuristically (see Section 7.1).
Additionally to the above presented valid inequalities, we note that all the formulations
could be further strengthened by leveraging the WTA-PTP formulation. Indeed, including
constraints imposing the non-negativity of the carriers’ profits, i.e., constraints (1d)
X X
p̄ki αik − ckij zijk ≥ 0 ∀k∈K
i∈V (i,j)∈A

for the setting with fixed margins, or constraints (35d)


XX X
p̄kmi Aki − ckij zijk ≥ 0 ∀k∈K
i∈V m∈M (i,j)∈A

for the one with margin decisions, leads to better LP relaxations. As for the no-good cuts
based formulation (16) (or (36)), with the same purpose, introducing the binary variable z,
besides constraints (1d) (or (35d)) one could include also constraints (1e) (or (35e)), which
impose that z k is the incidence vector of a route that visit the customers i s.t. αik = 1 (or
k
P
m∈M Ami = 1, respectively).

7 Solution approach
In this section, we present the solution approach we developed to solve the single-level prob-
lem reformulations proposed in Section 3. We designed a branch-and-cut algorithm where
value function constraints, the no-good cuts for formulation (16), as well as the ones related
to the value of θ when projecting out z variables, are separated dynamically as they are
exponentially many. First, in Section 7.1, we discuss the separation procedures for the ex-
ponentially many constraints included in the models. Second, in Section 7.2, we present a
heuristic algorithm, based on the solution of the BPFM, to generate a feasible solution used
as a warm-start for the exact approach for the BPMD.

25
7.1 Separation procedures
In this section, we describe the separation procedures we use to dynamically detect violated
value function constraints in the single-level reformulations presented in Section 3. Indeed,
we notice that these constraints are exponentially many and NP-hard to separate, as they
require finding an optimal PTP solution, corresponding to an optimal follower response for
a given assignment x of the leader. As for the formulation with the no-good cuts, these
constraints themselves (i.e., constraints (16d) and (36d)) need to be separated by solving
a PTP for each carrier. Concerning the formulations with the z variables, we notice that
there is an exponential number of constraints of type (10d) in (BPFM), and of type (22e)
and (27f) in (BPMD) and (BPMDd ). In the formulations obtained by projecting out the z
variables, not only constraints (12a) in (BPFM-z), (23) in (BPMD-z), and (28b) in (BPMDd -z)
are exponential in number, but also constraints (12b) and (28a), respectively. The separation
of cover inequalities introduced in Section 6 also requires solving an NP-hard problem.
We thus propose a separation procedure for these constraints, which we describe in the
following referring to the problem with fixed margins, without loss of generality. We note
that all constraints are separated on integer solutions only, in order to speed up the solution
process.
Separation of constraints (10d)
When dealing with formulation (BPFM), for any given solution (x̃, ỹ, z̃) of the master problem,
obtained by relaxing constraints (10d), we solve the PTP with x = x̃ for each k. Let (ŷ k , ẑ k )
be the optimal solution of the latter problem, with T̂ k P
and φ̂k beingPthek corresponding tour
k k k k
and value, respectively. If there exists a k such that p̄i ỹi − cij z̃ij < φ̂ , then the
i∈V (i,j)∈A
following constraint
X X X
p̄ki yik − ckij zijk ≥ p̄ki xki − C k (T̂ k )
i∈V (i,j)∈A i∈V (T̂ k )

is violated by the current solution of the master problem. Thus, we insert it into the master
problem. Otherwise, the obtained solution is feasible (optimal if we are at the root node) for
the original bilevel formulation.
Separation of constraints (12b)
When considering formulation (BPFM-z), we have to separate constraints (12b) as well, which
are also exponentially many. Thus, we first solve the master problem obtained by relaxing
both (12a) and (12b), finding the solution (x̃, ỹ, θ̃), with the corresponding tour T̃ k (with
ckTSP (V (T̂ k )), the cost of the optimal route associated to T̃ k nodes). Then, we solve the
lower-level problems with x = x̃ for each k, obtaining the optimal solution (ŷ k , ẑ k ), the
corresponding tour T̂ k , and the value φ̂k .
In the case in which there exists a k such that θ̃k < ckTSP (V (T̃ k )), we insert the following
constraint:  
X
θk ≥ ckTSP (V (T̃ k ))  yik − |V (T̃ k )| + 1 . (32)
i∈V (T̃ k )

26
If instead, for all k, inequality

θ̃k ≥ ckTSP (V (T̃ k )), (33)

does hold, then we proceed with the P separation of (12a) as we did for constraints (10d) above,
i.e., if there exists a k such that p̄ki ỹik − θ̃k < φ̂k , then the we add the following constraint
i∈V
X X
p̄ki yik − θk ≥ p̄ki xki − C k (T̂ k )
i∈V i∈V (T̂ k )

to the master.
Separation of constraints (15)
When taking into account the maximum duration constraints (15) on variable θk , we use the
following separation procedure for each k:

• if ckTSP (V (T̃ k )) ≤ tkmax , we add cut (32);

• otherwise (i.e., if ckTSP (V (T̃ k )) > tkmax ),


Pwe cut off the solution ỹ k because it is infeasible,
adding the following no-good cut: yi ≤ |V (T̃ k )| − 1.
k

i∈V (T̃ k )

Separation of no-good cuts (16d)


Given formulation (BPFM′ ), the separation procedure we follow consists in: a) solving the
relaxation obtained by relaxing the no-good cuts (16d) to get a solution α̃ and the corre-
sponding sets Ṽ k = {i ∈ V : αk = 1} for all k ∈ K; b) solving the |K| PTP problems (4)
with x = α obtaining the optimal carriers’ responses; and, c) in case there is a k ∈ K for
which y k < αk , the following no-good cut is added to the relaxation of (16)
X X
(1 − αik ) + αik ≥ 1,
i∈Ṽ k i̸∈Ṽ k

otherwise, the obtained solution is feasible for the original formulation.


We highlight here that this type of cut only excludes the current tentative assignment
of the master problem, which explains why the use of these no-good cuts turns out to be
inefficient in practice (see numerical results Section 8).
Separation of cover inequalities (30) and (31)
In Section 6, cover inequalities have been introduced as valid inequalities for formulations
with the bound on the number of packages each carrier can deliver. They can be separated
k
dynamically P in the following way. For a given k ∈ K, let x̃P be a solution to the master
k k k
problem. If i∈V x̃ = b , let S̃ be the support set of x̃ . If (1 − mmin )pi − ckTSP (S̃) < 0,
i∈S̃
then S̃ defines a cover, and we can add the corresponding cover inequality (30) or (31) to
the model.
As an alternative, the following heuristic can be executed as a preprocessing step in order
to find some of the covers S. For each k, sort the customers in non-decreasing order of

27
k
the associated compensation (1 − mmin )pi ; take the first ⌈ b2 ⌉ customers according to the
k
sorting, defining the set S̄; iteratively, for each vertex u ∈ S̄, select the ⌊ b2 ⌋ customers whose
k
distances from u are maximal. In this way, one obtains ⌈ b2 ⌉ sets S for which a valid cover
inequality can be added.
Finally, one can note that if
( )
X
min (1 − mmin )pi − C k (S) ≥ 0
S⊂V
|S|=bk i∈S

for a given k, no violated cover inequality exists for the corresponding carrier k. We perform
this check as a preprocessing step in our experiments.

7.2 Heuristic solution approach


In this section, we describe a heuristic procedure used to obtain a warm-start feasible solution
for the branch-and-cut algorithm solving the BPMD, which possibly corresponds to a tighter
lower bound of the problem compared to the one provided by the UCC-PTP, or the one
found by solving the WTA-PTP and retrieving its associated feasible solution. It consists of
three phases, involving the solution of the BPFM.
First, the BPFM (modeled either as (BPFM), or as (BPFM − z), or as (BPFM′ )) is solved
by setting all the compensations to their highest values, i.e., p̄ki = (1 − mmin )pi for all
i ∈ V, k ∈ K. In this way, we obtain a first feasible solution to the BPMD. Note that, in
case no package is assigned in this solution, then it means that no other solution exists in
which any package is served. Indeed, this is the most rewarding compensation decision for
the followers, and assigning no packages means that the compensations are still too low with
respect to the routing costs.
In case the solution to the first step is nonempty, in the second step, BPMD is solved with
all the variables, but the margin variables, fixed to the values returned by the first phase, plus
an additional constraint imposing that the profit of each carrier should be nonnegative. This
corresponds to solving the following auxiliary problem, which is a multiple-choice knapsack
solved independently for each follower:
XX X
k
max pmi Xmi (34a)
X
k∈K m∈M i∈V (T̂ k )
X
k
s.t. Xmi = x̂ki ∀ i ∈ V, k ∈ K (34b)
m∈M
X X X
k
pmi Xmi ≤ pi − C k (T̂ k ) ∀ k ∈ K, (34c)
m∈M i∈V (T̂ k ) i∈V (T̂ k )

where x̂, and T̂ come from the assignment and routing decisions made at the first step of the
heuristic. The solution to the above problem determines which is the best margin decision
for the leader, i.e., X̌, given the assignment of the items decided in the previous phase.

28
Finally, in the third step, the BPFM is solved again by fixing the margins P according to the
solution obtained in the second step, i.e., for each i and k, we set p̄ki = pi − k
X̌mi pmi . The
m∈M
solution obtained in the third step, together with the X̌ found in the second step, is a feasible
solution to the problem. Note that this last step is needed as, otherwise, the solution obtained
from the second step might not be bilevel feasible, as it does not necessarily optimize the
followers’ objective. The algorithm will then return the best solution in terms of the leader’s
objective function between the one found in the first and the third step.
As already noted, the separation of no-good cuts is computationally more expensive than
that of value function cuts (see Section 8.1), thus the employment of (BPFM′ ) within the
heuristic is not preferable. Depending on the presence of a budget constraint ((4c), (18c) or
(25c)) or a route duration constraint ((14) or (15)) the problem to solve at each step changes.
Indeed, when there is a constraint on the duration of the route, it is better to have a problem
with the z variables in the upper level, i.e., (BPFM), since constraints (14) do not need to
be separated. Instead, some preliminary experiments showed that, as expected, when there
is no route duration constraint, the problem without the z variables (BPFM-z) turns out to
be faster. The overall heuristic is described in Algorithm 1.

Algorithm 1: Heuristic algorithm for BPMD


Input: Sets V, A, K, M = {mmin , . . . , mmax }, and parameters pi for all i ∈ V , ckij for
all (i, j) ∈ A, and bk (or tkmax ) for all k ∈ K.
Output: Feasible solution for BPMD.
k
1 Set p̄i = (1 − mmin )pi for all i ∈ V, k ∈ K. Solve BPFM. Return the optimal solution
in terms of assignment of the leader x̂, and acceptance and routing decisions of the
followers T̂ .
2 Solve problem (34). Return the optimal solution X̌.
k
P k
3 Set p̄i = pi − X̌mi pmi for all i ∈ V, k ∈ K. Solve BPFM. Return the optimal
m∈M
solution (x̌, Ť ).
4 return the best solution, in terms of the leader’s objective function, between (x̂, T̂ )
and (x̌, Ť ).

8 Numerical Results
We test the different models developed in the previous sections on two sets of instances.
The first set of 15 instances is generated in the following way: the graph G = (V, A)
and the number of vehicles |K| are taken from the “Chao instances” (Chao et al., 1996),
originally proposed for the team orienteering problem (the number of customers |V | is in
{20, 31, 32, 63, 65} and the number of vehicles |K| is in {2, 3, 4}); costs ckij are assumed to
be the same for all the vehicles k ∈ K and equal to the Euclidean distance between i and j
for each arc (i, j), rounded to the nearest not smaller integer; node prices pi are generated

29
pseudo-randomly in [0, 100] following the Generation 2 procedure proposed in (Fischetti et al.,
1998, Section 6), consisting in setting pi := 1 + (7141 · i + 73) mod(100) for each i ∈ V . The
prices are available as a feature in the instances proposed in Chao et al. (1996), but these
instances are conceived for team orienteering problem applications with two depots (one is
the starting point and the other is the ending point of the vehicles’ routes), thus, since we are
in a one-depot framework, we generate the nodes prices according to the previously presented
formula. We recall that each customer is associated with the demand for a single item.
The second set of 12 instances has the following characteristics: the graphs G = (V, A) are
taken from the famous benchmark set of instances known as “Solomon instances” R101 – ran-
domly distributed customers – (Solomon, 1987), with a number of customers in {20, 25, 30, 35};
we set |K| ∈ {1, 2, 3} following the same setting of the Chao instances; costs ckij are again
assumed to be the same for all the vehicles k ∈ K and equal to the Euclidean distances; node
prices pi are generated as for the former set of instances. l m
|V |
The bound bk on the number of items each vehicle can serve is set to |K| +2. The upper
bound on the duration of the route tkmax , which we discussed in Section 4.4 is only available for
the Chao instances, thus we only solve this type of instances when considering the duration
constraint. In particular, each Chao instance has a specific value of tkmax identified by an
alphabetic letter. We take the instances of type “k”.
The proposed formulations are implemented in Python 3.10 and solved by using the Cplex
solver (version 22.1.0.0) (IBM, 2017), with a time limit of one hour. All the experiments are
conducted on a 3.7 GHz Intel Xeon W-2255 CPU, 128 GB RAM.
We present the numerical results obtained by testing the formulations of the problem
with fixed margin in Section 8.1, and the ones obtained by testing the formulations of the
problem with margin decisions in Section 8.2. In Section 8.3, we discuss the gain of the
margin decisions through the analysis of two Solomon instances.

8.1 Results on the fixed margins problem


In this section, we report the summary of the results obtained by solving the two formulations
proposed for the problem with fixed margins: (BPMF), and (BPMF′ ). We test these models on
the Chao and the Solomon instances described above, considering margins: 0.2, 0.5, 0.7, 0.8,
and 0.9. For each instance, we assume the same margin m̂ is applied to all items and carriers,
i.e., p̄ki = (1 − m̂)pi for all i ∈ V, k ∈ K. In Table 1, there are two blocks of columns, one
for each of the formulations considered. For each of the two we report: LB and UB, the
average lower and upper bound at termination, respectively; gap, the average percentage gap
returned by Cplex at termination; time, the average computing time in seconds; #nodes, the
average number of nodes of the branch-and-cut tree at termination; %served, the percentage
of served costumers. The first part of the table is associated with the 15 Chao instances, and
the second part with the 12 Solomon instances.
It is clear from these results that (BPFM) is the best formulation in terms of computational
efficiency. For all the tested margins, indeed, both the gap and the time of (BPFM) are better
than the ones of (BPFM′ ). These results motivate the use of model (BPFM) in the execution

30
Model (BPFM) Model (BPFM′ )
LB UB gap time #nodes %served LB UB gap time #nodes %served
Chao instances
0.2 430.4 430.4 0.00 562.7 37 100 394.2 430.4 8.41 1292.7 48 94.1
0.5 1076.0 1076.0 0.00 260.8 120 100 1076.0 1076.0 0.00 836.6 112 99.9
0.7 1377.6 1506.7 8.55 1516.5 2060 92.9 1230.1 1506.4 18.34 1572.9 800 83.6
0.8 1631.2 1721.6 5.25 2876.4 16745 92.7 1074.0 1721.6 37.61 2990.0 3180 64.5
0.9 1601.4 1936.8 17.32 3600 202061 76.5 831.4 1936.8 57.07 3600 259301 43.1
Solomon instances
0.2 270.5 270.5 0.00 256.5 1707 100 270.3 270.5 0.08 538.9 694 99.5
0.5 674.6 676.3 0.24 1307 39087 98.3 620.2 676.3 8.29 2034 8682 90.5
0.7 779.7 942.1 17.2 2564 420935 74.1 774.7 943.4 17.9 2742 725464 73.5
0.8 357.3 357.3 0.00 205.3 57739 20.4 357.3 362.6 1.43 498.1 102156 21.0
0.9 0.0 0.0 0.00 0.3 0 0 0.0 0.0 0.00 0.3 0 0

Table 1: Comparison between model (BPFM) and (BPFM′ ).

of steps 1 and 3 of the heuristic given in Algorithm 1, as well as the choice of discarding
(BPMD′ ) in the tests presented in the following section.

8.2 Results on the problem with margin decisions


In this section, we discuss the numerical results obtained by testing the formulations pro-
posed for the problem with margin decisions. We consider the following sets of margins M :
{0.2, 0.5}, {0.5, 0.9}, {0.2, 0.5, 0.8}, and {0.5, 0.7, 0.9}. We restrict our tests to two or three
margins as one might reasonably assume that, in practical settings, the choice may be among
low and high margins or low, medium, and high margins. The value of the margins is se-
lected after some preliminary experiments aimed at identifying values that generate different
solution structures (as shown in the following).
The feasible solutions found by the heuristic Algorithm 1 are used as MIP start for
Cplex. The heuristic solves at steps 1 and 3 either the (BPFM − z) formulation or the
(BPFM) formulation when the budget constraint or the route duration limit are considered,
respectively. A time limit of 1 hour is set for each heuristic phase and for the models solution
as well.
A first set of experiments is performed comparing the bilevel solutions obtained through
the heuristic proposed in Algorithm 1, with the ones of the WTA-PTP formulation (35) with
margin decisions (see the corresponding formulation in Appendix C) and the UCC-PTP
formulation (2) for the lowest values of margins in M (i.e., the highest compensation for
the carriers). Indeed, we cannot include margin decisions in the UCC-PTP problem, as it is
reasonable to assume that, in this setting, the carriers will always select the highest possible
compensation if multiple ones are available, which means setting p̄ki = (1 − mmin )pi ∀i ∈
V, k ∈ K in the objective function of formulation (2). We perform these experiments first
of all in order to evaluate the quality of the feasible solutions returned by the heuristic
proposed in Algorithm 1, which solves a sequence of BPFM formulations, against the quality
of the solutions of the single-level UCC-PTP and of the recovered solutions of the single-
level WTA-PTP. Furthermore, these tests allow us to evaluate the impact of changes in the
problem setting on the value of the platform solution.
Tables 2 and 3 summarize the results on the instances with capacity constraints and

31
route duration constraints, respectively. For the heuristic Algorithm 1, and the UCC-PTP
formulation (2), we report: LB, the average lower bound returned by the method; time,
the average computing time (in seconds) needed to return the solution; gap LB, the average
percentage gap between LB and the best lower bound LB ∗ on the optimal solution returned
−LB ∗
by the four methods (BPMD), (BPMDd ), (BPMD-z) and (BPMDd -z) (gap LB = 100 LBLB ∗ ).
For the WTA-PTP formulation (35), we report: LBrec, the average profit gained by the
platform when considering carriers’ reaction to the assignment corresponding to the WTA-
PTP solution (i.e., the value of what we define as recovered solution, see Section 3.1); UB,
the average upper bound returned by the method; time, the average computing time needed
to return the corresponding solution; gap UB, the average percentage gap between UB and
the best lower bound LB ∗ on the optimal solution returned by the four methods (BPMD),
−LB ∗
(BPMDd ), (BPMD-z) and (BPMDd -z) (gap UB = 100 UBLB ∗ ); gap LBrec, the average per-
centage gap between the profit associated with the recovered solution, i.e., LBrec, and LB ∗

(gap LBrec= 100 LBrec−LB
LB ∗
).

Heuristic UCC-PTP WTA-PTP


LB time gap LB LB time gap LB LBrec UB time gap UB gap LBrec
Chao instances
{0.2, 0.5} 1076 141 0.00 430 1759 -60.1 1072 1076 0.48 0.00 -0.45
{0.5, 0.9} 1876 3484 -1.88 1074 1782 -43.8 1730 1937 243 1.54 -13.6
{0.2, 0.5, 0.8} 1716 2344 -0.28 430 1759 -75.0 1664 1722 0.91 0.18 -4.58
{0.5, 0.7, 0.9} 1881 3336 -2.02 1074 1782 -43.9 1734 1937 16.5 1.19 -10.5
Solomon instances
{0.2, 0.5} 672 1865 -0.63 266 330.4 -60.6 624 676 0.45 0.08 -7.97
{0.5, 0.9} 770 4902 -15.4 656 315.9 -28.0 576 1006 1478 9.4 -36.3
{0.2, 0.5, 0.8} 790 3927 -16.8 266 330.4 -71.9 727 1007 1119 5.36 -23.7
{0.5, 0.7, 0.9} 773 4197 -18.1 656 315.9 -30.9 628 1014 1544 6.28 -32.0

Table 2: Comparison between Algorithm 1, the single-level models UCC-PTP (2) and WTA-
PTP (35).

Heuristic UCC-PTP WTA-PTP


Chao instances
LB time gap LB LB time gap LB LBrec UB time gap UB gap LBrec
{0.2, 0.5} 529 3277 -0.04 211 1783 -60.1 526 585 2014 9.22 -0.47
{0.5, 0.9} 941 3765 -1.77 529 1790 -44.1 920 1056 1983 10.1 -4.60
{0.2, 0.5, 0.8} 841 3665 -0.74 211 1783 -75.1 839 944 2008 9.82 -0.90
{0.5, 0.7, 0.9} 942 3573 -1.55 529 1790 -44.3 922 1057 2017 9.78 -4.64

Table 3: Comparison between Algorithm 1, the single-level models UCC-PTP (2) and WTA-
PTP (35), when considering the route duration limit.

As expected, we observe that the average gaps with respect to the bilevel solutions re-
turned by the heuristic are much tighter than the ones obtained by solving the UCC-PTP
formulation with the highest compensations. When instead the WTA-PTP model is solved,
we assume that the carriers can only either accept or reject the whole bundle of assigned
items. This reduced decision power of the carriers with respect to the bilevel setting leads
to an overestimation of the platform profit, as shown by the fact that the gap UB values are
always positive, while the absolute value of the gap LBrec is always higher than the one of

32
the gap LB returned by the heuristic. Indeed, although the UB gaps returned by the WTA-
PTP formulation are tight, the solutions that correspond to them are not bilevel feasible in
our setting, i.e., when the assignment is given to the carriers and they solve their own PTP
models, some of the items are not accepted and the true profit of the platform is lower than
what expected (with a gap w.r.t. LB∗ of up to -63%).
At this point, we move to discuss the results of the BPMD models. The results on
models (BPMD) and (BPMDd ), and their respective versions without z variables introduced in
Section 4.3, (BPMD-z) and (BPMDd -z), either with the constraints on the number of packages
or the duration of the route, are reported Tables 4–5 and Tables 6–7, respectively. Tables 4
and 6 have two blocks of columns: one for model (BPMD) and the other one for (BPMD-z).
Similarly, Tables 5 and 7 have two blocks of columns: one for model (BPMDd ) and the other
one for (BPMDd -z). For each model, we report: LBh , the average lower bound returned by the
heuristic given in Algorithm 1; #opt, the number of instances solved to optimality; LB, the
average lower bound at termination; UB, the average upper bound at termination; gap, the
average percentage gap returned by Cplex at termination; gap, the average percentage gap
between the UB min (minimum between UB and the upper bound returned by the solution
of the WTA-PTP formulation (35)) and LB; time, the average computing time in seconds;
septime, the average time in seconds needed for the separation of both the value function
constraints, and either the subtour constraints or the route value constraint (when z variables
are projected out); #sep, the average number of separated integer solutions; #nodes, the
average number of nodes of the branch-and-cut tree at termination. Tables 4 and 5 consist
of two parts: the first part is associated with the 15 Chao instances, and the second part
with the 12 Solomon instances. Instead, Tables 6 and 7 report average results on the Chao
instances only, since they are the only ones with the information on the route duration. Each
row reports the margin set M considered.
Heuristic Model (BPMD) Model (BPMD-z)
LBh #opt LB UB gap time septime #sep #nodes gap #opt LB UB gap time septime #sep #nodes gap
Chao instances
{0.2, 0.5} 1076 15 1076 1076 0.00 6.3 0.0 1 0 0.00 15 1076 1076 0.00 6.8 1.6 3 0 0.00
{0.5, 0.9} 1876 0 1892 1937 2.36 3600 1254 384 193346 2.47 0 1907 1937 1.60 3600 2541 4472 121306 1.60
{0.2, 0.5, 0.8} 1716 11 1719 1722 0.18 1477 723 309 36777 0.18 11 1719 1722 0.17 1337 1037 2337 19481 0.17
{0.5, 0.7, 0.9} 1881 0 1893 1937 2.33 3600 1310 430 170169 2.33 0 1911 1937 1.34 3600 2397 4844 143357 1.34
Solomon instances
{0.2, 0.5} 672 9 675 676 0.20 904 114 98 50971 0.20 9 676 676 0.08 986 827 1409 18437 0.08
{0.5, 0.9} 770 2 898 1003 9.04 3163 245 377 579250 8.76 0 875 1068 17.0 3600 980 6279 435079 12.0
{0.2, 0.5, 0.8} 790 5 945 1010 5.25 2584 145 235 571582 4.71 0 918 1059 12.4 3600 981 5947 434403 7.77
{0.5, 0.7, 0.9} 773 5 929 1015 6.92 2714 218 342 579739 6.75 0 915 1083 14.5 3600 948 5051 489491 8.86

Table 4: Comparison between model (BPMD) and (BPMD-z).

These tables clearly show that, if the high margin is not so high, i.e., {0.2, 0.5}, then
the leader can assign all (or almost all) the margins to high, i.e., 0.5, and all the followers
accept their assignment, thus the branching tree is very small, as the heuristic finds an
optimal solution in most of the cases. Otherwise, more iterations will be performed, and
more computing time is required. In Tables 4–5, we can observe two different trends for the
Chao and the Solomon instances. For the Chao instances, the gap at termination returned by
the models without z is lower. The opposite is true for the Solomon instances. This is related

33
Heuristic Model (BPMDd ) Model (BPMDd -z)
LBh #opt LB UB gap time septime #sep #nodes gap #opt LB UB gap time septime #sep #nodes gap
Chao instances
{0.2, 0.5} 1076 15 1076 1076 0.00 7.2 0.00 1 0 0.00 15 1076 1076 0.00 7.8 1.5 2 0 0.00
{0.5, 0.9} 1876 0 1889 1937 2.69 3600 1822 420 292460 2.69 0 1900 1937 1.91 3600 2655 5148 113127 1.91
{0.2, 0.5, 0.8} 1716 9 1718 1722 0.27 1446 893 176 87529 0.27 10 1718 1722 0.23 1320 1161 1544 25102 0.23
{0.5, 0.7, 0.9} 1881 0 1891 1937 2.50 3600 1606 336 243992 2.50 0 1908 1937 1.52 3600 2598 5286 146093 1.52
Solomon instances
{0.2, 0.5} 672 9 675 676 0.20 904 132 94 81940 0.20 9 675 676 0.10 944 807 1733 15618 0.10
{0.5, 0.9} 770 1 867 1015 13.3 3495 347 417 809036 12.2 0 842 1078 22.0 3600 1297 9379 305963 16.1
{0.2, 0.5, 0.8} 790 1 929 1021 8.24 3405 300 418 655635 6.79 0 873 1072 18.5 3600 1211 8422 415383 12.8
{0.5, 0.7, 0.9} 773 0 907 1031 11.0 3600 517 435 628041 9.5 0 884 1094 19.2 3600 1305 8771 366167 12.4

Table 5: Comparison between model (BPMDd ) and (BPMDd -z).


Heuristic Model (BPMDd ) Model (BPMDd -z)
LBh #opt LB UB gap time septime #sep #nodes gap #opt LB UB gap time septime #sep #nodes gap
{0.2, 0.5} 529 7 529 598 8.76 2402 305 34 615771 7.29 2 529 756 26.2 3141 2501 8893 61311 7.29
{0.5, 0.9} 941 6 954 1077 8.81 2465 141 44 530289 7.46 2 951 1367 27.0 3281 2181 11178 93651 7.89
{0.2, 0.5, 0.8} 841 6 848 958 8.80 2480 143 43 477236 7.47 2 841 1222 28.2 3179 2062 9675 82686 8.14
{0.5, 0.7, 0.9} 942 5 949 1089 9.85 2734 331 47 465270 7.96 3 948 1363 26.4 3216 2037 10116 70487 8.01

Table 6: Comparison between model (BPMD) and (BPMD-z) when considering the route
duration limit.
Heuristic Model (BPMDd ) Model (BPMDd -z)
LBh #opt LB UB gap time septime #sep #nodes gap #opt LB UB gap time septime #sep #nodes gap
{0.2, 0.5} 529 7 529 612 10.3 2564 395 71 608464 7.26 0 529 805 34.0 3600 2206 12492 68683 7.29
{0.5, 0.9} 941 6 949 1099 10.9 2760 406 74 722421 8.07 0 946 1449 34.5 3600 2279 12192 75795 8.71
{0.2, 0.5, 0.8} 839 5 849 991 11.8 3039 472 76 778847 7.57 0 844 1313 36.5 3600 2433 11663 62590 7.97
{0.5, 0.7, 0.9} 942 3 950 1129 14.4 3063 406 69 787943 8.10 0 946 1479 36.9 3600 2415 12466 79820 8.69

Table 7: Comparison between model (BPMDd ) and (BPMDd -z) when considering the route
duration limit.

to the fact that in the Solomon instances the customers are more randomly distributed and
further apart from each other, thus having information on the z variables at the master level
helps in the resolution. This information on the routing decisions is also useful for solving the
Chao instances when the limit on the route duration is imposed, as it is clear from Tables 6–
7. Indeed, in models having z variables in the upper level, constraint (14) can be directly
imposed at the upper level and does not impact the separation procedure, as it happens
instead in the case of constraint (15).
The values of gap are calculated ex-post, after taking into consideration both the upper
bound coming from the branch-and-cut and the one coming from the solution of the WTA-
PTP formulation. While the value of the gap can be used to compare the performances of
the LP relaxations of the different formulations considered, the value of gap, which is on
average the smallest between the two, provides us with information regarding the quality of
the best solution found by the different methods. The values of gap follow the same trends
of the values of the classic gap for the Chao and the Solomon instances.
We can further notice that, in general, the number of branch-and-cut nodes required by
the models without variables z is smaller than the one required by the models with z, even
if the percentage of computing time that is required for the separation procedure is higher.
Indeed, for these models, at each node, we need to separate not only the value function cuts

34
and the subtour elimination constraints (5c), but also constraints (11e), which involve the
solution of a TSP.

Figure 2: Cumulative chart of the number of Chao instances solved within a given gap at
termination.

Figure 3: Cumulative chart of the number of Solomon instances solved within a given gap
at termination.

We further provide two summary charts in Figures 2 and 3 related to the performance of
the four models on the Chao instances and Solomon instances, respectively. They report the
number of instances (on the vertical axis) for which the gap at termination is smaller than
or equal to the value reported on the horizontal axis. They confirm what is shown in the
summary tables presented above, since, in Figure 2 the curves associated with the models
without z are higher than the curves associated with the models with z; the opposite is true

35
in Figure 3. In addition, we can notice that, disaggregated models (BPMDd ) and (BPMDd -z)
are overall performing worse than models (BPMD) and (BPMD-z), respectively. This might
be due to the fact that the disaggregated model has more variables in the lower level. Indeed,
on the one hand, these variables are also part of the single-level reformulation (due to the
value-function approach), and on the other hand, they slow down the separation procedure
which involves the solution of the lower level.

(a) Box plots obtained by aggregating the Chao instances with the same number of
customers.

(b) Box plots obtained by aggregating the Chao instances with the same number of
vehicles.

Figure 4: Box plots representing the distribution of the gap at termination of Chao instances.

8.3 The gain related to margin decisions


To understand the structure of the solutions and the gain related to margin decisions, we
choose two illustrative instances, namely the Solomon instances R20 2 and R20 3, both of

36
(a) Box plots obtained by aggregating the Solomon instances with the same number
of customers.

(b) Box plots obtained by aggregating the Solomon instances with the same number
of vehicles.

Figure 5: Box plots representing the distribution of the gap at termination of Solomon
instances.

which could be solved to optimality by model (22) for all the tested margin values, even
without the warm-start solution provided by the heuristic.
Table 8 has two blocks of columns: one is devoted to the results on the two considered
instances when margin decisions are considered, the other to the results when the margins
are fixed to different values of m ∈ M . For the BPMD, for every tested set of margins M ,
we report the profit of the platform, the percentage of items served with high, medium, and
low margins respectively, the percentage of served customers, and the computational time.
For the BPFM, in every row we report the profit of the platform and the computational time
when considering all margins fixed to either low, or high or, if |M | = 3, medium, as well as
when considering different random margins chosen within M for every item.
From the first block of the table, it is evident that the higher the considered margins are,

37
the more diverse the solutions are, in terms of the margins applied in optimal solutions, and
the more difficult it is to solve them. Furthermore, for the leader, it seems more convenient
to consider i) higher margins, and ii) a higher number of margins. When comparing the
first and the second blocks of columns of the tables, we can clearly notice the added value
in terms of the platform’s profit of making margin decisions as compared to operating under
fixed margins.
BPFM
BPMD
low medium high random
profit %high %medium %low %served time profit time profit time profit time profit time
R20 2
{0.2, 0.5} 487.5 100.0 - 0.0 100.0 29.9 195.0 90.4 - - 487.5 12.6 315.3 28.7
{0.5, 0.9} 675.7 41.2 - 58.8 89.5 805.2 487.5 12.6 - - 0.0 0.23 527.9 980.7
{0.2, 0.5, 0.8} 691.8 68.8 31.2 0.0 84.2 399.3 195.0 90.4 487.5 12.6 0.0 0.30 514.1 1462
{0.5, 0.7, 0.9} 731.8 33.3 44.4 22.2 90.0 500 487.5 12.6 692.3 59.6 0.0 54.9 612.5 436.1
R20 3
{0.2, 0.5} 487.5 100.0 - 0.0 100.0 10.4 195.0 6.9 - - 487.5 36.1 351.6 30.9
{0.5, 0.9} 661.3 41.2 - 58.8 89.5 1495 487.5 36.1 - - 0.0 0.28 384 195.6
{0.2, 0.5, 0.8} 675.0 62.5 37.5 0.0 84.2 1649 195.0 6.9 487.5 36.1 0.0 0.28 434.0 552.6
{0.5, 0.7, 0.9} 705.6 22.2 50.0 27.8 90.0 42.4 487.5 36.1 674.8 98.7 0.0 122.5 612.2 13.5

Table 8: Structure of optimal solutions of models (BPMD) and (BPFM) on Solomon instances
R20 2 and R20 3.

9 Conclusions
The last-mile delivery field is undergoing an unprecedented transformation in its operational
procedures, primarily driven by the surge in e-commerce. This shift has had significant con-
sequences on the way business is conducted. First, the volume of deliveries has increased
substantially: as customers opt for online ordering over in-store purchases, their orders must
be efficiently dispatched for delivery. Second, e-buyers are more and more demanding in
terms of delivery times. Consequently, the demand for delivery services has become unpre-
dictable and volatile, while opportunities for consolidation are reduced. To address these
challenges, the companies in the field started developing new delivery strategies. One such
strategy, which is gaining significant success, is related to peer-to-peer delivery. In this model,
companies (referred to as platforms in this paper) receive delivery requests from customers
and match them with independent carriers available to perform the deliveries. Unlike the
traditional setting, carriers in peer-to-peer delivery do not work directly for the company,
but they have their own objective, which might not always align with those of the company.
Therefore, the challenge for the company lies in maximizing the profit from the delivery
operations, taking into account carriers’ objectives and behavior.
In this paper, we study a bilevel compensation and routing problem arising in the con-
text of peer-to-peer delivery. The problem combines the peer-to-peer logistic platform deci-
sions about the assignment of items to the carriers and the compensation for each delivered
item. The objective of the platform is to maximize the profit generated from the delivered
items, all the while factoring in carriers’ individual objectives when making assignment and
compensation decisions. The bilevel nature of these problems is highlighted by presenting

38
two single-level formulations that lead to either an overestimation or an underestimation
of the platform’s profit. After considering the fixed compensation setting, we propose two
bilevel formulations for the compensation and routing problem, one with aggregated variables
and the other with disaggregated variables. These formulations are then reformulated into
single-level models, which are compared in terms of the quality of their linear relaxations.
Additionally, we present equivalent formulations where routing variables are projected out.
Computational tests show that the performance of the formulations depends on the features
of the instances, i.e., compensation values and customers’ geography. While on average the
disaggregated models are performing worse than the aggregated ones, projecting out the
routing variables only helps for one type of instances. The numerical results confirm that
solving the single-level formulations (with reduced or increased power of the carriers, respec-
tively) leads to biases in the true platform solution values. Furthermore, the analysis of the
structure of the solutions reveals that i) including decisions on the margins results in better
profits for the platform, ii) the platform may benefit from offering higher compensations to
carriers, resulting in a higher number of accepted offers.
Besides some natural extensions of the problem, such as considering multiple depots for
the carriers, multiple vehicles in each subproblem, or introducing a penalty for the undelivered
items, future research may explore the introduction of stochasticity into the problem setting,
especially with regard to carriers’ behavior. The challenge would be modeling the uncertainty,
on one side, and adapting the methodologies proposed in this paper to deal with it, on the
other side, or potentially, proposing ad-hoc modeling and approaches.
Acknowledgements: The research of E. Fernández has been partially supported through
the Spanish Ministerio de Ciencia y Tecnologı́a and European Regional Development Funds
(ERDF) through project MTM2019-105824GB-I00. The research of C. Archetti, M. Cerulli,
and I. Ljubić was partially funded by CY Initiative of Excellence, France (grant “Investisse-
ments d’Avenir ANR-16-IDEX-0008”). This support is gratefully acknowledged.

References
N. Agatz, A. Erera, M. Savelsbergh, and X. Wang. Optimization for dynamic ride-sharing:
A review. European Journal of Operational Research, 223(2):295–303, 2012. doi: 10.1016/
j.ejor.2012.05.028.

A. Alnaggar, F. Gzara, and J. H. Bookbinder. Crowdsourced delivery: A review of platforms


and academic literature. Omega, 98:102139, 2021. doi: 10.1016/j.omega.2019.102139.

C. Archetti and L. Bertazzi. Recent challenges in routing and inventory routing: E-commerce
and last-mile delivery. Networks, 77(2):255–268, 2021. doi: 10.1002/net.21995.

A. M. Arslan, N. Agatz, L. Kroon, and R. Zuidwijk. Crowdsourced delivery—a dynamic


pickup and delivery problem with ad hoc drivers. Transportation Science, 53(1):222–235,
2019. doi: 10.1287/trsc.2017.0803.

39
R. Ausseil, J. A. Pazour, and M. W. Ulmer. Supplier menus for dynamic matching in peer-
to-peer transportation platforms. Transportation Science, 56(5):1304–1326, 2022. doi:
10.1287/trsc.2022.1133.
M. Barbosa, J. P. Pedroso, and A. Viana. A data-driven compensation scheme for last-mile
delivery with crowdsourcing. Computers & Operations Research, 150:106059, 2023. doi:
10.1016/j.cor.2022.106059.
H. Calvete, C. Galé, and M.-J. Oliveros. Bilevel model for production-distribution planning
solved by using ant colony optimization. Computers & Operations Research, 38:320–327,
01 2011. doi: 10.1016/j.cor.2010.05.007.
J.-F. Camacho-Vallejo, L. López-Vera, A. Smith, and J.-L. González-Velarde. A tabu search
algorithm to solve a green logistics bi-objective bi-level problem. Annals of Operations
Research, 07 2021. doi: 10.1007/s10479-021-04195-w.
W. Candler and R. Norton. Multi-level programming and development policy. Technical
report, The World Bank Development Research Center, Washington D.C., 1977.
M. Cerulli. Bilevel optimization and applications. PhD thesis, Institut Polytechnique de
Paris, 2021. URL http://www.theses.fr/2021IPPAX108.
I.-M. Chao, B. L. Golden, and E. A. Wasil. The team orienteering problem. European Journal
of Operational Research, 88(3):464–474, 1996. doi: 10.1016/0377-2217(94)00289-4.
C. Cleophas, C. Cottrill, J. F. Ehmke, and K. Tierney. Collaborative urban transportation:
Recent advances in theory and practice. European Journal of Operational Research, 273
(3):801–816, 2019. doi: 10.1016/j.ejor.2018.04.037.
B. Colson, P. Marcotte, and G. Savard. An overview of bilevel optimization. Annals of
Operations Research, 153:235–256, 2007. doi: 10.1007/s10479-007-0176-2.
S. Dempe. Foundations of bilevel programming. Springer Science & Business Media, 2002.
doi: 10.1007/b101970.
J. Du, X. Li, L. Yu, R. Dan, and J. Zhou. Multi-depot vehicle routing problem for hazardous
materials transportation: A fuzzy bilevel programming. Information Sciences, 399:201–218,
2017. doi: 10.1016/j.ins.2017.02.011.
D. Feillet, P. Dejax, and M. Gendreau. Traveling salesman problems with profits.
Transportation science, 39(2):188–205, 2005. doi: 10.1287/trsc.1030.0079.
M. Fischetti, J. J. S. González, and P. Toth. Solving the orienteering problem through
branch-and-cut. INFORMS Journal on Computing, 10(2):133–148, 1998. doi: 10.1287/
ijoc.10.2.133.
R. Fortet. Applications de l’algebre de boole en recherche opérationelle. Revue Française de
Recherche Opérationelle, 4(14):17–26, 1960.

40
K. Gdowska, A. Viana, and J. P. Pedroso. Stochastic last-mile delivery with crowdshipping.
Transportation Research Procedia, 30:90–100, 2018. doi: 10.1016/j.trpro.2018.09.011.
EURO Mini Conference on “Advances in Freight Transportation and Logistics”.

S. D. Handoko, L. H. Chuin, A. Gupta, O. Y. Soon, H. C. Kim, and T. P. Siew. Solving


multi-vehicle profitable tour problem via knowledge adoption in evolutionary bi-level pro-
gramming. In 2015 IEEE Congress on Evolutionary Computation (CEC), pages 2713–2720.
IEEE, 2015. doi: 10.1109/CEC.2015.7257225.

H. Hong, X. Li, D. He, Y. Zhang, and M. Wang. Crowdsourcing incentives for multi-hop
urban parcel delivery network. IEEE Access, 7:26268–26277, 2019. doi: 10.1109/ACCESS.
2019.2896912.

H. Horner, J. Pazour, and J. E. Mitchell. Optimizing driver menus under stochastic selec-
tion behavior for ridesharing and crowdsourced delivery. Transportation Research Part E:
Logistics and Transportation Review, 153:102419, 2021. doi: 10.1016/j.tre.2021.102419.

IBM. ILOG CPLEX 12.7 User’s Manual. IBM, 2017.

N. Kafle, B. Zou, and J. Lin. Design and modeling of a crowdsource-enabled system for urban
parcel relay and delivery. Transportation Research Part B: Methodological, 99:62–82, 2017.
doi: 10.1016/j.trb.2016.12.022.

T. Kleinert, M. Labbé, I. Ljubić, and M. Schmidt. A survey on mixed-integer programming


techniques in bilevel optimization. EURO Journal on Computational Optimization, 9:
100007, 2021. doi: 10.1016/j.ejco.2021.100007.

T. V. Le, A. Stathopoulos, T. Van Woensel, and S. V. Ukkusuri. Supply, demand,


operations, and management of crowd-shipping services: A review and empirical evi-
dence. Transportation Research Part C: Emerging Technologies, 103:83–103, 2019. doi:
10.1016/j.trc.2019.03.023.

Y. Marinakis and M. Marinaki. A bilevel genetic algorithm for a real life location routing
problem. International Journal of Logistics Research and Applications, 11(1):49–65, 2008.
doi: 10.1080/13675560701410144.

Y. Marinakis, A. Migdalas, and P. Pardalos. A new bilevel formulation for the Vehicle
Routing Problem and a solution method using a genetic algorithm. Journal of Global
Optimization, 38:555–580, 08 2007. doi: 10.1007/s10898-006-9094-0.

N. Masoud and R. Jayakrishnan. A decomposition algorithm to solve the multi-hop peer-to-


peer ride-matching problem. Transportation Research Part B: Methodological, 99:1–29,
2017. doi: 10.1016/j.trb.2017.01.004.

S. S. Mofidi and J. A. Pazour. When is it beneficial to provide freelance suppliers with choice?
a hierarchical approach for peer-to-peer logistics platforms. Transportation Research Part
B: Methodological, 126:1–23, 2019. doi: 10.1016/j.trb.2019.05.008.

41
A. Nikolakopoulos. A metaheuristic reconstruction algorithm for solving bi-level vehicle rout-
ing problems with backhauls for army rapid fielding. In V. Zeimpekis, G. Kaimakamis,
and N. Daras, editors, Military Logistics, Operations Research/Computer Science Inter-
faces Series, pages 141–157. Springer Cham, 2015. doi: 10.1007/978-3-319-12075-1\ 8.

Y. Ning and T. Su. A multilevel approach for modelling vehicle routing problem with un-
certain travelling time. Journal of Intelligent Manufacturing, 28(3):683–688, 2017. doi:
10.1007/s10845-014-0979-3.

S. P. Parvasi, R. Tavakkoli-Moghaddam, A. Taleizadeh, and M. Soveizy. A bi-level


bi-objective mathematical model for stop location in a school bus routing problem.
IFAC-PapersOnLine, 52:1120–1125, 01 2019. doi: 10.1016/j.ifacol.2019.11.346.

A. Punel and A. Stathopoulos. Modeling the acceptability of crowdsourced goods deliveries:


Role of context and experience effects. Transportation Research Part E: Logistics and
Transportation Review, 105(C):18–38, 2017. doi: 10.1016/j.tre.2017.06.007.

M. J. Santos, E. Curcio, P. Amorim, M. Carvalho, and A. Marques. A bilevel approach for


the collaborative transportation planning problem. International Journal of Production
Economics, 233, 2021. doi: 10.1016/j.ijpe.2020.108004.

M. M. Solomon. Algorithms for the vehicle routing and scheduling problems with time window
constraints. Operations Research, 35(2):254–265, 1987. doi: 10.1287/opre.35.2.254.

S. Tahernejad and T. K. Ralphs. Valid inequalities for mixed integer bilevel linear optimiza-
tion problems. Technical report, COR@L Technical Report 20T-013, 2020.

C. Wang, Z. Peng, and X. Xu. A bi-level programming approach to the location-routing


problem with cargo splitting under low-carbon policies. Mathematics, 9:2325, 2021. doi:
10.3390/math9182325.

H. Wang and H. Yang. Ridesourcing systems: A framework and review. Transportation


Research Part B: Methodological, 129:122–155, 2019. doi: 10.1016/j.trb.2019.07.009.

X. Wang, N. Agatz, and A. Erera. Stable matching for dynamic ride-sharing systems.
Transportation Science, 52(4):850–867, 2018. doi: 10.1287/trsc.2017.0768.

42
A Brief introduction to bilevel optimization
A general bilevel problem (Candler and Norton, 1977; Colson et al., 2007; Dempe, 2002;
Cerulli, 2021; Kleinert et al., 2021) is defined as a nested optimization problem where a
subset of variables is constrained to be optimal for another optimization problem (the so-
called lower-level problem, in contrast to the upper-level problem, which is how the outer
problem is usually referred to), parametrized w.r.t. the remaining variables. It arises anytime
there is a hierarchical relationship between two autonomous, and possibly conflictual, decision
makers. In mathematical terms, a bilevel problem can be written as follows:
“ max ” F (x, y)
x∈X
s.t. G(x, y) ≤ 0 (BP)
y ∈ arg max

{f (x, y ′ ) | g(x, y ′ ) ≤ 0}
y ∈Y

(BP) can be seen as a Stackelberg game, where two players (a leader and a follower) make
their decisions following a hierarchical order. Firstly, the leader makes his/her choice and
communicates it to the follower, who will select the best response taking into account the
choice of the leader. Thus, the leader’s task is to determine the optimal decision x, while
anticipating the optimal followers’ response y. The “max” indicates that the problem is
not well-posed if multiple optimal responses y exist for a single upper-level decision x. In
this case, one may distinguish between optimistic and pessimistic settings. In the optimistic
bilevel optimization, the leader assumes that the lower-level problem will be solved in such a
way as to be as beneficial as possible to the upper-level problem, i.e., maximizing its objective
function. On the contrary, the pessimistic approach consists in assuming that the follower
will select the optimal solution y which corresponds to the worst possible outcome for the
leader, in terms of the upper-level objective function.
One possible way to deal with bilevel programs is through the so-called value function
reformulation. Being φ(x) = max ′
{f (x, y ′ ) | g(x, y ′ ) ≤ 0} the value function of the lower
y ∈Y
level, which gives the optimal value of the follower’s problem for any first level decision x,
this approach consists in formulating the optimistic bilevel problem as
max F (x, y)
x,y
s.t. G(x, y) ≤ 0
g(x, y) ≤ 0
f (x, y) ≥ φ(x)
x ∈ X , y ∈ Y.
In this formulation, the lower-level constraints are lifted to the upper level, and the objective
function of the lower-level problem is bounded from above by its value function.

B Formulations of (BPMD-z) and (BPMDd-z)


The formulation (BPMD-z), obtained by projecting out the z variables in (BPMD), is:

43
 XX X
k

 max pmi wmi

X,y,w,z

 k∈K i∈V m∈M


 s.t. (17b), (19a)–(19b), (17d)–(17e)



 X X
k
≤ bk ∀k∈K



 Xmi

i∈V m∈M



 X
yik ≤ k




 Xmi ∀ i ∈ V, k ∈ K
 m∈M
(BPMD-z)  

 X
θk ≥ ckTSP (V (T ))  yik − |V (T )| + 1 ∀ T ∈ T , k ∈ K,





i∈V (T )




 X X X X
k
p̄mi wmi − θk ≥ k
p̄mi Xmi − C k (T ) ∀ T ∈ T ,k ∈ K







 i∈V m∈M i∈V (T ) m∈M

 k k n



 Xm , wm ∈ {0, 1} ∀ m ∈ M, k ∈ K

y k ∈ {0, 1}n+1 , θk ∈ R ∀ k ∈ K.

The formulation (BPMDd -z), obtained by projecting out the z variables in (BPMDd ), is:
 XX X  
k

 max
X,Y,z p mi Y mi



 k∈K i∈V m∈M
 X X
 k



 s.t. Xmi ≤1 ∀i∈V
k∈K m∈M



 X X
k
≤ bk




 Xmi ∀k∈K


 i∈V m∈M

 k k
 Ymi ≤ Xmi ∀ i ∈ V, m ∈ M, k ∈ K
(BPMDd -z) X X X X
k k k



 p̄mi Y mi − θ ≥ p̄mi X̃mi − C k (T ) ∀ T ∈ T ,k ∈ K
i∈V m∈M i∈V (T ) m∈M





  

 X X
θk ≥ ckTSP (V (T ))  k




 Ymi − |V (T )| + 1 ∀ T ∈ T ,k ∈ K



 i∈V (T ) m∈M

Xm ∈ {0, 1} , Ym ∈ {0, 1}n+1
k n k
∀ m ∈ M, k ∈ K





θk ∈ R

∀ k ∈ K.

C WTA-PTP with margin decisions


If we take into account margin decisions into WTA-PTP formulation (1), by replacing variable
αik with Akmi which is 1 if item i is served by carrier k with a margin of m for the platform,

44
we obtain the following formulation, which gives us a valid upper bound for BPMD:
XX X
max pmi Akmi (35a)
A,z
k∈K i∈V m∈M
X X
s.t. Akmi ≤ 1 ∀i∈V (35b)
k∈K m∈M
X X
Akmi ≤ bk ∀k∈K (35c)
i∈V m∈M
X X X
p̄kmi Akmi − ckij zij
k
≥0 ∀k∈K (35d)
i∈V m∈M (i,j)∈A
X
( Akm , z k ) is a route ∀k∈K (35e)
m∈M

Akm ∈ {0, 1}n+1 , z k ∈ {0, 1}|A| ∀ m ∈ M, k ∈ K. (35f)

D BPMD no-good cuts based formulation


Given the binary variable Akmi introduced in Appendix C, and being Mk any refused offer,
i.e., (m, i) ∈ Mk ⇐⇒ Akmi = 1, we obtain the following formulation:
 XX X

max pmi Akmi (36a)

 A

 k∈K i∈V m∈M
 X X
Akmi ≤ 1

s.t. ∀i∈V (36b)







 k∈K m∈M

X X
(BPMD ) Akmi ≤ bk ∀k∈K (36c)

i∈V m∈M



 X X
k
Akmi ≥ 1 ∀ k ∈ K, Mk ⊂ M × V




 (1 − A mi ) + (36d)
(m,i)∈Mk / k



 (m,i)∈M

Akm ∈ {0, 1}n+1 ∀ m ∈ M, k ∈ K. (36e)

We recall that, in order to determine the sets Mk for which constraints (36d) must be
imposed, an optimization problem (i.e., carrier’s PTP) needs to be solved.

45

You might also like