0% found this document useful (0 votes)

20 views

Integrating Convex Optimization and Deep Learning For Downlink Resource Allocation in LEO Satellites Networks

Uploaded by

Sharda Tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views

Integrating Convex Optimization and Deep Learning For Downlink Resource Allocation in LEO Satellites Networks

Uploaded by

Sharda Tripathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

1104 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO.

3, JUNE 2024

Integrating Convex Optimization and Deep

Learning for Downlink Resource Allocation
in LEO Satellites Networks
Xiufeng Sui, Ziqi Jiang , Yifeng Lyu , Rongfei Fan , Member, IEEE,
Han Hu , Member, IEEE, and Zhi Liu , Senior Member, IEEE

Abstract—This paper investigates the satellite communication Among these types, LEO satellites are positioned at an
network (SCN) to optimize the channel allocation of the fixed altitude ranging from 500 to 2000 km [3]. The lower orbit
ground base station and the transmission power allocation of of LEO satellites provides them with better coverage, shorter
the Low Earth orbit (LEO) satellites jointly while considering
the freshness of the information. We present a mathematical communication distance, and less path loss [4], leading to
model for the problem and formulate it as a mixed-integer lower time delay and energy consumption compared to higher-
programming (MIP) problem, which is NP-hard. To tackle this orbiting satellites [5]. As a result of these advantages, the
challenge, we propose a two-step approach that decomposes the deployment of LEO satellites has become more active, and the
problem into a channel allocation problem and a power allocation percentage of LEO satellites in orbit has increased. Notably,
problem. For the power allocation problem, we propose a convex
optimization algorithm termed as Opt. For the channel allocation several significant projects on LEO satellite communication
problem, we introduce two learning-based schemes, Ptr and systems, such as OneWeb and SpaceX, have been launched [6].
DNN-Ptr. Combining these two steps together, we develop two However, the power and spectrum resources of LEO satel-
novel algorithms, i.e., Opt-Ptr and Opt-DNN-Ptr. In particular, lites are very limited and can no longer meet the growing
the Opt-Ptr algorithm devises a novel Pointer Network to obtain demand for communications. Therefore, the effective use of
the channel allocation decision and then solves the remaining
power allocation problem using convex optimization algorithms. LEO satellites is on the agenda, as the number of devices in
To further improve the performance, the Opt-DNN-Ptr algorithm need of service around the world is rapidly increasing [7].
utilizes a DNN to predict a transmission power allocation, which On the other hand, the researches on the SCNs are still
is then combined with the channel allocation decision obtained in their infancy, and most of the exiting studies are about
from the pointer network to solve the remaining power allocation architecture, applications, multibeam satellite communication
problem. The simulation results verify the superiority of the
proposed algorithm. or other research topics [1]. There are only a few works about
optimizing the resource allocation for effective data down-
Index Terms—Satellite communication, resource allocation, loading to match the network resources and data demand [8],
deep neural network, pointer network.
[9], [10], [11], [12], [13]. In particular, Zhou et al. have
proposed a joint scheme for satellite channel and power
I. I NTRODUCTION resource allocation, along with Internet of Remote Things data
scheduling, aiming to maximize the network’s capacity for
ATELLITE communication networks (SCNs) are widely
S utilized in critical areas such as navigation, environmental
monitoring, and emergency assistance. Due to their excellent
Internet of Remote Things data using a model-free reinforce-
ment learning framework [8]. Jia et al. [9] have presented a
collaborative data downloading algorithm that optimizes the
properties including high bandwidth, global coverage, and low data reallocation between satellites by utilizing inter-satellite
transmission delay, SCNs have become a research hotspot [1]. link (ISL) routing method, considering the communication
Typically, satellites can be categorized into three types based resource allocation of ISLs to maximize the throughput of
on their orbit altitude: geostationary orbit (GEO), medium data downloading. Reference [10] maximizes the minimum
earth orbit (MEO), and low earth orbit (LEO) satellites [2]. number of successfully scheduled missions over all user
Manuscript received 11 June 2023; revised 1 November 2023; accepted 21 satellites by jointly optimizing contact (i.e., potential available
January 2024. Date of publication 1 February 2024; date of current version communication links) plan design, power allocation in relay
7 June 2024. This work was supported in part by the National Natural Science satellites, and mission schedules using a time expanded graph.
Foundation of China under Grant 61971457 and Grant U23A20275. The
associate editor coordinating the review of this article and approving it for However, the data sharing between satellites is non-trivial due
publication was Z. Xiao. (Corresponding author: Han Hu.) to the high mobility of the satellites. To achieve efficient data
Xiufeng Sui, Ziqi Jiang, Yifeng Lyu, Rongfei Fan, and Han downloading, a terrestrial-satellite network (TSN) architecture
Hu are with the Beijing Institute of Technology, Beijing 100081,
China (e-mail: [email protected]; [email protected]; yifenglyu@ to integrate the ultra-dense LEO networks and the terrestrial
bit.edu.cn; [email protected]; [email protected]). networks is proposed in [11]. Authors have proposed two
Zhi Liu is with the Graduate School of Informatics and Engineering, matching algorithms to solve the joint user scheduling and
The University of Electro-Communications, Tokyo 182-8585, Japan (e-mail:
[email protected]). backhaul transmission power resource allocation problem to
Digital Object Identifier 10.1109/TCCN.2024.3361071 maximize the total data rate and the number of accessed users
2332-7731
c 2024 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
SUI et al.: INTEGRATING CONVEX OPTIMIZATION AND DEEP LEARNING FOR DOWNLINK RESOURCE ALLOCATION 1105

in the TSN. In addition, to maximize the delay-constrained improved. Note that the predicted power allocation is only
throughput in satellite networks, Liu et al. [12] have extended a used to obtain the channel allocation more accurately. The
traditional time-expanded graph to model the data acquisition optimal transmission allocation of the problem is still obtained
and transmission and energy managements. Reference [13] by corresponding convex optimization algorithm. Extensive
ensures data transmission rate and minimizes energy con- simulations are conducted and the simulation results show that
sumption by addressing intermittent link issues [14] between the proposed learning-based algorithms can achieve encourag-
satellites and ground stations through a joint data downloading ing performance compared to other representative benchmark
and resource management approach. methods. The main contributions can be concluded as follows:
The aforementioned studies have investigated the data • We formulate the downlink transmission scenario as a
downloading problems for LEO satellites network. However, MIP problem, allowing for the joint optimization of
the channel resources and transmission power resource are the ground station’s channel allocation and transmission
ignored. Note that channel allocation and transmission power power allocation.
allocation have been widely studied in traditional wireless • We decompose the challenging NP-hard problem into
networks [15], [16]. These challenging optimization problems two sub-problems, namely the channel allocation problem
are usually divided into sub-problems and optimized interac- and the transmission power allocation problem, to avoid
tively. However, different from traditional wireless networks, the complexity of directly solving the MIP problem. For
we can not afford such interaction due to the high communi- the power allocation, we propose a convex optimization
cation cost. Deep RL and other learning based approaches are algorithm called Opt, while for the channel allocation, we
more adaptive [17], but can not be directly used in downlink introduce two learning-based schemes, namely Ptr and
resource allocation in SCNs due to the unique features of the DNN-Ptr.
SCNs. • We develop two novel algorithms: Opt-Ptr and Opt-DNN-
In this paper, we investigate the downlink resource alloca- Ptr by combining these two steps. The Opt-Ptr algorithm
tion in a satellite communication network (SCN) consisting uses a Pointer Network in conjunction with convex
of a fixed ground base station and multi LEO satellites and optimization algorithms such as the Lagrange multiplier
would like to address the aforementioned problems. We jointly method and sub-gradient method to determine the chan-
optimize the channel allocation of the ground base station nel allocation decision and solve the remaining power
and the transmission power allocation of the LEO satellites allocation problem. To further enhance performance, the
to maximize a weighted sum transmission rate considering Opt-DNN-Ptr leverages a DNN to predict the transmis-
the information freshness. In particular, we first mathemat- sion power allocation and incorporates it with the channel
ically model this downlink transmission and formulate the allocation decision from the pointer network to effectively
joint optimization of channel allocation and power allocation allocate resources.
problem as a mixed-integer programming (MIP) problem. To • We conduct a comparison between the proposed algo-
tackle this challenging NP-hard MIP problem, we decouple rithms and benchmark schemes, and the simulation
the problem into two sub-problems: the channel allocation results validate the excellent performance of our learning-
problem and the transmission power allocation problem. based algorithms across various scenarios. Furthermore,
To solve the transmission power allocation problem, the Opt-DNN-Ptr surpasses other algorithms and exhibits
we apply a convex optimization algorithm (i.e., Lagrange superior stability in performance.
multiplier method and sub-gradient method) to obtain the
optimal power allocation with the input and the given chan-
II. R ELATED W ORK
nel allocation, and this algorithm is termed as Opt. Thus,
the main difficulty to solve our problem is to solve the A. Resource Allocation in SCN
channel allocation problem. Inspired by the wide application 1) Channel and Bandwidth Allocation: Considering the
of learning based schemes in solving such problems, we limited spectrum resources of LEO satellites, researches have
introduce two learning-based schemes, Ptr and DNN-Ptr, been conducted to allocate the channel allocation or the
for the channel allocation problem. Combining these two bandwidth in the satellites communication. For example,
steps together, we develop two novel algorithms, i.e., Opt- Xiao et al. [19] propose a long-short term bandwidth allocation
Ptr and Opt-DNN-Ptr. In particular, the Opt-Ptr algorithm strategy for beam-hopping LEO satellites to improve the trans-
devises a novel Pointer Network [18] to obtain the channel mission rate. To efficiently utilize limited spectrum resources,
allocation decision and then solves the remaining power dynamic channel allocation is crucial, and Liu et al. [20]
allocation problem using convex optimization algorithms, such address this by proposing a centralized channel allocation
as the Lagrange multiplier method and sub-gradient method. approach based on DRL to minimize average transmission
To further improve the performance, the Opt-DNN-Ptr algo- latency.
rithm utilizes a deep neural network (DNN) to predict a 2) Power Allocation: In addition to the channel alloca-
transmission power allocation, which is then combined with tion, power allocation has also been studied. For example,
the channel allocation decision obtained from the pointer Dai et al. [21] formulate the dynamic and unpredictable
network to solve the remaining power allocation problem. By channel conditions into the power allocation problem, which
considering the power allocation during the channel allocation is then solved by the proposed DRL based power allocation
in Opt-DNN-Ptr algorithm, the performance can be further algorithm. Similarly, the power allocation optimization with

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
1106 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO. 3, JUNE 2024

the objective of extending the battery life of LEO satellites

is presented in [22], where Q-learning is applied to share the
workload of overworked satellites with neighbor satellites with
lower load selectively and correspondingly allocate the trans-
mission power resource according to the traffic volume and
battery status. The Q-learning is also applied in the conjecture-
based multi-agent power allocation algorithm proposed in [23],
which is to obtain an optimum power allocation in pursuit
of the traffic matching degree and fairness maximization for
multi-beam satellite system.
3) Joint Channel and Power Resource Allocation: Joint
Fig. 1. An example model with K satellites and a ground station.
power allocation and channel allocation or bandwidth allo-
cation optimization is a research focus as it is anticipated
to offer superior performance compared to non-joint resource However, the generation of the channel allocation decision
allocation approaches. For example, Paris et al. [24] utilize highly depends on the quantization function, and it is difficult
a genetic algorithm to jointly allocate power and bandwidth to design the most suitable one. Besides, a suitable quan-
resources for multi-beam satellites, aiming to enhance system tization function in a scenario may become unsuitable with
capacity. To improve convergence and avoid local optima, simple changes in the problems or even in the network size.
Pachler et al. [25] suggest a hybrid approach that com- In addition, the input may be insufficient to obtain the channel
bines a particle swarm algorithm with a genetic algorithm; allocation accurately especially when the network size is large.
however, these heuristic algorithms may not be suitable These technical issues are addressed in Sections III, IV, and V
for scenarios requiring timely solutions due to their high of this article.
complexity and over-sensitivity. To meet the real condition
traffic demand in the very high throughput satellite systems,
III. S YSTEM M ODEL AND P ROBLEM F ORMULATION
there algorithms, including Q-Learning, Deep Q-Learning and
Double Deep Q-Learning, are discussed to jointly manage the A. System Model
power resource, bandwidth resource and beamwidth resource We consider the downlink of a Ka-band SCN consisting of
dynamically in [26]. Compared with the researches on joint a fixed ground base station and K communicable satellites,
optimization of the power allocation and the bandwidth denoted by K = {1, . . . , K }, as illustrated in Fig. 1. The
allocation, few works focus on the joint optimization of downlink channel of the ground base station is divided into
power allocation and channel allocation. Note that channel N subchannels according to the frequency, denoted by N =
allocation and transmission power allocation have been widely {1, . . . , N }, where a larger n ∈ N indicates a higher carrier
studied in traditional wireless networks [15], [16], and solved frequency of the subchannel. In general, the number of the
via mathematical optimization or learning based approaches. subchannel is set to be less than ten [31]. We assume that
However, these solutions can not be directly applied to the ground base station has sufficient information about the
resource allocation in SCNs due to the unique features. satellites in orbit, so that the specific communicable satellites
and corresponding parameters at each time instance can be
predicted based on the predictable satellite orbital period.
B. Solving MIP Problems The system time is divided into consecutive time frames
The joint optimization of channel allocation and power allo- (periods) and we consider scenarios within a certain time
cation problem can be formulated as a MIP problem in general. frame of duration T, e.g., on the scale of a few minutes [8],
To tackle such MIP problem, multiple algorithms have been [32], [33]. The time period is set to be much less than the
proposed. Among these proposed methods, some learning- time of each contact of the satellite with the ground base
based algorithms have shown the potential to solve the MIP station, and we assume that the network topology remains
problems efficiently and quickly. For example, to minimize unchanged within a time frame [34], [35]. In different time
latency and energy costs in mobile edge computing enhanced frames, the amount of satellites ranges from a few to twenty
SIoT networks, Cui et al. [27] and Wu et al. [28] propose a as the orbital cycle of satellites differs. Considering the frames
MIP problem formulation for joint optimization of user asso- wherein the amount of sub-channels N is usually less than
ciation, offloading decision, and resource allocation, which is the amount of satellites K, the ground base station cannot
solved by decomposing it into two sub-problems and applying contact with all the K satellites and receive the data from
the Deep Q-Network (DQN) and Lagrange multiplier method all the K satellites at one frame. In addition, considering the
separately. DQN can directly solve the MIP problem [29]; short contact time and the limited computing resources of LEO
however, its exhaustive search nature in action selection makes satellites, a proper sub-channel allocation and power allocation
it unsuitable for high-dimensional action spaces. To tackle scheme is in critical demand. Since the base station has more
the curse of dimensionality problem in MIP problems with computational resources and knowledge of the satellite orbital
high-dimensional action spaces, Huang et al. [30] present cycles, it can predict the communication models between
the DROO algorithm, which decomposes the problem into satellites. Therefore, it is assumed that the ground base station
two sub-problems and solves them independently using an assigns sub-channels to the satellite according to the proposed
optimization algorithm and a DNN. channel assignment scheme, and then the satellite transmits
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
SUI et al.: INTEGRATING CONVEX OPTIMIZATION AND DEEP LEARNING FOR DOWNLINK RESOURCE ALLOCATION 1107

the collected data to the ground base station accordingly The satellite-to-ground data link is mainly free space optical
at each time frame through the assigned downlink sub- (FSO) communication, and the downlink channel gain varies
channel. Due to the high cost of satellite communication, we slowly with time and is mainly affected by weather [38]. We
assume that the channel allocation decision remains the same can assume that the channel gain is approximately constant
within a time frame. Considering the possibility of changing within a time slot τ , where τ is small enough so that
channel conditions, we further consider transmission power the channel gain is approximately constant within a time
allocations in smaller sizes in each frame to obtain a more slot although the weather changes, the satellite moves, etc.
appropriate power allocation. The satellite allocates its limited Considering the limited power resource of satellites, the max-
power to the assigned subchannels, taking into account other imum transmission power provided by the satellite k in each
communication tasks with limited and time-varying available time instant cannot surpass a specific threshold pkmax . Thus
power, in order to maximize the defined objectives. Details we have the max power constraint for satellites k, denoted
will be described later. as pn,k ,m ≤ pkmax . However, the continuous transmission
Let αn,k ∈ {0, 1} be the indicator variable, where αn,k = 1 with maximum power may result in a large loss of the
means that the subchannel n is assigned to the satellite k, and satellites battery [39]. To keep the long-term power budget,
αn,k = 0 means that subchannel n is not assigned to satellite we further consider a average power constraint. Specifically,
k. We assume that each
sub-channel can be assigned to at most the average transmit power among each frame is restricted
one satellite, i.e., K k =1 αn,k ≤ 1, ∀n ∈ N . In cases where below a certain level pkav , where pkav ≤ pkmax . For the power
satellites are not assigned to any sub-channel in the current allocation of satellite k,
the average
M transmit power constraint
N av
frame, the collected data will be stored and await the next can be formulated as n=1 m=1 pn,k ,m /M ≤ pk . For
connection. tot av
brevity, we denote the pk = Mpk . Thus the average power
We further average a frame into M time slots of length transmission power constraint
M can be transformed into the total
τ = M T , denoted by M = {1, . . . , M }. Let C
n,k ,m denote power constraint N n=1
tot
m=1 pn,k ,m ≤ pk , where pk ≤
tot
the amount of data transmitted by satellite k through sub- max
Mpk .
channel n in time slot m. To better respond to service
latency requirements and to encourage earlier completion of B. Problem Formulation
transmissions, we define the objective function as information With given maximum power pkmax and total power pktot ,
freshness-oriented, i.e., higher priority is given to the trans- we denotes the parameters of satellite k as gk = (pkmax , pktot ),
mission in the early time
M slots. The objective function is k ∈ K. For the convenience of expression, we use I t =
N
defined as α
n=1 n,k m=1 wm Cn,k ,m , where wm is the {ht , Gt } to denote the channel gains and satellites parameters
weight parameter and decreases as the increase of m. This in the t-th frame. Without loss of generality, we omit the
objective function denotes the sum weighted transmission data subscript t for brevity in the following. Let α {αn,k }
of satellite k and considers the information freshness. denote the sub-channel allocation and P {pn,k ,m } denote
Thus, to maximize the objective function, we mainly need to the overall transmit power allocation of K satellites. For each
optimize the channel allocation decision αn,k and the Cn,k ,m , frame with the input I, we are interested in optimizing the
where the Cn,k ,m can be obtained as Cn,k ,m = rn,k ,m · τ . channel allocation α and power allocation P with the goal
Here rn,k ,m denotes the transmission rate of sub-channel n in
slot m, when the sub-channel n is allocated to satellite k.
of maximizing
N K M data C̄ (I)
the sum weighted transmission
n=1 C̄
k =1 n,k , where C̄ n,k α n,k m=1 wm Cn,k ,m .
Let pn,k ,m denote the transmission power of the satellite k We consider two cases with different channel allocation
allocated to the sub-channel n in slot m, which will be zero constraints in the formulation. The first case assumes that
if αn,k = 0. Let hn,k ,m denote the downlink channel gain of each satellite can be assigned at most one sub-channel, this
sub-channel n assigned to satellite k in slot m. Considering is to allow as many as satellite can transmit the data back
the channel fading caused by rain attenuation, weather, air to the groundstation when the channel resources are not
quality, and other physical particulates [36], [37], hn,k ,m , can enough, i.e., N
A A f2 n=1 αn,k ≤ 1, ∀k ∈ K,. Thus the downlink
be modeled as: hn,k ,m = Rc 2 lT2 ,k · At n , where AR and resource allocation with the aforementioned constraints can be
k n,k ,m
AT ,k are the effective area of ground receiving antenna and formulate as follows:
N K
transmitting antenna of satellite k, respectively. c is the speed
(P 1): C̄ ∗ (I) = max C̄n,k
of light, lk is the distance between satellite k and the ground α ,P
n=1 k =1
BS, fn is the carrier frequency of sub-channel n, and Atn,k ,m is
s.t. 0 ≤ pn,k ,m ≤ pkmax , n ∈ N , k ∈ K, m ∈ M (2a)
the total channel fading of sub-channel n allocated to satellite
M

k in slot m, which increases rapidly with the increase of carrier
pn,k ,m ≤ pktot , n ∈ N , k ∈ K (2b)
frequency. The achievable transmission rate of sub-channel n
m=1
allocated to satellite k in slot m, denoted by rn,k ,m , can be
αn,k ∈ {0, 1}, n ∈ N , k ∈ K (2c)
modeled using Shannon formula as
2
K

pn,k ,m · hn,k ,m αn,k ≤ 1, n ∈ N , (2d)
rn,k ,m = Bn · log2 1 + , (1)
σn2 k =1
N
where Bn and σn2 are the transmission bandwidth and the noise αn,k ≤ 1, k ∈ K. (2e)
power of sub-channel n, respectively. n=1
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
1108 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO. 3, JUNE 2024

Additionally, we also consider the case that each satellite (2b) in (P3) can be transformed into an equality constraint, and
can be assigned multiple sub-channels. The downlink resource we can solve this problem with Lagrange Multiplier Method.
allocation with the aforementioned constraints can be formu- To be specific, we first introduce the Lagrangian multiplier λ,
lated as follows: and obtain a Lagrangian function
K
N M
N K
(P 2): C̄ ∗ (I) = max C̄n,k L(P, λ) = C̄n,k + λ pn,k ,m − pk . (5)
α ,P
n=1 k =1 n=1 k =1 m=1
N
It can be shown that finding the optimal solution to (P3)
s.t. 0 ≤ pn,k ,m ≤ pkmax , k ∈ K, m ∈ M (3a)
is equivalent to finding the optimal solution of L(P, λ) under
n=1 ∂L(P,λ) ∂L(P,λ)
N M the constraint of (2a), where ∂λ = 0, ∂ P = 0. As a
∗
pn,k ,m ≤ pktot , k ∈ K (3b) result, we can express the optimal power allocation pn,k ,m of
n=1 m=1 satellite k in slot m as

αn,k ∈ {0, 1}, n ∈ N , k ∈ K (3c) ∗ τ wm · Bn σn2
pn,k ,m = −αn,k + 2 . (6)
K
ln 2λ∗ hn,k ,m
αn,k ≤ 1, n ∈ N . (3d)
k =1 Thus, finding the optimal power allocation P∗ is equivalent to
finding the optimal λ∗ , which can be obtained by the efficient
(P1) and (P2) are non-convex MIP problems with binary
binary search method.
variables α and continuous variable P, which are difficult to
solve directly. To tackle the problem, we decompose MIP
problem into two sub-problems, i.e., the channel allocation B. Power Allocation for (P2)
problem which is a combinatorial optimization problem, and Since one satellite can be assigned to multiple sub-channels
the power allocation problem which is a convex optimization in (P2), the corresponding power allocation of satellites may
problem. The major difficulty of solving our problems lies change from one-dimensional to multi-dimensional. In this
in the channel allocation problem, as the power allocation case, the power allocation is written as the following convex
problem can be solved by convex optimization algorithms with optimization problem:
a given channel allocation. K
N

Since that there are too many possible channel allocations (P 4): C̄ ∗ (I, α) = max C̄n,k
in our problems, it is difficult to search for an optimal or P
n=1 k =1
a satisfying sub-optimal channel allocation decision among s.t. (3a), (3b) (7a)
all the possible channel allocations. Instead, for the channel
allocation problem, we utilize the learning-based methods In the following, we solve the problem (P4) by the sub-
to obtain the channel allocation decision. With the obtained gradient method. In particular, we set the initial feasible power
(0)
channel allocation, the total problem is reduced to a convex allocation as P(0) {pn,k ,m }. For the initial power allocation
optimization problem, which can be solved using convex P(0) , we constantly update power allocation using simple
optimization. We introduce the details in the next. iteration

IV. O PTIMIZATION BASED P OWER A LLOCATION P(i+1) = P(i) + βi g(i) (8)

A. Power Allocation for (P1) where i = 0, . . . , Z denotes the step, Z is the total updating
(i)
With a given channel allocation decision α, the original step. βi is the step size of step i. g(i) {gn,k ,m } is the
optimization problem (P1) is reduced to the following convex sub-gradient of the objective function of (P4) or one of the
(i)
optimization problem, constraint functions (3a) and (3b). The sub-gradient gn,k ,m is
N
K generated as follows:

∗ 1) If the constraints (3a) and (3b) are satisfied, we use the
(P 3): C̄ (I, α) = max C̄n,k
P
n=1 k =1
objective sub-gradient
s.t. (2a), (2b) (4a) 2
τ wm · Bn hn,k
(i) ,m
gn,k ,m = αn,k . (9)
The standard convex solver such as CVX can be applied to 2 2
ln 2 σn + pn,k ,m hn,k ,m
solve (P3) and obtain the power allocation. However, utilizing
CVX optimization toolbox with interior point method to obtain 2) Otherwise, we choose any unsatisfied constraint func-
the optimal solution is time-consuming, and we will propose tion and use a sub-gradient of the constraint function.
a more efficient method to solve this problem. Note that the Specifically, if constraint (3b) is not satisfied, then
(i)
sum weighted transmission data increases with the increase of gn,k ,m = −1. If constraint (3a) is not satisfied, then we
transmission power. Then, it can be shown that in the optimal can obtain gn,k ,m as
solution to (P3), the equality in (2b) holds. Since otherwise, we N
can always increase pn,k ,m until equality holds in (2b), which (i) 1, pn,k ,m < 0,
gn,k ,m = n=1
N (10)
results in a larger objective value. As a result, the constraint −1, n=1 pn,k ,m > pkmax .

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
SUI et al.: INTEGRATING CONVEX OPTIMIZATION AND DEEP LEARNING FOR DOWNLINK RESOURCE ALLOCATION 1109

During the updating, we record and update the best feasible into memory states. To be specific, for the input in step k, the
allocation corresponding to maximum objective value as Pbest encoder first embeds I k into a d-dimensional embedding ek ,
and output Pbest as result. which is then transformed into the memory state hk by the
LSTM cell. The decoder also maintains its memory states and
V. L EARNING BASED C HANNEL A LLOCATION generates a distribution over the output selection. For the input
to decoder, the LSTM cells in decoder directly convert the
The power allocation problem in (P1) and (P2) can be
input into the memory states q. At the first decoder step, the
solved with convex optimization algorithms, as demonstrated
initial input of LSTM cell in decoder is the start of sequence
in Section IV. However, we need to obtain the optimal channel
(SOS) token ⇒, which can be treated as a trainable parameter
allocation in advance. As there are AN K and K
N possible
of our neural network. The LSTM cell in decoder takes ⇒ as
channel allocations in problems (P1) and (P2), it is difficult
input and output q1 . The output of LSTM cell is combined
to search for an optimal channel allocation strategy among
with the encoder output, h, to create the probability distribution
all the possible channel allocation strategies. Given that the
over the input sequence in the Pointer Generation (Ptr-Gen),
deep learning can approximate arbitrary continuous functions,
where the probability distribution in the nth decoder step, pn ,
allowing it to mimic the behavior of highly nonlinear and
is generated as
complex systems and making it universal approximator, it can
be applied to solve a variety of problems in communications.
v T tanh(W1 hk + W2 qn ), mk = 0,
Thereby, we introduce two learning-based schemes, Ptr and unk = (11)
−∞, mk = 1.
DNN-Ptr, to solve the channel allocation problem. Combining
the learning-based schemes with convex optimization based pn = softmax (un ), un = un1 , un2 , . . . , unK . (12)
power allocation schemes, we develop two novel algorithms,
i.e., Opt-Ptr and Opt-DNN-Ptr. The Opt-Ptr algorithm intro- In formula (11) and (12), W1 , W2 , v are trainable parameters.
duces a novel Pointer Network to determine the channel To guarantee the input node can be selected at most once in
allocation decision. It then addresses the remaining power some problems, a binary vector m (masks) with length equal
allocation problem by employing convex optimization algo- to the input size is used to mask the input node that has been
rithms. Additionally, the Opt-DNN-Ptr algorithm enhances selected. mk = 1 denotes input node k has been selected
performance by incorporating a DNN to predict transmission and cannot be selected again, while mk = 0 denotes the
power allocation. This predicted allocation is combined with node k has not been selected. With the probability distribution
the channel allocation decision obtained from the pointer in the nth decoder step, pn , the pointer is then selected
network to solve the remaining power allocation problem. By from pn via a selection strategy. The simplest strategy is the
leveraging more information on power allocation, the Opt- arg max(pn (k )), where pn (k ) is the probability of the kth
k
DNN-Ptr guides the channel allocation, leading to a more input node. Once the pointer is selected, it is passed as the
rational policy generated by the Pointer Network. input to the next decoder step and generate the next selection.
It can be noted that the output of decoder is a sequence
A. Pointer Network Based Channel Allocation (Ptr) of pointers (i.e., indexes) to the input sequence. Taking the
With the input I, we need to select an optimal action Pointer Network architecture in Fig. 2 as an example, the
(channel allocation for each sub-channel) through continuous output (green arrows) is the sequence 2, K , k . Because of this,
training, in order to learn the policy π : I → α ∗ . Since the Pointer Network is suitable for such selection problems.
that the input is dynamic due to the high dynamic channel More importantly, the output size of Pointer Network can be
condition and the varying parameters of different satellites in different, which provides the flexibility and scalability with
different time frames, we apply the RL to train the agent. respect to the problem size for our problem.
RL is a kind of learning from one’s own experience, and
is a suitable and powerful tool for automatic control and
B. DNN Based Power Allocation Prediction
decision-making problems in random dynamic environment. In K!
addition, RL does not need manually labeled training samples, As mentioned above, there are AN K = (K −N )! possible
it is more robust to the change of the satellites parameters and channel allocations in single sub-channel case (P1) and K N
the channel conditions. Note that traditional methods, such as possible channel allocations in multi sub-channel case (P2).
DQN and DNN, can be applied to meet the above demands. In the learning based channel allocation discussed above, the
However, these solution are not scalable, i.e., these models pointer network based scheme Ptr outputs the optimal channel
are rigid regarding problem size. As mentioned before, the allocation barely based on the input consisting of channel gains
size of our problem is not fixed as the number of satellites K and the parameters of satellites, which may be insufficient
may change. Thus, we need a flexible and scalable solution to output the channel allocation accurately in our problems,
with respect to problem size, and we devises a novel Pointer especially when N and K are large. Thus, we consider to add
Network to allocate the channels. a DNN based scheme to predict the power allocation, and
Pointer Network is a sequence-to sequence network with the prediction results are added to the input of the pointer
encoder and decoder, both of which consists of Long Short- network based channel allocation to obtain a more accurate
Term Memory (LSTM) cells, as presented in Fig. 2. The channel allocation policy. As we can obtain the optimal power
encoder reads the input sequence and converts the sequence allocation through the convex optimization algorithms, we can

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
1110 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO. 3, JUNE 2024

Fig. 2. The structure of the proposed algorithms. The solid part and dashed part represent the structure of Opt-Ptr and Opt-DNN-Ptr, respectively. The
structure consists of five steps; Step1: generate the input sequence Î , where Î = (I , Ṗ) in Opt-DNN-Ptr, Î = I in Opt-Ptr. Step2: with Î , the pointer
network outputs the relaxed channel allocation α̇ and critic network generates predicted reward b; Step3: generate the candidate channel allocation (1 in
Opt-Ptr and Q in Opt-DNN-Ptr) based on the relaxed channel allocation α̇; Step4: calculate the power allocation and corresponding C̄ ∗ according to the
optimization algorithm; Step5: train the networks.

train the neural network with training data obtained from the C. Opt-Ptr Algorithm
output of convex optimization algorithms, to achieve the power Opt-Ptr algorithm combines the convex optimization and
allocation prediction policy π : I → P∗ . the Pointer Network, as shown in Fig. 2 (the solid line part),
In particular, we adopt a DNN to learn the power allocation where we obtain the channel allocation based on the Pointer
policy. This DNN is characterized by the coefficient vector θ D , Network, and solve the rest convex optimization problem
e.g., the weights that connect the hidden neurons. Specifically, by the power allocation algorithm introduced in Section IV.
as the DNN architecture shown in Fig. 2, the fully connected For a given input I, the Opt-Ptr algorithm directly outputs
DNN consists of an input layer, two hidden layers and one the optimized channel allocation of the ground base station
output layer, where we use ReLU as the activation function α ∗ through the Pointer Network. With the obtained channel
in the hidden layers. In the output layer, we use a Sigmoid allocation α ∗ , we utilize the power allocation optimization
activation function. As the convex optimization algorithms can algorithm in Section IV to calculate the optimal power allo-
calculate the optimal power allocation, we train the DNN with cation P∗ and corresponding C̄ ∗ . After obtaining the optimal
experience replay technique [40]. A batch of training data pairs C̄ ∗ , we update the Pointer Network using policy-based RL
(optimal power allocation calculated by convex optimization method. The architecture of the Opt-Ptr algorithm is detailed
algorithm) will be sampled from the replay memory to update as follows.
the DNN every training interval, which accordingly update the For the input I that consists of the satellites parameters
parameters of DNN. Then the power allocation prediction in gk = (Dk ,0 , pkmax , pktot ), k ∈ K and channel gains h
next time frame will be obtained based on the newly updated {hn,k ,m }, we need to divide it into K input sequences
DNN. Such iterations repeat thereafter as the input of each of each satellite as I k = {hk , gk }, k ∈ {1, . . . , K }.
time frame is different, and the policy of the DNN is gradually Here hk denotes the channel gain of the channel con-
improved. nected to satellites k, which can be expressed as hk =
In the next subsections, we introduce two algorithms, Opt- {hn,k ,m }, n ∈ {1, . . . , N }, m ∈ {1, . . . , M }. The encoder of
Ptr algorithm and Opt-DNN-Ptr algorithm, to solve the MIP Pointer Network takes the sequence of satellites parameters
problems. I k , k ∈ {1, . . . , K } as the input and embeds it into a vector

Algorithm 1 Opt-Ptr Algorithm train the Pointer Network. Taking the optimal sum weighted
Input: Total input I of each frame. transmission data C̄ ∗ in each frame as the reward signal, we
Output: Optimized channel allocation of ground base station update the parameters of the Pointer Network θP by using
α ∗ and Optimized power allocation of satellites P∗ . the policy gradient method. In each frame, with the input I,
1: Initialize Pointer Network parameters θ P and Critic output of the optimal channel allocation α ∗ , and reward signal
Network parameters θ V ; C̄ ∗ , we obtain the policy gradient ∇J (θ P ) in this frame as
2: Set training number O; follows:
3: for all t = 1, . . . , O do
4: Generate solution of the channel allocation problem ∇J (θ P ) ← (C ∗ − b)∇θ P log pθ P (α ∗ | I), (14)
α ∗ with Pointer Network; where b is the baseline value of sum weighted transmission
5: Get the baseline of the Pointer Network b with critic data which can effectively reduce the variance of gradients
network; and thus improve the performance of Pointer Network. To
6: Compute Optimal C̄ ∗ (I, α ∗ ) and P∗ (I, α ∗ ) utilizing obtain the baseline b, we still need a critic network, whose
the power allocation optimization algorithm; parameter is denoted by θ V . The critic network has a similar
7: gθ P = (C̄ ∗ −b)∇θ P log pθ P (α ∗ | I),Lv = b−C̄ ∗ 22 ; architecture to Pointer Network and takes the same input as
8: Update pointer network and critic network, θ P ← Pointer Network, but outputs the baseline b instead. In order
ADAM (θ P , gθ P ), θ V ← ADAM (θ V , ∇θ V LV ); to accurately predict the baseline b, we also need to train the
9: end for critic network. Based on the reward signal C̄ ∗ , we update
the parameters of critic network based on the loss function
Lv , which is obtained as Lv = b − C̄ ∗ 22 . With the policy
presentation. Based on the vector presentation, the decoder of gradient of Pointer network and the loss function of critic
Pointer Network points to one of the input satellites k each network, we update the whole network by ADAM algorithm.
step. Setting the total step of the decoder equal to number of The pseudocode of the Opt-Ptr algorithm is provided in
sub-channels N, we can take the output of Pointer Network in Algorithm 2.
n-th step as an optimized allocation of sub-channel n, denoted
as αn∗ ∈ {1, . . . , K }, n = 1, . . . , N . However, it is worthwhile
to note that the channel allocation in Section IV is denoted D. Opt-DNN-Ptr Algorithm
as αn,k , not αn , which means that we cannot apply the In the Opt-DNN-Ptr approach, we utilize a DNN to obtain
optimization algorithm based on α q directly. In fact, it is easy a predicted power allocation, which is then combined with
to obtain the channel allocation αn,k based on the αn , i.e., the input I and used to obtain the optimal channel allo-
cation by applying Pointer Network. As shown in Fig. 2
1, αn = k ,
αn,k = (13) (the dashed part), there are three parts in the Opt-DNN-Ptr
0, αn = k . framework: DNN, Pointer Network, and the aforementioned
Thus, the channel allocation expression utilized in this power allocation optimization algorithm. Different from the
section (αn ∈ {1, . . . , K }) is barely equivalent to the Opt-Ptr algorithm, we take the predicted power allocation P,
expression αn,k in Section IV. Without loss of generality, we together with the input I as the basis to generate optimal
substitute the channel allocation αn,k with αn ∈ {1, . . . , K } channel allocation α ∗ . For the input I, DNN predicts a power
in the following. Thus, we can take the output of Pointer allocation Ṗ. Then Pointer Network takes the predicted power
Network as the optimized channel allocation α ∗ {αn∗ }, n ∈ allocation Ṗ together with the initial input I as the input
{1, . . . , N }. Note that the satellites in single sub-channel case sequence and outputs the basis channel allocation α̇. Based
(P1) can only be assigned to one sub-channel at most. We on the corresponding quantization function, we quantize the
set the decoder can only point to each input at most once basis channel allocation into Q candidate channel alloca-
in this problem (P1) by using a mask mechanism that sets tions. For the candidate channel allocation, we calculate the
the parameter of the selected input as −∞ and thus decoder optimal power allocation and corresponding sum weighted
will not point to it anymore. As there is no limit in (P2), we transmission data under each candidate channel allocation by
apply the Pointer Network without the mask mechanism. For corresponding power allocation optimization algorithm. By
convenience, we can express this optimized channel allocation comparing the sum weighted transmission data corresponding
generation process as α ∗ = pθ P (· | I), where pθ P denotes to each candidate channel allocation, we obtain the optimal
the function of Pointer Network, θ P denotes the parameters solution for our problems and corresponding sum weighted
of Pointer Network. Based on the optimal channel allocation transmission data C̄ ∗ . According to the input I and optimal
α ∗ , we obtain the optimal power allocation P∗ (I, α ∗ ) and power allocation P∗ , we update the DNN using ADMA. In
corresponding sum weighted transmission data C̄ ∗ (I, α ∗ ) by addition, we also update the Pointer Network based on reward
applying the aforementioned convex optimization algorithms. C̄ ∗ and input. Details of the Opt-DNN-Ptr are described
Then we update the Pointer network accordingly. below.
For the Pointer Network, our goal is to obtain an optimal For the input I consisting of channel gain and satellites
channel allocation based on the input channel gain and parameters, we first predict the overall power allocation of
satellites parameters, which requires a reasonable training satellites Ṗ {ṗn,k ,m } by applying DNN with parameters
method. Thus, we apply the policy-based RL method to θ D . Based on the predicted power allocation Ṗ and the initial

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
1112 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO. 3, JUNE 2024

Algorithm 2 Opt-DNN-Ptr Algorithm According to the corresponding quantization function, the

Input: Total input I of each frame. basis channel allocation obtained from the Pointer Network
Output: Optimized channel allocation of ground base station α̇ is quantized into Q candidate channel allocations, where
α ∗ and optimal power allocation of satellites P∗ . the set of Q candidate channel allocation is denoted as α Q
1: Initialize DNN parameters θ D ; {α q }. Here α q , q ∈ {1, . . . , Q} denotes the q-th candidate
2: Initialize Pointer Network parameters θ P and Critic channel allocation composed of the allocation of N sub-
Network parameters θ V ; channel, which can be expressed as αq {αq,n }, n ∈
3: Set training number O, training interval δ and batch size B; {1, . . . , N }, in which αq,n ∈ {1, . . . , K }. The generation
4: for all t = 1, . . . , O do procedure of the set of Q candidate channel allocations is as
5: Predict the power allocation of satellites Ṗ with DNN; follows:
6: Generate the relaxed channel allocation α̇ based on the 1) To obtain the first candidate channel allocation α 1 , we
total input (I, Ṗ) with Pointer Network; set the channel allocation basis α̇ as the first channel
7: Get the baseline of the Pointer Network b by applying allocation α 1 = α̇;
critic network; 2) As for the remaining Q − 1 candidate channel allocation
8: Quantify α̇ into Q specific channel allocations of α q , q ∈ {2, . . . , Q} in αQ , we sequentially generate the
ground base station α q ← Qq (α̇); αq {αq,n }, n ∈ {1, . . . , N } as follows. The αq is
9: Compute C̄ ∗ (I, α q ) and P∗ (I, α q ) for all α q by generated based on α q−1 as αq,1 = [αq−1,1 · αq−1,N −
solving the power allocation optimization problem (P3) or (αq−1,1 − αq−1,N )/2] mod K and αq,n = [αq−1,n ·
(P4); αq,n−1 − (αq−1,n − αq,n−1 + n − 1)/2] mod K , n ∈
10: Select the optimal channel allocation α ∗ ← {2, . . . , N }.
argmax C̄ ∗ (I, α q ) and obtain corresponding optimal As each satellite can only be assigned to one sub-channel at
{α q } most in (P1), the allocation of different sub-channels should
power allocation P∗ and C̄ ∗ ; be different, which requires that αq,n1 = αq,n2 , n1 , n2 ∈
11: Update the DNN memory by adding training data pair {1, . . . , N }, n1 = n2 . Thus, we add a checking mechanism
(I, P∗ ); to the generation of αq,n in (P1). If the satellite allocated
12: gθ P ← (C̄ ∗ − b)∇θ P log pθ P (α̇ | I, Ṗ), Lv ← b − to sub-channel n in candidates q, denoted as αq,n , has not
C̄ ∗ 22 ; be assigned αq,n = αq,n , n = [1, . . . , n − 1], we add the
13: Update pointer network and critic network, θ P ← αq,n to the α q . However, if the satellite has been allocated to
ADAM (θ P , gθ P ), θ V ← ADAM (θ V , ∇θ V LV ); another sub-channel, we reallocate the sub-channel to the next
14: if t mod δ = 0 then satellite as αq,n = (αq,n + 1) mod K until this sub-channel
15: Sample a batch of training data set {(I, P∗ )} from n is allocated to an unassigned satellite.
DNN memory and update DNN network using ADAM After obtaining the Q candidate channel allocations, we
algorithm; obtain the optimal channel allocation α ∗ and corresponding
16: end if power allocation P∗ , reward C̄ ∗ by comparing the sum
17: end for weighted transmission data C̄ ∗ (I, α q ) under different channel
allocation α q . In addition to obtain the optimal channel
allocation α ∗ and power allocation P∗ as the solution to
input I, we generate the input sequence of Pointer Network, our problems, we need to update the framework accordingly.
denoted as (I k , P˙k ). The generation from I to I k is the same To update and train the DNN, we add the training data
as the division in Opt-Ptr algorithm. In addition, we obtain pairs (I, P∗ ) to the DNN memory and update the parame-
the predicted power allocation sequence based on the satellites ters of DNN every training interval using ADAM algorithm.
as P˙k = {ṗn,k ,m }, n ∈ {1, . . . , N }, m ∈ {1, . . . , M }. Taking Specifically, we adopt an initially empty replay memory with
the sequence (I k , P˙k ), k = 1, . . . , K as input, the Pointer limited capacity for DNN. In each time frame, a new training
Network with parameter θ P outputs a basis channel allocation data pair (I, P∗ ) will be added to the DNN memory. If the
α̇ = {α̇n | α̇n ∈ {1, . . . , K }, n = 1, . . . , N }. The basis memory is full, the oldest training data will be replaced by
channel allocation α̇ is exactly a specific channel allocation. the newest training data pairs. We train the DNN based on
However, note that the DNN is updated based on the data the training data stored in replay memory according to the
pairs from experience. The quality of the training data pair experience replay technique [40]. During each training interval
generated at each frame is directly related to the output quality δ, we randomly sample a batch of training data from memory.
of DNN. However, the training data pairs quality cannot Based on the sampled training data, the DNN will be updated
be guaranteed if we generate only one candidate channel using the ADAM algorithm.
allocation in each frame, which may result in an inaccurate As for the Pointer network, we also adopt the actor-critic
predicted power allocation. Furthermore, an inaccurate pre- network to reduce the variance of gradients and thus improve
dicted power allocation may lead to bad performance of the the performance of Pointer Network. Taking the Pointer
whole framework. Thus, we apply the quantization procedure Network as the actor network, we add a critic network that
and obtain the best solution and training data pairs from Q has the same input architecture as the Pointer Network but
candidate channel allocation, which can guarantee a good outputs the baseline b instead. During the training process, we
output with a reasonable quantization function. obtain the policy gradient of Pointer Network and loss of critic

network based on the optimal output channel allocation α ∗ , training loss in case N = 5,K = 7 for P1 gradually decreases
reward signal C̄ ∗ and the input (I, Ṗ) similar to the Opt-Ptr and stabilizes at around 0.14, whose fluctuation is mainly due
algorithm. Based on the policy gradient and loss, we update to the random sampling of training data. Meanwhile, as shown
the Pointer Network and corresponding critic network using in Fig. 3(b), the training loss of the DNN in case N = 7,
ADAM algorithm in each frame. While during the execution K = 12 for P2 converges within 1000 time frames and
phase after training, only the actor network is utilized without stabilizes at around 0.16.
any additional critic network. The whole procedure of Opt-
DNN-Ptr algorithm is organized in Algorithm 3.
Although each satellite needs to send basic information C. Performance Comparison
to the ground station for decision-making in each time slot, Regarding to the sum weighted transmission data
the time interval between time slots is approximately a few performance, we also compare the proposed learning-based
minutes, which is not frequent. Moreover, compared to the methods with the following benchmark schemes:
actual data transmission volume, the time and bandwidth • Opt-Greedy: Each sub-channel is allocated to the satel-
consumed by transmitting this information can be completely lites corresponding to the max average channel gain. For
negligible. P1, a checking mechanism is added to guarantee that each
satellite can be allocated to one sub-channel at most. The
VI. S IMULATION R ESULTS AND D ISCUSSION power allocation is obtained by the corresponding power
allocation optimization algorithm.
A. Parameters Settings
• Opt-RDS: The channel allocation is randomly selected
In the simulation, we assume that the ground base station from the set S in each time frame, and the corresponding
communicates with satellites in Ka band, the carrier frequency power allocation and sum weighted data are obtained
of sub-channels is in the range of (20, 30) GHz. Specific carrier using the power allocation algorithms in Section IV.
frequency setting of sub-channels is different under different • Opt-DNN [30]: The Opt-DNN algorithm obtains the
amount of sub-channels N with the interval of the carrier channel allocation by applying a fully connected DNN
frequency of sub-channels is barely same. The average gain and corresponding quantization function. The optimal
of sub-channel n connected to satellite k, denoted by h n,k , is power allocation of each candidate channel allocation can
A A 2
modeled as h n,k = Rf 2 lT2 ,k · Acn,k , where the average distance be obtained by solving (P3) or (P4) with the optimization
n
between satellites and ground station l is uniformly distributed algorithm in Section IV.
in the range of (500, 2000) Km according to the orbital altitude 1) Simulation Results for Single Sub-Channel Case (P1):
of LEO satellites and c is the speed of light c = 3 × 108 m/s. In Fig. 4, we evaluate the sum weighted transmission data
The link attenuation, denoted by An,k , and the product of the performance of the proposed Opt-Ptr and Opt-DNN-Ptr algo-
area of the receiving and transmitting antennas AR AT ,k are rithms. As shown in the Fig. 4(a), the Opt-DNN-Ptr algorithm
randomly distributed in the range presented in Table I. For the can obtain larger C̄ ∗ than Opt-Ptr in most frames when there
average channel gain h n,k , we further consider the varying are 5 sub-channels and 7 satellites. In the Fig. 4(b), it can
distance and communication angle between satellites and the be noted that the performance gap between Opt-Ptr and Opt-
ground station during one time frame as φ, which denotes DNN-Ptr in case N=5 K=12 becomes smaller than case N=5
the coefficient set of change of channel gain influenced by K=7.
the distance and angle. The time-varying channel gains of M In Fig. 5, we investigate the average sum weighted trans-
mini-slots hn,k = [hn,k ,1 , hn,k ,2 , . . . , hn,k ,M ] are modeled as mission data performance of different schemes under different
hn,k = h n,k φ n,k + Nw , where φ n,k denotes the selected M K. We can observe that the proposed Opt-DNN-Ptr algorithm
coefficients from φ. Nw denotes the channel gain variation can achieve the best performance in all the cases. The Opt-Ptr
caused by other factors such as weather. In addition, the has worse performance than Opt-DNN when K = 7, and has
total power limit of satellites pktot is obtained based on better performance than Opt-DNN when K > 7. In addition,
pktot = μMpkmax . The corresponding parameters refer to the the performance gap between Opt-Ptr and Opt-DNN increases
literature [41], [42], which are uniformly generated from the as K increases. The other two benchmark algorithms Opt-
ranges specified in Table I. In the experiment, we consider RDS and Opt-Greedy increase more slowly than the proposed
the information freshness and the amount of downloaded data, learning based algorithms and Opt-DNN, especially the Opt-
assuming wm = τ2 [2(M − m) + 1]. As for the quantization RDS. Compared with the benchmark schemes Opt-RDS and
parameter Q, we set the quantization number equal to the Opt-Greedy, the proposed two learning-based methods and the
amount of satellites Q = K. Opt-DNN have better performance. Among the three learning-
In the Opt-Ptr algorithm, we set the size of embedding based methods (i.e., Opt-DNN-Ptr, Opt-Ptr and Opt-DNN),
vector as 120 and learning rate 0.00025. Besides, we set the Opt-DNN-Ptr and Opt-Ptr increase faster than the Opt-
1 glimpse for Pointer Network and 3 glimpses for the critic DNN. Observing the structure of Opt-DNN, the DNN model is
network. The parameter settings for Opt-DNN-Ptr are listed updated based on the sample training data pairs from memory,
in Table II. where the training data pairs are composed of the best data
pairs among Q candidates in each frame. When K is small, the
B. Convergence Analysis proportion of the quantification number Q to S is sufficient to
In Fig. 3, we plot the training loss of the DNN in obtain an excellent solution. The excellent solutions in each
Opt-DNN-Ptr for P1 and P2. As shown in Fig. 3(a), the DNN frame constitute the training data pairs of the DNN, allowing
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
1114 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO. 3, JUNE 2024

TABLE I
S IMULATION PARAMETERS

TABLE II
H YPERPARAMETERS OF Opt-DNN-Ptr A LGORITHM

Fig. 3. DNN training loss of Opt-DNN-Ptr.

Fig. 4. The sum weighted transmission data C̄ ∗ for Opt-Ptr and Opt-DNN-Ptr algorithms.

the DNN to further improve the solution quality. In other of training data pairs may be insufficient when K is large,
words, the DNN learns from high quality training data pairs which leads to the slower increased performance than the Opt-
in Opt-DNN when K is small, thus Opt-DNN can obtain a Ptr. Compared to Opt-DNN that learns form the experience
better performance than Opt-Ptr when K = 7. However, as data pairs, the RL based Opt-DNN-Ptr and Opt-Ptr algorithms
K increases and Q cannot increase accordingly, the quality perform better, especially when the number of satellites K

can allocate multiple channels. the Opt-DNN algorithm selects

Q candidate channel allocation strategies and then chooses
the optimal strategy as output, which helps to ensure the
performance of the algorithm to some extent.
2) Simulation Results for Multiple Sub-Channel Case (P2):
In the multiple sub-channels case (P2), we first compare the
sum weighted transmission data performance of the proposed
Opt-Ptr and Opt-DNN-Ptr algorithms in Fig. 7. As shown
in Fig. 7, the Opt-DNN-Ptr approach has absolutely better
performance than Opt-Ptr in case N=7 K=9 and N=7 K=21.
In addition, it is worth noting that the performance gap
between Opt-Ptr and Opt-DNN-Ptr in P2 is much larger
than P1.
We further study the average sum weighted transmission
data performance of different schemes under different K
in Fig. 8. We can find that the Opt-DNN-Ptr algorithm
Fig. 5. The average sum weighted transmission data performance over
different K. outperforms the other algorithms. In addition, the performance
gap between Opt-DNN-Ptr algorithm and the other algorithms
is larger than problem (P1). The performance of Opt-Ptr
is worse than Opt-DNN. This is because according to the
framework of Opt-Ptr algorithm demonstrated in Section V,
the Pointer Network in this scenario does not apply the mask
mechanism. In addition, as there are K N channel allocations
in the allocation set S in P2, which is much larger than
P1, the performance of the pointer network may deteriorate
accordingly.
Although Opt-DNN has encouraging performance in this
problem (P2), the performance grows more slowly than the
performance in problem (P1). This is because that the Opt-
DNN applies the same quantization function but does not
use the checking mechanism as the problem (P1) in this
problem (P2). The quantization function may not be suitable
to obtain good candidate channel allocations based on the
relaxed channel allocation in this case, which leads to the
Fig. 6. The average sum weighted transmission data performance over slower increased performance of Opt-DNN algorithm. As for
different N. Opt-DNN-Ptr, the performance also increases more slowly
than (P1). This is because Opt-DNN-Ptr also applies the same
is large. More specifically, from the obtained results, the quantization function but without checking mechanism as the
proposed Opt-DNN-Ptr algorithm can always obtain the best problem (P1) in this problem (P2). In addition, the mask
sum weighted transmission data among all the schemes. This mechanism is also not applied in the Pointer Network in this
is because a DNN is added for predicting power allocation on problem (P2) of Opt-DNN-Ptr algorithm. However, Opt-DNN-
the basis of Opt-Ptr, which solves the problem of insufficient Ptr algorithm can still obtain the best performance among all
input, especially when N and K are large. the algorithms. Moreover, despite the significant performance
In Fig. 6, we further study the effect of different N on gap between the Opt-DNN-Ptr algorithm and other algorithms,
the performance. More the sub-channel can be connected to it consistently achieves remarkable performance across a wide
satellites, more data can be transferred back to the ground range of K in (P2). The larger action space slows down the
base station. Thus, we can observe that the sum weighted performance improvement of Opt-RDS.
transmission data of all the five algorithms increase with the In Fig. 9, we investigate the effect of different N on the
N. Among different N, the proposed Opt-DNN-Ptr always performance. Similar to the results in P1, the sum weighted
has the best performance. Similar to the performance under transmission data of all the algorithms increase with the
different K, the Opt-DNN can achieve better performance than increase of N. Among all the different N, the proposed
Opt-Ptr when N = 3. However, the performance of Opt-Ptr learning-based methods and the Opt-DNN perform better
increases faster than the Opt-DNN algorithm. The Opt-Ptr than the other three schemes, and Opt-DNN-Ptr has the best
achieves better performance than Opt-DNN when N > 3. performance. The Opt-DNN has better performance than Opt-
In Figure 5, Opt-PTR outperforms Opt-DNN, but in Ptr in P2.
Figure 6, their performance becomes comparable due to the These observations demonstrate that the proposed learning-
absence of a mask mechanism in Opt-PTR when the satellite based algorithms can achieve encouraging performance. In P1

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
1116 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO. 3, JUNE 2024

Fig. 7. The sum weighted transmission data C̄ ∗ for Opt-Ptr and Opt-DNN-Ptr algorithms.

(P2), the stability and superiority of the Opt-DNN-Ptr are

further demonstrated as the Opt-DNN-Ptr always has the
best performance in different cases. We also note that the
performance of proposed Opt-Ptr and Opt-DNN is highly
dependent on the network size and problem. The Opt-
DNN algorithm has more outstanding performance when
the network is small, while the Opt-Ptr is less affected by
network size. On the other hand, we can also clearly see
that the performance of the Opt-Ptr algorithm with the mask
mechanism is more stable and better than without the mask
mechanism.

VII. C ONCLUSION
In this paper, we jointly optimize the channel allocation
of the fixed ground station and the transmission power allo-
cation of satellites in a SCN. We mathematically model the
Fig. 8. The average sum weighted transmission data performance over
different K. downlink transmission and formulate the resource allocation
as a MIP problem. To address this NP-hard problem, we
develop two novel algorithms, i.e., Opt-Ptr and Opt-DNN-Ptr.
Extensive simulations show that our learning-based algorithms
achieve impressive performance compared to other benchmark
methods.

A PPENDIX
Here we introduce the method for obtaining the weight
parameters of the objective function. Our goal, for joint
sub-channel allocation and transmit power allocation, is to
maximize the amount of downloaded data while minimizing
data transmission delay. Specifically, combining the storage
loss proposed in [43], we minimize the integral of the total
amount of stored data across all K satellites within each frame.
The integral of the amount of data of satellites k in one frame
is formulated as
T
Fig. 9. The average sum weighted transmission data performance over Ok Dk (t) dt, (15)
different N. 0
where Dk (t) is the amount of data stored in satellite k at
and P2, the proposed Opt-DNN-Ptr can obtain the maximum time instant t. Using a time discretization approach with slot
sum weighted transmission data under different cases. In length τ , we can approximate the overall Ok by the Trapezoid
addition, comparing the simulation results under (P1) and Method:

τ1 τM
Ok Dk (t) dt + . . . + Dk (t) dt [12] R. Liu, M. Sheng, K.-S. Lui, X. Wang, Y. Wang, and D. Zhou, “An ana-
0 τM −1 lytical framework for resource-limited small satellite networks,” IEEE
M
Commun. Lett., vol. 20, no. 2, pp. 388–391, Feb. 2016.
Dk ,m−1 + Dk ,m [13] S. Zhang, G. Cui, and W. Wang, “Joint data downloading and resource
≈τ Õk . (16) management for small satellite cluster networks,” IEEE Trans. Veh.
2
m=1 Technol., vol. 71, no. 1, pp. 887–901, Jan. 2022.
[14] H. Yao, L. Wang, X. Wang, Z. Lu, and Y. Liu, “The space-terrestrial
Here Dk ,m denotes the amount of data stored in satellite k integrated network: An overview,” IEEE Commun. Mag., vol. 56, no. 9,
at the end of slot m, where 1 ≤ m ≤ M . Dk ,0 represents pp. 178–185, Sep. 2018.
[15] S. Aboagye, T. M. N. Ngatched, and O. A. Dobre, “Subchannel and
the initial amount of data stored in satellite k at the beginning power allocation in downlink VLC under different system configura-
of the first slot. The difference of the amount of stored data tions,” IEEE Trans. Wireless Commun., vol. 21, no. 5, pp. 3179–3191,
within a time slot is equal to the amount of data transmitted May 2022.
[16] Y. Qiu, H. Zhang, K. Long, and M. Guizani, “Subchannel assignment
back to the ground in that time slot. Thus, we have and power allocation for time-varying fog radio access network with
NOMA,” IEEE Trans. Wireless Commun., vol. 20, no. 6, pp. 3685–3697,
N
Jun. 2021.
Dk ,m−1 − Dk ,m = αn,k Cn,k ,m , ∀k , m, (17) [17] A. A. Khan and R. S. Adve, “Centralized and distributed deep rein-
n=1 forcement learning methods for downlink sum-rate optimization,” IEEE
Trans. Wireless Commun., vol. 19, no. 12, pp. 8410–8426, Dec. 2020.
where Cn,k ,m denotes the amount of data transmitted by satel- [18] O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” in Proc.
Adv. Neural Inf. Process. Syst., 2015, pp. 1–9.
lite k through sub-channel n in slot m. With the equality (17), [19] A. Xiao, Z. Chen, S. Wu, S. Jin, and L. Ma, “Collaborative long-
we can reformulate the (16) as short term bandwidth allocation for satellite-terrestrial networks,” IEEE
Commun. Lett., vol. 26, no. 5, pp. 1121–1125, May 2022.
τ [20] J. Liu, B. Zhao, Q. Xin, and H. Liu, “Dynamic channel allocation for
Õk = 2MDk ,0 satellite Internet of Things via deep reinforcement learning,” in Proc.
2 IEEE Int. Conf. Inf. Netw. (ICOIN), 2020, pp. 465–470.
N
M [21] N. Dai, D. Zhou, M. Sheng, and J. Li, “Deep reinforcement learning
based power allocation for high throughput satellites,” in Proc. IEEE
− αn,k (2(M − m) + 1)Cn,k ,m . (18) 94th Veh. Technol. Conf. (VTC-Fall), 2021, pp. 1–5.
n=1 m=1 [22] H. Tsuchida et al., “Efficient power control for satellite-borne batter-
ies using Q-learning in low-earth-orbit satellite constellations,” IEEE
As Dk ,0 is a constant value, minimizing the integral of the Wireless Commun. Lett., vol. 9, no. 6, pp. 809–812, Jun. 2020.
amount of data of k in one frame Ok is equals to
satellites [23] R. Chen, X. Hu, X. Li, and W. Wang, “Optimum power allocation based
N αn,k τ M on traffic matching service for multi-beam satellite system,” in Proc. 5th
maximizing n=1 2 [ m=1 (2(M − m) + 1)Cn,k ,m ], Int. Conf. Comput. Commun. Syst. (ICCCS), 2020, pp. 655–659.
representing the weighted sum of data from satellite k. [24] A. Paris, I. Del Portillo, B. Cameron, and E. Crawley, “A genetic
algorithm for joint power and bandwidth allocation in multibeam
satellite systems,” in Proc. IEEE Aerosp. Conf., 2019, pp. 1–15.
R EFERENCES [25] N. Pachler, J. J. G. Luis, M. Guerster, E. Crawley, and B. Cameron,
“Allocating power and bandwidth in multibeam satellite systems using
[1] O. Kodheli et al., “Satellite communications in the new space era: A particle swarm optimization,” in Proc. IEEE Aerosp. Conf., 2020,
survey and future challenges,” IEEE Commun. Surveys Tuts., vol. 23, pp. 1–11.
no. 1, pp. 70–109, 1st Quart., 2021. [26] F. G. Ortiz-Gomez, D. Tarchi, R. Martínez, A. Vanelli-Coralli,
[2] Y. Su, Y. Liu, Y. Zhou, J. Yuan, H. Cao, and J. Shi, “Broadband LEO M. A. Salas-Natera, and S. Landeros-Ayala, “Cooperative multi-agent
satellite communications: Architectures and key technologies,” IEEE deep reinforcement learning for resource management in full flexible
Wireless Commun., vol. 26, no. 2, pp. 55–61, Apr. 2019. VHTS systems,” IEEE Trans. Cogn. Commun. Netw., vol. 8, no. 1,
[3] H. Zhou and H. Liu, “Development review of foreign emerging com- pp. 335–349, Mar. 2022.
mercial LEO satellite communication constellations,” Telecommun. Eng., [27] G. Cui, X. Li, L. Xu, and W. Wang, “Latency and energy
vol. 58, no. 9, pp. 1108–1114, 2018. optimization for MEC enhanced SAT-IoT networks,” IEEE Access,
[4] J. Li, M. Li, and W. Li, “Satellite communication on the non- vol. 8, pp. 55915–55926, 2020.
geostationary system and the geostationary system in the fixed-satellite [28] Y.-C. Wu, T. Q. Dinh, Y. Fu, C. Lin, and T. Q. S. Quek, “A hybrid
service,” in Proc. 28th Wireless Opt. Commun. Conf. (WOCC), 2019, DQN and optimization approach for strategy and resource allocation
pp. 1–5. in MEC networks,” IEEE Trans. Wireless Commun., vol. 20, no. 7,
[5] L. You, K.-X. Li, J. Wang, X. Gao, X.-G. Xia, and B. Ottersten, “Massive pp. 4282–4295, Jul. 2021.
MIMO transmission for LEO satellite communications,” IEEE J. Sel. [29] S. Chen, J. Chen, Y. Miao, Q. Wang, and C. Zhao, “Deep reinforcement
Areas Commun., vol. 38, no. 8, pp. 1851–1865, Aug. 2020. learning-based cloud-edge collaborative mobile computation offload-
[6] B. Di, L. Song, Y. Li, and H. V. Poor, “Ultra-dense LEO: Integration of ing in industrial networks,” IEEE Trans. Signal Inf. Process. Netw.,
satellite access networks into 5G and beyond,” IEEE Wireless Commun., vol. 8, pp. 364–375, 2022. [Online]. Available: https://ieeexplore.ieee.
vol. 26, no. 2, pp. 62–69, Apr. 2019. org/document/9776583
[7] Z. Qu, G. Zhang, H. Cao, and J. Xie, “LEO satellite constellation for [30] L. Huang, S. Bi, and Y.-J. A. Zhang, “Deep reinforcement learning
Internet of Things,” IEEE Access, vol. 5, pp. 18391–18401, 2017. for online computation offloading in wireless powered mobile-edge
[8] D. Zhou, M. Sheng, Y. Wang, J. Li, and Z. Han, “Machine learning- computing networks,” IEEE Trans. Mobile Comput., vol. 19, no. 11,
based resource allocation in satellite networks supporting Internet of pp. 2581–2593, Nov. 2020.
Remote Things,” IEEE Trans. Wireless Commun., vol. 20, no. 10, [31] B. V. R. Gorantla and N. B. Mehta, “Resource and computationally
pp. 6606–6621, Oct. 2021. efficient subchannel allocation for D2D in multi-cell scenarios with
[9] X. Jia, T. Lv, F. He, and H. Huang, “Collaborative data downloading partial and asymmetric CSI,” IEEE Trans. Wireless Commun., vol. 18,
by using inter-satellite links in LEO satellite networks,” IEEE Trans. no. 12, pp. 5806–5817, Dec. 2019.
Wireless Commun., vol. 16, no. 3, pp. 1523–1532, Mar. 2017. [32] Z. Wang, J. Li, Y. Wang, Z. Su, S. Yu, and W. Meng, “Optimal
[10] D. Zhou, M. Sheng, R. Liu, Y. Wang, and J. Li, “Channel-aware mission repair strategy against advanced persistent threats under time-varying
scheduling in broadband data relay satellite networks,” IEEE J. Sel. networks,” IEEE Trans. Inf. Forensics Security, vol. 18, pp. 5964–5979,
Areas Commun., vol. 36, no. 5, pp. 1052–1064, May 2018. 2023.
[11] B. Di, H. Zhang, L. Song, Y. Li, and G. Y. Li, “Ultra-dense LEO: [33] Y. Wang, Z. Su, T. H. Luan, J. Li, Q. Xu, and R. Li, “SEAL:
Integrating terrestrial-satellite networks into 5G and beyond for data A strategy-proof and privacy-preserving UAV computation offload-
offloading,” IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 47–62, ing framework,” IEEE Trans. Inf. Forensics Security, vol. 18,
Jan. 2019. pp. 5213–5228, 2023.

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.
1118 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO. 3, JUNE 2024

[34] H. Liao, Z. Zhou, X. Zhao, and Y. Wang, “Learning-based queue-aware Yifeng Lyu received the B.E. degree from the
task offloading and resource allocation for space–air–ground-integrated University of Electronic Science and Technology of
power IoT,” IEEE Internet Things J., vol. 8, no. 7, pp. 5250–5263, Apr. China, Chengdu, China, in 2015, and the M.Eng.
2021. degree from the University of Toronto, Toronto,
[35] Z. Li et al., “Energy efficient resource allocation for UAV-assisted space- Canada, in 2017. He is currently pursuing the
air-ground Internet of Remote Things networks,” IEEE Access, vol. 7, Ph.D. degree with the School of Information and
pp. 145348–145362, 2019. Electronics, Beijing Institute of Technology. His
[36] W. Lyu, B. Cong, J. Han, and J. Xu, “OFDM and self-coherent detection research interests include satellite communication,
based satellite-to-ground communication system,” in Proc. 15th Int. satellite networks, and reinforcement learning.
Conf. Opt. Commun. Netw. (ICOCN), 2016, pp. 1–3.
[37] J. M. Tang, P. M. Lane, and K. A. Shore, “High-speed transmission
of adaptively modulated optical OFDM signals over multimode fibers
using directly modulated DFBs,” J. Lightw. Technol., vol. 24, no. 1,
pp. 429–441, Jan. 2006.
[38] C. Loo, “Impairment of digital transmission through a Ka band satellite
channel due to weather conditions,” Int. J. Satell. Commun., vol. 16, Rongfei Fan (Member, IEEE) received the B.E.
no. 3, pp. 137–145, 1998. degree in communication engineering from the
[39] H. Zhang, Q. Li, Y. Zhang, and X. Li, “Game theory based power Harbin Institute of Technology, Harbin, China, in
allocation method for inter-satellite links in LEO/MEO two-layered 2007, and the Ph.D. degree in electrical engineering
satellite networks,” in Proc. IEEE/CIC Int. Conf. Commun. China from the University of Alberta, Edmonton, Alberta,
(ICCC), 2021, pp. 398–403. Canada, in 2012. Since 2013, he has been a Faculty
[40] V. Mnih et al., “Human-level control through deep reinforcement Member with the Beijing Institute of Technology,
learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015. Beijing, China, where he is currently an Associate
[41] M. Chen, R. Chai, and Q. Chen, “Joint route selection and resource Professor with the School of Cyberspace Science
allocation algorithm for data relay satellite systems based on energy effi- and Technology. His research interests include edge
ciency optimization,” in Proc. IEEE 11th Int. Conf. Wireless Commun. computing, federated learning, resource allocation in
Signal Process. (WCSP), 2019, pp. 1–6. wireless networks, and statistical signal processing.
[42] A. Wang, L. Lei, X. Hu, E. Lagunas, A. I. Pérez-Neira, and
S. Chatzinotas, “Adaptive beam pattern selection and resource allocation
for NOMA-based LEO satellite systems,” in Proc. IEEE Glob. Commun.
Conf. (GLOBECOM), 2022, pp. 674–679.
[43] A. Sadeghi, F. Sheikholeslami, A. G. Marques, and G. B. Giannakis,
“Reinforcement learning for adaptive caching with dynamic storage Han Hu (Member, IEEE) received the B.E. and
pricing,” IEEE J. Sel. Areas Commun., vol. 37, no. 10, pp. 2267–2281, Ph.D. degrees from the University of Science
Oct. 2019. and Technology of China, China, in 2007 and
2012, respectively. He is currently a Professor
with the School of Information and Electronics,
Beijing Institute of Technology, China. His research
interests include multimedia networking, edge intel-
Xiufeng Sui received the Ph.D. degree in com- ligence, and space-air-ground integrated network. He
puter science from the School of Computer Science, received several academic awards, including the Best
University of Science and Technology of China, Paper Award of IEEE TCSVT 2019, the Best Paper
Anhui, China. He is currently an Associate Professor Award of IEEE Multimedia Magazine 2015, and the
with the School of Information and Electronics, Best Paper Award of IEEE Globecom 2013. He served as an Associate Editor
Beijing Institute of Technology. His main research for IEEE T RANSACTIONS ON M ULTIMEDIA and Ad Hoc Networks, and a
interests include computer architecture, operat- TPC Member of Infocom, ACM MM, AAAI, and IJCAI.
ing systems, strategies of engineering science
and technology, and management of technological
innovation.

Zhi Liu (Senior Member, IEEE) received the

B.E. degree from the University of Science and
Technology of China, China, and the Ph.D. degree
in informatics from the National Institute of
Ziqi Jiang received the B.E. and M.Eng. degrees Informatics. He is currently an Associate Professor
from the School of Information and Electronics, with The University of Electro-Communications,
Beijing Institute of Technology. Her research focuses Japan. His research interest includes video network
on satellite communication, convex optimization, transmission, vehicular networks, and mobile edge
and deep learning. computing. He was a recipient of the IEEE
StreamComm 2011 Best Student Paper Award, the
2015 IEICE Young Researcher Award, and the
ICOIN 2018 Best Paper Award. He is currently an Editorial Board Member of
Wireless Networks (Springer) and IEEE O PEN J OURNAL OF THE C OMPUTER
S OCIETY.

Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on October 16,2024 at 11:03:17 UTC from IEEE Xplore. Restrictions apply.

GateHouse 5G For LEO Poster 2021
No ratings yet
GateHouse 5G For LEO Poster 2021
1 page
Key technologies for NG-PON2 system
From Everand
Key technologies for NG-PON2 system
Rawa Muayad
No ratings yet
Galois Field Computations With Matlab
No ratings yet
Galois Field Computations With Matlab
31 pages
Entropy 26 00846 With Cover
No ratings yet
Entropy 26 00846 With Cover
15 pages
five
No ratings yet
five
17 pages
sensors-22-01421-v3
No ratings yet
sensors-22-01421-v3
52 pages
Mathematics: Strategy of Multi-Beam Spot Allocation For GEO Data Relay Satellite Based On Modified K-Means Algorithm
No ratings yet
Mathematics: Strategy of Multi-Beam Spot Allocation For GEO Data Relay Satellite Based On Modified K-Means Algorithm
16 pages
Space_security_LEO
No ratings yet
Space_security_LEO
49 pages
A Survey On Nongeostationary Satellite Systems The Communication Perspective
No ratings yet
A Survey On Nongeostationary Satellite Systems The Communication Perspective
32 pages
Digital_Twin_Satellite_Networks_Toward_6G_Motivations_Challenges_and_Future_Perspectives
No ratings yet
Digital_Twin_Satellite_Networks_Toward_6G_Motivations_Challenges_and_Future_Perspectives
7 pages
Zhang 2019
No ratings yet
Zhang 2019
6 pages
Smart Beamforming For Direct LEO Satellite Access
No ratings yet
Smart Beamforming For Direct LEO Satellite Access
22 pages
Task_Offloading_and_Resource_Allocation_for_Satellite-Terrestrial_Integrated_Networks
No ratings yet
Task_Offloading_and_Resource_Allocation_for_Satellite-Terrestrial_Integrated_Networks
14 pages
Survey on Non-Geostationary Satellite Systems
No ratings yet
Survey on Non-Geostationary Satellite Systems
32 pages
LEO Laser Microwave Hybrid Inter Satellite Routing Strategy Based On Modified Q Routing Algorithm
No ratings yet
LEO Laser Microwave Hybrid Inter Satellite Routing Strategy Based On Modified Q Routing Algorithm
18 pages
Satellite-Based_Computing_Networks_with_Federated_Learning
No ratings yet
Satellite-Based_Computing_Networks_with_Federated_Learning
7 pages
A Lagrangian Heuristic For Satellite Range Scheduling With Resource Constraints
No ratings yet
A Lagrangian Heuristic For Satellite Range Scheduling With Resource Constraints
28 pages
Task_Offloading_and_Resource_Allocation_for_Satellite-Terrestrial_Integrated_Networks
No ratings yet
Task_Offloading_and_Resource_Allocation_for_Satellite-Terrestrial_Integrated_Networks
14 pages
VT 2023 02448
No ratings yet
VT 2023 02448
13 pages
Performance Analysis of Satellite Communication System Under The Shadowed-Rician Fading: A Stochastic Geometry Approach
No ratings yet
Performance Analysis of Satellite Communication System Under The Shadowed-Rician Fading: A Stochastic Geometry Approach
15 pages
Computation Offloading in LEO Satellite Networks With Hybrid Cloud and Edge Computing
No ratings yet
Computation Offloading in LEO Satellite Networks With Hybrid Cloud and Edge Computing
13 pages
A Dynamic Resource Scheduling Scheme in Edge Computing Satellite Networks
No ratings yet
A Dynamic Resource Scheduling Scheme in Edge Computing Satellite Networks
12 pages
Energy and Time-Aware Inference Offloading For DNN-based Applications in LEO Satellites
No ratings yet
Energy and Time-Aware Inference Offloading For DNN-based Applications in LEO Satellites
6 pages
Paper 3
No ratings yet
Paper 3
7 pages
Oadband LEO Satellite Communications Architecture and Key TEchnologies
No ratings yet
Oadband LEO Satellite Communications Architecture and Key TEchnologies
7 pages
Playing With A Multi Armed Bandit To Optimize Resource Allocation in Satellite-Enabled 5G Networks
No ratings yet
Playing With A Multi Armed Bandit To Optimize Resource Allocation in Satellite-Enabled 5G Networks
14 pages
Esquema de Asignación de Mapeo de Recursos en La Red Gemela de Satélites 6G
No ratings yet
Esquema de Asignación de Mapeo de Recursos en La Red Gemela de Satélites 6G
26 pages
An Overview of Performance Analysis and
No ratings yet
An Overview of Performance Analysis and
17 pages
Pre Defense Report
No ratings yet
Pre Defense Report
20 pages
A Successive Deep Q-Learning Based Distributed Handover Scheme For Large-Scale LEO Satellite Networks
No ratings yet
A Successive Deep Q-Learning Based Distributed Handover Scheme For Large-Scale LEO Satellite Networks
6 pages
A User-Centric Handover Scheme For Ultra-Dense LEO Satellite Networks
No ratings yet
A User-Centric Handover Scheme For Ultra-Dense LEO Satellite Networks
5 pages
On-Board_Federated_Learning_for_Satellite_Clusters_With_Inter-Satellite_Links
No ratings yet
On-Board_Federated_Learning_for_Satellite_Clusters_With_Inter-Satellite_Links
17 pages
Deep Learning Based Channel Prediction For LEO Satellite Massive MIMO Communication
No ratings yet
Deep Learning Based Channel Prediction For LEO Satellite Massive MIMO Communication
5 pages
13 - Optimal Joint Subcarrier and PA Algorithm To Minimize Total Power Consumption MISO NOMA
No ratings yet
13 - Optimal Joint Subcarrier and PA Algorithm To Minimize Total Power Consumption MISO NOMA
29 pages
Performance Analysis of NB-IoT Uplink in Low Earth
No ratings yet
Performance Analysis of NB-IoT Uplink in Low Earth
22 pages
A Multi-Beam Satellite Cooperative Transmission SC
No ratings yet
A Multi-Beam Satellite Cooperative Transmission SC
22 pages
Deep Learning-Based Channel Prediction For LEO Satellite Massive MIMO Communication System
No ratings yet
Deep Learning-Based Channel Prediction For LEO Satellite Massive MIMO Communication System
5 pages
Effect of Strong Time-Varying Transmission Distance On LEO Satellite-Terrestrial Deliveries
No ratings yet
Effect of Strong Time-Varying Transmission Distance On LEO Satellite-Terrestrial Deliveries
13 pages
SALSA A Scheduling Algorithm For LoRa To LEO Satel
No ratings yet
SALSA A Scheduling Algorithm For LoRa To LEO Satel
8 pages
Joint Beamforming Design and Resource Allocation For Terrestrial-Satellite Cooperation System
No ratings yet
Joint Beamforming Design and Resource Allocation For Terrestrial-Satellite Cooperation System
14 pages
5G and Satellite Network Convergence Survey for Opportunities Challenges and Enabler Technologies
No ratings yet
5G and Satellite Network Convergence Survey for Opportunities Challenges and Enabler Technologies
8 pages
MIMO Satellite Communication Systems A Survey From The PHY Layer Perspective
No ratings yet
MIMO Satellite Communication Systems A Survey From The PHY Layer Perspective
28 pages
Poster Abstract: Satellite Based Wireless Sensor Networks - Global Scale Sensing With Nano-And Pico - Satellites
No ratings yet
Poster Abstract: Satellite Based Wireless Sensor Networks - Global Scale Sensing With Nano-And Pico - Satellites
4 pages
Energy-Constrained Online Scheduling for Satellite-Terrestrial Integrated Networks
No ratings yet
Energy-Constrained Online Scheduling for Satellite-Terrestrial Integrated Networks
14 pages
Frequency Plan Design For Multibeam Satellite Constellations Using Linear Programming
No ratings yet
Frequency Plan Design For Multibeam Satellite Constellations Using Linear Programming
31 pages
LEO Satellite Constellation For Internet of Things
No ratings yet
LEO Satellite Constellation For Internet of Things
11 pages
A_Survey_on_Random_Access_Protocols_in_Direct-Access_LEO_Satellite-Based_IoT_Communication
No ratings yet
A_Survey_on_Random_Access_Protocols_in_Direct-Access_LEO_Satellite-Based_IoT_Communication
37 pages
Uplink Resource Allocation For Multi-Cluster Internet-of-Things Deployment Underlaying Cellular Networks
No ratings yet
Uplink Resource Allocation For Multi-Cluster Internet-of-Things Deployment Underlaying Cellular Networks
14 pages
A Multiservice Traffic Allocation Model For LEO Satellite Communication Networks
No ratings yet
A Multiservice Traffic Allocation Model For LEO Satellite Communication Networks
7 pages
IJSCN_941
No ratings yet
IJSCN_941
30 pages
Theoretical and Simulation Based Analysis of
No ratings yet
Theoretical and Simulation Based Analysis of
6 pages
Electronics 12 01759
No ratings yet
Electronics 12 01759
19 pages
10 1109@icc40277 2020 9149349
No ratings yet
10 1109@icc40277 2020 9149349
6 pages
Randmomdsdsfdf
No ratings yet
Randmomdsdsfdf
19 pages
Beacon-Based Uplink Transmission For LoRaWAN Direct To LEO Satellite Internet of Things
No ratings yet
Beacon-Based Uplink Transmission For LoRaWAN Direct To LEO Satellite Internet of Things
16 pages
Joint CoMP Transmission For UAV-Aided Cognitive Satellite Terrestrial Networks
No ratings yet
Joint CoMP Transmission For UAV-Aided Cognitive Satellite Terrestrial Networks
10 pages
Logic Path Identified Hierarchical LPIH Routing For LEO Satellite Network
No ratings yet
Logic Path Identified Hierarchical LPIH Routing For LEO Satellite Network
6 pages
download
No ratings yet
download
7 pages
Sensors 20 00475 v2
No ratings yet
Sensors 20 00475 v2
19 pages
sensors-22-09406
No ratings yet
sensors-22-09406
19 pages
Optimizing Satellite Orbits and Communication Patterns Using Graph Theory and Machine Learning
No ratings yet
Optimizing Satellite Orbits and Communication Patterns Using Graph Theory and Machine Learning
12 pages
Joint NTN Slicing and Admission Control For Infrastructure-as-a-Service A Deep Learning Aided Multi-Objective Optimization
No ratings yet
Joint NTN Slicing and Admission Control For Infrastructure-as-a-Service A Deep Learning Aided Multi-Objective Optimization
16 pages
Building Energy Efficient Semantic Segmentation in Intelligent Edge Computing
No ratings yet
Building Energy Efficient Semantic Segmentation in Intelligent Edge Computing
11 pages
5G Embraces Satellites For 6G Ubiquitous IoT Basic Models For Integrated Satellite Terrestrial Networks
No ratings yet
5G Embraces Satellites For 6G Ubiquitous IoT Basic Models For Integrated Satellite Terrestrial Networks
19 pages
Stock Price Prediction Using Machine Learning
100% (1)
Stock Price Prediction Using Machine Learning
15 pages
Deep Learning with Python Develop Deep Learning Models on Theano and TensorFLow Using Keras Jason Brownlee All Chapters Instant Download
100% (1)
Deep Learning with Python Develop Deep Learning Models on Theano and TensorFLow Using Keras Jason Brownlee All Chapters Instant Download
65 pages
RNN_LSTM_BiRNN_Notes
No ratings yet
RNN_LSTM_BiRNN_Notes
3 pages
Sentiment Analysis Based On Deep Learning - A Comparative Study
No ratings yet
Sentiment Analysis Based On Deep Learning - A Comparative Study
29 pages
Aiml Online Brochure
No ratings yet
Aiml Online Brochure
16 pages
Enhanced Medical Time-Series Forecasting Using LSTM, MDN, and Attention Mechanism
No ratings yet
Enhanced Medical Time-Series Forecasting Using LSTM, MDN, and Attention Mechanism
5 pages
Visualizing and Forecasting Stocks: Submitted in Partial Fulfillment of The Requirement of For The Degree of
No ratings yet
Visualizing and Forecasting Stocks: Submitted in Partial Fulfillment of The Requirement of For The Degree of
31 pages
Research Paper
No ratings yet
Research Paper
7 pages
AI Pictology PDF
No ratings yet
AI Pictology PDF
24 pages
Deep Learning PPT Full Notes
100% (3)
Deep Learning PPT Full Notes
105 pages
SSRN Id4389914
No ratings yet
SSRN Id4389914
12 pages
Slag Foaming Estimation in The Electric Arc Furnace Using Machine Learning Based Long Short-Term Memory Networks
No ratings yet
Slag Foaming Estimation in The Electric Arc Furnace Using Machine Learning Based Long Short-Term Memory Networks
14 pages
Bunde Imoter Complete Work 2024
No ratings yet
Bunde Imoter Complete Work 2024
50 pages
Deep Learning With Long Short-Term Memory Networks For Financial Market Predictions
No ratings yet
Deep Learning With Long Short-Term Memory Networks For Financial Market Predictions
16 pages
Comparing Bitcoin's Prediction Model Using GRU, RNN, and LSTM by Hyperparameter Optimization Grid Search and Random Search
No ratings yet
Comparing Bitcoin's Prediction Model Using GRU, RNN, and LSTM by Hyperparameter Optimization Grid Search and Random Search
7 pages
A Chatbot Using LSTM-based Multi-Layer Embedding For Elderly Care
No ratings yet
A Chatbot Using LSTM-based Multi-Layer Embedding For Elderly Care
5 pages
Detecting Cyberattacks Using Anomaly Detection in Industrial Control Systems - A Federated Learning Approach
No ratings yet
Detecting Cyberattacks Using Anomaly Detection in Industrial Control Systems - A Federated Learning Approach
16 pages
Generating Structured Music Using Artificial Intelligence
No ratings yet
Generating Structured Music Using Artificial Intelligence
70 pages
2208.00788
No ratings yet
2208.00788
7 pages
CST414-QP (1)
No ratings yet
CST414-QP (1)
2 pages
SRL-ACO A Text Augmentation Framework Based On Semantic Role
No ratings yet
SRL-ACO A Text Augmentation Framework Based On Semantic Role
18 pages
Multimodal Deep Learning
No ratings yet
Multimodal Deep Learning
21 pages
Time Series Forecasting With Deep Learning: A Survey: Research
No ratings yet
Time Series Forecasting With Deep Learning: A Survey: Research
13 pages
Visual Question Answering System For Indian Regional Languages
No ratings yet
Visual Question Answering System For Indian Regional Languages
6 pages
Creating_Alert_messages_based_on_Wild_Animal_Activ (2)
No ratings yet
Creating_Alert_messages_based_on_Wild_Animal_Activ (2)
16 pages
Deep Learning For Credit Card Fraud Detection A Review of Algorithms Challenges and Solutions
No ratings yet
Deep Learning For Credit Card Fraud Detection A Review of Algorithms Challenges and Solutions
18 pages
Document 1
No ratings yet
Document 1
16 pages
Deep Learning On Graphs: A Survey: Ziwei Zhang, Peng Cui and Wenwu Zhu, Fellow, IEEE
No ratings yet
Deep Learning On Graphs: A Survey: Ziwei Zhang, Peng Cui and Wenwu Zhu, Fellow, IEEE
24 pages
Unit 2
No ratings yet
Unit 2
34 pages
AI and Robotics Complete practice set
No ratings yet
AI and Robotics Complete practice set
48 pages

Integrating Convex Optimization and Deep Learning For Downlink Resource Allocation in LEO Satellites Networks

Uploaded by

Integrating Convex Optimization and Deep Learning For Downlink Resource Allocation in LEO Satellites Networks

Uploaded by

1104 IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, VOL. 10, NO.

Integrating Convex Optimization and Deep

the objective of extending the battery life of LEO satellites

IV. O PTIMIZATION BASED P OWER A LLOCATION P(i+1) = P(i) + βi g(i) (8)

Algorithm 2 Opt-DNN-Ptr Algorithm According to the corresponding quantization function, the

Fig. 3. DNN training loss of Opt-DNN-Ptr.

can allocate multiple channels. the Opt-DNN algorithm selects

(P2), the stability and superiority of the Opt-DNN-Ptr are

Zhi Liu (Senior Member, IEEE) received the

You might also like