Response
Response
TCE-2024-04-1017
Journal: IEEE Transactions on Consumer Electronics
Manuscript ID: TCE-2024-04-1017
Paper Title: Cloud-Edge Collaborative Scalable Tucker-Based Tensor Computations
for Ubiquitous Consumer Electronics Data
Authors: Huazhong Liu,Weiyuan Zhang, Ren Li, Yunfan Zhang, Jihong Ding,
Guangshun Zhang, Hanning Zhang, Laurence T. Yang
First of all, we would like to thank the reviewers for spending so much valuable time
reviewing the paper and deeply appreciate the constructive comments and helpful
suggestions from all reviewers. We also would like to thank the associate editor for
handling the paper and giving us the revision chance. According to the reviews
suggestions, we have addressed all review comments. The detailed response and the
description of the revisions are presented below.
Comment 1.3: The paper uses many different expressions about computational
paradigms, such as "parallel execution," "scalable scheme," and "distributed manner."
What are their differences?
Response 1.3: Thanks for your constructive comment.
In summary, "parallel execution" performs multiple computations or processes
simultaneously, its main idea is to divide large problems into smaller ones and solve
them concurrently. "Distributed manner" emphasizes that data storage and task
execution at different worker nodes in different locations and collaborate through
network communication to complete tasks. "Scalable scheme" focus more on whether
the system or method can effectively handle the growing workload when increasing
resources such as computing power, storage space, bandwidth. A good scalable
scheme should include parallel execution and distributed methods, as both methods
are means to achieve system scalability.
Comment 1.4: Is there only one way to achieve intra-core parallelism as shown in
Fig.7? This seems different from the intra-core parallelism shown in Fig.5.
Response 1.4: Thanks for your constructive comments.
There is more than one intra-core parallel scheme, and Figure 7 only illustrates the
idea of intra-core parallelism. In fact, Figure 7 represents the idea of intra-core
parallel partitioning when computing between core tensors and vectors. Similarly,
intra-core parallel partitioning between the core tensor and factor matrices can also be
performed in this way, while Figure 5 reflects the idea of intra-core parallel
partitioning when performing operations between different core tensors.
Comment1.6: What is the reason for choosing different parameters, such as orders,
dimensionalities, ranks, etc., for different tensor operations during experimental
design? The authors need to provide a detailed explanation.
Response1.6: Thank you for your insightful comment.
Choosing different parameters for different tensor operations is determined based on
their computational characteristics. The experimental indicators are related to the
execution time, so we analyze the time complexity of different operations to
determine which parameters to use. According to the complexity analysis in section
VI, the time complexity of Tucker mode-n product is O( I ,changing
2
r) the dimension I of
the original tensor will significantly affect the amount of data in the factor matrix.
Therefore, we chose to change the dimensionality of the original tensor for different
experiments to better understand its impact. For same reason, the time complexity of
Hadamard product and inner product is O((r .Thus,
/ n) 2 N ) we chose to change the rank r of
core tensor to get the experimental results of Hadamard product and inner product.
Revise
Revise
Comment1.8:What is the precision in the relevant experiments of Tucker rounding?
The paper did not explain this, which is really confusing.
Response1.8:Thanks for your helpful comment.
Precision is the prescribed accuracy during executing Tucker decomposition in Tucker
rounding method. The definition of accuracy. For more explanation, please refer to
Ref.[32].In the revised version, we have rewritten the sentence as follows: “...using
different levels of precision[32] and compare the dimensions of the core tensor ...”
Comment2.2: In section III, Mode-n Kronecker product is mentioned two times, but
there is still a lack of explanation for this operation.
Response2.2: Thanks for your valuable comment.
Mode n Kronecker is a widely used operation. The Mode-n Kronecker product is a
tensor that only performs Kronecker product along the nth mode(order) of another
tensor. We have provided detailed explanations in the revised version.
Comment2.3: The paper divides tensor operations into three categories, but the
reasons for doing this is too brief. I think the authors should provide additional
explanations.
Response2.3: Thanks for your comment.
We divided tensor operations into three categories based on their functions.
1. The operation of taking values for a single tensor is classified as basic operations,
mainly including extracting values, extracting fibers, extracting slices, and extracting
sub tensors.
2. The operations that can cause numerical changes within a tensor or calculate a
certain value about a tensor are classified as mathematical operations, mainly
including Hadamard product, addition, subtraction, inner product, and Frobenius
norm.
3. Contraction operation is a special operation that changes the order and rank of the
tensor, mainly including Tucker mode-n product and multilinear contraction.
Comment2.5: This paper points out that the time complexity of scalable tensor inner
product operations is O(r^2N), while performing the same operation on original
tensors is only O(I^N), how to explain the advantages of the scalable computing?
Response2.5: Thanks for your constructive comment.
Although the exponent of the inner product time complexity based on Tucker
decomposition is 2N, r is the rank of the core tensor, which is much smaller than the
dimension I of the original tensor. Therefore, in general, the scale tensor inner product
operations still have computational efficiency advantages. From the above analysis, it
can be inferred that the scalable approach also has advantages in terms of space
complexity.
Comment2.7: In Fig. 10, when the number of working nodes is 1, I believe that the
serial and parallel execution times should be exactly the same. Why does your
experimental result appear to have better parallel performance instead? In fact, setting
the node to 1 is meaningless for serial and parallel execution.
Response2.7: Thanks for your insightful comment.
When the node is 1, both serial and parallel computations are performed on a single
node, and there is no difference between the two. The execution time should be the
same, so it is not necessary to show it in the experimental results. The small difference
in the experiment is the error after taking the average value after repeating multiple
sets of experiments. In the improved version, we have removed the experiment with 1
node in the real-world dateset.
Comment2.8: Long sentences and typing errors should be removed to improve the
readability of this paper.
Response2.8: Done. Thanks for your comment.
In the revised version, we have polished this paper carefully for several times and
corrected some grammatical mistakes and flawed sentences. Besides, we have also
asked a native speaker to proofread the paper again.
Response to the Comments of Reviewer 3
Comment3.1: This paper is of great significance in addressing the computation
efficiency for handling consumer electronics data. By introducing the calculation rules
of Tucker-based tensor operations and their scalable computing architecture, this
paper has improved the computational efficiency and provided a feasible solution for
data processing in cloud-edge collaborative environments. However, I have following
comments for improving your paper.
Response3.1: Thanks for your positive comments.
Comment3.7: This paper mentions the cloud-edge collaboration frequently, how does
it reflect in the experiment?
Response3.7: Thank you for your comment.
We use the EdgeCloudSim to simulate the edge computing environment. This tool is
specially used for edge computing scenarios, EdgeCloudSim is based on CloudSim to
address the specific needs of edge computing research and support necessary
functions in computing and network capabilities. EdgeCloudSim provides a modular
architecture, in which each module solves specific aspects of edge computing and
clearly defines interfaces with other modules. The entire system mainly consists of
five main modules, namely: core simulation, network, load generator, mobility, and
edge coordinator. The overall architecture is shown in the Figure below. Each module
contains configuration files that can be adjusted by simulation parameters to achieve
different cloud edge computing environments. In the experiment, network modules
and edge coordination modules were mainly used to calculate the delay in the task
allocation process.