0% found this document useful (0 votes)

7 views

Distributed Optimization In Networked Systems Algorithms And Applications Qingguo L download

The document discusses the book 'Distributed Optimization in Networked Systems: Algorithms and Applications' by Qingguo Lü and others, which focuses on distributed optimization algorithms and their applications in networked control systems. It highlights the importance of these systems in the context of the Internet of Things (IoT) and big data, addressing challenges in optimizing complex networks with many nodes. The book serves as a resource for students and researchers in computer science, automation, and related fields, providing insights into various optimization techniques and their practical applications.

Uploaded by

keniasfobert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Distributed Optimization In Networked Systems Algorithms And Applications Qingguo L download

Uploaded by

keniasfobert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 87

Distributed Optimization In Networked Systems

Algorithms And Applications Qingguo L download

https://ebookbell.com/product/distributed-optimization-in-
networked-systems-algorithms-and-applications-qingguo-l-47735708

Explore and download more ebooks at ebookbell.com

Here are some recommended products that we believe you will be
interested in. You can click the link to download.

Distributed Optimizationbased Control Of Multiagent Networks In

Complex Environments 1st Edition Minghui Zhu

https://ebookbell.com/product/distributed-optimizationbased-control-
of-multiagent-networks-in-complex-environments-1st-edition-minghui-
zhu-5141560

Gametheoretic Learning And Distributed Optimization In Memoryless

Multiagent Systems Tatarenko

https://ebookbell.com/product/gametheoretic-learning-and-distributed-
optimization-in-memoryless-multiagent-systems-tatarenko-6753550

Distributed Optimization Advances In Theories Methods And Applications

1st Ed Huaqing Li

https://ebookbell.com/product/distributed-optimization-advances-in-
theories-methods-and-applications-1st-ed-huaqing-li-22476542

Distributed Control And Optimization Technologies In Smart Grid

Systems First Edition Fanghong Guo

https://ebookbell.com/product/distributed-control-and-optimization-
technologies-in-smart-grid-systems-first-edition-fanghong-guo-6837710
Distributed Optimization Game And Learning Algorithms Theory And
Applications In Smart Grid Systems 1st Edition Huiwei Wang

https://ebookbell.com/product/distributed-optimization-game-and-
learning-algorithms-theory-and-applications-in-smart-grid-systems-1st-
edition-huiwei-wang-36373782

Stochastic Optimization For Distributed Energy Resources In Smart

Grids 1st Edition Yuanxiong Guo

https://ebookbell.com/product/stochastic-optimization-for-distributed-
energy-resources-in-smart-grids-1st-edition-yuanxiong-guo-5884530

Merging Optimization And Control In Power Systems Physical And Cyber

Restrictions In Distributed Frequency Control And Beyond 1st Edition
Feng Liu

https://ebookbell.com/product/merging-optimization-and-control-in-
power-systems-physical-and-cyber-restrictions-in-distributed-
frequency-control-and-beyond-1st-edition-feng-liu-45413710

Developments In Modelbased Optimization And Control Distributed

Control And Industrial Applications 1st Edition Sorin Olaru

https://ebookbell.com/product/developments-in-modelbased-optimization-
and-control-distributed-control-and-industrial-applications-1st-
edition-sorin-olaru-5355162

Coordination Of Distributed Energy Resources In Microgrids

Optimisation Control And Hardwareintheloop Validation Yan Xu

https://ebookbell.com/product/coordination-of-distributed-energy-
resources-in-microgrids-optimisation-control-and-hardwareintheloop-
validation-yan-xu-38251642
Wireless Networks

Qingguo Lü · Xiaofeng Liao · Huaqing Li ·

Shaojiang Deng · Shanfu Gao

Distributed
Optimization
in Networked
Systems
Algorithms and Applications
Wireless Networks

Series Editor
Xuemin Sherman Shen, University of Waterloo, Waterloo, ON, Canada
The purpose of Springer’s Wireless Networks book series is to establish the state
of the art and set the course for future research and development in wireless
communication networks. The scope of this series includes not only all aspects
of wireless networks (including cellular networks, WiFi, sensor networks, and
vehicular networks), but related areas such as cloud computing and big data.
The series serves as a central source of references for wireless networks research
and development. It aims to publish thorough and cohesive overviews on specific
topics in wireless networks, as well as works that are larger in scope than survey
articles and that contain more detailed background information. The series also
provides coverage of advanced and timely topics worthy of monographs, contributed
volumes, textbooks and handbooks.
** Indexing: Wireless Networks is indexed in EBSCO databases and DPLB **
Qingguo Lü • Xiaofeng Liao • Huaqing Li •
Shaojiang Deng • Shanfu Gao

Distributed Optimization
in Networked Systems
Algorithms and Applications
Qingguo Lü Xiaofeng Liao
College of Computer Science College of Computer Science
Chongqing University Chongqing University
Chongqing, China Chongqing, China

Huaqing Li Shaojiang Deng

College of Electronic and Information College of Computer Science
Engineering Chongqing University
Southwest University Chongqing, China
Chongqing, China

Shanfu Gao
College of Computer Science
Chongqing University
Chongqing, China

ISSN 2366-1186 ISSN 2366-1445 (electronic)

Wireless Networks
ISBN 978-981-19-8558-4 ISBN 978-981-19-8559-1 (eBook)
https://doi.org/10.1007/978-981-19-8559-1

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore
Pte Ltd. 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
To My Family
Q. Lü
To my family
X. Liao
To my family
H. Li
To my family
S. Deng
To my family
S. Gao
Preface

In recent years, the Internet of Things (IoT) and big data have been interconnected
to a wide and deep extent through the sensing, computing, communication, and
control of intelligent information. Networked systems are playing an increasingly
important role in the interconnected information environment, profoundly affecting
computer science, artificial intelligence, and other related fields. The core of such
systems composed of many nodes is to efficiently accomplish certain global goals by
collaborating with each other, while making separate decisions based on different
preferences, thus solving large-scale complex problems that are difficult for indi-
vidual nodes to perform, with strong resistance to interference and environmental
adaptability. In addition, such systems require participating nodes to access only
their own local information. This may be due to the consideration of security and
privacy issues in the network, or simply because the network is too large, making
the aggregation of global information to a central node practically impossible or
very inefficient. Currently, as a hot research topic with wide applicability and great
application value across multiple disciplines, distributed optimization of networked
systems has laid an important foundation for promoting and leading the frontier
development in computer science and artificial intelligence. However, networked
systems cover a large number of intelligent devices (nodes), and the network
environment is often dynamic and changing, making it extremely hard to optimize
and analyze them. It is problematic for existing theories and methods to effectively
address the new needs and challenges of optimization brought about by the rapid
development of technologies related to networked systems. Hence, it is urgent to
develop new theories and methods of distributed optimization over networks.
Analysis and synthesis including distributed unconstrained optimization, dis-
tributed constrained optimization, distributed nonsmooth optimization, distributed
online optimization, distributed economic dispatch in smart grids, undirected
networks, directed networks, time-varying networks, consensus control protocol,
gradient tracking technique, event-triggered communication strategy, Nesterov
and heavy-ball accelerated mechanisms, variance-reduction technique, differential
privacy strategy, gradient descent algorithm, accelerated algorithm, stochastic gra-
dient algorithm, and online algorithm are all thoroughly studied. This monograph

vii
viii Preface

mainly investigates distributed optimization algorithms and applications in net-

worked control systems. In general, the following problems are investigated in
this monograph: (1) accelerated algorithms for distributed convex optimization;
(2) projection algorithms for distributed stochastic optimization; (3) proximal
algorithms for distributed coupled optimization; (4) event-triggered algorithms for
distributed convex optimization; (5) event-triggered acceleration algorithms for dis-
tributed stochastic optimization; (6) accelerated algorithms for distributed economic
dispatch; (7) primal-dual algorithms for distributed economic dispatch; (8) event-
triggered algorithms for distributed economic dispatch; and (9) privacy preserving
algorithms for distributed online learning. Among the topics, simulation results
including some typical real applications are presented to illustrate the effectiveness
and the practicability of the distributed optimization algorithms proposed in the
previous parts.
This book is appropriate as a college course textbook for undergraduate and grad-
uate students majoring in computer science, automation, artificial intelligence, and
electric engineering, and as a reference material for researchers and technologists in
related fields.

Chongqing, China Qingguo Lü

Xiaofeng Liao
Huaqing Li
Shaojiang Deng
Shanfu Gao
Acknowledgments

This book was supported in part by the Natural Science Foundation of Chongqing
under Grant CSTB2022NSCQ-MSX1627, in part by the Chongqing Postdoctoral
Science Foundation under Grant 2021XM1006, in part by the China Postdoctoral
Science Foundation under Grant 2021M700588, in part by the National Natural
Science Foundation of China under Grant 62173278, in part by the Science and
Technology Research Program of Chongqing Municipal Education Commission
under Grant KJQN202100228, in part by the project of Key Laboratory of Industrial
Internet of Things & Networked Control, Ministry of Education under Grant
2021FF09, in part by the project funded by Hubei Province Key Laboratory
of Intelligent Information Processing and Real-time Industrial System (Wuhan
University of Science and Technology) under Grant ZNXX2022004, in part by
the project funded by Hubei Key Laboratory of Intelligent Robot (Wuhan Institute
of Technology) under Grant HBIR202205, in part by the Science and Technology
Research Program of Chongqing Municipal Education Commission under Grant
KJQN202100228, and in part by National Key R&D Program of China under
Grant 2018AAA0100101. We would like to begin by acknowledging Yingjue Chen
and Keke Zhang who have unselfishly given their valuable time in arranging raw
materials. Their assistance has been invaluable to the completion of this book. The
authors are especially grateful to their families for their encouragement and never
ending support when it was most required. Finally, we would like to thank the editors
at Springer for their professional and efficient handling of this book.

ix
Contents

1 Accelerated Algorithms for Distributed Convex Optimization . . . . . . . . . 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Communication Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.1 Centralized Nesterov Gradient Descent Method (CNGD) . . . . 6
1.3.2 Directed Distributed Nesterov-Like Gradient
Tracking (D-DNGT) Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.3 Related Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.2 Supporting Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2 Projection Algorithms for Distributed Stochastic
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.3 Communication Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.1 Problem Reformulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.2 Computation-Efficient Distributed Stochastic
Gradient Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

xi
xii Contents

2.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.4.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4.2 Supporting Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.5.1 Example 1: Performance Examination . . . . . . . . . . . . . . . . . . . . . . . . 53
2.5.2 Example 2: Application Behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3 Proximal Algorithms for Distributed Coupled Optimization . . . . . . . . . . 61
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.3 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.4 The Saddle-Point Reformulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.1 Unbiased Stochastic Average Gradient (SAGA) . . . . . . . . . . . . . . 68
3.3.2 Distributed Stochastic Algorithm (VR-DPPD) . . . . . . . . . . . . . . . . 69
3.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.4.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.5.1 Example 1: Simulation on General Real Data . . . . . . . . . . . . . . . . . 82
3.5.2 Example 2: Simulation on Large-Scale Real Data . . . . . . . . . . . . 85
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4 Event-Triggered Algorithms for Distributed Convex
Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2.3 Communication Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.3.1 Distributed Subgradient Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.4.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Contents xiii

5 Event-Triggered Acceleration Algorithms for Distributed

Stochastic Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.2.3 Communication Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3.1 Event-Triggered Communication Strategy . . . . . . . . . . . . . . . . . . . . 120
5.3.2 Event-Triggered Distributed Accelerated Stochastic
Gradient Algorithm (ET-DASG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.4.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.4.2 Supporting Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.4.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.1 Example 1: Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.2 Example 2: Energy-based Source Localization . . . . . . . . . . . . . . . 143
5.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6 Accelerated Algorithms for Distributed Economic Dispatch . . . . . . . . . . . 151
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.2.3 Communication Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.2.4 Centralized Lagrangian Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.3.1 Directed Distributed Lagrangian Momentum Algorithm . . . . . 158
6.3.2 Related Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.4.1 Auxiliary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.5.1 Case Study 1: EDP on IEEE 14-Bus Test Systems . . . . . . . . . . . 172
6.5.2 Case Study 2: EDP on IEEE 118-Bus Test Systems . . . . . . . . . . 173
6.5.3 Case Study 3: The Application to Dynamical EDPs . . . . . . . . . . 175
6.5.4 Case Study 4: Comparison with Related Methods . . . . . . . . . . . . 177
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7 Primal–Dual Algorithms for Distributed Economic Dispatch . . . . . . . . . . 183
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
7.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
xiv Contents

7.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

7.2.3 Communication Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
7.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.3.1 Dual Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.3.2 Distributed Primal–Dual Gradient Algorithm . . . . . . . . . . . . . . . . . 190
7.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.4.1 Small Gain Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.4.2 Supporting Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
7.4.3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
7.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.5.1 Example 1: EDP on the IEEE 14-Bus Test Systems . . . . . . . . . . 201
7.5.2 Example 2: Demand Response for Time-Varying
Supplies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8 Event-Triggered Algorithms for Distributed Economic Dispatch . . . . . . 209
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.2.2 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.2.3 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.2.4 Communication Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.3.1 Problem Reformulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.3.2 Event-Triggered Control Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.4.1 Supporting Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.4.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
8.4.3 The Exclusion of Zeno-Like Behavior. . . . . . . . . . . . . . . . . . . . . . . . . 224
8.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
8.5.1 Example 1: EDP on the IEEE 14-Bus System . . . . . . . . . . . . . . . . 226
8.5.2 Example 2: EDP on Large-Scale Networks . . . . . . . . . . . . . . . . . . . 226
8.5.3 Example 3: Comparison with Related Methods . . . . . . . . . . . . . . . 229
8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9 Privacy Preserving Algorithms for Distributed Online Learning . . . . . . 235
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
9.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
9.2.2 Model of Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
9.2.3 Communication Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
9.3 Algorithm Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.3.1 Differential Privacy Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.3.2 Differentially Private Distributed Online Algorithm . . . . . . . . . . 243
Contents xv

9.4 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

9.4.1 Differential Privacy Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
9.4.2 Logarithmic Regret . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
9.4.3 Square-Root Regret . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
9.4.4 Robustness to Communication Delays . . . . . . . . . . . . . . . . . . . . . . . . 260
9.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
9.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
List of Figures

Fig. 1.1 A directed and strongly connected network . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Fig. 1.2 Performance comparisons between D-DNGT and the
methods without momentum terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Fig. 1.3 Performance comparisons between D-DNGT and the
methods with momentum terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Fig. 1.4 Performance comparisons between the extensions of
D-DNGT and their closely related methods . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Fig. 2.1 Convergence of x̂ for solving the optimization problem in
Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Fig. 2.2 Comparison (a): x-axis is the iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Fig. 2.3 Comparison (b): x-axis is the number of gradient
evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Fig. 2.4 Comparison (a): x-axis is the iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 2.5 Comparison (b): x-axis is the number of gradient
evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Fig. 2.6 Comparison (a): x-axis is the iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Fig. 2.7 Comparison (b): x-axis is the number of gradient
evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Fig. 3.1 (a) Random network with a connection probability
p = 0.8. (b) Complete network. (c) Cycle network. (d)
Star network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Fig. 3.2 The transient behaviors of the second dimension of each
primal variable xi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Fig. 3.3 Comparisons between VR-DPPD and other algorithms. (a)
The x-axis is the iterations. (b) The x-axis is the number of
gradient evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Fig. 3.4 Evolution of residuals under different networks . . . . . . . . . . . . . . . . . . . . . 85
Fig. 3.5 Comparisons between VR-DPPD and other algorithms. (a)
The x-axis is the iterations. (b) The x-axis is the number of
gradient evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Fig. 4.1 All nodes’ states xi (t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

xvii
xviii List of Figures

Fig. 4.2 Evolutions of all nodes’ control inputs ui (t) . . . . . . . . . . . . . . . . . . . . . . . . . 109

Fig. 4.3 All nodes’ sampling time instant sequences {tki } . . . . . . . . . . . . . . . . . . . . . 110
Fig. 4.4 Evolutions of measurement error and threshold for node 3 . . . . . . . . . 110
Fig. 5.1 Four undirected and connected network topologies
composed of 10 nodes. (a) Random network with a
connection probability pc = 0.4. (b) Complete network.
(c) Cycle network. (d) Star network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Fig. 5.2 Convergence: (a) The transient behaviors of three
dimensions (randomly selected) of state estimator x. (b)
The testing accuracy of ET-DASG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Fig. 5.3 The triggering times for the neighbors when 5 nodes run
ET-DASG under different event-triggered parameters . . . . . . . . . . . . . . . 142
Fig. 5.4 Evolution of residuals under different constant step-sizes
or momentum coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Fig. 5.5 Evolution of residuals under different networks . . . . . . . . . . . . . . . . . . . . . 144
Fig. 5.6 Comparisons between ET-DASG and other methods . . . . . . . . . . . . . . . . 144
Fig. 5.7 The randomly selected 7 paths displayed on top of contours
of log-likelihood function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Fig. 6.1 The IEEE 14-bus test system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Fig. 6.2 EDP on IEEE 14-bus test system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Fig. 6.3 The IEEE 118-bus test system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Fig. 6.4 EDP on IEEE 118-bus test system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Fig. 6.5 Dynamical EDPs on IEEE 14-bus test system . . . . . . . . . . . . . . . . . . . . . . . . 178
Fig. 6.6 Performance comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
Fig. 7.1 The IEEE 14-bus test system [43] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Fig. 7.2 Power allocation at generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Fig. 7.3 Consensus of Lagrange multipliers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Fig. 7.4 Optimal energy schedule of each household . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Fig. 7.5 Predicted price on the time-varying demands . . . . . . . . . . . . . . . . . . . . . . . . 204
Fig. 8.1 EDP on the IEEE 14-bus system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Fig. 8.2 EDP on the IEEE 118-bus test system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Fig. 8.3 Comparison with related methods in which the residual
E(t) as the comparison metric. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Fig. 8.4 Comparison with related methods in which the obtained
cost of EDP as the comparison metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Fig. 9.1 (a) Estimations of five nodes without communication
delay. (b) The maximum and minimum pseudo-individual
average regrets (Rj (T )/T ) without communication delays . . . . . . . . . 264
Fig. 9.2 (a) Estimations of five nodes with communication delays.
(b) The maximum and minimum pseudo-individual
average regrets (Rj (T )/T ) with communication delays . . . . . . . . . . . . 265
Fig. 9.3 Estimations of five nodes for DTS with communication
delays. (a) Node’s estimate (z) (DP-DSSP). (b) Node’s
estimate (z) (the method in [37]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
List of Figures xix

DT S
Fig. 9.4 (a) The pseudo-individual average subregrets (Rj,av (k))
between k and k + 1 for DTS with communication delays.
(b) The pseudo-individual average regrets (Rj (T )/T ) for
DTS with communication delays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Fig. 9.5 The outputs xi (t) and xi (t) related to the adjacent relations

fit and fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Chapter 1
Accelerated Algorithms for Distributed
Convex Optimization

Abstract In this chapter, we introduce and solve distributed optimization problems

on directed networks, where each node has its own convex cost function while
obeying the network connectivity structure, and the principal target of these
problems is to minimize the global cost function (formulated by the average of
all local cost functions). Most of the existing methods, such as the push-sum
strategy, have eliminated the unbalancedness caused by directed networks with
the help of column-stochastic weights, but those methods may be infeasible in
case the distributed implementation requires each node to obtain (at least) its
out-degree information. In contrast, with the help of a directed network of row-
stochastic weights, we propose a new directed distributed Nesterov-like gradient
tracking algorithm, named D-DNGT, that incorporates the gradient tracking into the
distributed Nesterov method with momentum terms and employs non-uniform step-
sizes. This approach can effectively overcome the abovementioned limitations of
column-stochastic directed networks in the implementation. The implementation
of D-DNGT is straightforward if each node locally chooses a suitable step-size
and privately regulates the weights of information that acquires from in-neighbors.
If the largest step-size and the maximum momentum coefficient are positive and
small sufficiently, we can prove that D-DNGT converges linearly to the optimal
solution provided that the cost functions are smooth and strongly convex. We
provide numerical experiments to confirm the findings in this chapter and contrast
D-DNGT with recently proposed distributed optimization approaches.

Keywords Distributed convex optimization · Nesterov-like algorithm · Gradient

tracking · Directed network · Linear convergence

1.1 Introduction

In the past decades, with the development of artificial intelligence and the emer-
gence of 5G, a number of researchers are already interested in distributed optimiza-
tion. This chapter considers a class of widely concerned distributed optimization
problems with each node cooperatively attempting to optimize a global cost function

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 1
Q. Lü et al., Distributed Optimization in Networked Systems, Wireless Networks,
https://doi.org/10.1007/978-981-19-8559-1_1
2 1 Accelerated Algorithms for Distributed Convex Optimization

in the context of local interactions and local computations [1]. Instances of such
formulation characterized by distributed computing have several important and
widespread applications in various fields, including wireless sensor networks for
decision-making and information-processing [2], distributed resource allocation
in smart grids [3], distributed learning in robust control [4], and time-varying
formation control [5, 6], among many others [7–13]. Unlike traditional centralized
optimization, distributed optimization involves multiple nodes that gain access to
their private local information over networks, and typically no central coordinator
(node) can acquire the entire information over the networks.
Recently, an increasing number of distributed algorithms have been emerged
according to various locally computational schemes for individual nodes. Some
known approaches for different networks are usually dependent on the distributed
(sub)gradient descent with extensions to figure out interaction delays, asynchronous
updates and stochastic (sub)gradient scenarios, etc. [14–22]. It is noteworthy in this
aspect that these algorithms are intuitive and flexible for the cost functions and
networks, and however the convergence rates are considerably slow owing to the
utilization of diminishing step-size, which is required to guarantee convergence to
an exact optimal solution [14]. The convergence rate of the known algorithms, even
for strong convex functions, is only sublinear [15]. The convergence rate reaches
to be linear of an algorithm with a constant step-size at the cost of a sub-optimal
solution [20]. Methods that make up this exactness-speed dilemma, such as the
distributed alternating direction method of multipliers (ADMMs) [23, 24] and the
distributed dual decomposition [25], are based on Lagrangian dual, which have nice
provable convergence rates (linear convergence rate for strong convex functions)
[26]. In addition, extensions of various real-world factors including stochastic errors
[27], privacy preserving [28] and techniques including proximal (sub)gradient [29],
and formation-containment control [30] have been extensively studied. However,
due to the need to deal with sub-problems in each iteration, the computational
complexity is considerably high. To overcome these difficulties effectively, quite a
few approaches have been proposed, which achieve linear convergence for smooth
and strongly convex cost functions [31–38]. Nonetheless, these approaches [31–38]
are just suitable for undirected networks.
Distributed optimization over directed networks was firstly studied in [39], where
(sub)gradient-push (SP) method was employed to eliminate the requirement of
network balancing, i.e., with column-stochastic weights. Since SP is established on
(sub)gradient descent with diminishing step-size, it also encounters a slow sublinear
convergence rate. To accelerate convergence, Xi and Khan [40] proposed a linearly
convergent distributed method (DEXTRA) with constant step-size by combining
push-sum strategy with the protocol (EXTRA) in [31]. Further, Xi et al. [41] (fixed
directed network) and Nedic et al. [42] (time-varying directed networks) combined
the push-sum strategy with distributed inexact gradient tracking with constant step-
size (ADD-OPT [41] and Push-DIGing [42]) to acquire linear convergence to the
exact optimal solution. Then, Lü et al. [43, 44] extended the work of [42] to non-
uniform step-sizes and showed linear convergence. A different class of approaches
which do not utilize push-sum mechanism have been recently proposed in [45–
1.1 Introduction 3

50], where both row- and column-stochastic weights are adopted simultaneously to
acquire linear convergence over directed networks. It is noteworthy that although
these approaches [39–50] avoid the construction of doubly-stochastic weights, they
just require nodes to possess (at least) its own out-degree information exactly.
Therefore, all the nodes in the networks [39–50] can adjust their outgoing weights
to ensure that the sum of each column of weight matrix is one. This requirement,
however, is likely to be unrealistic in broadcast-based interaction schemes (i.e., the
node neither accesses its out-neighbors nor regulates its outgoing weights).
In this chapter, the algorithm that we will construct depends crucially on the
gradient tracking and is a variation of methods appeared in [47–55]. To be specific,
Qu and Li [54] combined the gradient tracking with distributed Nesterov gradient
descent (DNGD) method [55] and thereby investigated two accelerated distributed
Nesterov methods, i.e., Acc-DNGD-SC and Acc-DNGD-NSC, which exhibited fast
convergence rate compared with the centralized gradient descent (CGD) method for
different cost functions. Note that although the convergence rates are improved, the
two approaches in [54] just assume that the interaction networks are undirected,
which also involve the applicability of the methods in many fields, such as
wireless sensor networks. To remove this deficiency, Xin et al. [48] established an
acceleration and generalization of first-order methods with the gradient tracking
and the momentum term, i.e., ABm, which overcame the conservatism (eigenvector
estimation or doubly-stochastic weights) in the related work by implementing both
row- and column-stochastic weights. In this setting, some interesting generalized
methods [46] (random link failures) and [49] (interaction delays) were proposed.
Regrettably, the construction of column-stochastic weights demands each node to
possess at least its out-degree information, which is arduous to be implemented, for
example, in broadcast-based interaction scenarios. In light of this challenge, Xin et
al. [52] investigated the case of row-stochastic weight matrix which was required to
restrict global information on the network and proposed a fast distributed method
(FROST) under non-uniform step-sizes motivated by the idea of [51]. Related works
also involve the issues of demand response and economic scheduling in power
systems [53, 56]. However, these methods [51–53, 56] do not adopt momentum
terms [54, 55, 57], where nodes acquire more information from in-neighbors in
the network for fast convergence. Moreover, two accelerated methods based on
Nesterov’s momentum for the distributed optimization over arbitrary networks were
presented in [50]. Unfortunately, the related work [50] does not consider the non-
uniform step-sizes and lack of a rigorous theoretical analysis of the methods. Hence,
it is of great significance to discuss such a challenging issue due to its practicality.
The main interest of this chapter is to study the distributed convex optimiza-
tion problem over a directed network. To solve this issue, a linearly convergent
algorithm is designed, for which the non-uniform step-sizes, momentum terms,
and row-stochastic matrix are utilized. We hope to develop a broad theory of the
distributed convex optimization, and the potential purpose of designing a distributed
optimization algorithm is to adapt and promote real scenarios. To conclude, this
4 1 Accelerated Algorithms for Distributed Convex Optimization

chapter possesses the following four contributions:

(i) We design and discuss a novel directed distributed Nesterov-like gradient
tracking algorithm, named as D-DNGT, with row-stochastic matrix to solve the
distributed convex optimization problems over a directed network. Specifically,
D-DNGT incorporates the gradient tracking into the distributed Nesterov
method, which adds two types of momentum terms to ensure that nodes acquire
more information from in-neighbors in the network than the existing methods
[51–53] to achieve fast convergence. More importantly, a consensus iteration
step [50–52] is exploited for designing D-DNGT to counteract the effect of the
unbalancedness induced by the directed network.
(ii) In comparison with Acc-DNGD-SC and Acc-DNGD-NSC proposed in [54],
D-DNGT extends the centralized Nesterov gradient descent method (CNGD)
[57] to a distributed form and is suitable for directed networks. In contrast
to [54] (doubly-stochastic matrix) and [39–50] (column-stochastic matrix),
D-DNGT with row-stochastic matrix is relatively easy to be achieved in a
distributed way if each node can privately regulate the weights on information
which acquires from in-neighbors. This is inevitable in some real applications
such as ad hoc networks, peer to peer, etc.
(iii) D-DNGT adopts non-uniform step-sizes, which presents a selection of more
relaxed step-sizes than most existing methods proposed in [41, 42, 49], etc. If
the cost functions are smooth and strongly convex, D-DNGT attains a linear
convergence to the exact optimal solution in the case where the non-uniform
step-sizes and the momentum coefficients are constrained by the specific upper
bounds. In addition, some extensions of D-DNGT are discussed in the presence
of two types of weight matrices (only column-stochastic [41, 42] or both row-
and column-stochastic [45, 46]) or network interaction delays (arbitrary but
uniformly bounded) [49, 58].
(iv) The provided bounds on the largest step-size depend only on the network
topology and the cost functions, and each node can choose a relatively wider
step-size. This is in contrast to the earlier work on non-uniform step-sizes
within the framework of the gradient tracking [33, 43, 44, 53], which is relied
on the heterogeneity of the step-sizes. Moreover, the bounds of non-uniform
step-sizes in this chapter allow the existence (not all) of zero step-sizes among
the nodes.

1.2 Preliminaries

1.2.1 Notation

If not particularly stated, the vectors mentioned in this chapter are column vectors.
Let R and Rp denote the set of real numbers and p-dimensional real column vectors,
respectively. The subscripts i and j are utilized to denote the indices of the node
1.2 Preliminaries 5

and the superscript t denotes the iteration index of an algorithm; e.g., xit denotes
the variable of node i at time t. We let notations 1n and 0n denote two column
vectors with all entries equaling to one and zero, respectively. Let In and zij denote
the identity matrix of size n and the entry of matrix Z in its i-th row and j -th
column, respectively. The Euclidean norm for vectors and the induced 2-norm for
matrices are represented by the symbol || · ||2 . Let notation Z = diag{y} represent
the diagonal matrix of the vector y = [y1 , y2 , . . . , yn ]T , which follows that zii =
yi , ∀i = 1, . . . , n, and zij = 0, ∀i = j . We define the symbol diag{Z} as a diagonal
matrix whose diagonal elements correspond (same) to the matrix Z. The transposes
of a vector z and a matrix W are indicative of zT and W T , respectively. Let ei =
[0, . . . , 1i , . . . , 0]T . The gradient of f (z) (differentiable) at z is denoted as f (z) :
Rp → Rp . A non-negative square matrix Z ∈ Rn×n is row-stochastic if Z1n = 1n ,
column-stochastic if Z T 1n = 1n , and doubly stochastic if Z1n = 1n and Z T 1n =
1n .

1.2.2 Model of Optimization Problem

Consider a set of n nodes connected over a directed network. The global objective
is to find x ∈ Rp that minimizes the average of all local cost functions, i.e.,

1
n
minp f (x) = fi (x), (1.1)
x∈R n
i=1

where each fi : Rp → R is a convex function that we view as the local cost of

node i. Let f (x ∗ ) = f ∗ and x ∗ be represented by the optimal value and an optimal
solution to (1.1), respectively. ∗
The optimal solution’s set to (1.1) is denoted by X ,
where X∗ = {x ∈ Rp |(1/n) ni=1 fi (x) = f ∗ }.

1.2.3 Communication Network

In this chapter, we consider a group of n nodes communicating over a directed

network G = {V, E} involving the nodes set V = {1, . . . , n} and the edges set
E ⊆ V × V. If (i, j ) ∈ E, it indicates that node i can directly transmit data to
node j , where i is viewed as an in-neighbor of j and contrarily j is regarded as an
out-neighbor of i. Let Niin = {j ∈ V|(j, i) ∈ E} and Niout = {j ∈ V|(i, j ) ∈ E}
be the in-neighbor and out-neighbor sets of i, respectively. If |Niin | = |Niout |, the
network is said to be unbalanced, where | · | is called as the cardinality of a set. For
the directed network G, a path of length b from node i1 to node ib+1 is a sequence of
b + 1 distinct nodes i1 , . . . , ib+1 such that (ik , ik+1 ) ∈ E for k = 1, . . . , b. If there is
6 1 Accelerated Algorithms for Distributed Convex Optimization

a path between any two nodes, then G is said to be strongly connected. In addition,
the following assumptions are adopted.
Assumption 1.1 ([51]) The network G corresponding to the set of nodes is directed
and strongly connected.
Remark 1.1 Assumption 1.1 is fundamental to assure that nodes in the network can
always affect others directly or indirectly when studying distributed optimization
problems [39–53].

Assumption 1.2 ([42]) Each local cost function fi , i ∈ V, is Li -smooth. Mathe-

matically, there exists Li > 0 such that for any x, y ∈ Rp , one has

||∇fi (x) − ∇fi (y)||2 ≤ Li ||x − y||2 . (1.2)

Assumption 1.3 ([42]) Each local cost function fi , i ∈ V, is μi -strongly con-

nected. Mathematically, there exists μi ≥ 0 such that for any x, y ∈ Rp , one has
μi
fi (x) ≥ fi (y) + ∇fi (y)T (x − y) + ||x − y||22 , (1.3)
2
n
where μi ∈ [0, +∞) and i=1 μi > 0.
Remark 1.2 It is worth emphasizing that Assumptions 1.2 and 1.3 are two standard
assumptions to achieve linear convergence when the first-order methods [41–53]
are employed. If Assumption 1.3 holds, it suffices that each fi is convex and at
least one of them is strongly convex. Moreover, under Assumption 1.3, problem
(1.1) possesses
a unique optimal solution.
In the following analysis, we denote
L̄ = (1/n) ni=1 Li and μ̄ = (1/n) ni=1 μi as the Lipschitz continuity and
strong convexity constants, respectively, for the global cost function f . Denote
L̂ = maxi {Li }.

1.3 Algorithm Development

On the basis of the above section, we first review the centralized Nesterov gradient
descent method (CNGD) and then propose the directed distributed Nesterov-like
gradient tracking algorithm, named as D-DNGT, to solve problem (1.1).

1.3.1 Centralized Nesterov Gradient Descent Method (CNGD)

Here, CNGD derived from [57] is briefly introduced for L̄-smooth and μ̄-strongly
convex cost function. At each time t ≥ 0, CNGD kept three vectors y t , x t , v t ∈ Rp
1.3 Algorithm Development 7

and implemented the following three steps:

⎧ αγ v t +γ x t
⎪
⎨ y = α μ̄+γ
t

x t +1 = y t − L̄1 ∇f (y t ) (1.4)
⎪
⎩ t +1
v = (1 − α)v t + αγμ̄ y t − γα ∇f (y t ),

with the initial states y 0 = x 0 = v 0 ∈ Rp , where α and γ are constants related

to the parameters (L̄ and μ̄) of the cost function f . Nesterov [57] specified the
requirement
that γ = (1 − α)γ + α μ̄. If α = μ̄/L̄, then γ must satisfy γ =
(1 − μ̄/L̄)γ + μ̄ μ̄/L̄ = μ̄. After a series of transformations (see [59] for a
specific transformation), the equivalent form of CNGD (1.4) is given by

x t +1 = y t − L̄1 ∇f (y t )
(1.5)
y t +1 = x t +1 + β(x t +1 − x t ),
√ √ √ √
where β = ( L̄ − μ̄)/( L̄ + μ̄). It is well known that among all centralized
gradient approaches, CNGD [57] achieved the optimal convergence rate in terms of
the first-order oracle complexity. Under Assumptions 1.2 and 1.3, it is deduced that
the convergence rate of CNGD (1.5) was O((1 − μ̄/L̄)t ) whose dependence on
condition number L̄/μ̄ improved over CGD’s rate O((1 − μ̄/L̄)t ) in the large L̄/μ̄
regime. In this chapter, we devote ourselves to the study of a directed distributed
Nesterov-like gradient tracking (D-DNGT) algorithm, which is not only suitable for
a directed network but also converges linearly and accurately to the optimal solution
to (1.1). To the best of our knowledge, this work has not yet been involved and is
worthwhile to study.

1.3.2 Directed Distributed Nesterov-Like Gradient Tracking

(D-DNGT) Algorithm

We now describe D-DNGT to distributedly deal with problem (1.1). Each node i ∈
V at time t ≥ 0 stores four variables: xit ∈ Rp , yit ∈ Rp , sti ∈ Rn , and zit ∈ Rp . For
t > 0, node i ∈ V updates its variables as follows:
⎧
⎪ n
⎪
⎪
⎪ xit +1 = rij yjt + βi (xit − xit −1) − αi zit
⎪
⎪ j =1
⎪
⎪ t +1 t +1 t +1
⎪
⎨ yi = xi + βi (xi − xi )
t
n
(1.6)
⎪
⎪ sti +1 = rij stj
⎪
⎪ j =1
⎪
⎪
⎪ t +1
⎪ n ∇f (y t+1 ) ∇f (y t )
⎪
⎩ zi = rij zjt + it+1i − [sit ] i ,
[s ]
j =1 i i i i
8 1 Accelerated Algorithms for Distributed Convex Optimization

where αi > 0 and βi ≥ 0 are referred to the constant step-size (non-uniform)

and the momentum (heavy-ball momentum and Nesterov momentum) coefficient
(non-uniform) locally chosen at each node i, respectively. The notations [sti ]i and
∇fi (yit ) (vector), respectively, denote the i-th entry of sti and the gradient of fi (y)
at y = yit . The weights, rij , i, j ∈ V, associated with the network G obey the
following conditions:
⎧
⎨ > , j ∈ Niin ,
n
rij = rij = 1, ∀i, (1.7)
⎩
0, otherwise, j =1

and rii = 1 − j ∈Niin rij > , ∀i, where 0 < < 1. Each node i ∈ V starts with
1

initial states xi0 = yi0 ∈ Rp , s0i = ei , and zi0 = ∇fi (yi0 ).2
Denote R = [rij ] ∈ Rn×n as the collection of weights rij , i, j ∈ V in (1.7), which
is obviously row-stochastic. In essence, the update of zit in (1.6) is a distributed
inexact gradient tracking step, where each local cost function’s gradient is scaled
by [sti ]i , which is generated by the third update in (1.6). Actually, the update
of sti in (1.6) is a consensus iteration aiming to estimate the Perron eigenvector
w = [w1 , . . . , wn ]T (related to the eigenvalue 1) of the weight matrix R satisfying
1T w = 1. This iteration is similar to that employed in [51–53]. To sum up, D-DNGT
(1.6) transforms CNGD (1.5) into distributed ones via gradient tracking and can be
applied to a directed network.
Remark 1.3 For the sake of brevity, we mainly concentrate on the one dimensional
case, i.e., p = 1, and the multiple dimensional case is similarly proven.
Define x t = [x1t , . . . , xnt ]T ∈ Rn , y t = [y1t , . . . , ynt ]T ∈ Rn , zt = [z1t , . . . , znt ]T ∈
Rn , S t = [st1 , . . . , stn ]T ∈ Rn×n , ∇F (y t ) = [∇f 1 (y1t ), . . . , ∇f n (ynt )]T ∈ Rn and
S̃ t = diag{S t }. Therefore, the aggregated form of D-DNGT (1.6) can be written as
follows:
⎧ t +1
⎪
⎪ x = Ry t + Dβ (x t − x t −1 ) − Dα zt
⎨ t +1
y = x t +1 + Dβ (x t +1 − x t )
+1 , (1.8)
⎪S
⎪
t = RS t
⎩ t +1
z = Rzt + [S̃ t +1 ]−1 ∇F (y t +1) − [S̃ t ]−1 ∇F (y t )

where Dα = diag{α} ∈ Rn×n and Dβ = diag{β} ∈ Rn×n , where α = [α1 , . . . , αn ]T

and β = [β1 , . . . , βn ]T . The initial states are x 0 = y 0 ∈ Rn , S 0 = In and

1 It is worth noticing that the weights, rij , i, j ∈ V , associated with the network G given in (1.7) is
valid. For all i ∈ V , the conditions of the weights, rij , i, j ∈ V , in (1.7) can be satisfied when we
set rij = 1/|Niin |, ∀j ∈ Niin , and rij = 0, otherwise.
2 Suppose that each node possesses and achieves its unique identifier in the network, e.g., 1, . . . , n,

[45–50].
1.3 Algorithm Development 9

z0 = ∇F (y 0 ) ∈ Rn . It is worth emphasizing that D-DNGT (1.6) do not need the

out-degree information of nodes (only row-stochastic matrix adopted in D-DNGT),
which is more likely to be implemented.

1.3.3 Related Methods

In this subsection, some distributed optimization methods which are not only
suitable for directed networks but also related to D-DNGT (1.6) are discussed,
based on an instinct explanation. In particular, we consider ADD-OPT/Push-DIGing
[41, 42], FROST [52] and ABm [48].3
(a) Relation to ADD-OPT/Push-DIGing ADD-OPT [41] (Push-DIGing [42] is
suitable for time-varying networks in comparison with ADD-OPT) kept updating
four variables xit , sit , yit and zit ∈ R for each node i. Starting from the initial states
si0 = 1, zi0 = ∇fi (yi0 ) and an arbitrary xi0 , the updating rule of ADD-OPT is given
by
⎧
⎪ t +1 n
⎪
⎪ xi = cij xjt − αzit
⎪
⎪
⎪
⎪ j =1
⎨ n
t +1
si = cij sjt , yit +1 = xit +1 /sit +1 , (1.9)
⎪
⎪ j =1
⎪
⎪
⎪
⎪ t +1
n
⎪
⎩ zi = cij zjt + ∇fi (yit +1 ) − ∇fi (yit )
j =1

where C = [cij ] ∈ Rn×n is column-stochastic and α > 0 is the step-size. Under

Assumptions 1.1–1.3, ADD-OPT converged linearly to the optimal solution over
a directed network using uniform constant step-size. Besides, ADD-OPT/Push-
DIGing applied push-sum strategy (column-stochastic weights) to overcome the
unbalancedness induced by directed networks, which may be infeasible in dis-
tributed implementation because it required each node to possess (at least) its out-
degree information. We emphasize here that row-stochastic weights are relatively
easy to achieve in a distributed setting and the implementation is straightforward if
each node can privately regulate the weights on information which acquires from
in-neighbors.
(b) Relation to FROST The method, FROST, proposed in [52], served as a basis
for the development of D-DNGT (1.6). FROST maintained over time t ≥ 0 at each
node i, the solution estimate xit ∈ R and two auxiliary variables sti ∈ Rn and zit ∈ R.

3 Notice that some notations involved in the relevant method may contradict the notations described
in distributed optimization problem/algorithm/analysis throughout the chapter. Therefore, we
declare here that the symbols in this section should not be applied to other parts.
10 1 Accelerated Algorithms for Distributed Convex Optimization

Mathematically, the updating rule is as follows:

⎧
⎪ n
⎪
⎪
⎪ xit +1 = rij xjt − αi zit
⎪
⎪ =1
⎪
⎨
j
n
sti +1 = rij stj , (1.10)
⎪
⎪ j =1
⎪
⎪
⎪
⎪ t +1
n ∇f (x t+1 ) ∇fi (xit )
⎪
⎩ zi = rij zjt + it+1i − [sti ]i
[si ]i
j =1

where αi > 0 is a step-size locally chosen at each node i and the row-stochastic
weights R = [rij ] ∈ Rn×n comply with (1.7); the initialization xi0 ∈ R, s0i = ei , and
zi0 = ∇fi (xi0 ). FROST utilized row-stochastic weights with non-uniform step-sizes
among the nodes, and exhibited fast convergence over a directed network, which
converged at a linear rate to the optimal solution under Assumptions 1.1–1.3.
(c) Relation to ABm The ABm, investigated in [48], combined the gradient
tracking with a momentum term and utilized non-uniform step-sizes, which is
described as follows:
⎧
n
⎪
⎪ t +1
rij xjt − αi zit + βi (xit − xit −1 )
⎨ xi =
j =1

n , (1.11)
⎪
⎪ t +1
⎩ zi = cij zjt + ∇fi (xit +1 ) − ∇fi (xit )
j =1

initialized with zi0 = ∇fi (xi0 ) and an arbitrary xi0 at each node i, where as before
αi > 0 and βi ≥ 0 represent the local step-size and the momentum coefficient of
node i. By simultaneously implementing both row-stochastic (R = [rij ] ∈ Rn×n )
and column-stochastic (C = [cij ] ∈ Rn×n ) weights, it is deduced from [48] that
ABm reduces to AB [45] when βi = 0, ∀i, and AB lies at the heart of existing
methods that employ the gradient tracking [42, 43, 48].
Notice that, ADD-OPT/Push-DIGing, FROST and D-DNGT, described above,
have a non-linear term which is derived from the division by the eigenvector learning
term ((1.6), (1.9) and (1.10)). ABm eliminates this non-linear calculation and is
still suitable for the directed networks. However, ABm requires each node to gain
access to its out-degree information to build column-stochastic weights. It is a
challenge to establish directly in a distributed manner, which has been interpreted
earlier. It is worth highlighting that our algorithm, D-DNGT, extends CNGD to a
distributed form and is suitable for the directed networks in comparison with CNGD
[57] and Acc-DNGD-SC/Acc-DNGD-NSC [54]. In addition, D-DNGT combines
FROST with two kinds of momentum terms (heavy-ball momentum and Nesterov
momentum), which ensures that nodes acquire more information from in-neighbors
in the network than FROST to achieve much faster convergence.
1.4 Convergence Analysis 11

1.4 Convergence Analysis

In this section, we will prove that D-DNGT (1.6) converges at a linear rate to optimal
solution x ∗ provided that the coefficients (non-uniform step-sizes and momentum
coefficients) are bounded with properly chosen constants. The following notations
and relations are employed.
Recalling that R is irreducible and row-stochastic with positive diagonals,
under Assumption 1.1, there exists a normalized left Perron eigenvector w =
[w1 , . . . , wn ]T ∈ Rn (wi > 0, ∀i) of R such that

lim (R)t = (R)∞ = 1n wT , wT R = wT and wT 1n = 1.

t →∞

Also, define S ∞ = limt →∞ S t (we obtain S ∞ = (R)∞ because of S 0 =

In ), S̃ ∞ = diag{S ∞ }, ŝ = supt ≥0||S t ||2 , s̃ = supt ≥0 ||[S̃ t ]−1 ||2 ,4 x̄ t = wT x t ,
∇F (1n x̄ t ) = [∇f1 (x̄ t ), . . . , ∇fn (x̄ t )]T , α̂ = maxi∈V {αi } and β̂ = maxi∈V {βi }.
Since R is primitive and S 0 = In , it yields that {S t } is convergent [39, 51], and
therefore, the diagonal elements of S t are positive and bounded for all t ≥ 0. Thus,
ŝ and s̃ are two finite constants. In addition, we employ || · || to indicate either a
particular matrix norm or a vector norm such that ||Rz|| ≤ ||R||||z|| for all matrices
R and vectors z. Since all vector norms are equivalent in finite dimensional vector
space, we find the following results: || · ||2 ≤ d1 || · || and || · || ≤ d2 || · ||2 , where d1
and d2 are some positive constants.

1.4.1 Auxiliary Results

Before showing the main results, we introduce some auxiliary results. First, the
following crucial lemma is given, which is a direct implication of Assumption 1.1
and (1.7) (see Section II-B in [32]).
Lemma 1.4 ([32]) Suppose that Assumption 1.1 holds. Considering the weight
matrix R = [rij ] ∈ Rn×n follows (1.7). Then, there are a norm || · || and a constant
0 < ρ < 1 such that

||Rx − (R)∞ x|| ≤ ρ||x − (R)∞ x||,

for all x ∈ Rn .
According to the result established in Lemma 1.4, in the following, we present
an additional lemma from the Markov chain and consensus theory [60].

4 Throughout the chapter, for any arbitrary matrix/vector/scalar Z, we utilize the symbol (Z)t to
represent the t-th power of Z to distinguish the iteration of variables.
12 1 Accelerated Algorithms for Distributed Convex Optimization

Lemma 1.5 ([60]) Let S t be generated by (1.8). Then, there exist 0 < θ < ∞ and
0 < λ < 1 such that

||S t − S ∞ ||2 ≤ θ (λ)t , ∀t ≥ 0.

The next lemma, as a direct consequence of Lemma 1.5, will be employed to

deduce that the linear convergence of the sequences {||[S̃ t ]−1 − [S̃ ∞ ]−1 ||2 } and
{||[S̃ t +1]−1 − [S̃ t ]−1 ||2 } can be acquired (Detailed proof may see [52] or [51]).
Lemma 1.6 ([51]) Let S t be generated by (1.8). For all t ≥ 0, it holds that
(a) ||[S̃ t ]−1 − [S̃ ∞ ]−1 ||2 ≤ θ (s̃)2 (λ)t ,
(b) ||[S̃ t +1 ]−1 − [S̃ t ]−1 ||2 ≤ 2θ (s̃)2 (λ)t .
The next lemma derives the dynamics that govern the evolution of the weight
sum of zt .
Lemma 1.7 ([51]) Let zt be generated by (1.8). Recall that z0 = ∇F (y 0 ). Then,
for all t ≥ 0, it yields that

(R)∞ zt = (R)∞ [S̃ t ]−1 ∇F (y t ).

For convenience of the convergence analysis, we will make frequently use of the
following well-known lemma (see example [32] for a proof).
Lemma 1.8 ([32]) Suppose that Assumptions 1.2–1.3 hold. Since the global cost
function f is μ̄-strongly convex and L̄-smooth, then for all x ∈ R and 0 < ε < 2/L̄,
we get

||x − ε∇f (x) − x ∗ ||2 ≤ l||x − x ∗ ||2 ,

where l = max{|1 − L̄ε|, |1 − μ̄ε|}, x ∗ is the optimal solution to (1.1) and ∇f (x)
is the gradient of f (x) at x.

1.4.2 Supporting Lemmas

In this subsection, we begin to constitute the convergence analysis of D-DNGT

by investigating the evolutions of ||x t +1 − (R)∞ x t +1||, ||(R)∞ x t +1 − 1n x ∗ ||2 ,
||x t +1 − x t || and ||zt +1 − (R)∞ zt +1 ||. Our goal is to bound the above four
expressions according to the linear combinations of their past estimates and ∇F (y t ),
in which way we construct a linear system of inequalities. In what follows, the bound
of the consensus violation, ||x t +1 − (R)∞ x t +1 ||, is first provided.
1.4 Convergence Analysis 13

Lemma 1.9 Suppose that Assumption 1.1 holds. Then, for all t > 0, we have the
following inequality:

||x t +1 − (R)∞ x t +1 ||

≤ ρ||x t − (R)∞ x t || + κ1 β̂||x t − x t −1|| + κ2 α̂||zt ||2 , (1.12)

where ρ is given in Lemma 1.4, κ1 = d2 (ρ + 1)||In − (R)∞ || and κ2 = (d2 )2 ||In −

(R)∞ ||; α̂ and β̂ are the largest step-size and the maximum momentum coefficient
among the nodes, respectively.
Proof According to the updates of x t , y t in D-DNGT (1.8), it holds that

||x t +1 − (R)∞ x t +1||

≤ ρ||(In − (R)∞ )(x t + Dβ (x t − x t −1 ))||
+ ||(In − (R)∞ )Dβ (x t − x t −1)|| + κ2 α̂||zt ||2 , (1.13)

where the inequality in (1.13) is obtained from Lemma 1.4 and the fact that
(R)∞ R = (R)∞ . The desired result of Lemma 1.9 is then acquired.
The next lemma presents the bound of the optimality residual associated with the
weight average ||(R)∞ x t +1 − 1n x ∗ ||2 (Notice that (R)∞ x t +1 = 1n x̄ t +1 ).
Lemma 1.10 Suppose that Assumptions 1.2 and 1.3 hold. If 0 < n(wT α) < 2/L̄,
then, the following inequality holds for all t > 0:

||(R)∞ x t +1 − 1n x ∗ ||2

≤ d1 nL̂α̂||(R)∞ x t − x t || + l1 ||(R)∞ x t − 1n x ∗ ||2 + ŝ(s̃ 2 )θ α̂||∇F (y t )||2 (λ)t

+ (2κ3 +d1 ŝ s̃ L̂α̂)β̂||x t − x t −1|| + κ3 α̂||zt − (R)∞ zt ||, (1.14)

where κ3 = d1 ||(R)∞ ||2 and l1 = max{|1 − L̄n(wT α), |1 − μ̄n(wT α)}; θ and λ are
introduced in Lemma 1.5.
Proof Notice that (R)∞ R = (R)∞ . Recalling the updates of x t and y t in D-DNGT
(1.8), we get Lemma from 1.7 that

||(R)∞ x t +1 − 1n x ∗ ||2
= ||(R)∞ (x t + 2Dβ (x t − x t −1 ) − Dα zt + (Dα − Dα )(R)∞ zt ) − 1n x ∗ ||2
≤ ||(R)∞ x t − (R)∞ Dα (R)∞ [S̃ t ]−1 ∇F (y t ) − 1n x ∗ ||2

+ 2κ3 β̂||x t − x t −1 || + κ3 α̂||zt − (R)∞ zt ||. (1.15)

14 1 Accelerated Algorithms for Distributed Convex Optimization

We now discuss the first term in the inequality of (1.15). Note that (R)∞ =
1n wT and ∇F (1n x̄ t ) = [∇f1 (x̄ t ), . . . , ∇fn (x̄ t )]T . By utilizing 1n wT Dα 1n wT =
(wT α)1n wT , one obtains

||(R)∞ x t − (R)∞ Dα (R)∞ [S̃ t ]−1 ∇F (y t ) − 1n x ∗ ||2

≤ ||1n (wT x t − x ∗ − n(wT α)∇f (x̄ t ))||2
+ (wT α)||n1n ∇f (x̄ t ) − 1n wT [S̃ t ]−1 ∇F (y t )||2
= Λ1 + Λ2 , (1.16)

where ∇f (x̄ t ) = (1/n)1Tn ∇F (1n x̄ t ). By Lemma 1.8, when 0 < n(wT α) < 2/L̄,
Λ1 is bounded by
√
Λ1 ≤ l1 n||wT x t − x ∗ ||2 = l1 ||(R)∞ x t − 1n x ∗ ||2 , (1.17)

where l1 = max{|1 − L̄n(wT α)|, |1 − μ̄n(wT α)|}. Then, Λ2 can be bounded in the
following way:

Λ2 ≤(wT α)||n1n ∇f (x̄ t ) − 1n 1Tn ∇F (x t )||2

+ (wT α)||1n 1Tn ∇F (x t ) − 1n wT [S̃ t ]−1 ∇F (y t )||2
=Λ3 + Λ4 , (1.18)

where ∇F (x t ) = [∇f1 (x1t ), . . . , ∇fn (xnt )]T . Since ∇f (x̄ t ) = (1/n)1Tn ∇F (1n x̄ t ),
it yields from Assumption 1.2 that

Λ3 ≤ nL̂α̂||(R)∞ x t − x t ||2 . (1.19)

Next, by employing Lemma 1.6 and the relation S ∞ [S̃ ∞ ]−1 = 1n 1Tn , we have

Λ4 = (wT α)||S ∞ [S̃ ∞ ]−1 ∇F (x t ) − S ∞ [S̃ t ]−1 ∇F (y t )||2

≤ ŝ s̃ L̂α̂ β̂||x t − x t −1 ||2 + ŝ(s̃)2 θ α̂||∇F (y t )||2 (λ)t , (1.20)

where ŝ = supt ≥0 ||S t ||2 and s̃ = supt ≥0||[S̃ t ]−1 ||2 . The lemma follows by plugging
(1.16)–(1.20) into (1.15).
For the bound of the estimate difference ||x t +1 − x t ||, the following lemma is
shown.
Lemma 1.11 Suppose that Assumption 1.2 holds. For all t > 0, it holds that

||x t +1 − x t || ≤ κ4 ||x t − (R)∞ x t || + κ5 β̂||x t − x t −1|| + d2 α̂||zt ||2 , (1.21)

where κ4 = ||R − In || and κ5 = d2 + d2 ||R||.

1.4 Convergence Analysis 15

Proof Recalling that (R)∞ R = (R)∞ , we obtain from the updates of x t and y t in
D-DNGT (1.8) that

||x t +1 − x t || ≤||R − In ||||x t − (R)∞ x t || + d2 α̂||zt ||2

+ (d2 + d2 ||R||)β̂||x t − x t −1 ||, (1.22)

and the lemma follows.

The next lemma establishes the inequality which bounds the error term corre-
sponding to gradient estimation ||zt +1 − (R)∞ zt +1 ||.
Lemma 1.12 Suppose that Assumptions 1.1–1.3 hold. For all t > 0, we get the
following estimate:

||zt +1 − (R)∞ zt +1 ||
≤ κ4 κ6 (1 + β̂)||x t − (R)∞ x t || + κ6 d2 (1 + β̂)α̂||zt ||2

+ κ6 (1 + κ5 + κ5 β̂)β̂||x t − x t −1 || + ρ||zt − (R)∞ zt ||

+ 2||In − (R)∞ ||d2 (s̃)2 θ ||∇F (y t )||2 (λ)t , (1.23)

where κ6 = ||In − (R)∞ ||d1 d2 L̂s̃.

Proof It is immediately obtained from the update of zt in D-DNGT (1.8) that

||zt +1 − (R)∞ zt +1 ||
≤ ||In − (R)∞ ||||[S̃ t +1]−1 ∇F (y t +1) − [S̃ t ]−1 ∇F (y t )||
+ ρ||zt − (R)∞ zt ||, (1.24)

where we employ the triangle inequality and Lemma 1.4 to deduce the inequality. As
for the first term of the inequality in (1.24), we apply the update of y t in D-DNGT
(1.8) and the result in Lemma 1.6 to obtain

||[S̃ t +1]−1 ∇F (y t +1 ) − [S̃ t ]−1 ∇F (y t )||

≤ ||[S̃ t +1 ]−1 ∇F (y t +1) − [S̃ t +1 ]−1 ∇F (y t )||
+ ||[S̃ t +1 ]−1 ∇F (y t ) − [S̃ t ]−1 ∇F (y t )||

≤ d2 L̂s̃||y t +1 − y t ||2 + 2d2(s̃)2 θ ||∇F (y t )||2 (λ)t

≤ d1 d2 L̂s̃(1 + β̂)||x t +1 − x t || + d1 d2 L̂s̃ β̂||x t − x t −1||

+ 2d2 (s̃)2 θ ||∇F (y t )||2 (λ)t . (1.25)

Combining Lemma 1.11 with (1.25), the result in Lemma 1.12 is obtained.
16 1 Accelerated Algorithms for Distributed Convex Optimization

The final lemma constitutes an inevitable bound on the estimate, ||zt ||2 , for
deriving the aforementioned linear system.
Lemma 1.13 Assume that Assumption 1.2 holds. Then, the following inequality can
be established for all t > 0,

||zt ||2 ≤d1 ||zt − (R)∞ zt || + d1 nL̂||x t − (R)∞ x t ||

+ nL̂||(R)∞ x t − 1n x ∗ ||2 + d1 nL̂β̂||x t − x t −1 ||

+ ŝ(s̃)2 θ (λ)t ||∇F (y t )||2 . (1.26)

Proof Note that

||zt ||2 ≤ ||zt − (R)∞ zt ||2 + ||(R)∞ zt ||2 . (1.27)

In view of Lemma 1.7, using S ∞ [S̃ ∞ ]−1 = 1n 1Tn and (R)∞ = S ∞ , it suffices that

||(R)∞ zt ||2 ≤||S ∞ [S̃ t ]−1 ∇F (y t ) − S ∞ [S̃ ∞ ]−1 ∇F (y t )||2

+ ||S ∞ [S̃ ∞ ]−1 ∇F (y t ) − S ∞ [S̃ ∞ ]−1 ∇F (1n x ∗ )||2

≤ŝ(s̃)2 θ (λ)t ||∇F (y t )||2 + nL̂||y t − 1n x ∗ ||2 . (1.28)

By the update of y t in D-DNGT (1.8), one gets

||y t − 1n x ∗ ||2 ≤||x t − (R)∞ x t ||2 + ||(R)∞ x t − 1n x ∗ ||2

+ β̂||x t − x t −1 ||2 . (1.29)

Substituting (1.28) and (1.29) into (1.27) yields the desired result in Lemma 1.13.
The proof is completed.

1.4.3 Main Results

With the supporting relationships, i.e., the above Lemmas 1.9–1.13, in hands, the
main convergence results of D-DNGT are now established as follows.
For the sake of convenience, we define wmin = mini∈V {wi }, ν1 = κ2 d1 nL̂,
ν2 = κ2 nL̂, ν3 = κ2 d1 , ν4 = d1 nL̂, ν5 = d1 ŝ s̃ L̂, ν6 = d2 d1 nL̂, ν7 = d2 nL̂,
ν8 = d2 d1 , ν9 = κ4 κ6 , ν10 = κ6 d2 d1 nL̂, ν11 = κ6 d2 nL̂, ν12 = κ6 + κ5 κ6 , ν13 =
κ5 κ6 , ν14 = κ6 d2 d1 , ν15 = κ2 α̂ ŝ(s̃)2 θ , ν16 = ŝ(s̃)2 θ α̂, ν17 = d2 α̂ ŝ(s̃)2 θ , ν18 =
(2||In − (R)∞ || + κ6 (1 + β̂)α̂ ŝ)(s̃)2 θ d2 , ν19 = ν13 η3 + ν10 η3 α̂, ν20 = ν9 η1 +
ν10 η1 α̂ + ν11 η2 α̂ + ν12 η3 + ν10 η3 α̂ + ν14 η4 α̂ and ν21 = η4 (1 − ρ) − ν9 η1 − (ν10 η1 +
ν11 η2 + ν14 η4 )α̂. Then, the first result, i.e., Theorem 1.14, is introduced below.
1.4 Convergence Analysis 17

Theorem 1.14 Suppose that Assumptions 1.1–1.3 hold. Considering D-DNGT

(1.8) updates the sequences {x t }, {y t }, {S t } and {zt }. Then, if 0 < n(wT α) < 2/L̄,
one gets the following linear system of inequalities:
⎡ ⎤ ⎡ ⎤
||x t +1 − (R)∞ x t +1 || ||x t − (R)∞ x t ||
⎢ ||(R)∞ x t +1 − 1n x ∗ ||2 ⎥ ⎢ ||(R)∞ x t − 1n x ∗ ||2 ⎥
⎢ ⎥ ≤Γ ⎢ ⎥ + φt , (1.30)
⎣ ||x t +1 − x t || ⎦ ⎣ ||x t − x t −1|| ⎦
||z t +1 ∞
− (R) z ||t +1 ∞
||z − (R) z ||
t t

where the inequality is seen as component-wise. The elements of matrix Γ = [γij ] ∈

R4×4 and the vector φ t = [φ1t , φ2t , φ3t , φ4t ]T ∈ R4 are respectively given by
⎡ ⎤
ρ + ν1 α̂ ν2 α̂ κ1 β̂ + ν1 α̂ β̂ ν3 α̂
⎢ ν4 α̂ l1 2κ2 β̂ + ν5 α̂ β̂ κ3 α̂ ⎥
Γ =⎢
⎣ κ4 + ν6 α̂
⎥,
ν7 α̂ κ5 β̂ + ν6 α̂ β̂ ν8 α̂ ⎦
γ41 γ42 γ43 γ44

and γ41 = ν9 + ν10 α̂ + ν9 β̂ + ν10 α̂ β̂, γ42 = ν11 α̂ + ν11 α̂ β̂, γ43 = ν12 β̂ +
ν13 β̂ 2 + ν10 α̂ β̂ + ν10 α̂ β̂ 2 and γ44 = ρ + ν14 α̂ + ν14 α̂ β̂; φ1t = ν15 (λ)t ||∇F (y t )||2 ,
φ2t = ν16 (λ)t ||∇F (y t )||2 , φ3t = ν17 (λ)t ||∇F (y t )||2 and φ4t = ν18 (λ)t ||∇F (y t )||2 .
Assuming in addition that the largest step-size satisfies

1 η1 (1 − ρ)
0 < α̂ < min , ,
nL̄ ν1 η1 + ν2 η2 + ν3 η4

η3 − κ4 η1 η4 (1 − ρ) − ν9 η1
, , (1.31)
ν6 η1 + ν7 η2 + ν8 η4 ν10 η1 + ν11 η2 + ν14 η4

and the maximum momentum coefficient satisfies

η1 (1 − ρ) − (ν1 η1 + ν2 η2 + ν3 η4 )α̂
0 ≤ β̂ < min ,
κ1 η3 + ν1 η3 α̂

η2 (1 − l1 ) − (ν4 η1 + κ3 η4 )α̂ −ν20 + (ν20 )2 + 4ν19 ν21
, ,
2κ2 η3 + ν5 η3 α̂ 2ν19

η3 − κ4 η1 − (ν6 η1 + ν7 η2 + ν8 η4 )α̂
. (1.32)
κ5 η3 + ν6 η3 α̂

Then, the spectral radius of Γ , defined as ρ(Γ ), is strictly less than 1, where η1 , η2 ,
η3 , and η4 are arbitrary constants such that

ν4 η1 + κ3 η4 ν9 η1
η1 > 0, η2 > , η3 > κ4 η1 , η4 > . (1.33)
μ̄nwmin 1−ρ
18 1 Accelerated Algorithms for Distributed Convex Optimization

Proof First, plugging Lemma 1.13 into Lemmas 1.9–1.12 and rearranging the
acquired inequalities, it is immediately to verify (1.30). Next, we provide quite a
few conditions for the relation ρ(Γ ) < 1 to establish. According to Theorem 8.1.29
in [60], we know that, for a positive vector η = [η1 , . . . , η4 ]T ∈ R4 , if Γ η < η,
then ρ(Γ ) < 1 holds. By the definition of Γ , it is deduced that inequality Γ η < η
is equivalent to
⎧
⎪
⎪ (κ1 η3 + ν1 η3 α̂)β̂ < η1 (1 − ρ) − (ν1 η1 + ν2 η2 + ν3 η4 )α̂
⎪
⎨ (2κ η + ν η α̂)β̂ < η (1 − l ) − (ν η + κ η )α̂
2 3 5 3 2 1 4 1 3 4
(1.34)
⎪ (κ5 η3 + ν6 η3 α̂)β̂
⎪ < η3 − κ4 η1 − (ν6 η1 + ν7 η2 + ν8 η4 )α̂
⎪
⎩
2ν19 β̂ < −ν20 + (ν20 )2 + 4ν19 ν21 .

When 0 < α̂ < 1/nL̄, from Lemma 1.10, it yields that l1 = 1 − μ̄n(wT α) ≤
1 − μ̄nwmin α̂. To ensure the positivity of β̂ (the right hand sides of (1.34) are always
positive), (1.34) further implies that
⎧
⎪
⎪ α̂ < ν1 η1η+ν
1 (1−ρ)
2 η2 +ν3 η4
⎪
⎨ ν4 η1 +κ3 η4
η2 > μ̄nwmin
η3 −κ4 η1 (1.35)
⎪
⎪ α̂ < ν6 η1 +ν7 η2 +ν8 η4 , η3 > κ4 η1
⎪
⎩
α̂ < ν10 η1 +ν11 η2 +ν14 η4 , η4 > ν1−ρ
η4 (1−ρ)−ν9 η1 9 η1
.

Now, we are in the position of selecting vector η = [η1 , . . . , η4 ]T to ensure the

solvability of α̂. Since ρ < 1, we first pick an arbitrary positive constant η1 , then,
respectively, choose η3 and η4 in accordance with the third and fourth conditions
in (1.35), and finally select η2 satisfying the second condition in (1.35). Hence,
following from (1.35), it yields the upper bounds on the largest step-size α̂ in (1.31)
considering the requirement that 0 < n(wT α) < 2/L̄. In addition, we achieve the
upper bounds on the maximum momentum coefficient β̂ according to (1.34) and the
largest step-size α̂. This finishes the proof.
Remark 1.15 It is worth emphasizing that η1 , η2 , η3 , and η4 in Theorem 1.14
are adjustable parameters, which rely only on the network topology and the
cost functions. Thus, the choices of the largest step-size, α̂, and the maximum
momentum coefficient, β̂, can be calculated without much effort as long as other
parameters, such as λ, η, etc., are properly selected. Furthermore, to design the step-
sizes and the momentum coefficients, some global parameters, such as L̂, L̄, μ̄, and
wmin , are needed. We noticed that the preprocessing amount for calculating global
parameters is almost negligible compared to the worst-case runtime of D-DNGT
(see [42] for a specific analysis).
Before presenting the linear convergence of D-DNGT to the global optimal
solution, the following supermartingale convergence result is first introduced, which
will be crucial for the convergence analysis.
1.4 Convergence Analysis 19

Lemma 1.16 ([39]) Let {v t }, {ut }, {a t }, and {bt } be non-negative sequences such
that for all t ≥ 0,

v t +1 ≤ (1 + a t )v t − ut + b t .
∞ t
Also, let ∞ t =0 a < ∞ and
t
∞ t =0tb < ∞. Then, we get limt →∞ v = v for a
t

certain variable v ≥ 0, and t =0 u < ∞.

Now, we are ready to state the main convergence result.
Theorem 1.17 Suppose that Assumptions 1.1–1.3 hold. Consider that the
sequences {x t }, {y t }, {S t }, and {zt } are updated in D-DNGT (1.8). If α̂ and β̂
satisfy the conditions in Theorem 1.14, then the sequence {x t } converges to 1n x ∗ at
a linear rate of O((δ)t ), where λ < δ < 1 is a constant.
Proof Define
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
||x t − (R)∞ x t || ν15 (λ)t 000 ||∇F (y t )||2
⎢ ||(R) x − 1n x ||2 ⎥ t ⎢ ν16 (λ)t
∞ t ∗ 0 0 0⎥ ⎢ 0 ⎥
ϕt = ⎢
⎣ ||x t − x t −1||
⎥,P = ⎢
⎦ ⎣ ν17 (λ)t
⎥ , Qt = ⎢
⎦ ⎣
⎥.
⎦
000 0
∞ t
||z − (R) z ||
t ν18 (λ)t 000 0

Then, inequality (1.30) is equivalent to

ϕ t +1 ≤ Γ ϕ t + P t Qt . (1.36)

By iterating (1.36) recursively, for all t > 0, we can see that

t −1

ϕ ≤ (Γ ) ϕ +
t t 0
(Γ )t −k−1 P k Qk . (1.37)
k=0

Since the spectral radius of Γ is strictly less than 1, it can be concluded from
Lemma 1.16 in [52] that ||(Γ )t ||2 ≤ ϑ(δ0 )t and ||(Γ )t −k−1 P k ||2 ≤ ϑ(δ0 )t for
some ϑ > 0 and λ < δ0 < 1. Taking 2-norm on both sides of (1.37) yields that

t −1

||ϕ t ||2 ≤ ||(Γ )t ||2 ||ϕ 0 ||2 + ||(Γ )t −k−1 P k ||2 ||Qk ||2
k=0
t −1

≤ ϑ||ϕ 0 ||2 (δ0 )t + ϑ(δ0 )t ||Qk ||2 . (1.38)
k=0

And further, for all k = 0, . . . , t − 1,

||Qk ||2 ≤||∇F (y k ) − ∇F (1n x ∗ )||2 + ||∇F (1n x ∗ )||2

≤L̂||x k − (R)∞ x k ||2 + L̂||(R)∞ x k − 1n x ∗ ||2

20 1 Accelerated Algorithms for Distributed Convex Optimization

+ L̂β̂||x k − x k−1 ||2 + ||∇F (1n x ∗ )||2

≤(1 + d1 )(1 + β̂)L̂||ϕ k ||2 + ||∇F (1n x ∗ )||2 , (1.39)

where ∇F (1n x ∗ ) = [∇f1 (x ∗ ), . . . , ∇fn (x ∗ )]T . Thus, by combining (1.38) and

(1.39), we deduce that for all t > 0,

t −1

||ϕ t ||2 ≤ (ϑ||ϕ 0 ||2 + (1 + d1 )(1 + β̂)L̂ϑ ||ϕ k ||2
k=0
∗
+ ϑt||∇F (1n x )||2 )(δ0 ) . t
(1.40)
t −1
Define v t = k=0 ||ϕ ||2 , ν22 = (1 + d1 )(1 + β̂)L̂ϑ and p = ϑ||ϕ ||2 +
k t 0
∗
ϑt||∇F (1n x )||2 , and then (1.40) implies that

||ϕ t ||2 = v t +1 − v t ≤ (ν22 v t + pt )(δ0 )t , (1.41)

which is equivalent to

v t +1 ≤ (1 + ν22 (δ0 )t )v t +1 + pt (δ0 )t . (1.42)

∞ t
Since λ < δ0 < 1, it holds that ∞ t =0 ν22 (δ0 ) < ∞ and
t
t =0 p (δ0 ) < ∞.
t

Thus, all the conditions in Lemma 1.16 are satisfied (u = 0, t ≥ 0) and we

achieve that v t converges and thus is bounded. Following from (1.41), we obtain that
limt →∞ ||ϕ t ||2 /(δ1 )t ≤ limt →∞ (ν22 v t + pt )(δ0 )t /(δ1 )t = 0 for all δ0 < δ1 < 1,
and thus there is a positive constant m and an arbitrarily small constant τ such that
for all t ≥ 0,

||x t − 1n x ∗ ||2 ≤ ||x t − (R)∞ x t ||2 + ||(R)∞ x t − 1n x ∗ ||2

≤ (1 + d1 )||ϕ t ||2
≤ m(δ0 + τ )t , (1.43)

where we define δ = δ0 + τ . This fulfills the proof.

Remark 1.18 Theorem 1.17 establishes that D-DNGT linearly converges to the
global optimal solution provided that the largest step-size, α̂, and the maximum
momentum coefficient, β̂, respectively, obey the upper bounds given in Theo-
rem 1.14. Many existing works (the gradient tracking methods) [33, 35] and our
previous works [43, 44] adopted non-uniform step-sizes and converged at a linear
rate. Compared with [33, 35, 43, 44], this chapter still has three advantages. First,
D-DNGT incorporates the gradient tracking into the distributed Nesterov method,
which adds two types of momentum terms to improve information exchange to
ensure fast convergence. Second, since the provided bounds on the largest step-size,
α̂, in Theorem 1.14, depend only on the network topology and the cost functions,
1.4 Convergence Analysis 21

each node can choose a relatively wider step-size. This is in contrast to the earlier
work on non-uniform step-sizes within the framework of the gradient tracking
[33, 35, 43, 44], which is dependent on the heterogeneity (||(In − W )α||2 /||W α||2 ,
W is the weight matrix, in [35], and α̂/α̃, α̃ = mini∈V {αi }, in [33], [43, 44]) of
the step-sizes. Besides, the analysis showed that the algorithms in [33, 35, 43, 44]
could linearly converge to the optimal solution if and only if the heterogeneity and
the largest step-size are small. However, the largest step-size follows a bound which
is a function of the heterogeneity, and there is a trade-off between the tolerance of
heterogeneity and the largest step-size which can be achieved. Finally, the bounds of
non-uniform step-sizes in this chapter allow the existence (not all) of zero step-sizes
among the nodes if the largest step-size is positive and sufficiently small.

1.4.4 Discussion

The idea of D-DNGT can be applied to other directed distributed gradient tracking
methods to relax the condition of the weight matrices being only column-stochastic
[41, 42] or both row- and column-stochastic [45, 46]. Next, three possible Nesterov-
like optimization algorithms are presented. In this chapter, we only highlight and
verify their feasibilities by means of simulations. A rigorous theoretical analysis of
the three possible algorithms is left for the future work.
(a) D-DNGT with Only Column-Stochastic Weights [41, 42] here, we present
an extended algorithm, named as D-DNGT-C, by applying the momentum terms
into ADD-OPT [41]/Push-DIGing [42] (the weight matrices are only column-
stochastic). Specifically, the updates of D-DNGT-C are stated as follows:
⎧
⎪ t +1 n
⎪
⎪
⎪ x i = cij htj + βi (xit − xit −1) − αi zit
⎪
⎪ j =1
⎪
⎪ ht +1 = x t +1 + β (x t +1 − x t )
⎪
⎨ i i i i i
t +1 n
t +1 t +1 t +1 (1.44)
⎪
⎪ s = c s
ij j
t , y = h i /si
⎪
⎪
i
j =1
i
⎪
⎪
⎪
⎪ n
⎪ zit +1 =
⎩ cij zjt + ∇fi (yit +1 ) − ∇fi (yit ),
j =1

initialized with xi0 = h0i = yi0 ∈ R, si0 = 1, and zi0 = ∇fi (yi0 ), where as before
C = [cij ] ∈ Rn×n is column-stochastic, and αi > 0 and βi ≥ 0 represent the local
step-size and the momentum coefficient of node i. Unlike ADD-OPT [41]/Push-
DIGing [42], D-DNGT-C, by means of column-stochastic weights, adds two types
of momentum terms (heavy-ball momentum and Nesterov momentum) to ensure
that nodes acquire more information from in-neighbors in the network to achieve
fast convergence.
22 1 Accelerated Algorithms for Distributed Convex Optimization

(b) D-DNGT with Both Row- and Column-Stochastic Weights [45, 46] consider
that D-DNGT with both row- and column-stochastic weights does not need the
eigenvector estimation in D-DNGT (1.6) or D-DNGT-C (1.44). Hence, an extended
algorithm (named as D-DNGT-RC), which utilizes both row-stochastic (R = [rij ] ∈
Rn×n ) and column-stochastic (C = [cij ] ∈ Rn×n ) weights, is presented as follows:
⎧
⎪ n
⎪
⎪ xit +1 = rij yjt + βi (xit − xit −1 ) − αi zit
⎪
⎪
⎨ j =1
yit +1 = xit +1 + βi (xit +1 − xit ) (1.45)
⎪
⎪
⎪
⎪ t +1
n
⎩ zi =
⎪ cij zjt + ∇fi (yit +1 ) − ∇fi (yit ),
j =1

where xi0 = yi0 ∈ R and zi0 = ∇fi (yi0 ), αi > 0 and βi ≥ 0 represent the
local step-size and the momentum coefficient of node i. D-DNGT-RC not only
reduces additional iterations of eigenvector learning but also guarantees that more
information nodes can be obtained from in-neighbors, which may exhibit fast
convergence than [45] and [46].
(c) D-DNGT-RC with Interaction Delays [49] note that nodes will confront arbi-
trary but uniformly bounded interaction delays in the process of gaining information
from in-neighbors [49]. Specifically, to solve problem (1.1), we denote ςijt 5 as an
arbitrary priori unknown delay induced by the interaction link (j, i) at time t ≥ 0.
Then, the updates of D-DNGT-RC with delay (D-DNGT-RC-D) become
⎧
⎪ n t −ς t
⎪
⎪ xit +1 = rij yj ij + βi (xit − xit −1 ) − αi zit
⎪
⎪
⎨ j =1
yit +1 = xit +1 + βi (xit +1 − xit ) (1.46)
⎪
⎪
⎪
⎪ t +1
n t −ς t
⎩ zi =
⎪ cij zj ij + ∇fi (yit +1) − ∇fi (yit ).
j =1

Remark 1.19 The time-varying implementation of D-DNGT is more straightfor-

ward on a broadcast-based mechanism or a random network, such as the related
work in [46]. The asynchronous scheme can also follow the method in [19, 26, 35,
37]. In addition, it is also concluded from [61] that when D-DNGT is employed for
optimizing more complexity problem, such as deep neural networks, the gradient is
usually replaced with the stochastic gradient, which yields the stochastic version of
D-DNGT.

5For all t > 0, the interaction delays ςijt are assumed to be uniformly bounded. That is, there exists
some finite ς̂ > 0 such that 0 ≤ ςijt ≤ ς̂. In addition, each node is accessible to its own estimate
without delays, i.e., ςiit = 0, ∀i ∈ V and t > 0.
1.5 Numerical Examples 23

1.5 Numerical Examples

This section provides a variety of numerical experiments to illustrate the application

and performance of D-DNGT. The numerical experiments are divided into three
parts. (i) Without momentum terms: the convergence between D-DNGT and the
methods without momentum terms, including FROST [52], AB [45], and ADD-
OPT/Push-DIGing [41, 42], is first compared. (ii) With momentum terms: the
convergence between D-DNGT and the methods with momentum terms, including
ABm [48], ABN [50], and FROZEN [50], is also compared. (iii) Extensions of
D-DNGT: in the final scenario, we verify the convergence between the exten-
sions (including D-DNGT-C, D-DNGT-RC, and D-DNGT-RC-R) of D-DNGT
and their closely related methods (including ADD-OPT/Push-DIGing [41, 42],
AB [45], and AB with delays (AB-D) [49]). For the comparison of delays, let
ς̂ = 6 be the upper bound of the time-varying delays. At each time t > 0,
interaction delays, which are imposed on each interaction link, are randomly
and uniformlyselected in {0, 1, . . . , 6}. In light of parts (i)–(iii), we plot the
residual log10 ( ni=1 (||xit − x ∗ ||2 /||xi0 − x ∗ ||2 )) (t is the discrete-time iteration) for
comparison.
In the experiment, we concern a distributed binary classification problem uti-
lizing regularized logistic regression [48]. Specifically, the application of D-DNGT
for handling the distributed logistic regression problem is considered over a directed
network:

n
min f (x, v) = fi (x, v),
i=1

where x ∈ Rp and v ∈ R are the optimization variables for learning the separable
hyperplane. Here, the local cost function fi is given by

ω
mi

fi (x, v) = (||x||22 + v 2 ) + ln 1 + exp − cTij x + v bij ,
2
j =1

where each node i ∈ {1, . . . , n} privately knows mi training examples; cij , bij ∈
Rp × {−1, +1}, where cij is the p-dimensional feature vector of the j -th training
sample at the i-th node following from a Gaussian distribution with zero mean, and
bij is the label according to a Bernoulli distribution. In terms of parameter design,
we choose n = 10 and mi = 10 for all i and p = 2. The network topology as
the directed and strongly connected network is depicted in Fig. 1.1. In addition, we
utilize a simple uniform weighting strategy, rij = 1/|Niin |, ∀i, to regulate the row-
stochastic weights.
The simulation results are plotted in Figs. 1.2, 1.3 and 1.4. Figure 1.2 indicates
that D-DNGT with momentum terms promotes the convergence in comparison
with the applicable algorithms without momentum terms. Figure 1.3 means that
24 1 Accelerated Algorithms for Distributed Convex Optimization

Fig. 1.1 A directed and strongly connected network

Comparison (i)
0

-2

-4

-6
Residual

-8

-10

-12

-14
0 200 400 600 800 1000 1200 1400
Time[step]

Fig. 1.2 Performance comparisons between D-DNGT and the methods without momentum terms

D-DNGT with two momentum terms (heavy-ball momentum [48] and Nesterov
momentum [50, 54, 55]) improves the convergence when compared with the
applicable algorithms with single momentum term. We note that although the
eigenvector learning existed in D-DNGT may slow down convergence, D-DNGT
is more suitable for broadcast-based protocols than other optimization methods
(AB, ADD-OPT/Push-DIGing, ABm, and ABN ) because it only requires row-
stochastic weights. Finally, it is concluded from Fig. 1.4 that the algorithms with
momentum terms can successfully promote the convergence regardless of whether
the interaction links undergo interaction delays or the weight matrices are only
column-stochastic or both row- and column-stochastic.
1.5 Numerical Examples 25

Comparison (ii)
0

-2

-4

-6
Residual

-8

-10

-12

-14
0 200 400 600 800 1000 1200 1400
Time[step]

Fig. 1.3 Performance comparisons between D-DNGT and the methods with momentum terms

Comparison (iii)
0

-2

-4

-6
Residual

-8

-10

-12

-14
0 200 400 600 800 1000 1200 1400
Time[step]

Fig. 1.4 Performance comparisons between the extensions of D-DNGT and their closely related
methods
26 1 Accelerated Algorithms for Distributed Convex Optimization

1.6 Conclusion

In this chapter, we have considered a general distributed optimization problem in

which nodes aimed to collectively optimize the average of all local cost functions. To
figure out the optimization problem, a generalized directed distributed Nesterov-like
gradient tracking algorithm, named as D-DNGT, has been proposed and analyzed
in detail. D-DNGT extended distributed gradient tracking method with heavy-ball
momentum and Nesterov momentum, guaranteed that nodes selected non-uniform
step-sizes in a distributed manner and only required the weight matrix to be row-
stochastic, which has indicated that it was suitable for a directed network. In
particular, the directed network was assumed to be strongly connected. When
the largest step-size and the maximum momentum coefficient were subjected to
some upper bounds (the bounds relied only on the network topology and the cost
functions), we have established the globally linear convergence rate for D-DNGT
at the expense of eigenvector learning, supposing strongly convex and smooth
cost functions. In addition, some extensions of D-DNGT have been also explored.
Simulation results further verified our theoretical analysis. However, D-DNGT is not
flawless, and more in-depth researches are demanded to perfect it. For example, D-
DNGT cannot be suitable for the dynamical networks, stochastic noises, as well as
the networks with random link failures and quantization effects. As the future work,
it would be valuable to extend D-DNGT to deal with a number of problems, i.e.,
time-varying directed networks, stochastic noises, as well as networks with random
link failures and quantization effects. Moreover, more complex (asynchronous
interaction, inequality constraints, etc.) optimization problem is also worthy of
study.

References

1. S. Yang, Q. Liu, J. Wang, Distributed optimization based on a multiagent system in the presence
of communication delays. IEEE Trans. Syst., Man, Cybern., Syst. 47(5), 717–728 (2017)
2. J. Chen, A. Sayed, Diffusion adaptation strategies for distributed optimization and learning
over networks. IEEE Trans. Signal Process. 60(8), 4289–4305 (2012)
3. K. Li, Q. Liu, S. Yang, J. Cao, G. Lu, Cooperative optimization of dual multiagent system for
optimal resource allocation. IEEE Trans. Syst., Man, Cybern., Syst. 50(11), 4676–4687 (2020)
4. S. Wang, C. Li, Distributed robust optimization in networked system. IEEE Trans. Cybern.
47(8), 2321–2333 (2017)
5. X. Dong, G. Hu, Time-varying formation tracking for linear multi-agent systems with multiple
leaders. IEEE Trans. Autom. Control 62(7), 3658–3664 (2017)
6. X. Dong, G. Hu, Time-varying formation control for general linear multi-agent systems with
switching directed topologies. Automatica 73, 47–55 (2016)
7. C. Shi, G. Yang, Augmented Lagrange algorithms for distributed optimization over multi-agent
networks via edge-based method. Automatica 94, 55–62 (2018)
8. S. Zhu, C. Chen, W. Li, B. Yang, X. Guan, Distributed state estimation of sensor-network
systems subject to Markovian channel switching with application to a chemical process. IEEE
Trans. Syst. Man Cybern. Syst. 48(6), 864–874 (2018)
References 27

9. D. Jakovetic, A unification and generalization of exact distributed first order methods. IEEE
Trans. Signal Inform. Process. Over Netw. 5(1), 31–46 (2019)
10. Z. Wu, Z. Li, Z. Ding, Z. Li, Distributed continuous-time optimization with scalable adaptive
event-based mechanisms. IEEE Trans. Syst. Man Cybern. Syst. 50(9), 3252–3257 (2020)
11. K. Scaman, F. Back, S. Bubeck, Y. Lee, L. Massoulie, Optimal algorithms for smooth and
strongly convex distributed optimization in networks, in Proceedings of the 34th International
Conference on Machine Learning (PMLR), vol. 70 (2017), pp. 3027–3036
12. X. He, T. Huang, J. Yu, C. Li, Y. Zhang, A continuous-time algorithm for distributed
optimization based on multiagent networks. IEEE Trans. Syst. Man Cybern. Syst. 49(12),
2700–2709 (2019)
13. Y. Zhu, W. Ren, W. Yu, G. Wen, Distributed resource allocation over directed graphs via
continuous-time algorithms. IEEE Trans. Syst. Man Cybern. Syst. 51(2), 1097–1106 (2021)
14. A. Nedic, A. Ozdaglar, Distributed subgradient methods for multi-agent optimization. IEEE
Trans. Autom. Control 54(1), 48–61 (2009)
15. A. Nedic, A. Ozdaglar, P. Parrilo, Constrained consensus and optimization in multi-agent
networks. IEEE Trans. Autom. Control 55(4), 922–938 (2010)
16. H. Li, S. Liu, Y. Soh, L. Xie, Event-triggered communication and data rate constraint for
distributed optimization of multiagent systems. IEEE Trans. Syst. Man Cybern. Syst. 48(11),
1908–1919 (2018)
17. D. Yuan, Y. Hong, D. Ho, G. Jiang, Optimal distributed stochastic mirror descent for strongly
convex optimization. Automatica 90, 196–203 (2018)
18. I. Matei, J. Baras, Performance evaluation of the consensus-based distributed subgradient
method under random communication topologies. IEEE J. Sel. Topics Signal Process. 5(4),
754–771 (2011)
19. C. Xi, U. Khan, Distributed subgradient projection algorithm over directed graphs. IEEE Trans.
Autom. Control 62(8), 3986–3992 (2017)
20. D. Yuan, D. Ho, G. Jiang, An adaptive primal-dual subgradient algorithm for online distributed
constrained optimization. IEEE Trans. Cybern. 48(11), 3045–3055 (2018)
21. C. Li, P. Zhou, L. Xiong, Q. Wang, T. Wang, Differentially private distributed online learning,
IEEE Trans. Knowl. Data Eng. 30(8), 1440–1453 (2018)
22. J. Zhu, C. Xu, J. Guan, D. Wu, Differentially private distributed online algorithms over time-
varying directed networks. IEEE Trans. Signal Inform. Process. Over Netw. 4(1), 4–17 (2018)
23. W. Shi, Q. Ling, K. Yuan, G. Wu, W. Yin, On the linear convergence of the ADMM in
decentralized consensus optimization. IEEE Trans. Signal Process. 62(7), 1750–1761 (2014)
24. J. Mota, J. Xavier, P. Aguiar, M. Puschel, D-ADMM: a communication-efficient distributed
algorithm for separable optimization. IEEE Trans. Signal Process. 61(10), 2718–2723 (2013)
25. H. Terelius, U. Topcu, R. Murray, Decentralized multi-agent optimization via dual decomposi-
tion. IFAC Proc. Volumes 44(1), 11245–11251 (2011)
26. E. Wei, A. Ozdaglar, On the O(1/k) convergence of asynchronous distributed alternating
direction method of multipliers, in 2013 IEEE Global Conference on Signal and Information
Processing (2013). https://doi.org/10.1109/GlobalSIP.2013.6736937
27. M. Hong, T. Chang, Stochastic proximal gradient consensus over random networks. IEEE
Trans. Signal Process. 65(11), 2933–2948 (2017)
28. H. Xiao, Y. Yu, S. Devadas, On privacy-preserving decentralized optimization through
alternating direction method of multipliers (2019). Preprint arXiv:1902.06101
29. A. Chen, A. Ozdaglar, A fast distributed proximal-gradient method, in 2012 50th Annual
Allerton Conference on Communication, Control, and Computing (Allerton) (2012). https://
doi.org/10.1109/Allerton.2012.6483273
30. X. Dong, Y. Hua, Y. Zhou, Z. Ren, Y. Zhong, Theory and experiment on formation-containment
control of multiple multirotor unmanned aerial vehicle systems. IEEE Trans. Autom. Sci. Eng.
16(1), 229–240 (2019)
31. W. Shi, Q. Ling, G. Wu, W Yin, EXTRA: an exact first-order algorithm for decentralized
consensus optimization. SIAM J. Optimi. 25(2), 944–966 (2015)
28 1 Accelerated Algorithms for Distributed Convex Optimization

32. G. Qu, N. Li, Harnessing smoothness to accelerate distributed optimization. IEEE Trans.
Control Netw. Syst. 5(3), 1245–1260 (2018)
33. A. Nedic, A. Olshevsky, W. Shi, C. Uribe, Geometrically convergent distributed optimization
with uncoordinated step-sizes, in 2017 American Control Conference (ACC) (2017). https://
doi.org/10.23919/ACC.2017.7963560
34. M. Maros, J. Jalden, A geometrically converging dual method for distributed optimization over
time-varying graphs. IEEE Trans. Autom. Control 66(6), 2465–2479 (2021)
35. J. Xu, S. Zhu, Y. Soh, L. Xie, Convergence of asynchronous distributed gradient methods over
stochastic networks. IEEE Trans. Autom. Control 63(2), 434–448 (2018)
36. S. Pu, A. Nedic, Distributed stochastic gradient tracking methods. Math. Program. 187(1),
409–457 (2021)
37. Y. Tian, Y. Sun, B. Du, G. Scutari, ASY-SONATA: Achieving geometric convergence for
distributed asynchronous optimization, in 2018 56th Annual Allerton Conference on Communi-
cation, Control, and Computing (Allerton) (2018). https://doi.org/10.1109/ALLERTON.2018.
8636055
38. M. Maros, J. Jalden, Panda: A dual linearly converging method for distributed optimization
over time-varying undirected graphs, in 2018 IEEE Conference on Decision and Control
(CDC) (2018). https://doi.org/10.1109/CDC.2018.8619626
39. A. Nedic, A. Olshevsky, Distributed optimization over time-varying directed graphs. IEEE
Trans. Autom. Control 60(3), 601–615 (2015)
40. C. Xi, U. Khan, DEXTRA: a fast algorithm for optimization over directed graphs. IEEE Trans.
Autom. Control 62(10), 4980–4993 (2017)
41. C. Xi, R. Xin, U. Khan, ADD-OPT: accelerated distributed directed optimization. IEEE Trans.
Autom. Control 63(5), 1329–1339 (2018)
42. A. Nedic, A. Olshevsky, W. Shi, Achieving geometric convergence for distributed optimization
over time-varying graphs. SIAM J. Optimi. 27(4), 2597–2633 (2017)
43. Q. Lü, H. Li, D. Xia, Geometrical convergence rate for distributed optimization with time-
varying directed graphs and uncoordinated step-sizes. Inf. Sci. 422, 516–530 (2018)
44. Q. Lü, H. Li, Z. Wang, Q. Han, W. Ge, Performing linear convergence for distributed
constrained optimisation over time-varying directed unbalanced networks. IET Control Theory
Appl. 13(17), 2800–2810 (2019)
45. R. Xin, U. Khan, A linear algorithm for optimization over directed graphs with geometric
convergence. IEEE Control Syst. Lett. 2(3), 315–320 (2018)
46. S. Pu, W. Shi, J. Xu, A. Nedic, Push-pull gradient methods for distributed optimization in
networks. IEEE Trans. Autom. Control 66(1), 1–16 (2021)
47. F. Saadatniaki, R. Xin, U. Khan, Decentralized optimization over time-varying directed graphs
with row and column-stochastic matrices. IEEE Trans. Autom. Control 65(11), 4769–4780
(2020)
48. R. Xin, U. Khan, Distributed heavy-ball: a generalization and acceleration of first-order
methods with gradient tracking. IEEE Trans. Autom. Control 65(6), 2627–2633 (2020)
49. C. Zhao, X. Duan, Y. Shi, Analysis of consensus-based economic dispatch algorithm under
time delays. IEEE Trans. Syst. Man Cybern. Syst. 50(8), 2978–2988 (2020)
50. R. Xin, D. Jakovetic, U. Khan, Distributed Nesterov gradient methods over arbitrary graphs.
IEEE Signal Process. Lett. 26(8), 1247–1251 (2019)
51. C. Xi, V. Mai, E. Abed, U. Khan, Linear convergence in optimization over directed graphs with
row-stochastic matrices. IEEE Trans. Autom. Control 63(10), 3558–3565 (2018)
52. R. Xin, C. Xi, U. Khan, FROST-Fast row-stochastic optimization with uncoordinated step-
sizes. EURASIP J. Advanc. Signal Process. 2019(1), 1–14 (2019)
53. H. Li, Q. Lü, T. Huang, Convergence analysis of a distributed optimization algorithm with a
general unbalanced directed communication network. IEEE Trans. Netw. Sci. Eng. 6(3), 237–
248 (2019)
54. G. Qu, N. Li, Accelerated distributed Nesterov gradient descent. IEEE Trans. Autom. Control
65(6), 2566–2581 (2020)
References 29

55. D. Jakovetic, J. Xavier, J. Moura, Fast distributed gradient methods. IEEE Trans. Autom.
Control 59(5), 1131–1146 (2014)
56. H. Li, Q. Lü, X. Liao, T. Huang, Accelerated convergence algorithm for distributed constrained
optimization under time-varying general directed graphs. IEEE Trans. Syst. Man Cybern. Syst.
50(7), 2612–2622 (2020)
57. Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course (Springer Science
& Business Media, Berlin, 2013)
58. H. Wang, X. Liao, T. Huang, C. Li, Cooperative distributed optimization in multiagent
networks with delays. IEEE Trans. Syst. Man Cybern. Syst. 45(2), 363–369 (2015)
59. A. Defazio, On the curved geometry of accelerated optimization (2018). Preprint
arXiv:1812.04634
60. R. Horn, C. Johnson, Matrix Analysis (Cambridge University Press, New York, 2013)
61. T. Yang, Q. Lin, Z. Li, Unified convergence analysis of stochastic momentum methods for
convex and non-convex optimization (2016). Preprint arXiv:1604.03257
Chapter 2
Projection Algorithms for Distributed
Stochastic Optimization

Abstract This chapter focuses on introducing and solving the problem of compos-
ite constrained convex optimization with a sum of smooth convex functions and
non-smooth regularization terms (1 norm) subject to locally general constraints.
Each of the smooth objective functions is further thought of as the average
of several constituent functions, which is motivated by the modern large-scale
information processing problems in machine learning (the samples of a training
dataset are randomly distributed across multiple computing nodes). We present a
novel computation-efficient distributed stochastic gradient algorithm that makes use
of both the variance-reduction methodology and the distributed stochastic gradient
projection method with constant step-size to solve the problem in a distributed
manner. Theoretical study shows that the suggested algorithm can discover the
precise optimal solution in expectation when each constituent function (smooth)
is strongly convex if the constant step-size is less than an explicitly calculated upper
constraint. Regarding the current distributed methods, the suggested technique
not only has a low computation cost in terms of the overall number of local
gradient evaluations but is also suited for addressing general restricted optimization
problems. Finally, the numerical proof is offered to show the suggested algorithm’s
attractive performance.

Keywords Composite constrained optimization · Distributed stochastic

algorithm · Computation-efficient · Variance reduction · Non-smooth term

2.1 Introduction

Given the limited computational and storage capacity of nodes, it has become
unrealistic to deal with large-scale tasks centrally on a single compute node
[1]. Distributed optimization is a classic topic [2–9] yet has recently aroused
considerable interest in many emerging applications (large-scale tasks), such as
parameter estimation [3, 4], network attacks [5], machine learning [6], IoT networks
[7], and some others. At least two facts [8] have contributed to this resurgence
of interest: (a) recent developments in high-performance computing platforms

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 31
Q. Lü et al., Distributed Optimization in Networked Systems, Wireless Networks,
https://doi.org/10.1007/978-981-19-8559-1_2
32 2 Projection Algorithms for Distributed Stochastic Optimization

have enabled us to employ distributed resources to contribute significantly to

computational efficiency and (b) the size of datasets often far exceeds the storage
capacity of a single machine, requiring coordination across multiple machines.
In distributed optimization (without centralized coordination), each node is only
allowed to interact with its neighbors through a locally connected network. In
general, designing effective distributed algorithms for a wide range of optimization
problems is more challenging [8–10].
Distributed optimization methods that are only dependent on gradient informa-
tion have become a core interest in processing large-scale tasks due to their excellent
scalability. Many known methods, including distributed gradient descent (DGD)
[11, 12], dual averaging [13], EXTRA [14, 15], ADMM [16, 17], adaptive diffusion
[18], gradient tracking [19–21], and methods for constrained optimization problems
[22], have been studied in the literature. Moreover, quite a few efficient methods
for dealing with various practical problems such as complex networks [23], privacy
security [24], machine learning [25], online optimization [26], and power system
operation [27, 28] have been emerged.
More recently, significant effort has been made to design distributed methods
to solve the problem of composite non-smooth optimization. For the composite
optimization problem with global non-smooth terms, a fast distributed proximal
gradient method that adopts Nesterov’s acceleration is proposed in [29], which
achieves accelerated convergence. Other work focuses on the situation where
each node has a local non-smooth term that may be different from other nodes.
For example, a proximal distributed linearized ADMM (DL-ADMM) method is
provided in [30] to resolve such composite problems, and the convergence is
guaranteed. By extending EXTRA [14] to deal with local non-smooth terms, PG-
EXTRA is proposed in [31] and an improved convergence is established. In addition,
in comparison with PG-EXTRA [31], the NIDS method proposed in [32] is able
to employ larger step-sizes and also possesses the same convergence. From the
overview of the convergence rate, the above distributed proximal gradient methods
still exist a clear gap compared with centralized methods. Based on this, a linearly
convergent proximal gradient method is proposed in [33], which can successfully
break such a gap.
With the advent of the big data era, the amount of data that nodes in the network
need to process is getting larger and more complicated [34]. Therefore, the above
methods can be computationally very demanding due to the requirement that each
iteration of the algorithm needs a full gradient evaluation of local objective functions
[10–19, 21–24, 26, 29–33]. This may make these methods to be practically infeasible
when dealing with large-scale tasks, mainly because the nodes in the network need
to cope with large amounts of various data. In order to avoid extensive computation
and keep computational simplicity, one natural solution is to use the stochastic
gradient (a random subset of the local data for gradient evaluation) to approximate
the true gradient, and thus the distributed stochastic optimization methods have
emerged [35]. Considerable works have been done in investigating the distributed
stochastic optimization methods including distributed stochastic gradient descent
[35], stochastic gradient push [36], stochastic mirror descent [37], and stochastic
2.1 Introduction 33

gradient tracking [38]. However, in practice, these methods converge slowly due
to the large variance coming from the stochastic gradient and the adoption of a
carefully tuned sequence of decaying step-sizes. To address this deficiency, various
variance-reduction techniques have been leveraged in developing the stochastic
gradient descent methods, which appear some representative centralized methods
such as S2GD [39], SAG [40], SAGA [41], SVRG [42, 43], and SARAH [44]. The
idea of the variance-reduction technique is to reduce the variance of the stochastic
gradient and substantially improve the convergence.
Motivated by the centralized variance reduced methods, the distributed variance
reduced methods have been extensively studied, which outperform their centralized
counterparts in handling large-scale tasks. Of relevance to our work are the recent
developments in [45] and [46]. The distributed stochastic averaging gradient method
(DSA) proposed in [45] incorporates the variance-reduction technique in SAGA
[41] to the algorithm design ideas of EXTRA [14], which not only obtains the
expected linear convergence of distributed stochastic optimization for the first
time but also performs better than the previous works [14, 35] in dealing with
machine learning problems. Similar works also involve the DSBA [47], diffusion-
AVRG [48], ADFS [49], SAL-Edge [50], GT-SAGA/GT-SVRG [2, 51, 52], and
Network-DANE [8] utilizing various strategies. However, to the best knowledge
of the authors, there are no methods to focus on solving general composite
constrained convex optimization problems. Recently, the distributed neurodynamic-
based consensus algorithm proposed in [46] is developed to solve the problem of
a sum of smooth convex functions and 1 norms subjected to the locally general
constraints (linear equality, convex inequality, and bounded constraints), which
generalizes the work in [53] to the case where the objective function and the
constraint conditions are wider. In particular, based on the Lyapunov stability theory,
the method in [46] can achieve consensus at the global optimal solution with
constant step-size. The work in [46] is insightful, but unfortunately, the algorithm
does not take into account the high computational cost of evaluating the full gradient
of the local objective function at each iteration.
In this chapter, we are concerned with solving the composite constrained convex
optimization problem with a sum of smooth convex functions and non-smooth
regularization terms (1 norm), where the smooth objective functions are further
composed of the average of several constituent functions and the locally general
constraints are constituted by linear equality, convex inequality, and bounded
constraints. To aim at this, a computation-efficient distributed stochastic gradient
algorithm is proposed, which is capable of adaptability and facilitating the real-
world applications. In general, the novelties of the present work are summarized as
follows:
(i) We propose and analyze a novel computation-efficient distributed stochastic
gradient algorithm by leveraging the variance-reduction technique and the
distributed stochastic gradient projection method with constant step-size. In
contrast with most existing distributed methods [29–33, 45, 47–51, 53], the
34 2 Projection Algorithms for Distributed Stochastic Optimization

proposed algorithm is capable of solving a class of composite non-smooth

optimization problems subject to the locally general constraints.
(ii) The proposed algorithm outperforms the existing distributed methods [29–
33, 46, 53] in light of the total number of local gradient evaluations. In
particular, at each iteration, the proposed algorithm only evaluates the gradient
of one randomly selected constituent function and employs the unbiased
stochastic average gradient (obtained by the average of all most recent stochas-
tic gradients) to estimate the local gradients. Thus, the proposed algorithm
highly reduces the expense of full gradient evaluations.
(iii) If the constant step-size is less than an explicitly estimated upper bound, the
proposed algorithm is proven to converge to the exact optimal solution in
expectation when each constituent function is smooth and strongly convex.
In the unconstrained case, we also propose a distributed stochastic proximal
gradient algorithm by using the variance-reduction technique and study its
convergence rate.

2.2 Preliminaries

2.2.1 Notation

If not particularly stated, the vectors mentioned in this chapter are column vectors.
Let R, Rn , and Rm×n denote the set of real numbers, n-dimensional real column
vectors, and m × n real matrices, respectively. The n × n identity matrix is denoted
as In , and two column vectors of all ones and all zeros are denoted as 1 and 0
(appropriate dimensional), respectively. A quantity (probably a vector) of node i
is indexed by a subscript i; e.g., let xik be the estimate of node i at time k. We
use χmax (A) and χmin (A) to represent the largest and the smallest eigenvalues of
a real symmetric matrix A, respectively. We let the symbols x T and AT denote
the transposes of a vector x and a matrix A. The Euclidean norm (vectors) √ and
1 norm are denoted as || · || and || · ||1 , respectively. We let ||x||A = x T Ax,
where matrix A ∈ Rn×n is a positive semi-definite matrix. The Kronecker product
and the Cartesian product are represented by the symbols ⊗ and , respectively.
Given a random estimator x, the probability and expectation are represented by P[x]
and E[x], respectively. We utilize Z = diag{x} to represent the diagonal matrix of
vector x = [x1 , x2 , . . . , xn ]T , which satisfies that zii = xi , ∀i = 1, . . . , n, and
zij = 0, ∀i = j . Denote (·)+ = max{0, ·}.
For a set Ω ⊆ Rd , the projection of a vector x ∈ Rd onto Ω is denoted by
PΩ (x), i.e., PΩ (x) = arg miny∈Ω ||y −x||2. Notice that this projection always exists
and is unique if Ω is nonempty, closed, and convex [53]. Moreover, let Ω be a
nonempty closed convex set, then the projection operator PΩ (·) has the following
properties: (a) (y − PΩ (y))T (PΩ (y) − x) ≥ 0, for any x ∈ Ω and y ∈ Rd and (b)
||PΩ (y) − PΩ (x)|| ≤ ||y − x||, for any x, y ∈ Rd .
Random documents with unrelated
content Scribd suggests to you:
The Project Gutenberg eBook of Throne-

Makers
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

Title: Throne-Makers

Author: William Roscoe Thayer

Release date: February 10, 2019 [eBook #58856]

Language: English

Credits: Produced by The Online Distributed Proofreading Team

at
http://www.pgdp.net (This file was produced from
images
generously made available by The Internet
Archive/American
Libraries.)

*** START OF THE PROJECT GUTENBERG EBOOK THRONE-MAKERS

***
THRONE-MAKERS
Books by William Roscoe Thayer

THE DAWN OF ITALIAN INDEPENDENCE: Italy from the

Congress of Vienna, 1814, to the Fall of Venice, 1849. In
the series on Continental History. With maps. 2 vols.
crown 8vo, $4.00.

THRONE-MAKERS. Papers on Bismarck, Napoleon III., Kossuth,

Garibaldi, etc. 12mo, $1.50.

POEMS, NEW AND OLD. 16mo, $1.00.

HOUGHTON, MIFFLIN & CO.

Boston and New York.

THRONE-MAKERS
BY
WILLIAM ROSCOE THAYER
BOSTON AND NEW YORK
HOUGHTON, MIFFLIN AND COMPANY
The Riverside Press, Cambridge
1899
COPYRIGHT, 1899, BY WILLIAM ROSCOE THAYER

TO
DR. MORRIS LONGSTRETH
IN MEDICINE, ORIGINAL AND WISE
IN FRIENDSHIP, STEADFAST
PREFACE
Since 1789 every European people has been busy making a
throne, or seat of government and authority, from which its ruler
might preside. These thrones have been of many patterns, to
correspond to the diversity in tastes of races, parties, and times.
Often, the business of destroying seems to have left no leisure for
building. In England alone have men learned how to remodel a
throne without disturbing its occupant; as we in America raise or
move large houses without interrupting the daily life of the families
who dwell in them.

To portray the personality of some of the conspicuous Throne-

Makers of the century is the purpose of the following studies. I have
wished to show just enough of the condition of the countries under
review to enable the reader to understand what Bismarck, or
Napoleon III, or Kossuth, or Garibaldi, achieved. I have been brief,
and yet I trust that this method has afforded scope for exhibiting
that influence of the individual on the multitude which—however our
partial science may try to belittle it—was never more strikingly
illustrated than by such careers as these in our own time.

The group of Portraits which follow require no special introduction.

In the “Tintoret” and “Giordano Bruno” I have brought together as
compactly as possible, for the convenience of English readers, what
little is known about these two men. Berti’s work on Bruno, from
which I have drawn largely, deserves a wider recognition than it has
received outside of Italy; whoever reads it will regret that that
eminent scholar was prevented from completing his volume on
Bruno’s philosophy. The sketch of Bryant was written in 1894, that of
Carlyle in 1895, on the occasion of their centenaries.
My thanks are due to the proprietors of The Atlantic Monthly, The
Forum, and The American Review of Reviews for permission to
reprint such of the following articles as originally appeared in those
periodicals.

W. R. T.

8 Berkeley Street, Cambridge,

December 8, 1898.
CONTENTS

THRONE-MAKERS: PAGE

Bismarck 3

Napoleon III 44

Kossuth 79

Garibaldi 115

PORTRAITS:

Carlyle 163

Tintoret 193

Giordano Bruno 252

Bryant 309
THRONE-MAKERS
BISMARCK
One by one the nations of the world come to their own, have free
play for their faculties, express themselves, and eventually pass
onward into silence. Our age has beheld the elevation of Prussia.
Well may we ask, “What has been her message? What the path by
which she climbed into preëminence?” That she would reach the
summit, the work of Frederick the Great in the last century, and of
Stein at the beginning of this, portended. It has been Bismarck’s
mission to amplify and complete their task. Through him Prussia has
come to her own. What, then, does she express?

The Prussians have excelled even the Romans in the art of turning
men into machines. Set a Yankee down before a heap of coal and
another of iron, and he will not rest until he has changed them into
an implement to save the labor of many hands; the Prussian takes
flesh and blood, and the will-power latent therein, and converts
them into a machine. Such soldiers, such government clerks, such
administrators, have never been manufactured elsewhere.
Methodical, punctilious, thorough, are those officers and officials.
The government which makes them relies not on sudden spurts, but
on the cumulative force of habit. It substitutes rule for whim; it
suppresses individual spontaneity, unless this can be transformed
into energy for the great machine to use. That Prussian system takes
a turnip-fed peasant, and in a few months makes of him a military
weapon, the length of whose stride is prescribed in centimetres—a
machine which presents arms to a passing lieutenant with as much
gravity and precision as if the fate of Prussia hinged on that special
act. It takes the average tradesman’s son, puts him into the
educational mill, and brings him out a professor,—equipped even to
the spectacles,—a nonpareil of knowledge, who fastens on some
subject, great or small, timely or remote, with the dispassionate
persistence of a leech; and who, after many years, revolutionizes our
theory of Greek roots, or of microbes, or of religion. Patient and
noiseless as the earthworm, this scholar accomplishes a similarly
incalculable work.

A spirit of obedience, which on its upper side passes into

deference not always distinguishable from servility, and on its lower
side is not always free from arrogance, lies at the bottom of the
Prussian nature. Except in India, caste has nowhere had more
power. The Prussian does not chafe at social inequality, but he
cannot endure social uncertainty; he must know where he stands, if
it be only on the bootblack’s level. The satisfaction he gets from
requiring from those below him every scrape and nod of deference
proper to his position more than compensates him for the deference
he must pay to those above him. Classification is carried to the
fraction of an inch. Everybody, be he privy councilor or chimney-
sweep, is known by his office. On a hotel register you will see such
entries as “Frau X, widow of a school-inspector,” or “Fräulein Y, niece
of an apothecary.”

This excessive particularization, which amuses foreigners, enables

the Prussian to lift his hat at the height appropriate to the position
occupied by each person whom he salutes. It naturally develops
acuteness in detecting social grades, and a solicitude to show the
proper degree of respect to superiors and to expect as much from
inferiors,—a solicitude which a stranger might mistake for servility or
arrogance, according as he looked up or down. Yet, amid a punctilio
so stringent, fine-breeding—the true politeness which we associate
with the word “gentleman”—rarely exists; for a gentleman cannot be
made by the rank he holds, which is external, but only by qualities
within himself.
Nevertheless, these Prussians—so unsympathetic and rude
compared with their kinsmen in the south and along the Rhine, not
to speak of races more amiable still—kept down to our own time a
strength and tenacity of character that intercourse with Western
Europeans scarcely affected. Frederick the Great tried to graft on
them the polished arts and the grace of the French: he might as well
have decorated the granite faces of his fortresses with dainty
Parisian wall-paper. But when he touched the dominant chord of his
race,—its aptitude for system,—he had a large response. The
genuine Prussian nature embodied itself in the army, in the
bureaucracy, in state education, through all of which its astonishing
talent for rules found congenial exercise. One dissipation, indeed,
the Prussians allowed themselves, earlier in this century,—they
reveled in Hegelianism. But even here they were true to their
instinct; for the philosophy of Hegel commended itself to them
because it assumed to reduce the universe to a system, and to
pigeon-hole God himself.

We see, then, the elements out of which Prussia grew to be a

strong state, not yet large in population, but compact and carefully
organized. Let us look now at Germany, of which she formed a part.

We are struck at once by the fact that until 1871 Germany had no
political unity. During the centuries when France, England, and Spain
were being welded into political units by their respective dynasties,
the great Teutonic race in Central Europe escaped the unifying
process. The Holy Roman Empire—at best a reminiscence—was too
weak to prevent the rise of many petty princedoms and duchies and
of a few large states, whose rulers were hereditary, whereas the
emperor was elective. Thus particularism—what we might call states’
rights—flourished, to the detriment of national union. At the end of
the last century, Germany had four hundred independent sovereigns:
the most powerful being the King of Prussia; the weakest, some
knight whose realm embraced but a few hundred acres, or some
free city whose jurisdiction was bounded by its walls. When
Napoleon, the great simplifier, reduced the number of little German
states, he had no idea of encouraging the formation of a strong,
coherent German Empire. To guard against this, which might
menace the supremacy of France, he created the kingdoms of
Bavaria and Westphalia, and set up the Confederation of the Rhine.
After his downfall the German Confederation was organized,—a
weak institution, consisting of thirty-nine members, whose common
affairs were regulated by a Diet which sat at Frankfort.
Representation in this Diet was so unequal that Austria and Prussia,
with forty-two million inhabitants, had only one eighth of the votes,
while the small states, with but twelve million inhabitants, had seven
eighths. Four tiny principalities, with two hundred and fifty thousand
inhabitants each, could exactly offset Prussia with eight millions. By
a similar anomaly, Nevada and New York have an equal
representation in the United States Senate.

From 1816 to 1848 Austria ruled the Diet. Yet Austria was herself
an interloper in any combination of German states, for her German
subjects, through whom she gained admission to the Diet,
numbered only four millions; but her prestige was augmented by the
backing of her thirty million non-German subjects besides. Prussia
fretted at this Austrian supremacy, fretted, and could not counteract
it. Beside the Confederation, which so loosely bound the German
particularists together, there was a Customs Union, which, though
simply commercial, fostered among the Germans the idea of
common interests. The spirit of nationality, potent everywhere,
awakened also in the Germans a vision of political unity, but for the
most part those who beheld the vision were unpractical; the men of
action, the rulers, opposed a scheme which enfolded among its
possibilities the curtailing of their autocracy through the adoption of
constitutional government. No state held more rigidly than Prussia
the tenets of absolutism.

Great, therefore, was the general surprise, and among Liberals the
joy, at the announcement, in February, 1847, that the King of Prussia
had consented to the creation of a Prussian Parliament. He granted
to it hardly more power than would suffice for it to assemble and
adjourn; but even this, to the Liberals thirsty for a constitution, was
as the first premonitory raindrops after a long drought. Among the
members of this Parliament, or Diet, was a tall, slim, blond-bearded,
massive-headed Brandenburger, thirty-two years old, who sat as
proxy for a country gentleman. A few of his colleagues recognized
him as Otto von Bismarck; the majority had never heard of him.

Bismarck was born at Schönhausen, Prussia, April 1, 1815. His

paternal ancestors had been soldiers back to the time when they
helped to defend the Brandenburg March against the inroads of Slav
barbarians. His mother was the daughter of an employee in
Frederick the Great’s War Office. Thus, on both sides his roots were
struck in true Prussian soil. At the age of six he was placed in a
Berlin boarding-school, of which he afterward ridiculed the “spurious
Spartanism;” at twelve he entered a gymnasium, where for five
years he pursued the usual course of studies,—an average scholar,
but already noteworthy for his fine physique; at seventeen he went
up to the University at Göttingen. In the life of a Prussian, there is
but one period between the cradle and the grave during which he
escapes the restraints of iron-grooved routine: that period comprises
the years he spends at the university. There a strange license is
accorded him. By day he swaggers through the streets, leering at
the women and affronting the men; by night he carouses. And from
time to time he varies the monotony of drinking-bouts by a duel.
Such, at least, was the life of the university student in Bismarck’s
time. At Göttingen, and subsequently at Berlin, he had the
reputation of being the greatest beer-drinker and the fiercest fighter;
yet he must also have studied somewhat, for in due time he received
his degree in law, and became official reporter in one of the Berlin
courts. Then he served as referendary at Aix-la-Chapelle, and passed
a year in military service.

At twenty-four he set about recuperating the family fortunes,

which had suffered through his father’s incompetence. He took
charge of the estates, devoted himself to agriculture, and was
known for many miles round as the “mad squire.” Tales of his revels
at his country house, of his wild pranks and practical jokes, horrified
the neighborhood. Yet here, again, his recklessness did not preclude
good results. He made the lands pay, and he tamed into usefulness
that restive animal, his body, which was to serve as mount for his
mighty soul. Some biographers, referring to his bucolic
apprenticeship, have compared him to Cromwell; in his youthful
roistering he reminds us of Mirabeau.

To the Diet of 1847 the mad squire came, and during several
sittings he held his peace. At last, however, when a Liberal deputy
declared that Prussia had risen in arms in 1813, in the hope of
getting a constitution quite as much as of expelling the French, the
blond Brandenburger got leave to speak. In a voice which seemed
incongruously small for his stature, but which carried far and
produced the effect of being the utterance of an inflexible will, he
deprecated the assertions just made, and declared that the desire to
shake off foreign tyranny was a sufficient motive for the uprising in
1813. These words set the House in confusion. Liberal deputies
hissed and shouted so that Bismarck could not go on; but, nothing
daunted, he took a newspaper out of his pocket and read it, there in
the tribune, till order was restored. Then, having added that
whoever deemed that motive inadequate held Prussia’s honor cheap,
he strode haughtily to his seat, amid renewed jeers and clamor. Such
was Bismarck’s parliamentary baptism of fire.

Before the session adjourned, the deputies had come to know him
well. They discovered that the mad squire, the blunt “captain of the
dikes,” was doubly redoubtable; he had strong opinions, and utter
fearlessness in proclaiming them.

His political creed was short,—it comprised but two clauses: “I

believe in the supremacy of Prussia, and in absolute monarchy.”
More royalist than the King, he opposed every concession which
might diminish by a hair’s breadth the royal prerogative.
Constitutional government, popular representation, whatever
Liberals had been struggling and dying for since 1789, he detested.
Democracy, and especially German democracy, he scoffed at. For
sixty years reformers had been railing at the absurdities of the Old
Régime; they had denounced the injustice of the privileged classes;
they had made odious the tyranny of paternalism. Bismarck entered
the lists as the champion of “divine right,” and first proved his
strength by exposing the defects of democracy.

Those who believe most firmly in democracy acknowledge,

nevertheless, that it has many objections, both in theory and in
practice. Universal suffrage—the abandoning of the state to the
caprice of millions of voters, among whom the proportion of
intelligence to ignorance is as one to ten—seems a process worthy
of Bedlam. The ballot-box is hardly more accurate than the dice-box,
as a test of the fitness of candidates. Popular government means
party government, and parties are dogmatic, overbearing, insincere,
and corrupt. The men who legislate and administer, chosen by this
method, avowedly serve their party, and not the state; and though,
by chance, they should be both skilful and honest, they may be
overturned by a sudden revulsion of the popular will. Such a system
breeds a class of professional politicians,—men who make a business
of getting into office, and whose only recommendation is their
proficiency in the art of cajoling voters. A government should be
managed as a great business corporation is managed: it has to deal
with the weightiest problems of finance, and with delicate diplomatic
questions, for which the trained efforts of judicious experts are
needed; but instead of being intrusted to them, it is given over to
politicians elected by multitudes who cannot even conduct their
private business successfully, much less entertain large and patriotic
views of the common welfare. To decide an election by a show of
hands seems not a whit less absurd than to decide it by the
aggregate weight or the color of the hair of the voters. We speak of
the will of the majority as if it were infallibly right. The vast majority
of men to-day would vote that the sun revolves round the earth:
should this belief of a million ignoramuses countervail the knowledge
of one astronomer? Shall knowledge be the test of fitness in all
concerns except government, the most critical, the most far-reaching
and responsible of all? Majority rule substitutes mere numbers, bulk,
and quantity for quality. Putting a saddle on Intelligence, it bids
Ignorance mount and ride whither it will,—even to the devil. It is the
dupe of its own folly; for the politicians whom it chooses turn out to
be, not the representatives of the people, but the attorneys of some
mill or mine or railway.

These and similar objections to democracy Bismarck urged with a

sarcasm and directness hitherto unknown in German politics. When
half the world was repeating the words “Liberalism,” “Constitution,”
“Equality,”—as if the words themselves possessed magic to
regenerate society,—he insisted that firm nations must be based
upon facts, not phrases. He had the twofold advantage of invariably
separating the actual from the apparent, and of being opposed by
the most incompetent Liberals in Europe. However noble the ideals
of the German reformers, the men themselves were singularly
incapable of dealing with realities. Nor should this surprise us; for
they had but recently broken away from the machine we have
described, and as they had not yet a new machine to work in, they
whirled to and fro in vehement confusion, the very rigidity of their
previous restraint increasing their dogmatism and their discord.

The revolution of 1848 soon put them to the ordeal. The German
Liberals aimed at national unity under a constitution. Like their
brothers in Austria and Italy, they enjoyed a temporary triumph; but
they could not construct. Their Parliament became a cave of the
winds. Their schemes clashed. By the beginning of 1850 the old
order was restored.

During this stormy crisis, Bismarck, as deputy in two successive

Diets, had resolutely withstood the popular tide. He regarded the
revolutionists as men in whom the qualities of knave, fool, and
maniac alternately ruled; the revolution itself, he said, had no other
motive than “a lust of theft.” One of its leaders he dismissed as a
“phrase-watering-pot.” The right of assemblages he ridiculed as
furnishing democracy with bellows; a free press he stigmatized as a
blood-poisoner. When the imperial crown was offered to the King of
Prussia, Bismarck argued against accepting it; he would not see his
King degraded to the level of a mere “paper president.”

Such opposition would have made the speaker conspicuous, if only

for its audacity. His enemies had learned, however, that it required a
strong character to support that audacity continuously. They tried to
silence him with abuse; but their abuse, like tar, added fuel to his
fire. They tried ridicule; but their ridicule had too much of the
German dulness to wound him. They called him a bigoted Junker, or
squire. “Remember,” he retorted, “that the names Whig and Tory
were first used opprobriously, and be assured that we will yet bring
the name Junker into respect and honor.” Many anecdotes are told
illustrating his quick repulse of intended insult or his disregard of
formality. He was not unwilling that his enemies should remember
that he held his superior physical strength in reserve, if his
arguments failed. Yet on a hunting-party, or at a dinner, or in familiar
conversation, he was the best of companions. Germany has not
produced another, unless it were Goethe, so variedly entertaining;
and Goethe had no trace of one of Bismarck’s characteristics,—
humor. He possessed also tact and a sort of Homeric geniality which,
coupled with unbending tenacity, fitted him to succeed as a
diplomatist.

In 1851 the King appointed him to represent Prussia at the

German Diet, which sat at Frankfort. The outlook was gloomy.
Prussia had quelled the revolution, but she had lost prestige. Unable
to break asunder the German Confederation or to dominate it, she
had signed, at Olmütz, in the previous autumn, a compact which
acknowledged the supremacy of her old rival, Austria. While the
humiliation still rankled, Bismarck entered upon his career. Hitherto
not unfriendly to Austria, because he had looked upon her as the
extinguisher of the revolution, which he hated most of all, he began,
now that the danger was over, to give a free rein to his jealousy of
his country’s hereditary competitor. In the Diet, the Austrian
representative presided, the rulings were always in Austria’s favor,
the majority of the smaller states allowed Austria to guide them.
Bismarck at once showed his colleagues that humility was not his
rôle. Finding that the Austrian president alone smoked at the
sittings, he took out his own cigar and lighted it,—a trifle, but
significant. He resisted every encroachment, and demanded the
strictest observance of the letter of the law. Gradually he extended
Prussia’s influence among the confederates. He unmasked Austria’s
insincerity; he showed how honestly Prussia walked in the path of
legality; until he slowly created the impression that wickedness was
to be expected from one, and virtue from the other.

During seven years Bismarck held this outpost, winning no

outward victory, but storing a vast amount of knowledge about all
the states of the Confederation, their rulers and public men, which
was subsequently invaluable to him. His dispatches to the Prussian
Secretary of State, his reports to the King, form a body of diplomatic
correspondence unmatched in fulness, vigor, directness, and insight.
With him, there was no ambiguity, no diplomatic circumlocution, no
German prolixity. He sketched in indelible outlines the portraits,
corporal or mental, of his colleagues. He criticised the policy of
Prussia with a brusqueness which must have startled his superior. He
reviewed at longer range the political tendencies of Europe.
Officially, he kept strictly within the limits of his instructions; but his
own personality represented more than he could yet officially
declare,—Prussia’s ambition to become the leader of Germany. In all
his dispatches, and in all places where caution did not prescribe
silence, he reiterated his Cato warning, “Austria must be ousted from
Germany.”

Do not suppose, however, that Bismarck’s political greatness was

then discerned. Probably, had you inquired of Germans forty years
ago, “Who among you is the coming statesman?” not one would
have replied, “Bismarck.” At the opera, we cannot mistake the hero,
because the moonlight obligingly follows him over the stage; in real
life, the hero passes for the most part unrecognized, until his
appointed hour; but the historian’s duty is to show how the heroic
qualities were indubitably latent in him long before the world
perceived them.

In 1859 Bismarck was appointed ambassador at St. Petersburg,

where he stayed three years, when he was transferred to Paris. This
completed his apprenticeship, for in September, 1862, he was
recalled to Berlin to be minister-president.

His promotion had long been mooted. The new King William—a
practical, rigid monarch, with no Liberal visions, no desire to please
everybody—had been for eighteen months in conflict with his
Parliament. He had determined to reorganize the Prussian army; the
Liberals insisted that, as Parliament was expected to vote
appropriations, it should know how they were spent. William at last
turned to Bismarck to help him subjugate the unruly deputies, and
Bismarck, with a true vassal’s loyalty, declared his readiness to serve
as “lid to the saucepan.” Very soon the Liberals began to compare
him with Strafford, and the King with Charles I, but neither of them
quailed. “Death on the scaffold, under certain circumstances, is as
honorable,” Bismarck said, “as death on the battlefield. I can imagine
worse modes of death than the axe.” Hitherto he had strenuously
maintained the first article of his creed,—“I believe in the supremacy
of Prussia;” henceforth he upheld with equal vigor the second,—“I
believe in the autocracy of the King.”

The narrow Constitution limited the King’s authority, making it

coequal with that of the Upper and Lower Chambers, but Bismarck
quickly taught the deputies that he would not allow “a sheet of
paper” to intervene between the royal will and its fulfilment. Year
after year the Lower House refused to vote the army budget; year
after year Bismarck and his master pushed forward the military
organization, in spite of the deputies. Noah was not more unmoved
by those who came and scoffed at his huge, expensive, apparently
useless ark than were the Prussian minister and his King by their
critics, who did not see the purpose of the ark the two were
building. Bismarck merely insisted that the army, on which depended
the integrity of the nation, could not be subjected to the caprice of
parties; it was an institution above parties, above politics, he said,
which the King alone must control.

At the same time, the Minister-President actively pursued his other

project,—the expulsion of Austria from Germany. When the King of
Denmark died, in December, 1863, the succession to the duchies of
Schleswig and Holstein was disputed. Bismarck seized the occasion
for occupying the disputed territory, in partnership with Austria.
England protested, France muttered, but neither cared to risk a war
with the allied robbers. When it came to dividing the spoils,
Bismarck, who had recently gauged Austria’s strength, struck for the
lion’s share. Austria resisted. Bismarck then approved himself a
master of diplomacy. Never was he more clever or more
unscrupulous, shifting from argument to argument, delaying the
open rupture till Prussia was quite ready, feigning willingness to
submit the dispute to European arbitration while secretly stipulating
conditions which foredoomed arbitration to failure, and invariably
giving the impression that Austria refused to be conciliated. As the
juggler lets you see the card he wishes you to see, and no other, so
Bismarck always kept in full view, amid whatever shuffling of the
pack, the apparent legality of Prussia. In the end he drove Austria to
desperation.

In June, 1866, war came, with fury. One Prussian army crushed
with a single blow the German states which had promised to support
Austria; another marched into Bohemia, and, in seven days,
confronted the imperial forces at Sadowa. There was fought a great
battle, in which the Prussian crown prince repeated the master
stroke of Blücher at Waterloo, and then Austria, hopelessly beaten,
sued for peace.

Bismarck now showed himself astute in victory. Having ousted

Austria from Germany, he had no wish to wreak a vengeance that
she could not forgive. Taking none of her provinces, he exacted only
a small indemnity. With the German states he was equally
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.

More than just a book-buying platform, we strive to be a bridge

connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.

Join us on a journey of knowledge exploration, passion nurturing, and

personal growth every day!

ebookbell.com

Mine Sight For Long Term Planning
86% (7)
Mine Sight For Long Term Planning
145 pages
Python Data Science Essentials - Second Edition
From Everand
Python Data Science Essentials - Second Edition
Alberto Boschetti
4.5/5 (3)
Dental Practice BSC Report
No ratings yet
Dental Practice BSC Report
46 pages
Water Resources Engineering
75% (4)
Water Resources Engineering
288 pages
Distributed Optimization Game And Learning Algorithms Theory And Applications In Smart Grid Systems 1st Edition Huiwei Wang instant download
No ratings yet
Distributed Optimization Game And Learning Algorithms Theory And Applications In Smart Grid Systems 1st Edition Huiwei Wang instant download
85 pages
Instant Download Contemporary Issues in Communication, Cloud and Big Data Analytics PDF All Chapters
100% (2)
Instant Download Contemporary Issues in Communication, Cloud and Big Data Analytics PDF All Chapters
79 pages
Architectural Wireless Networks Solutions And Security Issues Lecture Notes In Networks And Systems 196 1st Ed 2021 Santosh Kumar Das Editor download
No ratings yet
Architectural Wireless Networks Solutions And Security Issues Lecture Notes In Networks And Systems 196 1st Ed 2021 Santosh Kumar Das Editor download
81 pages
Applied Multiobjective Optimization Nilanjan Dey instant download
100% (1)
Applied Multiobjective Optimization Nilanjan Dey instant download
47 pages
“Careers in Information Technology: Network and Systems Administrator”: GoodMan, #1
From Everand
“Careers in Information Technology: Network and Systems Administrator”: GoodMan, #1
Patrick Mukosha
No ratings yet
“Careers in Information Technology: Database Administrator”: GoodMan, #1
From Everand
“Careers in Information Technology: Database Administrator”: GoodMan, #1
Patrick Mukosha
No ratings yet
Instant download Contemporary Issues in Communication, Cloud and Big Data Analytics pdf all chapter
100% (1)
Instant download Contemporary Issues in Communication, Cloud and Big Data Analytics pdf all chapter
40 pages
Contemporary Issues in Communication, Cloud and Big Data Analytics - Download the ebook now for instant access to all chapters
100% (1)
Contemporary Issues in Communication, Cloud and Big Data Analytics - Download the ebook now for instant access to all chapters
79 pages
Computing Analytics and Networks Rajnish Sharma - Own the ebook now with all fully detailed content
100% (2)
Computing Analytics and Networks Rajnish Sharma - Own the ebook now with all fully detailed content
67 pages
Contemporary Issues in Communication, Cloud and Big Data Analytics all chapter instant download
100% (2)
Contemporary Issues in Communication, Cloud and Big Data Analytics all chapter instant download
33 pages
CompTIA Network+ Certification Guide (Exam N10-008): Unleash your full potential as a Network Administrator (English Edition)
From Everand
CompTIA Network+ Certification Guide (Exam N10-008): Unleash your full potential as a Network Administrator (English Edition)
Eithne Hogan
No ratings yet
Instant Access to Mathematical Methods and Modelling in Applied Sciences 1st Edition Mehmet Zeki Sarikaya (Editor) ebook Full Chapters
100% (4)
Instant Access to Mathematical Methods and Modelling in Applied Sciences 1st Edition Mehmet Zeki Sarikaya (Editor) ebook Full Chapters
55 pages
“Exploring Computer Systems: From Fundamentals to Advanced Concepts”: GoodMan, #1
From Everand
“Exploring Computer Systems: From Fundamentals to Advanced Concepts”: GoodMan, #1
Patrick Mukosha
No ratings yet
Cooperation And Integration In 6g Heterogeneous Networks Resource Allocation And Networking Jun Du download
No ratings yet
Cooperation And Integration In 6g Heterogeneous Networks Resource Allocation And Networking Jun Du download
90 pages
Cooperation And Integration In 6g Heterogeneous Networks Resource Allocation And Networking Jun Du download
100% (1)
Cooperation And Integration In 6g Heterogeneous Networks Resource Allocation And Networking Jun Du download
91 pages
Applications of Internet of Things Proceedings of ICCCIOT 2020 Jyotsna K. Mandal - Read the ebook online or download it for the best experience
100% (1)
Applications of Internet of Things Proceedings of ICCCIOT 2020 Jyotsna K. Mandal - Read the ebook online or download it for the best experience
58 pages
(Ebook) Big Data and Networks Technologies by Yousef Farhaoui ISBN 9783030236724, 3030236722 2024 scribd download
100% (8)
(Ebook) Big Data and Networks Technologies by Yousef Farhaoui ISBN 9783030236724, 3030236722 2024 scribd download
55 pages
(Ebook) Optimization for Communications and Networks by Poompat Saengudomlert (Author) ISBN 9780429065811, 9781439876565, 9781578087242, 0429065817, 1439876568, 1578087244 instant download
100% (2)
(Ebook) Optimization for Communications and Networks by Poompat Saengudomlert (Author) ISBN 9780429065811, 9781439876565, 9781578087242, 0429065817, 1439876568, 1578087244 instant download
58 pages
Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou 2024 Scribd Download
100% (1)
Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou 2024 Scribd Download
40 pages
Distributed Computing And Artificial Intelligence Volume 1 18th International Conference Sara Rodrguez Gonzlez Editor S Matu Editor Tan Yigitcanlar Editor Kenji Matsui Editor pdf download
No ratings yet
Distributed Computing And Artificial Intelligence Volume 1 18th International Conference Sara Rodrguez Gonzlez Editor S Matu Editor Tan Yigitcanlar Editor Kenji Matsui Editor pdf download
84 pages
Intelligent Computing And Networking Lecture Notes In Networks And Systems Book 301 Valentina Emilia Balas pdf download
100% (1)
Intelligent Computing And Networking Lecture Notes In Networks And Systems Book 301 Valentina Emilia Balas pdf download
77 pages
Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou - Quickly download the ebook to start your content journey
100% (1)
Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou - Quickly download the ebook to start your content journey
66 pages
Get Сooperation and Sustainable Development 1st Edition Aleksei V. Bogoviz (Editor) free all chapters
100% (1)
Get Сooperation and Sustainable Development 1st Edition Aleksei V. Bogoviz (Editor) free all chapters
65 pages
Distributed Storage Networks: Architecture, Protocols and Management
From Everand
Distributed Storage Networks: Architecture, Protocols and Management
Thomas C. Jepsen
No ratings yet
"Careers in Information Technology: Network Engineer": GoodMan, #1
From Everand
"Careers in Information Technology: Network Engineer": GoodMan, #1
Patrick Mukosha
No ratings yet
Distributed Computing And Artificial Intelligence Volume 1 18th International Conference Lecture Notes In Networks And Systems 1st Ed 2022 Kenji Matsui Editor instant download
No ratings yet
Distributed Computing And Artificial Intelligence Volume 1 18th International Conference Lecture Notes In Networks And Systems 1st Ed 2022 Kenji Matsui Editor instant download
80 pages
Download Complete Modeling and Simulation in HPC and Cloud Systems 1st Edition Joanna Kołodziej PDF for All Chapters
No ratings yet
Download Complete Modeling and Simulation in HPC and Cloud Systems 1st Edition Joanna Kołodziej PDF for All Chapters
55 pages
Exploring Semantic Technologies and Their Application to Nuclear Knowledge Management
From Everand
Exploring Semantic Technologies and Their Application to Nuclear Knowledge Management
IAEA
No ratings yet
(Ebook) Cyberdefense: The Next Generation by Marcus Matthias Keupp, (ed.) ISBN 9783031301919, 9783031301902, 3031301919, 3031301900 - Get the ebook instantly with just one click
100% (1)
(Ebook) Cyberdefense: The Next Generation by Marcus Matthias Keupp, (ed.) ISBN 9783031301919, 9783031301902, 3031301919, 3031301900 - Get the ebook instantly with just one click
82 pages
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Download Linear and Nonlinear Programming (International Series in Operations Research & Management Science, 228) Luenberger ebook All Chapters PDF
No ratings yet
Download Linear and Nonlinear Programming (International Series in Operations Research & Management Science, 228) Luenberger ebook All Chapters PDF
40 pages
Full Optimization On Solution Sets of Common Fixed Point Problems 1st Edition Alexander J. Zaslavski Ebook All Chapters
100% (3)
Full Optimization On Solution Sets of Common Fixed Point Problems 1st Edition Alexander J. Zaslavski Ebook All Chapters
49 pages
Network Engineer's Bible: Mastering 100 Protocols For Communication, Management, And Security
From Everand
Network Engineer's Bible: Mastering 100 Protocols For Communication, Management, And Security
Rob Botwright
No ratings yet
Get Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou free all chapters
100% (6)
Get Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou free all chapters
50 pages
Manycriteria Optimization And Decision Analysis Stateoftheart Present Challenges And Future Perspectives Natural Computing Series 1st Ed 2023 Dimo Brockhoff download
100% (1)
Manycriteria Optimization And Decision Analysis Stateoftheart Present Challenges And Future Perspectives Natural Computing Series 1st Ed 2023 Dimo Brockhoff download
79 pages
Computer Networking Bootcamp: Routing, Switching And Troubleshooting
From Everand
Computer Networking Bootcamp: Routing, Switching And Troubleshooting
Rob Botwright
No ratings yet
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cyber Intelligence And Information Retrieval Proceedings Of Ciir 2021 Joo Manuel R S Tavares instant download
100% (1)
Cyber Intelligence And Information Retrieval Proceedings Of Ciir 2021 Joo Manuel R S Tavares instant download
79 pages
Edge Computing 101: Novice To Pro: Expert Techniques And Practical Applications
From Everand
Edge Computing 101: Novice To Pro: Expert Techniques And Practical Applications
Rob Botwright
No ratings yet
Modeling and Simulation in HPC and Cloud Systems 1st Edition Joanna Kołodziej download
No ratings yet
Modeling and Simulation in HPC and Cloud Systems 1st Edition Joanna Kołodziej download
69 pages
Operations Research Engineering and Cyber Security Trends in Applied Mathematics and Technology 1st Edition Nicholas J. Daras all chapter instant download
100% (3)
Operations Research Engineering and Cyber Security Trends in Applied Mathematics and Technology 1st Edition Nicholas J. Daras all chapter instant download
57 pages
The Palo Alto Networks Handbook: Practical Solutions for Cyber Threat Protection
From Everand
The Palo Alto Networks Handbook: Practical Solutions for Cyber Threat Protection
Robert Johnson
No ratings yet
Data-Centric Business and Applications: ICT Systems-Theory, Radio-Electronics, Information Technologies and Cybersecurity 1st Edition Tamara Radivilova download
100% (5)
Data-Centric Business and Applications: ICT Systems-Theory, Radio-Electronics, Information Technologies and Cybersecurity 1st Edition Tamara Radivilova download
59 pages
Data Management Analytics And Innovation Proceedings Of Icdmai 2023 Neha Sharma pdf download
100% (1)
Data Management Analytics And Innovation Proceedings Of Icdmai 2023 Neha Sharma pdf download
79 pages
Strategies And Trends In Organizational And Project Management Pavel V Trifonov instant download
100% (1)
Strategies And Trends In Organizational And Project Management Pavel V Trifonov instant download
91 pages
(Ebook) Security and Privacy Preserving for IoT and 5G Networks: Techniques, Challenges, and New Directions by Ahmed A. Abd El-Latif, Bassem Abd-El-Atty, Salvador E. Venegas-Andraca, Wojciech Mazurczyk, Brij B. Gupta, (eds.) ISBN 9783030854270, 9783030854287, 3030854272, 3030854280 2024 Scribd Download
100% (5)
(Ebook) Security and Privacy Preserving for IoT and 5G Networks: Techniques, Challenges, and New Directions by Ahmed A. Abd El-Latif, Bassem Abd-El-Atty, Salvador E. Venegas-Andraca, Wojciech Mazurczyk, Brij B. Gupta, (eds.) ISBN 9783030854270, 9783030854287, 3030854272, 3030854280 2024 Scribd Download
81 pages
The Next Generation Vehicular Networks, Modeling, Algorithm and Applications Zhou Su - Download the ebook today and own the complete version
No ratings yet
The Next Generation Vehicular Networks, Modeling, Algorithm and Applications Zhou Su - Download the ebook today and own the complete version
63 pages
Secure Edge Computing for IoT: Master Security Protocols, Device Management, Data Encryption, and Privacy Strategies to Innovate Solutions for Edge Computing in IoT
From Everand
Secure Edge Computing for IoT: Master Security Protocols, Device Management, Data Encryption, and Privacy Strategies to Innovate Solutions for Edge Computing in IoT
Oluyemi James
No ratings yet
Secure Edge Computing for IoT: Master Security Protocols, Device Management, Data Encryption, and Privacy Strategies to Innovate Solutions for Edge Computing in IoT (English Edition)
From Everand
Secure Edge Computing for IoT: Master Security Protocols, Device Management, Data Encryption, and Privacy Strategies to Innovate Solutions for Edge Computing in IoT (English Edition)
Oluyemi James
No ratings yet
(Ebook) Architectural Wireless Networks Solutions and Security Issues (Lecture Notes in Networks and Systems, 196) by Santosh Kumar Das (editor); Sourav Samanta (editor); Nilanjan Dey (editor); Bharat S. Patel (editor); Aboul Ella Hassanien (editor) ISBN 9789811603853, 9811603855 2024 scribd download
100% (6)
(Ebook) Architectural Wireless Networks Solutions and Security Issues (Lecture Notes in Networks and Systems, 196) by Santosh Kumar Das (editor); Sourav Samanta (editor); Nilanjan Dey (editor); Bharat S. Patel (editor); Aboul Ella Hassanien (editor) ISBN 9789811603853, 9811603855 2024 scribd download
68 pages
Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou pdf download
100% (1)
Distributed Optimization Game and Learning Algorithms Theory and Applications in Smart Grid Systems Huiwei Wang Huaqing Li Bo Zhou pdf download
52 pages
Ict Cyber Security And Applications Amit Joshi Mufti Mahmud pdf download
100% (1)
Ict Cyber Security And Applications Amit Joshi Mufti Mahmud pdf download
83 pages
Multimodal And Tensor Data Analytics For Industrial Systems Improvement 2024th Edition Nathan Gaw download
No ratings yet
Multimodal And Tensor Data Analytics For Industrial Systems Improvement 2024th Edition Nathan Gaw download
78 pages
Coordination Control Of Distributed Systems 1st Edition Jan H Van Schuppen instant download
No ratings yet
Coordination Control Of Distributed Systems 1st Edition Jan H Van Schuppen instant download
77 pages
Complete Download Modeling and Simulation in HPC and Cloud Systems 1st Edition Joanna Kołodziej PDF All Chapters
100% (1)
Complete Download Modeling and Simulation in HPC and Cloud Systems 1st Edition Joanna Kołodziej PDF All Chapters
55 pages
Network Analysis and Architecture 1st Edition Yu-Chu Tian instant download
100% (1)
Network Analysis and Architecture 1st Edition Yu-Chu Tian instant download
79 pages
Privacy And Data Protection Challenges In The Distributed Era 1st Edition Eugenia Politou pdf download
100% (1)
Privacy And Data Protection Challenges In The Distributed Era 1st Edition Eugenia Politou pdf download
46 pages
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
From Everand
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Wireshark Network Security
From Everand
Wireshark Network Security
Piyush Verma
3/5 (1)
Eacvi Handbook Of Cardiovascular Ct Oliver Gaemperli Pl Maurovich Horvat pdf download
No ratings yet
Eacvi Handbook Of Cardiovascular Ct Oliver Gaemperli Pl Maurovich Horvat pdf download
80 pages
Motivational Marketing How to Effectively Motivate Your Prospects to Buy Now Buy More and Tell Their Friends Too 1st Edition Robert Imbriale - The latest ebook is available for instant download now
100% (1)
Motivational Marketing How to Effectively Motivate Your Prospects to Buy Now Buy More and Tell Their Friends Too 1st Edition Robert Imbriale - The latest ebook is available for instant download now
61 pages
Negotiating with the Enemy U S China Talks During the Cold War 1949 1972 Yafeng Xia - Quickly download the ebook to never miss any content
100% (1)
Negotiating with the Enemy U S China Talks During the Cold War 1949 1972 Yafeng Xia - Quickly download the ebook to never miss any content
59 pages
Advances in Geophysics 1st ed Edition Renata Dmowska (Eds.) - Quickly access the ebook and start reading today
100% (1)
Advances in Geophysics 1st ed Edition Renata Dmowska (Eds.) - Quickly access the ebook and start reading today
60 pages
The Prosecution of International Crimes A Critical Study of the International Tribunal for the Former Yugoslavia 1st Edition Madeleine Sann - Download the ebook now to never miss important information
100% (1)
The Prosecution of International Crimes A Critical Study of the International Tribunal for the Former Yugoslavia 1st Edition Madeleine Sann - Download the ebook now to never miss important information
48 pages
Visual Studio 2010 All in One For Dummies 1st Edition Andrew Moore - The full ebook version is ready for instant download
100% (1)
Visual Studio 2010 All in One For Dummies 1st Edition Andrew Moore - The full ebook version is ready for instant download
52 pages
Production Planning & Control: Prof. Biranchi Prasad Panda
No ratings yet
Production Planning & Control: Prof. Biranchi Prasad Panda
52 pages
Chapter 01
No ratings yet
Chapter 01
16 pages
Glykas 2010 Fuzzy Cognitive Maps
No ratings yet
Glykas 2010 Fuzzy Cognitive Maps
435 pages
Homework 4 - MATH 340
No ratings yet
Homework 4 - MATH 340
2 pages
Operation Research-D.M.Marathe
No ratings yet
Operation Research-D.M.Marathe
7 pages
Water-Food-Energy: Nexus and Non-Nexus Approaches For Optimal Cropping Pattern
No ratings yet
Water-Food-Energy: Nexus and Non-Nexus Approaches For Optimal Cropping Pattern
10 pages
Hill
No ratings yet
Hill
15 pages
M53 Lec3.4 (Optimization) - v2 PDF
No ratings yet
M53 Lec3.4 (Optimization) - v2 PDF
287 pages
Stock Minds first draft (3)
No ratings yet
Stock Minds first draft (3)
8 pages
1 An Introduction To Approximation Algorithms
No ratings yet
1 An Introduction To Approximation Algorithms
12 pages
TYBMS - Operations Research Date - 5 March 2020 Attempt Any THREE Questions Marks: 45 Duration: 1 HR
No ratings yet
TYBMS - Operations Research Date - 5 March 2020 Attempt Any THREE Questions Marks: 45 Duration: 1 HR
1 page
Simplified Estimation of Train Resistance Parameters: Full Scale Experimental Tests and Analysis
No ratings yet
Simplified Estimation of Train Resistance Parameters: Full Scale Experimental Tests and Analysis
18 pages
An Effective Heuristic For The P-Median Problem With Application To Ambulance Location
No ratings yet
An Effective Heuristic For The P-Median Problem With Application To Ambulance Location
15 pages
Supplement To Chapter 8
No ratings yet
Supplement To Chapter 8
2 pages
Model Predictive Control With Constraints: Example 5.1
No ratings yet
Model Predictive Control With Constraints: Example 5.1
32 pages
Download Complete (Ebook) Metaheuristic Algorithms for Image Segmentation: Theory and Applications by Diego Oliva, Mohamed Abd Elaziz, Salvador Hinojosa ISBN 9783030129309, 9783030129316, 3030129306, 3030129314 PDF for All Chapters
100% (8)
Download Complete (Ebook) Metaheuristic Algorithms for Image Segmentation: Theory and Applications by Diego Oliva, Mohamed Abd Elaziz, Salvador Hinojosa ISBN 9783030129309, 9783030129316, 3030129306, 3030129314 PDF for All Chapters
57 pages
Lesson 8 Complexity Theory and DP
No ratings yet
Lesson 8 Complexity Theory and DP
13 pages
Problem Set 1 - Econ 217 PDF
No ratings yet
Problem Set 1 - Econ 217 PDF
3 pages
Three-Dimensional Optimization of Staggered Finned Circular and Elliptic Tubes in Forced Convection - Matos
No ratings yet
Three-Dimensional Optimization of Staggered Finned Circular and Elliptic Tubes in Forced Convection - Matos
11 pages
Pulsation Suppression Device Design For Reciprocating Compressor
No ratings yet
Pulsation Suppression Device Design For Reciprocating Compressor
9 pages
CHP 1 Introduction To Managerial Economics Content
No ratings yet
CHP 1 Introduction To Managerial Economics Content
18 pages
LPP Pyq's
No ratings yet
LPP Pyq's
17 pages
Optimizing The Laser-Welded Butt Joints of Medium Carbon Steel Using RSM
No ratings yet
Optimizing The Laser-Welded Butt Joints of Medium Carbon Steel Using RSM
4 pages
Log Method
No ratings yet
Log Method
8 pages
Linear Quadratic Regulator
0% (1)
Linear Quadratic Regulator
52 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
IEEE Soft Computing For Greenhouse
No ratings yet
IEEE Soft Computing For Greenhouse
8 pages