SlideShare a Scribd company logo
Sre With Java Microservices Patterns For
Reliable Microservices In The Enterprise 1st
Edition Jonathan Schneider download
https://ebookbell.com/product/sre-with-java-microservices-
patterns-for-reliable-microservices-in-the-enterprise-1st-
edition-jonathan-schneider-55575376
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Sre With Java Microservices Jonathan Schneider Schneider Jonathan
https://ebookbell.com/product/sre-with-java-microservices-jonathan-
schneider-schneider-jonathan-31547858
Sre With Java Microservices Jonathan Schneider
https://ebookbell.com/product/sre-with-java-microservices-jonathan-
schneider-232065888
The Art Of Site Reliability Engineering Sre With Azure Building And
Deploying Applications That Endure 1st Edition Unai Huete Beloki
https://ebookbell.com/product/the-art-of-site-reliability-engineering-
sre-with-azure-building-and-deploying-applications-that-endure-1st-
edition-unai-huete-beloki-46188398
High Performance Sre Automation Error Budgeting Rpas Slos And Slas
With Site Reliability Engineering Arora Mishra
https://ebookbell.com/product/high-performance-sre-automation-error-
budgeting-rpas-slos-and-slas-with-site-reliability-engineering-arora-
mishra-55917980
Establishing Sre Foundations A Stepbystep Guide To Introducing Site
Reliability Engineering In Software Delivery Organizations 1st Edition
Vladyslav Ukis
https://ebookbell.com/product/establishing-sre-foundations-a-
stepbystep-guide-to-introducing-site-reliability-engineering-in-
software-delivery-organizations-1st-edition-vladyslav-ukis-46412452
Becoming Sre First Steps Toward Reliability For Your And Your
Organization 1 Converted David N Blankedelman
https://ebookbell.com/product/becoming-sre-first-steps-toward-
reliability-for-your-and-your-organization-1-converted-david-n-
blankedelman-55543244
Becoming Sre First Steps Toward Reliability For You And Your
Organization 1st Edition David N Blankedelman
https://ebookbell.com/product/becoming-sre-first-steps-toward-
reliability-for-you-and-your-organization-1st-edition-david-n-
blankedelman-56090074
Seeking Sre Conversations About Running Production Systems At Scale
1st Edition David N Blankedelman
https://ebookbell.com/product/seeking-sre-conversations-about-running-
production-systems-at-scale-1st-edition-david-n-blankedelman-7293816
Seeking Sre Conversations About Running Production Systems At Scale
David N Blankedelman
https://ebookbell.com/product/seeking-sre-conversations-about-running-
production-systems-at-scale-david-n-blankedelman-49848640
Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider
Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider
Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider
Jonathan Schneider
SRE with Java Microservices
Patterns for Reliable Microservices
in the Enterprise
Boston Farnham Sebastopol Tokyo
Beijing Boston Farnham Sebastopol Tokyo
Beijing
978-1-492-07392-5
[GP]
SRE with Java Microservices
by Jonathan Schneider
Copyright © 2020 Jonathan Schneider. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional
sales department: 800-998-9938 or corporate@oreilly.com.
Acquisitions Editor: Melissa Duffield
Development Editor: Melissa Potter
Production Editor: Deborah Baker
Copyeditor: JM Olejarz
Proofreader: Amanda Kersey
Indexer: WordCo Indexing Services, Inc.
Interior Designer: David Futato
Cover Designer: Karen Montgomery
Illustrator: O’Reilly Media, Inc.
October 2020: First Edition
Revision History for the First Edition
2020-08-26: First Release
See http://oreilly.com/catalog/errata.csp?isbn=9781492073925 for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. SRE with Java Microservices, the cover
image, and related trade dress are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author, and do not represent the publisher’s views.
While the publisher and the author have used good faith efforts to ensure that the information and
instructions contained in this work are accurate, the publisher and the author disclaim all responsibility
for errors or omissions, including without limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is subject to open source
licenses or the intellectual property rights of others, it is your responsibility to ensure that your use
thereof complies with such licenses and/or rights.
Table of Contents
Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1. The Application Platform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Platform Engineering Culture 2
Monitoring 7
Monitoring for Availability 7
Monitoring as a Debugging Tool 10
Learning to Expect Failure 12
Effective Monitoring Builds Trust 13
Delivery 13
Traffic Management 15
Capabilities Not Covered 15
Testing Automation 15
Chaos Engineering and Continuous Verification 17
Configuration as Code 17
Encapsulating Capabilities 18
Service Mesh 19
Summary 21
2. Application Metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Black Box Versus White Box Monitoring 24
Dimensional Metrics 25
Hierarchical Metrics 26
Micrometer Meter Registries 27
Creating Meters 30
Naming Metrics 31
iii
Common Tags 36
Classes of Meters 38
Gauges 39
Counters 42
Timers 45
“Count” Means “Throughput” 46
“Count” and “Sum” Together Mean “Aggregable Average” 46
Maximum Is a Decaying Signal That Isn’t Aligned to the Push Interval 50
The Sum of Sum Over an Interval 53
The Base Unit of Time 53
Using Timers 55
Common Features of Latency Distributions 59
Percentiles/Quantiles 60
Histograms 65
Service Level Objective Boundaries 69
Distribution Summaries 73
Long Task Timers 74
Choosing the Right Meter Type 77
Controlling Cost 77
Coordinated Omission 80
Load Testing 82
Meter Filters 87
Deny/Accept Meters 88
Transforming Metrics 89
Configuring Distribution Statistics 91
Separating Platform and Application Metrics 92
Partitioning Metrics by Monitoring System 96
Meter Binders 98
Summary 99
3. Debugging with Observability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
The Three Pillars of Observability…or Is It Two? 101
Logs 102
Distributed Tracing 103
Metrics 104
Which Telemetry Is Appropriate? 104
Components of a Distributed Trace 107
Types of Distributed Tracing Instrumentation 109
Manual Tracing 109
Agent Tracing 110
Framework Tracing 110
Service Mesh Tracing 111
iv | Table of Contents
Blended Tracing 112
Sampling 114
No Sampling 114
Rate-Limiting Samplers 114
Probabilistic Samplers 115
Boundary Sampling 116
Impact of Sampling on Anomaly Detection 116
Distributed Tracing and Monoliths 117
Correlation of Telemetry 118
Metric to Trace Correlation 119
Using Trace Context for Failure Injection and Experimentation 120
Summary 123
4. Charting and Alerting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Differences in Monitoring Systems 127
Effective Visualizations of Service Level Indicators 132
Styles for Line Width and Shading 132
Errors Versus Successes 134
“Top k” Visualizations 135
Prometheus Rate Interval Selection 137
Gauges 137
Counters 139
Timers 143
When to Stop Creating Dashboards 147
Service Level Indicators for Every Java Microservice 148
Errors 148
Latency 153
Garbage Collection Pause Times 161
Heap Utilization 164
CPU Utilization 170
File Descriptors 172
Suspicious Traffic 174
Batch Runs or Other Long-Running Tasks 175
Building Alerts Using Forecasting Methods 176
Naive Method 177
Single-Exponential Smoothing 179
Universal Scalability Law 181
Summary 185
5. Safe, Multicloud Continuous Delivery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Types of Platforms 188
Resource Types 189
Table of Contents | v
Delivery Pipelines 191
Packaging for the Cloud 194
Packaging for IaaS Platforms 196
Packaging for Container Schedulers 198
The Delete + None Deployment 199
The Highlander 200
Blue/Green Deployment 200
Automated Canary Analysis 205
Spinnaker with Kayenta 209
General-Purpose Canary Metrics for Every Microservice 214
Summary 218
6. Source Code Observability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
The Stateful Asset Inventory 223
Release Versioning 226
Maven Repositories 227
Build Tools for Release Versioning 230
Capturing Resolved Dependencies in Metadata 234
Capturing Method-Level Utilization of the Source Code 240
Structured Code Search with OpenRewrite 243
Dependency Management 252
Version Misalignments 252
Dynamic Version Constraints 253
Unused Dependencies 254
Undeclared Explicitly Used Dependencies 255
Summary 256
7. Traffic Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Microservices Offer More Potential Failure Points 257
Concurrency of Systems 258
Platform Load Balancing 259
Gateway Load Balancing 259
Join the Shortest Queue 262
Instance-Reported Availability and Utilization 264
Health Checks 267
Choice of Two 269
Instance Probation 269
Knock-On Effects of Smarter Load Balancing 270
Client-Side Load Balancing 270
Hedge Requests 272
Call Resiliency Patterns 273
Retries 274
vi | Table of Contents
Rate Limiters 276
Bulkheads 278
Circuit Breakers 280
Adaptive Concurrency Limits 283
Choosing the Right Call Resiliency Pattern 284
Implementation in Service Mesh 285
Implementation in RSocket 287
Summary 288
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Table of Contents | vii
Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider
Foreword
“To production and beyond!”
—Buzz Lightyear (paraphrasing)
I know Buzz said “to infinity and beyond,” but that whole notion never sat well with
me as a child. How could you go beyond infinity? It was only later in life, when I
became a software engineer, that it dawned on me—software is never complete. It’s
never finished. It’s...infinite. Buzz missed his calling in software!
Software has no end. Software is like the oceans, the stars, and the bugs in your code:
endless! Hopefully, that’s not controversial. The last few decades have seen all of us in
the software field pivot around this insight: the endless tail of software maintenance is
the most expensive part of what we do. A good deal of the significant movements in
software figure around that. Testing and continuous integration. Continuous delivery.
Cloud computing. Microservices. It’s not hard to get to production the first time, but
these practices optimize for the many subsequent trips to production. They optimize
for day two. They optimize for cycle time: how quickly can you take an idea and see it
delivered into production, from concept to customer? They optimize for “and
beyond.”
This insight—that software has no end—introduces a ton of new practices and puts
the lie to as many existing practices. It changes the focus from the initial development
and MVP to the maintenance and management of that software. The focus is on pro‐
duction.
I love production. You should love production. You should go to production, as early
and often as possible. Bring the kids. Bring the family. The weather’s amazing.
Great software engineers build for production and for the return trip. They build for
the endless journey. It is no longer acceptable to wring our collective hands and
throw our code over the proverbial wall: “Well, it works on my machine! Now you
deploy it!” There’s a whole new frontier out there—production—that demands a dif‐
ferent set of skills. Site reliability engineering (SRE) is, according to Ben Treynor,
ix
founder of Google’s site reliability team, “what happens when a software engineer is
tasked with what used to be called operations.” SREs share a lot of skills with tradi‐
tional software engineers, but they target them differently, with an eye on production.
Spring Boot developers can and should develop SRE skills, and few know the ropes
better than this book’s author, Jonathan Schneider. Spring Boot succeeds by being
laser-focused on production. It is extraordinary because it, along with frameworks
like Dropwizard, is built stem to stern for production. It supports easy aggregation of
metrics with Micrometer, so-called fat .jar deployments, the Actuator management
endpoints, application life cycle events, 12 Factor-style configuration, and more.
Spring Cloud is a set of Spring Boot extensions designed to support the microservices
architecture. And all that is to say nothing of the rich platform support. Spring Boot
is container-native. It and platforms like Cloud Foundry or Kubernetes are two sides
of the same coin. Spring Boot supports graceful shutdown, health groups, liveness
and readiness probes, Docker image generation (leveraging CNCF buildpacks), and
so much more.
And I’ll bet you didn’t know about most of those features! But Jonathan knows. Jona‐
than lives and breathes this stuff. He created the Micrometer project, a dimensional
metrics framework that supports dozens of different metrics and monitoring plat‐
forms, and then helped integrate it into Spring Boot’s Actuator module, and into
countless other third-party open source projects. I’ve watched him work on metrics,
continuous delivery tools like Spinnaker, and observability tools like Micrometer, and
generally pave the path to production for others. He’s got years of experience in lever‐
aging Spring and Spring Boot at a global scale, and while he can sling Spring Data
repositories and craft HTTP APIs with the best of them, his genius is in the way he
builds for that endless journey.
And now we can learn from him in this book.
Chapter 1 is a manifesto of sorts—read this to get in the right frame of mind for the
book. I should’ve read this chapter first!
I should’ve, but I didn’t. I skipped ahead to Chapter 2, which introduces instrumenta‐
tion and metrics. While I was eager to read the book cover to cover, I was most exci‐
ted about this chapter. It’s not surprising that the creator of Micrometer could so well
articulate the concepts in this chapter. Chapters 2 through 4 are a few hours very well
spent. They’re brilliant. 5/5: would (and did) read again.
Chapter 5 introduces the clouds, core concepts, types of platforms, and patterns
unique to these platforms. This chapter was one of the most insightful for me. It starts
slowly, and the next thing you know, you’re up to your deployment scripts in a game-
changing discussion of continuous delivery, canary analysis, and more. Read this
chapter twice.
x | Foreword
Chapter 6 gives you a framework and specific solutions to understand your codebase
and its dependencies. I’ve never seen all these concerns presented in a comprehensive
framework like this.
Chapter 7 is a fitting closing chapter to an amazing book in that it looks at the inter‐
actions of deployed services in production. By this point you’ll have learned how to
get the software to production and how to observe services and their source code.
This chapter is all about the service interactions and the dynamics of load on an
architecture.
I learned something new in every chapter. The book is full of the wisdom of a true
cloud native, and one that I think the community needs. Jonathan is a fantastic guide
on the endless journey to production...and beyond.
— Josh Long (@starbuxman)
Spring Developer Advocate
Spring team, VMware
San Francisco, CA
July 2020
Foreword | xi
Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider
Preface
This book presents a phased approach to building and deploying more reliable Java
microservices. The capabilities presented in each chapter are meant to be followed in
order, each building upon the capabilities of earlier chapters. There are five phases in
this journey:
1. Measure and monitor your services for availability.
2. Add debuggability signals that allow you to ask questions about periods of
unavailability.
3. Improve your software delivery pipeline to limit the chance of introducing more
failure.
4. Build the capability to observe the state of deployed assets all the way down to
source code.
5. Add just enough traffic management to bring your services up to a level of availa‐
bility you are satisfied with.
Our goal isn’t to build a perfect system, to eliminate all failure. Our goal is to end with
a highly reliable system and avoid spending time in the space of diminishing returns.
Avoiding diminishing returns is why we will spend so much time talking about effec‐
tive measurement and monitoring, and why this discipline precedes all others.
If you are in engineering management, Chapter 1 is your mission statement: to build
an application platform renowned for its reliability and the culture of an effective
platform engineering team that can deliver these capabilities to a broader engineering
organization.
The chapters that follow contain the blueprints for achieving this mission, targeted at
engineers. This book is intentionally narrowed in scope to Java microservices pre‐
cisely so that I can offer detailed advice on how to go about this, including specific
measurements battle-tested for Java microservices, code samples, and other
xiii
idiosyncracies like dependency management concerns that are unique to the Java vir‐
tual machine (JVM). Its focus is on immediate actionability.
My Journey
My professional journey in software engineering forms an arc that led me to write
this book:
• A scrappy custom software startup
• A traditional insurance company called Shelter Insurance in Missouri
• Netflix in Silicon Valley
• A Spring team engineer working remotely
• A Gradle engineer
When I left Shelter Insurance, despite my efforts, I didn’t understand public cloud. In
almost seven years there, I had interacted with the same group of named virtual
machines (bare metal actually, originally). I was used to quarterly release cycles and
extensive manual testing before releases. I felt like leaders emphasized and reempha‐
sized how “hard” we expected code freezes to be leading up to releases, how after a
release a code freeze wasn’t as hard as we would have liked, etc. I had never experi‐
enced production monitoring of an application—that was the responsibility of a net‐
work operations center, a room my badge didn’t provide access to because I didn’t
need to know what happened there. This organization was successful by most meas‐
ures. It has changed significantly in some ways since then, and little in others. I’m
thankful for the opportunity to have learned under some fantastic engineers there.
At Netflix I learned valuable lessons about engineering and culture. I left after a time
with a great sense of hope that some of these same practices could be applied to a
company like Shelter Insurance, and joined the Spring team. When I founded the
open source metrics library Micrometer, it was with a deep appreciation of the fact
that organizations are on a journey. Rather than supporting just the best-in-class
monitoring systems of today, Micrometer’s first five monitoring system implementa‐
tions contained three legacy monitoring systems that I knew were still in significant
use.
A couple of years working with and advising enterprises of various sizes on applica‐
tion monitoring and delivery automation with Spinnaker gave me an idea of both the
diversity of organizational dynamics and their commonalities. It is my understanding
of the commonalities, those practices and techniques that every enterprise could ben‐
efit from, that form the substance of this book. Every enterprise Java organization can
apply these techniques, given a bit of time and practice. That includes your
organization.
xiv | Preface
Conventions Used in This Book
The following typographical conventions are used in this book:
Italic
Indicates new terms, URLs, email addresses, filenames, and file extensions.
Constant width
Used for program listings, as well as within paragraphs to refer to program ele‐
ments such as variable or function names, databases, data types, environment
variables, statements, and keywords.
Constant width bold
Shows commands or other text that should be typed literally by the user.
Constant width italic
Shows text that should be replaced with user-supplied values or by values deter‐
mined by context.
This element signifies a tip or suggestion.
This element signifies a general note.
This element indicates a warning or caution.
Preface | xv
O’Reilly Online Learning
For more than 40 years, O’Reilly Media has provided technol‐
ogy and business training, knowledge, and insight to help
companies succeed.
Our unique network of experts and innovators share their knowledge and expertise
through books, articles, and our online learning platform. O’Reilly’s online learning
platform gives you on-demand access to live training courses, in-depth learning
paths, interactive coding environments, and a vast collection of text and video from
O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com.
How to Contact Us
Please address comments and questions concerning this book to the publisher:
O’Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-998-9938 (in the United States or Canada)
707-829-0515 (international or local)
707-829-0104 (fax)
We have a web page for this book, where we list errata and any additional informa‐
tion. You can access this page at https://oreil.ly/SRE_with_Java_Microservices.
Email bookquestions@oreilly.com to comment or ask technical questions about this
book.
For news and information about our books and courses, visit http://oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
xvi | Preface
Acknowledgments
Olga Kundzich
What I didn’t know before writing this book is how much voices from an author’s
circle of colleagues find their way into a book. It makes complete sense, of course.
We influence each other simply by working together! Olga’s insightful views on a
wide range of topics have probably had the greatest single influence on my think‐
ing in the last couple of years, and her voice is everywhere in this book (or at
least the best approximation of it I can represent). Thoughts you’ll find on “the
application platform,” continuous delivery (no no, not continuous deployment—I
kept confusing the two), asset inventory, monitoring, and elements of traffic
management are heavily influenced by her. Thank you Olga for investing so
much energy into this book.
Troy Gaines
To Troy I owe my initial introduction to dependency management, build auto‐
mation, continuous integration, unit testing, and so many other essential skills.
He was an early and significant influence in my growth as a software developer,
as I know he has been to many others. Thank you, old friend, for taking the time
to review this work as well.
Tommy Ludwig
Tommy is one of the rare telemetry experts that contributes to both distributed
tracing and aggregated metrics technologies. It is so common that contributors in
the observability space are hyper-focused on one area of it, and Tommy is one of
the few that floats between them. To put it mildly, I dreaded Tommy’s review of
Chapter 3, but was happy to find that we had more in common on this than I
expected. Thanks for pointing out the more nuanced view of distributed tracing
tag cardinality that made its way into Chapter 3.
Sam Snyder
I haven’t known Sam for long, but it didn’t take long for me to understand that
Sam is an excellent mentor and patient teacher. Thank you Sam for agreeing to
subject yourself to the arduous task of reviewing a technical book, and leaving so
much positive and encouraging feedback.
Mike McGarr
I received an email out of the blue from Mike in 2014 that, a short time later,
resulted in me packing everything up and moving to California. That email set
me on a course that changed everything. I came to know so many experts at Net‐
flix that accelerated me through the learning process because Mike took a chance
on me. It radically changed the way I view software development and operations.
Mike is also just a fantastic human being—a kind and inquisitive friend and
leader. Thanks, Mike.
Preface | xvii
Josh Long
Once in the book, I quoted a typical Josh Long phrase about there being “no
place like” production. I thought I was being cheeky and fun. And then Josh
wrote a foreword that features Buzz Lightyear…Josh is an unstoppable ball of
energy. Thank you Josh for injecting a bit of that energy into this work.
xviii | Preface
CHAPTER 1
The Application Platform
Martin Fowler and James Lewis, who initially proposed the term microservices, define
the architecture in their seminal blog post as:
…a particular way of designing software applications as suites of independently
deployable services. While there is no precise definition of this architectural style,
there are certain common characteristics around organization around business capa‐
bility, automated deployment, intelligence in the endpoints, and decentralized control
of languages and data.
Adopting microservices promises to accelerate software development by separating
applications into independently developed and deployed components produced by
independent teams. It reduces the need to coordinate and plan large-scale software
releases. Each microservice is built by an independent team to meet a specific busi‐
ness need (for internal or external customers). Microservices are deployed in a
redundant, horizontally scaled way across different cloud resources and communicate
with each other over the network using different protocols.
A number of challenges arise due to this architecture that haven’t been seen previ‐
ously in monolithic applications. Monolithic applications used to be primarily
deployed on the same server and infrequently released as a carefully choreographed
event. The software release process was the main source of change and instability in
the system. In microservices, communications and data transfer costs introduce addi‐
tional latencies and potential to degrade end-user experience. A chain of tens or hun‐
dreds of microservices now work together to create that experience. Microservices
are released independently of each other, but each one can inadvertently impact other
microservices and therefore the end-user experience, too.
Managing these types of distributed systems requires new practices, tools, and engi‐
neering culture. Accelerating software releases doesn’t need to come at the cost of
stability and safety. In fact, these go hand in hand. This chapter introduces the culture
1
of an effective platform engineering team and describes the basic building blocks of
reliable systems.
Platform Engineering Culture
To manage microservices, an organization needs to standardize specific communica‐
tion protocols and supporting frameworks. A lot of inefficiencies arise if each team
needs to maintain its own full stack development, as does friction when communicat‐
ing with other parts of a distributed application. In practice, standardization leads to
a platform team that is focused on providing these services to the rest of the teams,
who are in turn focused on developing software to meet business needs.
We want to provide guardrails, not gates.
—Dianne Marsh, director of engineering tools at Netflix
Instead of building gates, allow teams to build solutions that work for them first,
learn from them, and generalize to the rest of the organization.
Organizations which design systems are constrained to produce designs which are
copies of the communication structures of these organizations.
—Conway’s Law
Figure 1-1 shows an engineering organization built around specialties. One group
specializes in user interface and experience design, another building backend serv‐
ices, another managing the database, another working on business process automa‐
tion, and another managing network resources.
Figure 1-1. Organization built around technical silos
2 | Chapter 1: The Application Platform
The lesson often taken from Conway’s Law is that cross-functional teams, as in
Figure 1-2, can iterate faster. After all, when team structure is aligned to technical
specialization, any new business requirement will require coordination across all of
these specializations.
Figure 1-2. Cross-functional teams
There is obviously waste in this system as well though, specifically that specialists on
each team are developing capabilities independently of one another. Netflix did not
have dedicated site reliability engineers per team, as Google promotes in Site Reliabil‐
ity Engineering edited by Betsy Beyer et al. (O’Reilly). Perhaps because of a greater
degree of homogenity to the type of software being written by product teams (mostly
Java, mostly stateless horizontally scaled microservices), the centralization of product
engineering functions was more efficient. Does your organization more resemble
Google, working on very different types of products from automated cars to search to
mobile hardware to browsers? Or does it more resemble Netflix, composed of a series
of business applications written in a handful of languages running on a limited vari‐
ety of platforms?
Cross-functional teams and completely siloed teams are just on the opposite ends of a
spectrum. Effective platform engineering can reduce the need for a specialist per
team for some set of problems. An organization with dedicated platform engineering
is more of a hybrid, like in Figure 1-3. A central platform engineering team is stron‐
gest when it views product teams as customers that need to be constantly won over
and exercises little to no control over the behavior of its customers.
Platform Engineering Culture | 3
Figure 1-3. Product teams with dedicated platform engineering
For example, when monitoring instrumentation is distributed throughout the organi‐
zation as a common library included in each microservice, it shares the hard-won
knowledge of availability indicators known to be broadly applicable. Each product
team can spend just a little time adding availability indicators that are unique to its
business domain. It can communicate with the central monitoring team for informa‐
tion and advice on how to build effective signals as necessary.
At Netflix, the strongest cultural current was “freedom and responsibility,” defined in
a somewhat famous culture deck from 2001. I was a member of the engineering tools
team but we could not require that everyone else adopt a particular build tool. A
small team of engineers managed Cassandra clusters on behalf of many product
teams. There is an efficiency to this concentration of build tool or Cassandra skill, a
natural communication hub through which undifferentiated problems with these
products flowed and lessons were transferred to product-focused teams.
The build tools team at Netflix, at its smallest point, was just two engineers serving
the interests of roughly 700 other engineers while transitioning between recom‐
mended build tools (Ant to Gradle) and performing two major Java upgrades (Java 6
to 7 and then Java 7 to 8), among other daily routines. Each product team completely
owned its build. Because of “freedom and responsibility,” we could not set a hard date
for when we would completely retire Ant-based build tooling. We could not set a
hard date for when every team had to upgrade its version of Java (except to the extent
that a new Oracle licensing model did this for us). The cultural imperative drove us to
4 | Chapter 1: The Application Platform
focus so heavily on developer experience that product teams wanted to migrate with
us. It required a level of effort and empathy that could only be guaranteed by abso‐
lutely preventing us from setting hard requirements.
When a platform engineer like myself serves the interests of so many diverse product
teams in a focused technical speciality like build tooling, inevitably patterns emerge.
My team saw the same script playing out over and over again with binary dependency
problems, plug-in versioning, release workflow problems, etc. We worked initially to
automate the discovery of these patterns and emit warnings in build output. Without
the freedom-and-responsibility culture, perhaps we would have skipped warnings
and just failed the build, requiring product teams to fix issues. This would have been
satisfying to the build tools team—we wouldn’t be responsible for answering ques‐
tions related to failures that we tried to warn teams about. But from the product team
perspective, every “lesson” the build tools team learned would be disruptive to them
at random points in time, and especially disruptive when they had more pressing (if
temporary) priorities.
The softer, non-failing warning approach was shockingly ineffective. Teams rarely
paid any attention to successful build logs, regardless of how many warnings were
emitted. And even if they did see the warnings, attempting to fix them incurred risk: a
working build with warnings is better than a misbehaving one without warnings. As a
result, carefully crafted deprecation warnings could go ignored for months or years.
The “guardrails not gates” approach required our build tools team to think about how
we could share our knowledge with product teams in a way that was visible to them,
required little time and effort to act on, and reduced the risk of coming along with us
on the paved path. The tooling that emerged from this was almost over the top in its
focus on developer experience.
First, we wrote tooling that could rewrite the Groovy code of Gradle builds to autore‐
mediate common patterns. This was much more difficult than just emitting warnings
in the log. It required making indentation-preserving abstract syntax
tree modifications to imperative build logic, an impossible problem to solve in gen‐
eral, but surprisingly effective in specific cases. Autoremediation was opt-in though,
through the use of a simple command that product teams could run to accept
recommendations.
Next, we wrote monitoring instrumentation that reported patterns that were poten‐
tially remediable but for which product teams did not accept the recommendation.
We could monitor each harmful pattern in the organization over time, watch as it
declined in impact as teams accepted remediations. When we reached the long tail of
a small number of teams that just wouldn’t opt in, we knew who they were, so we
could walk over to their desks and work with them one on one to hear their concerns
and help them move forward. (I did this enough that I started carrying my own
mouse around. There was a suspicious correlation between Netflix engineers who
Platform Engineering Culture | 5
used trackballs and Netflix engineers who were on the long tail of accepting remedia‐
tions.) Ultimately, this proactive communication established a bond of trust that
made future recommendations from us seem less risky.
We went to fairly extreme lengths to improve the visibility of recommendations
without resorting to breaking builds to get developers’ attention. Build output was
carefully colorized and stylized, sometimes with visual indicators like Unicode check
marks and X marks that were hard to miss. Recommendations always appeared at the
end of the build because we knew that they were the last thing emitted on the termi‐
nal and our CI tools by default scrolled to the end of the log output when engineers
examined build output. We taught Jenkins how to masquerade as a TTY terminal to
colorize build output but ignore cursor movement escape sequences to still serialize
build task progress.
Crafting this kind of experience was technically costly, but compare it with the two
options:
Freedom and responsibility culture
Led us to build self-help autoremediation with monitoring that helped us under‐
stand and communicate with the teams that struggled.
Centralized control culture
We probably would have been led to break builds eagerly because we “owned” the
build experience. Teams would have been distracted from their other priorities to
accommodate our desire for a consistent build experience. Every change, because
it lacked autoremediation, would have generated far more questions to us as the
build tools team. The total amount of toil for every change would have been far
greater.
An effective platform engineering team cares deeply about developer experience, a
singular focus that is at least as keen as the focus product teams place on customer
experience. This should be no surprise: in a well-calibrated platform engineering
organization, developers are the customer! The presence of a healthy product man‐
agement discipline, expert user experience designers, and UI engineers and designers
that care deeply about their craft should all be indicators of a platform engineering
team that is aligned for the benefit of their customer developers.
More detail on team structure is out of the scope of this book, but refer to Team Top‐
ologies by Matthew Skelton and Manuel Pais (IT Revolution Press) for a thorough
treatment of the topic.
Once the team is culturally calibrated, the question becomes how to prioritize capa‐
bilities that a platform engineering team can deliver to its customer base. The remain‐
der of this book is a call to action, delivered in capabilities ordered from (in my view)
most essential to less essential.
6 | Chapter 1: The Application Platform
Monitoring
Monitoring your application infrastructure requires the least organizational commit‐
ment of all the stages on the journey to more resilient systems. As we’ll show in the
subsequent chapters, framework-level monitoring instrumentation has matured to
such an extent that you really just need to turn it on and start taking advantage. The
cost-benefit ratio has been skewed so heavily toward benefit that if you do nothing
else in this book, start monitoring your production applications now. Chapter 2 will
discuss metrics building blocks, and Chapter 4 will provide the specific charts and
alerts you can employ, mostly based on instrumentation that Java frameworks pro‐
vide without you having to do any additional work.
Metrics, logs, and distributed tracing are three forms of observability that enable the
measure of service availability and aid in debugging complex distributed systems
problems. Before going further in the workings of any of these, it is useful to under‐
stand what capabilities each enables.
Monitoring for Availability
Availability signals measure the overall state of the system and whether that system is
functioning as intended in the large. It is quantified by service level indicators (SLIs).
These indicators include signals for the health of the system (e.g., resource consump‐
tion) and business metrics like number of sandwiches sold or streaming video starts
per second. SLIs are tested against a threshold called a service level objective (SLO)
that sets an upper or lower bound on the range of an SLI. SLOs in turn are a some‐
what more restrictive or conservative estimate than a threshold you agree upon with
your business partners about a level of service you are expected to provide, or what’s
known as a service level agreement (SLA). The idea is that an SLO should provide
some amount of advance warning of an impending violation of an SLA so that you
don’t actually get to the point where you violate that SLA.
Metrics are the primary observability tool for measuring availability. They are a
measure of SLIs. Metrics are the most common availability signal because they repre‐
sent an aggregation of all activity happening in the system. They are cheap enough to
not require sampling (discarding some portion of the data to limit overhead), which
risks discarding important indicators of unavailability.
Metrics are numerical values arranged in a time series representing a sample at a par‐
ticular time or an aggregate of individual events that have occurred in an interval:
Monitoring | 7
1 I first learned of the USE criteria from Brendan Gregg’s description of his method for monitoring Unix sys‐
tems. In that context, latency measurement isn’t as granular, thus the missing L.
Metrics
Metrics should have a fixed cost irrespective of throughput. For example, a metric
that counts executions of a particular block of code should only ship the number
of executions seen in an interval regardless of how many there are. By this I mean
that a metric should ship “N requests were observed” at publish time, not “I saw a
request N distinct times” throughout the publishing interval.
Metrics data
Metrics data cannot be used to reason about the performance or function of any
individual request. Metrics telemetry trades off reasoning about an individual
request for the application’s behavior across all requests in an interval.
To effectively monitor the availability of a Java microservice, a variety of availability
signals need to be monitored. Common signals are given in Chapter 4, but in general
they fall into four categories, known together as the L-USE method:1
Latency
This is a measure of how much time was spent executing a block of code. For the
common REST-based microservice, REST endpoint latency is a useful measure of
the availability of the application, particularly max latency. This will be discussed
in greater detail in “Latency” on page 153.
Utilization
A measure of how much of a finite resource is consumed. Processor utilization is
a common utilization indicator. See “CPU Utilization” on page 170.
Saturation
Saturation is a measurement of extra work that can’t be serviced. “Garbage Col‐
lection Pause Times” on page 161 shows how to measure the Java heap, which
during times of excessive memory pressure leads to a buildup of work that can‐
not be completed. It’s also common to monitor pools like database connection
pools, request pools, etc.
Errors
In addition to looking at purely performance-related concerns, it is essential to
find a way to quantify the error ratio relative to total throughput. Measurements
of error include unanticipated exceptions yielding unsuccessful HTTP responses
on a service endpoint (see “Errors” on page 148), but also more indirect meas‐
ures like the ratio of requests attempted against an open circuit breaker (see “Cir‐
cuit Breakers” on page 280).
8 | Chapter 1: The Application Platform
Utilization and saturation may seem similar at first, and internalizing the difference
will have an impact on how you think about charting and alerting on resources that
can be measured both ways. A great example is JVM memory. You can measure JVM
memory as a utilization metric by reporting on the amount of bytes consumed in
each memory space. You can also measure JVM memory in terms of the proportion
of time spent garbage collecting it relative to doing anything else, which is a measure
of saturation. In most cases, when both utilization and saturation measurements are
possible, the saturation metric leads to better-defined alert thresholds. It’s hard to
alert when memory utilization exceeds 95% of a space (because garbage collection
will bring that utilization rate back below this threshold), but if memory utilization
routinely and frequently exceeds 95%, the garbage collector will kick in more fre‐
quently, more time will be spent proportionally doing garbage collection than any‐
thing else, and the saturation measurement will thus be higher.
Some common availability signals are listed in Table 1-1.
Table 1-1. Examples of availability signals
SLI SLO L-USE criteria
Process CPU usage <80% Saturation
Heap utilization <80% of available heap space Saturation
Error ratio for a REST endpoint <1% of total requests to the endpoint Errors
Max latency for a REST endpoint <100 ms Latency
Google has a much more prescriptive view on how to use SLOs.
Google’s approach to SLOs
Site Reliability Engineering by Betsy Beyer et al. (O’Reilly) presents service availability
as a tension between competing organizational imperatives: to deliver new features
and to run the existing feature set reliably. It proposes that product teams and dedica‐
ted site reliability engineers agree on an error budget that provides a measurable
objective for how unreliable a service is allowed to be within a given window of time.
Exceeding this objective should refocus the team on reliability over feature develop‐
ment until the objective is met.
The Google view on SLOs is explained in great detail in the “Alerting on SLOs” chap‐
ter in The Site Reliability Workbook edited by Betsy Beyer et al. (O’Reilly). Basically,
Google engineers alert on the probability that an error budget is going to be depleted
in any given time frame, and they react in an organizational way by shifting engineer‐
ing resources from feature development to reliability as necessary. The word “error”
in this case means exceeding any SLO. This might mean exceeding an acceptable ratio
of server failed outcomes in a RESTful microservice, but could also mean exceeding
an acceptable latency threshold, getting too close to overwhelming file descriptors on
Monitoring | 9
the underlying operating system, or any other combination of measurements. With
this definition, the time that a service is unreliable in a prescribed window is the pro‐
portion when one or more SLOs were not being met.
Your organization doesn’t need to have separate functions for product engineer and
site reliability engineer for error budgeting to be a useful concept. Even a single engi‐
neer working on a product completely alone and wholly responsible for its operation
can benefit from thinking about where to pause feature development in favor of
improving reliability and vice versa.
I think the overhead of the Google error budget scheme is overkill for a lot of organi‐
zations. Start measuring, discover how alerting functions fit into your unique organi‐
zation, and once practiced at measuring, consider whether you want to go all in on
Google’s process or not.
Collecting, visualizing, and alerting on application metrics is an exercise in continu‐
ously testing the availability of your services. Sometimes an alert itself will contain
enough contextual data that you know how to fix a problem. In other cases, you’ll
want to isolate a failing instance in production (e.g., by moving it out of the load bal‐
ancer) and apply further debugging techniques to discover the problem. Other forms
of telemetry are used for this purpose.
A less formal approach to SLOs
A less formal system worked well for Netflix, where individual engineering teams
were responsible for their services’ availability, there was no SRE/product engineer
separation of responsibility on individual product teams, and there wasn’t such a for‐
malized reaction to error budgets, at least not cross-organizationally. Neither system
is right or wrong; find a system that works well for you.
For the purpose of this book, we’ll talk about how to measure for availability in sim‐
pler terms: as tests against an error rate or error ratio, latencies, saturation, and uti‐
lization indicators. We won’t present violations of these tests as particular “errors” of
reliability that are deducted from an error budget over a window of time. If you want
to then take those measurements and apply the error-budgeting and organizational
dynamics of Google’s SRE culture to your organization, you can do that by following
the guidance given in Google’s writings on the topic.
Monitoring as a Debugging Tool
Logs and distributed traces, covered in detail in Chapter 3, are used mainly for
troubleshooting, once you have become aware of a period of unavailability. Profiling
tools are also debuggability signals.
It is very common (and easy, given a confusing market) for organizations to center
their entire performance management investment around debuggability tools.
10 | Chapter 1: The Application Platform
Application performance management (APM) vendors can sometimes sell them‐
selves as an all-in-one solution, but with a core technology built entirely on tracing or
logging and providing availability signals by aggregating these debugging signals.
In order to not single out any particular vendor, consider YourKit, a valuable profil‐
ing (debuggability) tool that does this task well without selling itself as more. YourKit
excels at highlighting computation- and memory-intensive hotspots in Java code, and
looks like Figure 1-4. Some popular commercial APM solutions have a similar focus,
which, while useful, is not a substitute for a focused availability signal.
Figure 1-4. YourKit excels at profiling
These solutions are more granular, recording in different ways the specifics of what
occurred during a particular interaction with the system. With this increased granu‐
larity comes cost, and this cost is frequently mitigated with downsampling or even
turning off these signals entirely until they are needed.
Attempts to measure availability from log or tracing signals generally force you to
trade off accuracy for cost, and neither can be optimized. This trade-off exists for
traces because they are generally sampled. The storage footprint for traces is higher
than for metrics.
Monitoring | 11
Learning to Expect Failure
If you aren’t already monitoring applications in a user-facing way, as soon as you
start, you’re likely to be confronted with the sobering reality of your software as it
exists today. Your impulse is going to be to look away. Reality is likely to be ugly.
At a midsize property-casualty insurance company, we added monitoring to the main
business application that the company’s insurance agents use to conduct their normal
business. Despite strict release processes and a reasonably healthy testing culture, the
application manifested over 5 failures per minute for roughly 1,000 requests per
minute. From one perspective, this is only a 0.5% error ratio (maybe acceptable and
maybe not), but the failure rate was still a shock to a company that thought its service
was well tested.
The realization that the system is not going to be perfect switches the focus from try‐
ing to be perfect to monitoring, alerting, and quickly resolving issues that the system
experiences. No amount of process control around the rate of change will yield per‐
fect outcomes.
Before evolving the delivery and release process further, the first step on the path to
resilient software is adding monitoring to your software as it is released now.
With the move to microservices and changing application practices and infrastruc‐
ture, monitoring has become even more important. Many components are not
directly under an organization’s control. For example, latency and errors can be
caused by failures in the networking layer, infrastructure, and third-party compo‐
nents and services. Each team producing a microservice has the potential to nega‐
tively impact other parts of the system not under its direct control.
End users of software also do not expect perfection, but do want their service pro‐
vider to be able to effectively resolve issues. This is what is known as the service recov‐
ery paradox, when a user of the service will trust a service more after a failure than
they did before the failure.
Businesses need to understand and capture the user experience they want to provide
to the end users—what type of system behavior will cause issues to the business and
what type of behavior is acceptable to users. Site Reliability Engineering and The Site
Reliability Workbook have more on how to pick these for your business.
Once identified and measured, you can adopt Google style, as seen in “Google’s
approach to SLOs” on page 9, or Netflix’s more informal “context and guardrails”
style, or anywhere in between to help you reason about your software or the next
steps. See the first chapter on Netflix in Seeking SRE by David N. Blank-Edelman
(O’Reilly) to learn more about context and guardrails. Whether you follow the Goo‐
gle practice or a simpler one is up to your organization, the type of software you
develop, and the engineering culture you want to promote.
12 | Chapter 1: The Application Platform
With the goal of never failing replaced with the goal of being able to meet SLAs, engi‐
neering can start building multiple layers of resiliency into systems, minimizing the
effects of failures on end-user experience.
Effective Monitoring Builds Trust
In certain enterprises, engineering can still be seen as a service organization rather
than a core business competency. At the insurance company with a five-failures-per-
minute error rate, this is the prevailing attitude. In many cases where the engineering
organization served the company’s insurance agents in the field, the primary interac‐
tion between them happened through reporting and tracking software issues through
a call center.
Engineering routinely prioritized bug resolution, based on defects learned from the
call center, against new feature requests and did a little of both for each software
release. I wondered how many times the field agents simply didn’t report issues,
either because a growing bug backlog suggested that it wasn’t an effective use of their
time or because the issue had a good-enough workaround. The problem with becom‐
ing aware of issues primarily through the call center is that it made the relationship
entirely one way. Business partners report and engineering responds (eventually).
A user-centric monitoring culture makes this relationship more two-way. An alert
may provide enough contextual information to recognize that rating for a particular
class of vehicle is failing for agents in some region today. Engineering has the oppor‐
tunity to reach out to the agents proactively with enough contextual information to
explain to the agent that the issue is already known.
Delivery
Improving the software delivery pipeline lessens the chance that you introduce more
failure into an existing system (or at least helps you recognize and roll back such
changes quickly). It turns out that good monitoring is a nonobvious prerequisite to
evolving safe and effective delivery practices.
The division between continuous integration (CI) and continuous delivery (CD)
tends to be blurred by the fact that teams frequently script deployment automation
and run these scripts as part of continuous integration builds. It is easy to repurpose a
CI system as a flexible general-purpose workflow automation tool. To make a clear
conceptual delineation between the two, regardless of where the automation runs,
we’ll say that continuous integration ends at the publication of a microservice artifact
to an artifact repository, and delivery begins at that point. In Figure 1-5, the software
delivery life cycle is drawn as a sequence of events from code commit to deployment.
Delivery | 13
Figure 1-5. The boundary between continuous integration and delivery
The individual steps are subject to different frequencies and organizational needs for
control measures. They also have fundamentally different goals. The goal of continu‐
ous integration is to accelerate developer feedback, fail fast through automated test‐
ing, and encourage eager merging to prevent promiscuous integration. The goal of
delivery automation is to accelerate the release cycle, ensure security and compliance
measures are met, provide safe and scalable deployment practices, and contribute to
an understanding of the deployed landscape for the monitoring of deployed assets.
The best delivery platforms also act as an inventory of currently deployed assets, fur‐
ther magnifying the effect of good monitoring: they help turn monitoring into action.
In Chapter 6, we’ll talk about how you can build an end-to-end asset inventory, end‐
ing with a deployed asset inventory, that allows you to reason about the smallest
details of your code all the way up to your deployed assets (i.e., containers, virtual
machines, and functions).
Continuous Delivery Doesn’t Necessarily Mean Continuous Deployment
Truly continuous deployment (every commit passing automated
checks goes all the way to production automatically) may or may
not be a goal for your organization. All things being equal, a tighter
feedback loop is preferable to a longer feedback loop, but it comes
with technical, operational, and cultural costs. Any delivery topics
discussed in this book apply to continuous delivery in general, as
well as continuous deployment in particular.
Once effective monitoring is in place and less failure is being introduced into the sys‐
tem by further changes to the code, we can focus on adding more reliability to the
running system by evolving traffic management practices.
14 | Chapter 1: The Application Platform
Traffic Management
So much of a distributed system’s resiliency is based on the expectation of and com‐
pensation for failure. Availability monitoring reveals these actual points of failure,
debuggability monitoring helps understand them, and delivery automation helps pre‐
vent you from introducing too many more of them in any incremental release. Traffic
management patterns will help live instances cope with the ever-present reality of
failure.
In Chapter 7, we’ll introduce particular mitigation strategies involving load balancing
(platform, gateway, and client-side) and call resilience patterns (retrying, rate limiters,
bulkheads, and circuit breakers) that provide a safety net for running systems.
This is covered last because it requires the highest degree of manual coding effort on
a per-project basis, and because the investment you make in doing the work can be
guided by what you learn from the earlier steps.
Capabilities Not Covered
Certain capabilities that are common focuses of platform engineering teams are not
included in this book. I’d like to call out a couple of them, testing and configuration
management, and explain why.
Testing Automation
My view on testing is that testing automation available in open source takes you a
certain way. Any investment beyond that is likely to suffer from diminishing returns.
Following are some problems that are well solved already:
• Unit testing
• Mocking/stubbing
• Basic integration testing, including test containers
• Contract testing
• Build tooling that helps separate computationally expensive and inexpensive test
suites
There are a couple other problems that I think are worth avoiding unless you really
have a lot of resources (both computationally and in engineering time) to expend.
Contract testing is an example of a technique that covers some of what both of these
test, but in a far cheaper way:
Traffic Management | 15
• Downstream testing (i.e., whenever a commit happens to a library, build all other
projects that depend on this library both directly or indirectly to determine
whether the change will cause failure downstream)
• End-to-end integration testing of whole suites of microservices
I’m very much for automated tests of various sorts and very suspicious of the whole
enterprise. At times, feeling the social pressure of testing enthusiasts around me, I
may have gone along with the testing fad of the day for a little while: 100% test cover‐
age, behavior-driven development, efforts to involve nonengineer business partners
in test specification, Spock, etc. Some of the cleverest engineering work in the open
source Java ecosystem has taken place in this space: consider Spock’s creative use of
bytecode manipulation to achieve data tables and the like.
Traditionally, working with monolithic applications, software releases were viewed as
the primary source of change in the system and therefore potential for failure.
Emphasis was placed on making sure the software release process didn’t fail. Much
effort was expended to ensure that lower-level environments mirrored production to
verify that pending software releases were stable. Once deployed and stable, the sys‐
tem was assumed to remain stable.
Realistically, this has never been the case. Engineering teams adopt and double down
on automated testing practices as a cure for failure, only to have failure stubbornly
persist. Management is skeptical of testing in the first place. When tests fail to capture
problems, what little faith they had is gone. Production environments have a stub‐
born habit of diverging from test environments in subtle and seemingly always cata‐
strophic ways. At this point, if you forced me to choose between having close to 100%
test coverage and an evolved production monitoring system, I’d eagerly choose the
monitoring system. This isn’t because I think less of tests, but because even in reason‐
ably well-defined traditional businesses whose practices don’t change quickly, 100%
test coverage is mythical. The production environment will simply behave differently.
As Josh Long likes to say: “There is no place like it.”
Effective monitoring warns us when a system isn’t working correctly due to condi‐
tions we can anticipate (i.e., hardware failures or downstream service unavailability).
It also continually adds to our knowledge of the system, which can actually lead to
tests covering cases we didn’t previously imagine.
Layers of testing practice can limit the occurrence of failure, but will never eliminate
it, even in industries with the tightest quality control practices. Actively measuring
outcomes in production lowers time to discovery and ultimately remediation of fail‐
ures. Testing and monitoring together are then complementary practices reducing
how much failure end users experience. At their best, testing prevents whole classes of
regressions, and monitoring quickly identifies those that inevitably remain.
16 | Chapter 1: The Application Platform
Our automated test suites prove (to the extent they don’t contain logical errors them‐
selves) what we know about the system. Production monitoring shows us what hap‐
pens. An acceptance that automated tests won’t cover everything should be a
tremendous relief.
Because application code will always contain flaws stemming from unanticipated
interactions, environmental factors like resource constraints, and imperfect tests,
effective monitoring might be considered even more of a requirement than testing for
any production application. A test proves what we think will happen. Monitoring
shows what is happening.
Chaos Engineering and Continuous Verification
There is a whole discipline around continuously verifying that your software behaves
as you expect by introducing controlled failures (chaos experiments) and verifying.
Because distributed systems are complex, we cannot anticipate all of their myriad
interactions, and this form of testing helps surface unexpected emergent properties of
complex systems.
The overall discipline of chaos engineering is broad, and as it is covered in detail in
Chaos Engineering by Casey Rosenthal and Nora Jones (O’Reilly), I won’t go into it in
this book.
Configuration as Code
The 12-Factor App teaches that configuration ought to be separated from code. The
basic form of this concept, configuration stored as an environment variable or
fetched at startup from a centralized configuration server like Spring Cloud Config
Server, I think is straightforward enough to not require any explanation here.
The more complicated case involving dynamic configuration—whereby changes to a
central configuration source propagates to running instances, influencing their
behavior—is in practice exceedingly dangerous and must be handled with care. Pair‐
ing with the open source Netflix Archaius configuration client (which is present in
Spring Cloud Netflix dependencies and elsewhere) was a proprietary Archaius server
which served this purpose. Unintended consequences resulting from dynamic config‐
uration propagation to running instances caused a number of production incidents of
such magnitude that the delivery engineers wrote a whole canary analysis process
around scoping and incrementally rolling out dynamic configuration changes, using
the lessons they had learned from automated canary analysis for different versions of
code. This is beyond the scope of this book, since many organizations will never
receive substantial enough benefit from automated canary analysis of code changes to
make that effort worthwhile.
Capabilities Not Covered | 17
Declarative delivery is an entirely different form of configuration as code, popular‐
ized again by the rise of Kubernetes and its YAML manifests. My early career left me
with a permanent suspicion of the completeness of declarative-only solutions. I think
there is always a place for both imperative and declarative configuration. I worked on
a policy administration system for an insurance company that consisted of a backend
API returning XML responses and a frontend of XSLT transformations of these API
responses into static HTML/JavaScript to be rendered in the browser.
It was a bizarre sort of templating scheme. Its proponents argued that the XSLT lent
the rendering of each page a declarative nature. And yet, it turns out that XSLT itself
is Turing complete with a convincing existence proof. The typical point in favor of
declarative definition is simplicity leading to an amenability to automation like static
analysis and remediation. But as in the XSLT case, these technologies have a seem‐
ingly unavoidable way of evolving toward Turing completeness. The same forces are
in play with JSON (Jsonnet) and Kubernetes (Kustomize). These technologies are
undoubtedly useful, but I can’t be another voice in the chorus calling for purely
declarative configuration. Short of making that point, I don’t think there is much this
book can add.
Encapsulating Capabilities
As under fire as object-oriented programming (OOP) may be today, one of its funda‐
mental concepts is encapsulation. In OOP, encapsulation is about bundling state and
behavior within some unit, e.g., a class in Java. A key idea is to hide the state of an
object from the outside, called information hiding. In some ways, the task of the plat‐
form engineering team is to perform a similar encapsulation task for resiliency best
practices for its customer developer teams, hiding information not out of control, but
to unburden them from the responsibility of dealing with it. Maybe the highest praise
a central team can receive from a product engineer is “I don’t have to care about what
you do.”
The subsequent chapters are going to introduce a series of best practices as I under‐
stand them. The challenge to you as a platform engineer is to deliver them to your
organization in a minimally intrusive way, to build “guardrails not gates.” As you
read, think about how you can encapsulate hard-won knowledge that’s applicable to
every business application and how you can deliver it to your organization.
If the plan involves getting approval from a sufficiently powerful executive and send‐
ing an email to the whole organization requiring adoption by a certain date, it’s a gate.
You still want buy-in from your leadership, but you need to deliver common func‐
tionality in a way that feels more like a guardrail:
18 | Chapter 1: The Application Platform
Explicit runtime dependencies
If you have a core library that every microservice includes as a runtime depend‐
ency, this is almost certainly your delivery mechanism. Turn on key metrics, add
common telemetry tagging, configure tracing, add traffic management patterns,
etc. If you have heavy Spring usage, use autoconfiguration classes. You can simi‐
larly conditionalize configuration with CDI if you are using Java EE.
Service clients as dependencies
For traffic management patterns especially (fallbacks, retry logic, etc.), consider
making it the responsibility of the team producing the service to also produce a
service client that interacts with the service. After all, the team producing and
operating it has more knowledge than anybody about where its weaknesses and
potential failure points are. Those engineers are likely the best ones to formalize
this knowledge in a client dependency such that each consumer of their service
uses it in the most reliable way.
Injecting a runtime dependency
If the deployment process is relatively standardized, you have an opportunity to
inject runtime dependencies in the deployed environment. This was the approach
employed by the Cloud Foundry buildpack team to inject a platform metrics
implementation into Spring Boot applications running on Cloud Foundry. You
can do something similar.
Before encapsulating too eagerly, find a handful of teams and practice this discipline
explicitly in code in a handful of applications. Generalize what you learn.
Service Mesh
As a last resort, encapsulate common platform functionality in sidecar processes (or
containers) alongside the application, which when paired with a control plane man‐
aging them is called a service mesh.
The service mesh is an infrastructure layer outside of application code that manages
interaction between microservices. One of the most recognizable implementations
today is Istio. These sidecars perform functions like traffic management, service dis‐
covery, and monitoring on behalf of the application process so that the application
does not need to be aware of these concerns. At its best, this simplifies application
development, trading off increased complexity and cost in deploying and running the
service.
Over a long enough time horizon, trends in software engineering are often cyclic. In
the case of site reliability, the pendulum swings from increased application and devel‐
oper responsibility (e.g., Netflix OSS, DevOps) to centralized operations team respon‐
sibility. The rise of interest in service mesh represents a shift back to centralized
operations team responsibility.
Encapsulating Capabilities | 19
Istio promotes the concept of managing and propagating policy across a suite of
microservices from its centralized control plane, at the behest of an organizationally
centralized team that specializes in understanding the ramifications of these policies.
The venerable Netflix OSS suite (the important pieces of which have alternative
incarnations like Resilience4j for traffic management, HashiCorp Consul for discov‐
ery, Micrometer for metrics instrumentation, etc.) made these application concerns.
Largely, though, the application code impact was just the addition of one or more
binary dependencies, at which point some form of autoconfiguration took over and
decorated otherwise untouched application logic. The obvious downside of this
approach is language support, with support for each site reliability pattern requiring
library implementations in every language/framework that the organization uses.
Figure 1-6 shows an optimistic view of the effect on this engineering cycle on derived
value. With any luck, at each transition from decentralization to centralization and
back, we learn from and fully encapsulate the benefits of the prior cycle. For example,
Istio could conceivably fully encapsulate the benefits of the Netflix OSS stack, only for
the next decentralization push to unlock potential that was unrealizable in Istio’s
implementation. This is already underway in Resilience4j, for example, with discus‐
sion about adaptive forms of patterns like bulkheads that are responsive to
application-specific indicators.
Figure 1-6. The cyclic nature of software engineering, applied to traffic management
Sizing of sidecars is also tricky, given this lack of domain-specific knowledge. How
does a sidecar know that an application process is going to consume 10,000 requests
per second, or only 1? Zooming out, how do we size the sidecar control plane up
front not knowing how many sidecars will eventually exist?
20 | Chapter 1: The Application Platform
Sidecars Are Limited to Lowest-Common-Denominator Knowledge
A sidecar proxy will always be weakest where domain-specific
knowledge of the application is the key to the next step in resil‐
iency. By definition, being separate from the application, sidecars
cannot encode any knowledge this domain specific to the applica‐
tion without requiring coordination between the application and
sidecar. That is likely at least as hard as implementing the sidecar-
provided functionality in a language-specific library includable by
the application.
I believe testing automation available in open source takes you a certain way. Any
investment beyond that is likely to suffer from diminishing returns, as discussed in
“Service Mesh Tracing” on page 111, and against using sidecars for traffic manage‐
ment, as in “Implementation in Service Mesh” on page 285, unpopular as these opin‐
ions might be. These implementations are lossy compared to what you can achieve
via a binary dependency either explicitly included or injected into the runtime, both
of which add a far greater degree of functionality that only becomes cost-prohibitive
if you have a significant number of distinct languages to support (and even then, I’m
not convinced).
Summary
In this chapter we defined platform engineering as at least a placeholder phrase for
the functions of reliability engineering that we will discuss through the remainder of
this book. The platform engineering team is most effective when it has a customer-
oriented focus (where the customer is other developers in the organization) rather
than one of control. Test tools, the adoption path for those tools, and any processes
you develop against the “guardrails not gates” rule.
Ultimately, designing your platform is in part designing your organization. What do
you want to be known for?
Summary | 21
Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider
CHAPTER 2
Application Metrics
The complexity of distributed systems comprised of many communicating microser‐
vices means it is especially important to be able to observe the state of the system. The
rate of change is high, including new code releases, independent scaling events with
changing load, changes to infrastructure (cloud provider changes), and dynamic con‐
figuration changes propagating through the system. In this chapter, we will focus on
how to measure and alert on the performance of the distributed system and some
industry best practices to adopt.
An organization must commit at a minimum to one or more monitoring solutions.
There are a wide range of choices including open source, commercial on-premises,
and SaaS offerings with a broad spectrum of capabilities. The market is mature
enough that an organization of any size and complexity can find a solution that fits its
requirements.
The choice of monitoring system is important to preserve the fixed-cost characteris‐
tic of metrics data. The StatsD protocol, for example, requires an emission to a StatsD
agent from an application on a per-event basis. Even if this agent is running as a side‐
car process on the same host, the application still suffers the allocation cost of creat‐
ing the payload on a per-event basis, so this protocol breaks at least this advantage of
metrics telemetry. This isn’t always (or even commonly) catastrophic, but be aware of
this cost.
23
Black Box Versus White Box Monitoring
Approaches to metrics collection can be categorized according to what the method is
able to observe:
Black box
The collector can observe inputs and outputs (e.g., HTTP requests into a system
and responses out of it), but the mechanism of the operation is not known to the
collector. Black box collectors somehow intercept or wrap the observed process
to measure it.
White box
The collector can observe inputs and outputs and also the internal mechanisms
of the operation. White box collectors do this in application code.
Many monitoring system vendors provide agents that can be attached to application
processes and that provide black box monitoring. Sometimes these agent collectors
reach so deep into well-known application frameworks that they start to resemble
white box collectors in some ways. Still, black box monitoring in whatever form is
limited to what the writer of the agent can generalize about all applications that might
apply the agent. For example, an agent might be able to intercept and time Spring
Boot’s mechanism for database transactions. An agent will never be able to reason
that a java.util.Map field in some class represents a form of near-cache and instru‐
ment it as such.
Service-mesh-based instrumentation is also black box and is generally less capable
than an agent. While agents can observe and decorate individual method invocations,
a service mesh’s finest-grained observation is at the RPC level.
On the other side, white box collection sounds like a lot of work. Some useful metrics
are truly generalizable across applications (e.g., HTTP request timings, CPU utiliza‐
tion) and are well instrumented by black box approaches. A white box instrumenta‐
tion library with some of these generalizations encapsulated when paired with an
application autoconfiguration mechanism resembles a black box approach. White
box instrumentation autoconfigured requires the same level of developer effort as
black box instrumentation: specifically none!
Good white box metrics collectors should capture everything that a black box collec‐
tor does but also support capturing more internal details that black box collectors by
definition cannot. The difference between the two for your engineering practices are
minimal. For a black box agent, you must alter your delivery practice to package and
configure the agent (or couple yourself to a runtime platform integration that does
this for you). For autoconfigured white box metrics collection that captures the same
set of detail, you must include a binary dependency at build time.
24 | Chapter 2: Application Metrics
Vendor-specific instrumentation libraries don’t tend to have this black box feel with a
white box approach because framework and library authors aren’t inclined to add a
wide range of proprietary instrumentation clients even as optional dependencies and
instrument their code N different times. A vendor-neutral instrumentation facade
like Micrometer has the advantage of the “write once, publish anywhere” experience
for framework and library authors.
Black box and white box collectors can of course be complementary, even when there
is some overlap between them. There is no across-the-boards requirement to choose
one over the other.
Dimensional Metrics
Most modern monitoring systems employ a dimensional naming scheme that con‐
sists of a metric name and a series of key-value tags.
While the storage mechanism varies substantially from one monitoring system to
another, in general every unique combination of name and tags is represented as a
distinct entry or row in storage. The total cost in storage terms of a metric then is the
product of the cardinality of its tag set (meaning the total number of unique key-
value tag pairs).
For example, an application-wide counter metric named http.server.requests that
contains a tag for an HTTP method of which only GET and POST are ever observed,
an HTTP status code where the service returns one of three status codes, and a URI
of which there are two in the application results in up to 2 * 3 * 2 = 12 distinct time
series sent to and stored in the monitoring system. This metric could be represented
in storage roughly like in Table 2-1. Coordination between tags, like the fact that only
endpoint /a1 will ever have a GET method and only /a2 will ever have a POST method
can limit the total number of unique time series below the theoretical maximum, to
only six rows in this example. In many dimensional time series databases, for each
row representing a unique set of name and tags, there will be a value ring buffer that
holds the samples for this metric over a defined period of time. When the system
contains a bounded ring buffer like this, the total cost of your metrics is fixed to the
product of the number of permutations of unique metric names/tags and the size of
the ring buffer.
Dimensional Metrics | 25
Table 2-1. The storage of a dimensional metric
Metric name and tags Values
http.server.requests{method=GET,status=200,uri=/a1} [10,11,10,10]
http.server.requests{method=GET,status=400,uri=/a1} [1,0,0,0]
http.server.requests{method=GET,status=500,uri=/a1} [0,0,0,4]
http.server.requests{method=POST,status=200,uri=/a2} [10,11,10,10]
http.server.requests{method=POST,status=400,uri=/a2} [0,0,0,1]
http.server.requests{method=POST,status=500,uri=/a2} [1,1,1,1]
In some cases, metrics are periodically moved to long-term storage. At this point,
there is an opportunity to squash or drop tags to reduce storage cost at the expense of
some dimensional granularity.
Hierarchical Metrics
Before dimensional metrics systems became popular, many monitoring systems
employed a hierarchical scheme. In these systems, metrics were defined only by
name, with no key-value tag pairs. Tags are so useful that a convention emerged to
append tag-like data to metric names with something like dot separators. So a dimen‐
sional metric like httpServerRequests, which has a method tag of GET in a dimen‐
sional system, might be represented as httpServerRequests.method.GET in a
hierarchical system. Out of this arose query features like wildcard operators to allow
simple aggregation across “tags,” as in Table 2-2.
Table 2-2. Aggregation of hierarchical metrics with wildcards
Metric query Value
httpServerRequests.method.GET 10
httpServerRequests.method.POST 20
httpServerRequests.method.* 30
Still, tags are not a first-class citizen in hierarchical systems, and wildcarding like this
breaks down. In particular, when an organization decides that a metric like httpSer
verRequests that is common to many applications across the stack should receive a
new tag, it has the potential to break existing queries. In Table 2-3, the true number of
requests independent of method is 40, but since some application in the stack has
introduced a new status tag in the metric name, it is no longer included in the aggre‐
gation. Even assuming we can agree as a whole organization to standardize on this
new tag, our wildcarding queries (and therefore any dashboards or alerts built off of
them) misrepresent the state of the system from the time the tag is introduced in the
26 | Chapter 2: Application Metrics
first application until it is fully propagated through the codebase and redeployed
everywhere.
Table 2-3. Failures of aggregation of hierarchical metrics with wildcards
Metric query Value
httpServerRequests.method.GET 10
httpServerRequests.method.POST 20
httpServerRequests.status.200.method.GET 10
httpServerRequests.method.* 30 (!!)
Effectively, the hierarchical approach has forced an ordering of tags when they are
really independent key-value pairs.
If you are starting with real-time application monitoring now, you should be using a
dimensional monitoring system. This means you will also have to use a dimensional
metrics instrumentation library in order to record metrics in a way that fully takes
advantage of the name/tag combination that makes these systems so powerful. If you
already have some instrumentation using a hierarchical collector, the most popular
being Dropwizard Metrics, you are going to have to ultimately rewrite this instru‐
mentation. It’s possible to flatten dimensional metrics into hierarchical metrics by
developing a naming convention that in some way iterates over all the tags and com‐
bines them with the metric name. Going the other direction is difficult to generalize,
because the lack of consistency in naming schemes makes it difficult to split a hier‐
archical name into dimensional metrics.
From this point on, we’ll be examining dimensional metrics instrumentation alone.
Micrometer Meter Registries
The remainder of this chapter will use Micrometer, a dimensional metrics instru‐
mentation library for Java that supports many of the most popular monitoring sys‐
tems on the market. There are only two main alternatives to Micrometer available:
Monitoring system vendors often provide Java API clients
While these work for white box instrumentation at the application level, there is
little to no chance that the remainder of the Java ecosystem, especially of third-
party open source libraries, will adopt a particular vendor’s instrumentation cli‐
ent for its metrics collection. Probably the closest we have come to this is some
spotty adoption in open source libraries of the Prometheus client.
Micrometer Meter Registries | 27
OpenTelemetry
OpenTelemetry is a hybrid metrics and tracing library. At the time of this writing,
OpenTelemetry does not have a 1.0 release, and its focus has certainly been more
on tracing than metrics, so metrics support is much more basic.
While there is some variation in capabilities from one dimensional metrics instru‐
mentation library to another, most of the key concepts described apply to each of
them, or at least you should develop an idea of how alternatives should be expected to
mature.
In Micrometer, a Meter is the interface for collecting a set of measurements (which
we individually call metrics) about your application.
Meters are created from and held in a MeterRegistry. Each supported monitoring
system has an implementation of MeterRegistry. How a registry is created varies for
each implementation.
Each MeterRegistry implementation that is supported by the Micrometer project has
a library published to Maven Central and JCenter (e.g., io.micrometer:micrometer-
registry-prometheus, io.micrometer:micrometer-registry-atlas):
MeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
MeterRegistry implementations with more options contain a fluent builder as well,
for example the InfluxDB registry shown in Example 2-1.
Example 2-1. Influx fluent builder
MeterRegistry registry = InfluxMeterRegistry.builder(InfluxConfig.DEFAULT)
.httpClient(myCustomizedHttpClient)
.build();
Metrics can be published to multiple monitoring systems simultaneously with Compo
siteMeterRegistry.
In Example 2-2, a composite registry is created that ships metrics to both Prometheus
and Atlas. Meters should be created with the composite.
Example 2-2. Composite meter registry that ships to Prometheus and Atlas
MeterRegistry prometheusMeterRegistry = new PrometheusMeterRegistry(
PrometheusConfig.DEFAULT);
MeterRegistry atlasMeterRegistry = new AtlasMeterRegistry(AtlasConfig.DEFAULT);
MeterRegistry registry = new CompositeMeterRegistry();
registry.add(prometheusMeterRegistry);
registry.add(atlasMeterRegistry);
28 | Chapter 2: Application Metrics
// Create meters like counters against the composite,
// not the individual registries that make up the composite
registry.counter("my.counter");
Micrometer packs with a global static CompositeMeterRegistry that can be used in a
similar way that we use an SLF4J LoggerFactory. The purpose of this static registry is
to allow for instrumentation in components that cannot leak Micrometer as an API
dependency by offering a way to dependency-inject a MeterRegistry. Example 2-3
shows the similarity between the use of the global static registry and what we are used
to from logging libraries like SLF4J.
Example 2-3. Using the static global registry
class MyComponent {
Timer timer = Timer.builder("time.something")
.description("time some operation")
.register(Metrics.globalRegistry);
Logger logger = LoggerFactory.getLogger(MyComponent.class);
public void something() {
timer.record(() -> {
// Do something
logger.info("I did something");
});
}
}
By adding any MeterRegistry implementations that you wire in your application to
the global static registry, any low-level libraries using the global registry like this wind
up registering metrics to your implementations. Composite registries can be added to
other composite registries. In Figure 2-1, we’ve created a composite registry in our
application that publishes metrics to both Prometheus and Stackdriver (i.e., we’ve
called CompositeMeterRegistry#add(MeterRegistry) for both the Prometheus and
Stackdriver registries). Then we’ve added that composite to the global static compo‐
site. The composite registry you created can be dependency-injected by something
like Spring, CDI, or Guice throughout your application for your components to regis‐
ter metrics against. But other libraries are often outside of this dependency-injection
context, and since they don’t want Micrometer to leak through their API signatures,
they register with the static global registry. In the end, metrics registration flows
down this hierarchy of registries. So library metrics flow down from the global com‐
posite to your application composite to the individual registries. Application metrics
flow down from the application composite to the individual Prometheus and Stack‐
driver registries.
Micrometer Meter Registries | 29
Figure 2-1. Relationship between global static registry and your application’s registries
Spring Boot Autoconfiguration of MeterRegistry
Spring Boot autoconfigures a composite registry and adds a regis‐
try for each supported implementation that it finds on the class‐
path. A dependency on micrometer-registry-{system} in your
runtime classpath along with any required configuration for that
system causes Spring Boot to configure the registry. Spring Boot
also adds any MeterRegistry found as a @Bean to the global static
composite. In this way, any libraries that you add to your applica‐
tion that provide Micrometer instrumentation automatically ship
their metrics to your monitoring system! This is how the black-
box-like experience is achieved through white box instrumenta‐
tion. As the developer, you don’t need to explicitly register these
metrics; just their presence in your application makes it work.
Creating Meters
Micrometer provides two styles to register metrics for each supported Meter type,
depending on how many options you need. The fluent builder, as shown in
Example 2-4, provides the most options. Generally, core libraries should use the flu‐
ent builder because the extra verbosity required to provide robust description and
base unit detail adds value to all of their users. In instrumentation for a particular
microservice with a small set of engineers, opting for more compact code and less
detail is fine. Some monitoring systems support attaching description text and base
units to metrics, and for those, Micrometer will publish this data. Furthermore, some
monitoring systems will use base unit information on a metric to automatically scale
and label the y-axis of charts in a way that is human readable. So if you publish a
30 | Chapter 2: Application Metrics
Random documents with unrelated
content Scribd suggests to you:
Vicksburg on Saturday next, and celebrating the Fourth of July
by a grand dinner, and so forth. When asked if he would invite
General Joe Johnston to join, he said, ‘No, for fear there will be
a row at the table.’ Ulysses must get into the city before he
dines in it. The way to cook a rabbit is ’first to catch the rabbit.’”
“Victimized.—We learned of an instance wherein a ‘knight of the
quill’ and a ‘disciple of the black art,’ with malice in their hearts
and vengeance in their eyes, ruthlessly put a period to the
existence of a venerable feline that has for a time, not within
the recollection of ‘the oldest inhabitant,’ faithfully performed
the duties to be expected of him, to the terror of sundry vermin
in his neighborhood. Poor defunct Thomas was then prepared,
not for the grave, but for the pot, and several friends invited to
partake of a nice rabbit. As a matter of course, no one would
wound the feelings of another, especially in these times, by
refusing a cordial invitation to dinner, and the guests assisted in
consuming the poor animal with a relish that did honor to their
epicurean tastes. The ‘sold’ assure us the meat was delicious,
and that pussy must look out for her safety.”
“Mule Meat.—We are indebted to Major Gillespie for a steak of
Confederate beef, alias mule. We have tried it, and can assure
our friends that, if it is rendered necessary, they need have no
scruples at eating the meat. It is sweet, savory, and tender, and
so long as we have a mule left, we are satisfied our soldiers will
be content to subsist upon it.”
As stated, the city was surrendered on the morning of the 4th of
July, and the army of General Grant marched in and took possession.
Some of the Federal soldiers who went into the city entered the
office of the “Citizen,” and finding the type for the paper all set in
the forms, added the following note, and struck off a large number
of copies, which were extensively distributed among our troops:—
“Note (at foot of last column).—July 4, 1863.
“Two days bring about great changes: the banner of the Union
floats over Vicksburg; General Grant has ‘caught the rabbit’; he
has dined in Vicksburg, and he brought his dinner with him. The
‘Citizen’ lives to see it. For the last time, it appears on wall-
paper. No more will it eulogize the luxury of mule meat and
fricasseed kitten, or urge Southern warriors to such diet
nevermore. This is the last wall-paper edition, and is, excepting
this note, an exact copy of it. It will be valuable hereafter as a
curiosity.”
The author, deeming this paper a curious chapter in the history of
the siege of Vicksburg, has thought it not improper to quote thus
fully from its columns.
CHAPTER XXII.
The Regiment Marches on Jackson—Jefferson Davis’s House—Siege of
Jackson—The Regiment Under Fire—Evacuation of the City—A Part of
the City is Burnt by the Enemy—Return to Vicksburg—A Hard March
—“French Joe’s” Mule—The Dead of the Regiment—Return to
Cincinnati—March Over Cumberland Mountains to Knoxville, Tenn.
As soon as the siege was concluded, General Grant immediately
turned his attention to General Johnston, who up to this time had
held the line of the Big Black, watching for a chance to strike our
besieging army. The time had now arrived for the Ninth Corps to
perform its part of the work of that memorable campaign. As soon
as General Johnston learned of Pemberton’s surrender, he began to
fall back to Jackson, the capital of the State. The Ninth Corps under
General Parke, together with General Smith’s division of the
Sixteenth Corps, and General W. T. Sherman’s own corps, all under
command of General Sherman, were ordered by General Grant to
pursue the retreating enemy. This movement began as early as the
evening of the 4th of July, but the Brigade of Colonel Christ did not
commence to move till the afternoon of the 7th, the Twenty-ninth
leaving camp at two o’clock in the afternoon. Toward nightfall the
Big Black was reached, the men crossing the river on a floating
bridge which had been constructed by the advance forces. The
march was continued for into the night, no halt being made till
twelve o’clock. The day had been severely hot, and a large number
of the men were left beside the road, where they had fallen, stunned
and bewildered, by the overpowering rays of the sun. When the
night came on, it began to rain, and for a space of two hours the
overcharged clouds poured torrents of water upon the soldiers, who
were toiling along over the muddy roads so faint from exhaustion
that they could scarcely drag one foot after the other. As soon as the
halt was made, fires were kindled, and the men contrived to dry
their clothing and steep a little coffee, the solace of the soldier. That
was a wet and intensely uncomfortable bivouac; there was no
recourse left the men but to spread their rubber blankets upon the
flooded earth, and, lying down upon them, cover themselves with
the half of a shelter-tent. They had barely fallen asleep when the
storm broke out afresh, and the rain came down upon them in great
sheets. Sleep was wholly banished, and huddling around the
smouldering fires, the “poor boys” thus passed the balance of that
gloomy night. The day which followed this was also very hot, and
the officers having learned that the troops could not endure the sun,
wisely concluded to allow them to remain quiet till near nightfall. At
four o’clock, P. M., the order came to break camp, and a long march
was performed, the Brigade marching till one o’clock on the morning
of the 9th. On the 9th, the line was formed as early as six o’clock in
the morning; but the men were not hurried through the day, being
allowed to make frequent but brief halts. The troops halted at nine
o’clock in the evening near the plantation of Jefferson Davis, where
the regiment was ordered on guard for the remainder of the night.
A part of the regiment on this occasion was posted very near the
house of Davis, and though the men were led by curiosity to visit it,
yet they refrained from destroying the property of this prominent
traitor, or committing any acts unbecoming a regiment of
Massachusetts soldiers. As early as seven o’clock on the following
morning, the men having had no sleep during the preceding night,
and scarcely any for three consecutive nights, the regiment was
ordered to start. At two o’clock that afternoon the rear guard of the
retreating enemy was suddenly encountered, a line of battle was
quickly formed, and slight skirmishing ensued; but the Twenty-ninth,
though very near the front, did not become engaged. Toward
evening the Confederates retreated, and our troops started in
pursuit, the Brigade proceeding only about two miles, when it halted
for the night on the plantation of Mr. Hardeman, on the line of the
Mississippi Central Railroad.
Early the next morning, while the regiments were resting, the order
was given for the Brigade to go to the front, taking position on a
ridge of land upon which stood the State Lunatic Asylum, about five
miles from Jackson. On the previous day, the enemy had occupied
this place, but were driven from it by the First Division under
General Welch. The Confederates on the 11th held another line of
works a little nearer the city of Jackson, but within easy range of this
ridge; the place was thickly wooded, and the Brigade lay concealed
among the trees during the day, the Twenty-ninth supporting
Captain Edward’s Rhode Island Battery, which did but little firing,
however.
When it grew dark, shovels were called into requisition, and every
man in the Brigade was set to work throwing up entrenchments,
laboring till daylight the next morning; but our men were not to be
allowed to enjoy the fruits of their night’s labor, for in the early
morning, they were ordered out of the works, up to the extreme
front, in support of our skirmish line. Fortunately they were not
obliged to endure the scorching rays of the sun, but found shelter in
a piece of woods; it was only a shelter from the sun, however, for
the enemy, knowing our position, poured into the woods a
continuous fire of shell, canister, and spherical case during the whole
of the two days that the regiment was here. The other regiments in
the Brigade suffered more or less loss, but the Twenty-ninth escaped
without a single casualty. In addition to the storm of larger missiles,
many of the musket-balls fired from the enemy’s lines found their
way into the woods, and so severe was the fire, that nearly every
tree along our line bore the marks of the leaden tempest. Many of
our comrades had narrow escapes from death and wounds, one
soldier in Company K especially, a ball passing through his tin dipper,
upon which he was resting his head.
On the morning of the 11th, the Brigade was relieved and ordered to
the rear, resuming its former position near the lunatic asylum; but in
the afternoon of the same day it was again ordered forward, and
again supported Captain Edward’s battery. Here it remained till the
morning of the 16th, when an advance of the whole line was made,
the Twenty-ninth passing up under a heavy fire to within forty rods
of the enemy’s works, bristling with cannon, the right of the
regiment going into the rifle-pits. Once in the pits, there was no such
thing as leaving them while it was daylight, and here the “boys”
spent the day, constantly engaged with the enemy’s sharpshooters.
Though considerably exposed, there was but one casualty during the
day, Private John Scully of Company A being instantly killed, the ball
penetrating his brain. The regiment in this position held the extreme
left of the picket line of our army, its right resting in the rifle-pits,
and its left in dense woods, retired so as to form nearly a half-circle.
The night of the 16th was dark, and hence favorable for secret
movements by both besiegers and besieged. About nine o’clock,
unusual noises were heard within the enemy’s lines, resembling the
rattling of wheels. Colonel Barnes became anxious to learn the cause
of these noises, and Captain Clarke was requested to use every
effort to ascertain what, if any, movement was going on in the
enemy’s camp. That officer had no difficulty in carrying out his
instructions, for one of his men, a fearless soldier, named David
Scully, unhesitatingly consented to undertake the perilous task of
approaching the hostile picket line. The ground descended quite
rapidly from Clarke’s line towards that of the Confederates. Scully
was left to execute his adventure in his own way. Prostrating himself
upon the ground, he rolled slowly down the hill, till he approached
within a few yards of the enemy’s pickets, and then pausing,
overheard their conversation, which was to the effect that their army
was retreating, and that they were soon to be relieved. Listening
here, Scully heard more distinctly than before, the noises in the
enemy’s camp. They were evidently removing their guns from the
works; and, beside this, the regular tread of marching men was
plainly distinguishable. In due time Scully returned, making this
report. About this time, a similar report was brought in by Charles
Logue of Company F, who went forward into the woods, very near
the enemy, exhibiting great courage. In order to verify the
statements of Scully and Logue, Colonel Barnes, with one or more of
the captains, advanced some distance beyond our picket line, when
they soon became convinced that the whole body of the enemy was
moving. Thereupon one of the sergeants was despatched to General
Ferrero, who was in command of the trenches, with information that
the enemy was moving in large numbers, and shortly after a
lieutenant was sent, with the message that the enemy was
abandoning his works and retiring from the city.
The night was intensely dark, and the ground over which these
officers were obliged to pass, in delivering their messages, beset
with difficulties, being broken, and in some places covered with
fallen timber and a thick growth of bushes. But, like faithful soldiers,
they persevered till they found General Ferrero, when they delivered
their messages. The substance of the reply that was sent back was,
“The movements of the enemy are well understood at headquarters.
The enemy are not retiring.” The rumbling of the enemy’s trains and
the neighing of their horses continued; and the Colonel and his
comrades stood at their posts all night, listening to these sounds,
which grew fainter and more distant every hour, as the Confederates
were slipping out of the grasp of General Sherman, and retiring
beyond the Pearl River. When the night was almost gone, a message
was received from General Ferrero, that the regiment might move
forward in the gray of the morning, if Colonel Barnes thought it
advisable.
When the morning came, a flag of truce was seen waving from the
enemy’s works, and at the same time the city appeared to be in
flames. During the night, General Johnston retired with his whole
army, artillery, and baggage, and even the large guns upon his
works. As soon as it was fairly day, the whole line was ordered
forward, and the regiment entered the city. The works were found to
be deserted, and the railroad depot and several public buildings in
flames; but the fire was quickly extinguished by our troops, and thus
a large portion of the city was doubtless saved from destruction.
After the regiment had finished its part of the generous work of
subduing the flames, the men were dismissed for a couple of hours,
during which time they contrived to “do” Jackson quite thoroughly.
The gardens were filled with melons and fruits, but of other and
more desirable food there was a small supply. Everything of much
value had been removed, and many of the deluded inhabitants had
followed in the steps of the retreating army, taking with them their
personal effects, thus giving the place the appearance of a deserted
town. The negroes had the good sense to stay, and, as was
invariably the case, they were overjoyed at the appearance of the
Union soldiers, testifying to their happiness in the way peculiar to
their race.
In the afternoon of the 17th, the regiment had orders to leave the
city, marching back to the ground occupied on the 14th. Here it
remained, enjoying much-needed rest, till Monday the 20th. Another
severe march was before them, a march needlessly hard; and at an
unreasonable hour in the morning of the 20th, the reveille aroused
the men from their slumbers.
Before the movement began, an order was issued from
headquarters, detailing Colonel Barnes Provost Marshal of the corps,
and the whole of the regiment as provost guard, with orders to
move in the rear of the corps, and to keep everything—men, horses,
and wagons—in front. This was the hardest duty the regiment ever
performed in the same number of days. For some reason, the march
was a forced one; the weather was of the same tropical character
that it had been during the three weeks previous, and water not only
scarce, but of poor quality. The story among the men was, that the
corps was racing with another, the Sixteenth (?); but the more
probable statement is, that the corps reaching Vicksburg first would
take the transports to go North, there being only a sufficient number
of steamers for the transportation of a single corps. The imperative
orders given to Colonel Barnes to prevent straggling, required
constant watchfulness and almost superhuman efforts, not only on
his part, but on the part of his brother officers and the men. Many
soldiers gave out, from the combined effects of over-exertion and
the enervating influence of the weather. On the second day out,
matters in this respect became so bad, that it became necessary to
impress into the service, ox-carts, horses, and vehicles of all
descriptions which could be found about the country, and use them
for the conveyance of the invalids, many of whom had received fatal
sunstrokes. The spectacle which the corps presented on the road
was wholly unbecoming a victorious army: nearly every regiment
had lost even the semblance of an organized body; everybody was
straggling along the roads, some riding in carts, and others mounted
upon horses and mules, while miles in the rear of this mob was the
gallant old Twenty-ninth Regiment, driving the crowd before them.
Violent menaces, and sometimes absolute force, were required to
keep the stragglers in motion.
For want of ambulances, nearly all the wounded in the battles and
skirmishes before Jackson were carried the whole distance from the
latter city to Vicksburg on litters or stretchers by details of men. To
protect these unfortunate soldiers from the sun, hoods made of
pieces of tent cloth were placed about their heads, and green
boughs arranged at the sides of the litters.
A large number of disabled horses and mules were left about the
country, in the track of Johnston’s retreat, and these were
systematically gathered up by General Sherman, when he returned
from Jackson, and driven along to the various landings in the vicinity
of Vicksburg and Milldale, where, together with the horses and other
animals captured by the soldiers on the march, they were delivered
up to the quartermasters. Nearly every company of the Twenty-ninth
had a large number of saddle and pack animals, which they had
ridden and used for the conveyance of their baggage during the
march. Company A had some twenty horses and mules, and
Company G nearly as many, when they returned to Milldale, having,
as they swept along the stragglers of the column, as the extreme
rear guard, collected these animals, as well as the jaded and tired-
out men, and their work was much lightened by these mounts. As
the rear guard approached the Big Black, the soldiers on foot were
sent forward into camp, and then about thirty or forty mounted men
came in together, most of the latter being men who had fallen out or
got foot-sore, and had been picked up and mounted to keep them
along with the army.
When one of these motley crowds came in, the commander of the
regiment, who was somewhat indignant at the appearance of the
thing, hailed the captain in command, “I should like to know, sir,
what this means; what sort of a command is this for an infantry
officer?” “Irregular mounted infantry, I should think,” replied the
leader, as he looked at his crew.
It was on this march that Captain Richardson’s man, nicknamed
“French Joe,” came to the conclusion that his captain’s mess kit
might just as well be carried by a mule as by Joseph, and, in fact,
that the mule might carry “Joe” too, and took one of the mules for
this purpose. He had only his belt and some old scraps of rope for a
tackling; but this he thought might serve well enough. He contrived
a pad out of his own and the Captain’s blankets, and, warned by the
example of John Gilpin, he attempted to balance his load and to tie
it securely to the sides of the mule, which were well festooned with
pots, pans, gridirons, camp kettles, and tin dippers, giving the
animal the appearance of the “hawker’s” donkey. After all this varied
assortment of wares had been piled upon the animal, Joe kindly
allowed a knapsack or two to be strapped on behind, and then
mounted, guiding the mule with a rope halter. He had not proceeded
far before some of the knots began to slip, for Joe was not a sailor,
nor was he a very skilful disposer of weights. Very soon one of the
knapsack straps got loose and insinuated itself on the inside of the
mule’s hind leg. It tickled him—he kicked. This displaced a camp
kettle, which slipped under his belly—he “buck-jumped,” and
unseated Joe. Then all the load shifted, the most of it getting under
the beast’s belly. He curveted and pranced, he reared and kicked,
and cleared the road right and left for more than a mile. The men
scattered on every side, for the mule was in earnest, and was no
respecter of persons, kicking just as viciously at the officers as at the
men. Captain Richardson had no dinner that day, save what he got
through the kindness of others; for his coffee, hard bread, and
bacon, tin plates and cups, flour, butter, and roasting corn—all the
materials of many a savory feast—lay in the dust.
On the 22d, the Ninth Corps reached the Big Black River. General
Parke and his division commanders now deemed it impossible, as it
certainty was disgraceful, for the corps to continue to march in this
manner. The different regiments were here, on the banks of the
river, gathered together, and forced to resume their organization.
One whole day was spent in this work, during which the men were
permitted to rest.
Toward evening of the 22d, the corps moved out of camp, and
marching slowly, crossed the Big Black on a pontoon bridge, in the
midst of a pouring rain; the troops camped near the river for the
night, and the next morning started for Milldale. The regiment was
the last to arrive, in consequence of its peculiar duty, and by being
the last, lost the first chance to go on board the transports, and was
thus forced to remain here till the 12th of August.
During the campaign now closed, the roll of the regiment’s dead had
been somewhat increased; and this, with a few exceptions, had
been occasioned by disease contracted in the sickly regions of the
Yazoo and Vicksburg. Private John Scully of Company A, a faithful
soldier, was the first to fall in the campaign, having been killed by a
bullet while bravely doing his duty in the rifle-pits before Jackson,
July 16. Second Lieutenant Horace A. Jenks of Company E came
next, dying of malarial fever, July 26. Lieutenant Jenks had at one
time been a sergeant in his company, and was promoted to be
second lieutenant for his good soldierly qualities. His death was
mourned by all the members of the regiment. First Lieutenant Ezra
Ripley of Company B, who died of fever at Helena, Ark., July 28, was
a member of the Middlesex Bar before entering the service. He was
a gentleman of liberal culture and rarest qualities of both heart and
mind. No sacrifice for his country was too great in his estimation,
and though not of a robust constitution, yet he never shrank from
any exposure or hardship. He performed the terrible march to
Jackson, but the seeds of disease sown during those days, already
described, soon ripened into death. Private Lyford Gilman of
Company B also died of disease at Vicksburg, August 2. He was also
a victim of the exhaustive march.
When the Ninth Corps was about to leave Vicksburg, General Grant,
desirous of recognizing its services in the late campaign, issued the
following order:—
“Headquarters Department of the Tennessee,}
“Vicksburg, Miss., July 31, 1863. }
[EXTRACT.]
“Special Orders, No. 207.
“In returning the Ninth Corps to its former command, it is with
pleasure that the general commanding acknowledges its
valuable services in the campaign just closed.
“Arriving at Vicksburg opportunely, taking position to hold at bay
Johnston’s army, then threatening the forces investing the city,
it was ready and eager to assume the aggressive at any
moment.
“After the fall of Vicksburg, it formed a part of the army which
drove Johnston from his position near the Big Black River, into
his entrenchments at Jackson, and after a siege of eight days,
compelled him to fly in disorder from the Mississippi Valley.
“The endurance, valor, and general good conduct of the Ninth
Corps are admired by all; and its valuable co-operation in
achieving the final triumph of the campaign is gratefully
acknowledged by the Army of the Tennessee.
“Major-General Parke will cause the different regiments and
batteries of his command to inscribe upon their banners and
guidons, ‘Vicksburg’ and ‘Jackson.’
“By order of
“Major-General U. S. Grant.
“P. S. Bowen, A. A. A. G.”
The time spent at Milldale, after the return from Jackson, was
occupied by the ordinary duties of camp life. The weather continued
very warm, and the destructive effects of the campaign now became
manifest. Deaths were very frequent among the troops here during
this time, burial parties were almost constantly engaged, and the
funeral notes of the fife and drum could be heard nearly every hour
in the day. None save the strongest came out of that campaign in
sound health.
On the 12th of August, the regiment embarked on the steamer
“Catahoula,” one of the slowest boats on the river, to go North; the
steamer left Milldale without a sufficient supply of fuel, and
accordingly frequent stoppages on the route, to gather wood,
became necessary. The trip to Cairo, including one day spent at
Memphis, occupied eight days, the boat reaching its destination on
the 20th.
At midnight on the 20th, the regiment took the cars for Cincinnati,
reaching that city on the afternoon of Sunday the 23d, and receiving
the same kind treatment as on its two former visits.
At night, the regiment left the city, crossed the Ohio to Covington,
Ky., and went into camp on the outskirts of the town, and remained
here till the 27th. At this time, probably nearly half of all the
members of the regiment were on the sick-list, and unable to do
duty. In the course of a few days they had come from the tropical
climate of the South into the cool bracing air of the West, and now
the chills and fever broke out among them to an alarming extent.
While here, Colonel Barnes left the regiment on a furlough to his
home in Massachusetts; he was very sick from the effects of a
malarial fever and overwork; from the eighteenth day of May, 1861,
till he was seized with this sickness, he had never been off duty, for
any cause, a day,—a fact that is not only remarkable, but,
considering the great hardships to which he had been subjected,
one that shows him to have been possessed of an iron constitution.
The author, in the preparation of this work, has endeavored, as far
as possible, to avoid the diary form of narrative, because he is aware
that such does not interest the general reader; but the record of the
regiment would be incomplete if it did not give somewhat in detail
the events of long and memorable marches, and the various
localities visited by it.
The march from Covington, Ky., into East Tennessee, which we are
about to describe, was one of the longest which the regiment ever
performed, and, for the reasons stated, we shall give a very
particular account of it. On the 27th, it broke camp, under the
command of Major Chipman, went to the railroad station in
Covington, took the cars for Nicholasville, arrived there at seven
o’clock the next morning, and camped near the depot. On the 29th,
Colonel Pierce, who had for several months been absent on special
duty in Massachusetts, joined the regiment and assumed command,
and on the same day a march on the Lancaster pike of about four
miles was performed.
August 31. The regiment was mustered for pay; Colonel Pierce
ordered to the command of the Brigade; the Second Michigan
Infantry joined the Brigade, and Major Chipman again took
command of the regiment.
September 1. Reveille at four o’clock, A. M. Started for Crab Orchard,
in Lincoln County; spent the night for the third time at Camp Dick
Robinson.
September 2. Reveille at an early hour; marched all day; camped
near Lancaster.
September 3. Another early start. Reached Crab Orchard, a place of
five hundred inhabitants, and abounding with mineral springs. Here
and at Nicholasville convalescent camps were established, and
during the time which the regiment remained at these places, a very
large number of its members went into the hospitals, where not a
few of them subsequently died.
September 10. The Brigade left Crab Orchard, and had a hard march
of about fourteen miles, and went into camp at a place called Mount
Vernon. The road for a considerable portion of the way was very
rough and mountainous, being so steep in some places that the
horsemen were obliged to dismount and lead their animals. The men
were in light marching order, having left the most of their extra
clothing at Crab Orchard, and had eight days’ rations served out to
them, being thus prepared for a long march.
September 11. The reveille sounded at half-past three o’clock in the
morning, and at half-past four the column was in motion. At night,
after a very fatiguing march, the camp was formed near Wild Cat
Mountain, Kentucky.
September 12. The men were routed out early in the morning, and
the day’s march began at five o’clock, but the road was good all day.
The weather, which had been fine ever since the march began,
became stormy at the end of this day, and at night it rained hard.
The camp was formed at London, Laurel County, Ky. On this march
the regiment passed over the battle-field of Mill Spring, where the
notorious Zollicoffer was killed.
September 13 was Sunday. The men were paid off and allowed to
rest all day. Since this famous march began, the Brigade had passed
through and into three counties; namely, Gerrard, Rock Castle, and
Laurel. The country through which they had travelled was thinly
populated, and with the exception of a few wild fruits and nuts
which they found on the journey, the men were obliged to subsist
upon their rations. It has been stated, that the wild fruits which the
men ate on this march proved very beneficial to their health, and
resulted in curing them of the complaints they had contracted in the
sickly swamps of the Yazoo.
September 14. The march was resumed at five o’clock in the
morning, and at night a halt was made at Laurel Spring.
September 15. Only a part of the day was occupied by marching, a
halt being made at the town of Barboursville, in Knox County, Ky.
September 16. Marched from Barboursville to Flat Lick; a long
march, pausing till the 19th.
September 19. A distance of about ten miles was travelled this day;
the camp was formed at Log Mountain. The column was nearing the
far-famed Cumberland Gap, and the roads were growing rougher
and more broken at every advance in that direction. The night was
very cold, water froze, and the crops of tobacco, sugar-cane, and
cotton in that region nearly all destroyed. When the sun rose the
next morning, it revealed the earth white with frost.
September 20. At ten o’clock in the morning, the Brigade reached
Cumberland Gap, and entered the State of Tennessee. After passing
into this gap, which was defended by a small force of infantry and
cavalry, the road became more and more elevated, till at last it
reached the summits of the mountains. The view from these heights
well paid the men for all their toil in climbing their rugged and
broken sides. In the far distance, ridge after ridge seemed to rise up
toward the heavens, the highest actually invading the clouds, which,
with a beautiful curtain of blue, hid from sight the lofty peaks. The
night was spent in the mountains near the gap.
September 21. Sycamore, Tenn. Camped for the night. An inquiry
having been made at one of the mountain huts, regarding the
distance between this place and Tazewell, the answer was, “Two
rises to go up and two rises to go down and a right smart plain.”
September 22. Morristown, Tenn. Here the Brigade remained till the
24th.
September 24. Marched to New Market.
September 25. Marched to Holston River and forded it.
September 26. Entered the city of Knoxville.
The distance marched between the first of September and 26th was
something over two hundred miles. The march over the mountains
has furnished the theme of many interesting conversations among
the men who performed it. The hardships of the road were manifold
and serious. It was enough to be compelled to climb day after day
the rugged and precipitous path along the side of these mountains;
it was enough, indeed, to bivouac on their cold and barren summits,
with only a single woollen blanket to protect the foot-sore soldier
from the searching and chilling night-air; but when we add to these
discomforts, that of intense and unsatisfied hunger, which was
actually endured during the entire march, the measure of the
sufferings of our comrades seems full to overflowing. They endured
these sufferings and hardships, however, for a good purpose.
Together with the troops which had gone on before them, they had
wrought the long-prayed-for deliverance of East Tennessee. On the
3d of this month, General Burnside, together with the Twenty-third
Corps and other troops, had entered the city of Knoxville, the
Confederate General Buckner retiring from the place with his army
and retreating toward Chattanooga.
The people of this region had long suffered from rebel rule, and the
barbarities which had been practised upon them have never been
fully related to the world. Some had been imprisoned, others
tortured, and others murdered. Their property had been mercilessly
confiscated, and not a few had been forced to perform military duty
in the service of a cause that they loathed and hated. When the
army of General Burnside appeared bearing the old flag, and the
colors of the cruel foe departed in haste and confusion, the loyal
people were overwhelmed with joy. The flag of the Union, which had
been carefully hid under carpets, concealed in cellars and between
mattresses, to save its owners from persecution, was now brought
forth from its hiding-places, and flaunted on every hand; from
windows and liberty-poles, it floated to the breeze.
A considerable part of General Burnside’s army was composed of
loyal Tennesseeans, who had been forced to fly into Kentucky during
the continuance of the enemy’s rule. These native troops, among
which was the cavalry under Lieutenant-Colonel Brownlow, son of
the famous parson, “were kept constantly in advance, and were
received with expressions of the profoundest gratitude by the
people. There were many thrilling scenes of the meeting of our
Tennessee soldiers with their families, from whom they had so long
been separated. The East Tennesseeans were so glad to see our
soldiers, that they cooked everything they had and gave it to them
freely, not asking pay, and apparently not thinking of it. Women
stood by the roadside with pails of water, and displayed Union flags.
The wonder was, where all the stars and stripes came from.
Knoxville was radiant with flags. At one point on the road from
Kingston to Knoxville seventy women and girls stood by the roadside
waving Union flags and shouting, ‘Hurrah for the Union.’ Old ladies
rushed out of their houses and wanted to see General Burnside and
shake hands with him, and cried, ‘Welcome, General Burnside, to
East Tennessee.’”41
These constitute but a small part of all the demonstrations of loyalty
by this intensely loyal people, and this brief account of their wrongs
but a trifling part of the manifold abuses heaped upon them by a
merciless and savage soldiery,—abuses and wrongs of the same
barbarous nature as those perpetrated at Andersonville and Belle
Isle, forming as they do the saddest chapter in the history of the
war. It should be among the proudest boasts of the people of
Massachusetts, that in the persons of her soldiers of the Twenty-
first, Twenty-ninth, Thirty-fifth, and Thirty-sixth regiments, she
helped deliver a people loyal to the old flag from a thraldom such as
has been imperfectly depicted in this chapter,—a thraldom worse
than death itself.
CHAPTER XXIII.
Battles of Blue Springs, Hough’s Ferry, and Campbell’s Station—Siege of
Knoxville—The Sufferings of the Men—Battle of Fort Sanders—
Gallant Conduct of the Regiment—It Captures Two Battle-flags—The
Siege Raised—General Sherman Re-enforces Burnside.
During the early part of October, a portion of the Ninth Corps under
General Potter, and a large body of cavalry under General
Shakleford, were sent up the valley some fifty miles in the direction
of Morristown, Jefferson County. A force of the enemy had crossed
into Eastern Tennessee from Virginia, and were threatening our
communications with Cumberland Gap. This movement on the part
of the Federals was made for the purpose of clearing the enemy
away from the flank of our army.
On the 8th of October, the regiment with its brigade was ordered
forward from Knoxville to join the rest of the corps, and on the night
of the 9th halted at Bull’s Gap, a pass in the mountains near the line
between Jefferson and Green counties.
The movement of the enemy was a very important one; they had
reached and occupied Greenville, and moved out beyond as for as
Blue Springs. Foster’s brigade of cavalry and mounted infantry was
sent out from Knoxville, up the valley of the French Broad River, to
turn the right of the enemy and get upon his rear, which movement
was accomplished on the 9th. Foster got himself into position, and
on the 10th, General Custer with his mounted infantry came up with
the enemy at Blue Springs, and began to skirmish. Ferrero’s division
of twelve small regiments, of which the Twenty-ninth was one,
arrived about noon, and went into position a half-mile from the field,
where they had a good view of the skirmish for nearly half an hour.
At the end of this time, two brigades of the division—namely,
Humphrey’s and Christ’s—were sent forward.
The enemy had a battery well supported on the left of the main road
leading to Greenville, on a high hill. They had thrown forward their
first line and skirmishers well advanced to a distance of perhaps
three-quarters of a mile from their battery, across the road and
across a rivulet, and had advanced another body of skirmishers
through a corn-field to the crest of a hill about three hundred yards
from where the Twenty-ninth was lying. Custer’s men had slowly
retired before the Confederates, and passed to our rear, when the
order came for our two brigades to charge. The men rose to their
feet and went forward at a rapid run, with arms aport and bayonets
fixed, up the hill. The enemy, closely followed by our men, fell back
rapidly down the hill, across the rivulet, into and through a belt of
woods, where the pursuit ended by the direct orders of our generals.
Here Colonel Christ re-formed his Brigade, to carry one of the
Confederate batteries that had begun to fire shell into our lines. The
enemy, seeing the preparations for a charge, wheeled their guns
about and fled; and at this stage in the affair, it became so dark that
all further hostilities ceased. Captain Leach, then sixty-three years of
age, led his company on this charge; and when the rivulet was
reached, which was some eight feet wide, sprang into it and
scrambled up the opposite bank as actively as the youngest of his
men, refusing the proffered assistance of Major Chipman, who was
leading the regiment.
Captains Leach and Clarke messed together; their negro servants,
Bob and Isaac, were left in the rear of the field, where this fight had
occurred, with their rations and baggage, and when the battle was
over, were sought to prepare supper; but the darkies could not be
found,—neither the rations nor baggage. Upon investigation, it
appeared that a rumor had spread to the rear that both these
officers had been killed in the fight. The negroes had of course
heard of it, and, considering themselves absolved from all further
obligations as servants, had gone back towards Bull’s Gap, taking
the effects of the officers with them, where at night they held a sort
of barbecue, feasted on the rations, and concluded their
entertainment with an auction sale of the baggage. These recreant
negroes were found the next morning and subjected to a severe
questioning. “Where are our rations?” “Where’s the coffee-pot?”
“What has become of our blankets?” Bob acted as spokesman: “De
rations and blankets is done gone; de coffee-pot is done gone, too,
—dey’s stole.” This ended the examination, and these two
unfortunate captains had short rations and hard fare for the rest of
the march. The enemy retired during the night, and soon after
daylight our army started in pursuit. After marching a mile or two,
the infantry halted, and Shakleford’s brigade of mounted men, with
several horse batteries, swept by the head of the column, and then
the infantry marched again. The most annoying information came
from the farmers along the road. They scarcely knew which were our
enemy,—the troops that had passed the night before, or the
mounted column of Shakleford,—and these were some of the
answers they gave in reply to questions of the whereabouts of the
Confederates: “They are just ahead”; “Not far from an hour ago,
they went by”; “A good gallop off”; and so forth.
When our troops reached Greenville, they learned to their surprise
that the enemy had passed through there six hours before, and that
they had a sharp engagement with General Foster’s men a few miles
out at Henderson’s. The tired troops pressed on; at Henderson’s,
they saw some signs of a fight, but the bridge was intact. General
Foster had refrained from destroying it, and the enemy had
neglected to do so. Toward night the regiment went into camp at
Rheatown, twenty-one miles from Blue Springs. Shakleford and
Foster followed the enemy into Virginia, inflicting upon them great
injury, and, upon returning, took up the line of the Watauga, to
cover the passes from Virginia into East Tennessee.
One of the abandoned wagons of the Confederates, found near
Rheatown, furnished our regiment with a liberal supply of excellent
bread and some other food. At this place our troops had two full
days’ rest, and it was much needed, for the men had performed a
forced march hither, and in the course of it had an encounter with
the enemy.
At the close of the second day, the columns were turned towards
Bull’s Gap, making the distance by easy marches, and upon arriving
there the regiment took the cars, but had proceeded but a short
distance when an accident rendered it necessary for them to march
six miles to Morristown, at which place they again took the cars and
went to Knoxville, reaching there on the 10th of October.
While the Confederates held East Tennessee, a merciless
conscription had been enforced by them, to avoid which many of the
male population had abandoned their homes and taken refuge in the
deep forests, or fled into Kentucky. After the country had been
occupied by Burnside, many of these loyal people returned to their
homes, and signified their willingness to enlist in the Federal army.
Burnside issued an order encouraging such enlistments, and
especially into the veteran regiments of the Ninth Corps, which had
been greatly depleted by their recent campaigns. Shortly after the
Twenty-ninth returned to Knoxville, Captain Clarke and Lieutenant
Atherton were detailed for this recruiting service, and ordered to
station themselves at Rheatown, where they spent several weeks,
and secured a number of recruits. On the 11th of November, a force
of Confederates again invaded Tennessee from Virginia, and evading
the left of our army on the Watauga, attacked with about 3,500
cavalry our post at Rogersville, and captured its small garrison. This,
and other hostile movements at various points, rendered necessary
the evacuation of Rheatown, and the drawing in of all our forces in
that part of the State, nearer Knoxville. Our recruiting party,
therefore, returned to the latter place, and went on after their
regiment, which, in the meantime, had gone out to Lenoir’s Station.
A serious invasion of East Tennessee, by General Longstreet, had
already begun. That officer, with a large force, had early in
November been detached from Bragg’s army, in the vicinity of
Chattanooga, and was now marching up the valley towards
Knoxville. On the 20th of October, the Ninth Corps left Knoxville and
went to Campbell’s Station, fifteen miles southwest of the city, on
the East Tennessee and Virginia Railroad; on the 21st, it moved
down the railroad to Lenoir’s Station, and remained there, with the
exception of a few days, till the 14th of November. On the night of
the 10th of November, Longstreet made his appearance on the south
side of the Holston River, at Hough’s Ferry, about six miles below
Loudon, and where was stationed General White, with one division
of the Twenty-third Corps. November the 14th, early in the morning.
General Potter, in a hard rain-storm, started with the whole of the
Ninth Corps to re-enforce General White. The Twenty-ninth with its
brigade (Christ’s) was in advance, and toward noon arrived at a
point five miles from the ferry, when rapid and heavy firing was
distinctly heard. Now the clouds parted and the storm slackened, but
the roads were as heavy and broken as before, making it
exceedingly difficult to get the artillery along, and rendering the
progress of the troops very slow. It was nearly dark when the
Brigade reached the ferry; by this time the battle there had nearly
ceased, nothing save an occasional musket-shot indicating the near
presence of the enemy. Immediately upon its arrival, the regiment
was ordered to the right of the line, marched nearly two miles
through a thick woods, and formed in line of battle within one
hundred yards of that of the enemy. The night soon came on, and
early in the evening the storm broke out again with increased fury;
the wind blew with the force of a tornado; the trees swayed to and
fro in the blast, threatening to fall upon the heads of the men, who
stood to arms all night without fires.
Very early the next morning (15th), when the men were expecting to
march against the enemy, the order came to fall back, and taking
the same track by which it had entered the gloomy forest, the
Brigade picked its way back to the place where it had first halted the
night before. All along the way brightly-burning camp-fires were
passed, but no troops were seen; these had already left, and were
well under way towards Lenoir’s. At noon the regiment reached the
latter place. The men had tasted no food for several hours, and were
nearly worn out with fatigue; during the march here, they had
managed to pluck a few ears of corn from the fields by the roadside,
and as soon as a pause was made and the arms stacked, the place
was ablaze with fires; every man at once went to work making
coffee and preparing little messes for dinner. Happily the poor,
hungry men had time to finish their meal, but they had barely
finished it when they were ordered under arms. The enemy had just
then appeared a half-mile away on the Kingston Road, and thither
the Brigade was hurried at the double-quick. This movement of the
Confederates was at once checked, and the rest of the day passed
without any further hostile demonstrations, except a night attack
upon our pickets.
The morning of the 16th was sharp and cold; as early as two o’clock
the regiment was ordered to march. The roads that had been muddy
the day before were now frozen; the artillery horses were pinched
with cold and hunger, and quite unable to drag the heavy cannon. It
was resolved to sacrifice a portion of the baggage train, which, to
the number of many wagons, was parked at Lenoir’s. The horses
and mules were detached and harnessed into the guns; the spokes
of the wagon-wheels were hacked, and, with their contents, set on
fire,—not, however, till the soldiers had replenished their haversacks
with a goodly quantity of smoked pork, coffee, sugar, and hard
bread.
The whole corps was in full retreat soon after daylight, and the
enemy at once began the pursuit, harassing our rear guard
continually. The road from Lenoir’s Station to Knoxville intersects at
Campbell’s with the road from Kingston, and Longstreet had
detached a column on his left to seize the junction of these roads.
The possession of Campbell’s Station was, therefore, of great
moment to Burnside, for should the enemy arrive there before him,
his retreat to Knoxville would surely be cut off. A division of troops
under Hartranft, by rapid marching, succeeded, in the early part of
the forenoon, in reaching Campbell’s, and going out on the Kingston
Road deployed across it, his left on the Loudon Road, along which
our army and trains were moving. Hartranft was just fifteen minutes
ahead of the enemy; he had only time to form his line, when the
Confederate column appeared hurrying up the Kingston Road. A
sharp engagement ensued; but the enemy was foiled in his attempt,
and driven back in confusion. Soon after, all our trains passed this
dangerous point in safety, and moved on to Knoxville. At about
noon, the rest of the army came up, and went into position on “a
low range of hills about a half-mile from the cross-roads.” The Ninth
Corps was posted on the right of the field, which was nearly a mile
broad, and extended a half-mile along the main road, and was
bordered by heavy woods, passable for infantry. Christ’s brigade was
on the right of the corps, and the Twenty-ninth on the right of the
Brigade, fifty yards from the woods in front, while its right flank
actually touched them.
The lines had been formed but a short time, when the blue uniforms
of our rear guard were seen, and finally our skirmishers,—the latter
crossing the fields, creeping along the fences, and coming up the
road, guns in hand, occasionally pausing to load and fire. Now and
then a soldier in gray showed himself on the edge of the woods, but
he would soon dart back out of sight. Colonel Pierce, now in
command of the regiment, had orders to cover his front and flank
with skirmishers, and Companies A and I, under Captain Clarke and
Lieutenant Williams, were detailed for this purpose. The companies
had proceeded but a short distance into the woods, when they came
upon the enemy, who were approaching stealthily from tree to tree,
evidently attempting what Colonel Christ had feared; namely, to
flank the Brigade. A brisk fire began at once, but our men kept their
line intact, and maintained perfect coolness. After the lapse of about
an hour, the officers on the skirmish line discovered that the enemy
were gradually overlapping the right of the Brigade, and promptly
informed Colonel Christ of the fact. The skirmishers were ordered to
come in at once, and the Brigade changed front and began to fall
back. This movement was not made a moment too soon, for a dense
mass of the enemy’s infantry immediately poured out of the woods
in the rear of the retreating Brigade; while his flanking party, which
had not yet lapped over our old position, also at the same moment,
emerged from the woods, and, with loud yells, joined in the pursuit,
firing an occasional shot, and with terrible oaths, shouting to our
men to surrender and lay down their arms.
Our men, loading as they marched, halted by files, turned about and
fired, and again took their places in the ranks. At last, the regiment,
which was in the rear, reached a sunken road, and, leaping into it,
moved rapidly to the left of our lines; while over the heads of the
men, now fully protected by the high bank, played the cannon of our
reserve batteries, at last free to fire without endangering the lives of
our own troops. The slaughter wrought upon the pursuing enemy is
described as terrible; and as the Twenty-ninth came up the hill,
gaining the plateau of the Knoxville side, Generals Burnside and
Ferrero, standing on either side of the road, clapped their hands as it
filed proudly between them.
It was now, perhaps, five o’clock in the afternoon, and the battle
degenerated into an artillery duel on our side, varied by the enemy
with occasional charges, by which they took nothing but disaster.
One by one, as it grew dark, the batteries retired, and after nightfall
the Brigade moved off and took up its weary march for Knoxville,
where it arrived at about three o’clock the next morning, and lay
down for a few brief hours to rest upon the bleak hillside near Fort
Sanders.
During this battle, Charles H. Dwinnell of Company A, a worthy
comrade and brave soldier, was killed, and William O’Conner of
Company H was captured. Dwinnell was shot through the brain by a
sharpshooter stationed in a tall pine. The ball was probably aimed at
Captain Clarke, who was quite conspicuous at the time; the
sharpshooter was instantly marked and shot by two of Dwinnell’s
comrades, who fired simultaneously, the enemy’s body being seen to
fall out of the tree.
The siege of the city commenced on the 17th, and progressed rather
gradually, beginning on the west and northwest, and finally
extending around the entire city, from river to river. As the work of
investing the place continued, our pickets were constantly pressed in
close upon the main works, so that by the 29th of November we
scarcely held more than the slope of the plateau crowned by our
main fortifications, and in some cases not even that.
To the right of Fort Sanders, named after a brilliant cavalry general
who was killed early in the siege, and west of the city, Humphrey’s
and Christ’s brigades picketed one side of the railroad cut, and the
enemy the other.
On one occasion, before the pickets were drawn in, a little squad of
the Twenty-ninth assaulted a house in front of them, and driving
away the enemy’s pickets there stationed, captured it, and brought
in the supplies, which consisted of a small sack of meal, a few
pounds of bacon, a box of tobacco, an eight-gallon keg of blackberry
brandy, and two boxes of cartridges. The enemy re-formed and
recaptured the house, but our men brought their booty safely into
camp. There was meal enough to give each man in the company to
which these adventurers belonged, a dish of hasty-pudding, and
tobacco enough to furnish every man in the regiment with a good-
sized piece. The brandy and cartridges were accounted for during
the night by some of the wildest picket-firing that occurred during
the siege. There was by no means a large supply of food in the city
when the siege began, but long before it concluded, all kinds of
provisions became extremely scarce.
On the 19th, the Confederates drove in our outer pickets and took
possession of the woods. On the evening of the 23d, they attacked
our picket line in front of the Brigade, and seemed to be on the point
of bringing on a general engagement. The order was given to set
fire to a long line of buildings between the two armies. This was
done to break the enemy’s lines and unmask their movements, and
resulted very successfully. The conflagration that followed was both
grand and awful. The dark wintry sky was lighted up by the flames,
which roared and crackled with an unearthly sound, casting a broad
belt of dazzling light over the fields and into the forests. In the
round-house of the railroad, there was stored a large amount of
condemned ammunition, and when the flames reached that, there
was an explosion that shook the earth, and startled the anxious
residents of the city.
The 26th of November was Thanksgiving Day. The men got a full
ration of bullets, but only a half-ration of bread.
About midnight of the 28th, the picket line near the river on the
southwest was driven in, and could not be re-established by the
brigade which furnished it. The line in front of Fort Sanders had also
been assailed and taken by the enemy, and about nine o’clock in the
evening an order was sent to take the regiment out of the lines and
place it in the immediate rear of the fort for special duty; Major
Chipman had command. A little later in the evening, Companies A,
C, D, and K were detached, and ordered to our lines near the river,
where the enemy had a few hours before captured our rifle-pits.
The night had nearly gone, and the first glimmer of day had
appeared, when the familiar charging yell of the enemy was heard
directly in front of the fort. Our pickets at this point were forced in,
and in a moment more a large body of the enemy’s infantry were
swarming at the very edge of the ditch. The battalion of the Twenty-
ninth, under Chipman, were hurried into the fort, and the four
detached companies at once sent for. The latter had a perilous
experience in joining their comrades, and though exposed to the fire
of the enemy’s cannon, reached the works without the loss of a
man, and in ample time to lend a hand in the severe contest which
was now well under way. The Confederates, led by fearless officers,
crowded the ditch, and crossing it on each other’s shoulders, began
to ascend the bank; one of their standard-bearers came running up
and planted his colors upon the parapet, in the very faces of Major
Chipman’s men; but he had hardly performed his deed of daring,
when one of our soldiers shot him through the heart, and he fell
forward into the works. Inspired by the example of their color-
bearer, a large body of the Confederates, led by a gray-haired old
officer (Colonel Thomas of Georgia), with wild shouts made a dash
up the bank. All seemed lost; but at this moment Companies A, C, D,
and K of the regiment came running into the fort, and ranging
themselves along the parapet, opened a deadly fire upon the
assaulting party. The gray old leader of the enemy, while waving his
sword and shouting to his men to come on, was shot dead. Many of
his brave followers suffered the same fate, and the handful of
survivors fell hurriedly back into the ditch. At the same instant, like
scenes were transpiring all along the works. The Seventy-ninth New
York was sharply engaged, and the artillerymen, not being able to
use their pieces, busied themselves by tossing among the enemy
lighted shell with their fuses cut to a few seconds’ length. Finally a
sergeant of one of the batteries, observing a renewed preparation of
the enemy to charge up the bank, slewed one of the large guns
about so as to make it bear upon the edge of the ditch, and, with a
single charge of canister, raked it for a distance of several yards with
deadly effect. About this time the assault slackened; but in a few
moments another column of the enemy came rushing towards the
fort, and with almost sublime courage faced the withering fire of our
troops, and large numbers of them gained the bank. The first
terrible scenes of the battle were re-enacted; three of the enemy’s
standards were planted simultaneously upon the parapet, but they
were quickly torn away by our men. The resistance was as desperate
as the assault: officers used freely their swords, the men clubbed
their muskets, others used their bayonets, and others still axes and
the rammers of the cannon. A struggle so severe as this could not
be otherwise than of short duration. In a few minutes the enemy’s
soldiers began to falter and fall back into the ditch. Seeing this,
General Ferrero, who was in command of the fort and closely
watching the fight, ordered one company of the Twenty-ninth on the
left, and one company of the Second Michigan on the right, to go
through the embrasures and charge the disorganized enemy.
Sweeping down the ditch, these commands captured about two
hundred of the enemy, and drove them into the fort, the little squad
of the Twenty-ninth following their captives and bearing triumphantly
two battle-flags of the foe; the capturers of which were Sergeant
Jeremiah Mahoney of Company A, and Private Joseph S. Manning of
Company K, both of whom afterwards received the medals of honor
voted by the Congress of the United States.
The fight immediately died away in front of Fort Sanders, and the
remnant of the enemy’s charging column shrank back within their
lines in dismay and confusion. But on the left, where the Federal
rifle-pits had been captured on the afternoon of the 28th, a fierce
battle was heard. Hartranft’s division was sharply engaged with the
enemy in its efforts to recapture the pits, and the effort was soon
successful. The Confederates were everywhere routed, our entire
line re-established, and by ten o’clock that Sunday morning
quietness had settled down over the whole field. The enemy seemed
appalled by the dreadful calamity that had overtaken him,—a
calamity, as we shall presently see, that practically ended the siege.
Ninety-eight dead bodies were taken out of the fatal ditch from a
space of four hundred square feet around the salient. General
Humphrey, who commanded the Mississippi brigade, was found dead
on the glacis, within twenty feet of the face of the ditch. Lying
among the dead in the moat, in every conceivable condition, were
the wounded; and scattered all over the open space in front of the
fort, through which telegraph wires had been stretched from stump
to stump to impede the movements of the assailants, were scattered
hundreds of both dead and wounded, and among them not a few of
the enemy’s soldiers unhurt, who, dismayed at the awful storm of
shell and grape that poured upon them, had lain prone upon the
earth until the battle was over, only too willing to be captured.
Nearly five hundred stand of small arms were collected on the field
within our picket lines. Pollard states the enemy’s loss in this battle
at seven hundred.
The great bravery of this charge entitles those who participated in it
to honorable mention. The troops who engaged in this assault
“consisted of three brigades of McLaw’s division; that of General
Wolford,—the Sixteenth, Eighteenth, and Twenty-fourth Georgia
regiments, and Cobb’s and Phillips’s Georgia legions; that of General
Humphrey,—the Thirteenth, Seventeenth, Twenty-first, Twenty-
second, and Twenty-third Mississippi regiments; and a brigade
composed of Generals Anderson’s and Bryant’s brigades, embracing
among others, the Palmetto State Guard, the Fifteenth South
Carolina Regiment, and the Fifty-first, Fifty-third, and Fifty-ninth
Georgia regiments.”42 The troops that garrisoned the fort were
Benjamin’s United States Battery, Buckley’s Rhode Island Battery, a
part of Roemer’s New York Battery, the Seventy-ninth New York
Highlanders, and, at the very beginning of the fight, a battalion of
the Twenty-ninth under Major Chipman, and before the repulse of
the assault on the salient, Captain Clarke’s and the other companies
of the regiment already named. When the battle was well advanced,
and affairs had assumed a serious aspect, the One Hundredth
Pennsylvania was moved up in the rear of the fort, and a few
minutes before the close of the fight, the Second Michigan was
ordered into the works on the right, one of its companies being
detailed to sweep the ditch. Our loss in the fort was eight killed and
five wounded, and among the former were two members of the
Twenty-ninth; namely, Sergeant John F. Smith of Company H, and
Corporal Gilbert T. Litchfield of Company K, both most excellent
soldiers. The loss of the enemy in this encounter doubtless exceeded
greatly that given by Mr. Pollard; one of our officers engaged stating
it to be fourteen hundred.
When Longstreet had drawn off his troops from the scene of his
defeat, General Burnside kindly directed General Potter to send out a
flag of truce, granting the enemy permission to remove his dead and
wounded from the field. The flag was courteously received, and for
the space of several hours there was a complete cessation of all
hostilities. As a reward for its services in this action, the regiment
was retained in Fort Sanders as a part of its garrison, and
consequently relieved from much severe picket duty, only
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

Similar to Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider (20)

PDF
Lattice: A Cloud-Native Platform for Your Spring Applications
Matt Stine
 
PPTX
Cloud to hybrid edge cloud evolution Jun112020.pptx
Michel Burger
 
PDF
Production-Ready_Microservices_excerpt.pdf
ajcob123
 
PDF
#ATAGTR2020 Presentation - Microservices – Explored
Agile Testing Alliance
 
PPTX
DevOps Underground - Microservices Monitoring
kloia
 
PPTX
Cloud computing for microprocessor tools
GowthamRider
 
PPTX
Accelerate DevOps/Microservices and Kubernetes
Rick Hightower
 
PDF
Microservices: The View from the Peak of Expectations
Saul Caganoff
 
PDF
Understanding Distributed Systems 2nd Edition 2nd Edition Roberto Vitillo
hunelibuzhan
 
PDF
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 
PDF
Microservices for Java Developers
Omar AbdullWahhab
 
PDF
Productivity Engineering: Surviving DevOps
Mike McGarr
 
PDF
Software Mistakes and Tradeoffs 1st Edition Tomasz Lelek
booteampong
 
PDF
Welcome to the Metrics
VMware Tanzu
 
PDF
The IT Manager's Guide to DevOps
Massimo Talia
 
PDF
Digital Disruption with DevOps - Reference Architecture Overview
IBM UrbanCode Products
 
PDF
Microservices_Designing_Deploying.pdf
HuyHongNguynnh1
 
PDF
Cross-Platform Observability for Cloud Foundry
VMware Tanzu
 
PDF
Cloud Native Java Designing Resilient Systems With Spring Boot Spring Cloud A...
yiogomboya
 
PDF
Microservices designing deploying
Suresh Kumar
 
Lattice: A Cloud-Native Platform for Your Spring Applications
Matt Stine
 
Cloud to hybrid edge cloud evolution Jun112020.pptx
Michel Burger
 
Production-Ready_Microservices_excerpt.pdf
ajcob123
 
#ATAGTR2020 Presentation - Microservices – Explored
Agile Testing Alliance
 
DevOps Underground - Microservices Monitoring
kloia
 
Cloud computing for microprocessor tools
GowthamRider
 
Accelerate DevOps/Microservices and Kubernetes
Rick Hightower
 
Microservices: The View from the Peak of Expectations
Saul Caganoff
 
Understanding Distributed Systems 2nd Edition 2nd Edition Roberto Vitillo
hunelibuzhan
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 
Microservices for Java Developers
Omar AbdullWahhab
 
Productivity Engineering: Surviving DevOps
Mike McGarr
 
Software Mistakes and Tradeoffs 1st Edition Tomasz Lelek
booteampong
 
Welcome to the Metrics
VMware Tanzu
 
The IT Manager's Guide to DevOps
Massimo Talia
 
Digital Disruption with DevOps - Reference Architecture Overview
IBM UrbanCode Products
 
Microservices_Designing_Deploying.pdf
HuyHongNguynnh1
 
Cross-Platform Observability for Cloud Foundry
VMware Tanzu
 
Cloud Native Java Designing Resilient Systems With Spring Boot Spring Cloud A...
yiogomboya
 
Microservices designing deploying
Suresh Kumar
 

Recently uploaded (20)

PDF
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
PPTX
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
PPT
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
PDF
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
PPTX
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
PDF
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
PDF
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
PDF
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
PPTX
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
PDF
Dimensions of Societal Planning in Commonism
StefanMz
 
PDF
The Different Types of Non-Experimental Research
Thelma Villaflores
 
PDF
Horarios de distribución de agua en julio
pegazohn1978
 
PDF
Geographical Diversity of India 100 Mcq.pdf/ 7th class new ncert /Social/Samy...
Sandeep Swamy
 
PDF
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
PPT
Talk on Critical Theory, Part One, Philosophy of Social Sciences
Soraj Hongladarom
 
PDF
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PPTX
PPT-Q1-WK-3-ENGLISH Revised Matatag Grade 3.pptx
reijhongidayawan02
 
PPTX
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
PPTX
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
PPTX
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
QNL June Edition hosted by Pragya the official Quiz Club of the University of...
Pragya - UEM Kolkata Quiz Club
 
How to Manage Large Scrollbar in Odoo 18 POS
Celine George
 
Talk on Critical Theory, Part II, Philosophy of Social Sciences
Soraj Hongladarom
 
DIGESTION OF CARBOHYDRATES,PROTEINS,LIPIDS
raviralanaresh2
 
care of patient with elimination needs.pptx
Rekhanjali Gupta
 
Knee Extensor Mechanism Injuries - Orthopedic Radiologic Imaging
Sean M. Fox
 
ARAL-Orientation_Morning-Session_Day-11.pdf
JoelVilloso1
 
CONCURSO DE POESIA “POETUFAS – PASSOS SUAVES PELO VERSO.pdf
Colégio Santa Teresinha
 
ASRB NET 2023 PREVIOUS YEAR QUESTION PAPER GENETICS AND PLANT BREEDING BY SAT...
Krashi Coaching
 
Dimensions of Societal Planning in Commonism
StefanMz
 
The Different Types of Non-Experimental Research
Thelma Villaflores
 
Horarios de distribución de agua en julio
pegazohn1978
 
Geographical Diversity of India 100 Mcq.pdf/ 7th class new ncert /Social/Samy...
Sandeep Swamy
 
The Constitution Review Committee (CRC) has released an updated schedule for ...
nservice241
 
Talk on Critical Theory, Part One, Philosophy of Social Sciences
Soraj Hongladarom
 
Biological Bilingual Glossary Hindi and English Medium
World of Wisdom
 
PPT-Q1-WK-3-ENGLISH Revised Matatag Grade 3.pptx
reijhongidayawan02
 
PATIENT ASSIGNMENTS AND NURSING CARE RESPONSIBILITIES.pptx
PRADEEP ABOTHU
 
Universal immunization Programme (UIP).pptx
Vishal Chanalia
 
I AM MALALA The Girl Who Stood Up for Education and was Shot by the Taliban...
Beena E S
 
Ad

Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider

  • 1. Sre With Java Microservices Patterns For Reliable Microservices In The Enterprise 1st Edition Jonathan Schneider download https://ebookbell.com/product/sre-with-java-microservices- patterns-for-reliable-microservices-in-the-enterprise-1st- edition-jonathan-schneider-55575376 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Sre With Java Microservices Jonathan Schneider Schneider Jonathan https://ebookbell.com/product/sre-with-java-microservices-jonathan- schneider-schneider-jonathan-31547858 Sre With Java Microservices Jonathan Schneider https://ebookbell.com/product/sre-with-java-microservices-jonathan- schneider-232065888 The Art Of Site Reliability Engineering Sre With Azure Building And Deploying Applications That Endure 1st Edition Unai Huete Beloki https://ebookbell.com/product/the-art-of-site-reliability-engineering- sre-with-azure-building-and-deploying-applications-that-endure-1st- edition-unai-huete-beloki-46188398 High Performance Sre Automation Error Budgeting Rpas Slos And Slas With Site Reliability Engineering Arora Mishra https://ebookbell.com/product/high-performance-sre-automation-error- budgeting-rpas-slos-and-slas-with-site-reliability-engineering-arora- mishra-55917980
  • 3. Establishing Sre Foundations A Stepbystep Guide To Introducing Site Reliability Engineering In Software Delivery Organizations 1st Edition Vladyslav Ukis https://ebookbell.com/product/establishing-sre-foundations-a- stepbystep-guide-to-introducing-site-reliability-engineering-in- software-delivery-organizations-1st-edition-vladyslav-ukis-46412452 Becoming Sre First Steps Toward Reliability For Your And Your Organization 1 Converted David N Blankedelman https://ebookbell.com/product/becoming-sre-first-steps-toward- reliability-for-your-and-your-organization-1-converted-david-n- blankedelman-55543244 Becoming Sre First Steps Toward Reliability For You And Your Organization 1st Edition David N Blankedelman https://ebookbell.com/product/becoming-sre-first-steps-toward- reliability-for-you-and-your-organization-1st-edition-david-n- blankedelman-56090074 Seeking Sre Conversations About Running Production Systems At Scale 1st Edition David N Blankedelman https://ebookbell.com/product/seeking-sre-conversations-about-running- production-systems-at-scale-1st-edition-david-n-blankedelman-7293816 Seeking Sre Conversations About Running Production Systems At Scale David N Blankedelman https://ebookbell.com/product/seeking-sre-conversations-about-running- production-systems-at-scale-david-n-blankedelman-49848640
  • 7. Jonathan Schneider SRE with Java Microservices Patterns for Reliable Microservices in the Enterprise Boston Farnham Sebastopol Tokyo Beijing Boston Farnham Sebastopol Tokyo Beijing
  • 8. 978-1-492-07392-5 [GP] SRE with Java Microservices by Jonathan Schneider Copyright © 2020 Jonathan Schneider. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Acquisitions Editor: Melissa Duffield Development Editor: Melissa Potter Production Editor: Deborah Baker Copyeditor: JM Olejarz Proofreader: Amanda Kersey Indexer: WordCo Indexing Services, Inc. Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: O’Reilly Media, Inc. October 2020: First Edition Revision History for the First Edition 2020-08-26: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781492073925 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. SRE with Java Microservices, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author, and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights.
  • 9. Table of Contents Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii 1. The Application Platform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Platform Engineering Culture 2 Monitoring 7 Monitoring for Availability 7 Monitoring as a Debugging Tool 10 Learning to Expect Failure 12 Effective Monitoring Builds Trust 13 Delivery 13 Traffic Management 15 Capabilities Not Covered 15 Testing Automation 15 Chaos Engineering and Continuous Verification 17 Configuration as Code 17 Encapsulating Capabilities 18 Service Mesh 19 Summary 21 2. Application Metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Black Box Versus White Box Monitoring 24 Dimensional Metrics 25 Hierarchical Metrics 26 Micrometer Meter Registries 27 Creating Meters 30 Naming Metrics 31 iii
  • 10. Common Tags 36 Classes of Meters 38 Gauges 39 Counters 42 Timers 45 “Count” Means “Throughput” 46 “Count” and “Sum” Together Mean “Aggregable Average” 46 Maximum Is a Decaying Signal That Isn’t Aligned to the Push Interval 50 The Sum of Sum Over an Interval 53 The Base Unit of Time 53 Using Timers 55 Common Features of Latency Distributions 59 Percentiles/Quantiles 60 Histograms 65 Service Level Objective Boundaries 69 Distribution Summaries 73 Long Task Timers 74 Choosing the Right Meter Type 77 Controlling Cost 77 Coordinated Omission 80 Load Testing 82 Meter Filters 87 Deny/Accept Meters 88 Transforming Metrics 89 Configuring Distribution Statistics 91 Separating Platform and Application Metrics 92 Partitioning Metrics by Monitoring System 96 Meter Binders 98 Summary 99 3. Debugging with Observability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 The Three Pillars of Observability…or Is It Two? 101 Logs 102 Distributed Tracing 103 Metrics 104 Which Telemetry Is Appropriate? 104 Components of a Distributed Trace 107 Types of Distributed Tracing Instrumentation 109 Manual Tracing 109 Agent Tracing 110 Framework Tracing 110 Service Mesh Tracing 111 iv | Table of Contents
  • 11. Blended Tracing 112 Sampling 114 No Sampling 114 Rate-Limiting Samplers 114 Probabilistic Samplers 115 Boundary Sampling 116 Impact of Sampling on Anomaly Detection 116 Distributed Tracing and Monoliths 117 Correlation of Telemetry 118 Metric to Trace Correlation 119 Using Trace Context for Failure Injection and Experimentation 120 Summary 123 4. Charting and Alerting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Differences in Monitoring Systems 127 Effective Visualizations of Service Level Indicators 132 Styles for Line Width and Shading 132 Errors Versus Successes 134 “Top k” Visualizations 135 Prometheus Rate Interval Selection 137 Gauges 137 Counters 139 Timers 143 When to Stop Creating Dashboards 147 Service Level Indicators for Every Java Microservice 148 Errors 148 Latency 153 Garbage Collection Pause Times 161 Heap Utilization 164 CPU Utilization 170 File Descriptors 172 Suspicious Traffic 174 Batch Runs or Other Long-Running Tasks 175 Building Alerts Using Forecasting Methods 176 Naive Method 177 Single-Exponential Smoothing 179 Universal Scalability Law 181 Summary 185 5. Safe, Multicloud Continuous Delivery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Types of Platforms 188 Resource Types 189 Table of Contents | v
  • 12. Delivery Pipelines 191 Packaging for the Cloud 194 Packaging for IaaS Platforms 196 Packaging for Container Schedulers 198 The Delete + None Deployment 199 The Highlander 200 Blue/Green Deployment 200 Automated Canary Analysis 205 Spinnaker with Kayenta 209 General-Purpose Canary Metrics for Every Microservice 214 Summary 218 6. Source Code Observability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 The Stateful Asset Inventory 223 Release Versioning 226 Maven Repositories 227 Build Tools for Release Versioning 230 Capturing Resolved Dependencies in Metadata 234 Capturing Method-Level Utilization of the Source Code 240 Structured Code Search with OpenRewrite 243 Dependency Management 252 Version Misalignments 252 Dynamic Version Constraints 253 Unused Dependencies 254 Undeclared Explicitly Used Dependencies 255 Summary 256 7. Traffic Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Microservices Offer More Potential Failure Points 257 Concurrency of Systems 258 Platform Load Balancing 259 Gateway Load Balancing 259 Join the Shortest Queue 262 Instance-Reported Availability and Utilization 264 Health Checks 267 Choice of Two 269 Instance Probation 269 Knock-On Effects of Smarter Load Balancing 270 Client-Side Load Balancing 270 Hedge Requests 272 Call Resiliency Patterns 273 Retries 274 vi | Table of Contents
  • 13. Rate Limiters 276 Bulkheads 278 Circuit Breakers 280 Adaptive Concurrency Limits 283 Choosing the Right Call Resiliency Pattern 284 Implementation in Service Mesh 285 Implementation in RSocket 287 Summary 288 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Table of Contents | vii
  • 15. Foreword “To production and beyond!” —Buzz Lightyear (paraphrasing) I know Buzz said “to infinity and beyond,” but that whole notion never sat well with me as a child. How could you go beyond infinity? It was only later in life, when I became a software engineer, that it dawned on me—software is never complete. It’s never finished. It’s...infinite. Buzz missed his calling in software! Software has no end. Software is like the oceans, the stars, and the bugs in your code: endless! Hopefully, that’s not controversial. The last few decades have seen all of us in the software field pivot around this insight: the endless tail of software maintenance is the most expensive part of what we do. A good deal of the significant movements in software figure around that. Testing and continuous integration. Continuous delivery. Cloud computing. Microservices. It’s not hard to get to production the first time, but these practices optimize for the many subsequent trips to production. They optimize for day two. They optimize for cycle time: how quickly can you take an idea and see it delivered into production, from concept to customer? They optimize for “and beyond.” This insight—that software has no end—introduces a ton of new practices and puts the lie to as many existing practices. It changes the focus from the initial development and MVP to the maintenance and management of that software. The focus is on pro‐ duction. I love production. You should love production. You should go to production, as early and often as possible. Bring the kids. Bring the family. The weather’s amazing. Great software engineers build for production and for the return trip. They build for the endless journey. It is no longer acceptable to wring our collective hands and throw our code over the proverbial wall: “Well, it works on my machine! Now you deploy it!” There’s a whole new frontier out there—production—that demands a dif‐ ferent set of skills. Site reliability engineering (SRE) is, according to Ben Treynor, ix
  • 16. founder of Google’s site reliability team, “what happens when a software engineer is tasked with what used to be called operations.” SREs share a lot of skills with tradi‐ tional software engineers, but they target them differently, with an eye on production. Spring Boot developers can and should develop SRE skills, and few know the ropes better than this book’s author, Jonathan Schneider. Spring Boot succeeds by being laser-focused on production. It is extraordinary because it, along with frameworks like Dropwizard, is built stem to stern for production. It supports easy aggregation of metrics with Micrometer, so-called fat .jar deployments, the Actuator management endpoints, application life cycle events, 12 Factor-style configuration, and more. Spring Cloud is a set of Spring Boot extensions designed to support the microservices architecture. And all that is to say nothing of the rich platform support. Spring Boot is container-native. It and platforms like Cloud Foundry or Kubernetes are two sides of the same coin. Spring Boot supports graceful shutdown, health groups, liveness and readiness probes, Docker image generation (leveraging CNCF buildpacks), and so much more. And I’ll bet you didn’t know about most of those features! But Jonathan knows. Jona‐ than lives and breathes this stuff. He created the Micrometer project, a dimensional metrics framework that supports dozens of different metrics and monitoring plat‐ forms, and then helped integrate it into Spring Boot’s Actuator module, and into countless other third-party open source projects. I’ve watched him work on metrics, continuous delivery tools like Spinnaker, and observability tools like Micrometer, and generally pave the path to production for others. He’s got years of experience in lever‐ aging Spring and Spring Boot at a global scale, and while he can sling Spring Data repositories and craft HTTP APIs with the best of them, his genius is in the way he builds for that endless journey. And now we can learn from him in this book. Chapter 1 is a manifesto of sorts—read this to get in the right frame of mind for the book. I should’ve read this chapter first! I should’ve, but I didn’t. I skipped ahead to Chapter 2, which introduces instrumenta‐ tion and metrics. While I was eager to read the book cover to cover, I was most exci‐ ted about this chapter. It’s not surprising that the creator of Micrometer could so well articulate the concepts in this chapter. Chapters 2 through 4 are a few hours very well spent. They’re brilliant. 5/5: would (and did) read again. Chapter 5 introduces the clouds, core concepts, types of platforms, and patterns unique to these platforms. This chapter was one of the most insightful for me. It starts slowly, and the next thing you know, you’re up to your deployment scripts in a game- changing discussion of continuous delivery, canary analysis, and more. Read this chapter twice. x | Foreword
  • 17. Chapter 6 gives you a framework and specific solutions to understand your codebase and its dependencies. I’ve never seen all these concerns presented in a comprehensive framework like this. Chapter 7 is a fitting closing chapter to an amazing book in that it looks at the inter‐ actions of deployed services in production. By this point you’ll have learned how to get the software to production and how to observe services and their source code. This chapter is all about the service interactions and the dynamics of load on an architecture. I learned something new in every chapter. The book is full of the wisdom of a true cloud native, and one that I think the community needs. Jonathan is a fantastic guide on the endless journey to production...and beyond. — Josh Long (@starbuxman) Spring Developer Advocate Spring team, VMware San Francisco, CA July 2020 Foreword | xi
  • 19. Preface This book presents a phased approach to building and deploying more reliable Java microservices. The capabilities presented in each chapter are meant to be followed in order, each building upon the capabilities of earlier chapters. There are five phases in this journey: 1. Measure and monitor your services for availability. 2. Add debuggability signals that allow you to ask questions about periods of unavailability. 3. Improve your software delivery pipeline to limit the chance of introducing more failure. 4. Build the capability to observe the state of deployed assets all the way down to source code. 5. Add just enough traffic management to bring your services up to a level of availa‐ bility you are satisfied with. Our goal isn’t to build a perfect system, to eliminate all failure. Our goal is to end with a highly reliable system and avoid spending time in the space of diminishing returns. Avoiding diminishing returns is why we will spend so much time talking about effec‐ tive measurement and monitoring, and why this discipline precedes all others. If you are in engineering management, Chapter 1 is your mission statement: to build an application platform renowned for its reliability and the culture of an effective platform engineering team that can deliver these capabilities to a broader engineering organization. The chapters that follow contain the blueprints for achieving this mission, targeted at engineers. This book is intentionally narrowed in scope to Java microservices pre‐ cisely so that I can offer detailed advice on how to go about this, including specific measurements battle-tested for Java microservices, code samples, and other xiii
  • 20. idiosyncracies like dependency management concerns that are unique to the Java vir‐ tual machine (JVM). Its focus is on immediate actionability. My Journey My professional journey in software engineering forms an arc that led me to write this book: • A scrappy custom software startup • A traditional insurance company called Shelter Insurance in Missouri • Netflix in Silicon Valley • A Spring team engineer working remotely • A Gradle engineer When I left Shelter Insurance, despite my efforts, I didn’t understand public cloud. In almost seven years there, I had interacted with the same group of named virtual machines (bare metal actually, originally). I was used to quarterly release cycles and extensive manual testing before releases. I felt like leaders emphasized and reempha‐ sized how “hard” we expected code freezes to be leading up to releases, how after a release a code freeze wasn’t as hard as we would have liked, etc. I had never experi‐ enced production monitoring of an application—that was the responsibility of a net‐ work operations center, a room my badge didn’t provide access to because I didn’t need to know what happened there. This organization was successful by most meas‐ ures. It has changed significantly in some ways since then, and little in others. I’m thankful for the opportunity to have learned under some fantastic engineers there. At Netflix I learned valuable lessons about engineering and culture. I left after a time with a great sense of hope that some of these same practices could be applied to a company like Shelter Insurance, and joined the Spring team. When I founded the open source metrics library Micrometer, it was with a deep appreciation of the fact that organizations are on a journey. Rather than supporting just the best-in-class monitoring systems of today, Micrometer’s first five monitoring system implementa‐ tions contained three legacy monitoring systems that I knew were still in significant use. A couple of years working with and advising enterprises of various sizes on applica‐ tion monitoring and delivery automation with Spinnaker gave me an idea of both the diversity of organizational dynamics and their commonalities. It is my understanding of the commonalities, those practices and techniques that every enterprise could ben‐ efit from, that form the substance of this book. Every enterprise Java organization can apply these techniques, given a bit of time and practice. That includes your organization. xiv | Preface
  • 21. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele‐ ments such as variable or function names, databases, data types, environment variables, statements, and keywords. Constant width bold Shows commands or other text that should be typed literally by the user. Constant width italic Shows text that should be replaced with user-supplied values or by values deter‐ mined by context. This element signifies a tip or suggestion. This element signifies a general note. This element indicates a warning or caution. Preface | xv
  • 22. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata and any additional informa‐ tion. You can access this page at https://oreil.ly/SRE_with_Java_Microservices. Email [email protected] to comment or ask technical questions about this book. For news and information about our books and courses, visit http://oreilly.com. Find us on Facebook: http://facebook.com/oreilly Follow us on Twitter: http://twitter.com/oreillymedia Watch us on YouTube: http://www.youtube.com/oreillymedia xvi | Preface
  • 23. Acknowledgments Olga Kundzich What I didn’t know before writing this book is how much voices from an author’s circle of colleagues find their way into a book. It makes complete sense, of course. We influence each other simply by working together! Olga’s insightful views on a wide range of topics have probably had the greatest single influence on my think‐ ing in the last couple of years, and her voice is everywhere in this book (or at least the best approximation of it I can represent). Thoughts you’ll find on “the application platform,” continuous delivery (no no, not continuous deployment—I kept confusing the two), asset inventory, monitoring, and elements of traffic management are heavily influenced by her. Thank you Olga for investing so much energy into this book. Troy Gaines To Troy I owe my initial introduction to dependency management, build auto‐ mation, continuous integration, unit testing, and so many other essential skills. He was an early and significant influence in my growth as a software developer, as I know he has been to many others. Thank you, old friend, for taking the time to review this work as well. Tommy Ludwig Tommy is one of the rare telemetry experts that contributes to both distributed tracing and aggregated metrics technologies. It is so common that contributors in the observability space are hyper-focused on one area of it, and Tommy is one of the few that floats between them. To put it mildly, I dreaded Tommy’s review of Chapter 3, but was happy to find that we had more in common on this than I expected. Thanks for pointing out the more nuanced view of distributed tracing tag cardinality that made its way into Chapter 3. Sam Snyder I haven’t known Sam for long, but it didn’t take long for me to understand that Sam is an excellent mentor and patient teacher. Thank you Sam for agreeing to subject yourself to the arduous task of reviewing a technical book, and leaving so much positive and encouraging feedback. Mike McGarr I received an email out of the blue from Mike in 2014 that, a short time later, resulted in me packing everything up and moving to California. That email set me on a course that changed everything. I came to know so many experts at Net‐ flix that accelerated me through the learning process because Mike took a chance on me. It radically changed the way I view software development and operations. Mike is also just a fantastic human being—a kind and inquisitive friend and leader. Thanks, Mike. Preface | xvii
  • 24. Josh Long Once in the book, I quoted a typical Josh Long phrase about there being “no place like” production. I thought I was being cheeky and fun. And then Josh wrote a foreword that features Buzz Lightyear…Josh is an unstoppable ball of energy. Thank you Josh for injecting a bit of that energy into this work. xviii | Preface
  • 25. CHAPTER 1 The Application Platform Martin Fowler and James Lewis, who initially proposed the term microservices, define the architecture in their seminal blog post as: …a particular way of designing software applications as suites of independently deployable services. While there is no precise definition of this architectural style, there are certain common characteristics around organization around business capa‐ bility, automated deployment, intelligence in the endpoints, and decentralized control of languages and data. Adopting microservices promises to accelerate software development by separating applications into independently developed and deployed components produced by independent teams. It reduces the need to coordinate and plan large-scale software releases. Each microservice is built by an independent team to meet a specific busi‐ ness need (for internal or external customers). Microservices are deployed in a redundant, horizontally scaled way across different cloud resources and communicate with each other over the network using different protocols. A number of challenges arise due to this architecture that haven’t been seen previ‐ ously in monolithic applications. Monolithic applications used to be primarily deployed on the same server and infrequently released as a carefully choreographed event. The software release process was the main source of change and instability in the system. In microservices, communications and data transfer costs introduce addi‐ tional latencies and potential to degrade end-user experience. A chain of tens or hun‐ dreds of microservices now work together to create that experience. Microservices are released independently of each other, but each one can inadvertently impact other microservices and therefore the end-user experience, too. Managing these types of distributed systems requires new practices, tools, and engi‐ neering culture. Accelerating software releases doesn’t need to come at the cost of stability and safety. In fact, these go hand in hand. This chapter introduces the culture 1
  • 26. of an effective platform engineering team and describes the basic building blocks of reliable systems. Platform Engineering Culture To manage microservices, an organization needs to standardize specific communica‐ tion protocols and supporting frameworks. A lot of inefficiencies arise if each team needs to maintain its own full stack development, as does friction when communicat‐ ing with other parts of a distributed application. In practice, standardization leads to a platform team that is focused on providing these services to the rest of the teams, who are in turn focused on developing software to meet business needs. We want to provide guardrails, not gates. —Dianne Marsh, director of engineering tools at Netflix Instead of building gates, allow teams to build solutions that work for them first, learn from them, and generalize to the rest of the organization. Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations. —Conway’s Law Figure 1-1 shows an engineering organization built around specialties. One group specializes in user interface and experience design, another building backend serv‐ ices, another managing the database, another working on business process automa‐ tion, and another managing network resources. Figure 1-1. Organization built around technical silos 2 | Chapter 1: The Application Platform
  • 27. The lesson often taken from Conway’s Law is that cross-functional teams, as in Figure 1-2, can iterate faster. After all, when team structure is aligned to technical specialization, any new business requirement will require coordination across all of these specializations. Figure 1-2. Cross-functional teams There is obviously waste in this system as well though, specifically that specialists on each team are developing capabilities independently of one another. Netflix did not have dedicated site reliability engineers per team, as Google promotes in Site Reliabil‐ ity Engineering edited by Betsy Beyer et al. (O’Reilly). Perhaps because of a greater degree of homogenity to the type of software being written by product teams (mostly Java, mostly stateless horizontally scaled microservices), the centralization of product engineering functions was more efficient. Does your organization more resemble Google, working on very different types of products from automated cars to search to mobile hardware to browsers? Or does it more resemble Netflix, composed of a series of business applications written in a handful of languages running on a limited vari‐ ety of platforms? Cross-functional teams and completely siloed teams are just on the opposite ends of a spectrum. Effective platform engineering can reduce the need for a specialist per team for some set of problems. An organization with dedicated platform engineering is more of a hybrid, like in Figure 1-3. A central platform engineering team is stron‐ gest when it views product teams as customers that need to be constantly won over and exercises little to no control over the behavior of its customers. Platform Engineering Culture | 3
  • 28. Figure 1-3. Product teams with dedicated platform engineering For example, when monitoring instrumentation is distributed throughout the organi‐ zation as a common library included in each microservice, it shares the hard-won knowledge of availability indicators known to be broadly applicable. Each product team can spend just a little time adding availability indicators that are unique to its business domain. It can communicate with the central monitoring team for informa‐ tion and advice on how to build effective signals as necessary. At Netflix, the strongest cultural current was “freedom and responsibility,” defined in a somewhat famous culture deck from 2001. I was a member of the engineering tools team but we could not require that everyone else adopt a particular build tool. A small team of engineers managed Cassandra clusters on behalf of many product teams. There is an efficiency to this concentration of build tool or Cassandra skill, a natural communication hub through which undifferentiated problems with these products flowed and lessons were transferred to product-focused teams. The build tools team at Netflix, at its smallest point, was just two engineers serving the interests of roughly 700 other engineers while transitioning between recom‐ mended build tools (Ant to Gradle) and performing two major Java upgrades (Java 6 to 7 and then Java 7 to 8), among other daily routines. Each product team completely owned its build. Because of “freedom and responsibility,” we could not set a hard date for when we would completely retire Ant-based build tooling. We could not set a hard date for when every team had to upgrade its version of Java (except to the extent that a new Oracle licensing model did this for us). The cultural imperative drove us to 4 | Chapter 1: The Application Platform
  • 29. focus so heavily on developer experience that product teams wanted to migrate with us. It required a level of effort and empathy that could only be guaranteed by abso‐ lutely preventing us from setting hard requirements. When a platform engineer like myself serves the interests of so many diverse product teams in a focused technical speciality like build tooling, inevitably patterns emerge. My team saw the same script playing out over and over again with binary dependency problems, plug-in versioning, release workflow problems, etc. We worked initially to automate the discovery of these patterns and emit warnings in build output. Without the freedom-and-responsibility culture, perhaps we would have skipped warnings and just failed the build, requiring product teams to fix issues. This would have been satisfying to the build tools team—we wouldn’t be responsible for answering ques‐ tions related to failures that we tried to warn teams about. But from the product team perspective, every “lesson” the build tools team learned would be disruptive to them at random points in time, and especially disruptive when they had more pressing (if temporary) priorities. The softer, non-failing warning approach was shockingly ineffective. Teams rarely paid any attention to successful build logs, regardless of how many warnings were emitted. And even if they did see the warnings, attempting to fix them incurred risk: a working build with warnings is better than a misbehaving one without warnings. As a result, carefully crafted deprecation warnings could go ignored for months or years. The “guardrails not gates” approach required our build tools team to think about how we could share our knowledge with product teams in a way that was visible to them, required little time and effort to act on, and reduced the risk of coming along with us on the paved path. The tooling that emerged from this was almost over the top in its focus on developer experience. First, we wrote tooling that could rewrite the Groovy code of Gradle builds to autore‐ mediate common patterns. This was much more difficult than just emitting warnings in the log. It required making indentation-preserving abstract syntax tree modifications to imperative build logic, an impossible problem to solve in gen‐ eral, but surprisingly effective in specific cases. Autoremediation was opt-in though, through the use of a simple command that product teams could run to accept recommendations. Next, we wrote monitoring instrumentation that reported patterns that were poten‐ tially remediable but for which product teams did not accept the recommendation. We could monitor each harmful pattern in the organization over time, watch as it declined in impact as teams accepted remediations. When we reached the long tail of a small number of teams that just wouldn’t opt in, we knew who they were, so we could walk over to their desks and work with them one on one to hear their concerns and help them move forward. (I did this enough that I started carrying my own mouse around. There was a suspicious correlation between Netflix engineers who Platform Engineering Culture | 5
  • 30. used trackballs and Netflix engineers who were on the long tail of accepting remedia‐ tions.) Ultimately, this proactive communication established a bond of trust that made future recommendations from us seem less risky. We went to fairly extreme lengths to improve the visibility of recommendations without resorting to breaking builds to get developers’ attention. Build output was carefully colorized and stylized, sometimes with visual indicators like Unicode check marks and X marks that were hard to miss. Recommendations always appeared at the end of the build because we knew that they were the last thing emitted on the termi‐ nal and our CI tools by default scrolled to the end of the log output when engineers examined build output. We taught Jenkins how to masquerade as a TTY terminal to colorize build output but ignore cursor movement escape sequences to still serialize build task progress. Crafting this kind of experience was technically costly, but compare it with the two options: Freedom and responsibility culture Led us to build self-help autoremediation with monitoring that helped us under‐ stand and communicate with the teams that struggled. Centralized control culture We probably would have been led to break builds eagerly because we “owned” the build experience. Teams would have been distracted from their other priorities to accommodate our desire for a consistent build experience. Every change, because it lacked autoremediation, would have generated far more questions to us as the build tools team. The total amount of toil for every change would have been far greater. An effective platform engineering team cares deeply about developer experience, a singular focus that is at least as keen as the focus product teams place on customer experience. This should be no surprise: in a well-calibrated platform engineering organization, developers are the customer! The presence of a healthy product man‐ agement discipline, expert user experience designers, and UI engineers and designers that care deeply about their craft should all be indicators of a platform engineering team that is aligned for the benefit of their customer developers. More detail on team structure is out of the scope of this book, but refer to Team Top‐ ologies by Matthew Skelton and Manuel Pais (IT Revolution Press) for a thorough treatment of the topic. Once the team is culturally calibrated, the question becomes how to prioritize capa‐ bilities that a platform engineering team can deliver to its customer base. The remain‐ der of this book is a call to action, delivered in capabilities ordered from (in my view) most essential to less essential. 6 | Chapter 1: The Application Platform
  • 31. Monitoring Monitoring your application infrastructure requires the least organizational commit‐ ment of all the stages on the journey to more resilient systems. As we’ll show in the subsequent chapters, framework-level monitoring instrumentation has matured to such an extent that you really just need to turn it on and start taking advantage. The cost-benefit ratio has been skewed so heavily toward benefit that if you do nothing else in this book, start monitoring your production applications now. Chapter 2 will discuss metrics building blocks, and Chapter 4 will provide the specific charts and alerts you can employ, mostly based on instrumentation that Java frameworks pro‐ vide without you having to do any additional work. Metrics, logs, and distributed tracing are three forms of observability that enable the measure of service availability and aid in debugging complex distributed systems problems. Before going further in the workings of any of these, it is useful to under‐ stand what capabilities each enables. Monitoring for Availability Availability signals measure the overall state of the system and whether that system is functioning as intended in the large. It is quantified by service level indicators (SLIs). These indicators include signals for the health of the system (e.g., resource consump‐ tion) and business metrics like number of sandwiches sold or streaming video starts per second. SLIs are tested against a threshold called a service level objective (SLO) that sets an upper or lower bound on the range of an SLI. SLOs in turn are a some‐ what more restrictive or conservative estimate than a threshold you agree upon with your business partners about a level of service you are expected to provide, or what’s known as a service level agreement (SLA). The idea is that an SLO should provide some amount of advance warning of an impending violation of an SLA so that you don’t actually get to the point where you violate that SLA. Metrics are the primary observability tool for measuring availability. They are a measure of SLIs. Metrics are the most common availability signal because they repre‐ sent an aggregation of all activity happening in the system. They are cheap enough to not require sampling (discarding some portion of the data to limit overhead), which risks discarding important indicators of unavailability. Metrics are numerical values arranged in a time series representing a sample at a par‐ ticular time or an aggregate of individual events that have occurred in an interval: Monitoring | 7
  • 32. 1 I first learned of the USE criteria from Brendan Gregg’s description of his method for monitoring Unix sys‐ tems. In that context, latency measurement isn’t as granular, thus the missing L. Metrics Metrics should have a fixed cost irrespective of throughput. For example, a metric that counts executions of a particular block of code should only ship the number of executions seen in an interval regardless of how many there are. By this I mean that a metric should ship “N requests were observed” at publish time, not “I saw a request N distinct times” throughout the publishing interval. Metrics data Metrics data cannot be used to reason about the performance or function of any individual request. Metrics telemetry trades off reasoning about an individual request for the application’s behavior across all requests in an interval. To effectively monitor the availability of a Java microservice, a variety of availability signals need to be monitored. Common signals are given in Chapter 4, but in general they fall into four categories, known together as the L-USE method:1 Latency This is a measure of how much time was spent executing a block of code. For the common REST-based microservice, REST endpoint latency is a useful measure of the availability of the application, particularly max latency. This will be discussed in greater detail in “Latency” on page 153. Utilization A measure of how much of a finite resource is consumed. Processor utilization is a common utilization indicator. See “CPU Utilization” on page 170. Saturation Saturation is a measurement of extra work that can’t be serviced. “Garbage Col‐ lection Pause Times” on page 161 shows how to measure the Java heap, which during times of excessive memory pressure leads to a buildup of work that can‐ not be completed. It’s also common to monitor pools like database connection pools, request pools, etc. Errors In addition to looking at purely performance-related concerns, it is essential to find a way to quantify the error ratio relative to total throughput. Measurements of error include unanticipated exceptions yielding unsuccessful HTTP responses on a service endpoint (see “Errors” on page 148), but also more indirect meas‐ ures like the ratio of requests attempted against an open circuit breaker (see “Cir‐ cuit Breakers” on page 280). 8 | Chapter 1: The Application Platform
  • 33. Utilization and saturation may seem similar at first, and internalizing the difference will have an impact on how you think about charting and alerting on resources that can be measured both ways. A great example is JVM memory. You can measure JVM memory as a utilization metric by reporting on the amount of bytes consumed in each memory space. You can also measure JVM memory in terms of the proportion of time spent garbage collecting it relative to doing anything else, which is a measure of saturation. In most cases, when both utilization and saturation measurements are possible, the saturation metric leads to better-defined alert thresholds. It’s hard to alert when memory utilization exceeds 95% of a space (because garbage collection will bring that utilization rate back below this threshold), but if memory utilization routinely and frequently exceeds 95%, the garbage collector will kick in more fre‐ quently, more time will be spent proportionally doing garbage collection than any‐ thing else, and the saturation measurement will thus be higher. Some common availability signals are listed in Table 1-1. Table 1-1. Examples of availability signals SLI SLO L-USE criteria Process CPU usage <80% Saturation Heap utilization <80% of available heap space Saturation Error ratio for a REST endpoint <1% of total requests to the endpoint Errors Max latency for a REST endpoint <100 ms Latency Google has a much more prescriptive view on how to use SLOs. Google’s approach to SLOs Site Reliability Engineering by Betsy Beyer et al. (O’Reilly) presents service availability as a tension between competing organizational imperatives: to deliver new features and to run the existing feature set reliably. It proposes that product teams and dedica‐ ted site reliability engineers agree on an error budget that provides a measurable objective for how unreliable a service is allowed to be within a given window of time. Exceeding this objective should refocus the team on reliability over feature develop‐ ment until the objective is met. The Google view on SLOs is explained in great detail in the “Alerting on SLOs” chap‐ ter in The Site Reliability Workbook edited by Betsy Beyer et al. (O’Reilly). Basically, Google engineers alert on the probability that an error budget is going to be depleted in any given time frame, and they react in an organizational way by shifting engineer‐ ing resources from feature development to reliability as necessary. The word “error” in this case means exceeding any SLO. This might mean exceeding an acceptable ratio of server failed outcomes in a RESTful microservice, but could also mean exceeding an acceptable latency threshold, getting too close to overwhelming file descriptors on Monitoring | 9
  • 34. the underlying operating system, or any other combination of measurements. With this definition, the time that a service is unreliable in a prescribed window is the pro‐ portion when one or more SLOs were not being met. Your organization doesn’t need to have separate functions for product engineer and site reliability engineer for error budgeting to be a useful concept. Even a single engi‐ neer working on a product completely alone and wholly responsible for its operation can benefit from thinking about where to pause feature development in favor of improving reliability and vice versa. I think the overhead of the Google error budget scheme is overkill for a lot of organi‐ zations. Start measuring, discover how alerting functions fit into your unique organi‐ zation, and once practiced at measuring, consider whether you want to go all in on Google’s process or not. Collecting, visualizing, and alerting on application metrics is an exercise in continu‐ ously testing the availability of your services. Sometimes an alert itself will contain enough contextual data that you know how to fix a problem. In other cases, you’ll want to isolate a failing instance in production (e.g., by moving it out of the load bal‐ ancer) and apply further debugging techniques to discover the problem. Other forms of telemetry are used for this purpose. A less formal approach to SLOs A less formal system worked well for Netflix, where individual engineering teams were responsible for their services’ availability, there was no SRE/product engineer separation of responsibility on individual product teams, and there wasn’t such a for‐ malized reaction to error budgets, at least not cross-organizationally. Neither system is right or wrong; find a system that works well for you. For the purpose of this book, we’ll talk about how to measure for availability in sim‐ pler terms: as tests against an error rate or error ratio, latencies, saturation, and uti‐ lization indicators. We won’t present violations of these tests as particular “errors” of reliability that are deducted from an error budget over a window of time. If you want to then take those measurements and apply the error-budgeting and organizational dynamics of Google’s SRE culture to your organization, you can do that by following the guidance given in Google’s writings on the topic. Monitoring as a Debugging Tool Logs and distributed traces, covered in detail in Chapter 3, are used mainly for troubleshooting, once you have become aware of a period of unavailability. Profiling tools are also debuggability signals. It is very common (and easy, given a confusing market) for organizations to center their entire performance management investment around debuggability tools. 10 | Chapter 1: The Application Platform
  • 35. Application performance management (APM) vendors can sometimes sell them‐ selves as an all-in-one solution, but with a core technology built entirely on tracing or logging and providing availability signals by aggregating these debugging signals. In order to not single out any particular vendor, consider YourKit, a valuable profil‐ ing (debuggability) tool that does this task well without selling itself as more. YourKit excels at highlighting computation- and memory-intensive hotspots in Java code, and looks like Figure 1-4. Some popular commercial APM solutions have a similar focus, which, while useful, is not a substitute for a focused availability signal. Figure 1-4. YourKit excels at profiling These solutions are more granular, recording in different ways the specifics of what occurred during a particular interaction with the system. With this increased granu‐ larity comes cost, and this cost is frequently mitigated with downsampling or even turning off these signals entirely until they are needed. Attempts to measure availability from log or tracing signals generally force you to trade off accuracy for cost, and neither can be optimized. This trade-off exists for traces because they are generally sampled. The storage footprint for traces is higher than for metrics. Monitoring | 11
  • 36. Learning to Expect Failure If you aren’t already monitoring applications in a user-facing way, as soon as you start, you’re likely to be confronted with the sobering reality of your software as it exists today. Your impulse is going to be to look away. Reality is likely to be ugly. At a midsize property-casualty insurance company, we added monitoring to the main business application that the company’s insurance agents use to conduct their normal business. Despite strict release processes and a reasonably healthy testing culture, the application manifested over 5 failures per minute for roughly 1,000 requests per minute. From one perspective, this is only a 0.5% error ratio (maybe acceptable and maybe not), but the failure rate was still a shock to a company that thought its service was well tested. The realization that the system is not going to be perfect switches the focus from try‐ ing to be perfect to monitoring, alerting, and quickly resolving issues that the system experiences. No amount of process control around the rate of change will yield per‐ fect outcomes. Before evolving the delivery and release process further, the first step on the path to resilient software is adding monitoring to your software as it is released now. With the move to microservices and changing application practices and infrastruc‐ ture, monitoring has become even more important. Many components are not directly under an organization’s control. For example, latency and errors can be caused by failures in the networking layer, infrastructure, and third-party compo‐ nents and services. Each team producing a microservice has the potential to nega‐ tively impact other parts of the system not under its direct control. End users of software also do not expect perfection, but do want their service pro‐ vider to be able to effectively resolve issues. This is what is known as the service recov‐ ery paradox, when a user of the service will trust a service more after a failure than they did before the failure. Businesses need to understand and capture the user experience they want to provide to the end users—what type of system behavior will cause issues to the business and what type of behavior is acceptable to users. Site Reliability Engineering and The Site Reliability Workbook have more on how to pick these for your business. Once identified and measured, you can adopt Google style, as seen in “Google’s approach to SLOs” on page 9, or Netflix’s more informal “context and guardrails” style, or anywhere in between to help you reason about your software or the next steps. See the first chapter on Netflix in Seeking SRE by David N. Blank-Edelman (O’Reilly) to learn more about context and guardrails. Whether you follow the Goo‐ gle practice or a simpler one is up to your organization, the type of software you develop, and the engineering culture you want to promote. 12 | Chapter 1: The Application Platform
  • 37. With the goal of never failing replaced with the goal of being able to meet SLAs, engi‐ neering can start building multiple layers of resiliency into systems, minimizing the effects of failures on end-user experience. Effective Monitoring Builds Trust In certain enterprises, engineering can still be seen as a service organization rather than a core business competency. At the insurance company with a five-failures-per- minute error rate, this is the prevailing attitude. In many cases where the engineering organization served the company’s insurance agents in the field, the primary interac‐ tion between them happened through reporting and tracking software issues through a call center. Engineering routinely prioritized bug resolution, based on defects learned from the call center, against new feature requests and did a little of both for each software release. I wondered how many times the field agents simply didn’t report issues, either because a growing bug backlog suggested that it wasn’t an effective use of their time or because the issue had a good-enough workaround. The problem with becom‐ ing aware of issues primarily through the call center is that it made the relationship entirely one way. Business partners report and engineering responds (eventually). A user-centric monitoring culture makes this relationship more two-way. An alert may provide enough contextual information to recognize that rating for a particular class of vehicle is failing for agents in some region today. Engineering has the oppor‐ tunity to reach out to the agents proactively with enough contextual information to explain to the agent that the issue is already known. Delivery Improving the software delivery pipeline lessens the chance that you introduce more failure into an existing system (or at least helps you recognize and roll back such changes quickly). It turns out that good monitoring is a nonobvious prerequisite to evolving safe and effective delivery practices. The division between continuous integration (CI) and continuous delivery (CD) tends to be blurred by the fact that teams frequently script deployment automation and run these scripts as part of continuous integration builds. It is easy to repurpose a CI system as a flexible general-purpose workflow automation tool. To make a clear conceptual delineation between the two, regardless of where the automation runs, we’ll say that continuous integration ends at the publication of a microservice artifact to an artifact repository, and delivery begins at that point. In Figure 1-5, the software delivery life cycle is drawn as a sequence of events from code commit to deployment. Delivery | 13
  • 38. Figure 1-5. The boundary between continuous integration and delivery The individual steps are subject to different frequencies and organizational needs for control measures. They also have fundamentally different goals. The goal of continu‐ ous integration is to accelerate developer feedback, fail fast through automated test‐ ing, and encourage eager merging to prevent promiscuous integration. The goal of delivery automation is to accelerate the release cycle, ensure security and compliance measures are met, provide safe and scalable deployment practices, and contribute to an understanding of the deployed landscape for the monitoring of deployed assets. The best delivery platforms also act as an inventory of currently deployed assets, fur‐ ther magnifying the effect of good monitoring: they help turn monitoring into action. In Chapter 6, we’ll talk about how you can build an end-to-end asset inventory, end‐ ing with a deployed asset inventory, that allows you to reason about the smallest details of your code all the way up to your deployed assets (i.e., containers, virtual machines, and functions). Continuous Delivery Doesn’t Necessarily Mean Continuous Deployment Truly continuous deployment (every commit passing automated checks goes all the way to production automatically) may or may not be a goal for your organization. All things being equal, a tighter feedback loop is preferable to a longer feedback loop, but it comes with technical, operational, and cultural costs. Any delivery topics discussed in this book apply to continuous delivery in general, as well as continuous deployment in particular. Once effective monitoring is in place and less failure is being introduced into the sys‐ tem by further changes to the code, we can focus on adding more reliability to the running system by evolving traffic management practices. 14 | Chapter 1: The Application Platform
  • 39. Traffic Management So much of a distributed system’s resiliency is based on the expectation of and com‐ pensation for failure. Availability monitoring reveals these actual points of failure, debuggability monitoring helps understand them, and delivery automation helps pre‐ vent you from introducing too many more of them in any incremental release. Traffic management patterns will help live instances cope with the ever-present reality of failure. In Chapter 7, we’ll introduce particular mitigation strategies involving load balancing (platform, gateway, and client-side) and call resilience patterns (retrying, rate limiters, bulkheads, and circuit breakers) that provide a safety net for running systems. This is covered last because it requires the highest degree of manual coding effort on a per-project basis, and because the investment you make in doing the work can be guided by what you learn from the earlier steps. Capabilities Not Covered Certain capabilities that are common focuses of platform engineering teams are not included in this book. I’d like to call out a couple of them, testing and configuration management, and explain why. Testing Automation My view on testing is that testing automation available in open source takes you a certain way. Any investment beyond that is likely to suffer from diminishing returns. Following are some problems that are well solved already: • Unit testing • Mocking/stubbing • Basic integration testing, including test containers • Contract testing • Build tooling that helps separate computationally expensive and inexpensive test suites There are a couple other problems that I think are worth avoiding unless you really have a lot of resources (both computationally and in engineering time) to expend. Contract testing is an example of a technique that covers some of what both of these test, but in a far cheaper way: Traffic Management | 15
  • 40. • Downstream testing (i.e., whenever a commit happens to a library, build all other projects that depend on this library both directly or indirectly to determine whether the change will cause failure downstream) • End-to-end integration testing of whole suites of microservices I’m very much for automated tests of various sorts and very suspicious of the whole enterprise. At times, feeling the social pressure of testing enthusiasts around me, I may have gone along with the testing fad of the day for a little while: 100% test cover‐ age, behavior-driven development, efforts to involve nonengineer business partners in test specification, Spock, etc. Some of the cleverest engineering work in the open source Java ecosystem has taken place in this space: consider Spock’s creative use of bytecode manipulation to achieve data tables and the like. Traditionally, working with monolithic applications, software releases were viewed as the primary source of change in the system and therefore potential for failure. Emphasis was placed on making sure the software release process didn’t fail. Much effort was expended to ensure that lower-level environments mirrored production to verify that pending software releases were stable. Once deployed and stable, the sys‐ tem was assumed to remain stable. Realistically, this has never been the case. Engineering teams adopt and double down on automated testing practices as a cure for failure, only to have failure stubbornly persist. Management is skeptical of testing in the first place. When tests fail to capture problems, what little faith they had is gone. Production environments have a stub‐ born habit of diverging from test environments in subtle and seemingly always cata‐ strophic ways. At this point, if you forced me to choose between having close to 100% test coverage and an evolved production monitoring system, I’d eagerly choose the monitoring system. This isn’t because I think less of tests, but because even in reason‐ ably well-defined traditional businesses whose practices don’t change quickly, 100% test coverage is mythical. The production environment will simply behave differently. As Josh Long likes to say: “There is no place like it.” Effective monitoring warns us when a system isn’t working correctly due to condi‐ tions we can anticipate (i.e., hardware failures or downstream service unavailability). It also continually adds to our knowledge of the system, which can actually lead to tests covering cases we didn’t previously imagine. Layers of testing practice can limit the occurrence of failure, but will never eliminate it, even in industries with the tightest quality control practices. Actively measuring outcomes in production lowers time to discovery and ultimately remediation of fail‐ ures. Testing and monitoring together are then complementary practices reducing how much failure end users experience. At their best, testing prevents whole classes of regressions, and monitoring quickly identifies those that inevitably remain. 16 | Chapter 1: The Application Platform
  • 41. Our automated test suites prove (to the extent they don’t contain logical errors them‐ selves) what we know about the system. Production monitoring shows us what hap‐ pens. An acceptance that automated tests won’t cover everything should be a tremendous relief. Because application code will always contain flaws stemming from unanticipated interactions, environmental factors like resource constraints, and imperfect tests, effective monitoring might be considered even more of a requirement than testing for any production application. A test proves what we think will happen. Monitoring shows what is happening. Chaos Engineering and Continuous Verification There is a whole discipline around continuously verifying that your software behaves as you expect by introducing controlled failures (chaos experiments) and verifying. Because distributed systems are complex, we cannot anticipate all of their myriad interactions, and this form of testing helps surface unexpected emergent properties of complex systems. The overall discipline of chaos engineering is broad, and as it is covered in detail in Chaos Engineering by Casey Rosenthal and Nora Jones (O’Reilly), I won’t go into it in this book. Configuration as Code The 12-Factor App teaches that configuration ought to be separated from code. The basic form of this concept, configuration stored as an environment variable or fetched at startup from a centralized configuration server like Spring Cloud Config Server, I think is straightforward enough to not require any explanation here. The more complicated case involving dynamic configuration—whereby changes to a central configuration source propagates to running instances, influencing their behavior—is in practice exceedingly dangerous and must be handled with care. Pair‐ ing with the open source Netflix Archaius configuration client (which is present in Spring Cloud Netflix dependencies and elsewhere) was a proprietary Archaius server which served this purpose. Unintended consequences resulting from dynamic config‐ uration propagation to running instances caused a number of production incidents of such magnitude that the delivery engineers wrote a whole canary analysis process around scoping and incrementally rolling out dynamic configuration changes, using the lessons they had learned from automated canary analysis for different versions of code. This is beyond the scope of this book, since many organizations will never receive substantial enough benefit from automated canary analysis of code changes to make that effort worthwhile. Capabilities Not Covered | 17
  • 42. Declarative delivery is an entirely different form of configuration as code, popular‐ ized again by the rise of Kubernetes and its YAML manifests. My early career left me with a permanent suspicion of the completeness of declarative-only solutions. I think there is always a place for both imperative and declarative configuration. I worked on a policy administration system for an insurance company that consisted of a backend API returning XML responses and a frontend of XSLT transformations of these API responses into static HTML/JavaScript to be rendered in the browser. It was a bizarre sort of templating scheme. Its proponents argued that the XSLT lent the rendering of each page a declarative nature. And yet, it turns out that XSLT itself is Turing complete with a convincing existence proof. The typical point in favor of declarative definition is simplicity leading to an amenability to automation like static analysis and remediation. But as in the XSLT case, these technologies have a seem‐ ingly unavoidable way of evolving toward Turing completeness. The same forces are in play with JSON (Jsonnet) and Kubernetes (Kustomize). These technologies are undoubtedly useful, but I can’t be another voice in the chorus calling for purely declarative configuration. Short of making that point, I don’t think there is much this book can add. Encapsulating Capabilities As under fire as object-oriented programming (OOP) may be today, one of its funda‐ mental concepts is encapsulation. In OOP, encapsulation is about bundling state and behavior within some unit, e.g., a class in Java. A key idea is to hide the state of an object from the outside, called information hiding. In some ways, the task of the plat‐ form engineering team is to perform a similar encapsulation task for resiliency best practices for its customer developer teams, hiding information not out of control, but to unburden them from the responsibility of dealing with it. Maybe the highest praise a central team can receive from a product engineer is “I don’t have to care about what you do.” The subsequent chapters are going to introduce a series of best practices as I under‐ stand them. The challenge to you as a platform engineer is to deliver them to your organization in a minimally intrusive way, to build “guardrails not gates.” As you read, think about how you can encapsulate hard-won knowledge that’s applicable to every business application and how you can deliver it to your organization. If the plan involves getting approval from a sufficiently powerful executive and send‐ ing an email to the whole organization requiring adoption by a certain date, it’s a gate. You still want buy-in from your leadership, but you need to deliver common func‐ tionality in a way that feels more like a guardrail: 18 | Chapter 1: The Application Platform
  • 43. Explicit runtime dependencies If you have a core library that every microservice includes as a runtime depend‐ ency, this is almost certainly your delivery mechanism. Turn on key metrics, add common telemetry tagging, configure tracing, add traffic management patterns, etc. If you have heavy Spring usage, use autoconfiguration classes. You can simi‐ larly conditionalize configuration with CDI if you are using Java EE. Service clients as dependencies For traffic management patterns especially (fallbacks, retry logic, etc.), consider making it the responsibility of the team producing the service to also produce a service client that interacts with the service. After all, the team producing and operating it has more knowledge than anybody about where its weaknesses and potential failure points are. Those engineers are likely the best ones to formalize this knowledge in a client dependency such that each consumer of their service uses it in the most reliable way. Injecting a runtime dependency If the deployment process is relatively standardized, you have an opportunity to inject runtime dependencies in the deployed environment. This was the approach employed by the Cloud Foundry buildpack team to inject a platform metrics implementation into Spring Boot applications running on Cloud Foundry. You can do something similar. Before encapsulating too eagerly, find a handful of teams and practice this discipline explicitly in code in a handful of applications. Generalize what you learn. Service Mesh As a last resort, encapsulate common platform functionality in sidecar processes (or containers) alongside the application, which when paired with a control plane man‐ aging them is called a service mesh. The service mesh is an infrastructure layer outside of application code that manages interaction between microservices. One of the most recognizable implementations today is Istio. These sidecars perform functions like traffic management, service dis‐ covery, and monitoring on behalf of the application process so that the application does not need to be aware of these concerns. At its best, this simplifies application development, trading off increased complexity and cost in deploying and running the service. Over a long enough time horizon, trends in software engineering are often cyclic. In the case of site reliability, the pendulum swings from increased application and devel‐ oper responsibility (e.g., Netflix OSS, DevOps) to centralized operations team respon‐ sibility. The rise of interest in service mesh represents a shift back to centralized operations team responsibility. Encapsulating Capabilities | 19
  • 44. Istio promotes the concept of managing and propagating policy across a suite of microservices from its centralized control plane, at the behest of an organizationally centralized team that specializes in understanding the ramifications of these policies. The venerable Netflix OSS suite (the important pieces of which have alternative incarnations like Resilience4j for traffic management, HashiCorp Consul for discov‐ ery, Micrometer for metrics instrumentation, etc.) made these application concerns. Largely, though, the application code impact was just the addition of one or more binary dependencies, at which point some form of autoconfiguration took over and decorated otherwise untouched application logic. The obvious downside of this approach is language support, with support for each site reliability pattern requiring library implementations in every language/framework that the organization uses. Figure 1-6 shows an optimistic view of the effect on this engineering cycle on derived value. With any luck, at each transition from decentralization to centralization and back, we learn from and fully encapsulate the benefits of the prior cycle. For example, Istio could conceivably fully encapsulate the benefits of the Netflix OSS stack, only for the next decentralization push to unlock potential that was unrealizable in Istio’s implementation. This is already underway in Resilience4j, for example, with discus‐ sion about adaptive forms of patterns like bulkheads that are responsive to application-specific indicators. Figure 1-6. The cyclic nature of software engineering, applied to traffic management Sizing of sidecars is also tricky, given this lack of domain-specific knowledge. How does a sidecar know that an application process is going to consume 10,000 requests per second, or only 1? Zooming out, how do we size the sidecar control plane up front not knowing how many sidecars will eventually exist? 20 | Chapter 1: The Application Platform
  • 45. Sidecars Are Limited to Lowest-Common-Denominator Knowledge A sidecar proxy will always be weakest where domain-specific knowledge of the application is the key to the next step in resil‐ iency. By definition, being separate from the application, sidecars cannot encode any knowledge this domain specific to the applica‐ tion without requiring coordination between the application and sidecar. That is likely at least as hard as implementing the sidecar- provided functionality in a language-specific library includable by the application. I believe testing automation available in open source takes you a certain way. Any investment beyond that is likely to suffer from diminishing returns, as discussed in “Service Mesh Tracing” on page 111, and against using sidecars for traffic manage‐ ment, as in “Implementation in Service Mesh” on page 285, unpopular as these opin‐ ions might be. These implementations are lossy compared to what you can achieve via a binary dependency either explicitly included or injected into the runtime, both of which add a far greater degree of functionality that only becomes cost-prohibitive if you have a significant number of distinct languages to support (and even then, I’m not convinced). Summary In this chapter we defined platform engineering as at least a placeholder phrase for the functions of reliability engineering that we will discuss through the remainder of this book. The platform engineering team is most effective when it has a customer- oriented focus (where the customer is other developers in the organization) rather than one of control. Test tools, the adoption path for those tools, and any processes you develop against the “guardrails not gates” rule. Ultimately, designing your platform is in part designing your organization. What do you want to be known for? Summary | 21
  • 47. CHAPTER 2 Application Metrics The complexity of distributed systems comprised of many communicating microser‐ vices means it is especially important to be able to observe the state of the system. The rate of change is high, including new code releases, independent scaling events with changing load, changes to infrastructure (cloud provider changes), and dynamic con‐ figuration changes propagating through the system. In this chapter, we will focus on how to measure and alert on the performance of the distributed system and some industry best practices to adopt. An organization must commit at a minimum to one or more monitoring solutions. There are a wide range of choices including open source, commercial on-premises, and SaaS offerings with a broad spectrum of capabilities. The market is mature enough that an organization of any size and complexity can find a solution that fits its requirements. The choice of monitoring system is important to preserve the fixed-cost characteris‐ tic of metrics data. The StatsD protocol, for example, requires an emission to a StatsD agent from an application on a per-event basis. Even if this agent is running as a side‐ car process on the same host, the application still suffers the allocation cost of creat‐ ing the payload on a per-event basis, so this protocol breaks at least this advantage of metrics telemetry. This isn’t always (or even commonly) catastrophic, but be aware of this cost. 23
  • 48. Black Box Versus White Box Monitoring Approaches to metrics collection can be categorized according to what the method is able to observe: Black box The collector can observe inputs and outputs (e.g., HTTP requests into a system and responses out of it), but the mechanism of the operation is not known to the collector. Black box collectors somehow intercept or wrap the observed process to measure it. White box The collector can observe inputs and outputs and also the internal mechanisms of the operation. White box collectors do this in application code. Many monitoring system vendors provide agents that can be attached to application processes and that provide black box monitoring. Sometimes these agent collectors reach so deep into well-known application frameworks that they start to resemble white box collectors in some ways. Still, black box monitoring in whatever form is limited to what the writer of the agent can generalize about all applications that might apply the agent. For example, an agent might be able to intercept and time Spring Boot’s mechanism for database transactions. An agent will never be able to reason that a java.util.Map field in some class represents a form of near-cache and instru‐ ment it as such. Service-mesh-based instrumentation is also black box and is generally less capable than an agent. While agents can observe and decorate individual method invocations, a service mesh’s finest-grained observation is at the RPC level. On the other side, white box collection sounds like a lot of work. Some useful metrics are truly generalizable across applications (e.g., HTTP request timings, CPU utiliza‐ tion) and are well instrumented by black box approaches. A white box instrumenta‐ tion library with some of these generalizations encapsulated when paired with an application autoconfiguration mechanism resembles a black box approach. White box instrumentation autoconfigured requires the same level of developer effort as black box instrumentation: specifically none! Good white box metrics collectors should capture everything that a black box collec‐ tor does but also support capturing more internal details that black box collectors by definition cannot. The difference between the two for your engineering practices are minimal. For a black box agent, you must alter your delivery practice to package and configure the agent (or couple yourself to a runtime platform integration that does this for you). For autoconfigured white box metrics collection that captures the same set of detail, you must include a binary dependency at build time. 24 | Chapter 2: Application Metrics
  • 49. Vendor-specific instrumentation libraries don’t tend to have this black box feel with a white box approach because framework and library authors aren’t inclined to add a wide range of proprietary instrumentation clients even as optional dependencies and instrument their code N different times. A vendor-neutral instrumentation facade like Micrometer has the advantage of the “write once, publish anywhere” experience for framework and library authors. Black box and white box collectors can of course be complementary, even when there is some overlap between them. There is no across-the-boards requirement to choose one over the other. Dimensional Metrics Most modern monitoring systems employ a dimensional naming scheme that con‐ sists of a metric name and a series of key-value tags. While the storage mechanism varies substantially from one monitoring system to another, in general every unique combination of name and tags is represented as a distinct entry or row in storage. The total cost in storage terms of a metric then is the product of the cardinality of its tag set (meaning the total number of unique key- value tag pairs). For example, an application-wide counter metric named http.server.requests that contains a tag for an HTTP method of which only GET and POST are ever observed, an HTTP status code where the service returns one of three status codes, and a URI of which there are two in the application results in up to 2 * 3 * 2 = 12 distinct time series sent to and stored in the monitoring system. This metric could be represented in storage roughly like in Table 2-1. Coordination between tags, like the fact that only endpoint /a1 will ever have a GET method and only /a2 will ever have a POST method can limit the total number of unique time series below the theoretical maximum, to only six rows in this example. In many dimensional time series databases, for each row representing a unique set of name and tags, there will be a value ring buffer that holds the samples for this metric over a defined period of time. When the system contains a bounded ring buffer like this, the total cost of your metrics is fixed to the product of the number of permutations of unique metric names/tags and the size of the ring buffer. Dimensional Metrics | 25
  • 50. Table 2-1. The storage of a dimensional metric Metric name and tags Values http.server.requests{method=GET,status=200,uri=/a1} [10,11,10,10] http.server.requests{method=GET,status=400,uri=/a1} [1,0,0,0] http.server.requests{method=GET,status=500,uri=/a1} [0,0,0,4] http.server.requests{method=POST,status=200,uri=/a2} [10,11,10,10] http.server.requests{method=POST,status=400,uri=/a2} [0,0,0,1] http.server.requests{method=POST,status=500,uri=/a2} [1,1,1,1] In some cases, metrics are periodically moved to long-term storage. At this point, there is an opportunity to squash or drop tags to reduce storage cost at the expense of some dimensional granularity. Hierarchical Metrics Before dimensional metrics systems became popular, many monitoring systems employed a hierarchical scheme. In these systems, metrics were defined only by name, with no key-value tag pairs. Tags are so useful that a convention emerged to append tag-like data to metric names with something like dot separators. So a dimen‐ sional metric like httpServerRequests, which has a method tag of GET in a dimen‐ sional system, might be represented as httpServerRequests.method.GET in a hierarchical system. Out of this arose query features like wildcard operators to allow simple aggregation across “tags,” as in Table 2-2. Table 2-2. Aggregation of hierarchical metrics with wildcards Metric query Value httpServerRequests.method.GET 10 httpServerRequests.method.POST 20 httpServerRequests.method.* 30 Still, tags are not a first-class citizen in hierarchical systems, and wildcarding like this breaks down. In particular, when an organization decides that a metric like httpSer verRequests that is common to many applications across the stack should receive a new tag, it has the potential to break existing queries. In Table 2-3, the true number of requests independent of method is 40, but since some application in the stack has introduced a new status tag in the metric name, it is no longer included in the aggre‐ gation. Even assuming we can agree as a whole organization to standardize on this new tag, our wildcarding queries (and therefore any dashboards or alerts built off of them) misrepresent the state of the system from the time the tag is introduced in the 26 | Chapter 2: Application Metrics
  • 51. first application until it is fully propagated through the codebase and redeployed everywhere. Table 2-3. Failures of aggregation of hierarchical metrics with wildcards Metric query Value httpServerRequests.method.GET 10 httpServerRequests.method.POST 20 httpServerRequests.status.200.method.GET 10 httpServerRequests.method.* 30 (!!) Effectively, the hierarchical approach has forced an ordering of tags when they are really independent key-value pairs. If you are starting with real-time application monitoring now, you should be using a dimensional monitoring system. This means you will also have to use a dimensional metrics instrumentation library in order to record metrics in a way that fully takes advantage of the name/tag combination that makes these systems so powerful. If you already have some instrumentation using a hierarchical collector, the most popular being Dropwizard Metrics, you are going to have to ultimately rewrite this instru‐ mentation. It’s possible to flatten dimensional metrics into hierarchical metrics by developing a naming convention that in some way iterates over all the tags and com‐ bines them with the metric name. Going the other direction is difficult to generalize, because the lack of consistency in naming schemes makes it difficult to split a hier‐ archical name into dimensional metrics. From this point on, we’ll be examining dimensional metrics instrumentation alone. Micrometer Meter Registries The remainder of this chapter will use Micrometer, a dimensional metrics instru‐ mentation library for Java that supports many of the most popular monitoring sys‐ tems on the market. There are only two main alternatives to Micrometer available: Monitoring system vendors often provide Java API clients While these work for white box instrumentation at the application level, there is little to no chance that the remainder of the Java ecosystem, especially of third- party open source libraries, will adopt a particular vendor’s instrumentation cli‐ ent for its metrics collection. Probably the closest we have come to this is some spotty adoption in open source libraries of the Prometheus client. Micrometer Meter Registries | 27
  • 52. OpenTelemetry OpenTelemetry is a hybrid metrics and tracing library. At the time of this writing, OpenTelemetry does not have a 1.0 release, and its focus has certainly been more on tracing than metrics, so metrics support is much more basic. While there is some variation in capabilities from one dimensional metrics instru‐ mentation library to another, most of the key concepts described apply to each of them, or at least you should develop an idea of how alternatives should be expected to mature. In Micrometer, a Meter is the interface for collecting a set of measurements (which we individually call metrics) about your application. Meters are created from and held in a MeterRegistry. Each supported monitoring system has an implementation of MeterRegistry. How a registry is created varies for each implementation. Each MeterRegistry implementation that is supported by the Micrometer project has a library published to Maven Central and JCenter (e.g., io.micrometer:micrometer- registry-prometheus, io.micrometer:micrometer-registry-atlas): MeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT); MeterRegistry implementations with more options contain a fluent builder as well, for example the InfluxDB registry shown in Example 2-1. Example 2-1. Influx fluent builder MeterRegistry registry = InfluxMeterRegistry.builder(InfluxConfig.DEFAULT) .httpClient(myCustomizedHttpClient) .build(); Metrics can be published to multiple monitoring systems simultaneously with Compo siteMeterRegistry. In Example 2-2, a composite registry is created that ships metrics to both Prometheus and Atlas. Meters should be created with the composite. Example 2-2. Composite meter registry that ships to Prometheus and Atlas MeterRegistry prometheusMeterRegistry = new PrometheusMeterRegistry( PrometheusConfig.DEFAULT); MeterRegistry atlasMeterRegistry = new AtlasMeterRegistry(AtlasConfig.DEFAULT); MeterRegistry registry = new CompositeMeterRegistry(); registry.add(prometheusMeterRegistry); registry.add(atlasMeterRegistry); 28 | Chapter 2: Application Metrics
  • 53. // Create meters like counters against the composite, // not the individual registries that make up the composite registry.counter("my.counter"); Micrometer packs with a global static CompositeMeterRegistry that can be used in a similar way that we use an SLF4J LoggerFactory. The purpose of this static registry is to allow for instrumentation in components that cannot leak Micrometer as an API dependency by offering a way to dependency-inject a MeterRegistry. Example 2-3 shows the similarity between the use of the global static registry and what we are used to from logging libraries like SLF4J. Example 2-3. Using the static global registry class MyComponent { Timer timer = Timer.builder("time.something") .description("time some operation") .register(Metrics.globalRegistry); Logger logger = LoggerFactory.getLogger(MyComponent.class); public void something() { timer.record(() -> { // Do something logger.info("I did something"); }); } } By adding any MeterRegistry implementations that you wire in your application to the global static registry, any low-level libraries using the global registry like this wind up registering metrics to your implementations. Composite registries can be added to other composite registries. In Figure 2-1, we’ve created a composite registry in our application that publishes metrics to both Prometheus and Stackdriver (i.e., we’ve called CompositeMeterRegistry#add(MeterRegistry) for both the Prometheus and Stackdriver registries). Then we’ve added that composite to the global static compo‐ site. The composite registry you created can be dependency-injected by something like Spring, CDI, or Guice throughout your application for your components to regis‐ ter metrics against. But other libraries are often outside of this dependency-injection context, and since they don’t want Micrometer to leak through their API signatures, they register with the static global registry. In the end, metrics registration flows down this hierarchy of registries. So library metrics flow down from the global com‐ posite to your application composite to the individual registries. Application metrics flow down from the application composite to the individual Prometheus and Stack‐ driver registries. Micrometer Meter Registries | 29
  • 54. Figure 2-1. Relationship between global static registry and your application’s registries Spring Boot Autoconfiguration of MeterRegistry Spring Boot autoconfigures a composite registry and adds a regis‐ try for each supported implementation that it finds on the class‐ path. A dependency on micrometer-registry-{system} in your runtime classpath along with any required configuration for that system causes Spring Boot to configure the registry. Spring Boot also adds any MeterRegistry found as a @Bean to the global static composite. In this way, any libraries that you add to your applica‐ tion that provide Micrometer instrumentation automatically ship their metrics to your monitoring system! This is how the black- box-like experience is achieved through white box instrumenta‐ tion. As the developer, you don’t need to explicitly register these metrics; just their presence in your application makes it work. Creating Meters Micrometer provides two styles to register metrics for each supported Meter type, depending on how many options you need. The fluent builder, as shown in Example 2-4, provides the most options. Generally, core libraries should use the flu‐ ent builder because the extra verbosity required to provide robust description and base unit detail adds value to all of their users. In instrumentation for a particular microservice with a small set of engineers, opting for more compact code and less detail is fine. Some monitoring systems support attaching description text and base units to metrics, and for those, Micrometer will publish this data. Furthermore, some monitoring systems will use base unit information on a metric to automatically scale and label the y-axis of charts in a way that is human readable. So if you publish a 30 | Chapter 2: Application Metrics
  • 55. Random documents with unrelated content Scribd suggests to you:
  • 56. Vicksburg on Saturday next, and celebrating the Fourth of July by a grand dinner, and so forth. When asked if he would invite General Joe Johnston to join, he said, ‘No, for fear there will be a row at the table.’ Ulysses must get into the city before he dines in it. The way to cook a rabbit is ’first to catch the rabbit.’” “Victimized.—We learned of an instance wherein a ‘knight of the quill’ and a ‘disciple of the black art,’ with malice in their hearts and vengeance in their eyes, ruthlessly put a period to the existence of a venerable feline that has for a time, not within the recollection of ‘the oldest inhabitant,’ faithfully performed the duties to be expected of him, to the terror of sundry vermin in his neighborhood. Poor defunct Thomas was then prepared, not for the grave, but for the pot, and several friends invited to partake of a nice rabbit. As a matter of course, no one would wound the feelings of another, especially in these times, by refusing a cordial invitation to dinner, and the guests assisted in consuming the poor animal with a relish that did honor to their epicurean tastes. The ‘sold’ assure us the meat was delicious, and that pussy must look out for her safety.” “Mule Meat.—We are indebted to Major Gillespie for a steak of Confederate beef, alias mule. We have tried it, and can assure our friends that, if it is rendered necessary, they need have no scruples at eating the meat. It is sweet, savory, and tender, and so long as we have a mule left, we are satisfied our soldiers will be content to subsist upon it.” As stated, the city was surrendered on the morning of the 4th of July, and the army of General Grant marched in and took possession. Some of the Federal soldiers who went into the city entered the office of the “Citizen,” and finding the type for the paper all set in the forms, added the following note, and struck off a large number of copies, which were extensively distributed among our troops:— “Note (at foot of last column).—July 4, 1863.
  • 57. “Two days bring about great changes: the banner of the Union floats over Vicksburg; General Grant has ‘caught the rabbit’; he has dined in Vicksburg, and he brought his dinner with him. The ‘Citizen’ lives to see it. For the last time, it appears on wall- paper. No more will it eulogize the luxury of mule meat and fricasseed kitten, or urge Southern warriors to such diet nevermore. This is the last wall-paper edition, and is, excepting this note, an exact copy of it. It will be valuable hereafter as a curiosity.” The author, deeming this paper a curious chapter in the history of the siege of Vicksburg, has thought it not improper to quote thus fully from its columns.
  • 58. CHAPTER XXII. The Regiment Marches on Jackson—Jefferson Davis’s House—Siege of Jackson—The Regiment Under Fire—Evacuation of the City—A Part of the City is Burnt by the Enemy—Return to Vicksburg—A Hard March —“French Joe’s” Mule—The Dead of the Regiment—Return to Cincinnati—March Over Cumberland Mountains to Knoxville, Tenn. As soon as the siege was concluded, General Grant immediately turned his attention to General Johnston, who up to this time had held the line of the Big Black, watching for a chance to strike our besieging army. The time had now arrived for the Ninth Corps to perform its part of the work of that memorable campaign. As soon as General Johnston learned of Pemberton’s surrender, he began to fall back to Jackson, the capital of the State. The Ninth Corps under General Parke, together with General Smith’s division of the Sixteenth Corps, and General W. T. Sherman’s own corps, all under command of General Sherman, were ordered by General Grant to pursue the retreating enemy. This movement began as early as the evening of the 4th of July, but the Brigade of Colonel Christ did not commence to move till the afternoon of the 7th, the Twenty-ninth leaving camp at two o’clock in the afternoon. Toward nightfall the Big Black was reached, the men crossing the river on a floating bridge which had been constructed by the advance forces. The march was continued for into the night, no halt being made till twelve o’clock. The day had been severely hot, and a large number of the men were left beside the road, where they had fallen, stunned and bewildered, by the overpowering rays of the sun. When the night came on, it began to rain, and for a space of two hours the overcharged clouds poured torrents of water upon the soldiers, who
  • 59. were toiling along over the muddy roads so faint from exhaustion that they could scarcely drag one foot after the other. As soon as the halt was made, fires were kindled, and the men contrived to dry their clothing and steep a little coffee, the solace of the soldier. That was a wet and intensely uncomfortable bivouac; there was no recourse left the men but to spread their rubber blankets upon the flooded earth, and, lying down upon them, cover themselves with the half of a shelter-tent. They had barely fallen asleep when the storm broke out afresh, and the rain came down upon them in great sheets. Sleep was wholly banished, and huddling around the smouldering fires, the “poor boys” thus passed the balance of that gloomy night. The day which followed this was also very hot, and the officers having learned that the troops could not endure the sun, wisely concluded to allow them to remain quiet till near nightfall. At four o’clock, P. M., the order came to break camp, and a long march was performed, the Brigade marching till one o’clock on the morning of the 9th. On the 9th, the line was formed as early as six o’clock in the morning; but the men were not hurried through the day, being allowed to make frequent but brief halts. The troops halted at nine o’clock in the evening near the plantation of Jefferson Davis, where the regiment was ordered on guard for the remainder of the night. A part of the regiment on this occasion was posted very near the house of Davis, and though the men were led by curiosity to visit it, yet they refrained from destroying the property of this prominent traitor, or committing any acts unbecoming a regiment of Massachusetts soldiers. As early as seven o’clock on the following morning, the men having had no sleep during the preceding night, and scarcely any for three consecutive nights, the regiment was ordered to start. At two o’clock that afternoon the rear guard of the retreating enemy was suddenly encountered, a line of battle was quickly formed, and slight skirmishing ensued; but the Twenty-ninth, though very near the front, did not become engaged. Toward evening the Confederates retreated, and our troops started in pursuit, the Brigade proceeding only about two miles, when it halted
  • 60. for the night on the plantation of Mr. Hardeman, on the line of the Mississippi Central Railroad. Early the next morning, while the regiments were resting, the order was given for the Brigade to go to the front, taking position on a ridge of land upon which stood the State Lunatic Asylum, about five miles from Jackson. On the previous day, the enemy had occupied this place, but were driven from it by the First Division under General Welch. The Confederates on the 11th held another line of works a little nearer the city of Jackson, but within easy range of this ridge; the place was thickly wooded, and the Brigade lay concealed among the trees during the day, the Twenty-ninth supporting Captain Edward’s Rhode Island Battery, which did but little firing, however. When it grew dark, shovels were called into requisition, and every man in the Brigade was set to work throwing up entrenchments, laboring till daylight the next morning; but our men were not to be allowed to enjoy the fruits of their night’s labor, for in the early morning, they were ordered out of the works, up to the extreme front, in support of our skirmish line. Fortunately they were not obliged to endure the scorching rays of the sun, but found shelter in a piece of woods; it was only a shelter from the sun, however, for the enemy, knowing our position, poured into the woods a continuous fire of shell, canister, and spherical case during the whole of the two days that the regiment was here. The other regiments in the Brigade suffered more or less loss, but the Twenty-ninth escaped without a single casualty. In addition to the storm of larger missiles, many of the musket-balls fired from the enemy’s lines found their way into the woods, and so severe was the fire, that nearly every tree along our line bore the marks of the leaden tempest. Many of our comrades had narrow escapes from death and wounds, one soldier in Company K especially, a ball passing through his tin dipper, upon which he was resting his head.
  • 61. On the morning of the 11th, the Brigade was relieved and ordered to the rear, resuming its former position near the lunatic asylum; but in the afternoon of the same day it was again ordered forward, and again supported Captain Edward’s battery. Here it remained till the morning of the 16th, when an advance of the whole line was made, the Twenty-ninth passing up under a heavy fire to within forty rods of the enemy’s works, bristling with cannon, the right of the regiment going into the rifle-pits. Once in the pits, there was no such thing as leaving them while it was daylight, and here the “boys” spent the day, constantly engaged with the enemy’s sharpshooters. Though considerably exposed, there was but one casualty during the day, Private John Scully of Company A being instantly killed, the ball penetrating his brain. The regiment in this position held the extreme left of the picket line of our army, its right resting in the rifle-pits, and its left in dense woods, retired so as to form nearly a half-circle. The night of the 16th was dark, and hence favorable for secret movements by both besiegers and besieged. About nine o’clock, unusual noises were heard within the enemy’s lines, resembling the rattling of wheels. Colonel Barnes became anxious to learn the cause of these noises, and Captain Clarke was requested to use every effort to ascertain what, if any, movement was going on in the enemy’s camp. That officer had no difficulty in carrying out his instructions, for one of his men, a fearless soldier, named David Scully, unhesitatingly consented to undertake the perilous task of approaching the hostile picket line. The ground descended quite rapidly from Clarke’s line towards that of the Confederates. Scully was left to execute his adventure in his own way. Prostrating himself upon the ground, he rolled slowly down the hill, till he approached within a few yards of the enemy’s pickets, and then pausing, overheard their conversation, which was to the effect that their army was retreating, and that they were soon to be relieved. Listening here, Scully heard more distinctly than before, the noises in the enemy’s camp. They were evidently removing their guns from the works; and, beside this, the regular tread of marching men was plainly distinguishable. In due time Scully returned, making this
  • 62. report. About this time, a similar report was brought in by Charles Logue of Company F, who went forward into the woods, very near the enemy, exhibiting great courage. In order to verify the statements of Scully and Logue, Colonel Barnes, with one or more of the captains, advanced some distance beyond our picket line, when they soon became convinced that the whole body of the enemy was moving. Thereupon one of the sergeants was despatched to General Ferrero, who was in command of the trenches, with information that the enemy was moving in large numbers, and shortly after a lieutenant was sent, with the message that the enemy was abandoning his works and retiring from the city. The night was intensely dark, and the ground over which these officers were obliged to pass, in delivering their messages, beset with difficulties, being broken, and in some places covered with fallen timber and a thick growth of bushes. But, like faithful soldiers, they persevered till they found General Ferrero, when they delivered their messages. The substance of the reply that was sent back was, “The movements of the enemy are well understood at headquarters. The enemy are not retiring.” The rumbling of the enemy’s trains and the neighing of their horses continued; and the Colonel and his comrades stood at their posts all night, listening to these sounds, which grew fainter and more distant every hour, as the Confederates were slipping out of the grasp of General Sherman, and retiring beyond the Pearl River. When the night was almost gone, a message was received from General Ferrero, that the regiment might move forward in the gray of the morning, if Colonel Barnes thought it advisable. When the morning came, a flag of truce was seen waving from the enemy’s works, and at the same time the city appeared to be in flames. During the night, General Johnston retired with his whole army, artillery, and baggage, and even the large guns upon his works. As soon as it was fairly day, the whole line was ordered forward, and the regiment entered the city. The works were found to be deserted, and the railroad depot and several public buildings in
  • 63. flames; but the fire was quickly extinguished by our troops, and thus a large portion of the city was doubtless saved from destruction. After the regiment had finished its part of the generous work of subduing the flames, the men were dismissed for a couple of hours, during which time they contrived to “do” Jackson quite thoroughly. The gardens were filled with melons and fruits, but of other and more desirable food there was a small supply. Everything of much value had been removed, and many of the deluded inhabitants had followed in the steps of the retreating army, taking with them their personal effects, thus giving the place the appearance of a deserted town. The negroes had the good sense to stay, and, as was invariably the case, they were overjoyed at the appearance of the Union soldiers, testifying to their happiness in the way peculiar to their race. In the afternoon of the 17th, the regiment had orders to leave the city, marching back to the ground occupied on the 14th. Here it remained, enjoying much-needed rest, till Monday the 20th. Another severe march was before them, a march needlessly hard; and at an unreasonable hour in the morning of the 20th, the reveille aroused the men from their slumbers. Before the movement began, an order was issued from headquarters, detailing Colonel Barnes Provost Marshal of the corps, and the whole of the regiment as provost guard, with orders to move in the rear of the corps, and to keep everything—men, horses, and wagons—in front. This was the hardest duty the regiment ever performed in the same number of days. For some reason, the march was a forced one; the weather was of the same tropical character that it had been during the three weeks previous, and water not only scarce, but of poor quality. The story among the men was, that the corps was racing with another, the Sixteenth (?); but the more probable statement is, that the corps reaching Vicksburg first would take the transports to go North, there being only a sufficient number of steamers for the transportation of a single corps. The imperative orders given to Colonel Barnes to prevent straggling, required
  • 64. constant watchfulness and almost superhuman efforts, not only on his part, but on the part of his brother officers and the men. Many soldiers gave out, from the combined effects of over-exertion and the enervating influence of the weather. On the second day out, matters in this respect became so bad, that it became necessary to impress into the service, ox-carts, horses, and vehicles of all descriptions which could be found about the country, and use them for the conveyance of the invalids, many of whom had received fatal sunstrokes. The spectacle which the corps presented on the road was wholly unbecoming a victorious army: nearly every regiment had lost even the semblance of an organized body; everybody was straggling along the roads, some riding in carts, and others mounted upon horses and mules, while miles in the rear of this mob was the gallant old Twenty-ninth Regiment, driving the crowd before them. Violent menaces, and sometimes absolute force, were required to keep the stragglers in motion. For want of ambulances, nearly all the wounded in the battles and skirmishes before Jackson were carried the whole distance from the latter city to Vicksburg on litters or stretchers by details of men. To protect these unfortunate soldiers from the sun, hoods made of pieces of tent cloth were placed about their heads, and green boughs arranged at the sides of the litters. A large number of disabled horses and mules were left about the country, in the track of Johnston’s retreat, and these were systematically gathered up by General Sherman, when he returned from Jackson, and driven along to the various landings in the vicinity of Vicksburg and Milldale, where, together with the horses and other animals captured by the soldiers on the march, they were delivered up to the quartermasters. Nearly every company of the Twenty-ninth had a large number of saddle and pack animals, which they had ridden and used for the conveyance of their baggage during the march. Company A had some twenty horses and mules, and Company G nearly as many, when they returned to Milldale, having, as they swept along the stragglers of the column, as the extreme
  • 65. rear guard, collected these animals, as well as the jaded and tired- out men, and their work was much lightened by these mounts. As the rear guard approached the Big Black, the soldiers on foot were sent forward into camp, and then about thirty or forty mounted men came in together, most of the latter being men who had fallen out or got foot-sore, and had been picked up and mounted to keep them along with the army. When one of these motley crowds came in, the commander of the regiment, who was somewhat indignant at the appearance of the thing, hailed the captain in command, “I should like to know, sir, what this means; what sort of a command is this for an infantry officer?” “Irregular mounted infantry, I should think,” replied the leader, as he looked at his crew. It was on this march that Captain Richardson’s man, nicknamed “French Joe,” came to the conclusion that his captain’s mess kit might just as well be carried by a mule as by Joseph, and, in fact, that the mule might carry “Joe” too, and took one of the mules for this purpose. He had only his belt and some old scraps of rope for a tackling; but this he thought might serve well enough. He contrived a pad out of his own and the Captain’s blankets, and, warned by the example of John Gilpin, he attempted to balance his load and to tie it securely to the sides of the mule, which were well festooned with pots, pans, gridirons, camp kettles, and tin dippers, giving the animal the appearance of the “hawker’s” donkey. After all this varied assortment of wares had been piled upon the animal, Joe kindly allowed a knapsack or two to be strapped on behind, and then mounted, guiding the mule with a rope halter. He had not proceeded far before some of the knots began to slip, for Joe was not a sailor, nor was he a very skilful disposer of weights. Very soon one of the knapsack straps got loose and insinuated itself on the inside of the mule’s hind leg. It tickled him—he kicked. This displaced a camp kettle, which slipped under his belly—he “buck-jumped,” and unseated Joe. Then all the load shifted, the most of it getting under the beast’s belly. He curveted and pranced, he reared and kicked,
  • 66. and cleared the road right and left for more than a mile. The men scattered on every side, for the mule was in earnest, and was no respecter of persons, kicking just as viciously at the officers as at the men. Captain Richardson had no dinner that day, save what he got through the kindness of others; for his coffee, hard bread, and bacon, tin plates and cups, flour, butter, and roasting corn—all the materials of many a savory feast—lay in the dust. On the 22d, the Ninth Corps reached the Big Black River. General Parke and his division commanders now deemed it impossible, as it certainty was disgraceful, for the corps to continue to march in this manner. The different regiments were here, on the banks of the river, gathered together, and forced to resume their organization. One whole day was spent in this work, during which the men were permitted to rest. Toward evening of the 22d, the corps moved out of camp, and marching slowly, crossed the Big Black on a pontoon bridge, in the midst of a pouring rain; the troops camped near the river for the night, and the next morning started for Milldale. The regiment was the last to arrive, in consequence of its peculiar duty, and by being the last, lost the first chance to go on board the transports, and was thus forced to remain here till the 12th of August. During the campaign now closed, the roll of the regiment’s dead had been somewhat increased; and this, with a few exceptions, had been occasioned by disease contracted in the sickly regions of the Yazoo and Vicksburg. Private John Scully of Company A, a faithful soldier, was the first to fall in the campaign, having been killed by a bullet while bravely doing his duty in the rifle-pits before Jackson, July 16. Second Lieutenant Horace A. Jenks of Company E came next, dying of malarial fever, July 26. Lieutenant Jenks had at one time been a sergeant in his company, and was promoted to be second lieutenant for his good soldierly qualities. His death was mourned by all the members of the regiment. First Lieutenant Ezra Ripley of Company B, who died of fever at Helena, Ark., July 28, was
  • 67. a member of the Middlesex Bar before entering the service. He was a gentleman of liberal culture and rarest qualities of both heart and mind. No sacrifice for his country was too great in his estimation, and though not of a robust constitution, yet he never shrank from any exposure or hardship. He performed the terrible march to Jackson, but the seeds of disease sown during those days, already described, soon ripened into death. Private Lyford Gilman of Company B also died of disease at Vicksburg, August 2. He was also a victim of the exhaustive march. When the Ninth Corps was about to leave Vicksburg, General Grant, desirous of recognizing its services in the late campaign, issued the following order:— “Headquarters Department of the Tennessee,} “Vicksburg, Miss., July 31, 1863. } [EXTRACT.] “Special Orders, No. 207. “In returning the Ninth Corps to its former command, it is with pleasure that the general commanding acknowledges its valuable services in the campaign just closed. “Arriving at Vicksburg opportunely, taking position to hold at bay Johnston’s army, then threatening the forces investing the city, it was ready and eager to assume the aggressive at any moment. “After the fall of Vicksburg, it formed a part of the army which drove Johnston from his position near the Big Black River, into his entrenchments at Jackson, and after a siege of eight days, compelled him to fly in disorder from the Mississippi Valley. “The endurance, valor, and general good conduct of the Ninth Corps are admired by all; and its valuable co-operation in
  • 68. achieving the final triumph of the campaign is gratefully acknowledged by the Army of the Tennessee. “Major-General Parke will cause the different regiments and batteries of his command to inscribe upon their banners and guidons, ‘Vicksburg’ and ‘Jackson.’ “By order of “Major-General U. S. Grant. “P. S. Bowen, A. A. A. G.” The time spent at Milldale, after the return from Jackson, was occupied by the ordinary duties of camp life. The weather continued very warm, and the destructive effects of the campaign now became manifest. Deaths were very frequent among the troops here during this time, burial parties were almost constantly engaged, and the funeral notes of the fife and drum could be heard nearly every hour in the day. None save the strongest came out of that campaign in sound health. On the 12th of August, the regiment embarked on the steamer “Catahoula,” one of the slowest boats on the river, to go North; the steamer left Milldale without a sufficient supply of fuel, and accordingly frequent stoppages on the route, to gather wood, became necessary. The trip to Cairo, including one day spent at Memphis, occupied eight days, the boat reaching its destination on the 20th. At midnight on the 20th, the regiment took the cars for Cincinnati, reaching that city on the afternoon of Sunday the 23d, and receiving the same kind treatment as on its two former visits. At night, the regiment left the city, crossed the Ohio to Covington, Ky., and went into camp on the outskirts of the town, and remained here till the 27th. At this time, probably nearly half of all the members of the regiment were on the sick-list, and unable to do
  • 69. duty. In the course of a few days they had come from the tropical climate of the South into the cool bracing air of the West, and now the chills and fever broke out among them to an alarming extent. While here, Colonel Barnes left the regiment on a furlough to his home in Massachusetts; he was very sick from the effects of a malarial fever and overwork; from the eighteenth day of May, 1861, till he was seized with this sickness, he had never been off duty, for any cause, a day,—a fact that is not only remarkable, but, considering the great hardships to which he had been subjected, one that shows him to have been possessed of an iron constitution. The author, in the preparation of this work, has endeavored, as far as possible, to avoid the diary form of narrative, because he is aware that such does not interest the general reader; but the record of the regiment would be incomplete if it did not give somewhat in detail the events of long and memorable marches, and the various localities visited by it. The march from Covington, Ky., into East Tennessee, which we are about to describe, was one of the longest which the regiment ever performed, and, for the reasons stated, we shall give a very particular account of it. On the 27th, it broke camp, under the command of Major Chipman, went to the railroad station in Covington, took the cars for Nicholasville, arrived there at seven o’clock the next morning, and camped near the depot. On the 29th, Colonel Pierce, who had for several months been absent on special duty in Massachusetts, joined the regiment and assumed command, and on the same day a march on the Lancaster pike of about four miles was performed. August 31. The regiment was mustered for pay; Colonel Pierce ordered to the command of the Brigade; the Second Michigan Infantry joined the Brigade, and Major Chipman again took command of the regiment.
  • 70. September 1. Reveille at four o’clock, A. M. Started for Crab Orchard, in Lincoln County; spent the night for the third time at Camp Dick Robinson. September 2. Reveille at an early hour; marched all day; camped near Lancaster. September 3. Another early start. Reached Crab Orchard, a place of five hundred inhabitants, and abounding with mineral springs. Here and at Nicholasville convalescent camps were established, and during the time which the regiment remained at these places, a very large number of its members went into the hospitals, where not a few of them subsequently died. September 10. The Brigade left Crab Orchard, and had a hard march of about fourteen miles, and went into camp at a place called Mount Vernon. The road for a considerable portion of the way was very rough and mountainous, being so steep in some places that the horsemen were obliged to dismount and lead their animals. The men were in light marching order, having left the most of their extra clothing at Crab Orchard, and had eight days’ rations served out to them, being thus prepared for a long march. September 11. The reveille sounded at half-past three o’clock in the morning, and at half-past four the column was in motion. At night, after a very fatiguing march, the camp was formed near Wild Cat Mountain, Kentucky. September 12. The men were routed out early in the morning, and the day’s march began at five o’clock, but the road was good all day. The weather, which had been fine ever since the march began, became stormy at the end of this day, and at night it rained hard. The camp was formed at London, Laurel County, Ky. On this march the regiment passed over the battle-field of Mill Spring, where the notorious Zollicoffer was killed.
  • 71. September 13 was Sunday. The men were paid off and allowed to rest all day. Since this famous march began, the Brigade had passed through and into three counties; namely, Gerrard, Rock Castle, and Laurel. The country through which they had travelled was thinly populated, and with the exception of a few wild fruits and nuts which they found on the journey, the men were obliged to subsist upon their rations. It has been stated, that the wild fruits which the men ate on this march proved very beneficial to their health, and resulted in curing them of the complaints they had contracted in the sickly swamps of the Yazoo. September 14. The march was resumed at five o’clock in the morning, and at night a halt was made at Laurel Spring. September 15. Only a part of the day was occupied by marching, a halt being made at the town of Barboursville, in Knox County, Ky. September 16. Marched from Barboursville to Flat Lick; a long march, pausing till the 19th. September 19. A distance of about ten miles was travelled this day; the camp was formed at Log Mountain. The column was nearing the far-famed Cumberland Gap, and the roads were growing rougher and more broken at every advance in that direction. The night was very cold, water froze, and the crops of tobacco, sugar-cane, and cotton in that region nearly all destroyed. When the sun rose the next morning, it revealed the earth white with frost. September 20. At ten o’clock in the morning, the Brigade reached Cumberland Gap, and entered the State of Tennessee. After passing into this gap, which was defended by a small force of infantry and cavalry, the road became more and more elevated, till at last it reached the summits of the mountains. The view from these heights well paid the men for all their toil in climbing their rugged and broken sides. In the far distance, ridge after ridge seemed to rise up toward the heavens, the highest actually invading the clouds, which,
  • 72. with a beautiful curtain of blue, hid from sight the lofty peaks. The night was spent in the mountains near the gap. September 21. Sycamore, Tenn. Camped for the night. An inquiry having been made at one of the mountain huts, regarding the distance between this place and Tazewell, the answer was, “Two rises to go up and two rises to go down and a right smart plain.” September 22. Morristown, Tenn. Here the Brigade remained till the 24th. September 24. Marched to New Market. September 25. Marched to Holston River and forded it. September 26. Entered the city of Knoxville. The distance marched between the first of September and 26th was something over two hundred miles. The march over the mountains has furnished the theme of many interesting conversations among the men who performed it. The hardships of the road were manifold and serious. It was enough to be compelled to climb day after day the rugged and precipitous path along the side of these mountains; it was enough, indeed, to bivouac on their cold and barren summits, with only a single woollen blanket to protect the foot-sore soldier from the searching and chilling night-air; but when we add to these discomforts, that of intense and unsatisfied hunger, which was actually endured during the entire march, the measure of the sufferings of our comrades seems full to overflowing. They endured these sufferings and hardships, however, for a good purpose. Together with the troops which had gone on before them, they had wrought the long-prayed-for deliverance of East Tennessee. On the 3d of this month, General Burnside, together with the Twenty-third Corps and other troops, had entered the city of Knoxville, the Confederate General Buckner retiring from the place with his army and retreating toward Chattanooga.
  • 73. The people of this region had long suffered from rebel rule, and the barbarities which had been practised upon them have never been fully related to the world. Some had been imprisoned, others tortured, and others murdered. Their property had been mercilessly confiscated, and not a few had been forced to perform military duty in the service of a cause that they loathed and hated. When the army of General Burnside appeared bearing the old flag, and the colors of the cruel foe departed in haste and confusion, the loyal people were overwhelmed with joy. The flag of the Union, which had been carefully hid under carpets, concealed in cellars and between mattresses, to save its owners from persecution, was now brought forth from its hiding-places, and flaunted on every hand; from windows and liberty-poles, it floated to the breeze. A considerable part of General Burnside’s army was composed of loyal Tennesseeans, who had been forced to fly into Kentucky during the continuance of the enemy’s rule. These native troops, among which was the cavalry under Lieutenant-Colonel Brownlow, son of the famous parson, “were kept constantly in advance, and were received with expressions of the profoundest gratitude by the people. There were many thrilling scenes of the meeting of our Tennessee soldiers with their families, from whom they had so long been separated. The East Tennesseeans were so glad to see our soldiers, that they cooked everything they had and gave it to them freely, not asking pay, and apparently not thinking of it. Women stood by the roadside with pails of water, and displayed Union flags. The wonder was, where all the stars and stripes came from. Knoxville was radiant with flags. At one point on the road from Kingston to Knoxville seventy women and girls stood by the roadside waving Union flags and shouting, ‘Hurrah for the Union.’ Old ladies rushed out of their houses and wanted to see General Burnside and shake hands with him, and cried, ‘Welcome, General Burnside, to East Tennessee.’”41 These constitute but a small part of all the demonstrations of loyalty by this intensely loyal people, and this brief account of their wrongs
  • 74. but a trifling part of the manifold abuses heaped upon them by a merciless and savage soldiery,—abuses and wrongs of the same barbarous nature as those perpetrated at Andersonville and Belle Isle, forming as they do the saddest chapter in the history of the war. It should be among the proudest boasts of the people of Massachusetts, that in the persons of her soldiers of the Twenty- first, Twenty-ninth, Thirty-fifth, and Thirty-sixth regiments, she helped deliver a people loyal to the old flag from a thraldom such as has been imperfectly depicted in this chapter,—a thraldom worse than death itself.
  • 75. CHAPTER XXIII. Battles of Blue Springs, Hough’s Ferry, and Campbell’s Station—Siege of Knoxville—The Sufferings of the Men—Battle of Fort Sanders— Gallant Conduct of the Regiment—It Captures Two Battle-flags—The Siege Raised—General Sherman Re-enforces Burnside. During the early part of October, a portion of the Ninth Corps under General Potter, and a large body of cavalry under General Shakleford, were sent up the valley some fifty miles in the direction of Morristown, Jefferson County. A force of the enemy had crossed into Eastern Tennessee from Virginia, and were threatening our communications with Cumberland Gap. This movement on the part of the Federals was made for the purpose of clearing the enemy away from the flank of our army. On the 8th of October, the regiment with its brigade was ordered forward from Knoxville to join the rest of the corps, and on the night of the 9th halted at Bull’s Gap, a pass in the mountains near the line between Jefferson and Green counties. The movement of the enemy was a very important one; they had reached and occupied Greenville, and moved out beyond as for as Blue Springs. Foster’s brigade of cavalry and mounted infantry was sent out from Knoxville, up the valley of the French Broad River, to turn the right of the enemy and get upon his rear, which movement was accomplished on the 9th. Foster got himself into position, and on the 10th, General Custer with his mounted infantry came up with the enemy at Blue Springs, and began to skirmish. Ferrero’s division of twelve small regiments, of which the Twenty-ninth was one, arrived about noon, and went into position a half-mile from the field,
  • 76. where they had a good view of the skirmish for nearly half an hour. At the end of this time, two brigades of the division—namely, Humphrey’s and Christ’s—were sent forward. The enemy had a battery well supported on the left of the main road leading to Greenville, on a high hill. They had thrown forward their first line and skirmishers well advanced to a distance of perhaps three-quarters of a mile from their battery, across the road and across a rivulet, and had advanced another body of skirmishers through a corn-field to the crest of a hill about three hundred yards from where the Twenty-ninth was lying. Custer’s men had slowly retired before the Confederates, and passed to our rear, when the order came for our two brigades to charge. The men rose to their feet and went forward at a rapid run, with arms aport and bayonets fixed, up the hill. The enemy, closely followed by our men, fell back rapidly down the hill, across the rivulet, into and through a belt of woods, where the pursuit ended by the direct orders of our generals. Here Colonel Christ re-formed his Brigade, to carry one of the Confederate batteries that had begun to fire shell into our lines. The enemy, seeing the preparations for a charge, wheeled their guns about and fled; and at this stage in the affair, it became so dark that all further hostilities ceased. Captain Leach, then sixty-three years of age, led his company on this charge; and when the rivulet was reached, which was some eight feet wide, sprang into it and scrambled up the opposite bank as actively as the youngest of his men, refusing the proffered assistance of Major Chipman, who was leading the regiment. Captains Leach and Clarke messed together; their negro servants, Bob and Isaac, were left in the rear of the field, where this fight had occurred, with their rations and baggage, and when the battle was over, were sought to prepare supper; but the darkies could not be found,—neither the rations nor baggage. Upon investigation, it appeared that a rumor had spread to the rear that both these officers had been killed in the fight. The negroes had of course heard of it, and, considering themselves absolved from all further
  • 77. obligations as servants, had gone back towards Bull’s Gap, taking the effects of the officers with them, where at night they held a sort of barbecue, feasted on the rations, and concluded their entertainment with an auction sale of the baggage. These recreant negroes were found the next morning and subjected to a severe questioning. “Where are our rations?” “Where’s the coffee-pot?” “What has become of our blankets?” Bob acted as spokesman: “De rations and blankets is done gone; de coffee-pot is done gone, too, —dey’s stole.” This ended the examination, and these two unfortunate captains had short rations and hard fare for the rest of the march. The enemy retired during the night, and soon after daylight our army started in pursuit. After marching a mile or two, the infantry halted, and Shakleford’s brigade of mounted men, with several horse batteries, swept by the head of the column, and then the infantry marched again. The most annoying information came from the farmers along the road. They scarcely knew which were our enemy,—the troops that had passed the night before, or the mounted column of Shakleford,—and these were some of the answers they gave in reply to questions of the whereabouts of the Confederates: “They are just ahead”; “Not far from an hour ago, they went by”; “A good gallop off”; and so forth. When our troops reached Greenville, they learned to their surprise that the enemy had passed through there six hours before, and that they had a sharp engagement with General Foster’s men a few miles out at Henderson’s. The tired troops pressed on; at Henderson’s, they saw some signs of a fight, but the bridge was intact. General Foster had refrained from destroying it, and the enemy had neglected to do so. Toward night the regiment went into camp at Rheatown, twenty-one miles from Blue Springs. Shakleford and Foster followed the enemy into Virginia, inflicting upon them great injury, and, upon returning, took up the line of the Watauga, to cover the passes from Virginia into East Tennessee. One of the abandoned wagons of the Confederates, found near Rheatown, furnished our regiment with a liberal supply of excellent
  • 78. bread and some other food. At this place our troops had two full days’ rest, and it was much needed, for the men had performed a forced march hither, and in the course of it had an encounter with the enemy. At the close of the second day, the columns were turned towards Bull’s Gap, making the distance by easy marches, and upon arriving there the regiment took the cars, but had proceeded but a short distance when an accident rendered it necessary for them to march six miles to Morristown, at which place they again took the cars and went to Knoxville, reaching there on the 10th of October. While the Confederates held East Tennessee, a merciless conscription had been enforced by them, to avoid which many of the male population had abandoned their homes and taken refuge in the deep forests, or fled into Kentucky. After the country had been occupied by Burnside, many of these loyal people returned to their homes, and signified their willingness to enlist in the Federal army. Burnside issued an order encouraging such enlistments, and especially into the veteran regiments of the Ninth Corps, which had been greatly depleted by their recent campaigns. Shortly after the Twenty-ninth returned to Knoxville, Captain Clarke and Lieutenant Atherton were detailed for this recruiting service, and ordered to station themselves at Rheatown, where they spent several weeks, and secured a number of recruits. On the 11th of November, a force of Confederates again invaded Tennessee from Virginia, and evading the left of our army on the Watauga, attacked with about 3,500 cavalry our post at Rogersville, and captured its small garrison. This, and other hostile movements at various points, rendered necessary the evacuation of Rheatown, and the drawing in of all our forces in that part of the State, nearer Knoxville. Our recruiting party, therefore, returned to the latter place, and went on after their regiment, which, in the meantime, had gone out to Lenoir’s Station. A serious invasion of East Tennessee, by General Longstreet, had already begun. That officer, with a large force, had early in
  • 79. November been detached from Bragg’s army, in the vicinity of Chattanooga, and was now marching up the valley towards Knoxville. On the 20th of October, the Ninth Corps left Knoxville and went to Campbell’s Station, fifteen miles southwest of the city, on the East Tennessee and Virginia Railroad; on the 21st, it moved down the railroad to Lenoir’s Station, and remained there, with the exception of a few days, till the 14th of November. On the night of the 10th of November, Longstreet made his appearance on the south side of the Holston River, at Hough’s Ferry, about six miles below Loudon, and where was stationed General White, with one division of the Twenty-third Corps. November the 14th, early in the morning. General Potter, in a hard rain-storm, started with the whole of the Ninth Corps to re-enforce General White. The Twenty-ninth with its brigade (Christ’s) was in advance, and toward noon arrived at a point five miles from the ferry, when rapid and heavy firing was distinctly heard. Now the clouds parted and the storm slackened, but the roads were as heavy and broken as before, making it exceedingly difficult to get the artillery along, and rendering the progress of the troops very slow. It was nearly dark when the Brigade reached the ferry; by this time the battle there had nearly ceased, nothing save an occasional musket-shot indicating the near presence of the enemy. Immediately upon its arrival, the regiment was ordered to the right of the line, marched nearly two miles through a thick woods, and formed in line of battle within one hundred yards of that of the enemy. The night soon came on, and early in the evening the storm broke out again with increased fury; the wind blew with the force of a tornado; the trees swayed to and fro in the blast, threatening to fall upon the heads of the men, who stood to arms all night without fires. Very early the next morning (15th), when the men were expecting to march against the enemy, the order came to fall back, and taking the same track by which it had entered the gloomy forest, the Brigade picked its way back to the place where it had first halted the night before. All along the way brightly-burning camp-fires were passed, but no troops were seen; these had already left, and were
  • 80. well under way towards Lenoir’s. At noon the regiment reached the latter place. The men had tasted no food for several hours, and were nearly worn out with fatigue; during the march here, they had managed to pluck a few ears of corn from the fields by the roadside, and as soon as a pause was made and the arms stacked, the place was ablaze with fires; every man at once went to work making coffee and preparing little messes for dinner. Happily the poor, hungry men had time to finish their meal, but they had barely finished it when they were ordered under arms. The enemy had just then appeared a half-mile away on the Kingston Road, and thither the Brigade was hurried at the double-quick. This movement of the Confederates was at once checked, and the rest of the day passed without any further hostile demonstrations, except a night attack upon our pickets. The morning of the 16th was sharp and cold; as early as two o’clock the regiment was ordered to march. The roads that had been muddy the day before were now frozen; the artillery horses were pinched with cold and hunger, and quite unable to drag the heavy cannon. It was resolved to sacrifice a portion of the baggage train, which, to the number of many wagons, was parked at Lenoir’s. The horses and mules were detached and harnessed into the guns; the spokes of the wagon-wheels were hacked, and, with their contents, set on fire,—not, however, till the soldiers had replenished their haversacks with a goodly quantity of smoked pork, coffee, sugar, and hard bread. The whole corps was in full retreat soon after daylight, and the enemy at once began the pursuit, harassing our rear guard continually. The road from Lenoir’s Station to Knoxville intersects at Campbell’s with the road from Kingston, and Longstreet had detached a column on his left to seize the junction of these roads. The possession of Campbell’s Station was, therefore, of great moment to Burnside, for should the enemy arrive there before him, his retreat to Knoxville would surely be cut off. A division of troops under Hartranft, by rapid marching, succeeded, in the early part of
  • 81. the forenoon, in reaching Campbell’s, and going out on the Kingston Road deployed across it, his left on the Loudon Road, along which our army and trains were moving. Hartranft was just fifteen minutes ahead of the enemy; he had only time to form his line, when the Confederate column appeared hurrying up the Kingston Road. A sharp engagement ensued; but the enemy was foiled in his attempt, and driven back in confusion. Soon after, all our trains passed this dangerous point in safety, and moved on to Knoxville. At about noon, the rest of the army came up, and went into position on “a low range of hills about a half-mile from the cross-roads.” The Ninth Corps was posted on the right of the field, which was nearly a mile broad, and extended a half-mile along the main road, and was bordered by heavy woods, passable for infantry. Christ’s brigade was on the right of the corps, and the Twenty-ninth on the right of the Brigade, fifty yards from the woods in front, while its right flank actually touched them. The lines had been formed but a short time, when the blue uniforms of our rear guard were seen, and finally our skirmishers,—the latter crossing the fields, creeping along the fences, and coming up the road, guns in hand, occasionally pausing to load and fire. Now and then a soldier in gray showed himself on the edge of the woods, but he would soon dart back out of sight. Colonel Pierce, now in command of the regiment, had orders to cover his front and flank with skirmishers, and Companies A and I, under Captain Clarke and Lieutenant Williams, were detailed for this purpose. The companies had proceeded but a short distance into the woods, when they came upon the enemy, who were approaching stealthily from tree to tree, evidently attempting what Colonel Christ had feared; namely, to flank the Brigade. A brisk fire began at once, but our men kept their line intact, and maintained perfect coolness. After the lapse of about an hour, the officers on the skirmish line discovered that the enemy were gradually overlapping the right of the Brigade, and promptly informed Colonel Christ of the fact. The skirmishers were ordered to come in at once, and the Brigade changed front and began to fall back. This movement was not made a moment too soon, for a dense
  • 82. mass of the enemy’s infantry immediately poured out of the woods in the rear of the retreating Brigade; while his flanking party, which had not yet lapped over our old position, also at the same moment, emerged from the woods, and, with loud yells, joined in the pursuit, firing an occasional shot, and with terrible oaths, shouting to our men to surrender and lay down their arms. Our men, loading as they marched, halted by files, turned about and fired, and again took their places in the ranks. At last, the regiment, which was in the rear, reached a sunken road, and, leaping into it, moved rapidly to the left of our lines; while over the heads of the men, now fully protected by the high bank, played the cannon of our reserve batteries, at last free to fire without endangering the lives of our own troops. The slaughter wrought upon the pursuing enemy is described as terrible; and as the Twenty-ninth came up the hill, gaining the plateau of the Knoxville side, Generals Burnside and Ferrero, standing on either side of the road, clapped their hands as it filed proudly between them. It was now, perhaps, five o’clock in the afternoon, and the battle degenerated into an artillery duel on our side, varied by the enemy with occasional charges, by which they took nothing but disaster. One by one, as it grew dark, the batteries retired, and after nightfall the Brigade moved off and took up its weary march for Knoxville, where it arrived at about three o’clock the next morning, and lay down for a few brief hours to rest upon the bleak hillside near Fort Sanders. During this battle, Charles H. Dwinnell of Company A, a worthy comrade and brave soldier, was killed, and William O’Conner of Company H was captured. Dwinnell was shot through the brain by a sharpshooter stationed in a tall pine. The ball was probably aimed at Captain Clarke, who was quite conspicuous at the time; the sharpshooter was instantly marked and shot by two of Dwinnell’s comrades, who fired simultaneously, the enemy’s body being seen to fall out of the tree.
  • 83. The siege of the city commenced on the 17th, and progressed rather gradually, beginning on the west and northwest, and finally extending around the entire city, from river to river. As the work of investing the place continued, our pickets were constantly pressed in close upon the main works, so that by the 29th of November we scarcely held more than the slope of the plateau crowned by our main fortifications, and in some cases not even that. To the right of Fort Sanders, named after a brilliant cavalry general who was killed early in the siege, and west of the city, Humphrey’s and Christ’s brigades picketed one side of the railroad cut, and the enemy the other. On one occasion, before the pickets were drawn in, a little squad of the Twenty-ninth assaulted a house in front of them, and driving away the enemy’s pickets there stationed, captured it, and brought in the supplies, which consisted of a small sack of meal, a few pounds of bacon, a box of tobacco, an eight-gallon keg of blackberry brandy, and two boxes of cartridges. The enemy re-formed and recaptured the house, but our men brought their booty safely into camp. There was meal enough to give each man in the company to which these adventurers belonged, a dish of hasty-pudding, and tobacco enough to furnish every man in the regiment with a good- sized piece. The brandy and cartridges were accounted for during the night by some of the wildest picket-firing that occurred during the siege. There was by no means a large supply of food in the city when the siege began, but long before it concluded, all kinds of provisions became extremely scarce. On the 19th, the Confederates drove in our outer pickets and took possession of the woods. On the evening of the 23d, they attacked our picket line in front of the Brigade, and seemed to be on the point of bringing on a general engagement. The order was given to set fire to a long line of buildings between the two armies. This was done to break the enemy’s lines and unmask their movements, and resulted very successfully. The conflagration that followed was both
  • 84. grand and awful. The dark wintry sky was lighted up by the flames, which roared and crackled with an unearthly sound, casting a broad belt of dazzling light over the fields and into the forests. In the round-house of the railroad, there was stored a large amount of condemned ammunition, and when the flames reached that, there was an explosion that shook the earth, and startled the anxious residents of the city. The 26th of November was Thanksgiving Day. The men got a full ration of bullets, but only a half-ration of bread. About midnight of the 28th, the picket line near the river on the southwest was driven in, and could not be re-established by the brigade which furnished it. The line in front of Fort Sanders had also been assailed and taken by the enemy, and about nine o’clock in the evening an order was sent to take the regiment out of the lines and place it in the immediate rear of the fort for special duty; Major Chipman had command. A little later in the evening, Companies A, C, D, and K were detached, and ordered to our lines near the river, where the enemy had a few hours before captured our rifle-pits. The night had nearly gone, and the first glimmer of day had appeared, when the familiar charging yell of the enemy was heard directly in front of the fort. Our pickets at this point were forced in, and in a moment more a large body of the enemy’s infantry were swarming at the very edge of the ditch. The battalion of the Twenty- ninth, under Chipman, were hurried into the fort, and the four detached companies at once sent for. The latter had a perilous experience in joining their comrades, and though exposed to the fire of the enemy’s cannon, reached the works without the loss of a man, and in ample time to lend a hand in the severe contest which was now well under way. The Confederates, led by fearless officers, crowded the ditch, and crossing it on each other’s shoulders, began to ascend the bank; one of their standard-bearers came running up and planted his colors upon the parapet, in the very faces of Major Chipman’s men; but he had hardly performed his deed of daring,
  • 85. when one of our soldiers shot him through the heart, and he fell forward into the works. Inspired by the example of their color- bearer, a large body of the Confederates, led by a gray-haired old officer (Colonel Thomas of Georgia), with wild shouts made a dash up the bank. All seemed lost; but at this moment Companies A, C, D, and K of the regiment came running into the fort, and ranging themselves along the parapet, opened a deadly fire upon the assaulting party. The gray old leader of the enemy, while waving his sword and shouting to his men to come on, was shot dead. Many of his brave followers suffered the same fate, and the handful of survivors fell hurriedly back into the ditch. At the same instant, like scenes were transpiring all along the works. The Seventy-ninth New York was sharply engaged, and the artillerymen, not being able to use their pieces, busied themselves by tossing among the enemy lighted shell with their fuses cut to a few seconds’ length. Finally a sergeant of one of the batteries, observing a renewed preparation of the enemy to charge up the bank, slewed one of the large guns about so as to make it bear upon the edge of the ditch, and, with a single charge of canister, raked it for a distance of several yards with deadly effect. About this time the assault slackened; but in a few moments another column of the enemy came rushing towards the fort, and with almost sublime courage faced the withering fire of our troops, and large numbers of them gained the bank. The first terrible scenes of the battle were re-enacted; three of the enemy’s standards were planted simultaneously upon the parapet, but they were quickly torn away by our men. The resistance was as desperate as the assault: officers used freely their swords, the men clubbed their muskets, others used their bayonets, and others still axes and the rammers of the cannon. A struggle so severe as this could not be otherwise than of short duration. In a few minutes the enemy’s soldiers began to falter and fall back into the ditch. Seeing this, General Ferrero, who was in command of the fort and closely watching the fight, ordered one company of the Twenty-ninth on the left, and one company of the Second Michigan on the right, to go through the embrasures and charge the disorganized enemy. Sweeping down the ditch, these commands captured about two
  • 86. hundred of the enemy, and drove them into the fort, the little squad of the Twenty-ninth following their captives and bearing triumphantly two battle-flags of the foe; the capturers of which were Sergeant Jeremiah Mahoney of Company A, and Private Joseph S. Manning of Company K, both of whom afterwards received the medals of honor voted by the Congress of the United States. The fight immediately died away in front of Fort Sanders, and the remnant of the enemy’s charging column shrank back within their lines in dismay and confusion. But on the left, where the Federal rifle-pits had been captured on the afternoon of the 28th, a fierce battle was heard. Hartranft’s division was sharply engaged with the enemy in its efforts to recapture the pits, and the effort was soon successful. The Confederates were everywhere routed, our entire line re-established, and by ten o’clock that Sunday morning quietness had settled down over the whole field. The enemy seemed appalled by the dreadful calamity that had overtaken him,—a calamity, as we shall presently see, that practically ended the siege. Ninety-eight dead bodies were taken out of the fatal ditch from a space of four hundred square feet around the salient. General Humphrey, who commanded the Mississippi brigade, was found dead on the glacis, within twenty feet of the face of the ditch. Lying among the dead in the moat, in every conceivable condition, were the wounded; and scattered all over the open space in front of the fort, through which telegraph wires had been stretched from stump to stump to impede the movements of the assailants, were scattered hundreds of both dead and wounded, and among them not a few of the enemy’s soldiers unhurt, who, dismayed at the awful storm of shell and grape that poured upon them, had lain prone upon the earth until the battle was over, only too willing to be captured. Nearly five hundred stand of small arms were collected on the field within our picket lines. Pollard states the enemy’s loss in this battle at seven hundred. The great bravery of this charge entitles those who participated in it to honorable mention. The troops who engaged in this assault
  • 87. “consisted of three brigades of McLaw’s division; that of General Wolford,—the Sixteenth, Eighteenth, and Twenty-fourth Georgia regiments, and Cobb’s and Phillips’s Georgia legions; that of General Humphrey,—the Thirteenth, Seventeenth, Twenty-first, Twenty- second, and Twenty-third Mississippi regiments; and a brigade composed of Generals Anderson’s and Bryant’s brigades, embracing among others, the Palmetto State Guard, the Fifteenth South Carolina Regiment, and the Fifty-first, Fifty-third, and Fifty-ninth Georgia regiments.”42 The troops that garrisoned the fort were Benjamin’s United States Battery, Buckley’s Rhode Island Battery, a part of Roemer’s New York Battery, the Seventy-ninth New York Highlanders, and, at the very beginning of the fight, a battalion of the Twenty-ninth under Major Chipman, and before the repulse of the assault on the salient, Captain Clarke’s and the other companies of the regiment already named. When the battle was well advanced, and affairs had assumed a serious aspect, the One Hundredth Pennsylvania was moved up in the rear of the fort, and a few minutes before the close of the fight, the Second Michigan was ordered into the works on the right, one of its companies being detailed to sweep the ditch. Our loss in the fort was eight killed and five wounded, and among the former were two members of the Twenty-ninth; namely, Sergeant John F. Smith of Company H, and Corporal Gilbert T. Litchfield of Company K, both most excellent soldiers. The loss of the enemy in this encounter doubtless exceeded greatly that given by Mr. Pollard; one of our officers engaged stating it to be fourteen hundred. When Longstreet had drawn off his troops from the scene of his defeat, General Burnside kindly directed General Potter to send out a flag of truce, granting the enemy permission to remove his dead and wounded from the field. The flag was courteously received, and for the space of several hours there was a complete cessation of all hostilities. As a reward for its services in this action, the regiment was retained in Fort Sanders as a part of its garrison, and consequently relieved from much severe picket duty, only
  • 88. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com