Performance Monitoring 2016
Performance Monitoring 2016
PERFORMANCE
& MONITORING
VOLUME III
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
DEAR READER,
TABLE OF CONTENTS
EXECUTIVE SUMMARY
14
18
22
26
29
32
38
42
46
48
52
53
GLOSSARY
EDITORIAL
BUSINESS
RICK ROSS
JOHN ESPOSITO
RE SE A RCH@DZONE .COM
EDIT OR-IN-CHIEF
CAITLIN CANDELMO
PUBLIC ATIONS M A N AGER
Read it, monitor your results, and let us know what you think.
BY JOHN ESPOSITO
EDIT OR-IN-CHIEF, DZONE
RESE [email protected]
MATT SCHMIDT
PRESIDENT & C T O
JESSE DAVIS
E V P & COO
ANDRE POWELL
KELLET ATKINSON
V P OF M A RK E TING
G. RYAN SPAIN
MATT OBRIAN
S A L E S@DZONE .COM
DIREC T OR OF BUSINESS
DE V EL OPMENT
ASSOCIATE EDITOR
MATT WERNER
A SSOCI ATE EDIT OR
CEO
MICHAEL THARRINGTON
A SSOCI ATE EDIT OR
TOM SMITH
RESE A RCH A N A LY S T
ALEX CRAFTS
CHRIS SMITH
JIM HOWARD
SR ACCOUNT E X ECU TI V E
ANDREW BARKER
ACCOUNT E X ECU TI V E
JIM DWYER
ACCOUNT E X ECU TI V E
CHRIS BRUMFIELD
ACCOUNT M A N AGER
ART
ASHLEY SLATE
DESIGN DIREC T OR
SPECIAL THANKS
to our topic experts, Zone
Leaders, trusted DZone
Most Valuable Bloggers, and
dedicated users for all their
help and feedback in making
this report a great success.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
EXECUTIVE SUMMARY
As Tony Hoare notoriously observed, "Premature
optimization is the root of all evil:" that is, the benefits
of absolutely maximal optimization are usually much
lower than the increased cost of maintenance and
debugging that results from the brittleness caused
by that optimization. On the other hand, the natural
tendency of OOP to prioritize form over performance
can generate a codebase that is highly readable but
partitioned such that performance-oriented refactoring
may prove extremely difficult. To help you steer between
the Scylla of overeager optimization and the Charybdis
of runtime-indifferent code structure, we've split
this publication between ways to design performant
systems and ways to monitor performance in the real
world. To shed light on how developers are approaching
application performance, and what performance
problems they encounter (and where, and at what
frequency), we present the following points in summary
of the most important takeaways of our research.
by far, with 47% using it often. Race conditions and thread locks
are encountered monthly by roughly one fifth of developers (21%
and 19% respectively). Of major parallel programming models, only
multithreading is often used by more than 30% of developers (81%).
IMPLICATIONS
RECOMMENDATIONS
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
KEY
RESEARCH
FINDINGS
Respondents were also asked to note the last time they had to solve
a performance problem in their infrastructure, and the majority
(21%) said in the last three months followed by this month at
17%, and this week at 14%. Compared to 2015s survey results,
where the most respondents (19%) noted over a year ago as the
last time they worked on infrastructure performance problems,
there is a clear shift to having more frequent performance
problems that require immediate attention.
01. WHEN WAS THE LAST TIME YOU HAD TO SOLVE A PERFORMANCE
PROBLEM IN YOUR SOFTWARE?
IN THE
1+ YEAR AGO
PAST YEAR
.3% NEVER
IN THE LAST
6 MONTHS
6%
4%
THIS WEEK
26%
9%
IN THE LAST
3 MONTHS
23%
17%
15%
IN THE LAST
THIS MONTH
the last two weeks at 17%. All in all, 81% of respondents answered
in the last 3 months or less, showing that software still has
frequent performance problems that developers need to address.
Application code (43%) remains the area of the technology stack
that tends to have the highest frequency of performance issues,
while malware remains the one with little to no issues, where 61% of
respondents had either very few issues or none at all.
2 WEEKS
9.6%
54.4%
36%
4%
44.4%
51.5%
11.3%
67.5%
21.2%
COMMUNICATION/MANAGING PEOPLE
TO ADDRESS THE ISSUE
28.6%
46.5%
24.9%
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
though down from 62% in 2015) said that they build their application
functionality first, and then they worry about performance. More
people this year have performance in mind from the start when
building applications, as 41% said that they build performance into the
application from the start, which is up from 35% in 2015.
Much like 2015, this years respondents also favored application logs,
as 89% of them said that these were one of the main tools their teams
use to find the root cause of a performance problem. The second most
commonly used tool for finding the root cause of a performance
issue are database logs, with 68% of respondents relying on them.
Monitoring, logging, and tests are three of the key components used to
help discover problems early enough to fix them before they begin to
negatively affect an applications performance.
answers were pretty evenly split amongst the options when the
survey takers were asked what the max simultaneous user load is for
the main application their team works on. The majorityonly 17%
said they use 1,001 5,000; 13% use 101 500; and 12% use 21 100.
When asked how many servers they use at their organizations, 38% of
the respondents said they they use fewer than 20 (this included IaaS
and on-premises servers).
Over half (57%) of the developers surveyed do not regularly design
their programs for parallel execution. When asked which parallel
programming frameworks, standards, and APIs they use, 47% said
they often used Executor Service (Java), while 33% occasionally use
ForkJoin (Java) and 29% occasionally use Web Workers (JavaScript).
As for parallel algorithm design techniques used, 63% most often
use load balancing. 81% of respondents often use multithreading as
their parallel programming model of choice. The respondents noted
that they run into concurrency issues (race conditions, thread locks,
mutual exclusion) only a few times a year.
13%
12%
DEBUGGERS
68%
67%
55%
48%
43%
39%
11%
501-1,000
1,001-5,000
17%
5,00110,000
101-500
OTHER
3.5%
BUILD PERF.
INTO THE
APP FROM
THE START
41%
FUNCT. FIRST,
THEN WORRY ABOUT
PERFORMANCE
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
OTHER
6%
PROFILERS
9%
34%
10,00150,000
21-100
12%
34%
90%
DATABASE LOGS
47%
17%
APPLICATION LOGS
9%
49%
OTHER
58%
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
See all your data. Boost performance. Drive accountability for everyone.
IT Operations
Mobile Developers
Faster delivery.
Fewer bottlenecks.
More stability.
End-to-end visibility,
24/7 alerting, and
crash analysis.
Front-end Developers
Reduce resolution times
and spend more time
writing new code.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
App Owners
Track engagement.
Pinpoint issues.
Optimize usability.
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PARTNER SPOTLIGHT
BY NEW RELIC
New Relic gives you deep performance analytics for every part of your
software environment.
CATEGORY
NEW RELEASES
OPEN SOURCE?
STRENGTHS
APM
Daily
No
CASE STUDY
One of the fastest-growing digital properties in the U.S., Bleacher Report is
the leading digital destination for team-specific sports content and realtime event coverage. To improve performance, the company embarked
on a multi-year journey to turn its monolithic web application into a
microservices-based architecture. New Relic has been there each step
of the way, helping the Bleacher Report team stay on top of performance
monitoring, proactive load testing, and capacity planning. Not only is the
software analytics tool helping save time and money by making the teams
code more efficient (and in turn, requiring fewer servers), but it also helps
Bleacher Report respond more quickly and effectively to issues reported by
users. I use New Relic every day, says Eddie Dombrowski, senior software
engineer. It helps me find ways to make our applications perform better
and prioritize which areas to address.
BLOG blog.newrelic.com
NOTABLE CUSTOMERS
Hearst
Trulia
Lending Club
TWITTER @newrelic
HauteLook/
Nordstromrack.com
MLBAM
Airbnb
MercadoLibre
WEBSITE newrelic.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
QUICK VIEW
Effective APM:
Find and Fix the
Things That Matter
BY JON C. HODGSON
01
Data granularity is critical.
Transaction & metric sampling can
completely miss intermittent issues
and may mislead you into solving
symptoms instead of the root cause.
02
Beware the Flaw of Averages. The
only way to truly understand the
end-user experience of all users is
by capturing all transactions and
leveraging Big Data to analyze them.
03
Methodology is as important as the
data. Ask the wrong questions, or ask
the wrong way, and youll waste time
fixing the wrong things.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Time
3.277
3.875
2.825
69.954
35.047
4.194
5.171
4.679
3.795
5.159
3.778
34.376
24.971
4.004
3.552
10.735
3.686
5.200
If you really think about it, youll realize how ludicrous the
initial question was in the first place. A singular value will
never relate the range of experience for all of those users.
There are actually over 10,000 answers to the question:
one for each individual call, and others for subsets of calls
like user type, location, etc. If you really want to know if
ALL of your users are happy with ALL of their interactions
with your application, you have to consider each user
interaction as individually as possible, and beware the Flaw
of Averages.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
10
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
11
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
IT is about decisions
BMC TrueSight transforms IT by turning data into actionable insights while
eliminating the noise of traditional IT management tools
Bring IT to Life with TrueSight
bmc.com/truesight
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PARTNER SPOTLIGHT
BY BMC SOFTWARE
NEW RELEASES
OPEN SOURCE?
STRENGTHS
Application and
Infrastructure Monitoring
Continuous Delivery
Some components
CASE STUDY
13
NOTABLE CUSTOMERS
SEI
Northwestern Univ.
Socit Gnrale
InContact
Harmony Information Lockheed Martin
HealthMEDX
IKEA
Systems
TWITTER @truesightpulse
WEBSITE bmc.com/truesight
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Know When
(and When Not)
to Blame Your
Network
BY NICK KEPHART
QUICK VIEW
01
In distributed, cloud-based
environments, its equally important
to understand both application and
network performance.
02
Active monitoring, often used for
website performance, can also
provide you with insights into cloud
provider networks.
03
Active monitoring can provide you a
stack trace for your network, showing
the performance of each network that
your traffic traverses.
04
Consider adding key network
connectivity and service metrics to
your arsenal in order to get ahead of
cloud outages.
14
you likely already use to monitor application experience can also help
with network experience. Getting better visibility into application
delivery may not be as hard as it seems.
HOW IT WORKS
So how does it work? It starts with loading a page in a browser
and monitoring user timing. Each object on the page is loaded,
measuring load time (DNS, blocked, wait, response, etc.) wire size, and
uncompressed size. These page loads can be linked together as entire
user transactions with button clicks, form fills, and more. This can be
particularly useful for JavaScript-heavy pages where user interactions
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Object Errors and Load Time: Most apps and webpages have
NETWORK CONNECTIVITY
region, and city can inform decisions about how fast bulky
objects can be loaded by users.
NETWORK SERVICES
CDN Latency: Measure performance from users to edge
widespread?
Is your CDN correctly caching and serving your content from an
your users?
often, but when it does, youre hosed. Keep an eye on your DNS
service provider.
Routing Path Changes: Keeping a pulse on routing changes
APP PERFORMANCE
Page Load and Transaction Time: A standard metric in many
15
Active monitoring can save you from huge headaches. One major
payment processor that Ive worked with spent an entire holiday
weekend, with multiple senior engineers trying to track down what
they thought was a database transaction fault. Another team had just
started deploying active monitoring in their environment, and upon
reviewing the data, was able to track the problem to a routing issue
that was causing unstable network connectivity. Upon seeing the
data, the application development team became instant converts to
adding active monitoring into the runbook for issue resolution.
As your applications are increasingly relying on IaaS, microservices,
and APIs from far-f lung parts of the Internet, your app is more reliant
on the network than ever. That means in order to have a complete
view of application experience, you should be adding active network
monitoring to your application troubleshooting arsenal. With this
data, your development team can avoid dead ends and be more
confident the next time you need to ask the network guys to dive into
an issue.
N I C K K E P H A RT leads Product Marketing at ThousandEyes, which
develops Network Intelligence software, where he reports on Internet
health and digs into the causes of outages that impact important online
services. Prior to ThousandEyes, Nick worked to promote new approaches
to cloud application architectures and automation while at cloud
management firm RightScale.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Performance Monitoring
+
Powerful Analytics
=
Satisfied Customers
16
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PARTNER SPOTLIGHT
AutoPilot Insight
BY NASTEL TECHNOLOGIES
Nastel Technologies provides a unified suite of analytic and monitoring tools for
end-to-end transaction tracking, logs, end-users, and apps
CATEGORY
NEW RELEASES
OPEN SOURCE?
STRENGTHS
Every quarter
No
CASE STUDY
Sky Mexico, a satellite television delivery service, needed a product to span its
UNIX and Windows infrastructure and monitor the health of ERP, CRM, billing, IVR,
provisioning, and other business and IT transactions. The lack of which was causing:
Achieved a reduction of help desk tickets and costly Tier 3 support of 30 and
70 percent, respectively
NOTABLE CUSTOMERS
CitiBank
BestBuy
Fiserve
BNY Mellon
UnitedHealth Group
Sky
Dell
NY Power Authority
BLOG www.nastel.com/blog
17
TWITTER @nastel
WEBSITE www.nastel.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Performance
Patterns in
MicroservicesBased Integrations
BY ROHIT DHALL
QUICK VIEW
01
Understand how integration with
multiple systems poses potential
performance issues.
02
Learn what performance patterns
are and how these can help
you avoid common potential
performance issues.
03
Understand five different performance patterns and how they work.
04
Understand the importance of
asynchronous communication/
integration.
18
PERFORMANCE PATTERNS
THROTTLING
TIMEOUTS
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
MICROSERVICE A
MICROSERVICE Y
POOL FOR X
POOL FOR Y
MICROSERVICE X
MICROSERVICE Y
YES
DECREASE NUMBER OF
AVAILABLE CONNECTION
RETURN CONNECTION
RETURN
EXCEPTION
19
CIRCUIT BREAKERS
ASYNCHRONOUS INTEGRATION
SUBSCRIBER
SUBSCRIBER
MESSAGE BROKER
PUBLISH EVENT DATA
PUBLISHER
PUBLISHER
CONCLUSION
BULKHEAD
REQUEST FOR
A CONNECTION
MICROSERVICE A
MICROSERVICE X
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
www.ca.com/apm
20
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PARTNER SPOTLIGHT
BY CA TECHNOLOGIES
NEW RELEASES
OPEN SOURCE?
STRENGTHS
APM
Quarterly
No
CASE STUDY
21
TWITTER @cainc
NOTABLE CUSTOMERS
Lexmark
Innovapost
Vodafone
Blue Cross
Blue Shield of
Tennessee
Itau Unibanco
U.S. Cellular
Expeditors
Produban
WEBSITE ca.com/apm
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Working in
Parallel:
QUICK VIEW
01
In parallel programming, work
is all the steps you have to do,
and depth is how much work
you can do at once. Both use Big
O notation.
ON THE COMPLICATIONS OF
PARALLEL ALGORITHM DESIGN
03
Sometimes you have to waste work
to improve parallelism.
04
Sometimes the algorithm with the
best available parallelism is not
the best algorithm.
BY ALAN HOHN
SOFTWARE ARCHITECT, LOCKHEED MARTIN MISSION SYSTEMS AND TRAINING
22
02
Available parallelism is work
divided by depth.
05
After you find a good parallel
algorithm, the next challenge is
tuning it to run efficiently on real
hardware.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
AVAILABLE PARALLELISM
Putting work and depth together, we can define "available
parallelism" (where bigger is better):
Available Parallelism = Work / Depth
With our search through an unsorted list, the work
wasO(n)and the depth wasO(1), giving an available
parallelism ofO(n). This means that as the size of the
input increases, the amount of work increases linearly, but
our ability to do it in parallel also increases linearly. So
as long as we have more processors the problem will take
about the same amount of time (ignoring for a moment the
overhead of splitting the work).
In a marginally more realistic example, let's say that
instead of just identifying duplicates, we wanted to count
the number of duplicates for each duplicate we find. Now,
instead of just comparing each item in the list to every
other item, we also need to keep track of how many
matches we've found. So we can't split up the comparisons
completely. Let's take a simple approach. We will split up
the "left" side of the comparison, then just iterate over the
list. This way we count the number of matches in parallel
for each item in the list. Of course, this is a very poor
approach, because we are finding the same duplicates
many times, which is a lot of wasted work.
For this example, while the work is stillO(n^2), the depth
is nowO(n). This means our available parallelism isO(n).
This is still quite good, because we still see linear speedup
from adding more processors.
Of course, it would be nice to avoid that wasted work.
Those experienced with map and reduce may have
noticed that a map can emit a value for each item, then
a reducer can add them up. In fact, this is Hadoops
WordCount example. The work in this case isO(n), and
if the reducer is written correctly the depth isO(log n).
23
CONCLUSION
This is a pretty basic discussion of how parallel
algorithms are analyzed and compared to each other. If
you'd like to see how parallel code might work in practice,
I have a GitHub repository that runs a Net Present Value
simulator using Java fork/joins RecursiveTask that
might be of interest.
A L A N H O H N is a software architect with Lockheed Martin Mission
Systems and Training. Much of his recent work has been with Java,
especially Java EE (JBoss) and OSGi (Karaf), but hes worked with C,
C++, C#, Ada, Python, MATLAB, Hadoop, and a few other things over
time. He had a great professor at Georgia Tech for a high performance
computing class, which is lucky because he stumbled into using it at
work soon after.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
dynatrace.com
24
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
CONCLUSION
If youre responsible for the performance of your companys code
from development and test and the transition to production, gapfree APM data helps isolate and resolve every issue quickly and
efficiently - with no finger pointing since no data point is missed. In a
world ruled by complexity, gap-free data not only creates a strong IT
foundation but also confidence in the digital experience delivered.
PARTNER SPOTLIGHT
NEW RELEASES
OPEN SOURCE?
STRENGTHS
Monthly
No
CASE STUDY
25
NOTABLE CUSTOMERS
Verizon
Costco
Panera
Volkswagen
AAA
TWITTER @dynatrace
Fidelity
Investments
Best Buy
WEBSITE dynatrace.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
L ATENCIES
B OT T LEN E C KS
KEY
FREQUENT ISSUES
SOME ISSUES
APPLICATION CODE
NO ISSUES
CLIENTS
13
RARE ISSUES
42
42
=.3NS
WHICH = 1 SEC, OR IS
EQUAL TO CLAPPING
YOUR HANDS
L1 CACHE ACCESS
=.9NS
WHICH = 3 SEC, OR IS
EQUAL TO BLOWING YOUR
NOSE
L2 CACHE ACCESS
=2.8NS
CPU
9
24
27
40
MEMORY
14
20
30
NETWOR
9
WHICH = 9 SEC, OR IS
EQUAL TO BILL GATES
EARNING $2,250
L3 CACHE ACCESS
=12.9NS
18
33 40
AT I O N
APPLIC SERVER
=100NS
READ 1M BYTES
=9 S
36
WHICH = 9 HOURS, OR IS
EQUAL TO COMPLETING A
STANDARD US WORKDAY
=16 S
STORAGE
10
19
READ 1M BYTES
40
DATABASE
23
31
WHICH = 14 HOURS, OR IS
EQUAL TO TAKING A FLIGHT
FROM NEW YORK TO BEIJING
=200 S
SE
D ATA B A
26
READ 1M BYTES
=2MS
INTERNET: SF TO NYC
=71MS
43
S M A R T C O N T E N T F O R T E C H P R O F E S S IO N A L S
DZONE.COM
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PARTNER SPOTLIGHT
SteelCentral AppInternals
BY RIVERBED
Were now able to look inside of the developers code without having to modify the code while its running in our
production environment. Thats fantastic. I cant imagine someone running a site of any real size without this capability.
- ERIC MCCR AW, GL OB A L W EB SY S T EMS M A N A GER , N AT ION A L INS T RUMEN T S
CATEGORY
NEW RELEASES
OPEN SOURCE?
Application Performance
Management
Quarterly
No
CASE STUDY
National Instruments public-facing website, ni.com, is updated frequently. The
web systems team, which is charged with keeping the site running optimally, spent
thousands of hours each year troubleshooting issues caused by new releases. This
caused tension between the web systems team and the developers, and impacted
customers as well.
The web systems team now uses AppInternals to find and fix root causes of
application performance problems. Developers use it as well to test their code in
QA. As a result, the team has:
BLOG rvbd.ly/20s7pW1
28
STRENGTHS
NOTABLE CUSTOMERS
ABB
Hertz
National Instruments
Allianz
Linkon
SLS
Asurion
Michelin
TWITTER @SteelCentral
WEBSITE www.appinternals.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Latency, in general terms, is the amount of time between a cause and the observation of its effect. In a computer
network, latency is defined as the amount of time it takes for a packet of data to get from one designated point to
another. The table below presents the latency for the most common operations on commodity hardware. These data
points are only approximations and will vary with the hardware and the execution environment of your code. However,
their primary purpose is to enable you to make informed technical decisions to reduce latency.
OPERATION
NOTE
LATENCY
SCALED LATENCY
L1 cache reference
0.5 ns
Consider L1 cache
reference duration is
1 sec
Branch
misprediction
During the execution of a program, the CPU predicts the next set
of instructions. Branch misprediction is when it makes the wrong
prediction. Hence, the previous prediction has to be erased and a
new one must be calculated and placed on the execution stack.
5 ns
10 s
L2 cache reference
7 ns
14 s
Mutex lock/unlock
25 ns
50 s
Main memory
reference
100 ns
3m 20s
Compress 1K bytes
with Snappy
1h 40 m
10,000 ns
5h 33m 20s
Read 1 MB
sequentially from
memory
This includes the seek time as well as the time to read 1 MB of data.
250,000 ns
We can assume that the DNS lookup will be much faster within a
data center than it is to go over an external router.
500,000 ns
Read 1 MB
sequentially from
SSD disk
Assumes this is a SSD disk. SSD boasts random data access times
of 100,000 ns or less.
1,000,000 ns
Disk seek
Disk seek is the method used to get to the sector and head in the
disk where the required data exists.
10,000,000 ns
Read 1 MB
sequentially from
disk
20,000,000 ns
Round trip for packet data from U.S.A. to Europe and back.
150,000,000 ns
3472d 5h 20m
REFERENCES: DESIGNS, LESSONS AND ADVICE FROM BUILDING LARGE DISTRIBUTED SYSTEMS - PETER NORVIGS POST ON TEACH YOURSELF PROGRAMMING IN TEN YEARS
29
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
30
ibm.biz/apiconnect
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PARTNER SPOTLIGHT
API Connect
IBM API Connect is a complete solution that addresses all aspects of the API lifecycle Create Run, Manage, Secure - for both on-premises and cloud environments.
CATEGORY
NEW RELEASES
OPEN SOURCE?
API Management
Agile
No
STRENGTHS
Simplify discovery of enterprise systems of record for automated
API creation
API LIFECYCLE
IBM API Connect offers features to manage the API lifecycle, including:
Runtake advantage of integrated tooling to build, debug and deploy APIs and
microservices using the Node.js or Java.
31
TWITTER @ibmapiconnect
FEATURES
Unified Console
Quickly run APIs and microservices
Manage APIs with ease
Readily secure APIs and microservices
Create APIs in minutes
WEBSITE ibm.com/apiconnect
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
How HTTP/2 Is
Changing Web
Performance Best
Practices
BY CLAY SMITH
QUICK VIEW
01
HTTP/2 is the successor of HTTP
that was ratified in May 2015.
02
It is changing long-standing web
performance optimizations.
03
Best practices for migrating and
using it in production are still being
finalized.
04
This article covers how HTTP/2 is
different, how it improves latency,
and how to debug it in production.
05
Measuring real-user performance is
critical during a HTTP/2 migration.
32
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
cat.jpg
TCP CONNECTION
robots.txt
fonts.css
news.css
about.css
fontse471dee.css
aboutbd48df43.css
newsffd4523e.css
footerdd45fdeb3.css
UNBUNDLED
footer.css
appe461bde5901e.
css
application.js
Requests for mult iple assets on a single host use a single TCP
connect ion in HTTP/2.
33
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
A WORK IN PROGRESS
Most users dont care what application protocol your site
usesthey just want it to be fast and work as expected.
Although HTTP/2 has been officially ratified for almost a year,
developers are still learning best practices when building
faster websites on top of it. The benefits of switching to HTTP/2
depend largely on the makeup of the particular website and
what percentage of its users have modern browsers. Moreover,
debugging the new protocol is challenging, and easy-to-use
developer tools are still under construction.
Despite these challenges, HTTP/2 adoption is growing.
According to researchers scanning popular web properties,
thenumber of top sites that use HTTP/2 is increasing,
especially afterCloudFlareandWordPressannounced
their support in late 2015. When considering a switch, its
important to carefully measure and monitor asset- and
page-load time in a variety of environments. As vendors and
web professionals educate themselves on the implications of
this massive change, making decisions from real user data is
critical. In the midst of awebsite obesity crisis, now is a great
time to cut down on the total number of assets regardless of
the protocol.
34
HTTP/2 SUPPORT
Apache
> 2.4.17
nginx
> 1.9.5
Microsoft IIS
Heroku
No (as of 1/16)
Google AppEngine
Amazon S3
No (as of 1/16)
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Akamai
Yes
CloudFare
Yes
KeyCDN
Yes
Amazon CloudFront
No
35
ADDITIONAL RESOURCES
Lets Encrypt
Why isnt HTTPS everywhere yet?
HTTP/2 on IIS
Moving to HTTP/2 on nginx 1.9.5
is-http npm module
Is TLS Fast Yet?
This art icle was written by Clay Smith, with contribut ions
oftechnical feedback and invaluable suggest ionsbyJef f
Martens, Product Manager forNew Relic Browser, and web
performance expertAndy Davies.
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
Events
Health Rule Violations Started
IIS
Internet
Information
Services
Device
Java
Java
Transaction Scorecard
Java
Normal
Java
83.1%
0.3%
Very Slow
1.3%
15
Stall
Errors
0.2%
15.1%
175
Shipments DB
Java
36
963
Slow
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PARTNER SPOTLIGHT
BY APPDYNAMICS
NEW RELEASES
OPEN SOURCE?
STRENGTHS
Application Performance
Management
Bi-Yearly
No
CASE STUDY
NOTABLE CUSTOMERS
BLOG
37
blog.appdynamics.com
NASDAQ
eHarmony
DIRECTV
Cisco
Citrix
Hallmark
TWITTER @AppDynamics
WEBSITE appdynamics.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Benchmarking
Java Logging
Frameworks
SOFTWARE DEVELOPER
THE CONTENDERS
For this test, we investigated four of the most commonly used Java
logging frameworks:
1.
2.
3.
4.
Log4j 1.2.17
Log4j 2.3
Logback 1.1.3 using SLF4J 1.7.7
JUL
38
01
In distributed, cloud-based
environments, its equally important
to understand both application and
network performance.
02
Active monitoring, often used for
website performance, can also
provide you with insights into cloud
provider networks.
03
Active monitoring can provide you a
stack trace for your network, showing
the performance of each network that
your traffic traverses.
04
Consider adding key network
connectivity and service metrics to
your arsenal in order to get ahead of
cloud outages.
BY ANDRE NEWMAN
QUICK VIEW
but the first was discarded due to large startup times to warm the
JIT). To simulate a workload, we generated prime numbers in the
background. We repeated this test three times and averaged the
results. This stress test also drives the logging frameworks harder
than they would in a typical workload because we wanted to push
them to their limit. For example, in a typical workload, you wont
see as many dropped events, because events will be more spread out
over time, allowing the system to catch up.
We performed all testing on an Intel Core i7-4500U CPU with 8 GB of
RAM and Java SE 7 update 79.
In the interest of fairness, we chose to keep each framework as close
to its default configuration as possible. You might experience a boost
in performance or reliability by tweaking your framework to suit
your application.
APPENDER CONFIGURATION
We configured our file appenders to append entries to a single file
using a PatternLayout of %d{HH:mm:ss.SSS} %-5level - %msg%n.
Our socket appenders sent log data to a local socket server, which
then wrote the entries to a file (see this link for an example using
Log4j 1.2.17). Our syslog appenders sent log data to a local rsyslog
server, which then forwarded the entries to Loggly.
The AsyncAppender was used with the default configuration, which
has a buffer size of 128 events (256 events for Logback) and does not
block when the buffer is full.
TEST RESULTS
Our goal was to measure the amount of time needed to log a number
of events. Our application logged 100,000 DEBUG events (INFO
events for JUL) over 10 iterations (we actually did 11 iterations,
FILE APPENDER
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
SOCKET APPENDER
UDP
TCP
SYSLOG APPENDER
UDP
CONCLUSION
The combination that we found to offer the best performance and
reliability is Log4j 1.2.17s FileAppender using an AsyncAppender.
This setup consistently completed in the fastest time with no
dropped events. For raw performance, the clear winner was
Logbacks FileAppender using an AsyncAppender.
TCP
As expected, TCP with Log4j 2.3 proved to be a much more reliable
transmission method. (You can view the test results here.) We saw
a small number of dropped messages, but it was negligible when
compared with UDP. The cost of this higher reliability is a run time
thats nearly twice as long.
With an asynchronous appender, we saw a decent boost in
performance with no drop in throughput.
39
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
40
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
SPONSORED OPINION
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
for multiple connections between the client and the server. This
reduces network latency, which in turn makes web pages load
faster. HTTP/2 also compresses HTTP headers, allows the server to
push resources to the client that havent been requested yet, and
allows the client to indicate to the servers which resources are more
important than others.
Why should you migrate your web applications to HTTP/2? The main
reason is speed. An HTTP/2-based site will simply load faster than
a site in HTTP/1.1, a nearly 20-year-old protocol that doesnt do
a very efficient job of handling the network handshake between
browser client and web server that happens every time a user tries to
access a web page. As websites have grown larger and more complex,
these inefficiencies have proved to be a drag on web performance.
Organizations have had to adapt by using techniques such as domain
sharding, in-line images, and file concatenation.
The end result is that a browser client can make faster and fewer
connections to a web host, speeding up the time it takes to
download content from that server. The smaller content payloads
and optimized TCP connections of HTTP/2 are especially ideal for
mobile applications and sites.
There are various ways to start using HTTP/2. You can upgrade
your web server to the latest versions of Apache and Nginx. Your
hosting or CDN provider can upgrade your site to HTTP/2 even
faster. No coding changes are required. Then keep monitoring your
sites to make sure they live up to their potential.
PARTNER SPOTLIGHT
Catchpoint Synthetic
BY CATCHPOINT SYSTEMS
NEW RELEASES
OPEN SOURCE?
STRENGTHS
8x Annually
No
CASE STUDY
Priceline.com relies on innovative proprietary architecture that combines internal and third-party
partner components to offer high-performing websites and services to millions of customers. Speed,
scalability, and consistency are keys to Priceline.com's continued success.
DNS Monitoring
Hosts & Zone Monitoring
Real User Measurement
THE SOLUTION
Utilizing Catchpoint's monitoring locations to proactively monitor
multistep transactions, DNS services, and API calls: Priceline continuouly
benchmarks performance with industry peers to define appropriate goals to
maintain its leadership position.
Catchpoint Insight: Priceline automatically correlated internal data with
synthetic monitoring metrics to diagnose problems and rapidly find root
causes across complex multi-tier architectures.
Zones and Hosts: Underperforming components (third-party vendors,
internal components, etc.) were quickly troubleshooted.
NOTABLE CUSTOMERS
Business Insider
Honeywell
Verizon
Comcast
Kate Spade
Wayfair
Trip Advisor
BLOG blog.catchpoint.com
41
TWITTER @catchpoint
WEBSITE catchpoint.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Executive
Insights on
Performance +
Monitoring
QUICK VIEW
01
Performance and monitoring grow
more challenging as more data and
more layers of abstraction are added
with no end in sight.
02
Customers need real user monitoring
from the server to the application
across all devices to understand the
performance of their apps for the
optimal UX.
03
Developers need to measure
performance earlier in the
development process and be
sensitive to how latency can accrue
as their application integrates with
other apps.
BY TOM SMITH
MARKETING STRATEGIST AND RESEARCH ANALYST, DZONE
42
02
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
04
05
07
43
09
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
www.fusion-reactor.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
Copyright 2016, Intergral GmbH. All rights reserved. All trademarks, names, logos referenced, belong to their respective companies.
SPONSORED OPINION
Traditional APM tools provide some neat metric graphs and can alert
you when something seems wrong, but they dont tell you much at
the level of detail software engineers need to get to the actual root
of the issue. So we grep through our logs, dive into the heap, tally
our object instances, run stack trace over and over, guess some
breakpoints or include debug data into our code.
transactions, web & JDBC requests not just measured by time, but
by memory consumed
PARTNER SPOTLIGHT
FusionReactor
fusion
reactor
BY INTERGRAL GMBH
TM
FusionReactor goes beyond traditional APM tools to give you unrivaled insight
into how your Java code performs and executes in production environments.
CATEGORY
NEW RELEASES
OPEN SOURCE?
STRENGTHS
3 Months
No
CASE STUDY
Auto Europe
Bullhorn
Allianz
BLOG blog.fusion-reactor.com
45
NOTABLE CUSTOMERS
TWITTER @Fusion_Reactor
Hasbro
WEBSITE fusion-reactor.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
QUICK VIEW
01
Not all web pages are created equal.
People react differently to slowdowns on
different pages in the transaction path.
02
Knowing your pages load times is
just a first step. You need to correlate
load time with other metrics that are
meaningful to your business.
03
Conversion Impact Scoring keeps you
from wasting limited performance optimization resources on the wrong pages.
04
Every site is different. Page groups
that have high Conversion Impact
Scores for another retailer may not
generate the same scores for you.
Thats why you need to use your own
user data.
BY TAMMY EVERTS
SENIOR RESEARCHER AND EVANGELIST, SOASTA
WHAT IS A CONVERSION?
A conversion is what happens when a person whos browsing a site
converts to being a user or buyer of the service or product that site
offers. So if youre a SaaS vendor, a conversion happens when a person
signs up to use your serviceor if youre an e-commerce shop, when
a person buys something. Conversions can also include actions like
signing up for a newsletter or making a donation.
The conversion funnel is
the start-to-finish path that a
user takes when they convert
from browsing to buying/
downloading/etc. A conversion
funnel for an ecommerce site
might look something like this (note
that percentages are arbitrary and
extremely optimistic):
CONVERSION RATE
CONVERSION RATE
CONVERSION RATE
3
PAGE LOAD TIME (SECONDS)
46
CONVERSION RATE
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Checkout SendTo
Home
Account SignIn
Checkout OrderConfirmation
Category Browse 1
If you looked only at page load times, you might believe that you need
to prioritize the Checkout SendTo group because its performance
is dramatically poorer than the other groups. But if you knew its
Conversion Impact Score, youd realize that page speed doesnt
have much impact on conversion rate, so making this group faster
wouldnt be the best use of your limited optimization resources.
PAGE GROUP
RELATIVE
CONVERSION IMPACT
SCORE
MEDIAN FULL
PAGE LOAD TIME
(SECONDS)
-0.12
2.9
Category Browse 1
-0.085
3.0
Home
-0.08
3.8
CONCLUSION
-0.045
2.1
Shopping Bag
-0.01
Knowing the Conversion Impact Scores for this set of page groups,
this is the order in which you might actually want to prioritize
their optimization to give you the best ROI:
Checkout Send To
-0.005
Wishlist
-0.004
2.8
Checkout Order
Confirmation
-0.003
3.2
Account SignIn
-0.0025
3.3
47
Still looking solely at load times, you might also guess that, because
these pages look fairly speedy, you dont need to worry about them.
This is where youd make your biggest mistake. Because these
groups have the highest Conversion Impact Scores, they have the
potential to deliver the most benefit to you if you make them faster.
1.
2.
3.
4.
5.
Home
Category Browse 1
Product Detail Page
Choose Your Country
Shopping Bag
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
Solutions Directory
This directory of monitoring, hosting, and optimization services provides comprehensive,
factual comparisons of data gathered from third-party sources and the tool creators
organizations. Solutions in the directory are selected based on several impartial criteria,
including solution maturity, technical innovativeness, relevance, and data availability.
48
PRODUCT NAME
PRODUCT TYPE
FREE TRIAL
HOSTING
WEBSITE
Akamai Ion
SaaS
akamai.com
Alertsite by Smartbear
Software
Available by request
On-premise
or SaaS
smartbear.com/product/alertsite/
overview/
Apica Systems
Limited by usage
SaaS
apicasystems.com
AppDynamics
On-premise
or SaaS
appdynamics.com
AppFirst
30 days
SaaS
appfirst.com
Available by request
SaaS
appneta.com
AppNomic AppsOne
ITOA
Upon request
On-premise
or SaaS
appnomic.com
Aternity
Upon request
On-premise
aternity.com
BigPanda
21 days
SaaS
bigpanda.io
14 days
SaaS
bmc.com/it-solutions/truesight.
html
BrowserStack
FEO
Limited by usage
SaaS
browserstack.com
Free Trial
SaaS
ca.com/us/products/ca-appsynthetic-monitor.html
Mobile APM
SaaS
ca.com/us/products/ca-mobileapp-analytics.html
CA Unified Infrastructure
Management
Infrastructure Monitoring
Free Trial
On-premise
ca.com/us/products/ca-unifiedinfrastructure-management.html
Catchpoint Suite
14 days
On-premise
or SaaS
catchpoint.com/products/
Censum by jClarity
7 days
SaaS w/
on-premise
option
jclarity.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
49
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PRODUCT NAME
PRODUCT TYPE
FREE TRIAL
HOSTING
WEBSITE
Circonus
SaaS
circonus.com
CloudFlare
CDN
cloudflare.com
Correlsense SharePath
Upon request
On-premise
or SaaS
correlsense.com
CoScale
30 days
SaaS
coscale.com
Datadog
14 days
SaaS
datadoghq.com
Dotcom Monitor
30 days
SaaS
dotcom-monitor.com
Dyn
7 days
On-premise
dyn.com
Dynatrace Application
Monitoring
APM, ITOA
30 days
On-premise
dynatrace.com/en/applicationmonitoring/
Demo on request
On-premise
dynatrace.com/en/data-centerrum/
Dynatrace Ruxit
On-premise
or SaaS
dynatrace.com/en/ruxit/
Dynatrace Synthetic
Demo on request
SaaS
dynatrace.com/en/syntheticmonitoring/
Dynatrace UEM
30 days
On-premise
dynatrace.com
eg Innovations Monitors
14 days
SaaS
eginnovations.com
Evolven
ITOA
Upon request
On-premise
evolven.com
Extrahop Networks
ITOA
SaaS
extrahop.com
F5 Big IP Software
30 days
On-premise
or SaaS
f5.com
Foglight by Dell
Available by request
On-premise
software.dell.com
Fusion Reactor
14 days
On-premise
fusion-reactor.com
HPE APM
30 days
On-premise
hp.com
On-premire
or SaaS
ibm.com/software/products/en/
api-connect
30 days
On-premise
or SaaS
ibm.com/software/products/en/
ibm-application-performancemanagement
DB monitoring
14 days
SaaS
idera.com
14 days
SaaS
idera.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
50
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PRODUCT NAME
PRODUCT TYPE
FREE TRIAL
HOSTING
WEBSITE
Illuminate by jClarity
14 days
SaaS w/
on-premise
option
jclarity.com
Impact by Cedexis
Upon request
SaaS
cedexis.com
Inetco Insight
Upon request
On-premise
inetco.com
Upon request
On-premise
infovista.com
JenniferSoft
APM
14 days
On-premise
jennifersoft.com
7 days
SaaS
keynote.com
Librato
30 days
SaaS
librato.com
Logentries
SaaS
logentries.com
Loggly
30 days
SaaS
loggly.com
LogMatrix NerveCenter
Available by request
On-premise
logmatrix.com
ManageEngine Applications
Manager
Available by request
On-premise
manageengine.com
APM
180 days
On-premise
microsoft.com
Moogsoft
Available by request
On-premise
or SaaS
moogsoft.com
Nagios XI
Open source
On-premise
nagios.com
Nastel Autopilot
Upon request
SaaS
nastel.com
NetScout nGeniusOne
Upon request
On-premise
netscout.com
Netuitive
21 days
SaaS
netuitive.com
FEO
30 days
SaaS
neustar.biz
SaaS
newrelic.com/applicationmonitoring
op5 Monitor
SaaS
op5.com
OpsGenie
Alert Software
Upon request
On-premise
opsgenie.com
OpsView
30 days
On-premise
opsview.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
51
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
PRODUCT NAME
PRODUCT TYPE
FREE TRIAL
HOSTING
WEBSITE
PA Server Monitor
30 days
On-premise
poweradmin.com
PagerDuty
14 days
SaaS
pagerduty.com
Pingdom
APM, FEO
30 days
SaaS
pingdom.com
Rackspace Monitoring
Cloud monitoring
SaaS
rackspace.com/cloud/monitoring
Riverbed SteelCentral
30-90 days
On-premise
riverbed.com
SauceLabs
14 days
SaaS
saucelabs.com
ScienceLogic Platform
Upon request
SaaS
sciencelogic.com
SevOne
Upon request
SaaS
sevone.com
SIEM by AccelOps
30 days
SaaS
accelops.com
Site24x7 by ManageEngine
Limited by usage
SaaS
site24x7.com
Soasta Platform
Up to 100 users
SaaS
soasta.com
Solarwinds Network
Performance Monitor
30 days
On-premise
solarwinds.com
SpeedCurve
FEO, ITOA
None
SaaS
speedcurve.com
Spiceworks
Free
On-premise
spiceworks.com
Stackify
Upon request
SaaS
stackify.com
TeamQuest
ITOA
Upon request
On-premise
teamquest.com
Telerik Analytics
Free
On-premise
telerik.com
ThousandEyes
15 days
SaaS
thousandeyes.com
TINGYUN App
Available by request
SaaS
tingyun.com
VictorOps
Alert Software
14 days
On-premise
victorops.com
Zabbix
Network Monitoring
Open source
On-premise
zabbix.com
On-premise
or SaaS
zenoss.com
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
DIVING DEEPER
INTO PERFORMANCE + MONITORING
TOP 10 #PERFORMANCE TWITTER FEEDS
@Souders
@brendangregg
@tameverts
@paul_irish
@bbinto
@mdaoudi
@ChrisLove
@firt
@Perf_Rocks
@duhroach
DevOps Zone
Java Zone
dzone.com/performance
dzone.com/devops
dzone.com/java
TOP PERFORMANCE
REFCARDZ
G E T T I N G S TA R T E D W I T H
TOP SPEED
TEST TOOLS
Planet Performance
webpagetest.org
bit.ly/dz-userm
perfplanet.com
ResponsiveDesign.is
bit.ly/dz-javaperf
responsivedesign.is
bit.ly/dz-scale
52
TOP PERFORMANCE
WEBSITES
tools.pingdom.com/fpt
developers.google.com/
speed/pagespeed/insights/
brendangregg.com/blog
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I
gtmetrix.com
DZONE.COM/GUIDES
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO LU M E I I I
GLOSSARY
ACTIVE MONITORING Also known as
synthetic monitoring, this is a type of
website monitoring where scripts are
created to simulate an ordered series of
actions that an end-user might take (as
opposed to comparatively atomic functional
or integration tests). Tests overall site
functionality and response time and helps
identify any problems that hinder overall
site performance.
APPENDER In logging systems, specifies
destination, message format, behavior (non/
blocking, response timeouts, retry intervals,
exception handling), accept/reject filters,
compression details, etc.
APPLICATION-LAYER PROTOCOL
NEGOTIATION (ALPN) An extension of
Transport Layer Security (TLS) protocol
negotiation that helps client and server
figure out, within the TLS handshake,
which application-layer protocols to use.
Handles HTTP/1.1 vs. HTTP/2 selection but
is also app-layer protocol indifferent.
53
DZO N E S G U I D E TO P E R F O R M A N C E A N D M O N I TO R I N G VO L U M E I I I