0% found this document useful (0 votes)

104 views14 pages

High-Performance Linux Cluster Monitoring Using Java: Curtis Smith and David Henry Linux Networx, Inc

This document discusses using Java for high-performance Linux cluster monitoring. It describes dividing monitoring into three stages - gathering data from nodes, consolidating the data, and transmitting it. The key points are that gathering data from Linux's /proc file system using Java can provide efficient, high-frequency monitoring without impacting performance.

Uploaded by

Fabryziani Ibrahim Asy Sya'bani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views14 pages

High-Performance Linux Cluster Monitoring Using Java: Curtis Smith and David Henry Linux Networx, Inc

Uploaded by

Fabryziani Ibrahim Asy Sya'bani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

High-performance Linux cluster monitoring

using Java
Curtis Smith and David Henry
Linux NetworX, Inc.
USA

Abstract

Monitoring is at the heart of cluster management. Instrumentation data is used to

schedule tasks, load-balance devices and services, notify administrators of
hardware and software failures, and generally monitor the health and usage of a
system. The information used to perform these operations must be gathered from
the cluster without impacting performance. This paper discusses some of the
performance barriers to efficient high-performance monitoring, and presents an
optimized technique to gather monitored data using the /proc file system and Java.

1 Introduction

Java technology has much to offer to the developers of cluster management

solutions. Java is dynamic, flexible, and portable. These unique features make it an
ideal foundation for building cluster management solutions over heterogeneous
networks and platforms.
Java has an extensive library of routines for coping easily with IP protocols
such as TCP and UDP, and for network programming on multi-homed hosts. This
makes creating network connections much easier than in C or C++. Through the
Java Native Interface (JNI), Java code that runs within a Java Virtual Machine
(JVM) can interoperate with applications and libraries written in other languages,
such as C, C++, and assembly.
Java is already a language of choice when building cluster monitoring or
management systems. However the language is usually used for the front end or
cluster host portion of the system, while a daemon written in C is installed on the
cluster nodes [1] [10]. Despite the benefits provided by the Java programming
language, the question remains: is Java efficient enough to replace the C daemon
running on each node for high-performance cluster monitoring?
2 High Performance Monitoring

Tools for monitoring Linux clusters have traditionally provided a limited amount
of data with delivery frequencies measured in seconds. High performance cluster
monitoring is defined as the ability to efficiently gather data from nodes at intra-
second frequencies. When dealing with large clusters, inefficiencies in the
monitoring software become increasingly problematic. This is because running
applications must coordinate with each other or with shared global resources.
Interference on one node can affect jobs running on other nodes. An example of
this is an MPI job that needs to synchronize all participating nodes.
One solution to this problem is to gather less data and transmit it less
frequently. However if the objective is high-performance monitoring, this solution
is unacceptable. A cluster that is heavily utilized should be monitored constantly
and frequently. Local job schedulers must be able to make decisions quickly based
on resource usage. Administrators often want to be notified of critical events and
view historic trend data, which is unavailable unless the cluster is monitored
constantly and consistently. Therefore utilizing more efficient algorithms,
increasing transmission parallelism, increasing the efficiency of the transmission
protocol and data format, and reducing redundancy are needed.
The performance numbers given in this paper are from code that runs the
monitor at 100% CPU utilization. Therefore higher numbers indicate higher
efficiency of the monitoring algorithm. Typically only those researching
monitoring algorithms are interested in exclusively running monitoring code on a
cluster. For all others, the monitoring rate must be throttled back to a reasonable
level. For most cluster administrators, monitoring dynamic data every few seconds
is adequate. So what are the advantages to monitoring at higher frequencies?
Profiling applications under development or tracking resource usage during
execution can be useful for debugging or optimizing. The usage of dynamic
resources such as memory, network, and CPUs can change very quickly for a given
application. Being able to watch how an application utilizes these resources as it
starts or as it runs is possibly one of the most interesting uses of high frequency
monitoring.
Even if the user is uninterested in monitoring at high frequencies, if an
algorithm is efficient it will consume fewer resources no matter what the
monitoring frequency. This efficiency is more important in a heterogeneous
cluster, where user jobs may be spread among faster and slower nodes. The slower
nodes need the entire CPU to keep up and synchronize with faster nodes. A
monitoring daemon taking CPU time on the slow node is on the job's critical path.

3 Monitoring Stages

Cluster monitoring primarily consumes two important resources: CPU cycles and
network bandwidth. However, the resource consumption problem is
fundamentally different for these two resources. The CPU usage problem is
completely localized to the node, and is addressed by creating efficient gathering
and consolidating algorithms. Network bandwidth however is a shared resource
and is a problem of scale. The network bandwidth usage problem is addressed by
finding ways to minimize the amount of data transmitted over the network.

2
To address these two issues, we divide cluster monitoring into three stages:
gathering, consolidation, and transmission. The gathering stage is responsible for
loading the data from the operating system, parsing the values, and storing the data
in memory. The consolidation stage is responsible for bringing the data from
multiple sources together, determining if values have changed, and filtering. The
transmission stage is responsible for the compression and transmission of the data.
This paper is concerned with optimizing the gathering stage of Linux cluster
monitoring. The other stages are equally important to efficient monitoring and
offer as many opportunities for optimization, but will only be discussed briefly in
this document.

3.1 Gathering Stage

In Linux there are several ways to gather system statistics, each with advantages
and disadvantages:
• Use existing tools: Standard and nonstandard tools exist which can perform
one or more of the gathering, consolidation, and transmission stages, such as
rstatd or SNMP tools. However the standard rstat daemon provides limited
information and is slow and inefficient [2].
• Kernel patch/module: Several system-monitoring projects rely on the use of
kernel modules or patches to provide access to monitored data [3]. In general,
this is a very efficient way to gather system data. A problem with this
approach however is keeping the code consistent with other changes in the
main kernel source. A patch may conflict with other kernel patches the user
wishes to use. In addition, the user must obtain and apply the patch or module
before they can use the monitoring system.
• The /proc virtual file system: The /proc virtual file system is generally a fast
and efficient way to perform system monitoring. The main drawback to using
/proc however is keeping the parsing code synchronized with /proc file format
changes. History has shown that the Linux kernel changes more frequently
than the /proc file formats, so this should be less of a problem than using
kernel modules or patches.
• Hybrids: Some monitoring systems implement a hybrid system, using a kernel
patch/module for gathering the data and the /proc virtual file system as the
interface to that data. The ideas presented in this paper apply in this case,
since the data must be gathered from /proc using the same techniques.

3.2 Consolidation Stage

The responsibilities of the consolidation stage can be performed on the node, a

cluster management host, or split between the two. In the interest of efficiency
however, we almost exclusively consolidate on the node. The reason for this is that
the node is the gatherer and provider of the monitored data. Two or more
simultaneous requests for data should not result in two calls to the operating
system to gather the data. Instead the data from the first request should be cached
and provided to the second. This approach reduces the burden on the operating
system and increases the responsiveness of the monitoring system.
The consolidation stage is also used to combine data from multiple data sources
and at independent gathering rates. This is necessary because not all data can be

3
gathered at the same degree of efficiency, and because not all data changes at the
same rate, or needs to be gathered at the same rate.
Another reason for consolidating at the node level is to reduce the amount of
information included in the transmission. Many of the /proc files contain both
dynamic and static data. By eliminating values that haven’t changed since the last
transmission, the amount of data sent by the node can be reduced substantially.
Consolidation not only eliminates the transmission of dynamic values that change
infrequently, but also addresses static values that never change.

3.3 Transmission Stage

Monitored data is almost always organized into a hierarchy. This is true of many of
the files in the /proc file system. An effective means of encoding the hierarchical
data into a form that can be transmitted efficiently is a responsibility of the
transmission stage. The Java property file format is an effective and efficient
means of storing hierarchical data, and can be easy produced with the provided
Java APIs. S-Expressions have been suggested as another efficient way of
transmitting this data [2].
A common point of discussion about transmitting monitored data is whether the
data should be encoded in binary or text form. The issue usually is that binary data
is more compact and is therefore more efficiently transmitted. However when
using the /proc file system, monitored data is usually stored in human-readable
form. Converting the data to binary form prior to transmission requires more
processing resources and time. By leaving the gathered data as text, the node
resources can be applied more to non-monitoring related work.
Leaving the data in text form provides these additional benefits:
• Platform independence: When monitoring in heterogeneous cluster
configurations the byte order of the data is not always the same between
machines. The use of a textual format solves this problem in a language and
architectural independent way without imposing further processing
requirements.
• Human readable formats: Textual data can be organized into formats that are
humanly readable. If desired, this feature can ease the process of debugging or
allow users to watch the data stream.
• Efficient compression: The text representation of numbers is comprised of
characters from a set of 10 bytes as opposed to the set of 256 used in binary.
The relative frequency of the digits and the patterns that they produce, allows
dictionary and entropy based compression algorithms to be applied effectively.

4 The /proc Virtual File System

The /proc virtual file system (also referred to as procfs) is the Linux
implementation of a virtual file system used by several UNIX operating systems,
including Sun's Solaris, Linux, and *BSD [4][5]. It appears as a normal file system
beginning at /proc and contains files having the same names as the process IDs of
currently running processes. However files in /proc do not occupy disk space - they
exist in working memory. The original purpose of /proc was to allow easy access
to information about processes, but in Linux it is now used by every part of the
kernel that has something to report.

4
Of the hundreds or even thousands of values provided by the /proc file system,
we will focus on a minimal set necessary for cluster monitoring, this includes:
• /proc/loadavg: containing system load averages,
• /proc/memory: containing memory management statistics,
• /proc/net/dev: containing network interface card metrics,
• /proc/stat: containing kernel statistics,
• /proc/uptime: containing total system uptime and idle time.
A complete list of values available in these files is provided in Table 1. As
would be expected the number of values provided by each file is different. In
addition the file formats and time to gather the supplied information from the
kernel are also different as will be seen later.
An important note about the /proc virtual file system is that each time a /proc
file is read, a handler function is called by the kernel or owning module to generate
the data. The data is generated on the fly, and the entire file is reconstructed
whether a single character or large block is read. This is a critical point for
efficiency, as any system mo nitors using /proc should gulp the whole file instead
of nibbling at it.
Table 1. Monitor values.

File Name Value

/proc/loadavg 1 minute load average
5 minute load average
15 minute load average
total jobs
total jobs running
/proc/meminfo active memory
inactive memory
buffered memory
cached memory
total free memory
total high memory
free high memory
total low memory
free low memory
shared memory
swap memory
swap cached memory
swap free memory
total memory
/proc/net/dev for each network interface card:
received bytes
received bytes compressed
receiving errors total
receiving dropped errors
receiving fifo errors
receiving frame errors
received multicast bytes
total packets received

5
transmitted bytes
transmitted bytes compressed
transmission errors total
transmission carrier errors
transmission collision errors
transmission dropped errors
transmission fifo errors
total packets transmitted
/proc/stat boot time
number of context switches
total number of interrupts
total number of pages paged in
total number of pages paged out
total number of process
total number of swaps in
total number of swaps out
aggregate cpu idle time
aggregate cpu nice time
aggregate cpu system time
aggregate cpu user time

for each CPU:

individual cpu idle time
individual cpu nice time
individual cpu system time
individual cpu user time

for each disk device:

individual disk block reads
individual disk block writes
individual disk io total
individual disk io reads
individual disk io writes
/proc/uptime total system uptime
total system idle time

The most critical inefficiency in monitoring /proc in the past has been reading
/proc/meminfo. In order for Linux kernels earlier than 2.4.7 to gather memory
statistics, the entire page structure and every block of swap had to be scanned. [2]
This was a substantial problem that was fixed by version 2.4.14.

5 Implementation

So how do we go about writing a high-performance monitor in Java? Just as with

any programming language there are efficient and inefficient ways to implement
any algorithm. It is not always obvious which APIs will render the most efficient
implementation.
Java provides a rich set of file IO classes. Included are stream based classes,
block device based classes, and the new IO library provided in J2SDK 1.4.

6
Experimentation shows that in general the RandomAccessFile class is the best
performing of the IO classes for basic block reads and writes to files. The newer
1.4 memory mapped classes do not provide improved performance in this case
since the /proc virtual file system files are already memory mapped.
We begin with a basic implementation that at first glance appears quite
reasonable. Knowing that the RandomAccessFile class is the most efficient, we
base our code on it. For each load operation, the file is opened, parsed on a line-by-
line basis, and then closed. We rely on the already implemented and optimized
functions for string loading and manipulation provided by the Java language.
Our test’s main function is designed to simply count the number of times a
particular file (in this case /proc/meminfo) can be loaded, parsed, store its values in
memory, and time the result. The tests were performed on a 1GHz Pentium III with
1GB memory and 99.8% idle CPU using the 2.4.18 version of the Linux kernel.
The code for our first implementation is listed in Figure 1.
Figure 1. Code for parsing the /proc/meminfo file.

import java.io.*;
import java.util.*;

public class memoryfile

{
private static final int TIME = 5000;

private static RandomAccessFile mFile;

private static ArrayList mList = new ArrayList();

public static void main( String[] inArguments )

throws Throwable
{
System.out.print( "Running... " );

int counter = 0;
long end = System.currentTimeMillis() + TIME;
while ( System.currentTimeMillis() < end )
{
load();
++counter;
}

double count = counter / ( ( double ) TIME / 1000 );

System.out.println(
count + "/s (" + ( 1 / count ) + ")" );
}

private static
void load()
throws Throwable
{
mFile = new RandomAccessFile( "/proc/meminfo", "r" );

parse();

mFile.close();
}

private static
void parse()
throws Throwable
{
// Skip the old header.

mFile.readLine();
mFile.readLine();
mFile.readLine();

// Store the values.

mList.clear();

store(); // total
store(); // free

7
store(); // shared
store(); // buffers
store(); // cached
store(); // swap cached
store(); // active
store(); // inactive
store(); // high
store(); // high free
store(); // low
store(); // low free
store(); // swap
store(); // swap free
}

private static
void store()
throws Throwable
{
// Get the next value.

String line = mFile.readLine();

// Skip fixed name length.

int i = 14;

// Skip column padding.

while ( line.charAt( i ) == ' ' )

{
++i;
}

// Find the end of the value.

int j = line.indexOf( ' ', i );

// Store the value as a String.

mList.add( line.substring( i, j ) );
}
}

We expect that this implementation will perform reasonably well. However

after the first run we see that the code only renders 85 loads per second at 100%
CPU utilization. The problem with this implementation is that it does not take into
account what is going on inside the kernel’s /proc file system or Java’s
RandomAccessFile readLine method.
As stated earlier, each time a read operation is performed on the /proc file
system, the kernel must produce the entire set of data provided in the file. This
means that if a request is made for the entire file or only a single byte, the kernel
must gather the entire set of values for the file. Unfortunately the readLine method
in the RandomAccessFile class gathers the characters of a line one character at a
time. This means that the /proc file is generated as many times as there are bytes in
the file. This is obviously extremely inefficient no matter what programming
language is used.
To get around this problem, we simply need to read the file once in whole and
then parse its contents. We do this by adding a byte buffer and changing the
parsing code as shown in Figure 2. The result of this change is a gathering rate of
4,173 times per second, or a 4,809% increase in performance.
Figure 2. Using block reads.

import java.io.*;
import java.util.*;

public class memoryfilebuffer

{
private s tatic final int TIME = 5000;

8
private static RandomAccessFile mFile;
private static ArrayList mList = new ArrayList();
private static byte[] mBuffer = new byte[ 4096 ];
private static int mOffset;

public static void main( String[] inArguments )

throws Throwable
{
System.out.print( "Running... " );

int counter = 0;
long end = System.currentTimeMillis() + TIME;
while ( System.currentTimeMillis() < end )
{
load();
++counter;
}

double count = counter / ( ( double ) TIME / 1000 );

System.out.println(
count + "/s (" + ( 1 / count ) + ")" );
}

private static
void load()
throws Throwable
{
mFile = new RandomAccessFile( "/proc/meminfo", "r" );
mFile.read( mBuffer );

parse();

mFile.close();
}

private static
void parse()
{
mOffset = 0;

// Skip the old header.

while ( mBuffer[ mOffset++ ] != '\n' );

while ( mBuffer[ mOffset++ ] != '\n' );
while ( mBuffer[ mOffset++ ] != '\n' );

// Position to the first value.

mOffset += 14; // 'MemTotal: '

// Store the values.

mList.clear();

store(); // total
store(); // free
store(); // shared
store(); // buffers
store(); // cached
store(); // swap cached
store(); // active
store(); // inactive
store(); // high
store(); // high free
store(); // low
store(); // low free
store(); // swap
store(); // swap free
}

private static
void store()
{
// Skip column padding.

while ( mBuffer[ mOffset ] == ' ' )

{
++mOffset;
}

// Find the end of the value.

9
int offset = mOffset;
while ( mBuffer[ mOffset ] != ' ' )
{
++mOffset;
}

// Store the value as a String.

mList.add(
new String( mBuffer, offset, mOffset - offset ) );

// Skip to the next value.

mOffset += 18; // ' KB\nXXXXXXXXX: '

}
}

Now we will examine the approach used to extract the values from the buffer.
We are using standard methods for locating characters and creating strings.
However what isn’t obvious is that we are incurring overhead for implicit Unicode
character encoding when we extract bytes from the buffer to create a string. Given
that we already have the character data in the buffer, and given that the /proc data
is always stored in the standard ASCII character range 0-127, there is no reason for
us to convert the data from the buffer into Unicode strings. Instead we need to
store the offset and length of the data in the buffer. We can do this with a couple of
integer arrays. These changes to our parsing code eliminate the memory allocation
and character encoding that is occurring now, and will eliminate the need for the
character-to-byte translation that would have been necessary prior to transmission.
The simple modifications to our parsing code displayed in Figure 3 yields another
236% increase in performance, or a monitoring rate of 14,031 times per second.
Figure 3. Removing unwanted memory allocation and character encoding.

private static int[] mOffsets = new int[ FIELDS ];

private static int[] mLengths = new int[ FIELDS ];
private static int mField;

void parse()
{
mOffset = 0;
mField = 0;

// Skip the old header.

while ( mBuffer[ mOffset++ ] != '\n' );

while ( mBuffer[ mOffset++ ] != '\n' );
while ( mBuffer[ mOffset++ ] != '\n' );

// Position to the first value.

mOffset += 14; // 'MemTotal: '

// Store the values.

10
store(); // swap free
}

private static
void store()
{
// Skip column padding.

while ( mBuffer[ mOffset ] == ' ' )

{
++mOffset;
}

// Find the end of the value.

int offset = mOffset;

while ( mBuffer[ mOffset ] != ' ' )

{
++mOffset;
}

// Store the value's offset and length.

mOffsets[ mField ] = offset;

mLengths[ mField++ ] = mOffset - offset;

// Skip to the next value.

mOffset += 18; // ' KB\nXXXXXXXXX: '

}

Given that the /proc file system files are mapped to memory rather than disk,
and because the code for opening and closing files is usually highly optimized on
any operating system, one might assume that the overhead associated with opening
and closing the file would not contribute a significant amount of time to the
gathering process. However this is not true.
Figure 4. Leaving the file open.

public static void main( String[] inArguments )

throws Throwable
{
System.out.print( "Running... " );

mFile = new RandomAccessFile( "/proc/meminfo", "r" );

int counter = 0;
long end = System.currentTimeMillis() + TIME;
while ( System.currentTimeMillis() < end )
{
load();
++counter;
}

mFile.close();

double count = counter / ( ( double ) TIME / 1000 );

System.out.println( count +
"/s (" + ( 1 / count ) + ")" );
}

private static void load() throws Throwable

{
mFile.read( mBuffer );
mFile.seek( 0 );

parse();
}

Our final modification is to keep the file open across multiple requests. To do
this we simply move existing code from the load method to the main method, and
add a seek call as shown in Figure 4. The result of this modification yields an

11
additional 141% increase in performance. Combined with the other three increases,
the gathering rate is now 33,855 times per second or a 39,729% increase in
performance over the first implementation. This translates to approximately 2.95
one hundred thousandths (0.000295) of the CPU per request. In other words
approximately 5 seconds of CPU time per hour at a monitoring rate of 50 times per
second. We have verified this result on the previously described machine.

6 Results

Tables 2 and 3 present the results of applying the described techniques to the five
specified /proc files using the Java and C programming languages. The Loads/s
and Parses/s columns represent the number of times the file can be loaded or
parsed per second using 100% of the CPU. The parsing times are dependent on the
complexity and size of the files. The Total/s column represents the combined
loading and parsing rate for each file. The performance numbers in these tables do
not take into account the time needed by the consolidation and transmission stages,
where equal attention should be paid to performance and efficiency.
Table 2. Gathering rates for Java.

Monitor Loads/s Parses/s Total/s

/proc/stat 31,297 240,871 27,954
/proc/loadavg 136,627 1,066,010 132,384
/proc/meminfo 39,727 262,371 35,265
/proc/net/dev 49,921 162,026 39,038
/proc/uptime 165,885 1,202,597 161,634
Table 3. Gathering rates for C.

Monitor Loads/s Parses/s Total/s

/proc/stat 34,365 289,318 31,255
/proc/loadavg 205,795 1,287,964 188,623
/proc/meminfo 44,952 319,070 39,980
/proc/net/dev 58,108 207,342 46,291
/proc/uptime 203,912 1,746,182 197,704

It should be noted that in both Java and C implementations, the parsing rates
are substantially higher than their corresponding loading rates. At the same time,
the overall rate differences between the Java and C implementations are relatively
small. This indicates that the monitoring throughput is now more dependent on the
efficiency of the kernel code for the /proc file system, than it is on the performance
differences of the implementation languages. Assuming a monitoring rate of 1 time
per second, Table 4 lists the projected CPU savings that would be achieved by
using C rather than Java. The values are computed by determining the time per call
for each monitor and then taking their differences. The difference amounts to a
combined savings of about 1 second per day.

12
Table 4. Performance comparison.

Monitor Per Call Hourly Daily

/proc/stat 3.778e-6 seconds 0.014 seconds 0.326 seconds
/proc/loadavg 2.252e-6 seconds 0.008 seconds 0.195 seconds
/proc/meminfo 3.344e-6 seconds 0.012 seconds 0.289 seconds
/proc/net/dev 4.014e-6 seconds 0.014 seconds 0.347 seconds
/proc/uptime 1.129e-6 seconds 0.004 seconds 0.098 seconds

7 Conclusion

Optimizing takes times and experimentation, and isolating poorly performing code
always involves multiple implementations. The best way to optimize any code is to
identify and replace the worst performing code first. By resolving the most
inefficient pieces first, we are more able to clearly identify the medium and smaller
performance barriers that remain.
By taking this approach, this paper has demonstrated that the Java language can
be used efficiently for high performance monitoring on the nodes of Linux clusters.
In the process it has presented the following techniques for an efficient
implementation:
• using the /proc file system,
• reading /proc files as a block rather than by lines or characters,
• keeping /proc files open between reads.
• eliminating unnecessary data conversions,
• consolidating data on the node,
• transmitting data in compressed form,
• being aware of language or library related performance issues,
• being aware of inefficiencies in specific kernel versions.
It has been shown that kernel modules or patches are not a requirement for high
performance monitoring. This is important because it provides a greater degree of
portability between Linux versions and distributions, and a greater choice of
monitor implementation languages. The performance of the /proc file system
however is very dependent on the efficiency of the kernel code that produces the
file, and a proper understanding of the mechanisms involved can greatly affect the
performance of monitors written in any language.

References

[1] Linux NetworkX, ClusterWorX: Cluster Management Solution. Available at:

http://www.linuxnetworx.com/
news/pdf/whitepaper_cwx.pdf
[2] Ron Minnich and Karen Reid, Supermon: High performance monitoring for
Linux clusters. The Fifth Annual Linux Showcase and Conference, Oakland,
CA Nov, 2001.
[3] lm_sensors project, Available at http://www2.lm-sensors.nu/~lm78/.

13
[4] T. J. Killian. Processes as Files. Proceedings of the USENIX Software Tools
Users Group Summer Conference, pp 203-207, June 1984.
[5] R. Faulkner and R. Gomes, The Process File System and Process Model in
UNIX System V. USENIX Conference Proceedings, January 1991.
[6] Rajkummar Buyya, PARMON: a portable and scalable monitoring system for
clusters. Software--Practice and Experience, John Wiley and Sons, Ltd, 2000.
[7] Eric Anderson and Dave Patterson, Exensible, Scalable Monitoring for
Clusters of Computers. The proceeeding of the 11th Systems Administration
Conference (Lisa '97), Oct. 1997.
[8] T. Roney, A. Bailey, and J. Fullop, Cluster Monitoring at NCSA. Linux
Revolution Conference, June 2001. Available at:
http://www.ncsa.uiuc.edu/Divisions/CC/systems/LCI-Cluster-Monitor.html
[9] Z. Liang, Y. Sun, C. Wang, ClusterProbe: An Open, Flexible and Scalable
Cluster Monitoring Tool. IEEE Int. Workshop on Cluster Computing, pp. 261-
268, 1999.
[10] Quadrics RMS Pandora Reference Manual, Available at:
http://www.quadrics.com.

OGP P1/11 Geophysical Position Data Exchange Format: Positioning Integrity
No ratings yet
OGP P1/11 Geophysical Position Data Exchange Format: Positioning Integrity
88 pages
Asotić, Din - Strings in C++ - The Fourth Step in C++ Learning (2023, Independent) - Libgen - Li
No ratings yet
Asotić, Din - Strings in C++ - The Fourth Step in C++ Learning (2023, Independent) - Libgen - Li
197 pages
Sandeep Nagar Introduction To Python - For Scientists and Engineers PDF
No ratings yet
Sandeep Nagar Introduction To Python - For Scientists and Engineers PDF
141 pages
Supermon: A High-Speed Cluster Monitoring System
No ratings yet
Supermon: A High-Speed Cluster Monitoring System
8 pages
Arabic E-Reading: Studies On Legibility and Readability For Personal Digital Assistants
No ratings yet
Arabic E-Reading: Studies On Legibility and Readability For Personal Digital Assistants
117 pages
C Fundamentals Running Language Supports
100% (1)
C Fundamentals Running Language Supports
349 pages
Visual Fox Pro 1
No ratings yet
Visual Fox Pro 1
53 pages
The C Programming Handbook For Beginners
No ratings yet
The C Programming Handbook For Beginners
93 pages
C Programming For Beginners
No ratings yet
C Programming For Beginners
18 pages
JAVA - Variables PDF
100% (1)
JAVA - Variables PDF
23 pages
Chapter-6 Data Types
No ratings yet
Chapter-6 Data Types
37 pages
Chapter 3 C# Essentials
No ratings yet
Chapter 3 C# Essentials
75 pages
AQA-8525-SP-2020
No ratings yet
AQA-8525-SP-2020
42 pages
SQL Qyeries-1notes
No ratings yet
SQL Qyeries-1notes
38 pages
CS101 Lecture 05 - Fundamentals of Programming
No ratings yet
CS101 Lecture 05 - Fundamentals of Programming
38 pages
EDA
100% (1)
EDA
9 pages
Howto Curses
No ratings yet
Howto Curses
8 pages
9618_s24_qp_21
No ratings yet
9618_s24_qp_21
28 pages
Programs on strings
No ratings yet
Programs on strings
17 pages
Microsoft PowerPoint - MS SQL Server
No ratings yet
Microsoft PowerPoint - MS SQL Server
26 pages
SQL Server Standards-Naming Convention PDF
No ratings yet
SQL Server Standards-Naming Convention PDF
24 pages
Pure Anal
No ratings yet
Pure Anal
2 pages
Datatypes
No ratings yet
Datatypes
7 pages
C Sharap Lectures Aso Khaleel Ameen
No ratings yet
C Sharap Lectures Aso Khaleel Ameen
17 pages
Cambridge International AS and A Level Computer Science Coursebook 2nd Edition Sylvia Langfield - The ebook in PDF format with all chapters is ready for download
100% (1)
Cambridge International AS and A Level Computer Science Coursebook 2nd Edition Sylvia Langfield - The ebook in PDF format with all chapters is ready for download
57 pages
String Class Method
No ratings yet
String Class Method
5 pages
Characters Sets
No ratings yet
Characters Sets
2 pages
(Unicode) Character
No ratings yet
(Unicode) Character
4 pages
CHAPTER - 7 Data Handling - Data Types
No ratings yet
CHAPTER - 7 Data Handling - Data Types
5 pages
String Manipulation Problem Statement
No ratings yet
String Manipulation Problem Statement
6 pages
String Handling in Java
No ratings yet
String Handling in Java
3 pages
Fluentd Configuration and Deployment Strategies: Definitive Reference for Developers and Engineers
From Everand
Fluentd Configuration and Deployment Strategies: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
uWSGI Deployment and Configuration Guide: Definitive Reference for Developers and Engineers
From Everand
uWSGI Deployment and Configuration Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Nginx Configuration and Deployment Guide: Definitive Reference for Developers and Engineers
From Everand
Nginx Configuration and Deployment Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Apache Flume Solutions: Definitive Reference for Developers and Engineers
From Everand
Apache Flume Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
RabbitMQ in Practice: Definitive Reference for Developers and Engineers
From Everand
RabbitMQ in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Rsync Solutions: Definitive Reference for Developers and Engineers
From Everand
Rsync Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Presto in Practice: Definitive Reference for Developers and Engineers
From Everand
Presto in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
UrBackup Solutions for Reliable System Backup: Definitive Reference for Developers and Engineers
From Everand
UrBackup Solutions for Reliable System Backup: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
LogDNA Essentials: Definitive Reference for Developers and Engineers
From Everand
LogDNA Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Logstash Essentials: Definitive Reference for Developers and Engineers
From Everand
Logstash Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to BackupPC: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to BackupPC: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
LACP Configuration and Implementation Guide: Definitive Reference for Developers and Engineers
From Everand
LACP Configuration and Implementation Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Systemd-nspawn in Practice: Definitive Reference for Developers and Engineers
From Everand
Systemd-nspawn in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Storm Systems for Real-Time Data Processing: Definitive Reference for Developers and Engineers
From Everand
Storm Systems for Real-Time Data Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Debezium in Action: Definitive Reference for Developers and Engineers
From Everand
Debezium in Action: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Prefect Workflow Orchestration Essentials: Definitive Reference for Developers and Engineers
From Everand
Prefect Workflow Orchestration Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenShift Platforms and Operations: Definitive Reference for Developers and Engineers
From Everand
OpenShift Platforms and Operations: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
From Everand
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Network File System in Practice: Definitive Reference for Developers and Engineers
From Everand
Network File System in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenTracing in Distributed Systems: Definitive Reference for Developers and Engineers
From Everand
OpenTracing in Distributed Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Tcpdump in Depth: Definitive Reference for Developers and Engineers
From Everand
Tcpdump in Depth: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
LEMP Architecture and Administration: Definitive Reference for Developers and Engineers
From Everand
LEMP Architecture and Administration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
StreamSets Pipeline Design and Best Practices: Definitive Reference for Developers and Engineers
From Everand
StreamSets Pipeline Design and Best Practices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Duplicati Essentials: Definitive Reference for Developers and Engineers
From Everand
Duplicati Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Data Pipeline Automation with Airbyte: Definitive Reference for Developers and Engineers
From Everand
Data Pipeline Automation with Airbyte: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Network Backup with Amanda: Definitive Reference for Developers and Engineers
From Everand
Advanced Network Backup with Amanda: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Stream Processing Techniques and Patterns: Definitive Reference for Developers and Engineers
From Everand
Stream Processing Techniques and Patterns: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Observability Engineering with Relic: Definitive Reference for Developers and Engineers
From Everand
Practical Observability Engineering with Relic: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kafka for Distributed Systems: Definitive Reference for Developers and Engineers
From Everand
Kafka for Distributed Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Mattermost Administration: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Mattermost Administration: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Netdata in Practice: Definitive Reference for Developers and Engineers
From Everand
Netdata in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Daemon Architecture and Implementation: Definitive Reference for Developers and Engineers
From Everand
Daemon Architecture and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
R1Soft Administration and Implementation: Definitive Reference for Developers and Engineers
From Everand
R1Soft Administration and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Coralogix Essentials: Definitive Reference for Developers and Engineers
From Everand
Coralogix Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Prometheus Administration and Deployment: Definitive Reference for Developers and Engineers
From Everand
Prometheus Administration and Deployment: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Optimized Caching Techniques: Application for Scalable Distributed Architectures
From Everand
Optimized Caching Techniques: Application for Scalable Distributed Architectures
Peter Jones
No ratings yet
DNP3 Protocol Engineering: Definitive Reference for Developers and Engineers
From Everand
DNP3 Protocol Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
From Everand
NetFlow Protocols and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Commvault Administration and Best Practices: Definitive Reference for Developers and Engineers
From Everand
Commvault Administration and Best Practices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
ZooKeeper Systems and Techniques: Definitive Reference for Developers and Engineers
From Everand
ZooKeeper Systems and Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering System Programming with C: Files, Processes, and IPC
From Everand
Mastering System Programming with C: Files, Processes, and IPC
Larry Jones
No ratings yet
Principles of Multiple Spanning Tree Protocol: Definitive Reference for Developers and Engineers
From Everand
Principles of Multiple Spanning Tree Protocol: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
InfluxDB Essentials: Definitive Reference for Developers and Engineers
From Everand
InfluxDB Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
From Everand
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
From Everand
Blue-Green Deployment Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenTelemetry in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
From Everand
Snowflake Data Platform Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
NetWorker Configuration and Administration Reference: Definitive Reference for Developers and Engineers
From Everand
NetWorker Configuration and Administration Reference: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Observium Network Monitoring Solutions: Definitive Reference for Developers and Engineers
From Everand
Observium Network Monitoring Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DataDog Operations and Monitoring Guide: Definitive Reference for Developers and Engineers
From Everand
DataDog Operations and Monitoring Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
SystemTap Essentials: Definitive Reference for Developers and Engineers
From Everand
SystemTap Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
PHP Microservices
From Everand
PHP Microservices
Carlos Pérez Sánchez
3/5 (1)
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Zipkin: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
The HAProxy Handbook: Load Balancing for Modern Infrastructure
From Everand
The HAProxy Handbook: Load Balancing for Modern Infrastructure
Robert Johnson
No ratings yet
Airflow for Data Workflow Automation
From Everand
Airflow for Data Workflow Automation
Richard Johnson
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
From Everand
Mastering OpenTelemetry: Building Scalable Observability Systems for Cloud-Native Applications
Robert Johnson
No ratings yet
Networking Programming with C++: Build Efficient Communication Systems
From Everand
Networking Programming with C++: Build Efficient Communication Systems
Robert Johnson
No ratings yet

High-Performance Linux Cluster Monitoring Using Java: Curtis Smith and David Henry Linux Networx, Inc

Uploaded by

High-Performance Linux Cluster Monitoring Using Java: Curtis Smith and David Henry Linux Networx, Inc

Uploaded by

High-performance Linux cluster monitoring

Monitoring is at the heart of cluster management. Instrumentation data is used to

Java technology has much to offer to the developers of cluster management

3.1 Gathering Stage

3.2 Consolidation Stage

The responsibilities of the consolidation stage can be performed on the node, a

3.3 Transmission Stage

4 The /proc Virtual File System

File Name Value

for each CPU:

for each disk device:

So how do we go about writing a high-performance monitor in Java? Just as with

public class memoryfile

private static RandomAccessFile mFile;

public static void main( String[] inArguments )

double count = counter / ( ( double ) TIME / 1000 );

// Store the values.

String line = mFile.readLine();

// Skip fixed name length.

// Skip column padding.

while ( line.charAt( i ) == ' ' )

// Find the end of the value.

int j = line.indexOf( ' ', i );

// Store the value as a String.

We expect that this implementation will perform reasonably well. However

public class memoryfilebuffer

public static void main( String[] inArguments )

double count = counter / ( ( double ) TIME / 1000 );

// Skip the old header.

while ( mBuffer[ mOffset++ ] != '\n' );

// Position to the first value.

mOffset += 14; // 'MemTotal: '

// Store the values.

while ( mBuffer[ mOffset ] == ' ' )

// Find the end of the value.

// Store the value as a String.

// Skip to the next value.

mOffset += 18; // ' KB\nXXXXXXXXX: '

private static int[] mOffsets = new int[ FIELDS ];

// Skip the old header.

while ( mBuffer[ mOffset++ ] != '\n' );

// Position to the first value.

mOffset += 14; // 'MemTotal: '

// Store the values.

while ( mBuffer[ mOffset ] == ' ' )

// Find the end of the value.

int offset = mOffset;

while ( mBuffer[ mOffset ] != ' ' )

// Store the value's offset and length.

mOffsets[ mField ] = offset;

// Skip to the next value.

mOffset += 18; // ' KB\nXXXXXXXXX: '

public static void main( String[] inArguments )

mFile = new RandomAccessFile( "/proc/meminfo", "r" );

double count = counter / ( ( double ) TIME / 1000 );

private static void load() throws Throwable

Monitor Loads/s Parses/s Total/s

Monitor Loads/s Parses/s Total/s

Monitor Per Call Hourly Daily

[1] Linux NetworkX, ClusterWorX: Cluster Management Solution. Available at:

You might also like