Geronimo 1.1.1 Performance: Understanding The Current Performance Profile
Geronimo 1.1.1 Performance: Understanding The Current Performance Profile
1 P E RF ORMAN CE
Understanding the current performance profile
Matt Hogstrom
[email protected]
November 2006
Version 1.0
Java 2 Enterprise Edition (J2EE), Java 2 Standard Edition (J2SE) and Enterprise Java Beans
are trademarks or registered trademarks of Sun Microsystems, Inc.
This report is not sponsored by, or endorsed by, the Apache Software Foundation. It is
simply the work of a committer on the project.
Acknowledgments
1
Summary
2
Introduction
3
Disclaimers
3
Testing Environment
5
Hardware
5
Software
7
Workload
8
Results
9
Web Tier
9
EJB Primitives
12
Trade Scenario
14
PingServlet2Session
30
PingServlet2EntityLocal
31
PingServlet2EntityEJBRemote
32
PingServlet2Session2Entity
33
PingServlet2EntityCo&ection
34
PingServlet2Session2CMROne2One
35
PingServlet2CMROne2Many
36
Scenario - JDBC
37
I would like to acknowledge the generous contribution of hardware used for this report.
Intel Corporation has been immensely helpful in providing 4 systems for testing.. Three of
the systems have been used in performance and regression testing of Geronimo and are the
foundation of this report. The other system is used for TCK testing and other
miscellaneous functions.
On a side note I’d like to thank Marc The’Berge from Intel who has been my primary
contact for coordinating the use of these systems. Marc has been a a great partner over the
years and a good friend. He has been very supportive of the projects goals and been very
attentive to supporting us. Thanks bubba ;-)
Belinda, my wife, get lots of credit for being patient with me and my odd habits. I married
so well, not sure why she got stuck with me.
The benchmark application used for testing is flexible enough to compare performance of
the Web, EJB and an application that represents a stock trading simulation. The Web tests
show that Apache Geronimo is very competitive against alternatives in the marketplace. In
almost all cases Apache Geronimo either was neck in neck with the other containers or
exceeded their best number. Also, scalability of the Web tier to almost 100% while
servicing 100 simultaneous clients shows there are no inherent bottlenecks that artificially
limit its scalability. Note that the 100 simultaneous clients were zero think-time and were to
generate a load. A separate set of tests will be conducted on multi-user scalability.
The EJB tier shows that where pass-by-value semantics are required Apache Geronimo has
area for improvement. Apache Geronimo lacks two significant features (a Stateless Session
bean cache and the ability to specify SELECT FOR UPDATE on EJB queries) that do not
allow for good performance under a heavy workload. In essence, where multiple concurrent
updates to tables represented by CMP Entity beans -911 deadlocks may be inevitable. The
Apache DayTrader Application highlights this problem. These issues are being addressed in
the upcoming Apache Geronimo 1.2 release.
Overall Apache Geronimo is very competitive for Web based workloads. Also, workloads
that do not require pass-by-value semantics with EJB workloads also perform acceptably.
Workloads using Container Managed Persistence may encounter deadlocks under high load.
The Performance Target (PT) is a number that represents, in general, the best performance
number of those Application Servers tested. There is no hidden AppServer A or AppServer
B in the report. Based on my testing I found that in many instances, Open Source
Application Servers outperformed Commercial Application Servers while in other areas
(specifically the EJB workloads) the Commercial Application Servers outperformed their
Open Source counterparts. The Performance Target (PT) is a worst case scenario for the
Apache Geronimo Server in that it represents the best number and a bit more in some
cases. Bottom line is, these numbers represent a respectable throughput for any application
server. That said, performance isn’t everything.
An Application Server could be the best performing one on the market and still not meet
many of the Non-Functional Requirements (NFRs) that people look at when making a
selection. Price, Performance, Usability, Footprint, among others. are all factors in making
that selection. This report comments primarily on the performance component.
DISCLAIMERS
In the interest of openness and full disclosure the reader should be aware of the following
facts:
First, the author, Matt Hogstrom, works for IBM. I am also a committer on the Apache
Geronimo project. I’ve been a performance analyst for several years and was the
Performance Architect for the WebSphere Application Server. I participated in ECperf 1.0
(JSR-004) as well as ECperf 1.1 (JSR-131). In addition, I represented IBM to the Standard
Performance Evaluation Corporation (SPEC) and participated in the development of
SPECjAppServer 2001, 2002 and 2004 in the OSG-Java subcommittee. I also like to
SCUBA dive and have three very cool children. I don’t like cats, but we have one anyway.
As far as databases are concerned I decided to use DB2 for a couple of reasons. First, I use
it all the time so I’m familiar with it. I wanted to complete this report quickly and decided
to avoid a learning curve of another database. I’d like to produce other results with MySQL
but given a limited timeframe I chose to use what I know and works well. Oracle would
have required me to get permission from Oracle to release benchmark results and I didn’t
want to go down that path of having someone else review the results.. This is an
Application Server test and not a commercial benchmark result like SPECjAppServer 2004.
(http://www.spec.org/jAppServer2004).
As with any testing there are an infinite number of possible combinations of tuning options.
I chose to use a set of generally accepted defaults and test with those. One could certainly
turn out other results that would be higher or lower when tweaking those options. I’m very
interested in options that I may not have tried so please feel free to provide feedback.
Please send me e-mail directly at [email protected] and I’ll do my best to incorporate
your ideas and feedback into future results.
Next, the other significant feature of this report is the choice of a Java Virtual Machine. I
decided to use IBM’s 1.5 Version of their JRE for Linux (32-bit SR3). I had originally started
testing with the Java 1.5 Virtual Machine from Sun but because it wasn’t clear to me that
benchmarking with that VM was clear of legal issues. That said, also, I could not for the life
of me to get the Sun VM past 91% CPU utilization. I have to admit that played a factor in
my decision as well.
Geronimo is certified on a 1.4.2 JVM. However, since most people are interested in Web
based applications and less in CORBA applications I felt that 1.5 was more relevant to a
larger number of people. Apache Geronimo runs with either version but lacks CORBA
functionality when running on Java 1.5.
Finally, I used a commercial load driver to run the workload. There was no fancy scripting
used for the tests and Open Source load drivers were not fast enough for the high volume,
low-level, primitives. I’d very much like to use something well known like Grinder or
JMeter from the Open Source world but they couldn’t keep up (or maybe I was too
boneheaded to figure it out. If someone has some suggestions or donations I’m interested.
H A R D WA R E
Image
First, these systems were so excellent I have to add some personal comments here. The
AppServer (Gordy) was a two chip / dual core system and ran as a 4-way. I hadn’t realized
how fast these machines were. With the 4MB of L2 cache they are almost unstoppable.
They ran really well and the upgrade from dual-core to quad-core was so simple. Update the
BIOS, swap the chips and bada bing...I had 8-ways. I used the 8-ways for the driver system
(Wilbur) and for the database system (Porky). I hate shameless plugs but these machines
really impressed me.
I had to run the AppServer system as a 4-way in these tests as there wasn’t enough driver
capacity for the low-level primitives (like PingServlet). Rather than configure / re-configure
the environment I chose to leave the setup in a constant configuration. Here is the block
level-view of the setup.
The Application Servers tested were all run with a 1 gigabyte heap size.
Application Server:
Database Software:
The application is very flexible and can execute in a variety of runtime modes. Each of
these runtime modes exercise various elements of the J2EE architecture. It includes
runtime modes for standard Web Tier applications that are based on JSPs and Servlets,
JDBC and EJB as well as a WebServices mode for conducting remote operations. For this
report, since Apache Geronimo was not able to execute in EJB mode under load I only
included the JDBC runtime mode.
The application was originally built by IBM as a way to characterize performance. It was
used internally and donated to Apache Geronimo in 2005. Even though the application is
maintained at the Apache Software Foundation it does not depend on Geronimo or in any
way favor that Application Server. Work is currently being conducted on DayTrader Version
2.0 to enhance the runtime modes and prepare it as a benchmark example for Java
Enterprise Edition 5.0.
Note: The website is in flux (at the time of publish) If the content is not currently visible
check back in a few days.
The Web Tier contains simple primitives that start with the simplest of primitives,
PingServlet, and build to increasingly complex tests up to PingServlet2JNDI. The goal of
these tests are to characterize the runtime performance of the various components of the
Web Container; these results were based on the Geronimo distribution that includes
Tomcat.
The name of the primitives covered in this section all begin with the simple primitive
PingServlet. All other primitives are built on this foundation. Since the other primitives are
built on PingServlet, in most cases, the PingServlet prefix is not mentioned in the charts.
So, for example, where you see 2Include you can assume that the actual primitive name is
PingServlet2Include.
The tests were gathered by warming up the workload for a few minutes and then stopping
it. The server was allowed to settle down and then the workload was restarted and run for
approximately 2 minutes. During this time the server reached a steady state and the
throughput was captured and is measured in Requests Per Second. The goal for all tests was
to achieve 100% CPU utilization and a steady throughput.
WEB TIER
The Web Tier is by far the best performing of the Apache Geronimo Server. For all
operations good CPU utilization was obtained (in excess of 98% on the 4-way Application
Server). Unfortunately, for PingJDBCWrite the database system was not able to keep up
with the amount of disk write activity and only a CPU utilization of approximately 73%
could be obtained. I’m working to improve performance by using RAM disks for the logs as
I’m more concerned with the Application Server performance than the database. For this
reason I have excluded the PingJDBCWrite from the tests as this was a problem for all
tested Application Servers.
All of the following results are measured in Requests Per Second which is the number of
round trip requests sent from the driver to the tested Application Server.
85,000
63,750
42,500
21,250
0
Servlet
HTTPSession1
PingJSP
Servlet, HTTPSession1 and PingJSP all show excellent competitive results compared against
the Performance Target (PT). In the initial set of tests show that Apache Geronimo is on a
footing with the other competitors in the arena.
Geronimo 1.1.1 PT
85,000
63,750
42,500
21,250
0
Writer
2Servlet
2JSPEL
Just like the previous primitives the PingServletWriter and Servlet2Servlet show very
competitive results.
50,000
37,500
25,000
12,500
0
JDBCRead
JDBCWrite
2JNDI
Finally, the JDBC primitives also show excellent performance. The significantly higher
JNDI is due in part to the fact that Geronimo uses a simple HashMap for JNDI name
resolution where the other implementations use an mutable Global JNDI implementation.
Overall for applications that are dependent on Web Tier and JDBC applications Apache
Geronimo is very competitive in the variety of choices available to users.
Geronimo 1.1.1 PT
30,000
22,500
15,000
7,500
0
2SessionEJB
2EntityLocal
2EntityRemote
Apache Geronimo does not perform as well as the competition when running in a pass-by-
value mode of operation (J2EE default mode of operation). In many instances this is not as
much of an issue due to the fact that most containers run in a mode of operation that
bypasses parameter copying but for those that require the isolation required through value
copying Apache Geronimo is at a disadvantage. You’ll note that for operations involving
Local Interfaces (2EntityLocal) the performance is again on par with the competition.
EntityRemote simply highlights the performance degredation noted about parameter
copying. Apache Geronimo allows for the disabling of parameter copying (like most
Application Servers) but I have not re-run these experiments in this mode of operation.
The next set of primitives show an area that needs improvement which is that of
complicated SQL operations using CMRs and Collections.
8,000
6,000
4,000
2,000
0
2EntityCollection
2CMROne2One
2CMR2One2Many
In each of the above primitives the sequence of events is PingServlet2Session2... One of the
areas that needs to be investigated further is the comparison of these same primitives
without parameter copying as indicated previously.
Overall, the performance of EJBs is competitive when using local interfaces. Where an
application developer requires the isolation provided via pass-by-value, Apache Geronimo
lags the other alternatives. However, as previously noted, many installations run in a pass-
by-reference mode and for these deployments Apache Geronimo will continue to provide
competitive performance for those applications.
Geronimo 1.1.1 PT
4,000
3,000
2,000
1,000
JDBC Direct
The overall CPU utilization for the Geronimo server being tested was 93.7 percent. it
appears that the database system in use was somewhat limited in its disk subsystem. Other
servers ran slightly hotter or about the same as the Apache Geronimo server.
Although DayTrader can run using EJB’s Apache Geronimo lacks two features required to
sustain heavy load in this mode. These are a Stateless Session Bean Cache as well as the
ability to specify whether SQL operations should be conducted with a SELECT FOR
UPDATE or not. These features are being added to Apache Geronimo 1.2 which will be
Download Size:
35MB
Unzipped Disk Size:
41.6M
Disk space after initial startup:
52.5MB
Initial Startup time:
15 seconds
After DayTrader 1.1.1 installed:
19 seconds
Java Command Line:
java -Xmx1g -Xms1g -jar server.jar
P I N G S E RV L E T
Configuration
Results
P I N G S E RV L E T 2 W R I T E R
Configuration
P I N G S E RV L E T 2 I N C L U D E
Configuration
Results
P I N G S E RV L E T 2 S E RV L E T
Configuration
Results
PINGJSP
Configuration
PINGJSPEL
Configuration
P I N G S E RV L E T 2 J S P
Configuration
Results
PINGHTTPSESSION1
Configuration
Results
PINGHHTPSESSION2
Configuration
PINGHTTPSESSION3
Configuration
Results
PINGJDBCREAD
Configuration
Results
PINGJDBCWRITE
P I N G S E RV L E T 2 J N D I
Configuration
P I N G S E RV L E T 2 S E S S I O N
Configuration
P I N G S E RV L E T 2 E N T I T Y L O C A L
Configuration
P I N G S E RV L E T 2 E N T I T Y E J B R E M O T E
Configuration
Results
P I N G S E RV L E T 2 S E S S I O N 2 E N T I T Y
Configuration
Results
P I N G S E RV L E T 2 E N T I T Y C O L L E C T I O N
Configuration
Results
P I N G S E RV L E T 2 S E S S I O N 2 C M RO N E 2 O N E
Configuration
Results
P I N G S E RV L E T 2 C M RO N E 2 M A N Y
Configuration
Results
SCENARIO - JDBC
Configuration
Results