Mastering GeoServer - DBMS Connection Parameters Explained
Mastering GeoServer - DBMS Connection Parameters Explained
DBMS Connection
Parameters Explained
Table of Contents
GeoServer DataStores and DBMS Connections
General considerations
Internal Connection Pool
Prepared statement notes
A few more thoughts
JNDI Connection Pool
Configuring pools for productions
Connection waiting time and relation with other params
Maximizing sharing of pools
Validation queries
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]
2
Ok, now let’s go into GeoServer specifics. In most GeoServer DataStores you have the
possibility to use the JNDI2 or the standard store which basically means you can have
GeoServer manage the connection pool for you or you can configure it externally (from within
the Application Server of choice) and then have GeoServer lean onto it to get connections.
Baseline is, one way or the other you’ll always end-up using a connection pool in GeoServer.
1
These recommendations apply not only to GeoServer but are of general usage when it comes
to working with a DBMS.
2
http://en.wikipedia.org/wiki/Java_Naming_and_Directory_Interface
3
http://commons.apache.org/proper/commons-dbcp/
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]
3
max The maximum number of connections the pool can hold. When the maximum
connections number of connections is exceeded, additional requests that require a
database connection will be halted until a connection from the pool becomes
available and eventually times out if that’s not possible within the time
specified in the connection timeout. The maximum number of connections
limits the number of concurrent requests that can be made against the
database.
min The minimum number of connections the pool will hold. This number of
connections connections is held even when there are no active requests. When this
number of connections is exceeded due to serving incoming requests
additional connections are opened until the pool reaches its maximum size
(described above).
1. If it is very far from the max connections this might limit the ability of
the GeoServer to respond quickly to unexpected or random heavy
load situations due to the fact that it takes a non negligible time to
create a new connections. However this set up is very good when the
DBMS is quite loaded since it tends to use as less connections as
possible at all times.
2. If it is very close to the max connections value the GeoServer will be
very fast to respond to random load situation. However in this case
the GeoServer would put a big burden on DBMS shoulders as the the
poll will try to hold all needed connections at all times.
validate Flag indicating whether connections from the pool should be validated before
connections they are used. A connection in the pool can become invalid for a number of
reasons including network breakdown, database server timeout, etc.. The
benefit of setting this flag is that an invalid connection will never be used
which can prevent client errors. The downside of setting the flag is that a small
performance penalty is paid in order to validate connections when the
connection is borrowed from the pool since the validation is done by sending
small query to the server. However the cost of this query is usually small, as
an instance on PostGis the validation query is “Select 1”.
fetch size The number of records read from the database in each network exchange. If
set too low (<50) network latency will affect negatively performance, if set too
high it might consume a significant portion of GeoServer memory and push it
towards an O ut Of Memory error. Defaults to 1000, it might be beneficial to
push it to a higher number if the typical database query reads much more data
than this, and there is enough heap memory to hold the results
connection Time, in seconds, the connection pool will wait before giving up its attempt to
timeout get a new connection from the database. Defaults to 20 seconds.
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]
4
This timeout kicks in during heavy load conditions when the number of
requests needing a connection to a certain DB outnumber greatly the number
of available connections in the pool, therefore some requests might get error
messages due to the timeouts while acquiring a connection. This condition is
not per se problematic since usually a request does not use a DB connection
for its entire lifecycle hence we do not need 100 connections at hand to
respond to 100 requests4; however we should strive to limit this condition
since it would queue threads on the connection pool after they might have
allocated memory (e.g. for rendering). We will get back to this later on.
prepared Activates the usage of prepared statements (see the prepared statements
statements specific section below)
max open Maximum number of prepared statements kept open and cached for each
prepared connection in the pool.
statements
In business applications fetching a small amount of data at a time this is beneficial for
performance, however, in spatial ones, where we typically fetch thousands of rows, the benefit
is limited, and sometimes, turns into a performance problem.
This is the case with PostGIS, that is able to tune the access plan by inspecting the requested
BBOX, and deciding if a sequential scan is preferable (the BBOX really accesses most of the
data) or using the spatial index is best instead. So, as a rule of thumb, when working with
PostGis, it’s better not to enable prepared statements.
With other databases there are no choices, Oracle currently works only with prepared
statements, SQL server only without them (this is often related to implementation limitations
than database specific issues).
4
For WMS requests the connection is held long enough to fetch and paint data for one particular layer,
then the connection is released to the pool and might be fetched back if the next layer is against the same
database, but for sure during the final PNG/JPEG encoding no connection is held. WFS GetFeature
instead keeps the connection open for as much time needed to write out the results (as GML is written as
we fetch data from the database)
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]
5
There is an upside of using prepared statement though: no sql injection attacks are possible
when using them. GeoServer code tries hard to avoid this kind of attack when working without
prepared statements, but enabling them makes the attack via filter parameters basically
impossible.
5
More information cab ne found at this link
http://docs.geoserver.org/stable/en/user/tutorials/tomcat-jndi/tomcat-jndi.html
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]
6
minIdle="0"
maxIdle="3"
maxWait="10000"
timeBetweenEvictionRunsMillis="30000"
minEvictableIdleTimeMillis="120000"
testWhileIdle="true"
poolPreparedStatements="true"
maxOpenPreparedStatements="100"
validationQuery="SELECT SYSDATE FROM DUAL"
/>
For more recent Oracle installations (from Oracle 9i onwards) the “oracle.jdbc.OracleDriver”
should be used instead of “oracle.jdbc.driver.OracleDriver” since oracle.jdbc.driver.OracleDriver
is deprecated.
maxAge="600000”
rollbackOnReturn="true"
The maxAge parameter specifies the maximum number of milliseconds a connection will be kept
in the pool before is discarded. When a connection is returned to the pool the age of the
connection will be checked against this parameter, if maxAge has been reached the connection
will be closed instead of returning to the pool. Default value is 0 meaning the connections will be
left open and no age checks will be done.
If rollbackOnReturn is set to “true” the pool can terminate the transaction by calling rollback
on the connection as is returned to the pool. Defaults to false.
Using values that are too low for minEvictableIdleTimeMillis is not advisable and may result in
connection failures.
The Application Server will be responsible for managing the lifecycle of the connection pool (as
such the DBMS driver will be provided to the Application Server) which will be available via the
standard JNDI mechanism to GeoServer DataStores: with this mechanism the stores will not
handle an internal connection pool but they will be allowed to share connections from a large
unique pool.
A nice side effect of using JNDI is also that at a cost of higher complexity the administrator will
be granted much finer grain control over the configuration of the pool. As an instance it will be
possible to specify more sophisticated strategies for the number of connections in the pool. At
the same time it is possible change the validation strategy for the connection, as an instance it
possible to:
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]
7
1. test the connections while they sit idle in the pool rather than when acquiring them (this
would speed acquisition of a connection)
2. test the connection only when we give them back to the pool
One last thing to point out. A big pro of using JNDI is that it makes it possible to share the same
pools between stores in different workspaces thus allowing us to serve the same layers in
different workspace but sharing efficiently the same connection pool across them.
The max wait time in general shall be set accordingly to the expected maximum execution time
for a requests, end-to-end. This include things like, accessing the file system, loading the data.
As an instance, if we take into account WMS requests we are allowed to specify a maximum
response time, therefore if set this to N seconds the max wait time should be set to a value
smaller than that since we don’t want to waste resources having threads blocked unnecessarily
waiting for a connection. In this case it shall be preferable to fail fast to release resources that
might be used unnecessarily otherwise.
Long story short, whenever it’s possible strive to make use of a small number of users and if not
using JNDI to a small number of schema, although JNDI is a must for organization willing to
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]
8
create a complex set up where different workspaces6 (i.e Virtual Services) serve the same
content differently.
Validation queries
Regardless of how we configure the validation query it is extremely important that we always
remember to validate connections before using them in GeoServer; not doing this might lead to
spurious errors due to stale connections sitting the pool. This can be achieved with the internal
connection pool (via the validate connections b ox) as well as with the pools declared in JNDI
(via the validation query mechanism); it is worth to remind that the latter will account for finer
grain configurability.
6
http://docs.geoserver.org/latest/en/user/services/virtual-services.html
© 2016 GeoSolutions SAS - All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form
or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission
of the publisher, except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted
by copyright law. For permission requests, write to the publisher"at [email protected]