Solace JMS Integration With Spark Streaming 1.3
Solace JMS Integration With Spark Streaming 1.3
Streaming 1.3
Document Version 1.0
October 2018
This document is an integration guide for using Solace JMS as a JMS provider for a Spark
Streaming custom receiver.
Apache Spark is a fast and general-purpose cluster computing system. It provides an optimized
engine that supports general execution graphs. It also supports a rich set of higher-level tools
including Spark SQL for SQL and structured data processing, MLib for machine learning, GraphX
for graph processing, and Spark Streaming for high-throughput, fault-tolerant stream processing of
live data streams. The Spark Streaming custom receiver is a simple interface that allows third
party applications to push data into Spark in an efficient manner.
The Solace message router supports persistent and non-persistent JMS messaging with high
throughput and low, consistent latency. Thanks to very high capacity and built-in virtualization,
each Solace message router can replace dozens of software-based JMS brokers in multi-tenant
deployments. Since JMS is a standard API, client applications connect to Solace like any other
JMS broker so companies whose applications are struggling with performance or reliability issues
can easily overcome them by upgrading to Solace’s hardware.
© Solace Corporation.
http://www.solace.com
Solace JMS Integration with Spark Streaming v1.3
Table of Contents
Solace JMS Integration with Spark Streaming 1.3 ...................................................................... 1
Table of Contents ....................................................................................................................... 2
1 Overview .............................................................................................................................. 3
1.1 Related Documentation ................................................................................................................................... 3
2 Why Solace .......................................................................................................................... 4
Superior Performance............................................................................................................................................. 4
Robustness ............................................................................................................................................................. 4
Simple Architecture................................................................................................................................................. 4
Simple Operations .................................................................................................................................................. 4
Cost Savings .......................................................................................................................................................... 4
3 Integrating with Spark Streaming .......................................................................................... 5
3.1 Description of Resources Required ................................................................................................................. 5
3.1.1 Solace Resources .................................................................................................................................................5
3.1.2 Spark Resources .................................................................................................................................................5
3.2 Step 1 – Obtain access to Solace message router and JMS API .................................................................... 6
3.3 Step 2 – Configuring the Solace Message Router ........................................................................................... 6
3.3.1 Creating a Message VPN......................................................................................................................................7
3.3.2 Configuring Client Usernames & Profiles ..............................................................................................................7
3.3.3 Setting up Guaranteed Messaging Endpoints .......................................................................................................8
3.3.4 Setting up Solace JNDI References ......................................................................................................................8
3.4 Step 3 – Coding a JMS custom receiver. ...................................................................................................... 10
3.5 Step 4 – Deploying JMS Receiver ................................................................................................................. 13
4 Performance Considerations .............................................................................................. 14
5 Working with Solace High Availability (HA) ......................................................................... 15
6 Debugging Tips for Solace JMS API Integration ................................................................. 16
6.1 How to enable Solace JMS API logging ........................................................................................................ 16
7 Advanced Topics ................................................................................................................ 17
7.1 Authentication ................................................................................................................................................ 17
7.1.1 Configuring the Solace Message Router .............................................................................................................17
7.1.2 Configuring Spark ...............................................................................................................................................19
7.2 Working with the Solace Disaster Recovery Solution .................................................................................... 20
7.2.1 Configuring a Host List within the Spring Framework ..........................................................................................20
7.2.2 Configuring reasonable JMS Reconnection Properties within Solace JNDI .........................................................20
7.2.3 Disaster Recovery Behavior Notes......................................................................................................................21
8 Appendix - Configuration and Java Source Reference........................................................ 22
8.1 JMSReciever.java.......................................................................................................................................... 22
8.2 JMSReceiverTest.java ................................................................................................................................... 25
2
Solace JMS Integration with Spark Streaming v1.3
1 Overview
This document demonstrates how to integrate Solace Java Message Service (JMS) with the Spark Streaming custom
receiver for consumption of JMS messages. The goal of this document is to outline best practices for this integration to
enable efficient use of both the Spark Streaming and Solace JMS.
The target audience of this document is developers using the Hadoopv2 with knowledge of both the Spark and JMS in
general. As such this document focuses on the technical steps required to achieve the integration. For detailed
background on either Solace JMS or Spark refer to the referenced documents below.
This document is divided into the following sections to cover the Solace JMS integration with Spark Streaming:
o Integrating with Spark Streaming
o Performance Considerations
o Debugging Tips
3
Solace JMS Integration with Spark Streaming v1.3
2 Why Solace
Solace technology efficiently moves information between all kinds of applications, users and devices, anywhere in the
world, over all kinds of networks. Solace makes its state-of-the-art data movement capabilities available via hardware
and software “message routers” that can meet the needs of any application or deployment environment. Solace’s
unique solution offers unmatched capacity, performance, robustness and TCO so our customers can focus on seizing
business opportunities instead of building and maintaining complex data distribution infrastructure.
Superior Performance
Solace’s hardware and software messaging middleware products can cost-effectively meet the performance needs of
any application, with feature parity and interoperability that lets companies start small and scale to support higher
volume or more demanding requirements over time, and purpose-built appliances that offer 50-100x higher
performance than any other technology for customers or applications that require extremely high capacity or low
latency.
Robustness
Solace offers high availability (HA) and disaster recovery (DR) without the need for 3rd party products, and fast failover
times no other solution can match. Distributing data via dedicated TCP connections ensures an orderly, well-behaved
system under load, and patented techniques ensure that the performance of publishers and high-speed consumers is
never impacted by slow consumers.
Simple Architecture
Modern enterprises run applications that demand many kinds of data movement such as persistent messaging, web
streaming, WAN distribution and cloud-based communications. By supporting all kinds of data movement with a unified
platform that can be deployed as a small-footprint software broker or high-capacity rack-mounted appliance, Solace lets
architects design an end-to-end infrastructure that’s easy to build applications for, integrate with existing technologies,
secure and scale.
Simple Operations
Solace’s solution features a shared administration framework for all kinds of data movement, deployment models and
network environments so it’s easy for IT staff to deploy, monitor, manage and upgrade their Solace-based messaging
environment.
Cost Savings
Solace reduces expenses with high-capacity hardware, flexible software, and the ability to deploy the right solution for
each problem. Solace’s support for many kinds of messaging lets you replace multiple messaging products with just
one, built-in HA, DR, WAN and Web functionality eliminate the need for third-party products.
4
Solace JMS Integration with Spark Streaming v1.3
Solace Message __IP:Port__ The IP address and port of the Solace Message Router message
Router IP:Port backbone. This is the address clients use when connecting to the
Solace Message Router to send and receive message. This
document uses a value of __IP:PORT__.
Message VPN Solace_Spark_VPN A Message VPN, or virtual message broker, to scope the integration
on the Solace Message Router.
JNDI Connection JNDI/CF/spark The JNDI Connection factory for controlling Solace JMS connection
Factory properties
JNDI Queue Name JNDI/Q/receiver The JNDI name of the queue used in the samples
Table 2 – Solace Resources
Resource Value
org.apache.spark.storage.StorageLevel
org.apache.spark.streaming.receiver.Receiver
Table 3 – Spark Resources
5
Solace JMS Integration with Spark Streaming v1.3
3.2 Step 1 – Obtain access to Solace message router and JMS API
The Solace messaging router can be obtained one of 2 ways.
1. If you are in an organization that is an existing Solace customer, it is likely your organization already has
Solace Message Routers and corporate policies about their use. You will have to contact your middleware
operational team in regards to access to a Solace Message Router.
2. If you are new to Solace or your company does not have development message routers, you can obtain a trail
Solace Virtual Message Router from the [Solace-Portal] in the Downloads-> Products -> Virtual Message
Router section.
The following Solace libraries are required. They can be obtained on [Solace-Portal] in the Downloads-> Enterprise
Messaging APIs-> JMS section.
Apache Geronimo geronimo- Apache Geronimo is an open source server runtime that
jms_1.1_spec-1.1.1.jar integrates the best open source projects to create Java/OSGi
server runtimes that meet the needs of enterprise developers
and system administrators. Our most popular distribution is a
fully certified Java EE 6 application server runtime.
o Appropriate JNDI mappings enabling JMS clients to connect to the Solace appliance configuration.
For reference, the CLI commands in the following sections are from SolOS version 6.2 but will generally be forward
compatible. For more details related to Solace appliance CLI see [Solace-CLI]. Wherever possible, default values will
be used to minimize the required configuration. The CLI commands listed also assume that the CLI user has a Global
Access Level set to Admin. For details on CLI access levels please see [Solace-FG] section “User Authentication and
Authorization”.
Also note that this configuration can also be easily performed using SolAdmin, Solace’s GUI management tool. This is
in fact the recommended approach for configuring a Solace appliance. This document uses CLI as the reference to
remain concise.
6
Solace JMS Integration with Spark Streaming v1.3
7
Solace JMS Integration with Spark Streaming v1.3
8
Solace JMS Integration with Spark Streaming v1.3
9
Solace JMS Integration with Spark Streaming v1.3
@Override
public void onStart() {
// TODO Auto-generated from spark.streaming.receiver
}
@Override
public void onStop() {
// TODO Auto-generated from spark.streaming.receiver
}
@Override
public void onMessage(Message arg0) {
// TODO Auto-generated from javax.jms.MessageListener
}
}
10
Solace JMS Integration with Spark Streaming v1.3
In the constructor we need to collect information to information needed to connect to Solace and build the JMS
environment.
11
Solace JMS Integration with Spark Streaming v1.3
String jndiQueue_s;
String connectionFactory_s;
jndiQueue_s = jndiQueue;
connectionFactory_s = connectionFactory;
}
• Next in the onStart() method we need to look up the JMS connection factory and queue then connect to
receive messages.
@Override
public void onStart() {
InitialContext initialContext = null;
try {
ConnectionFactory factory = (ConnectionFactory)
initialContext.lookup(connectionFactory_s);
connection_s = factory.createConnection();
Destination queue = (Destination) initialContext.lookup(jndiQueue_s);
} catch (NamingException e) {
e.printStackTrace();
} catch (JMSException e) {
e.printStackTrace();
}
}
12
Solace JMS Integration with Spark Streaming v1.3
Finally, when receiving messages from Solace they need to be stored into Spark.
@Override
public void onMessage(Message message) {
try {
store(message.toString());
message.acknowledge();
} catch (JMSException e) {
e.printStackTrace();
}
}
This is set as such to easily allow execution within the Spark example directory structure and may need to be changes
to best fit your operational environment.
To invoke the JMS receiver:
13
Solace JMS Integration with Spark Streaming v1.3
4 Performance Considerations
In the provided example above persistent messaging was used on the appliance and the Spark Streaming client
connected to a queue. This design pattern provides the highest level of reliability as each message is persisted on the
Solace message router and will not be lost in case of a client failure. This message pattern consumes the most
resources on the Solace Message Router and is not the most performant.
If the client does not want to receive messages that where missed while it was off line, does not want to receive older
messages if it is unable to keep up to the published message flow, or wants the highest throughput with lowest latency;
then direct message is the correct pattern.
14
Solace JMS Integration with Spark Streaming v1.3
In section 3.3.4 Setting up Solace JNDI References, the Solace CLI commands correctly configured the required JNDI
properties to reasonable values. These commands are repeated here for completeness.
config)# jndi message-vpn Solace_Spark_VPN
(config-jndi)# connection-factory JNDI/CF/park
(config-jndi-connection-factory)# property-list transport-properties
(config-jndi-connection-factory-pl)# property "reconnect-retry-wait" "3000"
(config-jndi-connection-factory-pl)# property "reconnect-retries" "20"
(config-jndi-connection-factory-pl)# property "connect-retries-per-host" "5"
(config-jndi-connection-factory-pl)# property "connect-retries" "1"
(config-jndi-connection-factory-pl)# exit
(config-jndi-connection-factory)# exit
(config-jndi)# exit
(config)#
Finally ensure that the JNDI Destination you are using points to a Topic not a Queue:
15
Solace JMS Integration with Spark Streaming v1.3
By default info logs will be written to the consol. This section will focus on using log4j as the logging library and tuning
Solace JMS API logs using the log4j properties. Therefore in order to enable Solace JMS API logging, a user must do
two things:
o Put Log4j on the classpath.
Below is an example Log4j properties file that will enable debug logging within the Solace JMS API.
log4j.rootCategory=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{ABSOLUTE} %5p %t %c{2}:%L - %m%n
log4j.category.com.solacesystems.jms=DEBUG
log4j.category.com.solacesystems.jcsmp=DEBUG
With this you can get output in a format similar to the following which can help in understanding what is happening
within the Solace JMS API.
14:35:01,171 DEBUG main client.ClientRequestResponse:75 - Starting request timer (SMP-
EstablishP2pSub) (10000 ms)
14:35:01,171 DEBUG Context_2_ReactorThread client.ClientRequestResponse:83 - Stopping
request timer (SMP-EstablishP2pSub)
14:35:01,173 INFO main jms.SolConnection:151 - Connection created.
14:35:01,173 INFO main connection.CachingConnectionFactory:298 - Established shared JMS
Connection: com.solacesystems.jms.SolConnection@ca3f2d
14:35:01,180 INFO main jms.SolConnection:327 - Entering start()
14:35:01,180 INFO main jms.SolConnection:338 - Leaving start() : Connection started.
14:35:01,180 INFO jmsContainer-1 jms.SolConnection:252 - Entering createSession()
16
Solace JMS Integration with Spark Streaming v1.3
7 Advanced Topics
7.1 Authentication
JMS Client authentication is handled by the Solace appliance. The Solace appliance supports a variety of
authentications schemes as described in [Solace-FG] in the Section “Client Authentication and Authorization.
In this section we will show how to configure the Solace Message Router to pass the authentication
username/password through to an LDAP,(Active-Directory) server to incorporate with enterprise level authentication
mechanisms. TLS certificates and Kerberos are also possible.
o First an LDAP profile needs to be created, this includes:
Finally the LDAP profile will need to be enabled for the message VPN. Note that there is no code change from the
Application/API. As the authentication is pass-through from the appliance to the LDAP server.
17
Solace JMS Integration with Spark Streaming v1.3
Then set the server certificate for the Solace Message Router.
(config)# ssl server-certificate mycert.pem
(config)#
18
Solace JMS Integration with Spark Streaming v1.3
To connect to :
smf://spark_user@__IP:Port__
This specified a URI scheme of “smf” which is the plaint-text method of communicating with the Solace Message
Router. This should be updated to “smfs” to switch to secure communication giving you the following configuration:
smfs://spark_user@__IP:Port__
It is also required to provide a trust store password. This password allows the Solace JMS API to validate the integrity
of the contents of the trust store. This is done through the following parameter.
env.put(SupportedProperty.Solace_JMS_SSL_TrustStorePassword, ___Password___)
19
Solace JMS Integration with Spark Streaming v1.3
There are multiple formats for the trust store file. By default Solace JMS assumes a format of Java Key Store (JKS). So
if the trust store file follows the JKS format then this parameter may be omitted. Solace JMS supports two formats for
the trust store: “jks” for Java Key Store or “pkcs12”. Setting the trust store format is done through the following
parameter.
env.put(SupportedProperty.Solace_JMS_SSL_TrustStoreFormat, jks)
And finally, the authentication scheme must be selected. Solace JMS supports the following authentication schemes for
secure connections:
o AUTHENTICATION_SCHEME_BASIC
o AUTHENTICATION_SCHEME_CLIENT_CERTIFICATE
This integration example will use basic authentication. So the required parameter is as follows:
env.put(SupportedProperty.Solace_JMS_Authentication_Scheme,AUTHENTICATION_SCHEME_BASIC)
20
Solace JMS Integration with Spark Streaming v1.3
21
Solace JMS Integration with Spark Streaming v1.3
import com.solacesystems.jms.SupportedProperty;
import javax.jms.Connection;
import javax.jms.ConnectionFactory;
import javax.jms.Destination;
import javax.jms.ExceptionListener;
import javax.jms.JMSException;
import javax.jms.Message;
import javax.jms.MessageConsumer;
import javax.jms.MessageListener;
import javax.jms.Session;
import javax.naming.Context;
import javax.naming.InitialContext;
import java.util.Hashtable;
22
Solace JMS Integration with Spark Streaming v1.3
log.info("Starting up...");
try
{
_connection = factory.createConnection();
_connection.setExceptionListener(new JMSReceiverExceptionListener());
23
Solace JMS Integration with Spark Streaming v1.3
MessageConsumer consumer;
consumer = session.createConsumer(queue);
consumer.setMessageListener(this);
_connection.start();
log.info("Completed startup.");
} catch (Exception ex)
{
// Caught exception, try a restart
log.error("Callback onStart caught exception, restarting ", ex);
restart("Callback onStart caught exception, restarting ", ex);
}
}
@Override
public void onMessage(Message message)
{
log.info("Callback onMessage received" + message);
store(message.toString());
try {
message.acknowledge();
} catch (JMSException ex) {
log.error("Callback onMessage failed to ack message", ex);
}
}
24
Solace JMS Integration with Spark Streaming v1.3
{
@Override
public void onException(JMSException ex)
{
log.error("JMS exceptionListener caught exception, , restarting ", ex);
restart("JMS exceptionListener caught exception, , restarting ");
}
}
@Override
public String toString()
{
return "JMSReceiver{" +
"brokerURL='" + _brokerURL + '\'' +
", vpn='" + _vpn + '\'' +
", username='" + _username + '\'' +
", queueName='" + _queueName + '\'' +
", connectionFactory='" + _connectionFactory + '\'' +
'}';
}
}
8.2 JMSReceiverTest.java
Note that this test is the simple word count test in Spark, against the entire JMS message as a
string. More practical use of this receiver would likely parse the JMS message into a serializable
object prior to Spark analysis.
25
Solace JMS Integration with Spark Streaming v1.3
package com.solacesystems.jms.samples;
import java.util.regex.Pattern;
import javax.naming.NamingException;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import org.apache.spark.storage.StorageLevel;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.api.java.JavaDStream;
import org.apache.spark.streaming.api.java.JavaPairDStream;
import org.apache.spark.streaming.api.java.JavaReceiverInputDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import scala.Tuple2;
import com.google.common.collect.Lists;
// Create a input stream with the custom receiver on target ip:port and count
the
// words in input stream of \n delimited text (eg. generated by 'nc')
JavaReceiverInputDStream<String> lines;
26
Solace JMS Integration with Spark Streaming v1.3
lines = ssc.receiverStream(
new JMSReceiver(args[0], args[1], args[2], args[3], args[4],
args[5], StorageLevel.MEMORY_ONLY_SER_2()));
wordCounts.print();
ssc.start();
ssc.awaitTermination();
ssc.close();
}
}
27