SlideShare a Scribd company logo
Search | Discover | Analyze
Confidential and Proprietary © Copyright 2013
Benchmarking Solr
Performance
June 18, 2014
Timothy Potter
Confidential and Proprietary © Copyright 2013
My SolrCloud Experience
• At LucidWorks, mostly focused on hardening SolrCloud; Lucene/Solr
committer
• Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18
shards ~900M docs)
• Built a Fabric/boto framework for deploying and managing a cluster in
EC2
• Co-author of Solr In Action
Confidential and Proprietary © Copyright 2013
Agenda
• Indexing performance tests
• Solr Scale Toolkit
• Next steps
Confidential and Proprietary © Copyright 2013
Cluster sizing
How many servers do I need to index X docs?
... shards ... ?
... replicas ... ?
I need N queries per second over
M docs, how many servers do I need?
It depends?!?
Confidential and Proprietary © Copyright 2013
Methodology
• Transparent repeatable results
– Ideally hoping for something owned by the community
• Synthetic docs ~ 1K each on disk, mix of field types
– Data set created using code borrowed from PigMix
– English text fields generated using a Zipfian distribution
• Java 1.7u55, Amazon Linux, r3.2xlarge nodes
– enhanced networking enabled, placement group, same AZ
• Stock Solr (cloud) 4.8.1
– Using Shawn Heisey’s GC tuning parameters
• Use Elastic MapReduce to generate load
– As many nodes as I need to drive Solr!
Confidential and Proprietary © Copyright 2013
Indexing Results
Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec
10 10 1 48 1762 73,780
10 10 2 34 3727 34,881
10 20 1 48 1282 101,404
10 20 2 34 3207 40,536
10 30 1 72 1070 121,495
10 30 2 60 3159 41,152
15 15 1 60 1106 117,541
15 15 2 42 2465 52,738
15 30 1 60 827 157,195
15 30 2 42 2129 61,062
Confidential and Proprietary © Copyright 2013
Direct Updates
Indexing
Client 1
CloudSolrServer
(SolrJ)
ZooKeeper
/clusterstate.json
Shard 1
(leader)
Shard 2
(leader)
Shard 3
(leader)
<doc>
<doc>
Watch
/clusterstate.json
<doc>
<doc>
compute shard
assignment on
clientbatch
Confidential and Proprietary © Copyright 2013
Replication
CloudSolrServer
(SolrJ)
ZooKeeper
/clusterstate.json
Shard 1
(leader)
Shard 2
(leader)
Shard 3
(leader)
<doc>
<doc>
Watch
/clusterstate.json
<doc>
Shard 1
(replica)
Shard 2
(replica)
Shard 3
(replica)
Blocks for response
from replica(s)
Confidential and Proprietary © Copyright 2013
Don’t swamp your servers!
Confidential and Proprietary © Copyright 2013
Lessons Learned
• Know what throughput your client side is capable of
generating
– If in MapReduce, index from reducers with speculative execution
disabled
• Don’t change Solr config without good reasons for doing
so
• Overshard (but not too much)
• Near-linear scalability as I added nodes!
Confidential and Proprietary © Copyright 2013
Query Performance Tests
• All nodes in SolrCloud perform indexing and execute queries
• Using the TermsComponent to build queries based on the terms in
each field.
• Harder to accurately simulate user queries over synthetic data
– Need mix of faceting, paging, sorting, grouping, boolean clauses, range
queries, boosting, filters (some cached, some not), etc ...
• Does the randomness in your test queries model (expected) user
behavior?
• Start with one server (1 shard) to determine baseline query
performance.
– Look for inefficiencies in your schema and other config settings
Confidential and Proprietary © Copyright 2013
Solr Scale Toolkit
• Fabric / Python based toolset for deploying and
managing SolrCloud clusters
• SolrJ-based client application useful for building
tools that need access to cluster state information
in ZooKeeper
• Code to support benchmarks for Solr
Confidential and Proprietary © Copyright 2013
Python-based Tools
boto – Python API for AWS (EC2, S3, etc)
Fabric – Python-based tool for automating system admin tasks
over SSH
pysolr – Python library for Solr (sending commits, queries, ...)
kazoo – Python client tools for ZooKeeper
Supporting Cast:
JMeter – run tests, generate reports
collectd – system monitoring
Logstash4Solr – log aggregation
JConsole/VisualVM – monitor JVM during indexing / queries
Confidential and Proprietary © Copyright 2013
Solr Scale Toolkit: Demo
• Launch a meta node
– Log agg / basic monitoring using SiLK
• Launch ZooKeeper Ensemble
– 3 nodes to establish quorum
– Setup cron job to clean-up snapshots
• Launch SolrCloud cluster
• Create new collection and index some docs
– Attach JConsole while indexing
• Run a healthcheck on the collection
• Checkout Banana Dashboard
• Backup / Restore
– Requires patch for SOLR-5956
– Use fab patch_jars to update jars and do a rolling restart
Confidential and Proprietary © Copyright 2013
• Custom built AMI?
• Block device mapping
– dedicated disk per Solr node
• Launch and then poll status until they are live
– verify SSH connectivity
• Tag each instance with a cluster ID and username
Provisioning machines
fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
Confidential and Proprietary © Copyright 2013
• Two options:
– provision 1 to N nodes when you launch Solr cluster
– use existing named ensemble
• Fabric command simply creates the myid
files and zoo.cfg file for the ensemble
– and some cron scripts for managing snapshots
• Basic health checking of ZooKeeper status:
– echo srvr | nc localhost 2181
ZooKeeper
fab new_zk_ensemble:zk1,n=3
Confidential and Proprietary © Copyright 2013
• Upload a BASH script that starts/stops Solr
• Set system props: jetty.port, host, zkHost, JVM
opts
• One or more Solr nodes per machine
• JVM mem opts dependent on instance type and
# of Solr nodes per instance
• Optionally configure log4j.properties to append
messages to Rabbitmq for Logstash4Solr
integration
SolrCloud
fab new_solrcloud:test1,zk=zk1,nodesPerHost=2
Confidential and Proprietary © Copyright 2013
• BASH script that implements:
– start/stop Solr nodes on each EC2 instance
– sets JVM memory options, system properties
(jetty.port), enable remote JMX, etc
– backup log files before restarting nodes
– ensure JVM is killed correctly before restarting
• Environment variables in:
solr-ctl-env.sh
solr-ctl.sh
Confidential and Proprietary © Copyright 2013
• Deploy a configuration directory to ZooKeeper
• Create a new collection
• Attach a local JConsole/VisualVM to a remote JVM
• Rolling restart (with Overseer awareness)
• Build Solr locally and patch remote
– Use a relay server to scp the JARs to Amazon network once and then
scp them to other nodes from within the network
• Put/get files
• Grep over all log files (across the cluster)
Miscellaneous Utility Tasks
Confidential and Proprietary © Copyright 2013
• fab mine: See clusters I’m running (or for other users too)
• fab kill_mine: Terminate all instances I’m running
– Use termination protection in production
• fab ssh_to: Quick way to SSH to one of the nodes in a
cluster
• fab stop/recover/kill: Basic commands for controlling
specific Solr nodes in the cluster
• fab jmeter: Execute a JMeter test plan against your cluster
– Example test plan and Java sampler is included with the source
Other useful stuff ...
Confidential and Proprietary © Copyright 2013
• Java-based command-line application that uses SolrJ’s
CloudSolrServer to perform advanced cluster
management operations:
– healthcheck: collect metadata and health information from all
replicas for a collection from ZooKeeper
– backup: create a snapshot of each shard in a collection for
backing up to remote storage (S3)
• Framework for building complex tools that benefit from
having access to cluster state information in ZooKeeper
SolrCloud Tools (SolrJ client app)
./tools.sh –tool healthcheck
Confidential and Proprietary © Copyright 2013
SiLK Integration
• SiLK: Solr integrated with Logstash and Kibana
– Index time-series data, such as log data (collectd, Solr logs, ...)
– Build cool dashboards with Banana (fork of Kibana)
• Easily aggregate all WARN and more severe log
messages from all Solr servers into logstash4solr
• Send collectd metrics to logstash4solr
Confidential and Proprietary © Copyright 2013
SiLK Integration
Confidential and Proprietary © Copyright 2013
What’s Next?
• Migrate to using Apache libcloud instead of using boto
directly
• Benchmark mixed work-loads (queries and indexing)
• SiLK is improving rapidly!
• Chaos monkey tests
– integrate jepsen?
• Open source so please kick the tires!
Confidential and Proprietary © Copyright 2013
Wrap-up
• Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk
• LucidWorks: http://www.lucidworks.com
• SiLK: http://www.lucidworks.com/lucidworks-silk/
• Solr In Action: http://www.manning.com/grainger/
• Connect: @thelabdude / tim.potter@lucidworks.com
Questions?
Ad

More Related Content

What's hot (20)

ApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr IntegrationApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr Integration
thelabdude
 
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the streamSpark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
Varun Thacker
 
How to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr clusterHow to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr cluster
lucenerevolution
 
Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Election
ravikgiitk
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloud
thelabdude
 
Scaling Solr with Solr Cloud
Scaling Solr with Solr CloudScaling Solr with Solr Cloud
Scaling Solr with Solr Cloud
Sematext Group, Inc.
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Shalin Shekhar Mangar
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
MongoDB
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Lucidworks
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
Anshum Gupta
 
How SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded EnvironmentHow SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded Environment
lucenerevolution
 
Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)
searchbox-com
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
Saumitra Srivastav
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
Lucidworks (Archived)
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
Shalin Shekhar Mangar
 
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, AolMail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
Lucidworks
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
Shalin Shekhar Mangar
 
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Spark Summit
 
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Lucidworks
 
ApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr IntegrationApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr Integration
thelabdude
 
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the streamSpark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit EU talk by Miklos Christine paddling up the stream
Spark Summit
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
Varun Thacker
 
How to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr clusterHow to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr cluster
lucenerevolution
 
Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Election
ravikgiitk
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloud
thelabdude
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Shalin Shekhar Mangar
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
MongoDB
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Lucidworks
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
Anshum Gupta
 
How SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded EnvironmentHow SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded Environment
lucenerevolution
 
Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)
searchbox-com
 
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, AolMail Search As A Sercive: Presented by Rishi Easwaran, Aol
Mail Search As A Sercive: Presented by Rishi Easwaran, Aol
Lucidworks
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
Shalin Shekhar Mangar
 
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Spark Summit
 
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Streaming Aggregation in Solr - New Horizons for Search: Presented by Erick E...
Lucidworks
 

Viewers also liked (20)

Data, Cesium.js, StreetView
Data, Cesium.js, StreetViewData, Cesium.js, StreetView
Data, Cesium.js, StreetView
Outliers Collective
 
Adobe Photoshop
Adobe PhotoshopAdobe Photoshop
Adobe Photoshop
LaRue
 
Нестандартные методы интернет рекламы
Нестандартные методы интернет рекламыНестандартные методы интернет рекламы
Нестандартные методы интернет рекламы
Vladimir
 
Linked In Introduction
Linked In IntroductionLinked In Introduction
Linked In Introduction
CMD Training Institute
 
E learning At The Library
E learning At The LibraryE learning At The Library
E learning At The Library
CMD Training Institute
 
Pista American Idiot
Pista American IdiotPista American Idiot
Pista American Idiot
tanica
 
Webテクノロジー@2012
Webテクノロジー@2012Webテクノロジー@2012
Webテクノロジー@2012
彰 村地
 
The mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the marketThe mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the market
Paul Williamson
 
"Search, APIs,Capability Management and the Sensis Journey"
"Search, APIs,Capability Management and the Sensis Journey""Search, APIs,Capability Management and the Sensis Journey"
"Search, APIs,Capability Management and the Sensis Journey"
Lucidworks (Archived)
 
最新ブラウザー UI 比較
最新ブラウザー UI 比較最新ブラウザー UI 比較
最新ブラウザー UI 比較
彰 村地
 
Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13
Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13
Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13
Marty Kaszubowski
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
Lucidworks (Archived)
 
Davis mark advanced search analytics in 20 minutes
Davis mark   advanced search analytics in 20 minutesDavis mark   advanced search analytics in 20 minutes
Davis mark advanced search analytics in 20 minutes
Lucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Lucidworks (Archived)
 
across the universe
across the universeacross the universe
across the universe
tanica
 
Updated: Marketing your Technology
Updated: Marketing your TechnologyUpdated: Marketing your Technology
Updated: Marketing your Technology
Marty Kaszubowski
 
Hellosong
HellosongHellosong
Hellosong
tanica
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様
彰 村地
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...
Lucidworks (Archived)
 
Adobe Photoshop
Adobe PhotoshopAdobe Photoshop
Adobe Photoshop
LaRue
 
Нестандартные методы интернет рекламы
Нестандартные методы интернет рекламыНестандартные методы интернет рекламы
Нестандартные методы интернет рекламы
Vladimir
 
Pista American Idiot
Pista American IdiotPista American Idiot
Pista American Idiot
tanica
 
Webテクノロジー@2012
Webテクノロジー@2012Webテクノロジー@2012
Webテクノロジー@2012
彰 村地
 
The mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the marketThe mobile as a health hub, and how bluetooth low energy enables the market
The mobile as a health hub, and how bluetooth low energy enables the market
Paul Williamson
 
"Search, APIs,Capability Management and the Sensis Journey"
"Search, APIs,Capability Management and the Sensis Journey""Search, APIs,Capability Management and the Sensis Journey"
"Search, APIs,Capability Management and the Sensis Journey"
Lucidworks (Archived)
 
最新ブラウザー UI 比較
最新ブラウザー UI 比較最新ブラウザー UI 比較
最新ブラウザー UI 比較
彰 村地
 
Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13
Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13
Presentation to the Old Dominion University (ODU) MBA Association, 3/20/13
Marty Kaszubowski
 
Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
Lucidworks (Archived)
 
Davis mark advanced search analytics in 20 minutes
Davis mark   advanced search analytics in 20 minutesDavis mark   advanced search analytics in 20 minutes
Davis mark advanced search analytics in 20 minutes
Lucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Lucidworks (Archived)
 
across the universe
across the universeacross the universe
across the universe
tanica
 
Updated: Marketing your Technology
Updated: Marketing your TechnologyUpdated: Marketing your Technology
Updated: Marketing your Technology
Marty Kaszubowski
 
Hellosong
HellosongHellosong
Hellosong
tanica
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様
彰 村地
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...
Lucidworks (Archived)
 
Ad

Similar to SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance (20)

Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
Lucidworks
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Lucidworks (Archived)
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Lucidworks
 
Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
DataWorks Summit/Hadoop Summit
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin Presentation
Nitin Sharma
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Nitin S
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
James Chen
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
Cominvent AS
 
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
Mark Swarbrick
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
bloomreacheng
 
Automated Cluster Management and Recovery for Large Scale Multi-Tenant Sea...
  Automated Cluster Management and Recovery  for Large Scale Multi-Tenant Sea...  Automated Cluster Management and Recovery  for Large Scale Multi-Tenant Sea...
Automated Cluster Management and Recovery for Large Scale Multi-Tenant Sea...
Lucidworks
 
Spark etl
Spark etlSpark etl
Spark etl
Imran Rashid
 
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
 DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and... DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
PROIDEA
 
How to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experienceHow to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experience
Docker, Inc.
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
Rahul Singh
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
Anant Corporation
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache Zookeeper
Anshul Patel
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
Angel Borroy López
 
Low latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache StormLow latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache Storm
Julien Nioche
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
Lucidworks
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Lucidworks (Archived)
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Lucidworks
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin Presentation
Nitin Sharma
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Nitin S
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
James Chen
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
Cominvent AS
 
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisationMySQL Webinar 2/4 Performance tuning, hardware, optimisation
MySQL Webinar 2/4 Performance tuning, hardware, optimisation
Mark Swarbrick
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
bloomreacheng
 
Automated Cluster Management and Recovery for Large Scale Multi-Tenant Sea...
  Automated Cluster Management and Recovery  for Large Scale Multi-Tenant Sea...  Automated Cluster Management and Recovery  for Large Scale Multi-Tenant Sea...
Automated Cluster Management and Recovery for Large Scale Multi-Tenant Sea...
Lucidworks
 
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
 DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and... DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
DOD 2016 - Stefan Thies - Monitoring and Log Management for Docker Swarm and...
PROIDEA
 
How to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experienceHow to accelerate docker adoption with a simple and powerful user experience
How to accelerate docker adoption with a simple and powerful user experience
Docker, Inc.
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
Rahul Singh
 
Building Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source TechnologiesBuilding Enterprise Search Engines using Open Source Technologies
Building Enterprise Search Engines using Open Source Technologies
Anant Corporation
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache Zookeeper
Anshul Patel
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
Angel Borroy López
 
Low latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache StormLow latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache Storm
Julien Nioche
 
Ad

More from Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
Lucidworks (Archived)
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
Lucidworks (Archived)
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
Lucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
Lucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Lucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Lucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Lucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Lucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Lucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
Lucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Lucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
Lucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
Lucidworks (Archived)
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucidworks (Archived)
 
Seeley yonik solr performance key innovations
Seeley yonik   solr performance key innovationsSeeley yonik   solr performance key innovations
Seeley yonik solr performance key innovations
Lucidworks (Archived)
 
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Lucidworks (Archived)
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
Lucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
Lucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Lucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Lucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Lucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Lucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Lucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
Lucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Lucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
Lucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
Lucidworks (Archived)
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucidworks (Archived)
 
Seeley yonik solr performance key innovations
Seeley yonik   solr performance key innovationsSeeley yonik   solr performance key innovations
Seeley yonik solr performance key innovations
Lucidworks (Archived)
 
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Lucidworks (Archived)
 

Recently uploaded (20)

Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Hands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordDataHands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordData
Lynda Kane
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Salesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docxSalesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docx
José Enrique López Rivera
 
Image processinglab image processing image processing
Image processinglab image processing  image processingImage processinglab image processing  image processing
Image processinglab image processing image processing
RaghadHany
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Learn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step GuideLearn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step Guide
Marcel David
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Leading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael JidaelLeading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael Jidael
Michael Jidael
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Hands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordDataHands On: Create a Lightning Aura Component with force:RecordData
Hands On: Create a Lightning Aura Component with force:RecordData
Lynda Kane
 
Linux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdfLinux Professional Institute LPIC-1 Exam.pdf
Linux Professional Institute LPIC-1 Exam.pdf
RHCSA Guru
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Salesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docxSalesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docx
José Enrique López Rivera
 
Image processinglab image processing image processing
Image processinglab image processing  image processingImage processinglab image processing  image processing
Image processinglab image processing image processing
RaghadHany
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Learn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step GuideLearn the Basics of Agile Development: Your Step-by-Step Guide
Learn the Basics of Agile Development: Your Step-by-Step Guide
Marcel David
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Leading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael JidaelLeading AI Innovation As A Product Manager - Michael Jidael
Leading AI Innovation As A Product Manager - Michael Jidael
Michael Jidael
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 

SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance

  • 1. Search | Discover | Analyze Confidential and Proprietary © Copyright 2013 Benchmarking Solr Performance June 18, 2014 Timothy Potter
  • 2. Confidential and Proprietary © Copyright 2013 My SolrCloud Experience • At LucidWorks, mostly focused on hardening SolrCloud; Lucene/Solr committer • Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs) • Built a Fabric/boto framework for deploying and managing a cluster in EC2 • Co-author of Solr In Action
  • 3. Confidential and Proprietary © Copyright 2013 Agenda • Indexing performance tests • Solr Scale Toolkit • Next steps
  • 4. Confidential and Proprietary © Copyright 2013 Cluster sizing How many servers do I need to index X docs? ... shards ... ? ... replicas ... ? I need N queries per second over M docs, how many servers do I need? It depends?!?
  • 5. Confidential and Proprietary © Copyright 2013 Methodology • Transparent repeatable results – Ideally hoping for something owned by the community • Synthetic docs ~ 1K each on disk, mix of field types – Data set created using code borrowed from PigMix – English text fields generated using a Zipfian distribution • Java 1.7u55, Amazon Linux, r3.2xlarge nodes – enhanced networking enabled, placement group, same AZ • Stock Solr (cloud) 4.8.1 – Using Shawn Heisey’s GC tuning parameters • Use Elastic MapReduce to generate load – As many nodes as I need to drive Solr!
  • 6. Confidential and Proprietary © Copyright 2013 Indexing Results Cluster Size # of Shards # of Replicas Reducers Time (secs) Docs / sec 10 10 1 48 1762 73,780 10 10 2 34 3727 34,881 10 20 1 48 1282 101,404 10 20 2 34 3207 40,536 10 30 1 72 1070 121,495 10 30 2 60 3159 41,152 15 15 1 60 1106 117,541 15 15 2 42 2465 52,738 15 30 1 60 827 157,195 15 30 2 42 2129 61,062
  • 7. Confidential and Proprietary © Copyright 2013 Direct Updates Indexing Client 1 CloudSolrServer (SolrJ) ZooKeeper /clusterstate.json Shard 1 (leader) Shard 2 (leader) Shard 3 (leader) <doc> <doc> Watch /clusterstate.json <doc> <doc> compute shard assignment on clientbatch
  • 8. Confidential and Proprietary © Copyright 2013 Replication CloudSolrServer (SolrJ) ZooKeeper /clusterstate.json Shard 1 (leader) Shard 2 (leader) Shard 3 (leader) <doc> <doc> Watch /clusterstate.json <doc> Shard 1 (replica) Shard 2 (replica) Shard 3 (replica) Blocks for response from replica(s)
  • 9. Confidential and Proprietary © Copyright 2013 Don’t swamp your servers!
  • 10. Confidential and Proprietary © Copyright 2013 Lessons Learned • Know what throughput your client side is capable of generating – If in MapReduce, index from reducers with speculative execution disabled • Don’t change Solr config without good reasons for doing so • Overshard (but not too much) • Near-linear scalability as I added nodes!
  • 11. Confidential and Proprietary © Copyright 2013 Query Performance Tests • All nodes in SolrCloud perform indexing and execute queries • Using the TermsComponent to build queries based on the terms in each field. • Harder to accurately simulate user queries over synthetic data – Need mix of faceting, paging, sorting, grouping, boolean clauses, range queries, boosting, filters (some cached, some not), etc ... • Does the randomness in your test queries model (expected) user behavior? • Start with one server (1 shard) to determine baseline query performance. – Look for inefficiencies in your schema and other config settings
  • 12. Confidential and Proprietary © Copyright 2013 Solr Scale Toolkit • Fabric / Python based toolset for deploying and managing SolrCloud clusters • SolrJ-based client application useful for building tools that need access to cluster state information in ZooKeeper • Code to support benchmarks for Solr
  • 13. Confidential and Proprietary © Copyright 2013 Python-based Tools boto – Python API for AWS (EC2, S3, etc) Fabric – Python-based tool for automating system admin tasks over SSH pysolr – Python library for Solr (sending commits, queries, ...) kazoo – Python client tools for ZooKeeper Supporting Cast: JMeter – run tests, generate reports collectd – system monitoring Logstash4Solr – log aggregation JConsole/VisualVM – monitor JVM during indexing / queries
  • 14. Confidential and Proprietary © Copyright 2013 Solr Scale Toolkit: Demo • Launch a meta node – Log agg / basic monitoring using SiLK • Launch ZooKeeper Ensemble – 3 nodes to establish quorum – Setup cron job to clean-up snapshots • Launch SolrCloud cluster • Create new collection and index some docs – Attach JConsole while indexing • Run a healthcheck on the collection • Checkout Banana Dashboard • Backup / Restore – Requires patch for SOLR-5956 – Use fab patch_jars to update jars and do a rolling restart
  • 15. Confidential and Proprietary © Copyright 2013 • Custom built AMI? • Block device mapping – dedicated disk per Solr node • Launch and then poll status until they are live – verify SSH connectivity • Tag each instance with a cluster ID and username Provisioning machines fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
  • 16. Confidential and Proprietary © Copyright 2013 • Two options: – provision 1 to N nodes when you launch Solr cluster – use existing named ensemble • Fabric command simply creates the myid files and zoo.cfg file for the ensemble – and some cron scripts for managing snapshots • Basic health checking of ZooKeeper status: – echo srvr | nc localhost 2181 ZooKeeper fab new_zk_ensemble:zk1,n=3
  • 17. Confidential and Proprietary © Copyright 2013 • Upload a BASH script that starts/stops Solr • Set system props: jetty.port, host, zkHost, JVM opts • One or more Solr nodes per machine • JVM mem opts dependent on instance type and # of Solr nodes per instance • Optionally configure log4j.properties to append messages to Rabbitmq for Logstash4Solr integration SolrCloud fab new_solrcloud:test1,zk=zk1,nodesPerHost=2
  • 18. Confidential and Proprietary © Copyright 2013 • BASH script that implements: – start/stop Solr nodes on each EC2 instance – sets JVM memory options, system properties (jetty.port), enable remote JMX, etc – backup log files before restarting nodes – ensure JVM is killed correctly before restarting • Environment variables in: solr-ctl-env.sh solr-ctl.sh
  • 19. Confidential and Proprietary © Copyright 2013 • Deploy a configuration directory to ZooKeeper • Create a new collection • Attach a local JConsole/VisualVM to a remote JVM • Rolling restart (with Overseer awareness) • Build Solr locally and patch remote – Use a relay server to scp the JARs to Amazon network once and then scp them to other nodes from within the network • Put/get files • Grep over all log files (across the cluster) Miscellaneous Utility Tasks
  • 20. Confidential and Proprietary © Copyright 2013 • fab mine: See clusters I’m running (or for other users too) • fab kill_mine: Terminate all instances I’m running – Use termination protection in production • fab ssh_to: Quick way to SSH to one of the nodes in a cluster • fab stop/recover/kill: Basic commands for controlling specific Solr nodes in the cluster • fab jmeter: Execute a JMeter test plan against your cluster – Example test plan and Java sampler is included with the source Other useful stuff ...
  • 21. Confidential and Proprietary © Copyright 2013 • Java-based command-line application that uses SolrJ’s CloudSolrServer to perform advanced cluster management operations: – healthcheck: collect metadata and health information from all replicas for a collection from ZooKeeper – backup: create a snapshot of each shard in a collection for backing up to remote storage (S3) • Framework for building complex tools that benefit from having access to cluster state information in ZooKeeper SolrCloud Tools (SolrJ client app) ./tools.sh –tool healthcheck
  • 22. Confidential and Proprietary © Copyright 2013 SiLK Integration • SiLK: Solr integrated with Logstash and Kibana – Index time-series data, such as log data (collectd, Solr logs, ...) – Build cool dashboards with Banana (fork of Kibana) • Easily aggregate all WARN and more severe log messages from all Solr servers into logstash4solr • Send collectd metrics to logstash4solr
  • 23. Confidential and Proprietary © Copyright 2013 SiLK Integration
  • 24. Confidential and Proprietary © Copyright 2013 What’s Next? • Migrate to using Apache libcloud instead of using boto directly • Benchmark mixed work-loads (queries and indexing) • SiLK is improving rapidly! • Chaos monkey tests – integrate jepsen? • Open source so please kick the tires!
  • 25. Confidential and Proprietary © Copyright 2013 Wrap-up • Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk • LucidWorks: http://www.lucidworks.com • SiLK: http://www.lucidworks.com/lucidworks-silk/ • Solr In Action: http://www.manning.com/grainger/ • Connect: @thelabdude / [email protected] Questions?

Editor's Notes

  • #5: Yes, it does, but we need to start somewhere
  • #8: more shards == better indexing throughput (if your servers can handle it)
  • #9: replication is as fast as your slowest replica replicas have to re-analyze documents future – would like to send pre-analyzed docs between leader and replica, esp if text analysis is complex
  • #10: Not much capacity available for running queries on this node
  • #12: First couple of passes, I either had really bad performance (too random) or really fast (not random enough)
  • #14: Make it very easy to launch and manage SolrCloud clusters in Amazon of sizes 1 node to 100’s User has basic control over instance type, # of instances, # of nodes per instance, ZooKeeper ensemble Doesn’t have to care about how to start each Solr, how to connect it with ZooKeeper, host names / IPs, etc.
  • #16: Custom AMI that we just need to “start” stuff on is preferred to configuring a barebones instance each time