SlideShare a Scribd company logo
Building responsive
Symbology & Suggest web
         service
     with MongoDB
        Andrei Palchys, @apalchys
         Alex Kosau, @alexkosau
Introduction
• Customer: Thomson Reuters
• Business domain: Financial markets
• Goal: Implement Next-Gen financial web
  services
• The project started: July 2011
• The project finished: (Dec 2011)
• Team: 1 team lead, 5+1 developers, 2 QA
Web services
• Symbology Web Service
Provides reference data about financial instruments, via symbols,
codes or instrument names




• Suggest Web Service
Architecture
                      Old
                 Search        Web           Front
Sources   ETL    Engine      services         End
                                                     Desktop




                     New

Sources   ETL         The New Web Services           Desktop
Reasons to write the new web
           services
• Bad performance

• Expensive for scaling or extending

• Not easy to manage some type of data
Requirements for the web new
          services
• Performance
    95% Symbology requests should fit in 50ms.
    95% Suggest requests should fit in 25ms.

• Use normalized data

• Use less memory as much as possible

• Fast data loading into DB

• Windows environment and .Net platform
What we considered from
  commercial databases
• Microsoft SQL Server
  • 13 ms, too slow



• Oracle TimesTen
  • Relational
  • Completely in-memory: guaranteed latency but slow startup
  • Expensive



• McObject’s ExtremeDb
  •   Object DB
  •   Native C interface: designed for performance
  •   Ultra reliability
  •   Still expensive
What we considered from free
         databases
 • Redis
 • Hbase
 • Cassandra
 • RavenDB

     All these databases miss one of the
               requirements
MongoDB
• Document-oriented
• Simple use (decent interface for .NET
  available)
• Simple maintenance (monitoring, replication,
  sharding)
• Data is stored in-memory once used.
• 1ms average response time
• Cross-platform (native Windows support)
Databases
• Symbology DB – about 30GB of data
• Suggest DB – >22 GB of data



   Symbology DB         Suggest DB

   Symbology WS
                        Suggest WS
   Suggest WS
Deployment (planned)
• 6 “clusters” all around the world (TR data
  centers), in replica set.

• “cluster” – 3 servers (replica set +
  sharding) + 1 arbiter

• 2 of them are also used to load data.

• 128GB of memory per server
Symbology DB: challenge
• Fast search by full key
• Minimize the space taken by the data, since
  we need it to fit into RAM
  •   Data is Text only (no pictures etc)
  •   Full document required always
  •   Only some fields are used to query data, and these fields are short (3..10 symbols)
  •   New fields should be easily added to the “queryable” list

• Composite queries are needed sometimes
  • AB and CD and not EF or GH

• Fast data loading
Symbology DB: solution
Map the names of the document fields to ints
RIC -> 1
Name -> 2

{
    "1":   "GOOG.O",
    "2":   "Google"
}
Symbology DB: solution
Unite all queryable fields into arrays

• Query syntax is the same
• Single index – less space occupied
• Easy to add new searchable data


"s":[{
           "k": 1,    "v": "MSFT.O"
     },{
           "k": 2,    "v": "Microsoft Inc."
     }
]
Symbology DB: solution
Combine key and value properties
• Takes less space
• Use regex /^a../
• No performance decrease – MongoDB uses index for regex
  which starts with /^

"s":[
        "MSFT.O|1",
        "Microsoft Inc.|2"
]

Query: { s: { $regex: "^MSFT.O|" } }
Symbology DB: solution
    Compress not queryable data and store as a
    single field (binary data)
       • Encode with Protocol Buffers or MsgPack
          – In our case, MsgPack 2x faster than Protobuf

       • Zip with Snappy
          – Fastest algorithm in the world.

{
    "b" :
BinData(0,"CgcxMDkwMzcwEgZ1cztJQk0xAAAAAAAA8D86
A05ZU0IXTmV3IFl
vcmsgU3RvY2sgRXhjaGFuZ2VZAAAAAAAA8D9gAXABeAGJAQ
AAAAAAAPA/ogEFNDc0MU6qAQU0NzQxTrI…“)
}
Symbology DB: solution
Change ETL output format to json and insert
directly to MongoDB

It helped to decrease loading time from 9h to
1h.
Suggest DB: challenge
• Fast search by partial text

• Keep only top 50 entities per term

• Generate Suggest DB from existing
  Symbology DB
Suggest DB: solution
Use “Inverted” index for fast search by partial
text

{“term”:   “g”, “references”:[…]},
{“term”:   “go”, “references”:[…]},
{“term”:   “goo”, “references”:[…]},
{“term”:   “goog”, “references”:[…]},
Suggest DB: solution
Generate Suggest DB from existing Symbology DB
  • About 750 mln temporary documents
  • MongoDB Map Reduce is too slow
  • All MongoDB based algorithms takes a lot of
    time

  Use Amazon Elastic MapReduce!
  10h -> 40 mins
  Practical usage Amazon Elastic MapReduce (Viktar Basharymau)
  http://bit.ly/usage_mapreduce
.Net MongoDB driver
- Use IBsonSerializer interface instead of
  BsonElement attributes
- Driver has good performance – we have not
  found any bottlenecks.
Questions?

More Related Content

What's hot (20)

CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & Developers
Niko Neugebauer
 
Scaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeScaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays Singapore
Angad Singh
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon
 
tdtechtalk20160330johan
tdtechtalk20160330johantdtechtalk20160330johan
tdtechtalk20160330johan
Johan Gustavsson
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
MongoDB
 
Presto in my_use_case
Presto in my_use_casePresto in my_use_case
Presto in my_use_case
wyukawa
 
HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusHBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project Status
Michael Stack
 
E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres
Regunath B
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
Pierre Baillet
 
Mongo db3.0 wired_tiger_storage_engine
Mongo db3.0 wired_tiger_storage_engineMongo db3.0 wired_tiger_storage_engine
Mongo db3.0 wired_tiger_storage_engine
Kenny Gorman
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
MongoDB
 
20140120 presto meetup_en
20140120 presto meetup_en20140120 presto meetup_en
20140120 presto meetup_en
Ogibayashi
 
Starting with MongoDB
Starting with MongoDBStarting with MongoDB
Starting with MongoDB
Cesar Martinez
 
MongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL DatabaseMongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL Database
Gaurav Awasthi
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
Michael Stack
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
johnrjenson
 
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Martin Traverso
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
Wojciech Biela
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Matt Fuller
 
TPC-H in MongoDB
TPC-H in MongoDBTPC-H in MongoDB
TPC-H in MongoDB
Aung Thu Rha Hein
 
CosmosDB for DBAs & Developers
CosmosDB for DBAs & DevelopersCosmosDB for DBAs & Developers
CosmosDB for DBAs & Developers
Niko Neugebauer
 
Scaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays SingaporeScaling ELK Stack - DevOpsDays Singapore
Scaling ELK Stack - DevOpsDays Singapore
Angad Singh
 
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBaseHBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon2017 Efficient and portable data processing with Apache Beam and HBase
HBaseCon
 
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
MongoDB
 
Presto in my_use_case
Presto in my_use_casePresto in my_use_case
Presto in my_use_case
wyukawa
 
HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project StatusHBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Keynote1: Apache HBase Project Status
Michael Stack
 
E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres
Regunath B
 
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
Pierre Baillet
 
Mongo db3.0 wired_tiger_storage_engine
Mongo db3.0 wired_tiger_storage_engineMongo db3.0 wired_tiger_storage_engine
Mongo db3.0 wired_tiger_storage_engine
Kenny Gorman
 
Webinar: When to Use MongoDB
Webinar: When to Use MongoDBWebinar: When to Use MongoDB
Webinar: When to Use MongoDB
MongoDB
 
20140120 presto meetup_en
20140120 presto meetup_en20140120 presto meetup_en
20140120 presto meetup_en
Ogibayashi
 
MongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL DatabaseMongoDB - An Agile NoSQL Database
MongoDB - An Agile NoSQL Database
Gaurav Awasthi
 
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and CloudHBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
Michael Stack
 
MongoDB Pros and Cons
MongoDB Pros and ConsMongoDB Pros and Cons
MongoDB Pros and Cons
johnrjenson
 
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Presto at Facebook - Presto Meetup @ Boston (10/6/2015)
Martin Traverso
 
Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.Presto - Analytical Database. Overview and use cases.
Presto - Analytical Database. Overview and use cases.
Wojciech Biela
 
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Hello, Enterprise! Meet Presto. (Presto Boston Meetup 10062015)
Matt Fuller
 

Viewers also liked (6)

Meetup#1: 10 reasons to fall in love with MongoDB
Meetup#1: 10 reasons to fall in love with MongoDBMeetup#1: 10 reasons to fall in love with MongoDB
Meetup#1: 10 reasons to fall in love with MongoDB
Minsk MongoDB User Group
 
Meetup#2: MongoDB Schema Design
Meetup#2: MongoDB Schema DesignMeetup#2: MongoDB Schema Design
Meetup#2: MongoDB Schema Design
Minsk MongoDB User Group
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by Examples
Hadi Ariawan
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
MongoDB
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
Luminary Labs
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
LinkedIn
 
Meetup#1: 10 reasons to fall in love with MongoDB
Meetup#1: 10 reasons to fall in love with MongoDBMeetup#1: 10 reasons to fall in love with MongoDB
Meetup#1: 10 reasons to fall in love with MongoDB
Minsk MongoDB User Group
 
MongoDB Schema Design by Examples
MongoDB Schema Design by ExamplesMongoDB Schema Design by Examples
MongoDB Schema Design by Examples
Hadi Ariawan
 
MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
MongoDB
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
Luminary Labs
 
Study: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving CarsStudy: The Future of VR, AR and Self-Driving Cars
Study: The Future of VR, AR and Self-Driving Cars
LinkedIn
 

Similar to Meetup#2: Building responsive Symbology & Suggest WebService (20)

5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
Tim Callaghan
 
Drop acid
Drop acidDrop acid
Drop acid
Mike Feltman
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
Hadoop User Group
 
Common crawlpresentation
Common crawlpresentationCommon crawlpresentation
Common crawlpresentation
Hadoop User Group
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
MongoDB
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
mathraq
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
Community Engine
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Emprovise
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
Tuan Luong
 
MongoDB
MongoDBMongoDB
MongoDB
Serdar Buyuktemiz
 
Wmware NoSQL
Wmware NoSQLWmware NoSQL
Wmware NoSQL
Murat Çakal
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
Intergen
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
FoundationDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
Jimmy Ray
 
MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform
MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce PlatformMongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform
MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform
MongoDB
 
Mongodb
MongodbMongodb
Mongodb
Apurva Vyas
 
DOTNET8.pptx
DOTNET8.pptxDOTNET8.pptx
DOTNET8.pptx
Udaiappa Ramachandran
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
Siraj Memon
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
DATAVERSITY
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
Chris Richardson
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
Tim Callaghan
 
Building a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with HadoopBuilding a Scalable Web Crawler with Hadoop
Building a Scalable Web Crawler with Hadoop
Hadoop User Group
 
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data LakesWebinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
MongoDB
 
MongoDB at community engine
MongoDB at community engineMongoDB at community engine
MongoDB at community engine
mathraq
 
Mongo DB at Community Engine
Mongo DB at Community EngineMongo DB at Community Engine
Mongo DB at Community Engine
Community Engine
 
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Highlights of AWS ReInvent 2023 (Announcements and Best Practices)
Emprovise
 
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep DiveTechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
TechEd AU 2014: Microsoft Azure DocumentDB Deep Dive
Intergen
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
FoundationDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring dataMongoDB 2.4 and spring data
MongoDB 2.4 and spring data
Jimmy Ray
 
MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform
MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce PlatformMongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform
MongoDB World 2018: Breaking the Mold - Redesigning Dell's E-Commerce Platform
MongoDB
 
MongoDB Internals
MongoDB InternalsMongoDB Internals
MongoDB Internals
Siraj Memon
 
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
A Case Study of NoSQL Adoption: What Drove Wordnik Non-Relational?
DATAVERSITY
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
Chris Richardson
 

Recently uploaded (20)

Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 

Meetup#2: Building responsive Symbology & Suggest WebService

  • 1. Building responsive Symbology & Suggest web service with MongoDB Andrei Palchys, @apalchys Alex Kosau, @alexkosau
  • 2. Introduction • Customer: Thomson Reuters • Business domain: Financial markets • Goal: Implement Next-Gen financial web services • The project started: July 2011 • The project finished: (Dec 2011) • Team: 1 team lead, 5+1 developers, 2 QA
  • 3. Web services • Symbology Web Service Provides reference data about financial instruments, via symbols, codes or instrument names • Suggest Web Service
  • 4. Architecture Old Search Web Front Sources ETL Engine services End Desktop New Sources ETL The New Web Services Desktop
  • 5. Reasons to write the new web services • Bad performance • Expensive for scaling or extending • Not easy to manage some type of data
  • 6. Requirements for the web new services • Performance 95% Symbology requests should fit in 50ms. 95% Suggest requests should fit in 25ms. • Use normalized data • Use less memory as much as possible • Fast data loading into DB • Windows environment and .Net platform
  • 7. What we considered from commercial databases • Microsoft SQL Server • 13 ms, too slow • Oracle TimesTen • Relational • Completely in-memory: guaranteed latency but slow startup • Expensive • McObject’s ExtremeDb • Object DB • Native C interface: designed for performance • Ultra reliability • Still expensive
  • 8. What we considered from free databases • Redis • Hbase • Cassandra • RavenDB All these databases miss one of the requirements
  • 9. MongoDB • Document-oriented • Simple use (decent interface for .NET available) • Simple maintenance (monitoring, replication, sharding) • Data is stored in-memory once used. • 1ms average response time • Cross-platform (native Windows support)
  • 10. Databases • Symbology DB – about 30GB of data • Suggest DB – >22 GB of data Symbology DB Suggest DB Symbology WS Suggest WS Suggest WS
  • 11. Deployment (planned) • 6 “clusters” all around the world (TR data centers), in replica set. • “cluster” – 3 servers (replica set + sharding) + 1 arbiter • 2 of them are also used to load data. • 128GB of memory per server
  • 12. Symbology DB: challenge • Fast search by full key • Minimize the space taken by the data, since we need it to fit into RAM • Data is Text only (no pictures etc) • Full document required always • Only some fields are used to query data, and these fields are short (3..10 symbols) • New fields should be easily added to the “queryable” list • Composite queries are needed sometimes • AB and CD and not EF or GH • Fast data loading
  • 13. Symbology DB: solution Map the names of the document fields to ints RIC -> 1 Name -> 2 { "1": "GOOG.O", "2": "Google" }
  • 14. Symbology DB: solution Unite all queryable fields into arrays • Query syntax is the same • Single index – less space occupied • Easy to add new searchable data "s":[{ "k": 1, "v": "MSFT.O" },{ "k": 2, "v": "Microsoft Inc." } ]
  • 15. Symbology DB: solution Combine key and value properties • Takes less space • Use regex /^a../ • No performance decrease – MongoDB uses index for regex which starts with /^ "s":[ "MSFT.O|1", "Microsoft Inc.|2" ] Query: { s: { $regex: "^MSFT.O|" } }
  • 16. Symbology DB: solution Compress not queryable data and store as a single field (binary data) • Encode with Protocol Buffers or MsgPack – In our case, MsgPack 2x faster than Protobuf • Zip with Snappy – Fastest algorithm in the world. { "b" : BinData(0,"CgcxMDkwMzcwEgZ1cztJQk0xAAAAAAAA8D86 A05ZU0IXTmV3IFl vcmsgU3RvY2sgRXhjaGFuZ2VZAAAAAAAA8D9gAXABeAGJAQ AAAAAAAPA/ogEFNDc0MU6qAQU0NzQxTrI…“) }
  • 17. Symbology DB: solution Change ETL output format to json and insert directly to MongoDB It helped to decrease loading time from 9h to 1h.
  • 18. Suggest DB: challenge • Fast search by partial text • Keep only top 50 entities per term • Generate Suggest DB from existing Symbology DB
  • 19. Suggest DB: solution Use “Inverted” index for fast search by partial text {“term”: “g”, “references”:[…]}, {“term”: “go”, “references”:[…]}, {“term”: “goo”, “references”:[…]}, {“term”: “goog”, “references”:[…]},
  • 20. Suggest DB: solution Generate Suggest DB from existing Symbology DB • About 750 mln temporary documents • MongoDB Map Reduce is too slow • All MongoDB based algorithms takes a lot of time Use Amazon Elastic MapReduce! 10h -> 40 mins Practical usage Amazon Elastic MapReduce (Viktar Basharymau) http://bit.ly/usage_mapreduce
  • 21. .Net MongoDB driver - Use IBsonSerializer interface instead of BsonElement attributes - Driver has good performance – we have not found any bottlenecks.

Editor's Notes

  • #3: В «двух» словах
  • #5: Как данные попадали от источника данных к конечному пользователю.