PistonHead's use of MongoDB for AnalyticsAndrew Morgan
Haymarket Media Group is building a reporting and analytics suite called PistonHub to provide dealers and administrators insights into classifieds and stock performance data. PistonHub will aggregate data from various sources like classifieds, calls, emails, and stock information to generate daily statistics for each dealer that can be viewed on a dashboard. This consolidated data will give dealers and sales teams more visibility to help dealers improve performance. The initial feedback on PistonHub has been positive for providing extra insights.
One of MongoDB’s primary appeals to developers is that it gives them the ability to start application development without needing to define a formal, up-front schema. Operations teams appreciate the fact that they don't need to perform a time-consuming schema upgrade operation every time the developers need to store a different attribute (as an example, The Weather Channel is now able to launch new features in hours whereas it used to take weeks). For business leaders, the application gets launched much faster, and new features can be rolled out more frequently. MongoDB powers agility.
Some projects reach a point where it's necessary to define rules on what's being stored in the database – for example, that for any document in a particular collection, you can be assured that certain attributes are present.
To address the challenges discussed above, while at the same time maintaining the benefits of a dynamic schema, MongoDB 3.2 introduces document validation.
There is significant flexibility to customize which parts of the documents are **and are not** validated for any collection.
Data Management 3: Bulletproof Data ManagementMongoDB
"This session focuses on delivering operationally robust deployments of MongoDB via specific design capabilities and varying data feeds. Learn how to use services or driver wrappers to unify design patterns for managing data. This talk will address the following questions:
How do you enforce a schema?
How do you redact or remove sensitive data in queries and feeds?
How do you detect and police ""out of profile"" queries and make sure they do not threaten your system?"
The document discusses using MongoDB as a tick store for financial data. It provides an overview of MongoDB and its benefits for handling tick data, including its flexible data model, rich querying capabilities, native aggregation framework, ability to do pre-aggregation for continuous data snapshots, language drivers and Hadoop connector. It also presents a case study of AHL, a quantitative hedge fund, using MongoDB and Python as their market data platform to easily onboard large volumes of financial data in different formats and provide low-latency access for backtesting and research applications.
The integration between Spring Framework and MongoDB tends to be somewhat unknown. This presentation shows the different projects that compose Spring ecosystem, Springdata, Springboot, SpringIO etc and how to merge between the pure JAVA projects to massive enterprise systems that require the interaction of these systems together.
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB
For 30 years, developers have been taught that relational data modeling was THE way to model, but as more companies adopt MongoDB as their data platform, the approaches that work well in relational design actually work against you in a document model design. In this talk, we will discuss how to conceptually approach modeling data with MongoDB, focusing on practical foundational techniques, paired with tips and tricks, and wrapping with discussing design patterns to solve common real world problems.
Python and MongoDB as a Market Data Platform by James BlackburnPyData
This document discusses using Python and MongoDB as a scalable platform for storing time series market data. It outlines some of the challenges of storing different types and sizes of financial data from various sources. The goals are to access 10 years of 1-minute data in under 1 second, store all data types in a single location, and have a system that is fast, complete, scalable, and supports agile development. MongoDB is chosen as it matches well with Python and allows for fast, low latency access. The system implemented uses MongoDB to store data in a way that supports versioning, arbitrary data types, and efficient storage and retrieval of pandas DataFrames. Performance tests show significant speed improvements over SQL and other tick databases for accessing intra
Learn how you can enjoy the developer productivity, low TCO, and unlimited scale of MongoDB as a tick database for capturing, analyzing, and taking advantage of opportunities in tick data. This presentation will illustrates how MongoDB can easily and quickly store variable data formats, like top and depth of book, multiple asset classes, and even news and social networking feeds. It will explore aggregating and analyzing tick data in real-time for automated trading or in batch for research and analysis and how auto-sharding enables MongoDB to scale with commodity hardware to satisfy unlimited storage and performance requirements.
Implementing and Visualizing Clickstream data with MongoDBMongoDB
Having recently implemented a new framework for the real-time collection, aggregation and visualization of web and mobile generated Clickstream traffic (realizing daily click-stream volumes of 1M+ events), this walkthrough is about the motivations, throughout-process and key decisions made, as well as an in depth look at the implementation of how to buildout a data-collection, analytics and visualization framework using MongoDB. Technologies covered in this presentation (as well as MongoDB) are Java, Spring, Django and Pymongo.
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2MongoDB
Applications get great efficiency from MongoDB by combining data that is accessed together into a single document. There are however situations where it is more efficient to have references between documents rather than embedding everything into a single document. This led to joins being our most requested feature. MongoDB 3.2 addresses this through the introduction of the $lookup stage in the aggregation pipeline to implement left-outer joins.
This webinar looks at $lookup as well as the other significant aggregation enhancements coming with MongoDB 3.2—why they're needed, what they deliver, and how to use them.
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB
Immediate feedback is an essential part of modern application development where developers want to sync across platforms, systems, and users to provide better end-user experiences. Change streams empower developers to easily leverage the power of MongoDB's internal real-time functionality to react to relevant data changes immediately. Change streams also provide the backbone of MongoDB Atlas triggers. This session introduces change streams and walks you through developing with them. We will discuss use cases, integrating with Kafka, and explore how to make good architectural decisions around this new functionality.
The document discusses building a CouchDB application to store human protein data, describing how each protein document would contain information like name, sequence, and other defining features extracted from public databases. It provides an example protein document to demonstrate the type of data that would be stored.
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
Understand how you can get the benefits you're looking for from NoSQL data stores without sacrificing the power and flexibility of the world's most popular open source database - MySQL.
Hermes: Free the Data! Distributed Computing with MongoDBMongoDB
Moving data throughout an organization is an art form. Whether mastering the art of ETL or building micro services, we are often left with either business logic embedded where it doesn't belong or monolithic apps that do too much. In this talk, we will show you how we built a persisted messaging bus to ‘Free the Data’ from the apps, making it available across the organization without having to write custom ETL code. This in turn makes it possible for business apps to be standalone, testable and more reliable. We will discuss the basic architecture and how it works, go through some code samples (server side and client side), and present some statistics and visualizations.
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...MongoDB
Mass spectrometry is the gold standard for determining chemical compositions, with spectrometers often measuring the mass of a compound down to a single electron. This level of granularity produces an enormous amount of hierarchical data that doesn't fit well into rows and columns. In this talk, learn how Thermo Fisher is using MongoDB Atlas on AWS to allow their users to get near real-time insights from mass spectrometry experiments—a process that used to take days. We also share how the underlying database service used by Thermo Fisher was built on AWS.
Data Management 2: Conquering Data ProliferationMongoDB
Today's customers demand applications which integrate intelligently with data from mobile, social media and cloud sources. A system of engagement meets these expectations by applying data and analytics drawn from an array of master systems. The enormous scale and performance required overwhelm relational approaches, but we can use MongoDB to meet the challenge. We'll learn to capture and transmit data changes among disparate systems, expose batch data as interactive operational queries and build systems with strong division of concerns, agility and flexibility.
The document discusses the FCC's Measuring Broadband America project which collects mobile broadband speed test data from over 100,000 Android and 50,000 iOS device installs of their app. It describes how they store the speed test results in MongoDB and use aggregation and geo-spatial queries to analyze the data and build heat maps visualizing carrier performance. Their goal is to release these visualizations and the underlying data and source code publicly in autumn 2014 to provide transparency into mobile broadband speeds.
MongoDB and Hadoop: Driving Business InsightsMongoDB
MongoDB and Hadoop can work together to solve big data problems facing today's enterprises. We will take an in-depth look at how the two technologies complement and enrich each other with complex analyses and greater intelligence. We will take a deep dive into the MongoDB Connector for Hadoop and how it can be applied to enable new business insights with MapReduce, Pig, and Hive, and demo a Spark application to drive product recommendations.
MongoDB has been conceived for the cloud age. Making sure that MongoDB is compatible and performant around cloud providers is mandatory to achieve complete integration with platforms and systems. Azure is one of biggest IaaS platforms available and very popular amongst developers that work on Microsoft Stack.
This document provides examples of using aggregations in Elasticsearch to calculate statistics and group documents. It shows terms, range, and histogram facets/aggregations to group documents by fields like state or population range and calculate statistics like average density. It also demonstrates nesting aggregations to first group by one field like state and then further group and calculate stats within each state group. Finally it lists the built-in aggregation bucketizers and calculators available in Elasticsearch.
MongoDB .local Paris 2020: Adéo @MongoDB : MongoDB Atlas & Leroy Merlin : et ...MongoDB
Adeo et en particulier Leroy Merlin utilisent massivement MongoDB pour propulser de nombreuses applications et en particulier son site web leroymerlin.fr.
Emmanuel Dieval Ingénieur Software chez ADEO, présentera le nouveau système au coeur de la publication de l'offre Leroy Merlin: OPUS.
OPUS s'appuie particulièrement sur MongoDB pour la construction des pages de famille de produits tout en supportant un important flux de données journalier.
Après un rappel sur les pipelines d'agrégation et une présentation de MongoDB Atlas par Maxime Beugnet, Developer Advocate chez MongoDB, Emmanuel parlera de l'utilisation des pipelines d'agrégation pour la construction des pages de famille de produits, mais aussi de Google Cloud Platform et des avantages à utiliser MongoDB Atlas.
Webinar: Building Your First App with MongoDB and JavaMongoDB
The document discusses building Java applications that use MongoDB as the database. It covers connecting to MongoDB from Java using the driver, designing schemas for embedded documents and arrays, building Java objects to represent and insert data, and performing basic operations like inserts. The document also mentions using an object-document mapper like Morphia to simplify interactions between Java objects and MongoDB documents.
BM Cloudant is a NoSQL Database-as-a-Service. Discover how you can outsource the data layer of your mobile or web application to Cloudant to provide high availability, scalability and tools to take you to the next level.
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep DiveMongoDB
Come and hear more about our new full-text search operator for MongoDB Atlas. This is a significant enhancement to MongoDB search features and is the easiest and most powerful full-text search solution for databases on MongoDB Atlas.
This talk is important for anyone who has implemented search or is considering a search feature in their MongoDB application.
You will see a demo of $searchBeta, learn about how it works, discover specific features to help you deliver relevant search results, and learn how you can start using full-text search in your application today.
What's the Scoop on Hadoop? How It Works and How to WORK IT!MongoDB
MongoDB and Hadoop work powerfully together as complementary technologies. Learn how the Hadoop connector allows you to leverage the power of MapReduce to process data sourced from your MongoDB cluster.
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB
What's the Scoop on MongoDB & Hadoop
Jake Angerman, Sr. Solutions Architect, MongoDB
MongoDB Evenings Dallas
March 30, 2016 at the Addison Treehouse, Dallas, TX
Analyze and visualize non-relational data with DocumentDB + Power BISriram Hariharan
The session will show how to do Analyze and visualize non-relational data with DocumentDB + Power BI. We are in the midst of a paradigm shift on how we store and analyze data. Unstructured or flexible schema data represents a large portion of data within an organization. Everyone is obsessed to turn this data into meaningful business information. Unstructured data analytics do not need to be time consuming and complex. Come learn how to analyze and visualize unstructured data in DocumentDB.
Faites évoluer votre accès aux données avec MongoDB StitchMongoDB
Vous avez des données précieuses dans MongoDB; et alors qu'il est important d'utiliser ces données pour donner de la valeur à vos utilisateurs et clients, il peut s'avérer difficile de le faire de manière sûre et sécurisée. Dans cette session, vous apprendrez à connecter simplement vos utilisateurs aux données dont ils ont besoin à l’aide de MongoDB Stitch.
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationMongoDB
This document provides an agenda for a MongoDB basics session. It will cover reporting and aggregation options in MongoDB like MapReduce, the Aggregation Framework, and examples of using aggregation for common reports like popular tags, popular articles, and aggregating geospatial data. It also discusses using the aggregation framework pipeline and operators to build these reports and tuning performance with the explain plan.
1403 app dev series - session 5 - analyticsMongoDB
This document provides an agenda for a session on reporting and analytics options in MongoDB, including Map Reduce, the Aggregation Framework, and examples using geospatial and text search features. It discusses building reports in an application, tuning aggregation pipelines with explain plans, and computing aggregations on the fly or pre-computing and storing them. The next session will cover operational topics like scaling out, high availability, production preparation, and sizing.
Implementing and Visualizing Clickstream data with MongoDBMongoDB
Having recently implemented a new framework for the real-time collection, aggregation and visualization of web and mobile generated Clickstream traffic (realizing daily click-stream volumes of 1M+ events), this walkthrough is about the motivations, throughout-process and key decisions made, as well as an in depth look at the implementation of how to buildout a data-collection, analytics and visualization framework using MongoDB. Technologies covered in this presentation (as well as MongoDB) are Java, Spring, Django and Pymongo.
Joins and Other Aggregation Enhancements Coming in MongoDB 3.2MongoDB
Applications get great efficiency from MongoDB by combining data that is accessed together into a single document. There are however situations where it is more efficient to have references between documents rather than embedding everything into a single document. This led to joins being our most requested feature. MongoDB 3.2 addresses this through the introduction of the $lookup stage in the aggregation pipeline to implement left-outer joins.
This webinar looks at $lookup as well as the other significant aggregation enhancements coming with MongoDB 3.2—why they're needed, what they deliver, and how to use them.
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB
Immediate feedback is an essential part of modern application development where developers want to sync across platforms, systems, and users to provide better end-user experiences. Change streams empower developers to easily leverage the power of MongoDB's internal real-time functionality to react to relevant data changes immediately. Change streams also provide the backbone of MongoDB Atlas triggers. This session introduces change streams and walks you through developing with them. We will discuss use cases, integrating with Kafka, and explore how to make good architectural decisions around this new functionality.
The document discusses building a CouchDB application to store human protein data, describing how each protein document would contain information like name, sequence, and other defining features extracted from public databases. It provides an example protein document to demonstrate the type of data that would be stored.
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
Understand how you can get the benefits you're looking for from NoSQL data stores without sacrificing the power and flexibility of the world's most popular open source database - MySQL.
Hermes: Free the Data! Distributed Computing with MongoDBMongoDB
Moving data throughout an organization is an art form. Whether mastering the art of ETL or building micro services, we are often left with either business logic embedded where it doesn't belong or monolithic apps that do too much. In this talk, we will show you how we built a persisted messaging bus to ‘Free the Data’ from the apps, making it available across the organization without having to write custom ETL code. This in turn makes it possible for business apps to be standalone, testable and more reliable. We will discuss the basic architecture and how it works, go through some code samples (server side and client side), and present some statistics and visualizations.
How Thermo Fisher Is Reducing Mass Spectrometry Experiment Times from Days to...MongoDB
Mass spectrometry is the gold standard for determining chemical compositions, with spectrometers often measuring the mass of a compound down to a single electron. This level of granularity produces an enormous amount of hierarchical data that doesn't fit well into rows and columns. In this talk, learn how Thermo Fisher is using MongoDB Atlas on AWS to allow their users to get near real-time insights from mass spectrometry experiments—a process that used to take days. We also share how the underlying database service used by Thermo Fisher was built on AWS.
Data Management 2: Conquering Data ProliferationMongoDB
Today's customers demand applications which integrate intelligently with data from mobile, social media and cloud sources. A system of engagement meets these expectations by applying data and analytics drawn from an array of master systems. The enormous scale and performance required overwhelm relational approaches, but we can use MongoDB to meet the challenge. We'll learn to capture and transmit data changes among disparate systems, expose batch data as interactive operational queries and build systems with strong division of concerns, agility and flexibility.
The document discusses the FCC's Measuring Broadband America project which collects mobile broadband speed test data from over 100,000 Android and 50,000 iOS device installs of their app. It describes how they store the speed test results in MongoDB and use aggregation and geo-spatial queries to analyze the data and build heat maps visualizing carrier performance. Their goal is to release these visualizations and the underlying data and source code publicly in autumn 2014 to provide transparency into mobile broadband speeds.
MongoDB and Hadoop: Driving Business InsightsMongoDB
MongoDB and Hadoop can work together to solve big data problems facing today's enterprises. We will take an in-depth look at how the two technologies complement and enrich each other with complex analyses and greater intelligence. We will take a deep dive into the MongoDB Connector for Hadoop and how it can be applied to enable new business insights with MapReduce, Pig, and Hive, and demo a Spark application to drive product recommendations.
MongoDB has been conceived for the cloud age. Making sure that MongoDB is compatible and performant around cloud providers is mandatory to achieve complete integration with platforms and systems. Azure is one of biggest IaaS platforms available and very popular amongst developers that work on Microsoft Stack.
This document provides examples of using aggregations in Elasticsearch to calculate statistics and group documents. It shows terms, range, and histogram facets/aggregations to group documents by fields like state or population range and calculate statistics like average density. It also demonstrates nesting aggregations to first group by one field like state and then further group and calculate stats within each state group. Finally it lists the built-in aggregation bucketizers and calculators available in Elasticsearch.
MongoDB .local Paris 2020: Adéo @MongoDB : MongoDB Atlas & Leroy Merlin : et ...MongoDB
Adeo et en particulier Leroy Merlin utilisent massivement MongoDB pour propulser de nombreuses applications et en particulier son site web leroymerlin.fr.
Emmanuel Dieval Ingénieur Software chez ADEO, présentera le nouveau système au coeur de la publication de l'offre Leroy Merlin: OPUS.
OPUS s'appuie particulièrement sur MongoDB pour la construction des pages de famille de produits tout en supportant un important flux de données journalier.
Après un rappel sur les pipelines d'agrégation et une présentation de MongoDB Atlas par Maxime Beugnet, Developer Advocate chez MongoDB, Emmanuel parlera de l'utilisation des pipelines d'agrégation pour la construction des pages de famille de produits, mais aussi de Google Cloud Platform et des avantages à utiliser MongoDB Atlas.
Webinar: Building Your First App with MongoDB and JavaMongoDB
The document discusses building Java applications that use MongoDB as the database. It covers connecting to MongoDB from Java using the driver, designing schemas for embedded documents and arrays, building Java objects to represent and insert data, and performing basic operations like inserts. The document also mentions using an object-document mapper like Morphia to simplify interactions between Java objects and MongoDB documents.
BM Cloudant is a NoSQL Database-as-a-Service. Discover how you can outsource the data layer of your mobile or web application to Cloudant to provide high availability, scalability and tools to take you to the next level.
MongoDB .local Toronto 2019: MongoDB Atlas Search Deep DiveMongoDB
Come and hear more about our new full-text search operator for MongoDB Atlas. This is a significant enhancement to MongoDB search features and is the easiest and most powerful full-text search solution for databases on MongoDB Atlas.
This talk is important for anyone who has implemented search or is considering a search feature in their MongoDB application.
You will see a demo of $searchBeta, learn about how it works, discover specific features to help you deliver relevant search results, and learn how you can start using full-text search in your application today.
What's the Scoop on Hadoop? How It Works and How to WORK IT!MongoDB
MongoDB and Hadoop work powerfully together as complementary technologies. Learn how the Hadoop connector allows you to leverage the power of MapReduce to process data sourced from your MongoDB cluster.
MongoDB Evenings Dallas: What's the Scoop on MongoDB & HadoopMongoDB
What's the Scoop on MongoDB & Hadoop
Jake Angerman, Sr. Solutions Architect, MongoDB
MongoDB Evenings Dallas
March 30, 2016 at the Addison Treehouse, Dallas, TX
Analyze and visualize non-relational data with DocumentDB + Power BISriram Hariharan
The session will show how to do Analyze and visualize non-relational data with DocumentDB + Power BI. We are in the midst of a paradigm shift on how we store and analyze data. Unstructured or flexible schema data represents a large portion of data within an organization. Everyone is obsessed to turn this data into meaningful business information. Unstructured data analytics do not need to be time consuming and complex. Come learn how to analyze and visualize unstructured data in DocumentDB.
Faites évoluer votre accès aux données avec MongoDB StitchMongoDB
Vous avez des données précieuses dans MongoDB; et alors qu'il est important d'utiliser ces données pour donner de la valeur à vos utilisateurs et clients, il peut s'avérer difficile de le faire de manière sûre et sécurisée. Dans cette session, vous apprendrez à connecter simplement vos utilisateurs aux données dont ils ont besoin à l’aide de MongoDB Stitch.
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationMongoDB
This document provides an agenda for a MongoDB basics session. It will cover reporting and aggregation options in MongoDB like MapReduce, the Aggregation Framework, and examples of using aggregation for common reports like popular tags, popular articles, and aggregating geospatial data. It also discusses using the aggregation framework pipeline and operators to build these reports and tuning performance with the explain plan.
1403 app dev series - session 5 - analyticsMongoDB
This document provides an agenda for a session on reporting and analytics options in MongoDB, including Map Reduce, the Aggregation Framework, and examples using geospatial and text search features. It discusses building reports in an application, tuning aggregation pipelines with explain plans, and computing aggregations on the fly or pre-computing and storing them. The next session will cover operational topics like scaling out, high availability, production preparation, and sizing.
The document discusses MongoDB and how it allows storing data in flexible, document-based collections rather than rigid tables. Some key points:
- MongoDB uses a flexible document model that allows embedding related data rather than requiring separate tables joined by foreign keys.
- It supports dynamic schemas that allow fields within documents to vary unlike traditional SQL databases that require all rows to have the same structure.
- Aggregation capabilities allow complex analytics to be performed directly on the data without requiring data warehousing or manual export/import like with SQL databases. Pipelines of aggregation operations can be chained together.
This presentation is showing how to use the Aggregation Framework, the powerful aggregation language of MongoDB. Using some real data coming from the USA Census, we will discover the most important operations.
Data Processing and Aggregation with MongoDB MongoDB
The document discusses data processing and aggregation using MongoDB. It provides an example of using MongoDB's map-reduce functionality to count the most popular pub names in a dataset of UK pub locations and attributes. It shows the map and reduce functions used to tally the name occurrences and outputs the top 10 results. It then demonstrates performing a similar analysis on just the pubs located in central London using MongoDB's aggregation framework pipeline to match, group and sort the results.
The document provides a comparison of MongoDB query and aggregation capabilities versus Couchbase N1QL capabilities. It begins with an introduction and overview of the different approaches and use cases for each. It then delves into detailed comparisons of specific query and aggregation operations such as CRUD, nested queries, array queries, text search, and joins. Overall, it finds that while both provide robust querying, Couchbase N1QL expressions tend to be more declarative due to its basis in SQL, whereas MongoDB requires more familiarity with its syntax.
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
This is the fourth webinar of a Back to Basics series that will introduce you to the MongoDB database. This webinar will introduce you to the aggregation framework.
This document discusses using MongoDB for risk use cases. It highlights MongoDB's support for unstructured, semi-structured and polymorphic data types as well as its ability to handle large volumes of data and queries per second. It then provides an example of how MongoDB can be used to collect and aggregate risk data from multiple sources, perform calculations in real-time, and generate pre-aggregated reports using aggregation frameworks.
Webinar: General Technical Overview of MongoDB for Dev TeamsMongoDB
In this talk we will focus on several of the reasons why developers have come to love the richness, flexibility, and ease of use that MongoDB provides. First we will give a brief introduction of MongoDB, comparing and contrasting it to the traditional relational database. Next, we’ll give an overview of the APIs and tools that are part of the MongoDB ecosystem. Then we’ll look at how MongoDB CRUD (Create, Read, Update, Delete) operations work, and also explore query, update, and projection operators. Finally, we will discuss MongoDB indexes and look at some examples of how indexes are used.
On Tuesday 18th March, the MongoDB team held on online Cloud Workshop in place of the in-person event which was planned.
Attendees learnt how to build modern, event driven applications powered by MongoDB Atlas in Google Cloud Platform (GCP) and were shown relevant operational and security best practices, to get started immediately with their own digital transformations.
The document discusses various options for processing and aggregating data in MongoDB, including the Aggregation Framework, MapReduce, and connecting MongoDB to external systems like Hadoop. The Aggregation Framework is described as a flexible way to query and transform data in MongoDB using a JSON-like syntax and pipeline stages. MapReduce is presented as more versatile but also more complex to implement. Connecting to external systems like Hadoop allows processing large amounts of data across clusters.
Couchbase Tutorial: Big data Open Source Systems: VLDB2018Keshav Murthy
The document provides an agenda and introduction to Couchbase and N1QL. It discusses Couchbase architecture, data types, data manipulation statements, query operators like JOIN and UNNEST, indexing, and query execution flow in Couchbase. It compares SQL and N1QL, highlighting how N1QL extends SQL to query JSON data.
Data analytics can offer insights into your business and help take it to the next level. In this talk you'll learn about MongoDB tools for building visualizations, dashboards and interacting with your data. We'll start with exploratory data analysis using MongoDB Compass.
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
A Kafka cluster stores streams of records (messages) in categories called topics. It is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. Today’s enterprises have their core systems often implemented on top of relational databases, such as the Oracle RDBMS. Implementing a new solution supporting the digital strategy using Kafka and the ecosystem can not always be done completely separate from the traditional legacy solutions. Often streaming data has to be enriched with state data which is held in an RDBMS of a legacy application. It’s important to cache this data in the stream processing solution, so that It can be efficiently joined to the data stream. But how do we make sure that the cache is kept up-to-date, if the source data changes? We can either poll for changes from Kafka using Kafka Connect or let the RDBMS push the data changes to Kafka. But what about writing data back to the legacy application, i.e. an anomaly is detected inside the stream processing solution which should trigger an action inside the legacy application. Using Kafka Connect we can write to a database table or view, which could trigger the action. But this not always the best option. If you have an Oracle RDBMS, there are many other ways to integrate the database with Kafka, such as Advanced Queueing (message broker in the database), CDC through Golden Gate or Debezium, Oracle REST Database Service (ORDS) and more. In this session, we present various blueprints for integrating an Oracle RDBMS with Apache Kafka in both directions and discuss how these blueprints can be implemented using the products mentioned before.
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafk...confluent
A Kafka cluster stores streams of records (messages) in categories called topics. It is the architectural backbone for integrating streaming data with a Data Lake, Microservices and Stream Processing. Today's enterprises have their core systems often implemented on top of relational databases, such as the Oracle RDBMS. Implementing a new solution supporting the digital strategy using Kafka and the ecosystem can not always be done completely separate from the traditional legacy solutions. Often streaming data has to be enriched with state data which is held in an RDBMS of a legacy application. It's important to cache this data in the stream processing solution, so that It can be efficiently joined to the data stream. But how do we make sure that the cache is kept up-to-date, if the source data changes? We can either poll for changes from Kafka using Kafka Connect or let the RDBMS push the data changes to Kafka. But what about writing data back to the legacy application, i.e. an anomaly is detected inside the stream processing solution which should trigger an action inside the legacy application. Using Kafka Connect we can write to a database table or view, which could trigger the action. But this not always the best option. If you have an Oracle RDBMS, there are many other ways to integrate the database with Kafka, such as Advanced Queueing (message broker in the database), CDC through Golden Gate or Debezium, Oracle REST Database Service (ORDS) and more. In this session, we present various blueprints for integrating an Oracle RDBMS with Apache Kafka in both directions and discuss how these blueprints can be implemented using the products mentioned before.
MongoDB offers two native data processing tools: MapReduce and the Aggregation Framework. MongoDB’s built-in aggregation framework is a powerful tool for performing analytics and statistical analysis in real-time and generating pre-aggregated reports for dashboarding. In this session, we will demonstrate how to use the aggregation framework for different types of data processing including ad-hoc queries, pre-aggregated reports, and more. At the end of this talk, you should walk aways with a greater understanding of the built-in data processing options in MongoDB and how to use the aggregation framework in your next project.
This document provides an overview of MongoDB, including:
- The speaker's credentials and agenda for the presentation
- Key advantages and concepts of MongoDB like its document-oriented and schemaless nature
- Products, characteristics, schema design, data modeling, installation types, and CRUD operations in MongoDB
- Data analytics using the aggregation framework and tools
- Topics like indexing, replica sets, sharded clusters, and scaling in MongoDB
- Security, Python driver examples, and resources for learning more about MongoDB
Webinar: Data Processing and Aggregation OptionsMongoDB
MongoDB scales easily to store mass volumes of data. However, when it comes to making sense of it all what options do you have? In this talk, we'll take a look at 3 different ways of aggregating your data with MongoDB, and determine the reasons why you might choose one way over another. No matter what your big data needs are, you will find out how MongoDB the big data store is evolving to help make sense of your data.
These are slides from our Big Data Warehouse Meetup in April. We talked about NoSQL databases: What they are, how they’re used and where they fit in existing enterprise data ecosystems.
Mike O’Brian from 10gen, introduced the syntax and usage patterns for a new aggregation system in MongoDB and give some demonstrations of aggregation using the new system. The new MongoDB aggregation framework makes it simple to do tasks such as counting, averaging, and finding minima or maxima while grouping by keys in a collection, complementing MongoDB’s built-in map/reduce capabilities.
For more information, visit our website at http://casertaconcepts.com/ or email us at [email protected].
In the age of digital transformation and disruption, your ability to thrive depends on how you adapt to the constantly changing environment. MongoDB 3.4 is the latest release of the leading database for modern applications, a culmination of native database features and enhancements that will allow you to easily evolve your solutions to address emerging challenges and use cases.
In this webinar, we introduce you to what’s new, including:
- Multimodel Done Right. Native graph computation, faceted navigation, rich real-time analytics, and powerful connectors for BI and Apache Spark bring additional multimodel database support right into MongoDB.
- Mission-Critical Applications. Geo-distributed MongoDB zones, elastic clustering, tunable consistency, and enhanced security controls bring state-of-the-art database technology to your most mission-critical applications.
- Modernized Tooling. Enhanced DBA and DevOps tooling for schema management, fine-grained monitoring, and cloud-native integration allow engineering teams to ship applications faster, with less overhead and higher quality.
Powering Microservices with MongoDB, Docker, Kubernetes & Kafka – MongoDB Eur...Andrew Morgan
Organisations are building their applications around microservice architectures because of the flexibility, speed of delivery, and maintainability they deliver.
Want to try out MongoDB on your laptop? Execute a single command and you have a lightweight, self-contained sandbox; another command removes all trace when you're done. Need an identical copy of your application stack in multiple environments? Build your own container image and then your entire development, test, operations, and support teams can launch an identical clone environment.
Containers are revolutionizing the entire software lifecycle: from the earliest technical experiments and proofs of concept through development, test, deployment, and support. Orchestration tools manage how multiple containers are created, upgraded and made highly available. Orchestration also controls how containers are connected to build sophisticated applications from multiple, microservice containers.
This presentation introduces you to technologies such as Docker, Kubernetes & Kafka which are driving the microservices revolution. Learn about containers and orchestration – and most importantly how to exploit them for stateful services such as MongoDB.
Data Streaming with Apache Kafka & MongoDB - EMEAAndrew Morgan
A new generation of technologies is needed to consume and exploit today's real time, fast moving data sources. Apache Kafka, originally developed at LinkedIn, has emerged as one of these key new technologies.
This webinar explores the use-cases and architecture for Kafka, and how it integrates with MongoDB to build sophisticated data-driven applications that exploit new sources of data.
The rise of microservices - containers and orchestrationAndrew Morgan
Organisations are building their applications around microservice architectures because of the flexibility, speed of delivery, and maintainability they deliver. In this session, the concepts behind containers and orchestration will be explained and how to use them with MongoDB.
What's new in MySQL Cluster 7.4 webinar chartsAndrew Morgan
MySQL Cluster powers the subscriber databases of major communication services providers as well as next generation web, cloud, social and mobile applications. It is designed to deliver:
- Real-time, in-memory performance for both OLTP and analytics workloads
- Linear scale-out for both reads and writes
99.999% High Availability
- Transparent, cross-shard transactions and joins
- Update-Anywhere Geographic replication
- SQL or native NoSQL APIs
All that while still providing full ACID transactions.
MySQL High Availability Solutions - Feb 2015 webinarAndrew Morgan
How important is your data? Can you afford to lose it? What about just some of it? What would be the impact if you couldn’t access it for a minute, an hour, a day or a week?
Different applications can have very different requirements for High Availability. Some need 100% data reliability with 24x7x365 read & write access while many others are better served by a simpler approach with more modest HA ambitions.
MySQL has an array of High Availability solutions ranging from simple backups, through replication and shared storage clustering – all the way up to 99.999% available shared nothing, geographically replicated clusters. These solutions also have different ‘bonus’ features such as full InnoDB compatibility, in-memory real-time performance, linear scalability and SQL & NoSQL APIs.
The purpose of this presentation is to help you decide where your application sits in terms of HA requirements and discover which of the MySQL solutions best fit the bill. It will also cover what you need outside of the database to ensure High Availability – state of the art monitoring being a prime example.
FOSDEM 2015 - NoSQL and SQL the best of both worldsAndrew Morgan
This document discusses the benefits and limitations of both SQL and NoSQL databases. It argues that while NoSQL databases provide benefits like simple data formats and scalability, relying solely on them can result in data duplication and inconsistent data when relationships are not properly modeled. The document suggests that MySQL Cluster provides a hybrid approach, allowing both SQL queries and NoSQL interfaces while ensuring ACID compliance and referential integrity through its transactional capabilities and handling of foreign keys.
MySQL Replication: What’s New in MySQL 5.7 and BeyondAndrew Morgan
Continuing in the footsteps of its predecessor, MySQL 5.7 is set to be a groundbreaking release. In this webinar, the engineers behind the product provide insights into what’s new for MySQL replication in the latest 5.7 Development Milestone Release and review the early access features available via labs.mysql.com. The next generation of replication features cover several technical areas such as better semi-synchronous replication, an enhanced multithreaded slave (per-transaction parallelism), improved monitoring with performance schema tables, online configuration changes, options for fine-tuning replication performance, support for more-advanced topologies with multisource replication, and much more. This is also a great chance to learn about MySQL Group Replication – the next generation of active-active, update-anywhere replication for MySQL.
NoSQL and SQL - Why Choose? Enjoy the best of both worlds with MySQLAndrew Morgan
Theres a lot of excitement around NoSQL Data Stores with the promise of simple access patterns, flexible schemas, scalability and High Availability. The downside comes in the form of losing ACID transactions, consistency, flexible queries and data integrity checks. What if you could have the best of both worlds? This session shows how MySQL Cluster provides simultaneous SQL and native NoSQL access to your data whether a simple key-value API (Memcached), REST, JavaScript, Java or C++. You will hear how the MySQL Cluster architecture delivers in-memory real-time performance, 99.999% availability, on-line maintenance and linear, horizontal scalability through transparent auto-sharding.
MySQL Cluster - Latest Developments (up to and including MySQL Cluster 7.4)Andrew Morgan
MySQL Cluster is the distributed, shared-nothing version of MySQL. It’s typically used for applications that need any combination of high availability, real-time performance, and scaling of reads and writes. After a brief introduction to the technology, its uses, and the new features added in MySQL Cluster 7.3, this session focuses on the very latest developments happening in MySQL Cluster 7.4. As you’d expect from a real-time, scalable, distributed, in-memory database, performance continues to be a top priority, as do simplicity of use and robustness. Come hear firsthand what’s being done to make sure MySQL Cluster continues to dominate in mission-critical, high-performance applications.
NoSQL & SQL - Best of both worlds - BarCamp Berkshire 2013Andrew Morgan
The document discusses blending NoSQL and SQL databases by leveraging the strengths of both. It describes how MySQL Cluster provides massively scalable performance through its NoSQL-style data storage and replication abilities, while also supporting SQL queries, joins, and ACID transactions like a traditional relational database. This allows applications to use NoSQL for simple operations and scalability while still using SQL for complex queries and transactions as needed.
NoSQL and SQL - blending the best of both worldsAndrew Morgan
The document discusses blending NoSQL and SQL databases to take advantage of their respective strengths. NoSQL excels at scalability, performance and ease of use through simplified data models and operations, while SQL databases provide ACID transactions, complex queries and joins. The best approach is to use different database types depending on application needs rather than choosing one over the other. MySQL Cluster is presented as an example of a database that combines NoSQL and SQL features.
MySQL Cluster is a database that provides high scalability, 99.999% availability, and real-time performance. It uses an auto-sharding and multi-master architecture that is ACID compliant. MySQL Cluster has a shared-nothing architecture with no single point of failure and self-healing capabilities.
Developing high-throughput services with no sql ap-is to innodb and mysql clu...Andrew Morgan
Ever-increasing performance demands of Web-based services have generated significant interest in providing NoSQL access methods to MySQL (MySQL Cluster from Oracle and the InnoDB storage engine of MySQL), enabling users to maintain all the advantages of their existing relational databases while providing blazing-fast performance for simple queries. Get the best of both worlds: persistence; consistency; rich SQL queries; high availability; scalability; and simple, flexible APIs and schemas for agile development. This session describes the memcached connectors and examines some use cases for how MySQL and memcached fit together in application architectures. It does the same for the newest MySQL Cluster native connector, an easy-to-use, fully asynchronous connector for Node.js.
Copy & Paste On Google >>> https://dr-up-community.info/
EASEUS Partition Master Final with Crack and Key Download If you are looking for a powerful and easy-to-use disk partitioning software,
F-Secure Freedome VPN 2025 Crack Plus Activation New Versionsaimabibi60507
Copy & Past Link 👉👉
https://dr-up-community.info/
F-Secure Freedome VPN is a virtual private network service developed by F-Secure, a Finnish cybersecurity company. It offers features such as Wi-Fi protection, IP address masking, browsing protection, and a kill switch to enhance online privacy and security .
PDF Reader Pro Crack Latest Version FREE Download 2025mu394968
🌍📱👉COPY LINK & PASTE ON GOOGLE https://dr-kain-geera.info/👈🌍
PDF Reader Pro is a software application, often referred to as an AI-powered PDF editor and converter, designed for viewing, editing, annotating, and managing PDF files. It supports various PDF functionalities like merging, splitting, converting, and protecting PDFs. Additionally, it can handle tasks such as creating fillable forms, adding digital signatures, and performing optical character recognition (OCR).
Solidworks Crack 2025 latest new + license codeaneelaramzan63
Copy & Paste On Google >>> https://dr-up-community.info/
The two main methods for installing standalone licenses of SOLIDWORKS are clean installation and parallel installation (the process is different ...
Disable your internet connection to prevent the software from performing online checks during installation
Discover why Wi-Fi 7 is set to transform wireless networking and how Router Architects is leading the way with next-gen router designs built for speed, reliability, and innovation.
⭕️➡️ FOR DOWNLOAD LINK : http://drfiles.net/ ⬅️⭕️
Maxon Cinema 4D 2025 is the latest version of the Maxon's 3D software, released in September 2024, and it builds upon previous versions with new tools for procedural modeling and animation, as well as enhancements to particle, Pyro, and rigid body simulations. CG Channel also mentions that Cinema 4D 2025.2, released in April 2025, focuses on spline tools and unified simulation enhancements.
Key improvements and features of Cinema 4D 2025 include:
Procedural Modeling: New tools and workflows for creating models procedurally, including fabric weave and constellation generators.
Procedural Animation: Field Driver tag for procedural animation.
Simulation Enhancements: Improved particle, Pyro, and rigid body simulations.
Spline Tools: Enhanced spline tools for motion graphics and animation, including spline modifiers from Rocket Lasso now included for all subscribers.
Unified Simulation & Particles: Refined physics-based effects and improved particle systems.
Boolean System: Modernized boolean system for precise 3D modeling.
Particle Node Modifier: New particle node modifier for creating particle scenes.
Learning Panel: Intuitive learning panel for new users.
Redshift Integration: Maxon now includes access to the full power of Redshift rendering for all new subscriptions.
In essence, Cinema 4D 2025 is a major update that provides artists with more powerful tools and workflows for creating 3D content, particularly in the fields of motion graphics, VFX, and visualization.
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Andre Hora
Exceptions allow developers to handle error cases expected to occur infrequently. Ideally, good test suites should test both normal and exceptional behaviors to catch more bugs and avoid regressions. While current research analyzes exceptions that propagate to tests, it does not explore other exceptions that do not reach the tests. In this paper, we provide an empirical study to explore how frequently exceptional behaviors are tested in real-world systems. We consider both exceptions that propagate to tests and the ones that do not reach the tests. For this purpose, we run an instrumented version of test suites, monitor their execution, and collect information about the exceptions raised at runtime. We analyze the test suites of 25 Python systems, covering 5,372 executed methods, 17.9M calls, and 1.4M raised exceptions. We find that 21.4% of the executed methods do raise exceptions at runtime. In methods that raise exceptions, on the median, 1 in 10 calls exercise exceptional behaviors. Close to 80% of the methods that raise exceptions do so infrequently, but about 20% raise exceptions more frequently. Finally, we provide implications for researchers and practitioners. We suggest developing novel tools to support exercising exceptional behaviors and refactoring expensive try/except blocks. We also call attention to the fact that exception-raising behaviors are not necessarily “abnormal” or rare.
Explaining GitHub Actions Failures with Large Language Models Challenges, In...ssuserb14185
GitHub Actions (GA) has become the de facto tool that developers use to automate software workflows, seamlessly building, testing, and deploying code. Yet when GA fails, it disrupts development, causing delays and driving up costs. Diagnosing failures becomes especially challenging because error logs are often long, complex and unstructured. Given these difficulties, this study explores the potential of large language models (LLMs) to generate correct, clear, concise, and actionable contextual descriptions (or summaries) for GA failures, focusing on developers’ perceptions of their feasibility and usefulness. Our results show that over 80% of developers rated LLM explanations positively in terms of correctness for simpler/small logs. Overall, our findings suggest that LLMs can feasibly assist developers in understanding common GA errors, thus, potentially reducing manual analysis. However, we also found that improved reasoning abilities are needed to support more complex CI/CD scenarios. For instance, less experienced developers tend to be more positive on the described context, while seasoned developers prefer concise summaries. Overall, our work offers key insights for researchers enhancing LLM reasoning, particularly in adapting explanations to user expertise.
https://arxiv.org/abs/2501.16495
Who Watches the Watchmen (SciFiDevCon 2025)Allon Mureinik
Tests, especially unit tests, are the developers’ superheroes. They allow us to mess around with our code and keep us safe.
We often trust them with the safety of our codebase, but how do we know that we should? How do we know that this trust is well-deserved?
Enter mutation testing – by intentionally injecting harmful mutations into our code and seeing if they are caught by the tests, we can evaluate the quality of the safety net they provide. By watching the watchmen, we can make sure our tests really protect us, and we aren’t just green-washing our IDEs to a false sense of security.
Talk from SciFiDevCon 2025
https://www.scifidevcon.com/courses/2025-scifidevcon/contents/680efa43ae4f5
AgentExchange is Salesforce’s latest innovation, expanding upon the foundation of AppExchange by offering a centralized marketplace for AI-powered digital labor. Designed for Agentblazers, developers, and Salesforce admins, this platform enables the rapid development and deployment of AI agents across industries.
Email: [email protected]
Phone: +1(630) 349 2411
Website: https://www.fexle.com/blogs/agentexchange-an-ultimate-guide-for-salesforce-consultants-businesses/?utm_source=slideshare&utm_medium=pptNg
Join Ajay Sarpal and Miray Vu to learn about key Marketo Engage enhancements. Discover improved in-app Salesforce CRM connector statistics for easy monitoring of sync health and throughput. Explore new Salesforce CRM Synch Dashboards providing up-to-date insights into weekly activity usage, thresholds, and limits with drill-down capabilities. Learn about proactive notifications for both Salesforce CRM sync and product usage overages. Get an update on improved Salesforce CRM synch scale and reliability coming in Q2 2025.
Key Takeaways:
Improved Salesforce CRM User Experience: Learn how self-service visibility enhances satisfaction.
Utilize Salesforce CRM Synch Dashboards: Explore real-time weekly activity data.
Monitor Performance Against Limits: See threshold limits for each product level.
Get Usage Over-Limit Alerts: Receive notifications for exceeding thresholds.
Learn About Improved Salesforce CRM Scale: Understand upcoming cloud-based incremental sync.
What Do Contribution Guidelines Say About Software Testing? (MSR 2025)Andre Hora
Software testing plays a crucial role in the contribution process of open-source projects. For example, contributions introducing new features are expected to include tests, and contributions with tests are more likely to be accepted. Although most real-world projects require contributors to write tests, the specific testing practices communicated to contributors remain unclear. In this paper, we present an empirical study to understand better how software testing is approached in contribution guidelines. We analyze the guidelines of 200 Python and JavaScript open-source software projects. We find that 78% of the projects include some form of test documentation for contributors. Test documentation is located in multiple sources, including CONTRIBUTING files (58%), external documentation (24%), and README files (8%). Furthermore, test documentation commonly explains how to run tests (83.5%), but less often provides guidance on how to write tests (37%). It frequently covers unit tests (71%), but rarely addresses integration (20.5%) and end-to-end tests (15.5%). Other key testing aspects are also less frequently discussed: test coverage (25.5%) and mocking (9.5%). We conclude by discussing implications and future research.
Download Wondershare Filmora Crack [2025] With Latesttahirabibi60507
Copy & Past Link 👉👉
http://drfiles.net/
Wondershare Filmora is a video editing software and app designed for both beginners and experienced users. It's known for its user-friendly interface, drag-and-drop functionality, and a wide range of tools and features for creating and editing videos. Filmora is available on Windows, macOS, iOS (iPhone/iPad), and Android platforms.
Adobe After Effects Crack FREE FRESH version 2025kashifyounis067
🌍📱👉COPY LINK & PASTE ON GOOGLE http://drfiles.net/ 👈🌍
Adobe After Effects is a software application used for creating motion graphics, special effects, and video compositing. It's widely used in TV and film post-production, as well as for creating visuals for online content, presentations, and more. While it can be used to create basic animations and designs, its primary strength lies in adding visual effects and motion to videos and graphics after they have been edited.
Here's a more detailed breakdown:
Motion Graphics:
.
After Effects is powerful for creating animated titles, transitions, and other visual elements to enhance the look of videos and presentations.
Visual Effects:
.
It's used extensively in film and television for creating special effects like green screen compositing, object manipulation, and other visual enhancements.
Video Compositing:
.
After Effects allows users to combine multiple video clips, images, and graphics to create a final, cohesive visual.
Animation:
.
It uses keyframes to create smooth, animated sequences, allowing for precise control over the movement and appearance of objects.
Integration with Adobe Creative Cloud:
.
After Effects is part of the Adobe Creative Cloud, a suite of software that includes other popular applications like Photoshop and Premiere Pro.
Post-Production Tool:
.
After Effects is primarily used in the post-production phase, meaning it's used to enhance the visuals after the initial editing of footage has been completed.
Adobe Master Collection CC Crack Advance Version 2025kashifyounis067
🌍📱👉COPY LINK & PASTE ON GOOGLE http://drfiles.net/ 👈🌍
Adobe Master Collection CC (Creative Cloud) is a comprehensive subscription-based package that bundles virtually all of Adobe's creative software applications. It provides access to a wide range of tools for graphic design, video editing, web development, photography, and more. Essentially, it's a one-stop-shop for creatives needing a broad set of professional tools.
Key Features and Benefits:
All-in-one access:
The Master Collection includes apps like Photoshop, Illustrator, InDesign, Premiere Pro, After Effects, Audition, and many others.
Subscription-based:
You pay a recurring fee for access to the latest versions of all the software, including new features and updates.
Comprehensive suite:
It offers tools for a wide variety of creative tasks, from photo editing and illustration to video editing and web development.
Cloud integration:
Creative Cloud provides cloud storage, asset sharing, and collaboration features.
Comparison to CS6:
While Adobe Creative Suite 6 (CS6) was a one-time purchase version of the software, Adobe Creative Cloud (CC) is a subscription service. CC offers access to the latest versions, regular updates, and cloud integration, while CS6 is no longer updated.
Examples of included software:
Adobe Photoshop: For image editing and manipulation.
Adobe Illustrator: For vector graphics and illustration.
Adobe InDesign: For page layout and desktop publishing.
Adobe Premiere Pro: For video editing and post-production.
Adobe After Effects: For visual effects and motion graphics.
Adobe Audition: For audio editing and mixing.
Mastering Fluent Bit: Ultimate Guide to Integrating Telemetry Pipelines with ...Eric D. Schabell
It's time you stopped letting your telemetry data pressure your budgets and get in the way of solving issues with agility! No more I say! Take back control of your telemetry data as we guide you through the open source project Fluent Bit. Learn how to manage your telemetry data from source to destination using the pipeline phases covering collection, parsing, aggregation, transformation, and forwarding from any source to any destination. Buckle up for a fun ride as you learn by exploring how telemetry pipelines work, how to set up your first pipeline, and exploring several common use cases that Fluent Bit helps solve. All this backed by a self-paced, hands-on workshop that attendees can pursue at home after this session (https://o11y-workshops.gitlab.io/workshop-fluentbit).
2. DISCLAIMER: MongoDB's product
plans are for informational purposes
only. MongoDB's plans may change
and you should not rely on them for
delivery of a specific feature at a
specific time.
4. Agenda
Document vs. Relational Model
Analytics on MongoDB data
60,000 feet – what is the aggregation pipeline
Aggregation pipeline operators
$lookup (Left Outer Equi Joins) in MongoDB
3.2
Other aggregation enhancements
Worked examples
6. Existing Alternatives to Joins
{ "_id": 10000,
"items": [
{
"productName": "laptop",
"unitPrice": 1000,
"weight": 1.2,
"remainingStock": 23
},
{
"productName": "mouse",
"unitPrice": 20,
"weight": 0.2,
"remainingStock": 276
}
],
…
}
• Option 1: Include all data for an order in
the same document
– Fast reads
• One find delivers all the required data
– Captures full description at the time of the
event
– Consumes extra space
• Details of each product stored in many order
documents
– Complex to maintain
• A change to any product attribute must be
propagated to all affected orders
orders
7. Existing Alternatives to Joins
{
"_id": 10000,
"items": [
12345,
54321
],
...
}
• Option 2: Order document
references product documents
– Slower reads
• Multiple trips to the database
– Space efficient
• Product details stored once
– Lose point-in-time snapshot of full
record
– Extra application logic
• Must iterate over product IDs in
the order document and find the
product documents
• RDBMS would automate through
a JOIN
orders
{
"_id": 12345,
"productName": "laptop",
"unitPrice": 1000,
"weight": 1.2,
"remainingStock": 23
}
{
"_id": 54321,
"productName": "mouse",
"unitPrice": 20,
"weight": 0.2,
"remainingStock": 276
}
products
8. The Winner?
• In general, Option 1 wins
– Performance and containment of everything in same place beats space
efficiency of normalization
– There are exceptions
• e.g. Comments in a blog post -> unbounded size
• However, analytics benefit from combining data from multiple collections
– Keep listening...
17. Aggregation Pipeline Stages
• $match
Filter documents
• $geoNear
Geospherical query
• $project
Reshape documents
• $lookup
New – Left-outer equi joins
• $unwind
Expand documents
• $group
Summarize documents
• $sample
New – Randomly selects a subset
of documents
• $sort
Order documents
• $skip
Jump over a number of documents
• $limit
Limit number of documents
• $redact
Restrict documents
• $out
Sends results to a new collection
18. $lookup
• Left-outer join
– Includes all documents from the
left collection
– For each document in the left
collection, find the matching
documents from the right
collection and embed them
Left Collection Right Collection
31. Aggregation With a Sharded Database
• Workload split between shards
– Client works through mongos as with
any query
– Shards execute pipeline up to a point
– A single shard merges cursors and
continues processing
– Use explain to analyze pipeline split
– Early $match on shard key may
exclude shards
– Potential CPU and memory
implications for primary shard host
– $lookup & $out performed within
Primary shard for the database
?
33. Restrictions
• $lookup only support equality for the match
• $lookup can only be used in the aggregation pipeline (e.g. not for find)
• The pipeline is linear; no forks. Can remove data at each stage and can only add new
raw data through $lookup
• Right collection for $lookup cannot be sharded
• Indexes are only used at the beginning of the pipeline (and right tables in subsequent
$lookups), before any data transformations
• $out can only be used in the final stage of the pipeline
• $geoNear can only be the first stage in the pipeline
• The BI Connector for MongoDB is part of MongoDB Enterprise Advanced
– Not in community
34. Next Steps
• Documentation
– https://docs.mongodb.org/manual/release-notes/3.2/#aggregation-framework-enhancements
• Not yet ready for production but download and try!
– https://www.mongodb.org/downloads#development
• Detailed blog
– https://www.mongodb.com/blog/post/joins-and-other-aggregation-enhancements-coming-in-mongodb-3-2-
part-1-of-3-introduction
• Webinars
– Tomorrow: What's New in MongoDB 3.2 https://www.mongodb.com/webinar/whats-new-in-mongodb-3-2
– Replay: 3.2 $lookup & aggregation https://www.mongodb.com/presentations/webinar-joins-and-other-
aggregation-enhancements-coming-in-mongodb-3-2
• Feedback
– MongoDB 3.2 Bug Hunt
• https://www.mongodb.com/blog/post/announcing-the-mongodb-3-2-bug-hunt
– https://jira.mongodb.org/
DISCLAIMER: MongoDB's product plans are for informational purposes only. MongoDB's plans may change and you
should not rely on them for delivery of a specific feature at a specific time.
35. MongoDB Days 2015
October 6, 2015
October 20, 2015
November 5, 2015
December 2, 2015
France
Germany
UK
Silicon Valley