A brief overview of currently popular & available key/value, column oriented & document oriented databases, along with implementation suggestions for the CakePHP web application framework.
The document discusses key-value stores as options for scaling the backend of a Facebook game. It describes Redis, Cassandra, and Membase and evaluates them as potential solutions. Redis is selected for its simplicity and ability to handle the expected write-heavy workload using just one or two servers initially. The game has since launched and is performing well with the Redis implementation.
The document provides an overview of key-value stores and discusses CouchDB, Tokyo Cabinet, Redis, and Cassandra as alternatives to relational databases for web applications. It describes each system's features and interface to the Ruby programming language. CouchDB uses JavaScript for generating views and stores schema-less data accessed via HTTP/JSON. Tokyo Cabinet supports hashtable, B-tree, and table modes and is used by Mixi, a Japanese social network. Redis is an in-memory key-value store focused on performance. Cassandra is a distributed, structured key-value store developed by Facebook based on Amazon's Dynamo database.
This document discusses different types of distributed databases. It covers data models like relational, aggregate-oriented, key-value, and document models. It also discusses different distribution models like sharding and replication. Consistency models for distributed databases are explained including eventual consistency and the CAP theorem. Key-value stores are described in more detail as a simple but widely used data model with features like consistency, scaling, and suitable use cases. Specific key-value databases like Redis, Riak, and DynamoDB are mentioned.
DynamoDB is a key-value database that achieves high availability and scalability through several techniques:
1. It uses consistent hashing to partition and replicate data across multiple storage nodes, allowing incremental scalability.
2. It employs vector clocks to maintain consistency among replicas during writes, decoupling version size from update rates.
3. For handling temporary failures, it uses sloppy quorum and hinted handoff to provide high availability and durability guarantees when some replicas are unavailable.
Key-value pairs are a fundamental concept in databases where a key is a field name or identifier and the value is the data stored in that field. For example, in an address book database each contact record would contain a "city" field with the key and the city name like "Buffalo" or "Toronto" as the value. Consistency in database design means each record contains the same keys so queries can easily search values by key like finding all contacts living in a particular city.
Informix NoSQL & Hybrid SQL detailed deep diveKeshav Murthy
This document provides an overview of Informix NoSQL capabilities and use cases. It discusses key-value stores, column family stores, document databases, and graph databases supported by Informix NoSQL. Several business uses of Informix NoSQL are outlined, including session store, user profile store, content metadata store, mobile apps, third party data aggregation, caching, and ecommerce. The document also compares pricing of Informix and MongoDB editions over a three year period. Finally, it provides timelines for go-to-market strategies for DB2 JSON and Informix JSON capabilities.
This document provides an overview of non-relational (NoSQL) databases. It discusses the history and characteristics of NoSQL databases, including that they do not require rigid schemas and can automatically scale across servers. The document also categorizes major types of NoSQL databases, describes some popular NoSQL databases like Dynamo and Cassandra, and discusses benefits and limitations of both SQL and NoSQL databases.
From Hadoop, through HIVE, Spark, YARN, searching for the holy grail for low latency SQL dialect like big data query. I talk about OLTP and OLAP, row stores and column stores, and finally arrive to BigQuery. Demo on the web console.
Processing Complex Workflows in Advertising using HadoopDataWorks Summit
- BrightRoll is a large online video advertising platform that processes over 20 billion events per day using Hadoop.
- The goals of the new system were to improve scalability, availability, ease of adding new computations, and reduce latency compared to the legacy system.
- The key aspects of the new system include streaming data into HDFS using Flume, de-duplicating events stored in HBase, running MapReduce jobs to perform joins and other computations configured via metadata, and auditing the processed data for correctness.
Graphs are everywhere. From websites adding social capabilities to Telcos providing personalized customer services, to innovative bioinformatics research, organizations are adopting graph databases as the best way to model and query connected data. If you can whiteboard, you can model your domain in a graph database.
In this session Emil Eifrem provides a close look at the graph model and offers best use cases for effective, cost-efficient data storage and accessibility.
Take Aways: Understand the model of a graph database and how it compares to document and relational databases Understand why graph databases are best suited for the storage, mapping and querying of connected data
Emil's presentation will be followed by a Hands-on Guide to Spring Data Neo4j. Spring Data Neo4j provides straightforward object persistence into the Neo4j graph database. Conceived by Rod Johnson and Neo Technology CEO Emil Eifrem, it is the founding project of the Spring Data effort. The library leverages a tight integration with the Spring Framework and the Spring Data infrastructure. Besides the easy to use object graph mapping it offers the powerful graph manipulation and query capabilities of Neo4j with a convenient API.
The talk introduces the different aspects of Spring Data Neo4j and shows applications in several example domains.
During the session we walk through the creation of a engaging sample application that starts with the setup and annotating the domain objects. We see the usage of Neo4jTemplate and the powerful repository abstraction. After deploying the application to a cloud PaaS we execute some interesting query use-cases on the collected data.
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of ViewAlex Esterkin
The document summarizes Alex Esterkin's presentation on his journey understanding and using NoSQL databases. The presentation covered:
1) Why companies like Facebook and Twitter adopted NoSQL databases like Cassandra to handle massive data volumes and query concurrency that exceed relational database limits.
2) Key concepts of NoSQL databases like the CAP theorem, vector clocks for consistency, and eventual consistency.
3) How the open source Project Voldemort NoSQL database uses MySQL as a pluggable storage engine, and how to configure MySQL for the simple query workload of a NoSQL database.
A couple of major players in the internet space, in particular Amazon, LinkedIn and Google, opened the eyes of the corporate world to the coming onslaught of a NoSQL workload. As with every new market opportunity, some young guns quickly jumped in to capitalize on the need and confusion, but things are starting to settle and NoSQL is maturing as Enterprise ready solutions break away with long sought after features. In this webcast, learn about NoSQL convergence from Oracle, the leader in data management and hear why some flavors of NoSQL are here to stay.
Automated Schema Design for NoSQL DatabasesMichael Mior
Selecting appropriate indices and materialized views is critical for high performance in relational databases. By example, we show that the problem of schema optimization is also highly relevant for NoSQL databases. We explore the problem of schema design in NoSQL databases with a goal of optimizing query performance while minimizing storage overhead. Our suggested approach uses the cost of executing a given workload for a given schema to guide the mapping from the application data model to a physical schema. We propose a cost-driven approach for optimization and discuss its usefulness as part of an automated schema design tool.
Just a few years ago all software systems were designed to be monoliths running on a single big and powerful machine. But nowadays most companies desire to scale out instead of scaling up, because it is much easier to buy or rent a large cluster of commodity hardware then to get a single machine that is powerful enough. In the database area scaling out is realized by utilizing a combination of polyglot persistence and sharding of data. On the application level scaling out is realized by microservices. In this talk I will briefly introduce the concepts and ideas of microservices and discuss their benefits and drawbacks. Afterwards I will focus on the point of intersection of a microservice based application talking to one or many NoSQL databases. We will try and find answers to these questions: Are the differences to a monolithic application? How to scale the whole system properly? What about polyglot persistence? Is there a data-centric way to split microservices?
This document provides an overview of NoSQL databases in Azure. It discusses 7 different database types - key-value, column family, document, graph and Hadoop. For each database type it provides information on what it is, examples of use cases, and how to query or model data. It encourages attendees to explore these databases and stresses that choosing the right database for the job is important.
NoSE: Schema Design for NoSQL ApplicationsMichael Mior
Database design is critical for high performance in relational databases and many tools exist to aid application designers in selecting an appropriate schema. While the problem of schema optimization is also highly relevant for NoSQL databases, existing tools for relational databases are inadequate for this setting. Application designers wishing to use a NoSQL database instead rely on rules of thumb to select an appropriate schema. We present a system for recommending database schemas for NoSQL applications. Our cost-based approach uses a novel binary integer programming formulation to guide the mapping from the application's conceptual data model to a database schema.
We implemented a prototype of this approach for the Cassan-dra extensible record store. Our prototype, the NoSQL Schema Evaluator (NoSE) is able to capture rules of thumb used by expert designers without explicitly encoding the rules. Automating the design process allows NoSE to produce efficient schemas and to examine more alternatives than would be possible with a manual rule-based approach.
This document provides an overview of distributed key-value stores and Cassandra. It discusses key concepts like data partitioning, replication, and consistency models. It also summarizes Cassandra's features such as high availability, elastic scalability, and support for different data models. Code examples are given to demonstrate basic usage of the Cassandra client API for operations like insert, get, multiget and range queries.
This document summarizes a tutorial on column-oriented database systems given at VLDB 2009. It discusses the evolution of column-oriented databases from early systems like DSM to current commercial systems. Key features of column-oriented databases include storing data by column rather than by row for improved read efficiency, better compression, and indexing capabilities. The tutorial outlines optimizations made possible by the columnar format like late materialization and vectorized processing. It also compares the performance of column and row storage using a telco data warehousing example.
Warum ist ein Graph - bestehend aus Knoten und Verbindungen (jeweils mit Attributen) eigentlich so gut geeignet, die meisten Domänen ohne Verrenkungen zu modellieren? Warum habe ich bisher noch nie etwas von der etablierten Graphendatenbank Neo4j gehört? Was kann ich denn konkret damit machen? Welche interessanten Anwendungsgebiete gibt es? Das objektorientierte API ist gut und schön, aber ich möchte meine Objekte direkt in den Graphen abbilden, kann ich das? Gibt es Neo4j, mit spannenden Datensets, auch als gehostete Lösung, um direkt zu starten? Was für eine Programmiersprache brauche ich denn für eine ...4j-Datenbank?
Diese und viele andere Fragen wollen wir in der Präsentation beantworten. Von den Grundlagen angefangen, über Beispiele mit Aha-Effekten bis zum kompakten API von Neo4j und den Treibern für viele Programmiersprachen wird alles vorgestellt. Besonders wichtig ist die Mächtigkeit in Bezug auf die einfache Modellierung beliebiger Domänen. Dabei kann das Objekt-Graph-Mapping auf der Basis der von uns entwickelten Spring-Data-Graph Bibliothek noch einmal kräftig punkten. Den Abschluss der Präsentation bildet ein Abstecher zu gehosteten Neo4j-Instanzen, die besonders für PaaS-Provider, wie z.B. Heroku, sehr geeignet sind.
NoSQL Database: Classification, Characteristics and ComparisonMayuree Srikulwong
My students' presentation of a paper "NoSQL Database: New Era of Databases for Big Data Analytics - Classification, Characteristics and Comparison" by Moniruzzaman, A.B.M. and Hossain, S.A. (2013).
NoSQL and Data Modeling for Data ModelersKaren Lopez
Karen Lopez's presentation for data modelers and data architects. Why data modeling is still relevant for big data and NoSQL projects.
Plus 10 tips for data modelers for working on NoSQL projects.
The document provides an introduction to NoSQL database types, including key/value stores, column stores, document stores, and graph databases. It discusses the strengths and weaknesses of each type as well as common use cases. It also covers the CAP theorem, which states that a distributed system cannot simultaneously provide consistency, availability, and partition tolerance. The document establishes that different NoSQL database types adhere to different parts of the CAP theorem and provides examples for each. Setup instructions are also included to access code repositories for hands-on exercises with Cassandra and Redis.
Software Developer and Architecture @ LinkedIn (QCon SF 2014)Sid Anand
The document provides details about Sid Anand's career and then discusses LinkedIn's software development process and architecture when he was there. It notes that when Sid started at LinkedIn in 2011, compiling the code took a long time due to the large codebase and many dependencies. It then describes how LinkedIn scaled to support hundreds of millions of members and thousands of employees by splitting the monolithic codebase into individual Git repos, using intermediate JARs to reduce dependencies, and connecting development machines to test environments instead of deploying everything locally. It also discusses LinkedIn's use of Kafka, search federation, and not making web service calls between data centers to scale across multiple data centers.
In this lecture we analyze document oriented databases. In particular we consider why there are the first approach to nosql and what are the main features. Then, we analyze as example MongoDB. We consider the data model, CRUD operations, write concerns, scaling (replication and sharding).
Finally we presents other document oriented database and when to use or not document oriented databases.
The document discusses cloning Twitter using HBase. It describes some key features of Twitter like allowing users to post status updates, follow other users, mention users, and re-tweet posts. It then provides an overview of HBase including its features like consistency, automatic sharding and failover. It discusses how to install HBase in single node, pseudo-distributed and fully distributed modes using Docker. It also demonstrates some common HBase shell commands like creating and listing tables, putting and getting data. Finally, it discusses how to model the user, tweet, follower and following relationships in HBase.
The document discusses various performance strategies for Drupal websites, including optimizing the application code, shifting load to the client through frontend optimizations, and optimizing server and infrastructure. Some strategies proposed are using PHP opcodes caching, MySQL query caching, stress testing and profiling to analyze performance bottlenecks, simplifying complex functionality with fewer modules, caching data objects and queries, using reverse proxies and CDNs to cache static content, and scaling the database through replication or clustering for large amounts of data or users. The overall message is to simplify complexity, cache as much as possible, and consider moving functionality out of the core application.
Many services / applications now a day are ill equipped with handling a sudden rush of popularity, as is often the case on the internet now a days, to a point where the services either become unavailable or unbearably slow.
By taking a chapter from the ant colonies in the wild, where their strength lies in their numbers and the fact that everyone works together towards the same goal, we can apply the same principle to our
service by using systems such as:
- gearman
- memcache
- daemons
- message queues
- load balancers
and many more, you can achieve greater performance, more redundancy, higher availability and have the ability to scale your services up and down as required easily.
During this talk attendees will be lead through the world of distributed systems and scalability, and shown the how, where and what, of how to take the average application and splitting it into smaller more manageable pieces.
The document discusses authoring stylesheets with Sass and Compass. It provides an overview of Sass features like variables, nested rules, mixins, and more. It also covers Compass and how it builds on Sass with additional tools and libraries to help manage stylesheets, assets, and make CSS development easier. The document includes examples of Sass syntax and how Compass can be used to help develop stylesheets and manage CSS projects.
Processing Complex Workflows in Advertising using HadoopDataWorks Summit
- BrightRoll is a large online video advertising platform that processes over 20 billion events per day using Hadoop.
- The goals of the new system were to improve scalability, availability, ease of adding new computations, and reduce latency compared to the legacy system.
- The key aspects of the new system include streaming data into HDFS using Flume, de-duplicating events stored in HBase, running MapReduce jobs to perform joins and other computations configured via metadata, and auditing the processed data for correctness.
Graphs are everywhere. From websites adding social capabilities to Telcos providing personalized customer services, to innovative bioinformatics research, organizations are adopting graph databases as the best way to model and query connected data. If you can whiteboard, you can model your domain in a graph database.
In this session Emil Eifrem provides a close look at the graph model and offers best use cases for effective, cost-efficient data storage and accessibility.
Take Aways: Understand the model of a graph database and how it compares to document and relational databases Understand why graph databases are best suited for the storage, mapping and querying of connected data
Emil's presentation will be followed by a Hands-on Guide to Spring Data Neo4j. Spring Data Neo4j provides straightforward object persistence into the Neo4j graph database. Conceived by Rod Johnson and Neo Technology CEO Emil Eifrem, it is the founding project of the Spring Data effort. The library leverages a tight integration with the Spring Framework and the Spring Data infrastructure. Besides the easy to use object graph mapping it offers the powerful graph manipulation and query capabilities of Neo4j with a convenient API.
The talk introduces the different aspects of Spring Data Neo4j and shows applications in several example domains.
During the session we walk through the creation of a engaging sample application that starts with the setup and annotating the domain objects. We see the usage of Neo4jTemplate and the powerful repository abstraction. After deploying the application to a cloud PaaS we execute some interesting query use-cases on the collected data.
NoSQL Plus MySQL From MySQL Practitioner\'s Point Of ViewAlex Esterkin
The document summarizes Alex Esterkin's presentation on his journey understanding and using NoSQL databases. The presentation covered:
1) Why companies like Facebook and Twitter adopted NoSQL databases like Cassandra to handle massive data volumes and query concurrency that exceed relational database limits.
2) Key concepts of NoSQL databases like the CAP theorem, vector clocks for consistency, and eventual consistency.
3) How the open source Project Voldemort NoSQL database uses MySQL as a pluggable storage engine, and how to configure MySQL for the simple query workload of a NoSQL database.
A couple of major players in the internet space, in particular Amazon, LinkedIn and Google, opened the eyes of the corporate world to the coming onslaught of a NoSQL workload. As with every new market opportunity, some young guns quickly jumped in to capitalize on the need and confusion, but things are starting to settle and NoSQL is maturing as Enterprise ready solutions break away with long sought after features. In this webcast, learn about NoSQL convergence from Oracle, the leader in data management and hear why some flavors of NoSQL are here to stay.
Automated Schema Design for NoSQL DatabasesMichael Mior
Selecting appropriate indices and materialized views is critical for high performance in relational databases. By example, we show that the problem of schema optimization is also highly relevant for NoSQL databases. We explore the problem of schema design in NoSQL databases with a goal of optimizing query performance while minimizing storage overhead. Our suggested approach uses the cost of executing a given workload for a given schema to guide the mapping from the application data model to a physical schema. We propose a cost-driven approach for optimization and discuss its usefulness as part of an automated schema design tool.
Just a few years ago all software systems were designed to be monoliths running on a single big and powerful machine. But nowadays most companies desire to scale out instead of scaling up, because it is much easier to buy or rent a large cluster of commodity hardware then to get a single machine that is powerful enough. In the database area scaling out is realized by utilizing a combination of polyglot persistence and sharding of data. On the application level scaling out is realized by microservices. In this talk I will briefly introduce the concepts and ideas of microservices and discuss their benefits and drawbacks. Afterwards I will focus on the point of intersection of a microservice based application talking to one or many NoSQL databases. We will try and find answers to these questions: Are the differences to a monolithic application? How to scale the whole system properly? What about polyglot persistence? Is there a data-centric way to split microservices?
This document provides an overview of NoSQL databases in Azure. It discusses 7 different database types - key-value, column family, document, graph and Hadoop. For each database type it provides information on what it is, examples of use cases, and how to query or model data. It encourages attendees to explore these databases and stresses that choosing the right database for the job is important.
NoSE: Schema Design for NoSQL ApplicationsMichael Mior
Database design is critical for high performance in relational databases and many tools exist to aid application designers in selecting an appropriate schema. While the problem of schema optimization is also highly relevant for NoSQL databases, existing tools for relational databases are inadequate for this setting. Application designers wishing to use a NoSQL database instead rely on rules of thumb to select an appropriate schema. We present a system for recommending database schemas for NoSQL applications. Our cost-based approach uses a novel binary integer programming formulation to guide the mapping from the application's conceptual data model to a database schema.
We implemented a prototype of this approach for the Cassan-dra extensible record store. Our prototype, the NoSQL Schema Evaluator (NoSE) is able to capture rules of thumb used by expert designers without explicitly encoding the rules. Automating the design process allows NoSE to produce efficient schemas and to examine more alternatives than would be possible with a manual rule-based approach.
This document provides an overview of distributed key-value stores and Cassandra. It discusses key concepts like data partitioning, replication, and consistency models. It also summarizes Cassandra's features such as high availability, elastic scalability, and support for different data models. Code examples are given to demonstrate basic usage of the Cassandra client API for operations like insert, get, multiget and range queries.
This document summarizes a tutorial on column-oriented database systems given at VLDB 2009. It discusses the evolution of column-oriented databases from early systems like DSM to current commercial systems. Key features of column-oriented databases include storing data by column rather than by row for improved read efficiency, better compression, and indexing capabilities. The tutorial outlines optimizations made possible by the columnar format like late materialization and vectorized processing. It also compares the performance of column and row storage using a telco data warehousing example.
Warum ist ein Graph - bestehend aus Knoten und Verbindungen (jeweils mit Attributen) eigentlich so gut geeignet, die meisten Domänen ohne Verrenkungen zu modellieren? Warum habe ich bisher noch nie etwas von der etablierten Graphendatenbank Neo4j gehört? Was kann ich denn konkret damit machen? Welche interessanten Anwendungsgebiete gibt es? Das objektorientierte API ist gut und schön, aber ich möchte meine Objekte direkt in den Graphen abbilden, kann ich das? Gibt es Neo4j, mit spannenden Datensets, auch als gehostete Lösung, um direkt zu starten? Was für eine Programmiersprache brauche ich denn für eine ...4j-Datenbank?
Diese und viele andere Fragen wollen wir in der Präsentation beantworten. Von den Grundlagen angefangen, über Beispiele mit Aha-Effekten bis zum kompakten API von Neo4j und den Treibern für viele Programmiersprachen wird alles vorgestellt. Besonders wichtig ist die Mächtigkeit in Bezug auf die einfache Modellierung beliebiger Domänen. Dabei kann das Objekt-Graph-Mapping auf der Basis der von uns entwickelten Spring-Data-Graph Bibliothek noch einmal kräftig punkten. Den Abschluss der Präsentation bildet ein Abstecher zu gehosteten Neo4j-Instanzen, die besonders für PaaS-Provider, wie z.B. Heroku, sehr geeignet sind.
NoSQL Database: Classification, Characteristics and ComparisonMayuree Srikulwong
My students' presentation of a paper "NoSQL Database: New Era of Databases for Big Data Analytics - Classification, Characteristics and Comparison" by Moniruzzaman, A.B.M. and Hossain, S.A. (2013).
NoSQL and Data Modeling for Data ModelersKaren Lopez
Karen Lopez's presentation for data modelers and data architects. Why data modeling is still relevant for big data and NoSQL projects.
Plus 10 tips for data modelers for working on NoSQL projects.
The document provides an introduction to NoSQL database types, including key/value stores, column stores, document stores, and graph databases. It discusses the strengths and weaknesses of each type as well as common use cases. It also covers the CAP theorem, which states that a distributed system cannot simultaneously provide consistency, availability, and partition tolerance. The document establishes that different NoSQL database types adhere to different parts of the CAP theorem and provides examples for each. Setup instructions are also included to access code repositories for hands-on exercises with Cassandra and Redis.
Software Developer and Architecture @ LinkedIn (QCon SF 2014)Sid Anand
The document provides details about Sid Anand's career and then discusses LinkedIn's software development process and architecture when he was there. It notes that when Sid started at LinkedIn in 2011, compiling the code took a long time due to the large codebase and many dependencies. It then describes how LinkedIn scaled to support hundreds of millions of members and thousands of employees by splitting the monolithic codebase into individual Git repos, using intermediate JARs to reduce dependencies, and connecting development machines to test environments instead of deploying everything locally. It also discusses LinkedIn's use of Kafka, search federation, and not making web service calls between data centers to scale across multiple data centers.
In this lecture we analyze document oriented databases. In particular we consider why there are the first approach to nosql and what are the main features. Then, we analyze as example MongoDB. We consider the data model, CRUD operations, write concerns, scaling (replication and sharding).
Finally we presents other document oriented database and when to use or not document oriented databases.
The document discusses cloning Twitter using HBase. It describes some key features of Twitter like allowing users to post status updates, follow other users, mention users, and re-tweet posts. It then provides an overview of HBase including its features like consistency, automatic sharding and failover. It discusses how to install HBase in single node, pseudo-distributed and fully distributed modes using Docker. It also demonstrates some common HBase shell commands like creating and listing tables, putting and getting data. Finally, it discusses how to model the user, tweet, follower and following relationships in HBase.
The document discusses various performance strategies for Drupal websites, including optimizing the application code, shifting load to the client through frontend optimizations, and optimizing server and infrastructure. Some strategies proposed are using PHP opcodes caching, MySQL query caching, stress testing and profiling to analyze performance bottlenecks, simplifying complex functionality with fewer modules, caching data objects and queries, using reverse proxies and CDNs to cache static content, and scaling the database through replication or clustering for large amounts of data or users. The overall message is to simplify complexity, cache as much as possible, and consider moving functionality out of the core application.
Many services / applications now a day are ill equipped with handling a sudden rush of popularity, as is often the case on the internet now a days, to a point where the services either become unavailable or unbearably slow.
By taking a chapter from the ant colonies in the wild, where their strength lies in their numbers and the fact that everyone works together towards the same goal, we can apply the same principle to our
service by using systems such as:
- gearman
- memcache
- daemons
- message queues
- load balancers
and many more, you can achieve greater performance, more redundancy, higher availability and have the ability to scale your services up and down as required easily.
During this talk attendees will be lead through the world of distributed systems and scalability, and shown the how, where and what, of how to take the average application and splitting it into smaller more manageable pieces.
The document discusses authoring stylesheets with Sass and Compass. It provides an overview of Sass features like variables, nested rules, mixins, and more. It also covers Compass and how it builds on Sass with additional tools and libraries to help manage stylesheets, assets, and make CSS development easier. The document includes examples of Sass syntax and how Compass can be used to help develop stylesheets and manage CSS projects.
Object Oriented Css For High Performance Websites And ApplicationsPerconaPerformance
The document discusses object-oriented CSS (OOCSS) as a way to improve performance, code reuse, and maintainability of CSS code for websites and applications. It outlines several principles of OOCSS including creating reusable CSS components, separating container and content rules, extending objects by applying multiple classes, avoiding location-dependent styles, and separating structure from skin. Examples are provided to illustrate these concepts. The goal of OOCSS is to write more modular, predictable and maintainable CSS code.
Google Wave is a new communication platform that combines elements of email, instant messaging, wikis and blogs. It allows for real-time collaboration on documents. The platform includes client-side and server-side components as well as protocols for communication. Documents in Wave are composed of smaller components like blips, waves and wavelets. The architecture supports multiple clients connecting to multiple servers. Operational transformation is used to synchronize changes in real-time. APIs are provided for embedding, creating gadgets and building robot participants.
Building A Scalable Open Source Storage SolutionPhil Cryer
The Biodiversity Heritage Library (BHL), like many other projects within biodiversity informatics, maintains terabytes of data that must be safeguarded against loss. Further, a scalable and resilient infrastructure is required to enable continuous data interoperability, as BHL provides unique services to its community of users. This volume of data and associated availability requirements present significant challenges to a distributed organization like BHL, not only in funding capital equipment purchases, but also in ongoing system administration and maintenance. A new standardized system is required to bring new opportunities to collaborate on distributed services and processing across what will be geographically dispersed nodes. Such services and processing include taxon name finding, indexes or GUID/LSID services, distributed text mining, names reconciliation and other computationally intensive tasks, or tasks with high availability requirements.
Non-Technical Introduction to CrossRef for LibrariesCrossref
Webinar with Ed Pentz and Carol Anne Meyer 8 December 2009. Provides a basic introduction to CrossRef, DOIs, scholarly reference linking, discovery tools for people, DOIs in Use, Benefits of CrossRef to Libraries
The document discusses NoSQL databases and MongoDB in particular. It provides an overview of MongoDB, including how it uses databases and collections similar to tables, but stores data as flexible JSON-like documents rather than fixed schemas. It also covers installing and configuring MongoDB, basic querying of documents, and the MongoMapper ORM for Ruby. Real-world examples of companies using MongoDB in production are provided, along with resources for learning more.
Content Management Selection and StrategyIvo Jansch
A presentation I did at the IMS 2009 event in London, helping organizations define a content management system strategy and helping them with the selection of CMS systems.
This document discusses content management strategy and software selection. It outlines the selection process, which includes determining business goals and requirements, evaluating products based on features and costs, and performing a SWOT analysis. The document contrasts commercial and open source systems and considers whether to buy, make, or customize an existing system. It also discusses technology platforms, financial considerations, and the implementation process.
This document discusses Processing, an open-source programming language and environment used primarily for visual design, prototyping, and data visualization. Processing is useful for visualizing data through animation, creating embeddable web content, and as an educational tool. The document outlines Processing's purpose, installation, benefits, limitations, licensing, examples of use, and competitors.
Scaling websites with RabbitMQ A(rlvaro Videla)Ontico
The document summarizes how RabbitMQ and AMQP can be used to solve common problems in scaling web applications. It introduces AMQP as an open protocol for message queuing that supports interoperability. It then discusses different exchange types in AMQP and provides code examples of a publisher and consumer to demonstrate how RabbitMQ can be used for batch processing and image uploading tasks in a scalable way.
Accelerate Your Rails Site with Automatic Generation-Based Action Cachingelliando dias
This document discusses automatic generation-based action caching for Rails applications. It introduces built-in caching challenges in Rails and proposes a solution using automatic, generation-based caching that partitions the cache to avoid conflicts between users. Key aspects include incrementing a generation counter on state changes to invalidate old cache entries, using a cache helper to manage caching, and structuring cache keys hierarchically.
Java Tech & Tools | Big Blobs: Moving Big Data In and Out of the Cloud | Adri...JAX London
Adrian Cole is the founder and chief evangelist of Cloudsoft. He discusses moving big data in and out of the cloud using jclouds, an open source multi-cloud library. jclouds provides portable APIs to work with various cloud services like blob storage, compute, and load balancing across different providers. Adrian highlights case studies and strategies for optimizing large data transfers to blob storage using techniques like multi-part uploads, slicing payloads, and parallel upload strategies. The talk focuses on improving upload speeds and integrating jclouds with other systems like Hadoop HDFS.
After a short theoretical introduction into the Extreme Programming (XP) and Scrum, the two major flavours of agile development, we will work on an example web project using Extreme Programming. The workshop will cover the whole development cycle - from planning through setting up a continuous integration server with test framework, up to developing and shipping a web application with PHP. We will add new features incrementally in a test-driven way, covering the application with unit and acceptance tests, keeping it integrated and fully functional all the time. While working, we will exercise all main practices of XP, starting with Pair Programming, Simple Design, Test-Driven Development, Refactoring and finishing with Continuous Integration and Small Releases.
This document introduces Drush, a command line shell and scripting interface for Drupal. It allows users to perform common Drupal tasks and operations like site building, installation, updates and maintenance from the command line. The document provides examples of common Drush commands for tasks such as downloading modules, enabling modules, updating modules, taking a site offline and more. It also discusses how aliases allow Drush commands to be run on remote and local sites.
Raymond Gao gave a presentation on cloud computing at the 2010 IUT Cloud Computing Seminar. He began by introducing himself and his background. The presentation covered definitions of cloud computing, demonstrations of AWS services like EC2 and S3, trends in the industry and major players like Amazon and Google, and how universities can benefit from cloud computing services. Gao discussed concepts like elastic load balancing and auto scaling. He also demonstrated how to set up an AWS account and manage resources through the management console. The presentation provided an overview of cloud computing concepts and Amazon Web Services.
The document discusses the shift away from traditional scholarly publishing containers (like journals and papers) towards a network of interconnected data on the web. It notes that current publishing formats focus too much on individual articles rather than the connections between ideas and data. For publishers and scientific organizations to survive, they need to fully integrate their content into the growing web of data through practices like common standards, interoperability, and open licensing that allows for broad reuse and remixing of content.
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...Michel Dumontier
This document proposes a peer-to-peer platform for distributed reasoning with OWL-EL knowledge bases. It describes distributing the knowledge base across peers using a hash table, with each peer responsible for concepts, axioms, and individuals mapped to it. It discusses how concept satisfiability can be checked in a distributed manner by unfolding concepts and checking satisfiability locally. It notes challenges in fully supporting nominals and the need for peers to exchange information.
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...Alan Dix
Talk at the final event of Data Fusion Dynamics: A Collaborative UK-Saudi Initiative in Cybersecurity and Artificial Intelligence funded by the British Council UK-Saudi Challenge Fund 2024, Cardiff Metropolitan University, 29th April 2025
https://alandix.com/academic/talks/CMet2025-AI-Changes-Everything/
Is AI just another technology, or does it fundamentally change the way we live and think?
Every technology has a direct impact with micro-ethical consequences, some good, some bad. However more profound are the ways in which some technologies reshape the very fabric of society with macro-ethical impacts. The invention of the stirrup revolutionised mounted combat, but as a side effect gave rise to the feudal system, which still shapes politics today. The internal combustion engine offers personal freedom and creates pollution, but has also transformed the nature of urban planning and international trade. When we look at AI the micro-ethical issues, such as bias, are most obvious, but the macro-ethical challenges may be greater.
At a micro-ethical level AI has the potential to deepen social, ethnic and gender bias, issues I have warned about since the early 1990s! It is also being used increasingly on the battlefield. However, it also offers amazing opportunities in health and educations, as the recent Nobel prizes for the developers of AlphaFold illustrate. More radically, the need to encode ethics acts as a mirror to surface essential ethical problems and conflicts.
At the macro-ethical level, by the early 2000s digital technology had already begun to undermine sovereignty (e.g. gambling), market economics (through network effects and emergent monopolies), and the very meaning of money. Modern AI is the child of big data, big computation and ultimately big business, intensifying the inherent tendency of digital technology to concentrate power. AI is already unravelling the fundamentals of the social, political and economic world around us, but this is a world that needs radical reimagining to overcome the global environmental and human challenges that confront us. Our challenge is whether to let the threads fall as they may, or to use them to weave a better future.
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, presentation slides, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveScyllaDB
Want to learn practical tips for designing systems that can scale efficiently without compromising speed?
Join us for a workshop where we’ll address these challenges head-on and explore how to architect low-latency systems using Rust. During this free interactive workshop oriented for developers, engineers, and architects, we’ll cover how Rust’s unique language features and the Tokio async runtime enable high-performance application development.
As you explore key principles of designing low-latency systems with Rust, you will learn how to:
- Create and compile a real-world app with Rust
- Connect the application to ScyllaDB (NoSQL data store)
- Negotiate tradeoffs related to data modeling and querying
- Manage and monitor the database for consistently low latencies
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPathCommunity
Join this UiPath Community Berlin meetup to explore the Orchestrator API, Swagger interface, and the Test Manager API. Learn how to leverage these tools to streamline automation, enhance testing, and integrate more efficiently with UiPath. Perfect for developers, testers, and automation enthusiasts!
📕 Agenda
Welcome & Introductions
Orchestrator API Overview
Exploring the Swagger Interface
Test Manager API Highlights
Streamlining Automation & Testing with APIs (Demo)
Q&A and Open Discussion
Perfect for developers, testers, and automation enthusiasts!
👉 Join our UiPath Community Berlin chapter: https://community.uipath.com/berlin/
This session streamed live on April 29, 2025, 18:00 CET.
Check out all our upcoming UiPath Community sessions at https://community.uipath.com/events/.
HCL Nomad Web – Best Practices and Managing Multiuser Environmentspanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-and-managing-multiuser-environments/
HCL Nomad Web is heralded as the next generation of the HCL Notes client, offering numerous advantages such as eliminating the need for packaging, distribution, and installation. Nomad Web client upgrades will be installed “automatically” in the background. This significantly reduces the administrative footprint compared to traditional HCL Notes clients. However, troubleshooting issues in Nomad Web present unique challenges compared to the Notes client.
Join Christoph and Marc as they demonstrate how to simplify the troubleshooting process in HCL Nomad Web, ensuring a smoother and more efficient user experience.
In this webinar, we will explore effective strategies for diagnosing and resolving common problems in HCL Nomad Web, including
- Accessing the console
- Locating and interpreting log files
- Accessing the data folder within the browser’s cache (using OPFS)
- Understand the difference between single- and multi-user scenarios
- Utilizing Client Clocking
Semantic Cultivators : The Critical Future Role to Enable AIartmondano
By 2026, AI agents will consume 10x more enterprise data than humans, but with none of the contextual understanding that prevents catastrophic misinterpretations.
Technology Trends in 2025: AI and Big Data AnalyticsInData Labs
At InData Labs, we have been keeping an ear to the ground, looking out for AI-enabled digital transformation trends coming our way in 2025. Our report will provide a look into the technology landscape of the future, including:
-Artificial Intelligence Market Overview
-Strategies for AI Adoption in 2025
-Anticipated drivers of AI adoption and transformative technologies
-Benefits of AI and Big data for your business
-Tips on how to prepare your business for innovation
-AI and data privacy: Strategies for securing data privacy in AI models, etc.
Download your free copy nowand implement the key findings to improve your business.
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfSoftware Company
Explore the benefits and features of advanced logistics management software for businesses in Riyadh. This guide delves into the latest technologies, from real-time tracking and route optimization to warehouse management and inventory control, helping businesses streamline their logistics operations and reduce costs. Learn how implementing the right software solution can enhance efficiency, improve customer satisfaction, and provide a competitive edge in the growing logistics sector of Riyadh.
What is Model Context Protocol(MCP) - The new technology for communication bw...Vishnu Singh Chundawat
The MCP (Model Context Protocol) is a framework designed to manage context and interaction within complex systems. This SlideShare presentation will provide a detailed overview of the MCP Model, its applications, and how it plays a crucial role in improving communication and decision-making in distributed systems. We will explore the key concepts behind the protocol, including the importance of context, data management, and how this model enhances system adaptability and responsiveness. Ideal for software developers, system architects, and IT professionals, this presentation will offer valuable insights into how the MCP Model can streamline workflows, improve efficiency, and create more intuitive systems for a wide range of use cases.
Quantum Computing Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul
Artificial intelligence is changing how businesses operate. Companies are using AI agents to automate tasks, reduce time spent on repetitive work, and focus more on high-value activities. Noah Loul, an AI strategist and entrepreneur, has helped dozens of companies streamline their operations using smart automation. He believes AI agents aren't just tools—they're workers that take on repeatable tasks so your human team can focus on what matters. If you want to reduce time waste and increase output, AI agents are the next move.
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc
Most consumers believe they’re making informed decisions about their personal data—adjusting privacy settings, blocking trackers, and opting out where they can. However, our new research reveals that while awareness is high, taking meaningful action is still lacking. On the corporate side, many organizations report strong policies for managing third-party data and consumer consent yet fall short when it comes to consistency, accountability and transparency.
This session will explore the research findings from TrustArc’s Privacy Pulse Survey, examining consumer attitudes toward personal data collection and practical suggestions for corporate practices around purchasing third-party data.
Attendees will learn:
- Consumer awareness around data brokers and what consumers are doing to limit data collection
- How businesses assess third-party vendors and their consent management operations
- Where business preparedness needs improvement
- What these trends mean for the future of privacy governance and public trust
This discussion is essential for privacy, risk, and compliance professionals who want to ground their strategies in current data and prepare for what’s next in the privacy landscape.
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxAnoop Ashok
In today's fast-paced retail environment, efficiency is key. Every minute counts, and every penny matters. One tool that can significantly boost your store's efficiency is a well-executed planogram. These visual merchandising blueprints not only enhance store layouts but also save time and money in the process.
2. @JPERRAS - JOEL PERRAS
Canadian Geek
Blog: http://nerderati.com
GitHub: http://github.com/jperras
CakePHP Core since Early 2009, PHP dev. since 2001
McGill University, Montréal, Canada - Physics,
Mathematics & Computer Science
Employer: Plank Design (http://plankdesign.com)
(Twitter: @plankdesign)
Saturday, October 31, 2009
3. RELATIONAL DATABASES
Many different vendors: MySQL, PostgreSQL,
SQLite, Oracle, ...
Same basic implementation:
B(+)-Trees for pages
B(+)-Trees or hash tables for secondary indexes
Possibly R-Trees for spatial indexes
Saturday, October 31, 2009
5. Schemas (relational models)
Familiar BCNF structure
Strong consistency
Transactions
Very “mature” & well tested (mostly)
Easy adoption/integration
Saturday, October 31, 2009
6. RDBMS’ES ARE NOT
GOING ANYWHERE
FriendFeed
Wikipedia
Google AdWords
Facebook
Saturday, October 31, 2009
7. Most small to medium size applications will
never need to go beyond a single database server.
Saturday, October 31, 2009
8. Always try and follow the Golden Web Application
Development Rule:
Saturday, October 31, 2009
9. DON’T TRY TO SOLVE A
PROBLEM YOU DON’T
HAVE
Saturday, October 31, 2009
10. The web has created new problem domains in
data storage and querying.
Saturday, October 31, 2009
11. MODERN WEB APPS
Often use variable schemas
Optional fields: contact lists, addresses, favourite
movies/books, etc.
NULL-itis: null values should not be permitted in
BCNF, but are everywhere in web applications.
Saturday, October 31, 2009
12. MODERN WEB APPS
‘Social’ apps => high write/read ratios
Complex Many-to-Many relationships
Joins become a problem in federated architectures
Eventual consistency is usually acceptable
Downtime unacceptable
Saturday, October 31, 2009
14. RULES OF APP AGING
http://push.cx/2009/rules-of-database-app-aging
1. All fields become optional
2. All relationships become many-to-many
3. Chatter (comments explaining hacks)
grows with time.
Saturday, October 31, 2009
15. SOME GOOD PROBLEMS
TO HAVE
Even if they are “Hard” ones to solve.
Saturday, October 31, 2009
16. Load Balancing
(you can only live with one machine for so long)
Saturday, October 31, 2009
17. High Availability
(because disks fail, and replication fails)
Saturday, October 31, 2009
18. What’s a web application developer to do?
Saturday, October 31, 2009
20. Not a silver bullet.
These can solve some problems,
but cause others and have their own limitations.
It’s up to you to weigh the cost/benefit of your chosen
solution.
Saturday, October 31, 2009
21. THE LANDSCAPE
Key/Value Stores/Distributed Hash Tables (DHT)
Document-oriented databases
Column-oriented databases
Saturday, October 31, 2009
22. KEY/VALUE STORES
Voldemort
Scalaris
Tokyo Cabinet
Redis
MemcacheDB
Saturday, October 31, 2009
23. DOCUMENT ORIENTED
DATA STORES
CouchDB <- (my favourite!)
MongoDB
SimpleDB (Amazon)
Saturday, October 31, 2009
24. COLUMN-ORIENTED
STORES
BigTable (Google)
HBase (Hadoop Database)
Hypertable (BigTable Open Source clone)
Cassandra (Facebook)
Saturday, October 31, 2009
25. How do we use these technologies
alongside CakePHP ?
Saturday, October 31, 2009
27. CASE STUDY - COUCHDB
http://github.com/jperras/divan
(I will make zip/tar available when more stable - stay tuned)
Saturday, October 31, 2009
28. CASE STUDY - TOKYO
CABINET/TYRANT
http://github.com/jperras/tyrannical
(I will make zip/tar available when more stable - stay tuned)
Saturday, October 31, 2009
30. So don’t try to force the interface to
be relational.
Saturday, October 31, 2009
31. DESIGNING A NON-
RELATIONAL DATASOURCE
Favour simplicity over transparency
Don’t try to implement everything that the
MySQL driver implements
Use the strengths of the alternative store
Saturday, October 31, 2009
33. KEY/VALUE STORES
Most have atomic increment/decrement operations
Great for API rate limiters (e.g. 300 API reqs/hour/account)
Counts & sums of normalized data
Most popular items, votes, ratings, some statistics
And more.
Saturday, October 31, 2009
34. DOCUMENT STORES
Filesystem objects (pdfs, images, excel sheets etc.) -
stored as document attachments (size limited).
Allows you to reduce reliance on shared filesystems (NFS)
Address book
Volatile schema situations
CouchDB has a very interesting feature set
Saturday, October 31, 2009
36. Thanks to the DataSource adapter implementation
in CakePHP, creating a model-based interface is simple.
Saturday, October 31, 2009
37. Thank you!
@jperras
http://nerderati.com
http://github.com/jperras
Saturday, October 31, 2009
38. CODE
Divan - CouchDB datasource
Yantra - State Machine component for application control flow
CakPHP TextMate Bundle
CakeMate - TextMate/Vim Plugin
Tyrannical - Tokyo Tyrant datasource
Originally by Martin Samson ([email protected])
Working to improve code - commits coming soon.
Currently working on a framework-agnostic, distributed, plugin/library server.
Saturday, October 31, 2009