SQL For Programmers -- Boston Big Data Techcon April 27thDave Stokes
SQL For Programmers is an introduction to SQL concepts, when SQL is a better choice, and a look at the future of databases. Presented April 27th, 2015 at Big Data Techcon Boston
1. The Covers relationship is ternary, involving Employees, Policies, and Dependents relations, while Purchaser and Beneficiary are binary relationships.
2. Ternary relationships impose stronger constraints - a policy must be linked to a specific employee and dependent.
3. The second diagram models the relationships more accurately by separating the purchaser, beneficiary, and policy linkages. This removes ambiguities of the ternary relationship.
Relational Theory for Budding Einsteins -- LonestarPHP 2016Dave Stokes
This document provides an overview of relational database theory and normalization for developers. It defines key terms like relational databases, logical and physical data models, database schemas, and data normalization. It explains the concepts of first, second, third and Boyce-Codd normal forms and how to normalize data to these forms by removing redundant and unnecessary data through a multi-step process. The goal of normalization is to organize data to minimize duplication and ensure integrity. An example demonstrates normalizing a dog owner database from first to third normal form.
Here are the key entities and relationships based on the information provided:
Entities:
- Department
- Employee
- Supervisor
- Project
Relationships:
- Department has one supervisor (1:1)
- Department has many employees (1:M)
- Employee works in one or more departments (M:N)
- Project has many employees assigned (1:M)
- Employee works on one or more projects (M:N)
The important attributes that uniquely identify each entity are also specified, such as employee number, department code, project number. This provides the foundation for modeling the database schema to represent these real world entities and relationships.
The document describes a set of PowerPoint slides for a Database Management Systems course. It includes an index listing the topics covered in each lecture and the corresponding slide numbers. The slides cover the basics of SQL queries, including the SELECT, FROM, and WHERE clauses. They also describe concepts like aggregates, null values, triggers, and designing active databases. Integrity constraints and different data types are discussed in the context of the CREATE TABLE statement.
This document discusses using R for initial data analysis. It covers loading data into R from files or by typing it in, exploring and visualizing the data using basic statistics and graphs, and saving outputs. R allows importing data from various sources, creating and editing data structures, and exporting objects and plots for sharing results. The key is becoming familiar with R's programming environment and functions for summarizing, transforming, and visualizing data.
The document discusses spatial query languages and SQL. It begins by introducing learning objectives about understanding query languages, using SQL, extending SQL for spatial data, and trends in query languages. It then provides examples of using SQL to query spatial data by making use of the Open Geodata Interchange Standard (OGIS) spatial data types and operations within SQL queries.
This document discusses database normalization and different normal forms including 1NF, 2NF, 3NF, and BCNF. It defines anomalies like insertion, update, and deletion anomalies that can occur when data is not normalized. Examples are provided to illustrate the different normal forms and how denormalizing data can lead to anomalies. The key aspects of each normal form like removing repeating groups (1NF), removing functional dependencies on non-prime attributes (2NF), and removing transitive dependencies (3NF, BCNF) are explained.
BEGINNER
1. Nodes
2. Parent Nodes & Child Nodes
3. Leaf Nodes
4. Root Node
5. Sub Tree
6. Level of a tree
7. m-ary Tree
8. Binary Tree (BT)
9. Binary Search Tree (BST)
10. BST - Insert, Delete
Common Use of Tree as a Data Structure
INTERMEDIATE
1. Nodes
2. Parent Nodes & Child Nodes
3. Leaf Nodes
4. Root Node
5. Sub Tree
6. Level of a tree:
7. m-ary Tree
8. Binary Tree (BT)
9. Complete and Full Binary Tree
10. Traversal
11. Binary Search Tree (BST)
12. BST - Insert, Delete
Tree is a non-linear data structure that can represent hierarchical relationships. It consists of nodes connected by edges. The root node has child nodes, and nodes may be leaf nodes or have child nodes themselves. Binary trees restrict nodes to having at most two children. General trees can be converted to binary trees by making the first child the left child and next siblings the right child. Binary trees can be built from input data by comparing values to determine left or right placement, or from traversal operations.
The document summarizes external sorting techniques used in database management systems. It describes a two-phase sorting approach using limited buffer space in memory. The first phase creates runs by sorting each page individually. The second phase repeatedly merges runs by pairs until a single sorted run is produced, using three buffer pages - two for input runs and one for the output merged run. The process of merging two sorted runs by comparing elements and writing the smallest to the output page is also explained.
The document defines various normal forms for database normalization including 1NF, 2NF, 3NF and BCNF. It explains the concepts of functional dependencies, full functional dependencies, partial dependencies and transitive dependencies. The goals of normalization are to eliminate data anomalies, reduce data redundancy and improve data integrity. Normalization is achieved by decomposing relations and removing dependencies between attributes.
We have described the normalization (first normal form, second normal form ..... upto fifth normal form) in simple and easy to understand language.
We at BIWHIZ are committed to equip you with the hottest skills of BI, Analytics, Big Data, Database and Data Science.
This document provides information about relational algebra operators including select, project, join, set operations, and more. It defines each operator, provides examples of how to write them using relational algebra notation, and explains how to apply them to sample tables and queries. Key learning outcomes covered are using relational algebra operators to retrieve information and write expressions based on relational tables.
Normalization (Brief Overview)
Functional Dependencies and Keys
1st Normal Form
2nd Normal Form
3rdNormal Form
3.5 Normal Form (Boyce Codd Normal Form-BCNF)
4thNormal Form
5thNormal Form(Project-Join Normal Form-PJNF)
Domain Key Normal Form (DKNF)
6thNormal Form
The document discusses database normalization. The goals of normalization are to eliminate redundant data and ensure related data is stored together. It describes the various normal forms including 1NF, 2NF, 3NF and 4NF. 1NF focuses on atomic values and unique identifiers. 2NF builds on 1NF by removing subsets of data that apply to multiple rows. 3NF then removes columns not dependent on the primary key. The benefits of normalization include greater organization, less redundancy, consistency and flexibility.
The document discusses normalization in database design. Normalization is the process of organizing data to avoid redundancy and dependency. It involves splitting tables and restructuring relationships between tables. The document outlines various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF and 5NF and provides examples to illustrate how to normalize tables to conform to each form.
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
Introduction to the R Statistical Computing Environmentizahn
Get an introduction to R, the open-source system for statistical computation and graphics. With hands-on exercises, learn how to import and manage datasets, create R objects, and conduct basic statistical analyses. Full workshop materials can be downloaded from http://projects.iq.harvard.edu/rtc/event/introduction-r
This document provides an overview of MS SQL Server tips covering topics such as relationship databases, database design including normalization, indexes, and useful queries. Relationship databases organize information into tables that can be related through primary and foreign keys. Database design involves normalization to eliminate anomalies and improve performance. Indexes help optimize queries and common types include clustered, nonclustered, unique and full-text. Useful queries are provided to check index fragmentation and monitor currently running processes.
Normalization is the process of removing redundant data from your tables to improve storage efficiency, data integrity, and scalability.
Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.
Why normalization?
The relation derived from the user view or data store will most likely be unnormalized.
The problem usually happens when an existing system uses unstructured file, e.g. in MS Excel.
Database Systems - Normalization of Relations(Chapter 4/3)Vidyasagar Mundroy
The document discusses normalization, which is a process for relational database design that reduces data redundancy and improves data integrity. It involves decomposing relations to eliminate anomalies like insertion, deletion, and modification anomalies. Several normal forms are described - 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF - each addressing different types of dependencies and anomalies. The goal of normalization is to organize the data in a logical manner and break relations into smaller, less redundant relations without affecting the information contained.
This document provides a summary of questions and answers related to data structures from Anna University regulation papers from 2008 to 2013. It covers topics like linear data structures (lists, stacks, queues), non-linear data structures (trees), and abstract data types. The document is compiled by Dr. P. Subathra and contains questions from various regulation years with detailed explanations and examples for each question.
This document provides guidance on naming and structuring tables and columns in a SQL database. It discusses best practices for naming tables and columns with up to 30 characters and avoiding spaces. It also covers the different data types that can be used as well as considerations for determining the appropriate data type based on the type and range of data. User defined data types can be created to enforce consistency. The document also discusses identity columns, creating tables, and other table properties and limitations.
SQL for PHP Programmers -- Dallas PHP Users Group Jan 2015Dave Stokes
This document provides an overview of a SQL tutorial being held on November 11th presented by Dave Stokes, MySQL Community Manager. It discusses some of the challenges PHP programmers face with SQL and relational database concepts. It provides explanations of relational algebra, database normalization forms, and SQL components like DDL and DML. Examples are given around creating tables, joins, foreign keys and other SQL statements. The goal is to help PHP programmers improve their skills with structured query language for working with databases.
SQL is a language used to manage and query relational databases. It allows users to create, modify, retrieve, and delete data from the database. The main components of SQL include DDL for defining database schema, DML for manipulating data, and DQL for querying data. SQL tables store data in rows and columns and can be queried using commands like SELECT, WHERE, GROUP BY and JOIN.
The document discusses spatial query languages and SQL. It begins by introducing learning objectives about understanding query languages, using SQL, extending SQL for spatial data, and trends in query languages. It then provides examples of using SQL to query spatial data by making use of the Open Geodata Interchange Standard (OGIS) spatial data types and operations within SQL queries.
This document discusses database normalization and different normal forms including 1NF, 2NF, 3NF, and BCNF. It defines anomalies like insertion, update, and deletion anomalies that can occur when data is not normalized. Examples are provided to illustrate the different normal forms and how denormalizing data can lead to anomalies. The key aspects of each normal form like removing repeating groups (1NF), removing functional dependencies on non-prime attributes (2NF), and removing transitive dependencies (3NF, BCNF) are explained.
BEGINNER
1. Nodes
2. Parent Nodes & Child Nodes
3. Leaf Nodes
4. Root Node
5. Sub Tree
6. Level of a tree
7. m-ary Tree
8. Binary Tree (BT)
9. Binary Search Tree (BST)
10. BST - Insert, Delete
Common Use of Tree as a Data Structure
INTERMEDIATE
1. Nodes
2. Parent Nodes & Child Nodes
3. Leaf Nodes
4. Root Node
5. Sub Tree
6. Level of a tree:
7. m-ary Tree
8. Binary Tree (BT)
9. Complete and Full Binary Tree
10. Traversal
11. Binary Search Tree (BST)
12. BST - Insert, Delete
Tree is a non-linear data structure that can represent hierarchical relationships. It consists of nodes connected by edges. The root node has child nodes, and nodes may be leaf nodes or have child nodes themselves. Binary trees restrict nodes to having at most two children. General trees can be converted to binary trees by making the first child the left child and next siblings the right child. Binary trees can be built from input data by comparing values to determine left or right placement, or from traversal operations.
The document summarizes external sorting techniques used in database management systems. It describes a two-phase sorting approach using limited buffer space in memory. The first phase creates runs by sorting each page individually. The second phase repeatedly merges runs by pairs until a single sorted run is produced, using three buffer pages - two for input runs and one for the output merged run. The process of merging two sorted runs by comparing elements and writing the smallest to the output page is also explained.
The document defines various normal forms for database normalization including 1NF, 2NF, 3NF and BCNF. It explains the concepts of functional dependencies, full functional dependencies, partial dependencies and transitive dependencies. The goals of normalization are to eliminate data anomalies, reduce data redundancy and improve data integrity. Normalization is achieved by decomposing relations and removing dependencies between attributes.
We have described the normalization (first normal form, second normal form ..... upto fifth normal form) in simple and easy to understand language.
We at BIWHIZ are committed to equip you with the hottest skills of BI, Analytics, Big Data, Database and Data Science.
This document provides information about relational algebra operators including select, project, join, set operations, and more. It defines each operator, provides examples of how to write them using relational algebra notation, and explains how to apply them to sample tables and queries. Key learning outcomes covered are using relational algebra operators to retrieve information and write expressions based on relational tables.
Normalization (Brief Overview)
Functional Dependencies and Keys
1st Normal Form
2nd Normal Form
3rdNormal Form
3.5 Normal Form (Boyce Codd Normal Form-BCNF)
4thNormal Form
5thNormal Form(Project-Join Normal Form-PJNF)
Domain Key Normal Form (DKNF)
6thNormal Form
The document discusses database normalization. The goals of normalization are to eliminate redundant data and ensure related data is stored together. It describes the various normal forms including 1NF, 2NF, 3NF and 4NF. 1NF focuses on atomic values and unique identifiers. 2NF builds on 1NF by removing subsets of data that apply to multiple rows. 3NF then removes columns not dependent on the primary key. The benefits of normalization include greater organization, less redundancy, consistency and flexibility.
The document discusses normalization in database design. Normalization is the process of organizing data to avoid redundancy and dependency. It involves splitting tables and restructuring relationships between tables. The document outlines various normal forms including 1NF, 2NF, 3NF, BCNF, 4NF and 5NF and provides examples to illustrate how to normalize tables to conform to each form.
Data Warehousing and Business Intelligence is one of the hottest skills today, and is the cornerstone for reporting, data science, and analytics. This course teaches the fundamentals with examples plus a project to fully illustrate the concepts.
Introduction to the R Statistical Computing Environmentizahn
Get an introduction to R, the open-source system for statistical computation and graphics. With hands-on exercises, learn how to import and manage datasets, create R objects, and conduct basic statistical analyses. Full workshop materials can be downloaded from http://projects.iq.harvard.edu/rtc/event/introduction-r
This document provides an overview of MS SQL Server tips covering topics such as relationship databases, database design including normalization, indexes, and useful queries. Relationship databases organize information into tables that can be related through primary and foreign keys. Database design involves normalization to eliminate anomalies and improve performance. Indexes help optimize queries and common types include clustered, nonclustered, unique and full-text. Useful queries are provided to check index fragmentation and monitor currently running processes.
Normalization is the process of removing redundant data from your tables to improve storage efficiency, data integrity, and scalability.
Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.
Why normalization?
The relation derived from the user view or data store will most likely be unnormalized.
The problem usually happens when an existing system uses unstructured file, e.g. in MS Excel.
Database Systems - Normalization of Relations(Chapter 4/3)Vidyasagar Mundroy
The document discusses normalization, which is a process for relational database design that reduces data redundancy and improves data integrity. It involves decomposing relations to eliminate anomalies like insertion, deletion, and modification anomalies. Several normal forms are described - 1NF, 2NF, 3NF, BCNF, 4NF, and 5NF - each addressing different types of dependencies and anomalies. The goal of normalization is to organize the data in a logical manner and break relations into smaller, less redundant relations without affecting the information contained.
This document provides a summary of questions and answers related to data structures from Anna University regulation papers from 2008 to 2013. It covers topics like linear data structures (lists, stacks, queues), non-linear data structures (trees), and abstract data types. The document is compiled by Dr. P. Subathra and contains questions from various regulation years with detailed explanations and examples for each question.
This document provides guidance on naming and structuring tables and columns in a SQL database. It discusses best practices for naming tables and columns with up to 30 characters and avoiding spaces. It also covers the different data types that can be used as well as considerations for determining the appropriate data type based on the type and range of data. User defined data types can be created to enforce consistency. The document also discusses identity columns, creating tables, and other table properties and limitations.
SQL for PHP Programmers -- Dallas PHP Users Group Jan 2015Dave Stokes
This document provides an overview of a SQL tutorial being held on November 11th presented by Dave Stokes, MySQL Community Manager. It discusses some of the challenges PHP programmers face with SQL and relational database concepts. It provides explanations of relational algebra, database normalization forms, and SQL components like DDL and DML. Examples are given around creating tables, joins, foreign keys and other SQL statements. The goal is to help PHP programmers improve their skills with structured query language for working with databases.
SQL is a language used to manage and query relational databases. It allows users to create, modify, retrieve, and delete data from the database. The main components of SQL include DDL for defining database schema, DML for manipulating data, and DQL for querying data. SQL tables store data in rows and columns and can be queried using commands like SELECT, WHERE, GROUP BY and JOIN.
A relational database management system (RDBMS) is a database management system that is based on the relational model. An RDBMS makes it possible for end users to create, read, update and delete data in a database systematically. Normalization is a technique used to organize data in a database to eliminate redundancy and improve data integrity. It involves decomposing tables and relations to their lowest sets of attributes. Some common types of normalization forms are first normal form, second normal form, third normal form and Boyce-Codd normal form.
This document provides an overview of SQL programming. It covers the history of SQL and SQL Server, SQL fundamentals including database design principles like normalization, and key SQL statements like SELECT, JOIN, UNION and stored procedures. It also discusses database objects, transactions, and SQL Server architecture concepts like connections. The document is intended as a training guide, walking through concepts and providing examples to explain SQL programming techniques.
This document discusses dimensional modeling (DM) as a way to simplify entity-relationship (ER) data models that are used for data warehousing and online analytical processing (OLAP). DM results in a star schema with one central fact table linked to multiple dimension tables. This structure is simpler for users to understand and for query tools to navigate compared to complex ER models. While DM uses more storage space by duplicating dimensional data, it improves query performance through fewer joins. The document provides an example comparing the storage requirements of a phone call fact table under a star schema versus a snowflake schema.
Open Source 1010 and Quest InSync presentations March 30th, 2021 on MySQL Ind...Dave Stokes
Speeding up queries on a MySQL server with indexes and histograms is not a mysterious art but simple engineering. This presentation is an indepth introduction that was presented on March 30th to the Quest Insynch and Open Source 101 conferences
This document provides an introduction and tutorial to SQL and the Oracle relational database system. It covers the basics of SQL, including defining and querying tables, modifying data, and more advanced query techniques. It also discusses additional Oracle topics like PL/SQL, integrity constraints, triggers, and the overall Oracle system architecture. The tutorial is intended to provide a detailed overview of SQL and how to work with Oracle databases.
This document provides an introduction and tutorial to SQL and the Oracle relational database system. It covers the basics of SQL, including defining and querying tables, modifying data, and more advanced queries using views, joins, and subqueries. It also introduces PL/SQL for database programming and covers other Oracle-specific topics like integrity constraints, triggers, and system architecture. The tutorial is intended to provide a detailed overview of the Oracle database and SQL.
SQL is a language used to communicate with databases and manage data. It allows users to create, update, and retrieve data from databases. The document outlines the history of SQL and its evolution over time. It also describes key SQL concepts like data types, commands, primary keys, database normalization, and techniques for ensuring data integrity.
Data Science, Statistical Analysis and R... Learn what those mean, how they can help you find answers to your questions and complement the existing toolsets and processes you are currently using to make sense of data. We will explore R and the RStudio development environment, installing and using R packages, basic and essential data structures and data types, plotting graphics, manipulating data frames and how to connect R and SQL Server.
SQL is a standard language for accessing and manipulating databases. It allows users to retrieve, insert, update, and delete data as well as create, modify and delete tables. The main SQL commands are grouped into four categories: data definition language for creating/modifying database structures, data manipulation language for interacting with data, transaction control language for managing transactions, and data control language for security. Common SQL commands include CREATE, SELECT, INSERT, UPDATE, DELETE, ALTER, and DROP.
1. The document describes the laboratory manual for the subject "Fundamentals of Database Management Systems" for the III B. Tech - II Semester at Malla Reddy Engineering College.
2. It provides the sample database for the Railway Reservation System and lists 13 experiments involving SQL commands for creating tables, inserting data, querying, joining, aggregation and more.
3. The goals of the course are to practice concepts learned in DBMS by developing and querying a database using SQL and PL/SQL statements.
L1 Intro to Relational DBMS LP.pdfIntro to Relational .docxDIPESH30
L1 Intro to Relational DBMS LP.pdf
Intro to Relational
Databases
CS 2215 Introduction to Databases
1
2
What Is a DBMS?
Database: A collection of information.
Eg ?
Examples: Library, University
Database Management System (DBMS) :
software package designed to store
and manage databases.
Eg: Oracle, SQL server, MySQL, Access
Files vs DBMS : Why bother with
databases ?
Why not just store all the data in a big file
and write C or Java programs to manipulate
the data. 2
3
Why Use a DBMS?
Naïve users sheltered from messy
details
Data integrity:
Eg: if Bob works in Marketing, make
sure there is a dept. called Marketing.
Reduced application development
time: Avoid writing special programs
from scratch each time to access
data.
Standard Application Interface:
increased reliability
3
4
Why Use a DBMS?
Data independence: easier
to make changes
If how data is stored changes,
don’t have to change views.
Forms, etc.
Security: easier to control
how data is shared
Concurrent access: allow
multiple users to access
simultaneously
But in a controlled way !
4
5
Different people involved
DBMS implementers: who build the DBMS like
Oracle, MS SQL server
End users: Use forms & reports, might write SQL
queries
DB application programmers: write programs
to make life easier for end users.
Eg: person who creates forms for library.
Must know how databases work
DB administrator (DBA):
Handles security and authorization
Crash recovery
Database tuning as needs evolve
5
6
Overview of course: Relational Model:
Student Database, Fig 1.2
6
STUDENT
Name StudentNumber Class Major
Smith 17 1 CS
Brown 8 2 CS
7
Overview of course:
Data Models:
High level : Entity Relation (E.R.) model
Intermediate level : relational model
Student database
Low level: physical database -
Covered in CSCI 4524 Advanced
Databases
Relational databases:
Integrity constraints
Good design : normalization
Query languages: Relational
algebra, SQL
Views, Assertions, Triggers
7
8
Relational Data Model
Relation: 2-dimensional table
All info stored in tables
Eg: student, course
See Elmasri Fig 1.2
Rows (or tuples): student : 2 rows
Records: a row may correspond with a
record in a file
Commonly used if we are talking about the
physical storage of databases
Columns (or attributes): student : 4
columns
8
9
Relational Data Model
Relational model proposed by E. F.
Codd 1970
Dominant model in commercial DBMS
products.
Eg: Oracle, SQL server, MySQL, Access.
Compared to previous models
(network, hierarchical etc):
Easier to understand info in tables
Casual user can write simple SQL queries
Complex queries much easier to
understand compared to previous models.
9
10
Basic Terminology
Relational Schema (or head): set of all the
column names i.e. what info is bei ...
A database is a collection of information organized in a way that allows a computer program to select desired data quickly. A traditional database is organized into fields, records, and files. A field contains a single piece of information, a record contains one set of fields, and a file contains records.
A database management system (DBMS) is a collection of programs that allows users to enter, organize, and select data in a database. It performs functions like user management, data creation/modification/access, and database maintenance. Popular DBMS include Microsoft Access, Oracle, MySQL, SQL Server, and others.
Good database systems have ACID properties - Atomicity, Consistency, Isolation, and Durability.
This document contains study material prepared by D.GAYA, Assistant Professor of Computer Science at Pondicherry University Community College, for the subject Relational Database Management System. It covers various topics related to SQL including basic SQL reports and commands, data types, joins, DDL, DML, DCL commands, and binary data types. Examples are provided to explain concepts such as creating and dropping databases, creating tables, commenting in SQL, and using the TO_HEX and HEX_TO_BINARY functions for binary data.
The document discusses SQL commands and concepts. It begins by explaining the different types of SQL statements: Data Definition Language (DDL) for creating and modifying database objects, Data Manipulation Language (DML) for manipulating data, Data Retrieval Language (DRL) for querying data, Transaction Control Language (TCL) for managing transactions, and Data Control Language (DCL) for managing user access. It then provides examples of key DDL commands like CREATE, ALTER, and DROP TABLE and DML commands like INSERT, UPDATE, DELETE. It concludes by introducing aggregate functions in SQL like COUNT for summarizing data.
Valkey 101 - SCaLE 22x March 2025 Stokes.pdfDave Stokes
An Introduction to Valkey, Presented March 2025 at the Southern California Linux Expo, Pasadena CA. Valkey is a replacement for Redis and is a very fast in memory database, used to caches and other low latency applications. Valkey is open-source software and very fast.
MySQL is an ubiquitous open source database but do you know how make it secure? This talk is from the 2022 Texas Cyber Summit on how to do just that. Make sure you data and database are secure.
MySQL Indexes and Histograms - RMOUG Training Days 2022Dave Stokes
Nobody complains when the database is too fast. But they do gripe when it slows down. The two most popular ways to increase query speed are indexes and histograms. But there a dozens of options for indexes and a lot of lots of bad information on how to use them. Histograms are great but not for all types of data. This session covers the hows and whys of both approaches
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019Dave Stokes
MySQL is a relational database management system. The document provides an introduction to MySQL, including:
- MySQL is available in both community and enterprise editions. The community edition is free to use while the enterprise edition starts at $5K/4 core CPU before discounts.
- Data in MySQL is organized into tables within schemas (or databases). Tables contain rows of data organized into columns.
- Structured Query Language (SQL) is used to interact with MySQL databases. Common SQL commands include SELECT to retrieve data, INSERT to add data, UPDATE to modify data, and DELETE to remove data.
- JOIN clauses allow retrieving data from multiple tables by linking them together on common columns. This helps normalize data
Develop PHP Applications with MySQL X DevAPIDave Stokes
The X DevAPI provides a way to use MySQL as a NoSQL JSON Document Store and this presentation covers how to use it with the X DevAPI PHP PECL extension. And it also works with traditional relational tables. Presented at Oracle CodeOne 24 October 2018
MySQL 8 Tips and Tricks from Symfony USA 2018, San FranciscoDave Stokes
This document discusses several new features in MySQL 8 including:
1. A new transactional data dictionary that stores metadata instead of files for improved simplicity and crash safety.
2. The addition of histograms to help the query optimizer understand data distributions without indexes for better query planning.
3. Resource groups that allow assigning threads to groups with specific CPU and memory limits to control resource usage.
4. Enhancements to JSON support like in-place updates and new functions for improved flexibility with semi-structured data.
The Proper Care and Feeding of MySQL DatabasesDave Stokes
Many Linux System Administrators are 'also' accidental database administrators. This is a guide for them to keep their MySQL database instances happy, health, and glowing
This document discusses MySQL Document Store, which allows both SQL and NoSQL functionality on the MySQL platform. It provides benefits for developers, operations teams, and business owners. MySQL Document Store uses JSON documents stored in MySQL tables, providing a schemaless document model with ACID transactions. This allows flexible data structures while maintaining SQL's reliability. The document demonstrates CRUD operations and querying documents using either SQL or NoSQL-style APIs. It concludes that MySQL Document Store provides the best of both SQL and NoSQL worlds in a single product.
MySQL Without The SQL -- Oh My! PHP[Tek] June 2018Dave Stokes
The MySQL Document Store allows developers to use MySQL as a JSON Document Store -- no normalizing of data, setting up relational tables, and you do not have to use SQL to query data. And you get the both the SQL and NoSQL worlds on one server
Presentation Skills for Open Source FolksDave Stokes
Do you want to present at a Linuxfest or other open source conference but do not know where or how to start. Follow these recommendations and you will be on your way to being a speaking all star. Discover how write your presentation. what tools you need, and other items of note
MySQL Without the SQL -- Oh My! Longhorn PHP ConferenceDave Stokes
You can now use MySQL without needing to know Structured Query Language (SQL) with the MySQL Document Store. Access JSON documents and/or relational tables using the new X DevAPI
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)Dave Stokes
MySQL 8 has many new features and this presentation covers the new data dictionary, improved JSON functions, roles, histograms, and much more. Updated after SunshinePHP 2018 after feedback
ConFoo MySQL Replication Evolution : From Simple to Group ReplicationDave Stokes
MySQL Replication has been around for many years but how wee do you under stand it? Do you know about read/write splitting, RBR vs SBR style replication, and InnoDB cluster?
This presentation is an INTRODUCTION to intermediate MySQL query optimization for the Audience of PHP World 2017. It covers some of the more intricate features in a cursory overview.
SwanseaCon 2017 presentation on Making MySQL Agile-ish. Relational Databases are not usually considered part of the Agile Programming movement but there are many new features in MySQL to make it easier to include it. This presentation covers how MySQL is moving to help support agile development while maintaining the traditional 'non agile' stability expected from a database.
The very basics of programming in PHP to store/retrieve data on a relational database management system (RDMS). For those looking for intermediate to advanced material, please see 'What Your Database Query is Really Doing'.
Artificial Intelligence is providing benefits in many areas of work within the heritage sector, from image analysis, to ideas generation, and new research tools. However, it is more critical than ever for people, with analogue intelligence, to ensure the integrity and ethical use of AI. Including real people can improve the use of AI by identifying potential biases, cross-checking results, refining workflows, and providing contextual relevance to AI-driven results.
News about the impact of AI often paints a rosy picture. In practice, there are many potential pitfalls. This presentation discusses these issues and looks at the role of analogue intelligence and analogue interfaces in providing the best results to our audiences. How do we deal with factually incorrect results? How do we get content generated that better reflects the diversity of our communities? What roles are there for physical, in-person experiences in the digital world?
Unlocking the Power of IVR: A Comprehensive Guidevikasascentbpo
Streamline customer service and reduce costs with an IVR solution. Learn how interactive voice response systems automate call handling, improve efficiency, and enhance customer experience.
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc
Most consumers believe they’re making informed decisions about their personal data—adjusting privacy settings, blocking trackers, and opting out where they can. However, our new research reveals that while awareness is high, taking meaningful action is still lacking. On the corporate side, many organizations report strong policies for managing third-party data and consumer consent yet fall short when it comes to consistency, accountability and transparency.
This session will explore the research findings from TrustArc’s Privacy Pulse Survey, examining consumer attitudes toward personal data collection and practical suggestions for corporate practices around purchasing third-party data.
Attendees will learn:
- Consumer awareness around data brokers and what consumers are doing to limit data collection
- How businesses assess third-party vendors and their consent management operations
- Where business preparedness needs improvement
- What these trends mean for the future of privacy governance and public trust
This discussion is essential for privacy, risk, and compliance professionals who want to ground their strategies in current data and prepare for what’s next in the privacy landscape.
Technology Trends in 2025: AI and Big Data AnalyticsInData Labs
At InData Labs, we have been keeping an ear to the ground, looking out for AI-enabled digital transformation trends coming our way in 2025. Our report will provide a look into the technology landscape of the future, including:
-Artificial Intelligence Market Overview
-Strategies for AI Adoption in 2025
-Anticipated drivers of AI adoption and transformative technologies
-Benefits of AI and Big data for your business
-Tips on how to prepare your business for innovation
-AI and data privacy: Strategies for securing data privacy in AI models, etc.
Download your free copy nowand implement the key findings to improve your business.
Social Media App Development Company-EmizenTechSteve Jonas
EmizenTech is a trusted Social Media App Development Company with 11+ years of experience in building engaging and feature-rich social platforms. Our team of skilled developers delivers custom social media apps tailored to your business goals and user expectations. We integrate real-time chat, video sharing, content feeds, notifications, and robust security features to ensure seamless user experiences. Whether you're creating a new platform or enhancing an existing one, we offer scalable solutions that support high performance and future growth. EmizenTech empowers businesses to connect users globally, boost engagement, and stay competitive in the digital social landscape.
IT help desk outsourcing Services can assist with that by offering availability for customers and address their IT issue promptly without breaking the bank.
Dev Dives: Automate and orchestrate your processes with UiPath MaestroUiPathCommunity
This session is designed to equip developers with the skills needed to build mission-critical, end-to-end processes that seamlessly orchestrate agents, people, and robots.
📕 Here's what you can expect:
- Modeling: Build end-to-end processes using BPMN.
- Implementing: Integrate agentic tasks, RPA, APIs, and advanced decisioning into processes.
- Operating: Control process instances with rewind, replay, pause, and stop functions.
- Monitoring: Use dashboards and embedded analytics for real-time insights into process instances.
This webinar is a must-attend for developers looking to enhance their agentic automation skills and orchestrate robust, mission-critical processes.
👨🏫 Speaker:
Andrei Vintila, Principal Product Manager @UiPath
This session streamed live on April 29, 2025, 16:00 CET.
Check out all our upcoming Dev Dives sessions at https://community.uipath.com/dev-dives-automation-developer-2025/.
Generative Artificial Intelligence (GenAI) in BusinessDr. Tathagat Varma
My talk for the Indian School of Business (ISB) Emerging Leaders Program Cohort 9. In this talk, I discussed key issues around adoption of GenAI in business - benefits, opportunities and limitations. I also discussed how my research on Theory of Cognitive Chasms helps address some of these issues
Quantum Computing Quick Research Guide by Arthur MorganArthur Morgan
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
HCL Nomad Web – Best Practices and Managing Multiuser Environmentspanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-and-managing-multiuser-environments/
HCL Nomad Web is heralded as the next generation of the HCL Notes client, offering numerous advantages such as eliminating the need for packaging, distribution, and installation. Nomad Web client upgrades will be installed “automatically” in the background. This significantly reduces the administrative footprint compared to traditional HCL Notes clients. However, troubleshooting issues in Nomad Web present unique challenges compared to the Notes client.
Join Christoph and Marc as they demonstrate how to simplify the troubleshooting process in HCL Nomad Web, ensuring a smoother and more efficient user experience.
In this webinar, we will explore effective strategies for diagnosing and resolving common problems in HCL Nomad Web, including
- Accessing the console
- Locating and interpreting log files
- Accessing the data folder within the browser’s cache (using OPFS)
- Understand the difference between single- and multi-user scenarios
- Utilizing Client Clocking
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul
Artificial intelligence is changing how businesses operate. Companies are using AI agents to automate tasks, reduce time spent on repetitive work, and focus more on high-value activities. Noah Loul, an AI strategist and entrepreneur, has helped dozens of companies streamline their operations using smart automation. He believes AI agents aren't just tools—they're workers that take on repeatable tasks so your human team can focus on what matters. If you want to reduce time waste and increase output, AI agents are the next move.
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025BookNet Canada
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, transcript, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungenpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-und-verwaltung-von-multiuser-umgebungen/
HCL Nomad Web wird als die nächste Generation des HCL Notes-Clients gefeiert und bietet zahlreiche Vorteile, wie die Beseitigung des Bedarfs an Paketierung, Verteilung und Installation. Nomad Web-Client-Updates werden “automatisch” im Hintergrund installiert, was den administrativen Aufwand im Vergleich zu traditionellen HCL Notes-Clients erheblich reduziert. Allerdings stellt die Fehlerbehebung in Nomad Web im Vergleich zum Notes-Client einzigartige Herausforderungen dar.
Begleiten Sie Christoph und Marc, während sie demonstrieren, wie der Fehlerbehebungsprozess in HCL Nomad Web vereinfacht werden kann, um eine reibungslose und effiziente Benutzererfahrung zu gewährleisten.
In diesem Webinar werden wir effektive Strategien zur Diagnose und Lösung häufiger Probleme in HCL Nomad Web untersuchen, einschließlich
- Zugriff auf die Konsole
- Auffinden und Interpretieren von Protokolldateien
- Zugriff auf den Datenordner im Cache des Browsers (unter Verwendung von OPFS)
- Verständnis der Unterschiede zwischen Einzel- und Mehrbenutzerszenarien
- Nutzung der Client Clocking-Funktion
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungenpanagenda
SQL For PHP Programmers
1. Tutorial Day: Nov 11th,
9:00a - 12:30p
Dave Stokes
MySQL Community Manager
[email protected]
@Stoker
SSQQLL FFoorr PPHHPP PPrrooggrraammmmeerrss
2. 2
SSaaffee HHaarrbboorr SSttaatteemmeenntt
The following is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated
into any contract. It is not a commitment to
deliver any material, code, or functionality, and
should not be relied upon in making purchasing
decision. The development, release, and timing
of any features or functionality described for
Oracle’s products remains at the sole discretion
of Oracle.
3. 3
The Problem wwiitthh PPHHPP PPrrooggrraammmmeerrss
Your are up to date on the latest version of PHP
4. 4
The Problem wwiitthh PPHHPP PPrrooggrraammmmeerrss
Your are up to date on the latest version of PHP
The latest version of Javascript – no problemo!
5. 5
The Problem wwiitthh PPHHPP PPrrooggrraammmmeerrss
Your are up to date on the latest version of PHP
The latest version of Javascript – no problemo!
Frameworks – you know two or three or more – plus the
ones you wrote yourself
6. 6
The Problem wwiitthh PPHHPP PPrrooggrraammmmeerrss
Your are up to date on the latest version of PHP
The latest version of Javascript – no problemo!
Frameworks – you know two or three or more – plus the
ones you wrote yourself
But roughly 2-3% have had any training in Structured
Query Language (SQL)
7. 7
So what is SQL?!?!??!?!??!??!
http://en.wikipedia.org/wiki/SQL
SQL (/ˈɛs kjuː ˈɛl/ or /ˈsiːkwəl/; Structured Query
Language) is a special-purpose programming language
designed for managing data held in a relational database
management system (RDBMS), or for stream processing
in a relational data stream management system
(RDSMS).
Originally based upon relational algebra and tuple
relational calculus, SQL consists of a data definition
language and a data manipulation language. The scope of
SQL includes data insert, query, update and delete,
schema creation and modification, and data access
control.
8. 8
Oh Crap!!!
He Said 'relational
algebra' and 'tuple
relational
calculus'!
10. 10
Relational algebra
http://en.wikipedia.org/wiki/Relational_algebra
Relational algebra is a family of algebra with a well-founded
semantics used for modelling the data stored in
relational databases, and defining queries on it.
To organize the data, first the redundant data and
repeating groups of data are removed, which we call
normalized. By doing this the data is organized or
normalized into what is called first normal form (1NF).
Typically a logical data model documents and
standardizes the relationships between data entities (with
its elements). A primary key uniquely identifies an
instance of an entity, also known as a record.
11. 11
Relation Algebra Continued
Once the data is normalized and in sets of data (entities
and tables), the main operations of the relational algebra
can be performed which are the set operations (such as
union, intersection, and cartesian product), selection
(keeping only some rows of a table) and the projection
(keeping only some columns). Set operations are
performed in the where statement in SQL, which is where
one set of data is related to another set of data.
12. 12
Database Normalization Forms
1nf
– No columns with repeated or similar data
– Each data item cannot be broken down further
– Each row is unique (has a primary key)
– Each filed has a unique name
2nf
– Move non-key attributes that only depend on part of the
key to a new table
● Ignore tables with simple keys or no no-key attributes
3nf
– Move any non-key attributes that are more dependent
on other non-key attributes than the table key to a
new table.
● Ignore tables with zero or only one non-key attribute
13. 13
In more better English, por favor!
3NF means there are no transitive dependencies.
A transitive dependency is when two columnar
relationships imply another relationship. For example,
person -> phone# and phone# -> ringtone, so person ->
ringtone
– A → B
– It is not the case that B → A
– B → C
14. 14
And the rarely seen 4nf & 5nf
You can break the information down further but very rarely
do you need to to 4nf or 5nf
15. 15
So why do all this normalization?
http://databases.about.com/od/specificproducts/a/normalization.htm
Normalization is the process of efficiently
organizing data in a database. There are two
goals of the normalization process: eliminating
redundant data (for example, storing the same
data in more than one table ) and ensuring data
dependencies make sense (only storing related
data in a table). Both of these are worthy goals as
they reduce the amount of space a database
consumes and ensure that data is logically
stored.
16. 16
Example – Cars
Name Gender Color Model
Heather F Blue Mustang
Heather F White Challenger
Eli M Blue F-type
Oscar M Blue 911
Dave M Blue Mustang
There is redundant information
across multiple rows but each
row is unique
17. 17
2nf – split into tables
Name Gender
Heather F
Eli M
Oscar M
Dave M
Color Model Owner
Blue Mustang Heather
White Challenger Heather
Blue F-type Eli
Blue 911 Oscar
Blue Mustang Dave
Split data into
two tables –
one for owner
data and one
for car data
18. 18
3nf – split owner and car info into different tables
Car_ID Color Model Owner
_ID
1 Blue Mustang 1
2 White Challenger 1
3 Blue F-type 2
4 Blue 911 3
5 Blue Mustang 4
The car info is
separated from the
car info. Note that
the car table has a
column for the
owner's ID from the
owner table.
Owner_ID Name Gender
1 Heather F
2 Eli M
3 Oscar M
4 Dave M
19. 19
But what if White Mustang is shared or 4nf
Owner_ID Name Gender
1 Heather F
2 Eli M
3 Oscar M
4 Dave M
Car_id Model Color
1 Mustang Blue
2 Challenger White
3 F-type Blue
4 911 Blue
Car_id Owner_id
1 1
2 1
3 2
4 3
1 4
Tables for Owner,
Car, & Ownership
data
Now we have a flexible way to
search data about owners, cars, and
their relations.
20. 20
So now what!!!
By normalizing to 3nf (or 4th), we are storing the data with
no redundancies (or very, very few)
Now we need a way to define how the data is stored
And a way to manipulate it.
21. 21
SQL
SQL is a declarative language made up of
– DDL – Data Definition Language
– DML – Data Manipulation Language
SQL was one of the first commercial languages for Edgar
F. Codd's relational model, as described in his influential
1970 paper, "A Relational Model of Data for Large Shared
Data Banks." --Wikipedia
– Codd, Edgar F (June 1970). "A Relational Model of
Data for Large Shared Data Banks". Communications
of the ACM (Association for Computing Machinery)
13 (6): 377–87. doi:10.1145/362384.362685.
Retrieved 2007-06-09.
23. 23
SQL is declarative
Describe what you want, not how to process
Hard to look at a query to tell if it is efficient by just looks
Optimizer picks GPS-like best route
– Can pick wrong – traffic, new construction, washed out
roads, and road kill! Oh my!!
24. 24
SQL is made up of two parts
Data Definition Language (DDL)
– For defining data structures
●CREATE, DROP, ALTER, and
RENAME
Data Manipulation Language
– Used to SELECT, INSERT, DELETE, and
UPDATE data
26. 26
The stuff in the parenthesis
CHAR(30) or VARCHAR(30) will hold strings up to 30
character long.
– SQL MODE (more later) tells server to truncate or
return error if value is longer that 30 characters
–
INT(5) tells the server to show five digits of data
DECIMAL(5,3) stores five digits with two decimals, i.e.
-99.999 to 99.999
FLOAT(7,4) -999.9999 to 999.9999
28. 28
NULL No Value
Null is used to indicate a lack of value or no data
– Gender : Male, Female, NULL
Nulls are very messy in B-tree Indexing, try to avoid
Math with NULLs is best avoided
29. 29
DESC City in detail
Describe table tells us the names of the columns (Fields),
the data type, if the column is NULLABLE, Keys, any
default value, and Extras.
30. 30
Data Types
Varies with vendor
Usually have types for text, integers, BLOBs, etc.
Refer to manual
31. 31
MySQL World Database
http://dev.mysql.com/doc/index-other.html
Used in MySQL documentation, books, on line tutorials,
etc.
Three tables
– City
– Country
– Country Language
34. 34
Join two tables
To get a query that provides the names of the City and the
names of the countries, JOIN the two tables on a common
data between the two columns (that are hopefully
indexed!)
36. 36
Simple join
Both City and Country
have columns that
can be used for JOINs
– Country.Code
– City.CountryCode
37. 37
What happened when you send a query
Server receives the query
The user is authenticated for permissions
– Database, table, and/or column level
Syntax
Optimizer
– Statistics on data
– Cost model
● Pick cheapest option (DISK I/O)
● Cardinality of indexes
Get the data
Sorting/Grouping/etc
Data returned
38. 38
EXPLAIN
EXPLAIN is pre pended to the query to show the results
from the optimizer
39. 39
VISUAL Explain
MySQL Workbench
MySQL 5.6/5.7
Uses JSON output from
EXPLAIN and turns it into
something more visually
appealing
41. 41
MySQL Internals Manual :: 7 The Optimizer :: 7.1 Code
and Concepts :: 7.1.2 The Optimizer Code
handle_select()
mysql_select()
JOIN::prepare()
setup_fields()
JOIN::optimize() /* optimizer is from here ... */
optimize_cond()
opt_sum_query()
make_join_statistics()
get_quick_record_count()
choose_plan()
/* Find the best way to access tables */
/* as specified by the user. */
optimize_straight_join()
best_access_path()
/* Find a (sub-)optimal plan among all or subset */
/* of all possible query plans where the user */
/* controls the exhaustiveness of the search. */
greedy_search()
best_extension_by_limited_search()
best_access_path()
/* Perform an exhaustive search for an optimal plan */
find_best()
make_join_select() /* ... to here */
JOIN::exec()
42. 42
Data and Data types
Use the smallest reasonable field
– BIGINT are not needed for customer id numbers
– Signed/unsigned
● Customer “negative seven four two four three”
– All those extra bits have to be moved disk → memory
→ buffer → Ether → buffer → program
CHAR versus VARCHAR
– Space and compression
– Overhead slight
ENUMs and BITs
– Have to plan ahead
– Sorting issues
● Value of the ENUM
43. 43
INTEGER, INT, SMALLINT, TINYINT, MEDIUMINT, BIGINT
Type Storage
(bytes)
Minimum
Signed
Maximum
Signed
Minimum
Unsigned
Maximum
Unsigned
TINYINT 1 -128 127 0 255
SMALLINT 2 -32768 32767 0 65535
MEDIUMINT 3 -8388608 8388607 0 16777215
INT 4 -
2147483648
2147483647 0 4294967295
BIGINT 8 -
9223372036
854775808
9223372036
854775807
0 1844674407
3709551615
44. 44
Creating a table with MySQL command line client
CREATE TABLE fooint (a INT(1), b INT(4), c iINT(10));
INSERT INTO fooint (a,b,c) VALUES (1,100,10000);
INSERT INTO fooint VALUES (777,777,0);
SELECT * from foointg
INSERT INTO fooint (a) values (1234567890);
SELECT * FROM foointG
Note that ; and g are equivalent and G is for vertical
output.
What happens with INSERT INTO fooint (a) values
(12345678900); ??
45. 45
ALTER, TRUNCATE, and DROP table
ALTER TABLE fooint ADD COLUMN id INT UNSIGNED
NOT NULL FIRST;
– Also BEFORE and AFTER
SELECT * FROM fooint;
TRUNCATE fooint;
– Used to remove data but not schema definition
DROP TABLE fooint;
– Goodbye to table and schema
46. 46
Another table
CREATE TABLE fooint (id INT UNSIGNED NOT NULL
PRIMARY KEY AUTO_INCREMENT, name Char(30));
DESC fooint; and SHOW CREATE TABLE foointl
INSERT INTO fooint (name) VALUES ('Alpha'),('Beta'),
('Gamma');
SELECT * FROM fooint;
– Note the id column incremented automatically
47. 47
Foreign Keys
CREATE TABLE employee (
e_id INT NOT NULL,
name CHAR(20),
PRIMARY KEY (e_id)
);
CREATE TABLE building (
office_nbr INT NOT NULL,
description CHAR(20),
e_id INT NOT NULL,
PRIMARY KEY (office_nbr),
FOREIGN KEY (e_id)
REFERENCES employee
(e_id)
ON UPDATE CASCADE
ON DELETE CASCADE);
48. 48
More on foreign keys
mysql> INSERT INTO
employee VALUES (10,'Larry'),
(20,'Shemp'), (40,'Moe');
Query OK, 3 rows affected (0.04
sec)
Records: 3 Duplicates: 0
Warnings: 0
mysql> INSERT INTO building
VALUES (100,'Corner
Office',10), (101,'Lobby',40);
Query OK, 2 rows affected (0.04
sec)
Records: 2 Duplicates: 0
Warnings: 0
mysql> SELECT * FROM employee;
+------+-------+
| e_id | name |
+------+-------+
| 10 | Larry |
| 20 | Shemp |
| 40 | Moe |
+------+-------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM building;
+------------+---------------+------+
| office_nbr | description | e_id |
+------------+---------------+------+
| 100 | Corner Office | 10 |
| 101 | Lobby | 40 |
+------------+---------------+------+
2 rows in set (0.00 sec)
49. 49
Using foreign keys
mysql> SELECT * FROM employee JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 40 | Moe | 101 | Lobby | 40 |
+------+-------+------------+---------------+------+
2 rows in set (0.02 sec)
50. 50
Left Join with foreign keys
mysql> SELECT * FROM employee LEFT JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 40 | Moe | 101 | Lobby | 40 |
| 20 | Shemp | NULL | NULL | NULL |
+------+-------+------------+---------------+------+
3 rows in set (0.00 sec)
51. 51
Foreign keys keep you from messy data
mysql> INSERT INTO building VALUES (120,'Cubicle',77);
ERROR 1452 (23000): Cannot add or update a child row: a
foreign key constraint fails (`test`.`building`, CONSTRAINT
`building_ibfk_1` FOREIGN KEY (`e_id`) REFERENCES
`employee` (`e_id`) ON DELETE CASCADE ON UPDATE
CASCADE)
mysql>
52. 52
Taking advantage of CASCADE
mysql> DELETE FROM employee WHERE e_id=40;
Query OK, 1 row affected (0.08 sec)
mysql> SELECT * FROM employee LEFT JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 20 | Shemp | NULL | NULL | NULL |
+------+-------+------------+---------------+------+
53. 53
Cascade keeping foreign key data updated
mysql> UPDATE employee SET e_id=21 WHERE e_id=20;
Query OK, 1 row affected (0.04 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT * FROM employee LEFT JOIN building ON
(employee.e_id=building.e_id);
+------+-------+------------+---------------+------+
| e_id | name | office_nbr | description | e_id |
+------+-------+------------+---------------+------+
| 10 | Larry | 100 | Corner Office | 10 |
| 21 | Shemp | NULL | NULL | NULL |
+------+-------+------------+-------------
54. 54
Indexing
MySQL uses mainly B-trees for indexes
Great for =, <, >, <=, =>, BETWEEN, some LIKE 'string%'
55. 55
13.1.8 CREATE INDEX Syntax
CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index_name
[index_type]
ON tbl_name (index_col_name,...)
[index_option]
[algorithm_option | lock_option] ...
index_col_name:
col_name [(length)] [ASC | DESC]
index_type:
USING {BTREE | HASH}
index_option:
KEY_BLOCK_SIZE [=] value
| index_type
| WITH PARSER parser_name
| COMMENT 'string'
algorithm_option:
ALGORITHM [=] {DEFAULT|INPLACE|COPY}
lock_option:
LOCK [=] {DEFAULT|NONE|SHARED|EXCLUSIVE}
56. 56
You may not know
Can create multi-column indexes
– Year,month,day index can be used for
● Year,Month,Day searches
● Year,Month searches
● Year searches
– But not day or month, day searches
Can index on part of a column
– CREATE INDEX ndx1 ON customer (name(10));
NULLs are to be avoided
– Extra optimizer steps
UNIQUE indexes enforce no duplicates
InnoDB will create an index if YOU DO NOT
57. 57
13.2.5.2 INSERT … ON DUPLICATE KEY UPDATE
If you specify ON DUPLICATE KEY UPDATE, and a row
is inserted that would cause a duplicate value in a
UNIQUE index or PRIMARY KEY, an UPDATE of the old
row is performed. For example, if column a is declared as
UNIQUE and contains the value 1, the following two
statements have identical effect:
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
58. 58
FIND Asian countries with Population > 5 million
EXPLAIN SELECT * FROM Country
WHERE Continent = 'Asia'
AND Population > 5000000;
59. 59
Will indexes make query faster?
SELECT * FROM Country
WHERE Continent = 'Asia'
AND Population > 5000000;
Index columns
– Population
– Continent
– Population and Continent
– Continent and Population
● Which to pick? Optimizer will look at option that takes
the least CPU and I/O
● Statistics from storage engine
– Indexes to use, order to join tables, avoiding sorts,
is sorting expensive
60. 60
Index on Population
Alter TABLE Country ADD INDEX p (Population)
Down from 239!!
61. 61
A reading from the Book of MySQL – Chapter 8 Optimization
READ this section of the MySQL Manual
– Hardware
– Configuration
– Queries
– Locking
– Benchmarkings
Set InnoDb Buffer Cache Size to 75-80% of system RAM
Move Logs off disks/controllers with data
Use bigger stronger machines for replication slaves
Looking at SSDs, FusionIO cards
Monitor! Monitor! Monitor!
62. 62
ALTER TABLE Country ADD INDEX C (continent)
42 is better than
239 or 54
63. 63
CREATE INDEX cpidx ON Country (continent,
Population)
Event better still!
64. 64
CREATE INDEX pcidx on Country (Population,continent)
FORCING the use of the
new index shows
it is not optimal.
65. 65
Index-es
Indexes need maintenance
– Run OPTIMIZE TABLE when system quiet
– Each INSERT/DELETE takes overhead
● Slows you down
– Therefore remove unused indexes
– MySQL Utilities mysqlindexcheck to look for
unused index on long running systems (do not
use AFTER a restart)
● Use good naming convention, document
– Statistics can be saved/reloaded at shutdown/reboot
● After a reboot w/o saving, the system is going to
need to rebuild stats from scratch, run slower
– Log queries not using Indexes
● Not all of these are bad, just recognize them
67. 67
Transactions
You will need to use a
transactional storage
engine like InnoDB, NDB
You need to START a
transaction, do the work
and COMMIT to record the
changes or ROLLBACK to
cancel the recording.
To avoid using ROLLBACK, you can
employ the following strategy:
– Use LOCK TABLES to lock
all the tables you want to
access.
– Test the conditions that must
be true before performing
the update.
– Update if the conditions are
satisfied.
– Use UNLOCK TABLES to
release your locks.
Note
– This solution does not
handle the situation when
someone kills the threads
in the middle of an
update. In that case, all
locks are released but
some of the updates may
not have been executed.
68. 68
CREATE TABLE account (id int not null, balance
decimal(6,2) default 0)
INSERT INTO account VALUES (1,1000.10), (2,400),
(3,0), (15,.99);
START TRANSACTION;
UPDATE account SET balance = 1010.10 WHERE id = 1;
UPDATE account SET balance = 300 WHERE id=2;
COMMIT;
START TRANSACTION
UPDATE account SET balance=1000 WHERE ID=3;
ROLLBACK;
69. 69
AUTOCOMMIT
AUTOCOMMIT is set to 1 by default or 'do things as I type
them mode'.
– START TRANSACTION overrides
– Some APIs like JDBC have own way to handle
transactions (see chapter 23 in MySQL docs)
AUTOCOMMIT set to 0 requires a COMMIT to store
changes
Some statements cannot be rolled back. In general, these
include data definition language (DDL) statements, such
as those that create or drop databases, those that create,
drop, or alter tables or stored routines
Some statements cause an implicit commit – DDL, user
account changes, transaction control or locking
statements, data loading, and replication control
statements,
70. 70
SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;
Note ROLLBACKS do not release locks
71. 71
SET autocommit=0;
LOCK TABLES t1 WRITE, t2 READ, ...;
... do something with tables t1 and t2 here ...
COMMIT;
UNLOCK TABLES;
Note ROLLBACKS do not release locks
73. 73
First steps of scaling
✔ Upgrade MySQL versions
✔ 5.5 is 25% faster than 5.1
✔ 5.6 is 20% faster
✔ 5.7 will be faster still
✔ Memory
✔ DISKS
✔ Move logs to different platters, controllers
Move large databases to own disks
✔ TUNE queries
✔ Scale horizontally especially is server is old
✔ Use bigger, more powerful boxes for replicants
✔ Optimize network