0% found this document useful (0 votes)
60 views

SQL

SQL basic

Uploaded by

Suman Saurabh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
60 views

SQL

SQL basic

Uploaded by

Suman Saurabh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 13
EvANS gui row- oriented vs Column-oriented Some SQL databases are row-oriented and some are column-oriented (aka columnar") Tow- oriented column- oriented age name Gge name Leb aan] 4 Jakshes} | (214719 | | bob} shir | akshay = —S— l 2 co's values are a column's values ave stored together on disk stored together Examples: Examples: Mie that arent ‘ Postgresal, MysQl, SQLite, Redshift, Presto, BigQuery, SQL Server Vertica, Snowflake Mainly used for: records that users need to look up/change all the time (eg a web service) Pros: > easy to look up or update a single row (it's in one place!) Cons: > big expensive queries make your_website less responsive you have to be super careful about which Queries you run on Your production database, querying all data in a column to do analysis is way faster usually distribute data across many machines so 1000 machines can run your query Cons: —* SELECT * can be SUPER SLOW if you have 100 columns (avoid doing it!) > Updating a row is slow (do batch imports instead) oun EVANS evrk — WOUS +o count Here are three ways to count rows: @ count (*): count all rows This counts every row, regardless of the values in the row. Often used with a GROUP BY to get common values, like in this "top 50 baby names" query: SELECT name, COUNT(*) FROM baby_names GROUP BY name ORDER BY COUNT(*) DESC LIMIT 50 @ count (DISTINCT column): get the number of distinct values Really useful when a column has duplicate values. For example, this query finds out how many species every animal genus has: SELECT genus, COUNT(DISTINCT species) FROM all_animals GROUP BY 1 ORDER BY 2 DESC @ SUM(CASE WHEN expression THEN 1 ELSE @ END) Want to Know many dogs are named 'boxer'? You can use SUM and CASE WHEN to count them! ye 12 SELECT owner t "(oy SUM(CASE WHEN type = ‘dog’ then 1 else @ end) AS num_dogs x a Sol , SUM(CASE WHEN type = ‘cat’ then 1 else @ end) AS num_cats Muse, SUM(CASE WHEN type NOT IN (‘dog’, ‘cat) then 1 else @ ®* end) AS num_other FROM pets GROUP BY 1 owner type owner num_dogs num_cats num_other 1 dog => 1 1 1 e 1 cat ~ 2 1 @ 1 2 dog 2 parakeet Sun questions to ask about your data Tt's really easy to make incorrect assumptions about the data in a table: % hows later. every, haspital oe tient has & Why is everyone doctor right? Some questions you might want to ask: Does this column have NULL or 0 or ™ values? Sone patients have NULC names, thet!s good to Know Goctacs in the system who never Worked at+thic hocpital, I should ‘Sometimes there ace 2 docto-'s Appointments at the exact time, that shouldn't happen... Does the id column in table A always have a match in table B? Pt ahy ace there 213 doctor TOs with ho match in the doctors table ?f Sour Evans pork SQL query steps When this SQL query runs, here's how 1 think of what happens: every line in the query changes a table into another table S SELECT owner, count(*) | FROM cats 2 WHERE owner != 3 3 GROUP BY owner 4 HAVING count(*) = 2 6 ORDER BY owner DESC ® FROM cats @ WHERE owner |= 3 owner | name libra a libra z cinnamon 2 cinnamon 2 chanceuse 2 chanceuse ot 3 astra By pastes — NM fo 4 lime 4 lime 4 nikola 4 nikola @ GRovP BY owner @ HAVING count le) =2 owner | name © Select owner, count Le) ORDER BY ownec DESC owner| count(*) owner| count(*) 2) 2 sort | 4 2 e4 /2 al2 Sour Evan’ store how LEFT JOIN works Let's run this join: cats LEFT JOIN people ON owner_id = id Here are the 2 tables: cats. people owner_id name id name 1 bella 1 juan 2 luna 2 ahmed 3 lime 4 ryan Combine every cats row @ Find rows where the with every people row ON condilion is true cats. people owner_id name owner-id= id juan 1 ahmed eats pele ‘i rs owner_id | ‘name | id | hame yan = 3 2 SS T bella| 1 | juan seem = U 2 | ahmed 2 Samed “ 2 una ahme 2 ryan} / 3 juan 3 ahmed 2 ryan @® Add any missing rouls From the left table (cats) cots cots: Pele people, owner_id | name | id | name T bella| 1 | juan Wind , 2 luna | 2 | ahmed Te cE w i 3 lime [ru] NULL « WE,24¢ hirm back f put NULLS for the. people columas step 3 seems weird oF First but its really useful to know which rows had no match! LEFT SolN is my favourite Find duplicate emails with HAVING This query finds duplicate email addresses in a clients table: SELECT email, count(*), group_concat(name, FROM clients GROUP BY email HAVING count(*) > 4 Here's how it breaks down: @) FROM clients id name email 1 mrdarey [email protected] 2 luna [email protected] 3 nala [email protected] 4 tigger [email protected] @) SELECT email, count(*), group_concat(name, ',') AS names email count(’) names: [email protected] 1 ‘mr darcy [email protected] 1 luna [email protected] 2 nala, tigger ",*) AS names (@ GROUP BY email id name email 1 mrdarcy [email protected] id_name email 2 luna [email protected] [id_name email ] 3 ala [email protected] 4 tigger [email protected] (@ HAVING count(*) > 1 email count(*) names [email protected] 2 nala,tigger A simple LEFT JOIN Here's a join query to figure out which treats Luna bought SELECT clients.name AS client_name, sales.item FROM sales LEFT JOIN clients ON sales.client_id = clients.id WHERE clients.name = ‘luna! Let's go through that query one step at a time @ FROM sales @ LEFT JOIN clients client_id item Here's the clients table: 1 catnip id name email 1 blanket 1 mrdarcy [email protected] 1 tuna 2 luna [email protected] 2 tuna 3 nala [email protected] 5 laser pointer 4 tigger = [email protected] @ LEFT JOIN clients ON @ WHERE clients.name = ‘luna’ sales.client_id = clients.id cect id tem x client_id item id name email Sales data on the left, clients on the right 2 tuna 2 luna [email protected] Gientid item id name_emai mr 1 catnip 4 darcy@pemberleycom darcy 1 binket 1 [email protected] 1 tuna 1 [email protected] 2 tuna 2 ~~ una [email protected] laser NULL NULL NULL @) SELECT cats.name AS Get the time between thunderstorms with LAG() ‘Window functions let you reference values in other rows, like the previous row! This means you can ‘subtract the day in the previous row to get the time between thunderstorms. SELECT type, day, day - lag(day) OVER(PARTITEON BY type ORDER By day ASC) as days_since_prev FROM weather ORDER BY day ASC Let's go through that query one step at a fime: (1) FROM weather @ PARTITION BY type type day rain 9 thunderstorm 11 rin 13 rain 21 thunderstorm 22 rain 30 thunderstorm 36, rain 38 thunderstorm 44 rin 48 @ ORDER BY day ASC @ SELECT type, day, day - In this case the rows already look ordered, but lag(day) OVER(PARTITION BY you shauld always use an ORDER BY if you type ORDER BY day ASC) as expect a specific order days_since_prev type day days_since_prev rain a rain 13204 rain 2108 rain 30 9 rain 38°28 rain 48 1 thunderstorm 171 thunderstorm 22 11 thunderstorm 36 14 thunderstorm 41° 5 ous Evans anatomy of a @bork SQL quecy Every SQL database contains a bunch of tables clients cities id [rane] Every SELECT query takes data from those tables and generates a new table select column FROM table SELECT... as the simples FROM ..- avery oy 0% INNER JOIN... WHERE ... GRovP BY... optional | HAVING... ORDER BY... Limit ... requiced A few basic facts to start out: — You always need to use the order SELECT .. . FROM... WHERE ... GROUP BY + SQL isn't case sensitive: select * from table is fine too S there are other kinds of quectes like INSERT / UPDATE / DELETE but this zine is just about SELECT Sura Evans @bOrk WHERE WHERE filters the table you start with. For example, let's break down this simple query: SELECT owner FROM cats WHERE name IS NULL FROM cats owner | name simba 2 bella NULL owner WHERE nome IS NULL name NULL SELECT owner] owner 3 What you can put in a WHERE: Check if a string contains a substring! WHERE name LIKE '%darcy%’ % isa wildeacd Check if an expression is in a list of values WHERE name IN ('bella’, "simba’) ODS & These work the way you'd guess WHERE revenue - costs >= @ expe IS NULL expe 1S NoT NULL = NULL doesn't work, you need to use IS NULL You can AND together as many conditions as you want TF I'm using lots C....) pur all the ORs of ANDS T like to AND (.....) inthe brackets write them like this” any C..) Sura Evans wa cules for simple JOINS Soins in SQL let you take 2 tables and combine them into one. bye dixty & bic 3 INNER SOIN sy tS = ~ 2 a= AA up re aRRp me 2 Soins can get really complicated, so we'll start with the simplest way to join. Here are the rules I use for 410% of my joins: Rule 1: only use LEFT JOIN and INNER JOIN This is every join type: INNER rom | LEFT JOIN RIGHT JOIN FULL OUTER JOIN CROSS JOIN Rule 2: Only include 1 condition in your join Here's the syntax for a ein: tablel LEFT JOIN table2 ON I usually stick to a very simple condition, like this: tablel LEFT JOIN table2 ON tablel.some_column = table2.other_column Rule 3: One of the joined columns should be unique If neither of the columns is unique, you'll get strange results like this: ple INNER JOIN foods peaale =a Peep ane | age | INNER ods, fa ON. people name = “eds. name ane | favourite food julia | 18 | Sow | Suita | bananes ane | favourite food | agg julia | 18 u julia | bananas 119 kevin | 16 julia | ketchup julia | ketchup jig y7? julia | bananas [18 julia | ketchup [18 Sete = SELECT @bork SELECT is where you pick the final columns that appear in output of the query: Here's the syntax: SELECT expression_1 [AS alias], expression_2 [AS alias2], FROM ... Some useful things to Know about SELECT: Q you can combine many columns with SQL expressions A few examples: CONCAT(first_name, ' ', last_name) MAX(last_year_profit, this_year_profit) DATE_TRUNC('month', created)¢— this 1s PostgreS@L syntax for rounding a date, other SQL dialects have different syntax @) Alias. an expression with as CONCAT (first_name, '', last_name) is a mouthful! It's nice to give your complicated expressions an easy-to-read alias, like: SELECT CONCAT(first_name, ' ', lastname) AS full_name @ Select all columns with seLect + When I'm starting to figure outa query, Tl often write something like SELECT * from some_table LIMIT 10 just to quickly see what the columns in the table look like SELECT count) and SELECT & are totally diffecent, countle) means “ count all cous Which isn't really related ta SELECT & ORDER BY and Limit * ORDER BY and LIMIT happen at the end and affect the final output of the query. ORDER BY lets you sort by anything you want! The syntax: stands for Asoens ORDER BY Lexpression} DESC Example: T_* FROM cats i BY LENGTH(name) ASC cats ‘owner | _nane owner] name T | daisy 4 | rose 1 | drareasoan 1 | daisy 3 | buttercup 3 | buttercup 4_| rose 1 LIMIT lets you limit the number of rows output. The syntax: must be a &— number IMIG 2 SELECT * FROM cats \ ' ORDER BY LENGTH(name) ASC + \ LIMIT 2 H cats nent owner] name T | daisy ‘owner [ name 4 | rose 1 | daisy 1 3 | buttercup 4 | rose

You might also like