Teradata Vantage SQL Basics
Teradata Vantage SQL Basics
One of the most important differences between joins and subqueries is the need for
establishing a one-to-many relationship in a join, something that is automatically provided
when writing a subquery.
This example illustrates what happens when the join relationship is many-to-many. In such
a case, unintended result rows appear in the final result set. In the example, employee 1015
works only in department 501, but since the join is on the manager number, this result set
will show that person working in both department 501 and department 402. Certainly not
the real circumstance. Other employees (employees 1018, 1023, 1006, 1008) share the same
fate. It would be difficult, if not impossible, to view the result set and know who truly works
in which department.
Notes
There are multiple ways to rewrite NOT IN, the best choice to find rows in table A where
matching data doesn't exist in table B is a NOT EXISTS Correlated Subquery: (Covered in a
later module)
Cross Join
A CROSS JOIN is a join where no join condition is specified. Since no qualification exists, the
database establishes a pseudo-condition of “WHERE 1=1”, which is true for each and every
comparison. An example is illustrated below demonstrating Cross Join.
Notes
CROSS JOIN is a rarely used syntax. In our example, the one on the left is preferred
because it shows the reader that a cross join is intended, while the one on the right may or
may not be (perhaps the writer forgot the join condition).
Since no join condition exists, the database invents one for us, whether we are pleased
with it or not. Explain shows a condition of “1=1”, which always evaluates true. Thus, you
can read the row for employee “Short” as “Project the employee’s number, last name, and
department name for each row in the department table where 1=1 is true.” The result is to
project these column values (from the “Smith” row) for each department row. The same
thing happens all over again for each employee row.
Cartesian Products
A completely unconstrained cross join is called a Cartesian product. It results when a CROSS JOIN is
issued without a WHERE clause. In this case, each row of one table is joined to each row of another
table. The output of a Cartesian product is often not meaningful however they do have useful
application. Any query requiring all elements of one table to be compared to all elements of another is
All others will cause a three-table join: An Inner Join between dept and emp followed by a Cross Join
to employee!
Notes
–
A table-alias is not really an alias, it replaces the tablename within that query. When using
aliases in writing joins, one must be careful to always use alias names when referencing,
and not use aliased table names. In our examples, the table Employee has been aliased as
“emp” in the FROM, but the join condition references Employee as a table name and does
not reference the alias. In the first example SQL-92 join syntax requires a join condition
from the joined tables and thus the query fails with "3782: Improper column reference in
the search condition of a joined table". But the SQL-89 syntax is interpreted as having three
a Teradata SQL request (but in Standard SQL). Teradata was implemented before there was
Standard SQL, the initial query language was called TEQUEL (TEradata QUEry Language),
RETRIEVE employee.last_name contains enough information for the Parser to resolve table
last_name
----------------------------------------
Hopkins
Ratzlaff
Rogers
Rogers
Kanieski
Crane
Stein
Johnson
Short
Brown
...
joins and then attempts to determine the most cost effective join order. Because the Optimizer uses
column statistics to choose the least costly join plan from the candidate set it generates, the plans it
Column projection and row selection are done prior to doing the join.
n-way joins are reduced to a series of binary joins.
Query optimizers use trees to build and analyze optimal join orders, most common are:
Notes
–
Query optimizers use trees to build and analyze optimal join orders. The join
search tree types used most frequently by relational database optimizers are the
When a left-deep search tree is used to analyze possible join orders, the number
Bushy trees are an optimal method for generating more join order combinations.
than the left-deep tree method. Bushy trees also provide the capability of
The Optimizer uses various combinations of join plan search trees, sometimes
mixing left-deep, bushy, and even right-deep branches within the same tree.
The possibilities for ordering binary joins escalate rapidly as the total number of
relations joined increases. The Optimizer is very intelligent about looking for
plans and uses numerous field-proven heuristics to ensure that more costly
plans are eliminated from consideration early in the costing process in order to
Summary
● Column values may be projected from any table of a join
● Subqueries and inner joins can both return inner result sets
● Inner joins cannot return outer (NOT IN) result sets as can subqueries
● Incorrect table and column references can cause incorrect result sets
LAB 1
Select all currently active (legacy_flag = 0) jobs with a job_code in the
33x.xxx range with assigned (= exist in hr_payroll) active HELP TABLE br_payroll;
SOLUTION
br_jobs.job_code
is unique, but one row per matching employee is returned, resulting in duplicate rows DI
SOLUTION
No DISTINCT needed as br_payroll.job_code is non-unique.
LAB 3
LAB 4
Select all districts with less than
60,000 inhabitants
(num_inhabitants) having
HELP TABLE fin_district;
accounts with a loan at status
HELP TABLE fin_account;
'D' (= running contract, client in
HELP TABLE fin_loan;
debt).
SOLUTION
Again, DISTINCT needed to remove duplicate rows.