Monday, January 25, 2021

JOINs - The Basics

 JOINs confuse a lot of those new to Structured Query Language (SQL). If you read the various web forums popularly used to ask questions then you know that using JOINs is pretty scary for a lot of folks. The sheer number of the 'how do I use JOIN to link two tables together?' questions is staggering. Not just because the JOIN itself is often obfuscated with relational algebra, Venn Diagrams, and the syntactical oddities of SQL -- and that is when you do find an online resource that tries to help. Plus adding to the frustration on top of all that are that the various options for using JSON can be downright perplexing.

Example 1

Let us start with an example of customers and their orders.  There is one table named customer for the information pertaining to the customer. There is another table with order information named orders holding the details of any orders from those folks in the customer table. 

Each of these tables has a column for the identification number of a customer.  In the customer table that column is named id and in the orders table that column is named customer_id.  Besides the inconsistencies in plurals of the table names, which is all too common in databases, there is the difference in column names for what is essentially the same data.  If you can get past that issue and realize that that you can use either column to link to the other table, then you are almost ready for the 'tricky stuff'.

To find the list of orders and their correspond customer, we can JOIN the two tables using the id/customer_id columns between the table with a simple JOIN. 

SELECT
orders.order_id
, customer.id
, customer.name  
from orders 
join customer 
on (customer.id = orders.customer_id) ;

And the results show the data from the orders table with the corresponding data with customer table.

The results of JOIN






LEFT Joins

But sometimes the two tables do not always evenly match up.  If we change the query above from join customer to LEFT join customer we get a much different result.

The results of LEFT JOIN




The LEFT JOIN displays all the orders and the corresponding customer information, if any.  If there is no corresponding data for the customer information, the server reports NULL(1).  In this case NULL is used to represent that we have no data for that element and order number 2 does not have a matching entry in the customer table.

RIGHT Joins

Changing the query to a RIGHT join will display all the customers and an orders they may have placed.

The Results of a RIGHT JOIN




The above shows that two orders have been placed by customers 1 & 2 and no orders from the customers 3 & 4.

CROSS Join

A CROSS Join will match every row in the first table with every row in the second table. Notice that the query is modified with no ON statement.  

The revised query for a
CROSSJOIN and the results











STRAIGHT_JOIN 

The STRAIGHT_JOIN is used to force the left table to be read first in cases where the optimizer wants to change things around for you.  With optimizer hints, this maybe redundant but it is still very handy.


The STRAIGHT_JOIN









The Natural JOIN

The NATURAL join creates an implicit join clause on the common columns in the two tables being joined. Common columns are columns that have the same name in both tables which means that you need to be VERY certain the columns with the same names are actually using the same data -- You do not want the 'id' for customers to get confused with those from the orders, employees, or other tables.


The Natural Join










(1) NULL

Null is used to indicate the absence of data. If you go back to the second example. the LEFT JOIN, we received records where there was no matching customer id number or name.  This could be caused by bad data in the tables, a badly entered record, or many other (frustrating) causes.  But the records that are holding NULLs can be found.  Hopefully I can loop back to that at a later date.