SQL Join Tutorial
This is part two of my SQL tutorial, I guess. I hadn’t intended to go beyond the basics when I wrote the first article, but it became necessary to add to it. In this section I wanted to get into Join statements.
Lets start by looking at a table named Characters:
| Characters | |||
| Name | Role | Kingdom | RuledBy |
| Van Fanel | King | Fanelia | Fanel family |
| Balgus | General | Fanelia | Fanel family |
| Allen Schezar | Knight | Asturia | Grava Aston |
| Millerna Aston | Princess | Asturia | Grava Aston |
You’ll see a lot of repeated information in the records. The Kingdom field only two vales, which are repeated in a few of the records. The RuledBy field is really holding information about the Kingdom, and its values are based on the Kingdom field.
In general you want to avoid making repetitive changes to a database; for example if we need to update the RuledBy value for the Fanelia Kingdom we have to modify two records. Even though it could be done with one UPDATE statement, its still a repetitive change. Most databases are setup such that you update as few records as possible when you make changes, since that will reduce the possibility that values get out of synch. You don’t want to goof the update and have records showing different rulers for the kingdom.
Space is also very important to a database administrators like to keep tables as small as possible. So databases are also designed to minimize the amount of space each table takes. Copying the Kingdom name and RuledBy fields multiple times can take up a lot of space if there are hundreds of rows.
For both of these reasons databases are Normalized. What this means is that we create multiple tables that are related to each other and are used together to hold the data. These are also called Relational Tables.
In the example we have two sections of information, the characters and the kingdoms. So lets split them into relational tables:
| Characters | ||
| Name | Role | KingdomID |
| Van Fanel | King | 1 |
| Balgus | General | 1 |
| Allen Schezar | Knight | 2 |
| Millerna Aston | Princess | 2 |
| Kingdoms | ||
| KingdomID | Kingdom | RuledBy |
| 1 | Fanelia | Fanel family |
| 2 | Asturia | Grava Aston |
What we’re doing here is moving all of that repeated data into a second table. We’re using an ID number to line up the records. If we look at the first record in the Characters table it has a KingdomID value of 1, so we look for a record in the Kingdoms table that has the same ID; put the two records together and you have the complete information. So now we don’t have the kingdom information repeated anymore; just that number, so the total space used should be reduced. Also if we need to update kingdom information all we do is update a single record and through the table relationship all of the records are updated. Theres a lot more that can be said about Normalization and Relational Tables, but thats the basics of it.
Unfortunately there is a catch to it, database servers aren’t able to piece all this together themselves. When you construct your SQL queries you need to explain the relationship to the server. This is done with a JOIN clause in the SQL statement. What that does is joins two tables together in a single statement. Lets say that we’d like to grab all the records joined together from our two tables, the statement would look like this:
SELECT * FROM Characters INNER JOIN Kingdoms ON Characters.KingdomID = Kingdoms.KingdomID
This will return the following records:
| Name | Role | Characters.KingdomID | Kingdoms.KingdomID | Kingdom | RuledBy |
| Van Fanel | King | 1 | 1 | Fanelia | Fanel family |
| Balgus | General | 1 | 1 | Fanelia | Fanel family |
| Allen Schezar | Knight | 2 | 2 | Asturia | Grava Aston |
| Millerna Aston | Princess | 2 | 2 | Asturia | Grava Aston |
The SQL statement should look fairly similar to a regular SELECT statement, except we’ve added a JOIN clause to it. In fact it is your average SELECT statement, you can still list the specific fields you’d like to retrieve and add a WHERE clause to filter the results.
Lets take a look at the JOIN clause. The keyword INNER defines the type of join to do, INNER means that we’d like to get all of the records that have data in both tables. Lets pretend there was a third record in the Kingdoms table, with a KingdomID of 3; that record would not be returned through the INNER JOIN since it doesn’t have a counterpart in the Characters table. The next word, JOIN, says that we are joining a second table in the results, its followed by the name of the table. Then the word ON is followed by the relationship to use when joining the records together. What you do is set the two fields that match equal to each other, using the notation TableName.FieldName so you can uniquely identify the fields, since in this case the tables use the same field names.
The records returned include every field from both tables. Since most of the fields have unique names only their name is given. The two KingdomID fields include the table names as well, to make them unique.
This is the default style for joins, in most databases you can omit the word INNER and get the same results. There are other types of JOIN statements that handle the data in different ways. To show those I’m going to add a row to each of the tables. These rows won’t have KingdomID values that match with the other table, so the INNER JOIN would have ignore them. The other JOIN statements are intended to help you work with these records. So here are the new tables:
| Characters | |||
| Name | Role | KingdomID | |
| Van Fanel | King | 1 | |
| Balgus | General | 1 | |
| Allen Schezar | Knight | 2 | |
| Millerna Aston | Princess | 2 | |
| Hitomi Kanzaki | Fortune Teller | 4 | |
| Kingdoms | ||
| KingdomID | Kingdom | RuledBy |
| 1 | Fanelia | Fanel family |
| 2 | Asturia | Grava Aston |
| 3 | Zaibach | Dornkirk |
Now the remaining JOIN types are OUTER joins; and there are three types LEFT, RIGHT, and FULL. The LEFT OUTER JOIN will grab all records that are in the first table listed (the table on the left side of the word JOIN) whether or not it has a counterpart in the other table. So with this statement:
SELECT * FROM Characters LEFT OUTER JOIN Kingdoms ON Characters.KingdomID = Kingdoms.KingdomID
Would give us this return:
| Name | Role | Characters.KingdomID | Kingdoms.KingdomID | Kingdom | RuledBy |
| Van Fanel | King | 1 | 1 | Fanelia | Fanel family |
| Balgus | General | 1 | 1 | Fanelia | Fanel family |
| Allen Schezar | Knight | 2 | 2 | Asturia | Grava Aston |
| Millerna Aston | Princess | 2 | 2 | Asturia | Grava Aston |
| Hitomi Kanzaki | Fortune Teller | 4 |
A RIGHT OUTER JOIN will grab all records that are in the second table listed (the table on the right side of the word JOIN) whether or not it has a counterpart in the other table. So with this statement:
SELECT * FROM Characters LEFT OUTER JOIN Kingdoms ON Characters.KingdomID = Kingdoms.KingdomID
You will get this as the return:
| Name | Role | Characters.KingdomID | Kingdoms.KingdomID | Kingdom | RuledBy |
| Van Fanel | King | 1 | 1 | Fanelia | Fanel family |
| Balgus | General | 1 | 1 | Fanelia | Fanel family |
| Allen Schezar | Knight | 2 | 2 | Asturia | Grava Aston |
| Millerna Aston | Princess | 2 | 2 | Asturia | Grava Aston |
| 3 | Zaibach | Dornkirk |
The final join, the FULL OUTER JOIN will show you all records from both tables regardless of what is in the other table.
SELECT * FROM Characters FULL OUTER JOIN Kingdoms ON Characters.KingdomID = Kingdoms.KingdomID
Will return this:
| Name | Role | Characters.KingdomID | Kingdoms.KingdomID | Kingdom | RuledBy |
| Van Fanel | King | 1 | 1 | Fanelia | Fanel family |
| Balgus | General | 1 | 1 | Fanelia | Fanel family |
| Allen Schezar | Knight | 2 | 2 | Asturia | Grava Aston |
| Millerna Aston | Princess | 2 | 2 | Asturia | Grava Aston |
| Hitomi Kanzaki | Fortune Teller | 4 | |||
| 3 | Zaibach | Dornkirk |
For the LEFT and RIGHT OUTER JOINS you need to be careful of the order you list the tables. For the INNER JOIN and FULL OUTER JOIN the table order doesn’t matter, you will still get the same results. The database server will reorder the fields in the return so that the fields first table will show up on the left.
In all of the examples so far the each record from one table only matched with one record from the other table, this is known as a one-to-one relationship. The JOIN clauses can still work if a record has multiple matches in the other table; a one-to-many or many-to-many relationship. In these cases the JOIN will return the records multiple times so that each match is shown. Lets adjust the tables a bit more to illustrate this:
| Characters | |||
| Name | Role | KingdomID | |
| Van Fanel | King | 1 | |
| Allen Schezar | Knight | 2 | |
| Kingdoms | ||
| KingdomID | Kingdom | RuledBy |
| 1 | Fanelia | Fanel family |
| 2 | Asturia | Grava Aston |
| 2 | Zaibach | Dornkirk |
Allen Schezar has a KingdomID of 2, which matches 2 records in the Kingdom table. So when we join these tables, we will get 2 records for Allen Schezar; one for each on the matches. So lets try this SQL statement:
SELECT Name, Kingdom FROM Characters INNER JOIN Kingdoms ON Characters.KingdomID = Kingdoms.KingdomID
Which gives us this result:
| Name | Kingdom |
| Van Fanel | Fanelia |
| Allen Schezar | Asturia |
| Allen Schezar | Zaibach |




