SQL Joins


SQL JOINS can be confusing sometimes, so I will go over an example here. Before I show the example, I would like to present three types of data models.

One-to-One Relationship: A can only link to B, and vice versa. Say A is a country and B is the country's capital. Canada only has one capital city and Ottawa is Canada's only capital. You can't associate any other national capitals with Canada and Ottawa can't be the capital of other nations. 

One-to-Many Relationship: A can link to many elements of B, but B can't link back to multiple As. Say A is mom and B is her children. A is the mom of all the children in B and one child from B can only link to one biological mom. 

Many-to-Many Relationship: Many As connect to many Bs, and vice versa. Say A is blog posts and B is labels. A blog can have many labels and a label can have several blog posts.


Next, we have three popular joins on SQL which are the inner joins, the left joins, and the right joins. To show this, we are going to use a TV series selection. This database has a many-to-many relationship because one reviewer can rate several movies, and one movie can be reviewed by several reviewers. We have three tables in this series database: reviewers, series, and reviews. Of course, the reviews table connects with both reviewers table and the series table. Thus, we need something called foreign keys to connect all these tables together. 

Below is the reviewers, series, and reviews tables, respectively. 




The series listing is mostly my favorite TV shows. I only showed 10 entries of the reviews table because the actual table has 47 entries. This reviews table connects the series and reviewers table with foreign keys series_id and reveiwer_id. 

Inner Join: This type of join only takes the data that both tables A and B have. If A or B has Null entries, these entries will be ignored. This will be the diagram in the middle of the SQL JOINS chart. 


Above is an example of an inner join. Here series is Table A and reviews is Table B. I only want to present the title and the average rating of the series. 


Notice that we only have 12 shows here when we actually have 14 shows. This is because Friends and The Walking Dead have no ratings yet so this part of the series table does not overlap with the reviews tables; thus, the average calculations have been ignored. It is also cool to note that Game of Thrones has the highest average rating! WOOHOO! 

Left Join: This type of join appears on the top left corner diagram of the SQL JOINS chart where table A has the red shade. This means that we only take the data inside table A and the overlap with table B. For instance, we want to find the reviewers rating behavior. 


Above tries to find the number of ratings, min/max ratings, average ratings, and status of reviewers. Here Table A is reviewers and Table B is the reviews. The keys connecting the two tables are reviewers.id and reviews.reviewer_id. Since we did a left join, so all the data in Table A reviewers will be used, even including those who did not rate any shows. 


It appears that Bruce Wayne has gone to sleep during the rating season. He is the only inactive user in the database. On the other hand,  Sabrina and Fiona are equally active with 10 ratings. 

Right Join: This type of join appears on the top right corner diagram of the SQL JOINS chart where table B has the red shade. This means we want to take all data from B even if it is empty and stick it to the overlap of tables A and B.  For instance, we want to find the average rating of each genre. 


Table A is reviews and Table B is series, and they are connected by the keys series.id and reviews.series_id. We want every entry from series, which is all the TV series. If NULL occurs because of no rating, then will be replaced by 0 in this case. 


As a result, we found out that Fantasy has the highest rating, which is merely reflecting the average rating of Game of Thrones. Well, I can't complain about not having enough of data points there because it is Game of Thrones!!!
Here is the SQL code!

1 comment:

  1. Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. big data projects for students But it’s not the amount of data that’s important.Project Center in Chennai


    Python Training in Chennai
    Python Training in Chennai
    The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training Project Centers in Chennai

    ReplyDelete