Blog Blog Posts Business Management Process Analysis

Self Join In SQL

Each table in a database typically holds a certain sort of data. As a result, in a database, there are sometimes hundreds of tables that are connected to one another. This means that joins are required. Using the Link keyword, you may join various tables based on their common columns. A table can also be joined to itself, which is known as a self join.

What is SQL?

Structured Query Language (SQL) is a programming language that is used to maintain relational databases and execute different operations on the data contained inside them. SQL, which was developed in the 1970s, is now widely used not just by database administrators but also by developers building data integration scripts and data analysts wanting to set up and run analytical queries.

In this blog, we will define a self-join, explain how it works, and when you should use it in your SQL queries.

Table of Contents:

If you are interested in SQL databases, Refer to this youtube video

What is Self Join in SQL

As the names indicate, the self join joins a table to itself. To utilize a self join, the table must include a column (let’s call it X) that operates as the primary key and another column (let’s call it Y) that holds data that can be matched up with the values in Column X. Columns X and Y do not have to have the same value for every given row, and Column Y may even be null.

Using table abbreviations, we can do Self Join. The table abbreviations prevent us from using the same table name again in a single sentence. It will produce an error if we use the same table name more than once in a single query without using table aliases.

Table abbreviations allow us to utilize the temporary name of the table that will be used in the query. The following explanation will help us know the table abbreviations.

Want to Ace the Job Interview of SQL, Here’s a Jackpot for you Top 72 SQL Interview Questions and Answers for 2022

Syntax and Query in Self Join SQL

You may link a table to itself using a self join, It is useful for querying hierarchical data or comparing rows inside the same database.

In a self join, the inner join or left join clause is applied. Because the self join query refers to the same table, the table alias is used to give multiple identities to the same table within the query.

Self Join Syntax

The syntax of a self join is identical to that of connecting two separate tables. We utilize aliases names for tables here because the table names are the same.

To learn more about Self Join SQL, visit our SQL Tutorial!

Examples of Self Join SQL

Here’s an example of the employees and their management. Let’s have look at the Employees table:

Each employee has  his/her unique id, which we call “Column X.” The column Manager Id holds the Id of the employee or manager; this is our “Column Y.” If we use these columns to track the employee-manager relationships in this table:

Id   Full Name Salary Manager Id
1  Chirs Hemsworth  200000 5
2  Tom Holland 250000 5
3 Ben Affleck 120000 1
4 Christian Bale 150000  
5 Gal Gadot 300000 4

In hierarchies, this form of table arrangement is particularly common. Now, we can execute the following query to display the manager’s name for each employee in the same row:

Code:

SELECT
    employee.Id,
        employee.FullName,
        employee.ManagerId,
        manager.FullName as ManagerName
FROM Employees employee
JOIN Employees manager
ON employee.ManagerId = manager.Id

which returns the following result:

Id FullName ManagerId ManagerName
1 Chirs Hemsworth 5 Gal Galdot
2 Tom Holland 5 Gal Galdot
3 Ben Affleck 1 Chirs Hemsworth
5 Gal Galdot 4 Christian Bale

The query collects the columns Id, FullName, and ManagerId from the employee table. It also chooses the FullName column of the table changes are taking place manager and labels it as ManagerName. As an outcome, the ID and name of every employee who has a manager are produced.

If you want to learn about SQL, do not miss the opportunity to enroll in Microsoft SQL Certification Training

The Employees table is connected with itself in this query and performs two functions:

By doing so, we are treating the two copies of the Employees table as if they were two separate tables, one for the employees and one for the supervisors.

Id FullName Age
1 Gal Galdot 25
2 Chirs Evans 70
3 Tom Holland 35
4 Jon Snow 38

Assume we need to find all possible pairings of coworkers so that everyone gets an opportunity to meet everyone else at the business introducing the event. The SQL code is as follows:

Code:

SELECT
    teammate1.FullName as Teammate1FullName,
    teammate1.Age as Teammate1Age,
        teammate2.FullName as Teammate2FullName,
    teammate2.Age as Teammate2Age
FROM Colleagues teammate1
CROSS JOIN Colleagues teammate2
ON teammate1.FullName <> teammate2.FullName

The result:

Teammate1FullName Teammate1Age Teammate2FullName Teammate2Age
Chirs Evans 70 Gal Galdot 25
Tom Holland 35 Gal Galdot 25
Jon Snow 38 Gal Galdot 25
Gal Galdot                 25 Chirs Evans 70
Tom Holland 35 Chirs Evans 70
Jon Snow 38 Chirs Evans 70
Gal Galdot                 25 Tom Holland 35
Chirs Evans 70 Tom Holland 35
Jon Snow 38 Tom Holland 35
Gal Galdot                 25 Jon Snow 38
Chirs Evans 70 Jon Snow 38
Tom Holland 35 Jon Snow 38

Every single individual in the table is matched with the outcome. We don’t want anyone partnered with himself or herself, therefore we have the ON clause condition teammate1.FullName <> teammate2.FullName.Because there are four colleagues at this event, each participant will be partnered with three other colleagues.

A real-world example is flight information in airports, which generates massive amounts of data each hour. Assume we want to look for a flight identification number as well as information about the departure and destination airports. Consider the tables below:

AirportId Country City
1 India Delhi
2 Germany Hamburg
3 Canada ottawa
4 France Paris
5 Bangladesh Dhaka

Table Flight:

FlightId AirplaneId StartTimesStamp EndTimesStamp StartAirportId EndAirportId
1 889191 2020-01-14 13:00:00 2020-01-14 15:00:00 3 4
2 868821 2020-02-04 01:00:00 2020-02-04 16:00:00 1 5
3 836755 2020-02-15 09:00:00 2020-02-15 12:00:00 5 4
4 897867 2020-02-24 03:00:00 2020-02-24 19:00:00 4 2
5 834567 2020-03-25 10:00:00 2020-03-25 12:00:00 2 1
6 899905 2020-04-01 00:00:00 2020-04-01 17:00:00 3 1
7 989898 2020-01-01 02:00:00 2020-01-01 04:00:00 1 2

Make mention that the Airport table’s AirportId column is the foreign key to the Flight table’s StartAirportId and EndAirportId columns. We will connect the Airport table to the Flight table twice, as follows:

The inquiry is as follows:

SELECT
	flight.FlightId,
    	flight.AirplaneId,
    	flight.StartAirportId,
    	startairport.Country as StartAirportCountry,
    	startairport.City as StartAirportCity,
    	flight.EndAirportId,
    	endairport.Country as EndAirportCountry,
    	endairport.City as EndAirportCity
FROM Flight flight
JOIN Airport startairport
ON flight.StartAirportId = startairport.AirportId
JOIN Airport endairport
ON flight.EndAirportId = endairport.AirportId

And the result of the query looks like this:

FlightId AirplaneId StartAirportId StartAirportCountry StartAirportcity EndAirportId EndAirportCountry EndAirportcity
1 889191 3 Canada Ottawa 4 France Paris
2 868821 1 India Delhi 5 Bangladesh Dhaka
3 836755 5 Bangladesh Dhaka 4 France Paris
4 897867 4 France Paris 2 Germany Hamburg
5 834567 2 Germany Hamburg 1 India Delhi
6 899905 3 Canada Ottawa 1 India Delhi
7 989898 1 India Delhi 2 Germany Hamburg

Let us now review the outcome. The first three columns are from a simpleSELECT of the Flight table. The following two columns are from Airport, which serves as the starting airport table; The rows are matched using in  AirportId and StartAirportId from the Airport and Flight tables, respectively.

Become a Database Architect

Conclusion

As we’ve learned, the self join  SQL is a significant subset of the join. We’ve seen examples of self join applications ranging from processing a hierarchy in a database to matching rows inside a table. We can join the same table numerous times, but each reference must have an alias that defines its purpose. These table abbreviations are used to retrieve columns from this single table based on the role that it is associated with.

You can visit the SQL Community  page to get your doubts resolved!

The post Self Join In SQL appeared first on Intellipaat Blog.

Blog: Intellipaat - Blog

Leave a Comment

Get the BPI Web Feed

Using the HTML code below, you can display this Business Process Incubator page content with the current filter and sorting inside your web site for FREE.

Copy/Paste this code in your website html code:

<iframe src="https://www.businessprocessincubator.com/content/self-join-in-sql/?feed=html" frameborder="0" scrolling="auto" width="100%" height="700">

Customizing your BPI Web Feed

You can click on the Get the BPI Web Feed link on any of our page to create the best possible feed for your site. Here are a few tips to customize your BPI Web Feed.

Customizing the Content Filter
On any page, you can add filter criteria using the MORE FILTERS interface:

Customizing the Content Filter

Customizing the Content Sorting
Clicking on the sorting options will also change the way your BPI Web Feed will be ordered on your site:

Get the BPI Web Feed

Some integration examples

BPMN.org

XPDL.org

×