You've seen the Russian dolls where you've got one doll, inside another, inside another, etc.? A SQL subquery is very much like this; it's tucked inside other commands, or even other SQL subqueries.
They may be placed inside one of the following commands:
Subqueries are used to refine a SQL query when you'd rather just use one operation to find out what you need to know rather than two. This is best when you are planning to use a particular query over again in the future, as for a monthly sales figure or other regular metric.
Sometimes when you're working on a SQL query, you find that you need to find out something else to finish your operations properly. For instance, suppose you need to find the top customers at your grocery store who have shopped during a certain period? Or the opposite, find among customers who have shopped during a certain period which spend the most money?
A SQL subquery is perfect for finding out this information. In plain English, your command would be something like "Select the best customers Where (Select customers who shopped during January)." Assuming you have your top customers segregated in a second table, in SQL code this would translate to:
SELECT "TopCustomer" WHERE (SELECT "shoppingdate" = "October")
Another example is when you need to copy a table or portion of a table into a new table so you can perform other operations on it.
INSERT INTO TableNew SELECT TableNeeded
Where this gets really confusing is when you start nesting subquery after subquery. Like the Russian dolls, you can insert new subqueries for as long as you can keep track of them. Most SQL programmers are creating subqueries long before they start to use them. It's just that most people create new tables from the nested query to perform the operations in the outside query before actually compiling subquery statements.
Subqueries can form three types of statements:
Syntax will be as follows:
SELECT "column_name1"
FROM "table_name1"
WHERE "column_name2" [Comparison Operator]
(SELECT "column_name3"
FROM "table_name2"
WHERE [Condition])
[Comparison Operator] could be equality operators such as =, >, <, >=, <=. It can also be a text operator such as "LIKE". The portion in red is considered as the "inner query", while the portion in black is considered as the "outer query".
Examples:
Table Store_Information
Store_city | Sales | Date |
---|---|---|
Los Angeles | $1500 | Jan-05-1999 |
San Diego | $250 | Jan-07-1999 |
Los Angeles | $300 | Jan-08-1999 |
Boston | $700 | Jan-08-1999 |
Table Geography
Region_name | Store_city |
---|---|
East | Boston |
East | New York |
West | Los Angeles |
West | San Diego |
We want to use a subquery to find the sales of all stores in the West region. To do so, we use the following SQL statement:
SELECT SUM(Sales) FROM Store_Information
WHERE Store_city IN
(SELECT Store_city FROM Geography
WHERE Region_name = 'West')
Result:
SUM(Sales) |
2050 |
In this example, instead of joining the two tables directly and then adding up only the sales amount for stores in the West region, we first use the subquery to find out which stores are in the West region, and then we sum up the sales amount for these stores.
In the above example, the inner query is first executed, and the result is then fed into the outer query. This type of subquery is called a simple subquery. If the inner query is dependent on the outer query, we will have a correlated subquery. An example of a correlated subquery is shown below:
SELECT SUM(a1.Sales) FROM Store_Information a1
WHERE a1.Store_city IN
(SELECT Store_city FROM Geography a2
WHERE a2.Store_city = a1.Store_city)
Notice the WHERE clause in the inner query, where the condition involves a table from the outer query.