Advanced querying

Physical modelIn the first module we used the SQL SELECT statement to retrieve data from both single and multiple tables. In this topic we look at some more sophisticated querying methods including:

  • Using the same table more than once in a query: a variation on joining tables
  • Subqueries: an inner query is placed within an outer query
  • Combining queries: Like subqueries, derived tables and set operations allow for the results of two or more queries to be combined

Using the same table more than once in a query

To begin this module, the first reading shows how a table can be joined to itself to answer queries that need data from more than one row in a table. As well as illustrating a new technique, this reading also provides a review of using the SELECT statement.

have a go: SELECT processing order

 

Subqueries

Textbook reading
Ch8 pp 340-345

Subqueries is another method to retrieve information from multiple tables. Two readings are provided:

  1. The first reading, subqueries and Correlated Subqueries explains subqueries
  2. The second reading provides a brief summary of subqueries.
    As you read think about the processing order of the query execution.

If you wish, you can now complete questions 1-4 of Getting your hands dirty.

 

Combining queries

Textbook reading
Ch8 pp 345-349

The reading Using Derived Tables, Combining Queries, Conditional Expressions and More Complicated SQL Queries cover:

  • Constructing a temporary table to use in a query, and
  • Combining the output of different queries together.

The following reading in Derived Tables provides a summary of the UNION Statement.

 

Another viewpoint

Writing SQL queries is a knack and does take time and practice. To help you, an additional reading is included:

  • Elmasri provides an alternative explanation of the
    SELECT statement. As you work through, you might wish to try
    and write the queries before looking at the answers.

Alternatively, jump in the deep end and start working on the queries in the Getting Your Hands Dirty section.

 

Getting your hands dirty

INVOICE databaseINVOICE database

Try the following queries on the INVOICE database.

Remember
There is more than one way to write a query, therefore your answer may be quite different to ours. If it returns the right rows then it is correct.

 

1. Find the items that have a price that is greater than the average item price Your query should return

outer query

2. Find the most common item sold Your query should return

outer query

3. Find all the customers that have not had an invoice. (Note: This is the opposite of an earlier query that finds all the customers that have an invoice against them.) Your query should return

query

4. Find all the products that have never been sold.Your query should return

query

5. Find the items that have a price less than the average item price. Your query should return

query

6. Find the product descriptions and item price of the most expensive and least expensive items stocked Your query should return

query

INVOICE databaseCheck

1. Find the items that have a price that is greater than the average item price Your query should return

outer query

View the answer

      SELECT ItemNo, ItemDesc, ItemPrice
      FROM ITEM
      WHERE ItemPrice > (SELECT AVG(ItemPrice)
                         FROM ITEM);

The inner query is executed first and returns a single row:

inner query

The outer query is then executed, the value returned is placed in each row of the table created by the outer query and evaluated (as we know, SQL can only compare a single row at a time). The rows that match the criteria are returned.

2. Find the most common item sold Your query should return

outer query

View the answer

      SELECT init.ItemNo, ItemDesc
      FROM InvoiceItem init, Item it
      WHERE init.ItemNo = it.ItemNo
      GROUP BY init.ItemNo, ItemDesc
      HAVING SUM(Quantity) >= ALL (SELECT SUM(Quantity)
                              FROM InvoiceItem
                              GROUP BY ItemNo);

If we used the ANY operator in the above query, all of the values from the inner query will be evaluated as true in the outer query and displayed.  Try it and see.

The inner query returns:

inner query

This is then evaluated by the outer query.

3. Find all the customers that have not had an invoice. (Note: This is the opposite of an earlier query that finds all the customers that have an invoice against them.) Your query should return

query

View the answer

This is the opposite of an earlier query that finds all the customers that have an invoice against them.

      SELECT CustABN AS ABN, CustName AS Name
      FROM CUSTOMER
      WHERE CustABN NOT IN (SELECT c.CustABN
                            FROM CUSTOMER c, INVOICE i
                            WHERE c.CustABN = i.CustABN);

Here the logical NOT operator is used with the IN operator. The inner query is the earlier query that found all customers with an invoice against them. Using NOT IN will now finds all the records that weren’t retuned by the inner query.

4. Find all the products that have never been sold.Your query should return

query

View the answer

      SELECT i.ItemNo, ItemDesc
      FROM ITEM i
      WHERE NOT EXISTS (SELECT *
                        FROM INVOICEITEM ii
                        WHERE i.ItemNo = ii.ItemNo);

5. Find the items that have a price less than the average item price. Your query should return

query

View the answer

      SELECT ItemDesc, ItemPrice, AvgPrice
      FROM (SELECT AVG(ItemPrice) AS AvgPrice FROM ITEM), ITEM
      WHERE ItemPrice < AvgPrice;

6. Find the product descriptions and item price of the most expensive and least expensive items stocked Your query should return

query

View the answer

In an earlier exercise we had found the most expensive and least expensive items stocked using:

      SELECT MAX(ItemPrice) AS Dearest, MIN(ItemPrice) AS Cheapest
      FROM ITEM;

To now include the product descriptions:

     (SELECT ItemDesc AS Item, ItemPrice
      FROM ITEM
      WHERE ItemPrice IN (SELECT MAX(ItemPrice)
                          FROM ITEM))
      UNION

     (SELECT ItemDesc AS Item, ItemPrice
      FROM ITEM
      WHERE ItemPrice IN (SELECT MIN(ItemPrice)
                          FROM ITEM));

STUDENT databaseSTUDENT database

View the Student Tables

1. For students that took both HIT999 and HIT232, find their grades in both units?

2. Find all students who have the same coordinator as Donald Duck?

3. Find all students whose coordinator has no other students?

4. Find students who have coord no 23 or who did HIT999

5. Which Coordinators advise more than 10 students?

6. Find the number of students advised by coordinators who advise a student in 3rd year (completed more than 160 credit points) or advise a student taking HIT232

STUDENT databaseCheck

1. For students that took both HIT999 and HIT232, find their grades in both units? View the answer

      SELECT g1.student_No, g1.Course_No, g1.Grade, g2.Course_No, g2.Grade
      FROM Grade g1, Grade g2
      WHERE g1.student_No = g2.student_No
      AND g1.Course_No = 'HIT232'
      AND g2.Course_No = 'HIT999';

2. Find all students who have the same coordinator as Donald Duck? View the answer

      SELECT Name
      FROM Students
      WHERE Coordinator_No = (SELECT Coordinator_No
                              FROM Students
                              WHERE Name = 'Donald Duck');

In this query the subquery only returned a single value and so the = operator could be used. Had the subquery returned more than one row or no rows and = still be used there would have been a runtime error.

In cases where a subquery could return no value, one value or many values, ANY or ALL should be used. This query could be rewritten as:

      SELECT Name
      FROM Students
      WHERE Coordinator_No = ANY (SELECT Coordinator_No
                                  FROM Students
                                  WHERE Name = 'Donald Duck');

3. Find all students whose coordinator has no other students? View the answer

      SELECT name
      FROM student s1
      WHERE NOT EXISTS ( SELECT *
                        FROM student s2
                        WHERE s1. name <> s2.name
                        AND s1.coordinator_no = s2.coordinator_no);

4. Find students who have coord no 23 or who did HIT999 View the answer

      (SELECT Student_No
      FROM student;
      WHERE Coord_No = 23)
      UNION
      (SELECT Student_No
      FROM Grade
      WHERE Course_No='HIT999' );

5. Which Coordinators advise more than 10 students?View the answer

      SELECT Name
      FROM Coordinator
      WHERE (SELECT Count(*)
            FROM Student
            WHERE Student.coord_no=Coordinator.coord_no) >10;

6. Find the number of students advised by coordinators who advise a student in 3rd year (completed more than 160 credit points) or advise a student taking HIT232 View the answer

      SELECT coord_no, Count(*)
      FROM Student
      GROUP BY coord_no
      HAVING max(total_units) > 160
      OR coord_no IN (SELECT coord_no
                     FROM grade, student
                     WHERE course_no = 'HIT232'
                     AND student.student_no = grade.student_no);

PRODUCT databasePRODUCT database

For those of you who want or need some more practice, here are some additional queries based on a product database. If you are feeling confident you may wish to skip or skim through these.

First read the description of the product database.

Then try the following queries.

1. Find the manufacturers that sell printers but not PCs.

2. Find those hard disk sizes that occur in two or more PCs

3. Find those PC models that have both the same speed and RAM as at least one other PC. A PC should only be listed once.

4. Find those manufacturers of at least two different computers (Pcs or laptops) with speeds of at least 1000. Note the two computers could be a pc and laptop or 2 pcs or 2 laptops.

5. Find the PCs whose speed is slower than that of all Laptops?

6. Find the model number of the item (PC, laptop, or printer) with the highest price.

7. Find the maker(s) of the PC(s) with the fastest processor among all those PC's that have the smallest amount of RAM.

PRODUCT databaseCheck

1. Find the manufacturers that sell printers but not PCs. View the answer

      (SELECT maker
      FROM Product p, printer pr
      WHERE p.model=pr.model)
      MINUS
      (SELECT maker
      FROM Product p, pc
      WHERE p.model=pc.model);

2. Find those hard disk sizes that occur in two or more PCs. View the answer

      SELECT hd
      FROM pc
      GROUP by hd
      HAVING count(*) >=2;

3. Find those PC models that have both the same speed and RAM as at least one other PC. A PC should only be listed once. View the answer

      SELECT DISTINCT pc1.model
      FROM pc pc1, pc pc2
      WHERE pc1.speed = pc2.speed
      AND pc1.ram = pc2.ram
      AND pc1.model <> pc2.model;

4. Find those manufacturers of at least two different computers (Pcs or laptops) with speeds of at least 1000. Note the two computers could be a pc and laptop or 2 pcs or 2 laptops. View the answer

      SELECT maker
      FROM (SELECT maker, p.model, pc.speed
            FROM product p, pc
            WHERE p.model=pc.model
            AND pc.speed >= 1000
            UNION
            SELECT maker, p.model, l.speed
            FROM product p, laptop l
            WHERE p.model=l.model
            AND l.speed >= 1000)
      GROUP BY maker
      HAVING count(*) >= 2;

5. Find the PCs whose speed is slower than that of all Laptops?View the answer

      SELECT *
      FROM pc
      WHERE speed <= ALL (select speed from laptop);

OR

      SELECT *
      FROM pc
      WHERE speed < (SELECT min(speed) FROM laptop);

6. Find the model number of the item (PC, laptop, or printer) with the highest price. View the answer

      SELECT model
      FROM
            (SELECT model, price FROM pc
            UNION
            SELECT model, price FROM laptop
            UNION
            SELECT model, price FROM printer)
      WHERE price = (SELECT max(price)
                  FROM
                  (SELECT model, price FROM pc
                  UNION
                  SELECT model, price FROM laptop
                  UNION
                  SELECT model, price FROM printer));

7. Find the maker(s) of the PC(s) with the fastest processor among all those PC's that have the smallest amount of RAM. View the answer

      SELECT p.maker
      FROM pc pc1, product p
      WHERE pc1.model = p.model
      AND ram = (SELECT min(ram) FROM pc)
      AND speed >= ALL( SELECT speed
                        FROM pc
                        WHERE ram=pc1.ram);

      SELECT p.model, p.maker
      FROM pc, product p
      WHERE pc.model = p.model
      AND (ram, speed) IN (SELECT ram, max(speed)
                        FROM pc
                              GROUP BY ram
                              HAVING ram = (SELECT min(ram) FROM pc) );