Tuesday, 5 November 2024

SQL clauses and the order in which they are executed

SQL Execution Order (Logical Order):

  1. FROM:
    • This clause specifies the tables (or other data sources) from which to retrieve data. It happens first because it determines which rows are available to the rest of the query.
  2. JOIN:
    • If you're using JOIN, it’s performed immediately after FROM. The join operation combines data from multiple tables based on a condition.
  3. WHERE:
    • Filters the rows returned by the FROM and JOIN clauses based on a condition. The rows are selected after the join operations.
  4. GROUP BY:
    • Groups the rows based on the columns you specify. After filtering (WHERE), SQL groups the data based on specified columns.
  5. HAVING:
    • Similar to WHERE, but it is used to filter groups created by GROUP BY. You would use HAVING to filter out groups based on aggregated results, such as SUM, COUNT, etc.
  6. SELECT:
    • Selects the columns to be displayed in the final output. The SELECT clause is executed after FROM, JOIN, WHERE, and GROUP BY (and HAVING if used), because it operates on the filtered and grouped data.
  7. DISTINCT:
    • If you’re using DISTINCT, it happens immediately after SELECT to remove duplicate rows from the result set.
  8. ORDER BY:
    • The data is sorted based on the columns specified in the ORDER BY clause. Sorting occurs after SELECT and DISTINCT (if used), and it organizes the rows in ascending or descending order.
  9. LIMIT / OFFSET:
    • These clauses are used to restrict the number of rows returned. They occur last, after all sorting and selection of rows are done.

Example:

SELECT DISTINCT s.student_id, s.student_name, SUM(p.score) AS total_score FROM student s JOIN point p ON s.student_id = p.student_id WHERE s.age > 18 GROUP BY s.student_id, s.student_name HAVING SUM(p.score) > 50 ORDER BY total_score DESC LIMIT 1 OFFSET 2;

Execution Order for the Above Query:

  1. FROM: Start with the student and point tables.
  2. JOIN: Combine student and point based on the student_id column.
  3. WHERE: Filter the students who are older than 18 years.
  4. GROUP BY: Group the remaining rows by student_id and student_name.
  5. HAVING: Filter the groups where the total score is greater than 50.
  6. SELECT: Select the columns student_id, student_name, and the aggregated SUM(p.score) as total_score.
  7. DISTINCT: Remove any duplicate rows if necessary.
  8. ORDER BY: Sort the results by total_score in descending order.
  9. LIMIT/OFFSET: Select the 3rd row in the sorted result (because of LIMIT 1 OFFSET 2).

Thank you

No comments:

Post a Comment

Golang Advanced Interview Q&A