SQL GROUP BY, then WHERE, then COUNT: A Detailed Guide to Counting Courses with Passed Tests
In this article, we’ll explore how to write an efficient SQL query that counts the number of courses where both evaluations (test1 and test2) have been passed on the first attempt. We’ll break down the problem into two steps: first, retrieving the first attempts for each course, and then filtering out the courses that don’t meet the condition.
Understanding the Problem
We’re given a database table tbl with columns row #, course_id, eval_type, eval_date, and Passed?. Our goal is to write an SQL query that returns the count of course IDs where both test1 and test2 have been passed on their first attempt.
Step 1: Retrieving First Attempts for Each Course
To start, we’ll create an inner query that retrieves the first attempts for each course. We can achieve this by using a subquery or a Common Table Expression (CTE). In our case, we’ll use a CTE to make the query more readable.
WITH FirstAttempts AS (
SELECT
row_number() over (partition by course_id, eval_type order by date) as seq,
course_id,
eval_type,
date,
passed
FROM tbl
)
SELECT *
FROM FirstAttempts
WHERE seq = 1 AND passed = 'Y'
In this query:
- We create a CTE named
FirstAttemptsthat selects the required columns from thetbltable. - The
row_number()function assigns a unique number to each row within each partition (in this case,course_idandeval_type). The rows are ordered bydate. - We then select only the rows where
seqequals 1 (i.e., the first attempt for each course) andpassedequals ‘Y’.
Step 2: Filtering Out Courses with Unpassed Tests
Now that we have the first attempts, we need to filter out courses where both test1 and test2 haven’t been passed on their first attempt. We can achieve this by using another query or modifying the previous one.
SELECT DISTINCT t1.course_id
FROM (
SELECT
course_id,
eval_type,
min(date) as date
FROM tbl
GROUP BY course_id, eval_type
) t1
WHERE NOT EXISTS (
SELECT *
FROM tbl t2
WHERE
t2.course_id = t1.course_id
AND t2.date = t1.date
AND t2.passed = 'N'
)
In this query:
- We use a subquery to select the minimum date for each course and evaluation type (using
min(date)). - The outer query selects distinct courses where there doesn’t exist another row with the same course ID, date, and passed status equal to ‘N’.
Combining the Two Queries
To get our final result, we need to combine the two queries. We can do this by selecting the required columns from both queries.
SELECT
t1.course_id,
COUNT(*) as count_passed_courses
FROM (
SELECT
course_id,
eval_type,
min(date) as date
FROM tbl
GROUP BY course_id, eval_type
) t1
LEFT JOIN (
SELECT *
FROM FirstAttempts
WHERE seq = 1 AND passed = 'Y'
) t2 ON t1.course_id = t2.course_id AND t1.date = t2.date
WHERE NOT EXISTS (
SELECT *
FROM tbl t3
WHERE
t3.course_id = t1.course_id
AND t3.date = t1.date
AND t3.passed = 'N'
)
GROUP BY t1.course_id, t1.eval_type
In this query:
- We join the two queries on
course_idanddate. - The outer query selects distinct courses where there doesn’t exist another row with the same course ID, date, and passed status equal to ‘N’.
Conclusion
In conclusion, we’ve broken down the problem of counting courses with both test1 and test2 passed on their first attempt into two steps: retrieving the first attempts for each course and filtering out courses with unpassed tests. We’ve also provided an example query that combines these two steps.
The final result should give us a clear understanding of how to write an efficient SQL query to solve this problem.
Last modified on 2023-06-30