Using the COUNT() function with GROUP BY is one of the most common SQL constructs in aggregate queries. Read this article to find out how to use COUNT() with GROUP BY correctly using 5 examples.
In this article, we will explain the importance of using COUNT
with GROUP BY
. We’ll talk about why it is essential in SQL and how it enables data analysis and summarization based on specific criteria. This combination empowers users to extract meaningful insights, calculate counts, and generate statistical summaries from large datasets.
If you’re looking for an in-depth review of basic SQL concepts like COUNT()
and GROUP BY
, I recommend our interactive SQL Basics course. It contains 129 exercises, which will help you review all key SQL concepts.
How to Use COUNT() and GROUP BY
If you’re here just for the quick answer, here’s the TLDR:
The correct way of using COUNT()
with GROUP BY
is shown in the query below:
SELECT Store, COUNT(*) as NumberOfSales, FROM SalesGROUP BY Store;
Let's break down the result to understand how this query works. The rows with the same value in the Store
column are grouped together. Imagine an intermediate table where these rows are grouped and marked with different colors, like the image below. This would be our intermediary table containing only the Store
column, since that is the column that is part of our SELECT
statement.
Store | ProductID | CustomerID |
---|---|---|
Store A | 165 | 7 |
Store A | 111 | 16 |
Store A | 145 | 25 |
Store B | 136 | 9 |
Store B | 111 | 38 |
Store C | 136 | 16 |
Store C | 111 | 8 |
The database then logically counts the number of rows in each group using the COUNT(*) function. This count represents the number of orders for each store.
Once the rows are counted, there is no need to have duplicate rows with the same Store value. So, GROUP BY will reduce the number of rows to only unique values. In other words, it will get rid of the individual rows and give us a summary of each group. We end up with the following result:
Store | NumberOfSales |
---|---|
Store A | 3 |
Store B | 2 |
Store C | 2 |
When to Use GROUP BY
GROUP BY
is an SQL clause that groups rows based on one or more column values. It is often used in combination with aggregate functions like COUNT()
, SUM()
, AVG()
, MAX()
, and MIN()
to perform calculations on grouped data.
The GROUP BY
clause is useful when you want to:
- Make calculations and aggregations on subsets of data.
- Generate summary statistics and metrics for different groups or categories.
- Identify patterns and trends within specific groups.
- Generate reports and analyze data based on different dimensions or attributes.
- Apply filters and conditions on grouped data, using the HAVING clause
In summary, GROUP BY
is used to organize and summarize data based on specific columns, functions or expressions, which will allow you to gain insights and perform calculations on distinct groups within a dataset.
When to Use the COUNT() Function
COUNT()
is one of SQL’s most common aggregate functions. It returns the number of rows that match a specified condition or are included in a result set. It is often used to retrieve the total number of records in a table or to calculate the number of occurrences of a particular value within a column.
5 Examples of Using COUNT() with GROUP BY
Now that we’ve gone through the basic scenarios where COUNT()
and GROUP BY
are used, let’s go over some of the more complicated examples. We’ll start off with simpler examples and work our way up to more complex scenarios.
Example #1: GROUP BY a Single Column
The simplest scenario you could encounter is when you need to GROUP BY a single column. In the following example, we need to find out how our company’s employee count is distributed across different job titles.
Before seeing the SQL solution to this scenario, let’s go over the sample data. This is the employees
table:
EmployeeID | FirstName | LastName | JobTitle |
---|---|---|---|
1 | John | Doe | Manager |
2 | Jane | Smith | Supervisor |
3 | Mark | Johnson | Developer |
4 | Emily | Williams | Analyst |
5 | Michael | Brown | Designer |
6 | Sarah | Davis | Developer |
7 | Robert | Wilson | Designer |
8 | Jessica | Taylor | Developer |
By using the COUNT
function with GROUP BY
on the JobTitle
column, we can get a breakdown of the number of employees in each specific role. You can see the query and the output (based on the sample data) below:
SELECT JobTitle, COUNT(*) AS NumberOfEmployeesFROM employeesGROUP BY JobTitle;
JobTitle | NumberOfEmployees |
---|---|
Analyst | 1 |
Designer | 2 |
Developer | 3 |
Manager | 1 |
Supervisor | 1 |
This example works the same way as our initial query. GROUP BY
puts the rows for employees with the same job title into one group. Then the COUNT()
function counts the rows in each group. GROUP BY
then collapses the rows in each group, keeping only the value of the column JobTitle
and the count.
Example #2: GROUP BY Multiple Columns
Of course, you can group rows by more than one column.
In this example, we will look at a sample orders
table containing basic order information:
OrderID | CustomerID | ProductID | ProductCategory | OrderDate | Status | Amount |
---|---|---|---|---|---|---|
1 | 101 | 1001 | Electronics | 2023-05-01 | Completed | 150.00 |
2 | 102 | 1002 | Clothing | 2023-05-02 | Completed | 80.00 |
3 | 101 | 1001 | Home Goods | 2023-06-03 | In progress | 60.00 |
4 | 103 | NULL | Accessories | 2023-06-03 | Canceled | 200.00 |
5 | 101 | 1002 | Electronics | 2023-07-04 | NULL | 120.00 |
6 | 102 | 1001 | NULL | 2023-07-05 | NULL | NULL |
7 | 103 | 1002 | Clothing | 2023-07-06 | In progress | 90.00 |
8 | 102 | 1002 | Accessories | 2023-08-07 | NULL | 75.00 |
9 | 103 | NULL | NULL | 2023-08-08 | NULL | 100.00 |
10 | 101 | 1001 | Home Goods | 2023-09-09 | NULL | NULL |
11 | 102 | 1001 | Home Goods | 2023-06-05 | In progress | 80.00 |
12 | 103 | 1004 | Accessories | 2023-06-06 | Completed | 75.00 |
13 | 102 | 1005 | Electronics | 2023-08-06 | Completed | 88.00 |
We need to write a query that will show the number of orders placed by each customer and the ProductCategory
of that order. This means that we will have to return the CustomerID
and the category that the order falls within.
The query will look like this:
SELECT CustomerID, ProductCategory, COUNT(*) AS NumberOfOrdersFROM ordersGROUP BY CustomerID, ProductCategory;
And the result of running this query can be seen below:
CustomerID | ProductCategory | NumberOfOrders |
---|---|---|
101 | Electronics | 2 |
101 | Home Goods | 2 |
102 | Electronics | 2 |
102 | Accessories | 2 |
102 | Clothing | 1 |
103 | Accessories | 2 |
103 | Clothing | 1 |
103 | NULL | 1 |
Our query groups rows by two columns: CustomerID
and ProductCategory
. This means that GROUP BY
groups rows with the same values of CustomerID
and ProductCategory
into one group. (Rows for CustomerID
101 and the category Electronics
are in one group, but rows for CustomerID
101 and the category Home Goods
are in a different group.) Then the rows in each group are counted by COUNT()
.
Example #3: Using WHERE with COUNT() and GROUP BY
Our previous example analyzed a scenario where we wanted to create an aggregate of all of the information in our orders
table. But sometimes, we might want to slice into this information and only see orders for selected categories.
Using the same sample data that we had before, we will now write a query that shows the same information for orders that fall within the “Accessories” or “Clothing” ProductCategory
.
To do this, we can use the query from Example 2 and just add a WHERE
clause. This clause will filter for records where ProductCategory
is equal to “Accessories” or “Clothing”.
SELECT CustomerID, ProductCategory, COUNT(*) AS NumberOfOrdersFROM ordersWHERE ProductCategory IN (‘Accessories’, ‘Clothing’)GROUP BY CustomerID, ProductCategory;
CustomerID | ProductCategory | NumberOfOrders |
---|---|---|
102 | Accessories | 2 |
103 | Accessories | 1 |
102 | Clothing | 1 |
103 | Clothing | 1 |
If it’s not entirely intuitive how the database has generated the results using the query above, here’s a step-by-step explanation of what happened behind the scenes:
- First, the database scans the
orders
table and reads all the rows. - It then applies the filtering condition in
WHERE ProductCategory IN (‘Accessories’, ‘Clothing’ )
to filter the rows. After this step, only the rows where the product category is “Accessories” or “Clothing” are considered for further processing. - The filtered rows are then grouped based on the values in the
CustomerID
andProductCategory
columns, which are specified in theGROUP BY
clause. - For each of the unique combinations of
CustomerID
andProductCategory
, theCOUNT(*)
function is applied. This will count the number of rows within each group. - The final result set includes the
CustomerID
,ProductCategory
and the count of orders (shown in theNumberOfOrders
column) for each group.
In summary, the database will filter the rows that respect the specified filtering condition. It will then group them according to the specified columns in the GROUP BY
clause, and then calculate the count of orders within each of those groups. The final output will include the CustomerID
, ProductCategory
and the corresponding count of orders for each unique combination of CustomerID
and ProductCategory
.
Example #4: Using ORDER BY with COUNT() and GROUP BY
Looking at the result from example number 4, we can see that some of the rows in the output are mixed. This is because the columns in the GROUP BY
list will generally require – but not guarantee – sorting the output based on the list of columns in the GROUP BY
section.
But if, for example, we need to organize the results on different conditions ( e.g. a timeline or by date) we will need to use an ORDER BY clause:
SELECT CustomerID, ProductCategory, COUNT(*) AS NumberOfOrdersFROM ordersWHERE ProductCategory IN (‘Accessories’, ‘Electronics’)GROUP BY CustomerID, ProductCategoryORDER BY ProductCategory, CustomerID;
CustomerID | ProductCategory | NumberOfOrdes |
---|---|---|
101 | Accessories | 2 |
102 | Accessories | 2 |
102 | Electronics | 1 |
103 | Electronics | 2 |
We’ve added the ORDER BY
clause with the list of columns we want to sort the data by. As you can see, this sorts the output information based on the order of the columns listed.
Example #5: COUNT(expression)
So far, we’ve looked at simple examples of working with COUNT()
; the purpose was to count all of the rows in the source dataset or table.
However, there are more complex ways of using the COUNT–GROUP BY
combination. To explain this, we’ll build a new sample dataset.
We have a table called SurveyResponses
that stores answers to a survey. Some of the questions are optional, which is why there are some respondents with NULL answer values; they’ve skipped the optional questions. You can see the sample data below:
ResponseID | RespondentID | RespondentName | QuestionID | Answer |
---|---|---|---|---|
1 | 101 | John | 1 | Agree |
2 | 101 | John | 2 | No |
3 | 101 | John | 3 | Yes |
4 | 102 | Sarah | 1 | Yes |
5 | 102 | Sarah | 2 | Not Sure |
6 | 102 | Sarah | 3 | NULL |
7 | 103 | Mark | 1 | No |
8 | 103 | Mark | 2 | Maybe |
9 | 103 | Mark | 3 | No |
10 | 104 | Emily | 1 | Yes |
11 | 104 | Emily | 2 | Not Sure |
12 | 104 | Emily | 3 | Disagree |
COUNT(*) - Counting Rows
The most common usage of the COUNT
function (and its default functionality, even if it’s used with GROUP BY
) is to count the number of rows. For example, if we want to count the answer types of survey questions, we can use the following query:
SELECT COUNT(*) AS NumberOfYesAnswers, AnswerFROM SurveyResponsesGROUP BY Answer;
The output will look like this:
NumberOfAnswers | Answer |
---|---|
3 | Yes |
3 | No |
2 | Not Sure |
1 | Agree |
1 | Disagree |
1 | Maybe |
1 | NULL |
The outcome of this is a count of all rows that match a certain answer. It returns the aggregate number of survey answers and the type of the answer.
Counting NON-NULL Values in a Column
Let’s look at another example that might seem to generate the same results. Actually, this example has something particular and important about it; instead of using *
as a parameter in our COUNT()
function, we’re using COUNT()
with a column name.
We will modify the query in the previous example. Instead of the *
parameter for the COUNT()
function, we will replace it with the Answer
column. Our new query looks like this:
SELECT COUNT(Answer) AS NumberOfAnswers, AnswerFROM SurveyResponsesGROUP BY Answer;
If we run this new query, we will see that the results returned are almost exactly the same:
NumberOfAnswers | Answer |
---|---|
3 | Yes |
3 | No |
2 | Not Sure |
1 | Agree |
1 | Disagree |
1 | Maybe |
0 | NULL |
We can see that the output for the NULL
answer value has changed from 1 to 0. This is because the COUNT()
function only takes into consideration non-NULL
values when doing the aggregation.
Before we were doing COUNT(*)
, which implicitly means count the rows; COUNT(Answer)
will count the values in the Answer
column. And because we had 1 value with NULL
, it will skip those values in its calculation – returning 0 in this second scenario.
Counting Distinct NON-NULL Values in a Column
In this third variation of using the COUNT
function, we will use the same query as in our previous example. This time, though, we will add the DISTINCT keyword before the column name.
SELECT Answer, COUNT(DISTINCT Answer) AS DistinctCountFROM SurveyResponsesGROUP BY Answer;
Status | DistinctCount |
---|---|
Yes | 1 |
No | 1 |
Not Sure | 1 |
Agree | 1 |
Disagree | 1 |
Maybe | 1 |
NULL | 0 |
We can see in the output above that the result of this query has turned all of the positive values in the DistinctCount
column to 1. Adding the DISTINCT
keyword means that each time the COUNT
function finds a new value it hasn’t seen before, it will add that status to its list and add a 1 to its count. However, if it finds the same status value a second time or more, it will skip counting it.
Want to Know More About COUNT() and GROUP BY?
Using COUNT()
with GROUP BY
is only one of the many powerful features that SQL offers. If you’re eager to explore SQL further and truly master its capabilities, I encourage you to check out our comprehensive SQL Basics course. If you’re looking for SQL practice, I recommend our SQL Practice track, which has over 600 SQL practice exercises.
After taking our courses, maybe you feel like you want to find a job working with SQL. To help you prepare for your interview, we have a list of SQL interview questions that will show you some real-world examples of questions and problems in SQL.