# SQL Tutorial 5 Aggregating Data

## SQL Aggregating Data with Example

Aggregating data in SQL involves the use of aggregate functions that perform calculations on a set of values and return a single value.

Some common aggregate functions include COUNT, SUM, AVG, MIN, and MAX. Here’s an example of how to use these functions in SQL queries:

Let’s consider a simple table Orders:

CREATE TABLE Orders (

OrderID int,

CustomerID int,

OrderDate date,

Amount float

);

INSERT INTO Orders (OrderID, CustomerID, OrderDate, Amount)

VALUES

(1, 101, ‘2023-10-01’, 100.5),

(2, 102, ‘2023-10-02’, 75.2),

(3, 103, ‘2023-10-03’, 50.0),

(4, 101, ‘2023-10-04’, 200.3),

(5, 101, ‘2023-10-05’, 125.0),

(6, 103, ‘2023-10-06’, 180.5);

Now, we can use various aggregate functions to perform analysis on this data:

COUNT – Counts the number of records that meet a certain condition.

SELECT COUNT(*) AS TotalOrders FROM Orders;

SUM – Calculates the sum of a numeric column.

SELECT SUM(Amount) AS TotalAmount FROM Orders;

AVG – Calculates the average of a numeric column.

SELECT AVG(Amount) AS AverageAmount FROM Orders;

MIN – Finds the minimum value in a column.

SELECT MIN(Amount) AS MinimumAmount FROM Orders;

MAX – Finds the maximum value in a column.

SELECT MAX(Amount) AS MaximumAmount FROM Orders;

You can also use these aggregate functions with the GROUP BY clause to perform aggregate functions on subsets of data. For example, to get the total amount for each customer:

SELECT CustomerID, SUM(Amount) AS TotalAmount FROM Orders GROUP BY CustomerID;

These examples illustrate how to use various aggregate functions in SQL to perform analysis on data.

## SQL Group Functions with example

In SQL, group functions, also known as aggregate functions, are used to perform calculations on a group of rows and return a single value. Common group functions include COUNT, SUM, AVG, MIN, and MAX. Here’s an explanation of each function along with an example:

Consider a table called Sales with the following structure:

CREATE TABLE Sales (

SaleID int,

ProductID int,

Quantity int,

Amount float

);

INSERT INTO Sales (SaleID, ProductID, Quantity, Amount)

VALUES

(1, 101, 2, 50.5),

(2, 102, 1, 30.2),

(3, 101, 3, 75.0),

(4, 103, 1, 20.3),

(5, 101, 2, 40.0),

(6, 103, 2, 45.5);

COUNT – Counts the number of rows that meet a specific condition. To count the number of sales:

SELECT COUNT(*) AS TotalSales FROM Sales;

SUM – Calculates the sum of a numeric column. To find the total amount of sales:

SELECT SUM(Amount) AS TotalAmount FROM Sales;

AVG – Calculates the average of a numeric column. To find the average quantity of products sold:

SELECT AVG(Quantity) AS AverageQuantity FROM Sales;

MIN – Finds the minimum value in a column. To find the minimum amount of a sale:

SELECT MIN(Amount) AS MinimumAmount FROM Sales;

MAX – Finds the maximum value in a column. To find the maximum quantity of a product sold:

SELECT MAX(Quantity) AS MaximumQuantity FROM Sales;

Additionally, you can use these group functions along with the GROUP BY clause to perform these calculations for each group. For example, to find the total amount and quantity for each product:

SELECT ProductID, SUM(Amount) AS TotalAmount, SUM(Quantity) AS TotalQuantity

FROM Sales

GROUP BY ProductID;

Using these examples, you can understand how to use group functions in SQL to perform various calculations on a set of rows.

## SQL The GROUP BY Clause with example

The GROUP BY clause in SQL is used to group rows that have the same values in one or more columns.

It is often used in conjunction with aggregate functions such as COUNT, SUM, AVG, MIN, and MAX to perform operations on each group. Here’s an explanation of the GROUP BY clause with an example:

Consider a table called Orders with the following structure:

CREATE TABLE Orders (

OrderID int,

CustomerID int,

OrderDate date,

Amount float

);

INSERT INTO Orders (OrderID, CustomerID, OrderDate, Amount)

VALUES

(1, 101, ‘2023-10-01’, 100.5),

(2, 102, ‘2023-10-02’, 75.2),

(3, 101, ‘2023-10-03’, 50.0),

(4, 103, ‘2023-10-04’, 200.3),

(5, 101, ‘2023-10-05’, 125.0),

(6, 103, ‘2023-10-06’, 180.5);

Now, let’s use the GROUP BY clause to group orders by CustomerID and find the total order amount for each customer:

SELECT CustomerID, SUM(Amount) AS TotalAmount

FROM Orders

GROUP BY CustomerID;

This query will return the total order amount for each customer.

You can also use multiple columns in the GROUP BY clause to group the data based on multiple columns. For instance, to find the total order amount for each customer on each order date, you can use:

SELECT CustomerID, OrderDate, SUM(Amount) AS TotalAmount

FROM Orders

GROUP BY CustomerID, OrderDate;

The GROUP BY clause is particularly useful when you need to perform aggregate functions on subsets of data based on specific criteria. It allows you to perform operations on groups of rows rather than on individual rows, providing aggregated results for each group.

## SQL Grouping by Manipulated Columns with example

When dealing with SQL, it’s common to require some manipulation of columns before applying the GROUP BY clause. This can be achieved using various functions like CONCAT, DATE_FORMAT, and other string or date functions depending on the specific requirements. Here’s an example that demonstrates grouping by manipulated columns:

Consider a table Employees with the following structure:

CREATE TABLE Employees (

EmployeeID int,

FirstName varchar(255),

LastName varchar(255),

JoiningDate date,

Salary int

);

INSERT INTO Employees (EmployeeID, FirstName, LastName, JoiningDate, Salary)

VALUES

(1, ‘John’, ‘Doe’, ‘2023-01-15’, 50000),

(2, ‘Jane’, ‘Smith’, ‘2023-02-20’, 60000),

(3, ‘Michael’, ‘Johnson’, ‘2023-02-20’, 55000),

(4, ‘Emily’, ‘Williams’, ‘2023-03-10’, 65000),

(5, ‘William’, ‘Brown’, ‘2023-03-10’, 70000);

Now, let’s say you want to group employees by their joining month and year. You can use the DATE_FORMAT function to manipulate the JoiningDate column and then apply the GROUP BY clause:

SELECT DATE_FORMAT(JoiningDate, ‘%Y-%m’) AS JoiningMonth, COUNT(*) AS EmployeeCount

FROM Employees

GROUP BY DATE_FORMAT(JoiningDate, ‘%Y-%m’);

This query will give you the count of employees who joined in each month and year. The DATE_FORMAT function is used to extract and format the year and month from the JoiningDate column.

Similarly, you can use other string manipulation functions like CONCAT to group by concatenated columns or any other required transformation based on your specific requirements.

## SQL The HAVING Clause with example

The HAVING clause in SQL is used to filter the results of a GROUP BY clause based on specified conditions. It is typically used with aggregate functions such as COUNT, SUM, AVG, MIN, and MAX. Here’s an example to illustrate the usage of the HAVING clause:

Consider a table Orders with the following structure:

CREATE TABLE Orders (

OrderID int,

CustomerID int,

OrderDate date,

Amount float

);

INSERT INTO Orders (OrderID, CustomerID, OrderDate, Amount)

VALUES

(1, 101, ‘2023-10-01’, 100.5),

(2, 102, ‘2023-10-02’, 75.2),

(3, 101, ‘2023-10-03’, 50.0),

(4, 103, ‘2023-10-04’, 200.3),

(5, 101, ‘2023-10-05’, 125.0),

(6, 103, ‘2023-10-06’, 180.5);

Suppose you want to find customers who have placed more than 2 orders. You can use the HAVING clause as follows:

SELECT CustomerID, COUNT(*) AS OrderCount

FROM Orders

GROUP BY CustomerID

HAVING COUNT(*) > 2;

This query will return the CustomerID and the count of orders for each customer, but only for customers who have placed more than 2 orders.

You can apply various conditions to the HAVING clause, just like you would in a WHERE clause, but the HAVING clause is used specifically with groups created by the GROUP BY clause. It allows you to filter the results based on the results of aggregate functions.