• SQL aggregate functions - SUM, MIN, MAX, AVG, COUNT. Aggregate functions

    The following subsections describe other SELECT statement clauses that can be used in queries, as well as aggregate functions and sets of statements. As a reminder, at this point we've looked at using the WHERE clause, and in this article we'll look at the GROUP BY, ORDER BY, and HAVING clauses, and provide some examples of using these clauses in combination with the aggregate functions that are supported in Transact-SQL.

    GROUP BY clause

    Offer GROUP BY groups a selected set of rows to produce a set of summary rows by the values ​​of one or more columns or expressions. A simple case of using the GROUP BY clause is shown in the example below:

    USE SampleDb; SELECT Job FROM Works_On GROUP BY Job;

    This example selects and groups employee positions.

    In the example above, the GROUP BY clause creates a separate group for all possible values ​​(including a NULL value) for the Job column.

    Using Columns in GROUP offer BY must meet certain conditions. Specifically, each column in the query's select list must also appear in the GROUP BY clause. This requirement does not apply to constants and columns that are part of an aggregate function. (Aggregation functions are covered in the next subsection.) This makes sense because Only columns in the GROUP BY clause are guaranteed to have one value per group.

    You can group a table by any combination of its columns. The example below demonstrates grouping the rows of the Works_on table into two columns:

    USE SampleDb; SELECT ProjectNumber, Job FROM Works_On GROUP BY ProjectNumber, Job;

    The result of this query:

    Based on the query results, you can see that there are nine groups with different combinations of project number and position. The sequence of column names in the GROUP BY clause does not have to be the same as in the list of columns in the SELECT.

    Aggregate functions

    Aggregate functions are used to obtain total values. All aggregate functions can be divided into the following categories:

      ordinary aggregate functions;

      statistical aggregate functions;

      user-defined aggregate functions;

      analytical aggregate functions.

    Here we will look at the first three types of aggregate functions.

    Common aggregate functions

    Transact-SQL supports the following six aggregate functions: MIN, MAX, SUM, AVG, COUNT, COUNT_BIG.

    All aggregate functions perform calculations on a single argument, which can be either a column or an expression. (The only exception is the second form of the two functions: COUNT and COUNT_BIG, namely COUNT(*) and COUNT_BIG(*), respectively.) The result of any aggregate function calculation is a constant value, displayed in a separate result column.

    Aggregate functions are specified in the column list of the SELECT statement, which may also contain a GROUP BY clause. If there is no GROUP BY clause in the SELECT statement, and the list of select columns contains at least one aggregate function, then it must not contain simple columns (except columns that serve as arguments to the aggregate function). Therefore the code in the example below is incorrect:

    USE SampleDb; SELECT LastName, MIN(Id) FROM Employee;

    Here, the LastName column of the Employee table should not be in the column select list because it is not an argument to the aggregate function. On the other hand, the column select list can contain column names that are not arguments to the aggregate function if those columns are arguments to the GROUP BY clause.

    An argument to an aggregate function can be preceded by one of two possibilities: keywords:

    ALL

    Specifies that calculations are performed on all values ​​in the column. This is the default value.

    DISTINCT

    Specifies that only unique column values ​​are used for calculations.

    Aggregate functions MIN and MAX

    The MIN and MAX aggregate functions calculate the smallest and highest value column accordingly. If a query contains a WHERE clause, the MIN and MAX functions return the smallest and largest values ​​of the rows that match the specified conditions. The example below shows the use of the MIN aggregate function:

    USE SampleDb; -- Return 2581 SELECT MIN(Id) AS "Minimum Id value" FROM Employee;

    The result returned in the example above is not very informative. For example, the last name of the employee who owns this number is unknown. But it is not possible to get this last name in the usual way because, as mentioned earlier, you are not allowed to explicitly specify the LastName column. In order to also obtain the last name of this employee along with the lowest personnel number of an employee, a subquery is used. The example below shows the use of such a subquery, where the subquery contains the SELECT statement from the previous example:

    Result of the request:

    The use of the MAX aggregate function is shown in the example below:

    The MIN and MAX functions can also accept strings and dates as arguments. In the case of a string argument, the values ​​are compared using the actual sort order. For all time data arguments of type "date", the smallest column value will be the most early date, and the largest is the latest.

    You can use the DISTINCT keyword with the MIN and MAX functions. Before using the MIN and MAX aggregate functions, all NULL values ​​are eliminated from their argument columns.

    Aggregate function SUM

    Aggregate SUM function calculates total amount column values. The argument to this aggregate function must always have numeric type data. The use of the SUM aggregate function is shown in the example below:

    USE SampleDb; SELECT SUM (Budget) "Total budget" FROM Project;

    This example calculates the total amount of budgets for all projects. Result of the request:

    In this example, the aggregate function groups all project budget values ​​and determines their total amount. For this reason, the query contains an implicit grouping function (as do all similar queries). The implicit grouping function from the example above can be specified explicitly, as shown in the example below:

    USE SampleDb; SELECT SUM (Budget) "Total budget" FROM Project GROUP BY();

    Using the DISTINCT parameter eliminates all duplicate values ​​in a column before applying the SUM function. Likewise, all NULL values ​​are removed before applying this aggregate function.

    AVG aggregate function

    Aggregate AVG function returns the arithmetic average of all values ​​in a column. The argument to this aggregate function must always be a numeric data type. Before using the AVG function, all NULL values ​​are removed from its argument.

    The use of the AVG aggregate function is shown in the example below:

    USE SampleDb; -- Return 133833 SELECT AVG (Budget) "Average budget for the project" FROM Project;

    Here the arithmetic average of the budget value for all budgets is calculated.

    Aggregate functions COUNT and COUNT_BIG

    Aggregate COUNT function has two different forms:

    COUNT(col_name) COUNT(*)

    The first form of the function counts the number of values ​​in the col_name column. If the query uses the DISTINCT keyword, all duplicate values ​​in the column are removed before using the COUNT function. This form of the COUNT function does not take NULL values ​​into account when counting the number of values ​​in a column.

    The use of the first form of the COUNT aggregate function is shown in the example below:

    USE SampleDb; SELECT ProjectNumber, COUNT(DISTINCT Job) "Works in the project" FROM Works_on GROUP BY ProjectNumber;

    Here the number of different positions for each project is counted. The result of this query:

    As you can see from the example query, NULL values ​​were not taken into account by the COUNT function. (The sum of all values ​​in the position column turned out to be 7, not 11, as it should be.)

    The second form of the COUNT function, i.e. The COUNT(*) function counts the number of rows in a table. And if the SELECT statement of a query with the COUNT(*) function contains a WHERE clause with a condition, the function returns the number of rows that satisfy the specified condition. Unlike the first version of the COUNT function, the second form does not ignore NULL values ​​because this function operates on rows, not columns. The example below demonstrates the use of the COUNT(*) function:

    USE SampleDb; SELECT Job AS "Type of work", COUNT(*) "Need workers" FROM Works_on GROUP BY Job;

    Here the number of positions in all projects is calculated. Result of the request:

    COUNT_BIG function similar to the COUNT function. The only difference between them is the type of result they return: the COUNT_BIG function always returns BIGINT values, while the COUNT function returns INTEGER data values.

    Statistical aggregate functions

    The following functions make up the group of statistical aggregate functions:

    VAR

    Calculates the statistical variance of all values ​​represented in a column or expression.

    VARP

    Calculates the statistical variance of the population of all values ​​represented in a column or expression.

    STDEV

    Calculates the standard deviation (which is calculated as square root from the corresponding variance) of all values ​​of a column or expression.

    STDEVP

    Calculates the standard deviation of the population of all values ​​in a column or expression.

    User-defined aggregate functions

    The Database Engine also supports the implementation of user-defined functions. This capability allows users to augment system aggregate functions with functions that they can implement and install themselves. These functions represent a special class of user-defined functions and are discussed in detail later.

    HAVING offer

    In a sentence HAVING defines a condition that applies to a group of rows. Thus, this clause has the same meaning for groups of rows as the WHERE clause does for the contents of the corresponding table. The syntax of the HAVING clause is:

    HAVING condition

    Here the condition parameter represents the condition and contains aggregate functions or constants.

    The use of the HAVING clause in conjunction with the COUNT(*) aggregate function is illustrated in the example below:

    USE SampleDb; -- Returns "p3" SELECT ProjectNumber FROM Works_on GROUP BY ProjectNumber HAVING COUNT(*)

    In this example, the system groups all rows by the values ​​of the ProjectNumber column using the GROUP BY clause. After this, the number of rows in each group is counted and groups containing less than four rows (three or less) are selected.

    The HAVING clause can also be used without aggregate functions, as shown in the example below:

    USE SampleDb; -- Returns "Consultant" SELECT Job FROM Works_on GROUP BY Job HAVING Job LIKE "K%";

    This example groups the rows of the Works_on table by job title and eliminates those jobs that do not start with the letter "K".

    The HAVING clause can also be used without the GROUP BY clause, although this is not common practice. In this case, all table rows are returned in one group.

    ORDER BY clause

    Offer ORDER BY determines the sort order of the rows in the result set returned by the query. This sentence has the following syntax:

    The sort order is specified in the col_name parameter. The col_number parameter is an alternative sort order indicator that identifies columns by the order in which they appear in the select list of the SELECT statement (1 is the first column, 2 is the second column, etc.). ASC parameter specifies sorting in ascending order, and DESC parameter- in the downward direction. The default is ASC.

    The column names in the ORDER BY clause do not have to be in the list of select columns. But this does not apply to queries like SELECT DISTINCT, because in such queries, the column names specified in the ORDER BY clause must also be specified in the list of select columns. In addition, this clause cannot contain column names from tables not specified in the FROM clause.

    As you can see from the syntax of the ORDER BY clause, the result set can be sorted by multiple columns. This sorting is shown in the example below:

    In this example, department numbers and employee names are selected for employees whose personnel number is less than 20,000, and sorted by last name and first name. The result of this query:

    Columns in the ORDER BY clause can be specified not by their names, but by their order in the select list. Accordingly, the sentence in the example above can be rewritten as follows:

    Such alternative way specifying columns by their position instead of names is used if the ordering criterion contains an aggregate function. (Another way is to use the column names, which then appear in the ORDER BY clause.) However, in the ORDER BY clause, it is recommended that you specify columns by their names rather than by numbers, to make it easier to update the query if columns need to be added or removed from the select list. Specifying columns in the ORDER BY clause by their numbers is shown in the example below:

    USE SampleDb; SELECT ProjectNumber, COUNT(*) "Number of employees" FROM Works_on GROUP BY ProjectNumber ORDER BY 2 DESC;

    Here, for each project, the project number and the number of employees participating in it are selected, arranging the result in descending order by the number of employees.

    Transact-SQL places NULL values ​​at the beginning of the list when sorting in ascending order, and at the end of the list when sorting in descending order.

    Using an ORDER BY clause to paginate results

    Displaying query results on the current page can either be implemented in user application, or instruct the database server to do this. In the first case, all the database rows are sent to the application, whose job is to select the required rows and display them. In the second case, on the server side, only the rows required for the current page are selected and displayed. As you might expect, creating server-side pages usually provides better performance, because Only the lines needed for display are sent to the client.

    To support server-side page creation, SQL Server 2012 introduces two new SELECT clauses: OFFSET and FETCH. The application of these two sentences is demonstrated in the example below. Here, from the AdventureWorks2012 database (which you can find in the source), the business ID, job title, and birthday of all female employees are retrieved, sorting the result by job title in ascending order. The resulting set of rows is split into 10-line pages and the third page is displayed:

    In a sentence OFFSET specifies the number of result lines to skip in the displayed result. This number is calculated after the rows are sorted using the ORDER BY clause. In a sentence FETCH NEXT specifies the number of rows that satisfy the WHERE condition and are sorted to be returned. The parameter of this clause can be a constant, an expression, or the result of another query. The FETCH NEXT clause is similar to the FETCH FIRST.

    The main goal when creating server-side pages is to be able to implement common page forms using variables. This task can be performed using the SQL Server package.

    SELECT Statement and IDENTITY Property

    IDENTITY property allows you to define values ​​for a specific table column in the form of an automatically increasing counter. Numeric data type columns such as TINYINT, SMALLINT, INT, and BIGINT can have this property. For such a table column, the Database Engine automatically creates sequential values ​​starting from the specified starting value. Therefore, the IDENTITY property can be used to create single-digit numeric values ​​for a selected column.

    A table can only contain one column with the IDENTITY property. The table owner has the option to specify a starting value and an increment, as shown in the example below:

    USE SampleDb; CREATE TABLE Product (Id INT IDENTITY(10000, 1) NOT NULL, Name NVARCHAR(30) NOT NULL, Price MONEY) INSERT INTO Product(Name, Price) VALUES ("Product1", 10), ("Product2", 15) , ("Product3", 8), ("Product4", 15), ("Product5", 40); -- Returns 10004 SELECT IDENTITYCOL FROM Product WHERE Name = "Product5"; -- Similar to the previous statement SELECT $identity FROM Product WHERE Name = "Product5";

    This example first creates a Product table that contains an Id column with an IDENTITY property. The values ​​in the Id column are created automatically by the system, starting at 10,000 and increasing in unit increments for each subsequent value: 10,000, 10,001, 10,002, etc.

    Several system functions and variables are associated with the IDENTITY property. For example, the example code uses $identity system variable. As you can see from the output of this code, this variable automatically references the IDENTITY property. You can also use the system function instead IDENTITYCOL.

    The initial value and increment of a column with the IDENTITY property can be found using the functions IDENT_SEED And IDENT_INCR respectively. These functions are used as follows:

    USE SampleDb; SELECT IDENT_SEED("Product"), IDENT_INCR("Product")

    As already mentioned, IDENTITY values ​​are set automatically by the system. But the user can explicitly specify their values ​​for certain rows by assigning the parameter IDENTITY_INSERT ON value before inserting an explicit value:

    SET IDENTITY INSERT table name ON

    Because you can use the IDENTITY_INSERT parameter to set a column with the IDENTITY property to any value, including a duplicate value, the IDENTITY property generally does not enforce uniqueness of the column's values. Therefore, to enforce uniqueness of column values, you should use UNIQUE or PRIMARY KEY constraints.

    When you insert values ​​into a table after setting IDENTITY_INSERT to on, the system creates the next value of the IDENTITY column by incrementing the largest current value of that column.

    CREATE SEQUENCE statement

    There are several significant disadvantages to using the IDENTITY property, the most significant of which are the following:

      the application of the property is limited to the specified table;

      the new column value cannot be obtained in any way other than by applying it;

      the IDENTITY property can only be specified when creating a column.

    For these reasons, SQL Server 2012 introduces sequences that have the same semantics as the IDENTITY property, but without the previously listed disadvantages. In this context, a sequence is a database functionality that allows you to specify counter values ​​for different database objects, such as columns and variables.

    Sequences are created using instructions CREATE SEQUENCE. The CREATE SEQUENCE statement is defined in the SQL standard and is supported by other relational database systems such as IBM DB2 and Oracle.

    The example below shows how to create a sequence in SQL Server:

    USE SampleDb; CREATE SEQUENCE dbo.Sequence1 AS INT START WITH 1 INCREMENT BY 5 MINVALUE 1 MAXVALUE 256 CYCLE;

    In the example above, the values ​​of Sequence1 are created automatically by the system, starting with value 1 and incrementing 5 for each subsequent value. Thus, in START offer the initial value is indicated, and in INCREMENT offer- step. (The step can be either positive or negative.)

    In the next two optional sentences MINVALUE And MAXVALUE minimum and maximum value sequence object. (Note that the MINVALUE value must be less than or equal to the initial value, and the MAXVALUE value cannot be greater than the upper limit of the data type specified for the sequence.) In the clause CYCLE indicates that the sequence is repeated from the beginning when the maximum (or minimum for a sequence with a negative step) value is exceeded. By default, this clause is set to NO CYCLE, which means that exceeding the maximum or minimum value of the sequence will throw an exception.

    The main feature of sequences is their independence from tables, i.e. they can be used with any database objects, such as table columns or variables. (This property has a positive effect on storage and therefore performance. The specific sequence does not need to be stored; only its last value is stored.)

    New sequence values ​​are created using NEXT VALUE FOR expressions, the application of which is shown in the example below:

    USE SampleDb; -- Return 1 SELECT NEXT VALUE FOR dbo.sequence1; -- Returns 6 (next step) SELECT NEXT VALUE FOR dbo.sequence1;

    You can use the NEXT VALUE FOR expression to assign the result of a sequence to a variable or column cell. The example below shows how to use this expression to assign results to a column:

    USE SampleDb; CREATE TABLE Product (Id INT NOT NULL, Name NVARCHAR(30) NOT NULL, Price MONEY) INSERT INTO Product VALUES (NEXT VALUE FOR dbo.sequence1, "Product1", 10); INSERT INTO Product VALUES (NEXT VALUE FOR dbo.sequence1, "Product2", 15); -- ...

    In the example above, we first create a Product table with four columns. Next, two INSERT statements insert two rows into this table. The first two cells of the first column will have the values ​​11 and 16.

    The example below shows the use of a directory view sys.sequences to view the current value of a sequence without using it:

    Typically, the NEXT VALUE FOR statement is used in an INSERT statement to cause the system to insert the generated values. This expression can also be used as part of a multi-row query using the OVER clause.

    To change a property of an existing sequence, use ALTER SEQUENCE statement. One of the most important applications This statement is associated with the RESTART WITH option, which resets the specified sequence. The example below shows the use of ALTER SEQUENCE to reset almost all of the properties of Sequence1:

    USE SampleDb; ALTER SEQUENCE dbo.sequence1 RESTART WITH 100 INCREMENT BY 50 MINVALUE 50 MAXVALUE 200 NO CYCLE;

    Deleting a sequence using an instruction DROP SEQUENCE.

    Set operators

    In addition to the operators discussed earlier, Transact-SQL supports three more set operators: UNION, INTERSECT, and EXCEPT.

    UNION operator

    UNION operator combines the results of two or more queries into a single result set that includes all rows belonging to all queries in the union. Accordingly, the result of joining two tables is a new table containing all the rows included in one of the original tables or in both of these tables.

    The general form of the UNION operator looks like this:

    select_1 UNION select_2 ( select_3])...

    The parameters select_1, select_2, ... are SELECT statements that create a join. If the ALL option is used, all rows are displayed, including duplicates. In a UNION operator, the ALL parameter has the same meaning as in a SELECT selection list, but with one difference: for a SELECT selection list, this parameter is used by default, but for a UNION operator it must be specified explicitly.

    In its original form, the SampleDb database is not suitable for demonstrating the use of the UNION operator. Therefore, this section creates a new table, EmployeeEnh, which is identical to the existing Employee table, but has an additional City column. This column indicates the place of residence of employees.

    Creating the EmployeeEnh table provides us with a good opportunity to demonstrate the use of the clause INTO in the SELECT statement. The SELECT INTO statement performs two operations. First, a new table is created with the columns listed in the SELECT list. Then the rows from the original table are inserted into the new table. The name of the new table is specified in the INTO clause, and the name of the source table is specified in the FROM clause.

    The example below shows how to create the EmployeeEnh table from the Employee table:

    USE SampleDb; SELECT * INTO EmployeeEnh FROM Employee; ALTER TABLE EmployeeEnh ADD City NCHAR(40) NULL;

    In this example, the SELECT INTO statement creates the EmployeeEnh table, inserts all the rows from the Employee source table into it, and then the ALTER TABLE statement adds the City column to the new table. But the added City column does not contain any values. Values ​​in this column can be inserted using Management Studio or using the following code:

    USE SampleDb; UPDATE EmployeeEnh SET City = "Kazan" WHERE Id = 2581; UPDATE EmployeeEnh SET City = "Moscow" WHERE Id = 9031; UPDATE EmployeeEnh SET City = "Ekaterinburg" WHERE Id = 10102; UPDATE EmployeeEnh SET City = "St. Petersburg" WHERE Id = 18316; UPDATE EmployeeEnh SET City = "Krasnodar" WHERE Id = 25348; UPDATE EmployeeEnh SET City = "Kazan" WHERE Id = 28559; UPDATE EmployeeEnh SET City = "Perm" WHERE Id = 29346;

    Now we are ready to demonstrate the use of the UNION instruction. The example below shows a query to create a join between the EmployeeEnh and Department tables using this statement:

    USE SampleDb; SELECT City AS "City" FROM EmployeeEnh UNION SELECT Location FROM Department;

    The result of this query:

    Only compatible tables can be joined using the UNION statement. By compatible tables, we mean that both lists of select columns must contain the same number of columns, and the corresponding columns must have compatible data types. (In terms of compatibility, the INT and SMALLINT data types are not compatible.)

    The result of a join can only be ordered by using the ORDER BY clause in last instruction SELECT, as shown in the example below. GROUP BY and HAVING clauses can be used with individual SELECT statements, but not within the join itself.

    The query in this example selects employees who either work in department d1 or started working on the project before January 1, 2008.

    The UNION operator supports the ALL parameter. Using this option does not remove duplicates from the result set. You can use the OR operator instead of the UNION operator if all SELECT statements joined by one or more UNION operators refer to the same table. In this case, the set of SELECT statements is replaced by one with the SELECT statement with a set of OR operators.

    INTERSECT and EXCEPT Operators

    Two other operators for working with sets, INTERSECT And EXCEPT, determine the intersection and difference, respectively. Under intersection in this context there is a set of rows that belong to both tables. And the difference between two tables is defined as all the values ​​that belong to the first table and are not present in the second. The example below shows the use of the INTERSECT operator:

    Transact-SQL does not support the use of the ALL parameter with either the INTERSECT or EXCEPT operator. The use of the EXCEPT operator is shown in the example below:

    Remember that these three set operators have different execution priorities: the INTERSECT operator has the highest priority, followed by the EXCEPT operator, and the UNION operator has the lowest priority. Inattention to execution priority when using multiple different operators for working with sets may lead to unexpected results.

    CASE Expressions

    In the field of database application programming, it is sometimes necessary to modify the representation of data. For example, people can be subdivided by coding them according to their social class, using the values ​​1, 2 and 3, denoting men, women and children respectively. This programming technique can reduce the time required to implement a program. CASE expression The Transact-SQL language makes it easy to implement this type of encoding.

    Unlike most programming languages, CASE is not a statement, but an expression. Therefore, a CASE expression can be used almost anywhere that Transact-SQL allows expressions. The CASE expression has two forms:

      simple CASE expression;

      search expression CASE.

    The syntax of a simple CASE expression is as follows:

    A statement with a simple CASE expression first searches a list of all expressions in WHEN clause the first expression that matches expression_1, then executes the corresponding THEN clause. If there is no matching expression in the WHEN list, the ELSE clause.

    The CASE search expression syntax is as follows:

    IN in this case it searches for the first matching condition and then executes the corresponding THEN clause. If none of the conditions meet the requirements, the ELSE clause is executed. The use of a CASE search expression is shown in the example below:

    USE SampleDb; SELECT ProjectName, CASE WHEN Budget > 0 AND Budget 100000 AND Budget 150000 AND Budget

    The result of this query:

    This example weighs the budgets of all projects and then displays their calculated weights along with their corresponding project names.

    The example below shows another way to use a CASE expression, where the WHEN clause contains subqueries that form part of the expression:

    USE SampleDb; SELECT ProjectName, CASE WHEN p1.Budget (SELECT AVG(p2.Budget) FROM Project p2) THEN "above average" END "Budget Category" FROM Project p1;

    The result of this query is as follows:

    Let's learn to summarize. No, these are not the results of studying SQL, but the results of the values ​​of the columns of the database tables. Aggregate SQL functions act on the column values ​​to produce a single resulting value. The most commonly used SQL aggregate functions are SUM, MIN, MAX, AVG, and COUNT. It is necessary to distinguish between two cases of using aggregate functions. First, aggregate functions are used on their own and return a single resulting value. Second, aggregate functions are used with the SQL GROUP BY clause, that is, grouping by fields (columns) to obtain the resulting values ​​in each group. Let's first consider cases of using aggregate functions without grouping.

    SQL SUM function

    The SQL SUM function returns the sum of the values ​​in a database table column. It can only be applied to columns whose values ​​are numbers. The SQL queries to get the resulting sum start like this:

    SELECT SUM(COLUMN_NAME) ...

    This expression is followed by FROM (TABLE_NAME), and then a condition can be specified using the WHERE clause. Additionally, the column name can be preceded by DISTINCT, which means that only unique values ​​will be counted. By default, all values ​​are taken into account (for this you can specifically specify not DISTINCT, but ALL, but the word ALL is not required).

    Example 1. There is a company database with data about its divisions and employees. In addition to everything else, the Staff table has a column with data on employee salaries. The selection from the table looks like this (to enlarge the picture, click on it with the left mouse button):

    To obtain the sum of all salaries, use the following query:

    SELECT SUM(Salary) FROM Staff

    This query will return the value 287664.63.

    And now. In the exercises we are already beginning to complicate the tasks, bringing them closer to those encountered in practice.

    SQL MIN function

    The SQL MIN function also operates on columns whose values ​​are numbers and returns the minimum of all values ​​in the column. This function has a syntax similar to that of the SUM function.

    Example 3. The database and table are the same as in example 1.

    We need to find out the minimum wage for employees of department number 42. To do this, we write the following request:

    The query will return the value 10505.90.

    And again exercise for independent decision . In this and some other exercises, you will need not only the Staff table, but also the Org table, containing data about the company’s divisions:


    Example 4. The Org table is added to the Staff table, containing data about the company's departments. Print the minimum number of years worked by one employee in a department located in Boston.

    SQL MAX function

    The SQL MAX function works similarly and has a similar syntax, which is used when you need to determine the maximum value among all values ​​in a column.

    Example 5.

    We need to find out the maximum salary of employees of department number 42. To do this, write the following request:

    The query will return the value 18352.80

    The time has come exercises for independent solution.

    Example 6. We again work with two tables - Staff and Org. Display the name of the department and the maximum value of the commission received by one employee in the department belonging to the group of departments (Division) Eastern. Use JOIN (joining tables) .

    SQL AVG function

    What is stated regarding the syntax for the previous functions described is also true for the SQL AVG function. This function returns the average of all values ​​in a column.

    Example 7. The database and table are the same as in the previous examples.

    Suppose you want to find out the average length of service of employees of department number 42. To do this, write the following query:

    The result will be 6.33

    Example 8. We work with one table - Staff. Display the average salary of employees with 4 to 6 years of experience.

    SQL COUNT function

    The SQL COUNT function returns the number of records in a database table. If you specify SELECT COUNT(COLUMN_NAME) ... in the query, the result will be the number of records without taking into account those records in which the column value is NULL (undefined). If you use an asterisk as an argument and start a SELECT COUNT(*) ... query, the result will be the number of all records (rows) of the table.

    Example 9. The database and table are the same as in the previous examples.

    You want to know the number of all employees who receive commissions. The number of employees whose Comm column values ​​are not NULL will be returned by the following query:

    SELECT COUNT(Comm) FROM Staff

    The result will be 11.

    Example 10. The database and table are the same as in the previous examples.

    If you want to find out the total number of records in the table, then use a query with an asterisk as an argument to the COUNT function:

    SELECT COUNT(*) FROM Staff

    The result will be 17.

    In the next exercise for independent solution you will need to use a subquery.

    Example 11. We work with one table - Staff. Display the number of employees in the planning department (Plains).

    Aggregate Functions with SQL GROUP BY

    Now let's look at using aggregate functions together with the SQL GROUP BY statement. The SQL GROUP BY statement is used to group result values ​​by columns in a database table.

    Example 12. There is a database of the advertisement portal. It has an Ads table containing data about ads submitted for the week. The Category column contains data about large ad categories (for example, Real Estate), and the Parts column contains data about smaller parts included in the categories (for example, the Apartments and Summer Houses parts are parts of the Real Estate category). The Units column contains data on the number of advertisements submitted, and the Money column contains data on the amount of money received for submitting advertisements.

    CategoryPartUnitsMoney
    TransportCars110 17600
    Real estateApartments89 18690
    Real estateDachas57 11970
    TransportMotorcycles131 20960
    Construction materialsBoards68 7140
    Electrical engineeringTVs127 8255
    Electrical engineeringRefrigerators137 8905
    Construction materialsRegips112 11760
    LeisureBooks96 6240
    Real estateAt home47 9870
    LeisureMusic117 7605
    LeisureGames41 2665

    Using the SQL GROUP BY statement, find the amount of money earned by posting ads in each category. We write the following request.

    The lesson will cover the topic of sql renaming a column (fields) using the service word AS; The topic of aggregate functions in sql is also covered. Will be dismantled specific examples requests

    Column names in queries can be renamed. This makes the results more readable.

    IN SQL language renaming fields involves using AS keyword, which is used to rename field names in result sets

    Syntax:

    SELECT<имя поля>AS<псевдоним>FROM...

    Let's look at an example of renaming in SQL:

    Example of the “Institute” database: Display the names of teachers and their salaries, for those teachers whose salary is below 15,000, rename the zarplata field to "low_salary"


    ✍ Solution:

    Renaming columns in SQL is often necessary when calculating values ​​associated with multiple fields tables. Let's look at an example:

    Example of the “Institute” database: From the teachers table, display the name field and calculate the amount of salary and bonus, naming the field "salary_bonus"


    ✍ Solution:
    1 2 SELECT name, (zarplata+ premia) AS zarplata_premia FROM teachers;

    SELECT name, (zarplata+premia) AS zarplata_premia FROM teachers;

    Result:

    Aggregate functions in SQL

    To obtain total values ​​and evaluate expressions, aggregate functions in sql are used:

    All aggregate functions return a single value.

    The COUNT, MIN, and MAX functions apply to any data type.

    The SUM and AVG functions are used only for numeric fields.
    There is a difference between the COUNT(*) and COUNT() functions: the second does not take into account NULL values ​​when calculating.

    Important: When working with aggregate functions in SQL, a function word is used AS


    Example of the “Institute” database: Get the value of the highest salary among teachers, display the result as "max_salary"


    ✍ Solution:
    SELECT MAX (zarplata) AS max_salary FROM teachers;

    SELECT MAX(zarplata) AS max_salary FROM teachers;

    Results:

    Let's consider more complex example using aggregate functions in sql.


    ✍ Solution:

    GROUP BY clause in SQL

    The group by operator in sql is usually used in conjunction with aggregate functions.

    Aggregate functions are executed on all resulting query rows. If the query contains a GROUP BY clause, each set of rows specified in the GROUP BY clause constitutes a group, and the aggregate functions are executed for each group separately.

    Let's look at an example with the lessons table:

    Example:

    Important: Thus, by using GROUP BY, all query output rows are divided into groups characterized by the same combinations of values ​​in those columns (that is, aggregate functions are performed on each group separately).

    It is worth considering that when grouping by a field containing NULL values, all such records will fall into one group.

    For various types printers, determine their average cost and quantity (i.e. separately for laser, inkjet and matrix). Use aggregate functions. The result should look like this:

    Having SQL statement

    The HAVING clause in SQL is needed to check the values, which are obtained using the aggregate function after grouping(after using GROUP BY). Such a check cannot be contained in a WHERE clause.

    Example: DB Computer store. Calculate the average price of computers with the same processor speed. Perform calculations only for those groups whose average price is less than 30,000.

    Can perform generalized group processing of field values. This is done using aggregate functions. Aggregate functions produce a single value for an entire table group. SQL provides the following aggregate functions:

    • COUNT– counts the number of table rows with non-NULL values ​​of the field specified as an argument.
    • SUM– calculates the arithmetic sum of all selected values ​​for a given field.
    • AVG– averages all selected values ​​of this field.
    • MAX– displays the largest value of all selected values ​​for this field.
    • MIN– displays the smallest value of all selected values ​​for this field.

      Using Aggregate Functions

      Aggregate functions are used similarly to field names in the SELECT clause of a query, with one exception: they take field names as an argument. Only numeric fields can be used with SUM And AVG. WITH COUNT, MAX, And MIN Both numeric and character fields can be used. When used with character fields MAX And MIN will translate them to the ASCII equivalent. This means that MIN will choose the first one, and MAX last value in alphabetical order.

      To find the total sales amount in the sales table, we have to write the following query:

      SELECT SUM(SSum) FROM Sells

      As a result we get:

      This query counted the number of non-blank values ​​in the SNum field of the Sells table. If we rewrite the query as follows:

      SELECT COUNT(SDate) FROM Sells

      As a result we get:

      COUNT OF SDate
      4

      Different query results when calculating what appears to be the same thing are obtained because one of the SDate field values ​​is empty ( NULL). Be careful when using such queries.

    Using Aggregate Functions

    SQL has many built-in functions of various categories, among which a special place is occupied by aggregate functions, which operate on the values ​​of columns of many rows and return a single value. Arguments to aggregate functions can be both table columns and the results of expressions over them. Aggregate functions themselves can be included in other arithmetic expressions. The following table shows the most commonly used standard unary aggregate functions.


    The general format of a unary aggregate function is as follows:

    function_name([ALL | DISTINCT] expression)

    where DISTINCT specifies that the function should consider only distinct values ​​of the argument, and ALL specifies all values, including duplicates (this is the default). For example, the AVG function with the DISTINCT keyword for column rows with values ​​1, 1, 1 and 3 will return 2, and if the ALL keyword is present, it will return 1.5.

    Aggregate functions are used in SELECT and HAVING clauses. Here we will look at their use in the SELECT clause. In this case, the expression in the function argument applies to all rows in the input table of the SELECT clause. Additionally, you cannot use both aggregate functions and table columns (or expressions with them) in a SELECT clause unless you have a GROUP BY clause, which we'll look at in the next section.

    The COUNT function has two formats. In the first case, the number of rows in the input table is returned; in the second case, the number of argument values ​​in the input table is returned:

    • COUNT(*)
    • COUNT(expression)

    The simplest way to use this function is to count the number of rows in a table (all or those that satisfy a specified condition). For this, the first syntax option is used.

    Query: Number of product types for which information is available in the database.

    SELECT COUNT(*) AS "Number of product types"

    FROM Product

    The second version of the COUNT function syntax can take the name of a single column as an argument. In this case, the number of either all values ​​in this column of the input table is counted, or only non-repeating ones (using the DISTINCT keyword).

    Query: Number of distinct names contained in the Customer table.

    SELECT COUNT (DISTINCT FNAME)

    FROM Customer

    The use of the remaining unary aggregate functions is similar to COUNT, except that for the MIN and MAX functions, the use of the DISTINCT and ALL keywords does not make sense. With the functions COUNT, MAX and MIN, in addition to numeric fields, character fields can also be used. If the argument to an aggregate function contains no values, the COUNT function returns 0 and all others return NULL.

    SELECT MAX (OrdDate)

    FROM

    WHERE OrdDate"1.09.2010"

    Assignment for independent work: State it in language SQL queries to sample the following data:

    • Total cost of all orders;
    • The number of different cities contained in the Customer table.