FIRST_VALUE
FIRST_VALUE is an SQL window function that returns the first value in an ordered set of values. The order of the values is defined by the ORDER BY clause in the OVER() function used with FIRST_VALUE.
FIRST_VALUE(expression) OVER ([PARTITION BY partition_expression] [ORDER BY sort_expression [ASC|DESC]] [ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW])
- expression: This is the value that will be analyzed. FIRST_VALUE will return the first value in the ordered set based on this expression.
- partition by partition_expression: This clause divides the result set into partitions to which the FIRST_VALUE function is applied. If it’s not specified, the function treats the entire result set as a single partition.
- order by sort_expression [asc|desc]: This clause specifies the order of the data in each partition. The ASC or DESC determines the sequence. If it’s not specified, the function treats all rows of the query result set as a single group.
- rows between unbounded preceding and current row: This clause defines the window frame in which the FIRST_VALUE function operates. In this case, it is from the first row of the partition to the current row. The setting of the window frame size can impact the result of the FIRST_VALUE function.
Example
Output
Explanation
In this example, the FIRST_VALUE()
function is used to return the first value in an ordered set based on the price of the product. Here it returns the ‘MacBook Pro’ which has the highest price in the ‘Products’ table.
FIRST_VALUE(expression) OVER ([PARTITION BY partition_expression] ORDER BY sort_expression [ASC | DESC] [NULLS {FIRST | LAST}])
- expression: The expression to be calculated. This can be any expression composed of input data types in the column that will be examined by the function.
- partition by partition_expression: This is an optional clause that splits the rows into groups, or partitions based on their values in partition_expression. The function will then be applied to each partition separately.
- order by sort_expression: This sorts the rows in each partition which are specified in the sort_expression. The function is applied to the sorted set of rows.
- asc | desc: These are optional keywords that determine the sorting order of the rows. ASC sorts in ascending order, while DESC sorts in descending order.
- nulls {first | last}: Another optional clause. NULLS FIRST will place NULL values at the start of the sorted list, while NULLS LAST will place NULL values at the end.
Example
Output
Explanation
In this example, the FIRST_VALUE
SQL function is used to return the revenue of the first quarter for each year in the Sales
table. The OVER
clause partitions data by the Year
column and orders by the Quarter
column. As such, for each row in a partition, FIRST_VALUE
will return the value of Revenue
column for the first quarter (Quarter
= 1).
FIRST_VALUE(expression) OVER ([partition_by_clause] order_by_clause [rows_range_clause])
- expression: A provision to point to a column in an SQL statement, to get the first value of. It will operate over the result set defined by this column.
- partition_by_clause: An optional categorization statement that divides the result set into particular groups or partitions. Results of `FIRST_VALUE` function within each partition are calculated independently of other partitions.
- order_by_clause: A necessary provision to order the values in each partition or the entire result set. The `FIRST_VALUE` function returns the first value in the ordered set.
- rows_range_clause: An optional specification that controls the window frame or subset of rows to be processed by the function. When not provided, but `order_by_clause` is given, the function considers all rows from the start of the partition to the current row.
Example
Output
Explanation
In the above SQL query, the FIRST_VALUE
function is used to get the employee with the highest salary from each department. The result of the function is presented in a new column ‘Highest_Paid_Employee’. The PARTITION BY
clause segments the data into departments and the ORDER BY
clause orders employees within these departments in descending order of their salary. The FIRST_VALUE
function then picks the first value after this ordering, which is the employee with the highest salary.
FIRST_VALUE(expression) IGNORE NULLS OVER ([query_partition_clause] [order_by_clause]);
- expression: This is the value or column that FIRST_VALUE will be used against. It defines the data against which the operation will be performed.
- ignore nulls: This optional clause effects the handling of null values. If this clause is included, the function will ignore null values in the chosen data set, and will instead return the first non-null value.
- query_partition_clause: This allows the function to be applied to separate groups or “partitions” of the data. The partitions are defined by the query_partition_clause. If left unspecified, the function will consider all data in the table to be a single group.
- order_by_clause: This clause is used to order the data before applying the function. This is critical to the function, as it determines which value is considered “first” in the dataset. The FIRST_VALUE function will then return the first value in the ordered dataset.
Example
Output
Explanation
In the presented SQL example, the FIRST_VALUE
function is used to find the name of the top scoring player for each team in a hypothetical Players table. The PARTITION BY
clause categorizes players by their team_id
, while the ORDER BY
clause sorts players within their respective teams based on their points, in descending order. The FIRST_VALUE
function then picks the first value of player’s name from each sorted partition, effectively giving the top scorer of each team. This top scorer’s name is then added as a new column ‘top_scorer’ in the resulting table.