SQL Reference - CUME

CUME_DIST

CUME_DIST is an analytical function in SQL used to compute the cumulative distribution of a row in a group of rows. It calculates the relative position of a specific row within the group. The value returned is a ratio between 0 and 1 and can be interpreted as the cumulative proportion of rows that are less than or equal to the current row, sorted by the ORDER BY clause in the function.

CUME_DIST() OVER ( [ PARTITION BY partition_expression ] ORDER BY sort_expression [ ASC | DESC ] )

Example

SELECT
  product,
  price,
  CUME_DIST() OVER (ORDER BY price) as cum_dist
FROM
  pricing_table;

Output

| product | price | cum_dist |
|---------|-------|----------|
| A       | 10    | 0.20     |
| B       | 20    | 0.40     |
| C       | 30    | 0.60     |
| D       | 40    | 0.80     |
| E       | 50    | 1.00     |

Explanation

The CUME_DIST function computes the cumulative distribution of a value in a group of values in Oracle. It represents the percentage of rows that are less than or equal to the current row. The example calculates the cumulative distribution of prices in the pricing_table. For each row, it determines the percentage of prices that are lower or equal to the price in the current row.

CUME_DIST() OVER ([PARTITION BY partition_expression] ORDER BY sort_expression [ASC | DESC])

Example

CREATE TABLE sales (
    id SERIAL PRIMARY KEY,
    sales_amount INTEGER NOT NULL
);

INSERT INTO sales (sales_amount)
VALUES (10), (20), (30), (10), (50);

SELECT sales_amount, CUME_DIST() OVER (ORDER BY sales_amount)
FROM sales;

Output

| sales_amount | cume_dist |
|--------------|-----------|
| 10           | 0.4       |
| 10           | 0.4       |
| 20           | 0.6       |
| 30           | 0.8       |
| 50           | 1.0       |

Explanation

The example demonstrates the usage of the CUME_DIST() function in PostgreSQL. CUME_DIST() is a window function that calculates the cumulative distribution of a value in a set of values. In the given table sales, the CUME_DIST() function sorts sales_amount in ascending order and computes the cumulative distribution of each row. The result is a ratio between the total number of rows and the row number of each row in the sorted set.

CUME_DIST() OVER ( [ PARTITION BY partition_expression ] ORDER BY sort_expression [ ASC | DESC ] )

Example

SELECT productid, productname,
CUME_DIST() OVER (
ORDER BY unitprice
) AS cumedist
FROM products;

Output

| productid | productname | cumedist |
|-----------|-------------|----------|
|    1      |    Soap     |   0.1    |
|    2      |    Brush    |   0.2    |
|    3      |    Cream    |   0.3    |
|    4      |    Paste    |   0.4    |
|    5      |    Rinse    |   0.5    |

Explanation

The above SQL statement calculates the cumulative distribution of the unit prices of the products in the products table. The CUME_DIST() function calculates the cumulative distribution of a value in a group of values in SQL Server. In the given example, the CUME_DIST() function compares each product’s unit price to other unit prices in the ‘products’ table, then calculates the cumulative distribution.