WIDTH_BUCKET

WIDTH_BUCKET is a function in SQL that is often used in data analysis, as it distributes a set of data into buckets or bins of a certain defined range. The function takes a input value and assigns it to a specific bucket based on the stated boundaries, thus aiding in the categorization of data values into groups.

WIDTH_BUCKET(expr, min_value, max_value, num_buckets)

  • expr: This is an expression that specifies the input value to be sorted into a bucket.
  • min_value: This parameter specifies the minimum value of the range that the bucketing algorithm should consider.
  • max_value: This signifies the maximum value of the range for the bucketing algorithm.
  • num_buckets: This parameter denotes the number of equal width buckets to be created within the provided range.

Example

SELECT WIDTH_BUCKET(salary, 1000, 9000, 8) AS bucket, COUNT(*)
FROM employees
GROUP BY WIDTH_BUCKET(salary, 1000, 9000, 8)
ORDER BY bucket;

Output

| bucket | count |
|--------|-------|
| 1 | 5 |
| 2 | 8 |
| 3 | 12 |
| 4 | 7 |
| 5 | 9 |
| 6 | 4 |
| 7 | 3 |
| 8 | 7 |

Explanation

In this example, WIDTH_BUCKET function distributes the range of salary values into 8 buckets. The range is from 1000 to 9000. Then, it counts the number of salary values within each bucket. The result is a frequency distribution of salary ranges.

WIDTH_BUCKET(operand, b1, b2, count)

  • operand: This is the numeric expression or value that will be evaluated against the specified range for bucketing. It represents the values for which buckets are to be created. This argument must be a numeric type.
  • b1: This marks the start of the bucketing range. It sets the minimum boundary for bucketing. The operand values that are less or equal to this value fall into the first bucket. The value must be of numeric type.
  • b2: This represents the end of the bucketing range. It sets the maximum boundary for bucketing. The operand values that are greater than this value fall into the last bucket. The value must be of numeric type.
  • count: This defines the total number of evenly divided buckets to be created between b1 and b2. It sets the number of intervals or sections between the start and end of the bucketing range. Count must be an integer greater than zero.

Example

SELECT
score,
WIDTH_BUCKET(score, 0, 100, 5) AS bucket
FROM
scores;

Output

score | bucket
-------+--------
85 | 5
82 | 5
55 | 3
10 | 1

Explanation

The WIDTH_BUCKET() function divides the range 0 to 100 into 5 equal width buckets and determines the bucket number for each score in the scores table. For example, a score of 85 falls into the 5th bucket (from 80 to 100).

For in-depth explanations and examples SQL keywords where you write your SQL, install our extension.