SQL Reference - REGR

REGR_SLOPE(Y, X)

y: This is the dependent variable in the regression equation. It represents the data column whose values are to be predicted.
x: This is the independent variable in the regression equation. It represents the data column used to make predictions for the Y values.

Example

SELECT
  REGR_SLOPE(sales, cost) AS slope
FROM
  transactions;

Output

SLOPE
-------------
1.25

Explanation

The REGR_SLOPE function calculates the slope of the line of best fit through a dataset. The line is calculated such that the sum of the squared differences between the y-coordinates of the data points and the y-coordinates of the corresponding points (“expected values”) on the line of best fit, is minimized.

In the given example, it is utilized to determine the relationship between sales and cost in the transactions table. The output 1.25 represents the estimated increase in sales for every one unit increase in cost.

REGR_SLOPE(Y, X)

y: The dependent variable in the regression equation. This variable’s value is influenced by the value of X. In PostgreSQL, this should be a numeric type.
x: The independent variable in the regression equation. This variable can take on its values freely without being influenced by Y. In PostgreSQL, this should be a numeric type.

Example

SELECT REGR_SLOPE(sales, quantity_sold) AS SLOPE
FROM sales_data;

Output

+-------------+
| SLOPE       |
+-------------+
| 0.12        |
+-------------+

Explanation

REGR_SLOPE function is used in the provided PostgreSQL code to calculate the slope of the line of best fit through a dataset. This slope is calculated using the ‘sales’ and ‘quantity_sold’ columns in the ‘sales_data’ table. The output shows the computed slope value of 0.12.