REGR_R2

REGR_R2 is a SQL aggregate function that calculates the square of the correlation between the independent variable (x) and the dependent variable (y) in a linear regression. The returned value is a measure of how well the regression line predicts actual values, ranging between 0 and 1, with 1 indicating a perfect fit.

REGR_R2(expr1, expr2)

  • expr1: This is the dependent variable. In statistical terms, the dependent variable is the outcome or the one that is being tested. It is the variable whose variation depends on that of another. This should be a numeric or non-numeric field present in the Oracle database.
  • expr2: This is the independent variable. Statistically, the independent variable is the variable considered to cause, influence, or affect outcomes. It is the predictor variable in the regression analysis. It should also be a numeric or non-numeric field present in the Oracle database.

Example

SELECT REGR_R2(sales, quantity) OVER (PARTITION BY store_id) as r2
FROM store_sales;

Output

| R2
| -------------
| 0.9056284
| 0.8565223
| 0.9342156
| ...

Explanation

This example demonstrates the use of REGR_R2() function in Oracle. The function calculates the square of the correlation coefficient (R2), indicating how closely the data in the sales column tracks the data in the quantity column for each store_id. The result, r2, is a value between 0 and 1. A higher value indicates a tighter correlation between sales and quantity.

REGR_R2(Y, X)

  • y: This is the dependent variable in the regression equation. It represents the outcome or result that the model will predict.
  • x: This is the independent variable in the regression equation. It represents the predictor or factor that affects the dependent variable.

Example

SELECT
REGR_R2(y_variable, x_variable)
FROM
table_name;

Output

regr_r2
-----------------
0.9123456789012

Explanation

The function REGR_R2(y_variable, x_variable) calculates the square of the correlation coefficient (Pearson’s R-square) between y_variable and x_variable. The value of R-squared ranges between 0 and 1. In this example, the R-squared value is approximately 0.912, suggesting a high degree of correlation between the variables.

For in-depth explanations and examples SQL keywords where you write your SQL, install our extension.