VAR_SAMP
VAR_SAMP is a statistical function in SQL that calculates the sample variance of a set of numeric values. The sample variance is determined by the squared deviation from the mean, divided by the number of elements minus one. It's used to quantify the spread of data points.
VAR_SAMP(X)
- x: This represents the variable or expression of a set of values from which the sample variance is to be calculated. It is typically a column name in a MySQL table.
Example
SELECT VAR_SAMP(grade) AS Sample_Variance FROM Students;
Output
Sample_Variance-----------------13.2
Explanation
The SQL statement calculates the sample variance of the grade
column from the Students
table. The sample variance is a measure of the dispersion of a set of data points around their mean value.
VAR_SAMP(expression)
- expression: A column or field in the table for which the sample variance is to be calculated. This parameter must contain numeric values. It can also be an expression that results in a numeric value.
Example
CREATE TABLE sample_data(value INT);INSERT INTO sample_data VALUES (1),(2),(3),(4),(5),(6);SELECT var_samp(value) AS sample_variance FROM sample_data;
Output
sample_variance----------------- 3.5
Explanation
The VAR_SAMP function is utilized to calculate the sample variance of the specified numeric column ‘value’. With the given values from 1 to 6, the sample variance resulting from calculation is 3.5.
VAR_SAMP( expression ) OVER ( [ query_partition_clause ] [ order_by_clause ] )
- expression: The column or expression for which the variance sample is to be calculated. It should contain numeric data type.
- query_partition_clause: Defines the grouping to calculate the variance sample. In absence, the whole result set is used.
- order_by_clause: The sorting order for the data before calculating the sample variance. This is often used with analytic functions. Not specifying this clause will treat all rows of the partition equally.
Example
SELECT VAR_SAMP(age) FROM employees;
Output
VAR_SAMP(AGE)-------------400.23489956
Explanation
The code calculates the sampled variance of the ages in the ‘employees’ table. The difference between this function and VAR_POP
is that VAR_SAMP
considers only a sample of the population, hence it divides the sum of squared difference by n-1
instead of n
. The output indicates the sampled variance of the ages.