Standardized coefficient
In statistics, standardized coefficients or beta coefficients are the estimates resulting from a regression analysis that have been standardized so that the variances of dependent and independent variables are 1.[1] Therefore, standardized coefficients refer to how many standard deviations a dependent variable will change, per standard deviation increase in the predictor variable. For univariate regression, the absolute value of the standardized coefficient equals the correlation coefficient. Standardization of the coefficient is usually done to answer the question of which of the independent variables have a greater effect on the dependent variable in a multiple regression analysis, when the variables are measured in different units of measurement (for example, income measured in dollars and family size measured in number of individuals).
Some statistical software packages like PSPP, SPSS and SYSTAT label the standardized regression coefficients as "Beta" while the unstandardized coefficients are labeled "B". Others, like DAP/SAS label them "Standardized Coefficient". Sometimes the unstandardized variables are also labeled as "b".
A regression carried out on original (unstandardized) variables produces unstandardized coefficients. A regression carried out on standardized variables produces standardized coefficients. Values for standardized and unstandardized coefficients can also be derived subsequent to either type of analysis.
Before solving a multiple regression problem, all variables (independent and dependent) can be standardized. Each variable can be standardized by subtracting its mean from each of its values and then dividing these new values by the standard deviation of the variable. Standardizing all variables in a multiple regression yields standardized regression coefficients that show the change in the dependent variable measured in standard deviations.
- Advantages
- Standard coefficients' advocates note that the coefficients ignore the independent variable's scale of units, which makes comparisons easy.
- Disadvantages
- Critics voice concerns that such a standardization can be misleading. Since standardizing a variable removes the unit of measurement from its value, a standardized coefficient for a given relationship only represents its strength relative to the variation in the distributions. This invites bias due to sampling error when one standardizes variables using means and standard deviations based on small samples. Furthermore, a change of one standard deviation in one variable is only equivalent to a change of one standard deviation in another predictor insofar as the shapes of the two variables' distributions resemble one another. The meaning of a standard deviation may vary markedly between non-normal distributions (e.g., when skewed or otherwise asymmetrical). This underscores the importance of normality assumptions in parametric statistics, and poses an additional problem when interpreting standardized coefficient estimates that even nonparametric regression does not solve when dealing with non-normal distributions.
References
- Larry D. Schroeder, David L. Sjoquist, Paula E. Stephan. (1986) Understanding regression analysis, Sage Publications. ISBN 0-8039-2758-4, p. 31-32
- Eric Vittinghoff, David V. Glidden, Stephen C. Shiboski and Charles E. McCulloch. (2005) Regression methods in biostatistics: linear, logistic, survival, and repeated measures models, Springer, p. 75-76
External links
- Glossary of social science terms
- Which Predictors Are More Important? - why standardized coefficients are used