Score test

Rao's score test, or the score test (often known as the Lagrange multiplier test in econometrics^[1]) is a statistical test of a simple null hypothesis that a parameter of interest $\theta$ is equal to some particular value $\theta _{0}$ . It is the most powerful test when the true value of $\theta$ is close to $\theta _{0}$ . The main advantage of the score-test is that it does not require an estimate of the information under the alternative hypothesis or unconstrained maximum likelihood. This constitutes a potential advantage in comparison to other tests, such as the Wald test and the generalized likelihood ratio test (GLRT). This makes testing feasible when the unconstrained maximum likelihood estimate is a boundary point in the parameter space.

Single parameter test

The statistic

Let $L$ be the likelihood function which depends on a univariate parameter $\theta$ and let $x$ be the data. The score $U(\theta )$ is defined as

U(\theta )={\frac {\partial \log L(\theta \mid x)}{\partial \theta }}.

The Fisher information is^[2]

I(\theta )=-\operatorname {E} \left[\left.{\frac {\partial ^{2}}{\partial \theta ^{2}}}\log L(X;\theta )\right|\theta \right]\,.

The statistic to test ${\mathcal {H}}_{0}:\theta =\theta _{0}$ is $S(\theta _{0})={\frac {U(\theta _{0})^{2}}{I(\theta _{0})}}$

which has an asymptotic distribution of $\chi _{1}^{2}$ , when $\mathcal{H}_0$ is true.

Note on notation

Note that some texts use an alternative notation, in which the statistic $S^{*}(\theta )={\sqrt {S(\theta )}}$ is tested against a normal distribution. This approach is equivalent and gives identical results.

As most powerful test for small deviations

\left({\frac {\partial \log L(\theta \mid x)}{\partial \theta }}\right)_{\theta =\theta _{0}}\geq C

Where $L$ is the likelihood function, $\theta _{0}$ is the value of the parameter of interest under the null hypothesis, and $C$ is a constant set depending on the size of the test desired (i.e. the probability of rejecting $H_{0}$ if $H_{0}$ is true; see Type I error).

The score test is the most powerful test for small deviations from $H_{0}$ . To see this, consider testing $\theta =\theta _{0}$ versus $\theta =\theta _{0}+h$ . By the Neyman–Pearson lemma, the most powerful test has the form

{\frac {L(\theta _{0}+h\mid x)}{L(\theta _{0}\mid x)}}\geq K;

Taking the log of both sides yields

\log L(\theta _{0}+h\mid x)-\log L(\theta _{0}\mid x)\geq \log K.

The score test follows making the substitution (by Taylor series expansion)

\log L(\theta _{0}+h\mid x)\approx \log L(\theta _{0}\mid x)+h\times \left({\frac {\partial \log L(\theta \mid x)}{\partial \theta }}\right)_{\theta =\theta _{0}}

and identifying the $C$ above with $\log(K)$ .

Relationship with other hypothesis tests

The likelihood ratio test, the Wald test, and the Score test are asymptotically equivalent tests of hypotheses.^[3] When testing nested models, the statistics for each test converge to a Chi-squared distribution with degrees of freedom equal to the difference in degrees of freedom in the two models.

Multiple parameters

A more general score test can be derived when there is more than one parameter. Suppose that ${\hat {\theta }}_{0}$ is the maximum likelihood estimate of $\theta$ under the null hypothesis $H_{0}$ . Then

U^{T}({\hat {\theta }}_{0})I^{{-1}}({\hat {\theta }}_{0})U({\hat {\theta }}_{0})\sim \chi _{k}^{2}

asymptotically under $H_{0}$ , where $k$ is the number of constraints imposed by the null hypothesis and

U({\hat {\theta }}_{0})={\frac {\partial \log L({\hat {\theta }}_{0}\mid x)}{\partial \theta }}

and

I({\hat {\theta }}_{0})=-E\left({\frac {\partial ^{2}\log L({\hat {\theta }}_{0}\mid x)}{\partial \theta \partial \theta '}}\right).

This can be used to test $H_{0}$ .

Special cases

In many situations, the score statistic reduces to another commonly used statistic.^[4]

When the data follows a normal distribution, the score statistic is the same as the t statistic.

When the data consists of binary observations, the score statistic is the same as the chi-squared statistic in the Pearson's chi-squared test.

When the data consists of failure time data in two groups, the score statistic for the Cox partial likelihood is the same as the log-rank statistic in the log-rank test. Hence the log-rank test for difference in survival between two groups is most powerful when the proportional hazards assumption holds.

References

↑ Bera, Anil K.; Bilias, Yannis (2001). "Rao's score, Neyman's C(α) and Silvey's LM tests: An essay on historical developments and some new results". Journal of Statistical Planning and Inference. 97: 9–44. doi:10.1016/S0378-3758(00)00343-8. Engle, Robert F (1984) . Wald, Likelihood Ratio and Lagrange Multiplier tests in Econometrics. in Handbook of Econometrics, Volume II, Edited by Z. Griliches and M.D. Intriligator. Elsevier Science Publishers BV.
↑ Lehmann and Casella, eq. (2.5.16).
↑ Engle, Robert F. (1983). "Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics". In Intriligator, M. D.; Griliches, Z. Handbook of Econometrics. II. Elsevier. pp. 796–801. ISBN 978-0-444-86185-6.
↑ Cook, T. D.; DeMets, D. L., eds. (2007). Introduction to Statistical Methods for Clinical Trials. Chapman and Hall. pp. 296–297. ISBN 1-58488-027-9.

Statistics

Descriptive statistics

Continuous data

Center	Mean arithmetic geometric harmonic Median Mode

Dispersion	Variance Standard deviation Coefficient of variation Percentile Range Interquartile range

Shape	Moments Skewness Kurtosis L-moments

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Population Statistic Effect size Statistical power Sample size determination Missing data

Survey methodology	Sampling Standard error stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Interaction Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in

Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife

Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F

Goodness of fit	Chi-squared Kolmogorov–Smirnov Anderson–Darling Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC

Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the 10/21/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.