词条 | Influential observation |
释义 |
In statistics, an influential observation is an observation for a statistical calculation whose deletion from the dataset would noticeably change the result of the calculation.[1] In particular, in regression analysis an influential point is one whose deletion has a large effect on the parameter estimates.[2] AssessmentVarious methods have been proposed for measuring influence.[2][3] Assume an estimated regression , where is an n×1 column vector for the response variable, is the n×k design matrix of explanatory variables (including a constant), is the n×1 residual vector, and is a k×1 vector of estimates of some population parameter . Also define , the projection matrix of . Then we have the following measures of influence:
Outliers, leverage and influenceAn outlier may be defined as a surprising data point. Leverage is a measure of how much the estimated value of the dependent variable changes when the point is removed. There is one value of leverage for each data point.[6] Data points with high leverage force the regression line to be close to the point.[5] In Anscombe's quartet, only the bottom right image has a point with high leverage. See also
References1. ^{{citation|title=Elementary Statistics for Geographers|first1=James E.|last1=Burt|first2=Gerald M.|last2=Barber|first3=David L.|last3=Rigby|publisher=Guilford Press|year=2009|isbn=9781572304840|page=513|url=https://books.google.com/books?id=p7YMOPuu8ugC&pg=PA513}}. 2. ^{{cite web |first=Larry |last=Winner |title=Influence Statistics, Outliers, and Collinearity Diagnostics |work= |date=March 25, 2002 |url=http://stat.ufl.edu/~winner/sta6127/influence.doc }} 3. ^{{cite book |last=Belsley |first=David A. |last2=Kuh |first2=Edwin |last3=Welsh |first3=Roy E. | year=1980 |title=Regression Diagnostics: Identifying Influential Data and Sources of Collinearity |publisher=John Wiley & Sons |location=New York |series=Wiley Series in Probability and Mathematical Statistics |isbn=0-471-05856-4 |pages=11–16 |url=https://books.google.com/books?id=GECBEUJVNe0C&pg=PA11 }} 4. ^{{cite web |title=Outliers and DFBETA |url=http://www.albany.edu/faculty/kretheme/PAD705/SupportMat/DFBETA.pdf |dead-url=no |archivedate=May 11, 2013 |archiveurl=https://web.archive.org/web/20130511013229/http://www.albany.edu/faculty/kretheme/PAD705/SupportMat/DFBETA.pdf }} 5. ^1 2 {{cite book | last = Everitt | first = Brian | title = The Cambridge Dictionary of Statistics | publisher = Cambridge University Press | location = Cambridge, UK New York | year = 1998 | isbn = 0-521-59346-8 }} 6. ^{{cite web |first=Clifford |last=Hurvich |title=Simple Linear Regression VI: Leverage and Influence |publisher=NYU Stern |url=http://pages.stern.nyu.edu/~churvich/Undergrad/Handouts2/31-Reg6.pdf |dead-url=no |archivedate=September 21, 2006 |archiveurl=https://web.archive.org/web/20060921160749/http://pages.stern.nyu.edu/~churvich/Undergrad/Handouts2/31-Reg6.pdf }} Further reading
3 : Actuarial science|Regression diagnostics|Robust statistics |
随便看 |
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。