词条 | Generalized additive model for location, scale and shape |
释义 |
The Generalized Additive Model for Location, Scale and Shape (GAMLSS) is an approach to statistical modelling and learning. GAMLSS is a modern distribution-based approach to (semiparametric) regression. A parametric distribution is assumed for the response (target) variable but the parameters of this distribution can vary according to explanatory variables using linear, nonlinear or smooth functions. In machine learning parlance, GAMLSS is a form of supervised machine learning. In particular, the GAMLSS statistical framework enables flexible regression and smoothing models to be fitted to the data. The GAMLSS model assumes the response variable has any parametric distribution which might be heavy or light-tailed, and positively or negatively skewed. In addition, all the parameters of the distribution [location (e.g., mean), scale (e.g., variance) and shape (skewness and kurtosis)] can be modeled as linear, nonlinear or smooth functions of explanatory variables. Overview of the modelThe generalized additive model for location, scale and shape (GAMLSS) is a statistical model developed by Rigby and Stasinopoulos and later expanded to overcome some of the limitations associated with the popular generalized linear models (GLMs) and generalized additive models (GAMs). For an overview of these limitations see Nelder and Wedderburn (1972)[1] and Hastie and Tibshirani's book.[2] In GAMLSS the exponential family distribution assumption for the response variable, (), (essential in GLMs and GAMs), is relaxed and replaced by a general distribution family, including highly skew and/or kurtotic continuous and discrete distributions. The systematic part of the model is expanded to allow modeling not only of the mean (or location) but other parameters of the distribution of y as linear and/or nonlinear, parametric and/or additive non-parametric functions of explanatory variables and/or random effects. GAMLSS is especially suited for modelling a leptokurtic or platykurtic and/or positively or negatively skewed response variable. For count type response variable data it deals with over-dispersion by using proper over-dispersed discrete distributions. Heterogeneity also is dealt with by modeling the scale or shape parameters using explanatory variables. There are several packages written in R related to GAMLSS models.[3] A GAMLSS model assumes independent observations for with probability (density) function conditional on a vector of four distribution parameters, each of which can be a function of the explanatory variables. The first two population distribution parameters and are usually characterized as location and scale parameters, while the remaining parameter(s), if any, are characterized as shape parameters, e.g. skewness and kurtosis parameters, although the model may be applied more generally to the parameters of any population distribution with up to four distribution parameters, and can be generalized to more than four distribution parameters. where μ, σ, ν, τ and are vectors of length , is a parameter vector of length , is a fixed known design matrix of order and is a smooth non-parametric function of explanatory variable , and . For centile estimation the WHO Multicentre Growth Reference Study Group have recommended GAMLSS and the Box-Cox power exponential (BCPE) distributions[4] for the construction of the WHO Child Growth Standards.[5][6] What distributions can be usedThe form of the distribution assumed for the response variable y, is very general. For example an implementation of GAMLSS in R[7] has around 50 different distributions available. Such implementations also allow use of truncated distributions and censored (or interval) response variables.[7] References1. ^{{cite journal|last1=Nelder|first1=J.A.|last2=Wedderburn|first2=R.W.M|title=Generalized linear models|journal=J. R. Stat. Soc. A|date=1972|volume=135|issue=3|pages=370–384|doi=10.2307/2344614|jstor=2344614}} 2. ^{{cite book|last1=Hastie|first1=TJ|last2=Tibshirani|first2=RJ|title=Generalized additive models|date=1990|publisher=Chapman and Hall|location=London}} 3. ^{{cite journal|last2=Rigby|first2=Robert A|date=December 2007|title=Generalized additive models for location scale and shape (GAMLSS) in R|journal=Journal of Statistical Software|volume=23|issue=7|doi=10.18637/jss.v023.i07|last1=Stasinopoulos|first1=D. Mikis}} 4. ^{{cite journal|last2=Stasinopoulos|first2=D. Mikis|date=February 2004|title=Smooth Centile Curves for Skew and Kurtotic data Modelled Using the Box-Cox Power Exponential Distribution|journal=Statistics in Medicine|volume=23|issue=19|pages=3053–3076|doi=10.1002/sim.1861|pmid=15351960|last1=Rigby|first1=Robert}} 5. ^{{Cite journal | last1 = Borghi | first1 = E. | last2 = De Onis | first2 = M. | last3 = Garza | first3 = C. | last4 = Van Den Broeck | first4 = J. | last5 = Frongillo | first5 = E. A. | last6 = Grummer-Strawn | first6 = L. | last7 = Van Buuren | first7 = S. | last8 = Pan | first8 = H. | last9 = Molinari | first9 = L. | doi = 10.1002/sim.2227 | last10 = Martorell | first10 = R. | last11 = Onyango | first11 = A. W. | last12 = Martines | first12 = J. C. | author13 = WHO Multicentre Growth Reference Study Group | title = Construction of the World Health Organization child growth standards: Selection of methods for attained growth curves | journal = Statistics in Medicine | volume = 25 | issue = 2 | pages = 247–265 | year = 2006 | pmid = 16143968| pmc = }} 6. ^WHO Multicentre Growth Reference Study Group (2006) WHO Child Growth Standards: Length/height-for-age, weight-for-age, weight-for-length, weight-for-height and body mass index-for-age: Methods and development. Geneva: World Health Organization. 7. ^1 R packages for GAMLSS can be downloaded from here Further reading
External links
2 : Generalized linear models|Semi-parametric models |
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。