On Model Selection in Some Regularized Linear Regression Methods

Kubus, Mariusz

dc.contributor.author	Kubus, Mariusz
dc.date.accessioned	2015-06-22T09:51:28Z
dc.date.available	2015-06-22T09:51:28Z
dc.date.issued	2013
dc.identifier.issn	0208-6018
dc.identifier.uri	http://hdl.handle.net/11089/10042
dc.description.abstract	A dynamic development of various regularization formulas in linear models has been observed recently. Penalizing the values of coefficients affects decreasing of the variance (shrinking coefficients to zero) and feature selection (setting zero for some coefficients). Feature selection via regularized linear models is preferred over popular wrapper methods in high dimension due to less computational burden as well as due to the fact that it is less prone to overfitting. However, estimated coefficients (and as a result quality of the model) depend on tuning parameters. Using model selection criteria available in R implementation does not guarantee that optimal model will be chosen. Having done simulation study we propose to use EDC criterion as an alternative.	pl_PL
dc.description.abstract	W ostatnich latach można zaobserwować dynamiczny rozwój różnych postaci regularyzacji w modelach liniowych. Wprowadzenie kary za duże wartości współczynników skutkuje zmniejszeniem wariancji (wartości współczynników są ,,przyciągane” do zera) oraz eliminacją niektórych zmiennych (niektóre współczynniki się zerują). Selekcja zmiennych za pomocą regularyzowanych modeli liniowych jest w problemach wielowymiarowych preferowana wobec popularnego podejścia polegającego na przeszukiwaniu przestrzeni cech i ocenie podzbiorów zmiennych za pomocą kryterium jakości modelu (wrappers). Przyczyną są mniejsze koszty obliczeń i mniejsza podatność na nadmierne dopasowanie. Jednakże wartości estymowanych współczynników (a więc także jakość modelu) zależą od parametrów regularyzacji. Zaimplementowane w tym celu w programie R kryteria jakości modelu nie gwarantują wyboru modelu optymalnego. Na podstawie przeprowadzonych symulacji w artykule proponuje się zastosowanie kryterium EDC.	pl_PL
dc.language.iso	en	pl_PL
dc.publisher	Wydawnictwo Uniwersytetu Łódzkiego	pl_PL
dc.relation.ispartofseries	Acta Universitatis Lodziensis. Folia Oeconomica;285
dc.subject	model selection	pl_PL
dc.subject	EDC	pl_PL
dc.subject	regularization	pl_PL
dc.subject	linear models	pl_PL
dc.subject	feature selection	pl_PL
dc.title	On Model Selection in Some Regularized Linear Regression Methods	pl_PL
dc.title.alternative	O wyborze postaci modelu w wybranych metodach regularyzowanej regresji liniowej	pl_PL
dc.type	Article	pl_PL
dc.page.number	[115]-123	pl_PL
dc.contributor.authorAffiliation	Opole University of Technology, Department of Mathematics and Applied Computer Science	pl_PL
dc.references	Bai Z.D., Krishnaiah P.R., Zhao L.C. (1986), On the detection of the number of signals in the presence of white noise, J. Multivariate Anal. 20, p. 1–25	pl_PL
dc.references	Breiman L., Spector P. (1992), Submodel selection and evaluation in regression: the X-random case, International Statistical Review 60: p. 291–319	pl_PL
dc.references	Burnham K. P., Anderson D.R. (2002), Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, 2nd ed. Springer-Verlag	pl_PL
dc.references	Efron B., Hastie T., Johnstone I., Tibshirani R. (2004), Least Angle Regression, Annals of Statistics 32 (2): p. 407–499	pl_PL
dc.references	Guyon I., Gunn S., Nikravesh M., Zadeh L. (2006), Feature Extraction: Foundations and Applications. Springer, New York	pl_PL
dc.references	Hastie T., Tibshirani R., Friedman J. (2009), The Elements of Statistical Learning: Data Mining, Inferance, and Prediction. 2nd edition, Springer, New York	pl_PL
dc.references	Hurvich C. M., Tsai C.-L. (1989), Regression and time series model selection in small samples, Biometrika, 76: p. 297–307	pl_PL
dc.references	Kundu D., Murali G. (1996), Model selection in linear regression, Computational Statistics & Data Analysis 22, p. 461–469	pl_PL
dc.references	Maddala G.S. (2008), Ekonometria, PWN, Warszawa	pl_PL
dc.references	Nemenyi P. B. (1963), Distribution-free multiple comparisons, PhD thesis, Princeton University	pl_PL
dc.references	Tibshirani R. (1996), Regression shrinkage and selection via the lasso, J.Royal. Statist. Soc. B., 58: p. 267–288	pl_PL
dc.references	Wahba G. (1980), Spline bases, regularization, and generalized crossvalidation for solving approximation problems with large quantities of noisy data, Proc. of the Inter. Conf. on Approximation theory in Honour of George Lorenz, Academic Press, Austin, Texas, p. 905–912	pl_PL
dc.references	Zou H., Hastie T. (2005), Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society Series B, 67(2): p. 301–320	pl_PL

Pliki tej pozycji

Nazwa:: 13-kubus.pdf
Rozmiar:: 214.7KB
Format:: PDF

Oglądaj/Otwórz

Pozycja umieszczona jest w następujących kolekcjach

Acta Universitatis Lodziensis. Folia Oeconomica nr 285/2013 [28]

Pokaż uproszczony rekord