Classification of Large Data Sets. Comparison of Performance of Chosen Algorithms

Dudek, Andrzej

dc.contributor.author	Dudek, Andrzej
dc.date.accessioned	2015-06-22T09:46:22Z
dc.date.available	2015-06-22T09:46:22Z
dc.date.issued	2013
dc.identifier.issn	0208-6018
dc.identifier.uri	http://hdl.handle.net/11089/10038
dc.description.abstract	Researchers analyzing large (> 100,000 objects) data sets with the methods of cluster analysis often face the problem of computational complexity of algorithms, that sometimes makes it impossible to analyze in an acceptable time. Common solution of this problem is to use less computationally complex algorithms (like k-means), which in turn can in many cases give much worse results than for example algorithms using eigenvalues decomposition . The results of analysis of the actual sets of this type are therefore usually a compromise between quality and computational capabilities of computers. This article is an attempt to present the current state of knowledge on the classification of large datasets, and identify ways to develop and open problems.	pl_PL
dc.description.abstract	Badacze analizujący przy pomocy metod analizy skupień duże (> 100.000 obiektów) zbiory danych, stają często przed problemem złożoności obliczeniowej algorytmów, uniemożliwiającej niekiedy przeprowadzenie analizy w akceptowalnym czasie. Jednym z rozwiązań tego problemu jest stosowanie mniej złożonych obliczeniowo algorytmów (hierarchiczne aglomeracyjne, k-średnich), które z kolei mogą w wielu sytuacjach dawać zdecydowanie gorsze rezultaty niż np. algorytmy wykorzystujące dekompozycję względem wartości własnych. Rezultaty rzeczywistych analiz tego typu zbiorów są więc zazwyczaj kompromisem pomiędzy jakością a możliwościami obliczeniowymi komputerów. Artykuł jest próbą przedstawienia aktualnego stanu wiedzy na temat klasyfikacji dużych zbiorów danych oraz wskazania dróg rozwoju i problemów otwartych.	pl_PL
dc.language.iso	en	pl_PL
dc.publisher	Wydawnictwo Uniwersytetu Łódzkiego	pl_PL
dc.relation.ispartofseries	Acta Universitatis Lodziensis. Folia Oeconomica;285
dc.subject	clustering	pl_PL
dc.subject	classification	pl_PL
dc.subject	large data sets	pl_PL
dc.title	Classification of Large Data Sets. Comparison of Performance of Chosen Algorithms	pl_PL
dc.title.alternative	Klasyfikacja dużych zbiorów porównanie wydajności wybranych algorytmów	pl_PL
dc.type	Article	pl_PL
dc.page.number	[71]-77	pl_PL
dc.contributor.authorAffiliation	Wrocław, University of Economics, Chair of Econometrics and Informatics	pl_PL
dc.references	Bock H.H., Diday E. (eds.) (2000), Analysis of symbolic data. Explanatory methods for extracting statistical information from complex data, Springer-Verlag, Berlin	pl_PL
dc.references	Diday E., Noirhomme-Fraiture M. (eds.) (2008), Symbolic Data Analysis with SODAS Software, John Wiley & Sons, Chichester	pl_PL
dc.references	Dimitriadou E., Weingessel A., Hornik K. (2001), Voting-Merging: An Ensemble Method for Clustering. [in:] G. Dorffner, H. Bischop, K. Hornik (eds.), Artificial Neural Networks – ICANN 2001, Lecture Notes in Computer Science volume 2130 Springer, Berlin / Heidelberg, 217–224	pl_PL
dc.references	Everitt B.S., Landau S., Leese M. (2001), Cluster analysis, Edward Arnold, London	pl_PL
dc.references	Gordon A.D. (1999), Classification, Chapman & Hall/CRC, London	pl_PL
dc.references	Hubert L.J., Arabie P. (1985), Comparing partitions. „Journal of Classification”, no. 2, 193–218	pl_PL
dc.references	Kaufman L., Rousseeuw P.J. (1990), Finding groups in data: an introduction to cluster analysis, Wiley, New York	pl_PL
dc.references	Ng A., Jordan M., Weiss Y. (2002), On spectral clustering: analysis and an algorithm, [w:] T. Dietterich, S. Becker, Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems 14. MIT Press, 849–856	pl_PL
dc.references	Walesiak M., Dudek A. (2010), Klasyfikacja spektralna z wykorzystaniem odległości GDM, Prace Naukowe UE we Wrocławiu nr 107, 161–171	pl_PL
dc.references	Walesiak M., Dudek A. (2011), clusterSim package, URL http://www.R-project.org	pl_PL

Files in this item

Name:: 08-dudek.pdf
Size:: 378.0Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Acta Universitatis Lodziensis. Folia Oeconomica nr 285/2013 [28]

Show simple item record