Statystyczne metody klasyfikacji tekstów

Idczak, Adam; Korzeniewski, Jerzy

dc.contributor.author	Idczak, Adam
dc.contributor.author	Korzeniewski, Jerzy
dc.date.accessioned	2022-04-04T12:07:21Z
dc.date.available	2022-04-04T12:07:21Z
dc.date.issued	2022
dc.identifier.citation	Idczak A., Korzeniewski J., Statystyczne metody klasyfikacji tekstów, WUŁ, Łódź 2022, https://doi.org/10.18778/8220-786-6	pl_PL
dc.identifier.isbn	978-83-8220-786-6
dc.identifier.uri	http://hdl.handle.net/11089/41490
dc.description.abstract	W ostatnich latach, wraz z szybkim rozwojem technologii komputerowych i internetowych, coraz większego znaczenia nabierają komputerowe metody badania tekstu, w szczególności metody ustalania sentymentu czy też wydźwięku tekstu. Metody komputerowe mogą być później wykorzystywane w takich zagadnieniach, jak streszczanie tekstu, wyszukiwanie informacji z tekstu, sprawdzanie poprawności tekstu, maszynowe tłumaczenie tekstu i wielu innych. Niniejsza monografia zawiera przegląd metod analizy sentymentu dla dokumentów głównie anglojęzycznych, badanie efektywności wybranych metod analizy sentymentu w zastosowaniu do dokumentów polskojęzycznych, propozycje nowych metod, które mogą poprawić jakość klasyfikacji. W nowych propozycjach nacisk został położony na problemy klasyfikacji binarnej, niekorzystanie ze źródeł zewnętrznych, korzystanie w jak najmniejszym stopniu ze zbioru uczącego. Proponujemy przenieść ciężar klasyfikacji tekstów z obszernego zbioru uczącego na wyszukiwanie i analizowanie związków pomiędzy słowami tworzącymi dokument, a nawet grupami słów. Zaproponowana metoda ma prostą interpretację, może konkurować z metodami standardowymi oraz może być wykorzystana do innych problemów związanych z ustalaniem sentymentu tekstów.	pl_PL
dc.language.iso	pl	pl_PL
dc.publisher	Wydawnictwo Uniwersytetu Łódzkiego	pl_PL
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 Międzynarodowe	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	sentyment dokumentu	pl_PL
dc.subject	klasyfikacja dokumentu	pl_PL
dc.subject	komputerowe metody uczenia się	pl_PL
dc.subject	korelacja liniowa	pl_PL
dc.subject	metoda SVM	pl_PL
dc.subject	klasyfikator Bayesa	pl_PL
dc.title	Statystyczne metody klasyfikacji tekstów	pl_PL
dc.type	Book	pl_PL
dc.page.number	142	pl_PL
dc.contributor.authorAffiliation	Uniwersytet Łódzki, Wydział Ekonomiczno-Socjologiczny, Instytut Statystyki i Demografii, Katedra Metod Statystycznych	pl_PL
dc.contributor.authorAffiliation	Uniwersytet Łódzki, Wydział Ekonomiczno-Socjologiczny, Instytut Statystyki i Demografii, Katedra Demografii	pl_PL
dc.identifier.eisbn	978-83-8220-787-3
dc.references	Abbasi A., Chen H., Salem A., (2008), Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums, „ACM Transactions on Information Systems”, vol. 26, issue 3, s. 1–34, https://doi.org/10.1145/1361684.1361685	pl_PL
dc.references	Agarwal A., Xie B., Vovsha I., Rambow O., Passonneau R., (2011), Sentiment Analysis of Twitter Data, [w:] Proceedings of the Workshop on Language in Social Media (LSM), s. 30–38.	pl_PL
dc.references	Agarwal B., Mittal N., (2012), Categorical Probability Proportion Difference (CPPD): A Feature Selection Method for Sentiment Classification, [w:] Proceedings of the 2nd Workshop on Sentiment Analysis Where AI Meets Psychology (SAAIP), Mumbai, India, s. 17–26.	pl_PL
dc.references	Agarwal B., Mittal N., (2016), Prominent Feature Extraction for Sentiment Analysis, Springer International Publishing, Cham.	pl_PL
dc.references	Aggarwal C. C., (2018), Machine Learning for Text, Springer International Publishing, Cham.	pl_PL
dc.references	Agnihotri D., Verma K., Tripathi P., (2016), Computing Correlative Association of Terms for Automatic Classification of Text Documents, [w:] Proceedings of the International Symposium on Computer Vision and the Internet, Association for Computing Machinery, New York, https://doi.org/10.1145/2983402.2983424	pl_PL
dc.references	Agnihotri D., Verma K., Tripathi P., Singh B., (2019), Soft Voting Technique to Improve the Performance of Global Filter Based Feature Selection in Text Corpus, „Applied Intelligence”, vol. 49, issue 4, s. 1597–1619, https://doi.org/10.1007/s10489-018-1349-1	pl_PL
dc.references	Bagheri A., Saraee M., Jong de F., (2013), Sentiment Classification in Persian: Introducing a Mutual Information-Based Method for Feature Selection, „21st Iranian Conference on Electrical Engineering (ICEE)”, s. 1–6, https://doi.org/10.1109/IranianCEE.2013.6599671	pl_PL
dc.references	Bahassine S., Madani A., Kissi M., (2016), An Improved Chi-Square Feature Selection for Arabic Text Classification Using Decision Tree, „11th International Conference on Intelligent Systems: Theories and Applications (SITA)”, s. 1–5, https://doi.org/ 10.1109/SITA.2016.7772289	pl_PL
dc.references	Bahassine S., Madani A., Al-Sarem M., Kissi M., (2018), Feature Selection Using an Improved Chi-Square for Arabic Text Classification, „Journal of King Saud University – Computer and Information Sciences”, vol. 32, issue 2, s. 225–231, https://doi.org/10.1016/j.jksuci.2018.05.010	pl_PL
dc.references	Bakus J., Kamel M., (2006), Higher Order Feature Selection for Text Classification, „Knowledge Information Systems”, vol. 9, issue 4, s. 468–491, https://doi.org/10.1007/s10115-005-0209-6	pl_PL
dc.references	Battiti R., (1994), Using Mutual Information for Selecting Features in Supervised Neural Net Learning, „IEEE Transactions on Neural Networks”, vol. 5, issue 4, s. 537–550, https://doi.org/10.1109/72.298224	pl_PL
dc.references	Blitzer J., Dredze M., Pereira F., (2007), Biographies, Bollywood, Boom-Boxes and Blenders: Domain Adaptation for Sentiment Classification, [w:] Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Association of Computational Linguistics, Prague, s. 440–447, https://aclanthology.org/P07-1056.pdf (dostęp: 11.02.2022).	pl_PL
dc.references	Boser B. E., Guyon I. M., Vapnik, V. N., (1992), A Training Algorithm for Optimal Margin Classifiers, [w:] Proceedings of the 5th Annual Workshop on Computational Learning Theory, Association for Computing Machinery, New York, s. 144–152, https://doi.org/10.1145/130385.130401	pl_PL
dc.references	Cai J., Song F., (2008), Maximum Entropy Modeling with Feature Selection for Text Categorization, [w:] Proceedings of the 4th Asia Information Retrieval Conference on Information Retrieval Technology, s. 549–554.	pl_PL
dc.references	Carvalho F., Guedes G. P., (2020), TF-IDFC-RF: A Novel Supervised Term Weighting Scheme for Sentiment Analysis, https://arxiv.org/pdf/2003.07193.pdf (dostęp: 20.07.2021).	pl_PL
dc.references	Chen J., Huang H., Tian S., Qu Y., (2009), Feature Selection for Text Classification with Naïve Bayes, „Expert Systems with Applications”, vol. 36, issue 3, https://doi.org/10.1016/j.eswa. 2008.06.054	pl_PL
dc.references	Chen X., Ma J., Lu Y., (2013), Feature Selection for Chinese Online Reviews Sentiment Classification, [w:] Proceedings of the Joint Conference of International Conference on Computational Problem-Solving and International High Speed Intelligent Communication Forum, s. 79–82, https://doi.org/10.1109/ICCPS.2013.6893490	pl_PL
dc.references	Chen Y., Han B., Hou P., (2014), New Feature Selection Methods Based on Context Similarity for Text Categorization, [w:] Proceedings of the 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), https://doi.org/10.1109/FSKD.2014.6980902	pl_PL
dc.references	Combarro E., Montanes E., Diaz I., Ranilla J., Mones R., (2005), Introducing a Family of Linear Measures for Feature Selection in Text Categorization, „IEEE Transactions on Knowledge and Data Engineering”, vol. 17, issue 9, s. 1223–1232, https://doi.org/10.1109/TKDE.2005.149	pl_PL
dc.references	Cortes C., Vapnik V. N., (1995), Support-Vector Networks, „Machine Learning”, vol. 20, no. 3, s. 273–297.	pl_PL
dc.references	Dai L., Chen H., Li X., (2011), Improving Sentiment Classification Using Feature Highlighting and Feature Bagging, [w:] Proceedings of 11th IEEE International Conference on Data Mining Workshops, s. 61–66.	pl_PL
dc.references	Dave K., Lawrence S., Pennock D. M., (2003), Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews, [w:] Proceedings of the 12th International Conference on World Wide Web (WWW–2003), Association for Computing Machinery, New York, s. 519–528, https://doi.org/10.1145/775152.775226	pl_PL
dc.references	Davies A., Ghahramani Z., (2011), Language-Independent Bayesian Sentiment Mining of Twitter, [w:] Proceedings of the 5th Workshop on Social Network Mining and Analysis, s. 99–107.	pl_PL
dc.references	Ding X., Tang Y., (2013), Improved Mutual Information Method for Text Feature Selection, [w:] Proceedings of the 8th International Conference on Computer Science and Education, s. 163–166, https://doi.org/10.1109/ICCSE.2013.6553903	pl_PL
dc.references	Domański Cz., Pruska K., (2000), Nieklasyczne metody statystyczne, PWE, Warszawa.	pl_PL
dc.references	Dunning T., (1993), Accurate Methods for the Statistics of Surprise and Coincidence, „Computational Linguistics”, vol. 19, s. 61–74.	pl_PL
dc.references	Elakkiya E., Selvakumar S., Velusamy R., (2020), CIFAS: Community Inspired Firefly Algorithm with Fuzzy Cross-Entropy for Feature Selection in Twitter Spam Detection, [w:] Proceedings of the 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), s. 1–7, https://doi.org/10.1109/ICCCNT49239.2020.9225321	pl_PL
dc.references	Eyheramendy S., Madigan D., (2007), A Bayesian Feature Selection Score Based on Naïve Bayes Models, [w:] H. Liu, H. Motoda (eds), Computational Methods of Feature Selection, Chapman and Hall, New York, s. 277–294.	pl_PL
dc.references	Forman G., (2003), An Extensive Empirical Study of Feature Selection Metrics for Text Classification, „Journal of Machine Learning Research”, vol. 3, s. 1289–1305.	pl_PL
dc.references	Fragoudis D., Meretakis D., Likothanassis S., (2005), Best Terms: an Efficient Feature-Selection Algorithm for Text Categorization, „Knowledge and Information Systems”, vol. 8, issue 1, s. 16–33.	pl_PL
dc.references	Fukumoto F., Suzuki Y., (2015), Temporal-based Feature Selection and Transfer Learning for Text Categorization, [w:] Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), s. 17–26.	pl_PL
dc.references	Galavotti L., Sebastiani F., Simi M., (2000), Experiments on the Use of Feature Selection and Negative Evidence in Automated Text Categorization, [w:] J. L. Borbinha, T. Baker (eds), Proceedings of ECDL-00, 4th European Conference on Research and Advanced Technology for Digital Libraries, Springer Verlag, Lisbon–Heidelberg, s. 59–68.	pl_PL
dc.references	Gamon M., (2004), Sentiment Classification on Customer Feedback Data: Noisy Data, Large Feature Vectors, and the Role of Linguistic Analysis, [w:] Proceedings of the 20th International Conference on Computational Linguistics (COLING), Association for Computational Linguistics, Stroudsburg, PA, s. 841–847, https://doi.org/10.3115/1220355.1220476	pl_PL
dc.references	Gao Z., Xu Y., Meng F., Qi F., Lin Z., (2014), Improved Information Gain-Based Feature Selection for Text Categorization, [w:] Proceedings of the 4th International Conference on Wireless Communications, Vehicular Technology, Information Theory and Aerospace and Electronic Systems (VITAE), s. 1–5, https://doi.org/10.1109/VITAE.2014.6934421	pl_PL
dc.references	Garnes Ø. L., (2009), Feature Selection for Text Categorisation, Norwegian University of Science and Technology, Department of Computer and Information Science, https://ntnuopen.ntnu.no/ntnu-xmlui/bitstream/handle/11250/250768/347827_FULLTEXT01.pdf?sequence=1 (dostęp: 4.02.2022).	pl_PL
dc.references	Genkin A., Lewis D., Madigan D., (2007), Large-Scale Bayesian Logistic Regression for Text Categorization, „Technometrics”, vol. 49, no. 3, s. 291–304.	pl_PL
dc.references	Ghareb A. S., Abu Bakara A., Al-Radaideh Q. A., Hamdan A. R., (2018), Enhanced Filter Feature Selection Methods for Arabic Text Categorization, „International Journal of Information Retrieval Research”, vol. 8, issue 2, s. 1–24, https://doi.org/10.4018/IJIRR.2018040101	pl_PL
dc.references	Govindarajan M., (2013), Sentiment Analysis of Movie Reviews Using Hybrid Method of Naïve Bayes and Genetic Algorithm, „International Journal of Advanced Computer Research”, vol. 3, no. 4, s. 139–145.	pl_PL
dc.references	Gündüz H., Çataltepe Z., (2015), Borsa Istanbul (BIST) Daily Prediction Using Financial News and Balanced Feature Selection, „Journal of Machine Learning”, vol. 42, no. 22.	pl_PL
dc.references	Guyon I., Elisseeff A., (2003), An Introduction to Variable and Feature Selection, „The Journal of Machine Learning Research”, vol. 3, s. 1157–1182.	pl_PL
dc.references	Hai N., Nghia N., Le H., Vu Thanh N., (2015), A Hybrid Feature Selection Method for Vietnamese Text Classification, Seventh International Conference on Knowledge and Systems Engineering (KSE), https://doi.org/10.1109/KSE.2015.25	pl_PL
dc.references	Hand D., Mannila H., Smith P., (2005), Eksploracja danych, Wydawnictwo Naukowo-Techniczne, Warszawa.	pl_PL
dc.references	Hatzivassiloglou V., Wiebe J., (2000), Effects of Adjective Orientation and Gradability on Sentence Subjectivity, [w:] Proceedings of the International Conference on Computational Linguistics (COLING), Association for Computational Linguistics, Stroudsburg, PA, s. 299–305, https://doi.org/10.3115/990820.990864	pl_PL
dc.references	Hersh W., Buckley C., Leone T. J., Hickam D., (1994), OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research, [w:] B. W. Croft, C. J. van Rijsbergen (eds), Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, s. 192–201.	pl_PL
dc.references	Holland J., (1975), Adaptation in Natural and Artificial Systems, University of Michigan Press, Michigan, USA.	pl_PL
dc.references	Hosmer D. W., Lemeshow S., Sturdivant R. X., (2013), Applied Logistic Regression, 3rd ed., John Wiley & Sons, New Jersey.	pl_PL
dc.references	Internet Movie Database, www.imdb.com (dostęp: 15.10.2021).	pl_PL
dc.references	Iqbal F., Hashmi J., Fung B., Batool R., Khattak A. M., Aleem S., Hung P., (2019), A Hybrid Framework for Sentiment Analysis Using Genetic Algorithm Based Feature Reduction, „IEEE Access”, vol. 7, s. 14637–14652, https://doi.org/10.1109/ACCESS.2019.2892852	pl_PL
dc.references	Jiang T., Yu H., (2015), A Novel Feature Selection Based on Tibetan Grammar for Tibetan Text Classification, [w:] Proceedings of the 6th IEEE International Conference on Software Engineering and Service Sciences (ICSESS), s. 445–448, https://doi.org/10.1109/ICSESS.2015.7339093	pl_PL
dc.references	Jiang X.-Y., Shui J., (2013), An Improved Mutual Information-Based Feature Selection Algorithm for Text Classification, [w:] Proceedings of the 5th International Conference on Intelligent Human-Machine Systems and Cybernetics, s. 126–129, https://doi.org/10.1109/IHMSC.2013.37	pl_PL
dc.references	Joachims T., (1998), Text Categorization with Support Vector Machines: Learning with Many Relevant Features, [w:] Machine Learning: ECML-98. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 1398, Springer, Berlin, s. 137–142, https://doi.org/10.1007/BFb0026683	pl_PL
dc.references	Joshi M., Penstein-Rosé C., (2009), Generalizing Dependency Features for Opinion Mining, [w:] Proceedings of the 47th ACL and the 4th IJCNLP Conference, Association for Computational Linguistics, ACL and AFNLP, Suntec, s. 313–316, https://aclanthology.org/P09-2079.pdf (dostęp: 11.02.2022).	pl_PL
dc.references	Kwak N., Choi Ch.-H., (2002), Input Feature Selection for Classification Problems, „IEEE Transactions of Neural Works”, vol. 13, no. 1, s. 143–159, https://doi.org/10.1109/72.977291	pl_PL
dc.references	Lam W., Ho C. Y., (1998), Using a Generalized Instance Set for Automatic Text Categorization, [w:] Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, s. 81–89, https://doi.org/10.1145/290941.290961	pl_PL
dc.references	Largeron C., Moulin C., Géry M., (2011), Entropy Based Feature Selection for Text Categorization, [w:] Proceedings of the 2011 ACM Symposium on Applied Computing, Taichung, s. 924–928, https://doi.org/10.1145/1982185.1982389	pl_PL
dc.references	Lifang Y., Sijun Q., Huan Z., (2017), Feature Selection Algorithm for Hierarchical Text Classification Using Kullback-Leibler Divergence, [w:] Proceedings of the IEEE 2nd International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), s. 421–424, https://doi.org/10.1109/ ICCCBDA.2017.7951950	pl_PL
dc.references	Lula P., (2018), Statystyczne modelowanie zawartości dokumentów tekstowych, Wydawnictwo Uniwersytetu Ekonomicznego w Krakowie, Kraków.	pl_PL
dc.references	Lula P., Wójcik K., (2011), Sentiment Analysis of Consumer Opinions Written in Polish, „Economics and Management”, vol. 16, s. 1286–1291.	pl_PL
dc.references	Malik A., Novovicova J., (2005), Information-Theoretic Feature Selection Algorithms for Text Classification, [w:] Proceedings of the IEEE International Joint Conference on Neural Networks, Montreal, vol. 5, s. 3272–3278, https://doi.org/10.1109/IJCNN.2005.1556452	pl_PL
dc.references	McCallum A., Nigam K., (1998), A Comparison of Event Models for Naïve Bayes Text Classification, [w:] Proceedings of the AAAI/ICML-98 Workshop on Learning for Text Categorization, AAAI Press, Madison, Wisconsin, s. 41–48.	pl_PL
dc.references	Mladenic D., Grobelnik M., (1999), Feature Selection for Unbalanced Class Distribution and Naive Bayes, [w:] Proceedings of the 16th International Conference on Machine Learning (ICML), Bled, s. 258–267.	pl_PL
dc.references	Mladenić D., Brank J., Grobelnik M., Milic-Frayling N., (2004), Feature Selection Using Linear Classifier Weights: Interaction with Classification Models, [w:] Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, s. 234–241, https://doi.org/10.1145/1008992.1009034	pl_PL
dc.references	Mladenović M., Mitrović J., Krstev C., Vitas D., (2016), Hybrid Sentiment Analysis Framework for a Morphologically Rich Language, „Journal of Intelligent Information Systems”, vol. 46, s. 599–620, https://doi.org/10.1007/s10844-015-0372-5	pl_PL
dc.references	Movie Review Data, https://www.cs.cornell.edu/people/pabo/movie-review-data/ (dostęp: 15.10.2021).	pl_PL
dc.references	Na J.-Ch., Khoo C., Wu P. H. J., (2005), Use of Negation Phrases in Automatic Sentiment Classification of Product Reviews, „Library Collections, Acquisitions & Technical Services”, no. 29, s. 180–191, https://ccc.inaoep.mx/~villasen/bib/Use%20of%20negation%20phrases%20in%20automatic%20sentiment%20classification.pdf (dostęp: 11.02.2021).	pl_PL
dc.references	Nasukawa T., Yi J., (2003), Sentiment Analysis: Capturing Favorability Using Natural Language Processing, [w:] Proceedings of the 2nd International Conference on Knowledge Capture, s. 70–77, https://doi.org/10.1145/945645.945658	pl_PL
dc.references	Ng H. T., Goh W. B., Low K. L., (1997), Feature Selection, Perceptron Learning, and a Usability Case Study for Text Categorization, [w:] N. J. Belkin, A. D. Narasimhalu, P. Willett (eds), Proceedings of the SIGIR-97, 20th ACM International Conference on Research and Development in Information Retrieval, ACM Press, New York, vol. 31, s. 67–73, https://doi.org/10.1145/278459.258537	pl_PL
dc.references	Nguyen T. H., Nghia N. H., Tuan D. L., Nguyen V. T., (2015), A Hybrid Feature Selection Method for Vietnamese Text Classification, [w:] Proceedings of the 7th International Conference on Knowledge and Systems Engineering (KSE), s. 91–96, https://doi.org/10.1109/KSE.2015.25	pl_PL
dc.references	O’Keefe T., Koprinska I., (2009), Feature Selection and Weighting Methods in Sentiment Analysis, [w:] Proceedings of the 14th Australasian Document Computing Symposium, Sydney.	pl_PL
dc.references	Ong B. Y., Goh S. W., Xu C., (2015), Sparsity Adjusted Information Gain for Feature Selection in Sentiment Analysis, [w:] Proceedings of the IEEE International Conference on Big Data, Santa Clara, USA, s. 2122–2128, https://doi.org/10.1109/BigData.2015.7363995	pl_PL
dc.references	Ortega-Mendoza R. M., López-Monroy A., Franco-Arcega A., Montes-y-Gómez M., (2018), Emphasizing Personal Information for Author Profiling: New Approaches for Term Selection and Weighting, „Knowledge Based Systems”, vol. 145, s. 169–181, https://doi.org/10.1016/ J.KNOSYS.2018.01.014	pl_PL
dc.references	Pakiet e1071, https://CRAN.R-project.org/package=e1071 (dostęp: 21.09.2021).	pl_PL
dc.references	Pakiet naivebayes, https://CRAN.R-project.org/package=naivebayes (dostęp: 21.09.2021).	pl_PL
dc.references	Pakiet RWeka, https://CRAN.R-project.org/package=RWeka (dostęp: 21.09.2021).	pl_PL
dc.references	Pakiet text2vec, https://CRAN.R-project.org/package=text2vec (dostęp: 20.09.2021).	pl_PL
dc.references	Pakiet tm, https://CRAN.R-project.org/package=tm (dostęp: 20.09.2021).	pl_PL
dc.references	Paltoglou G., Thelwall M., (2010), A Study of Information Retrieval Weighting Schemes for Sentiment Analysis, [w:] Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL ’10), Association for Computational Linguistics, Uppsala, s. 1386–1395, https://aclanthology.org/P10-1141.pdf (dostęp: 5.06.2021).	pl_PL
dc.references	Pang B., Lee L., (2004), A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, [w:] Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics (ACL ’04), Association for Computational Linguistics, Stroudsburg, PA, s. 271–278, https://doi.org/10.3115/1218955.1218990	pl_PL
dc.references	Pang B., Lee L., (2005), Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales, [w:] Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL ’05), Association for Computational Linguistics, Stroudsburg, PA, s. 115–124, https://doi.org/10.3115/1219840.1219855	pl_PL
dc.references	Pang B., Lee L., Vaithyanathan S., (2002), Thumbs up? Sentiment Classification Using Machine Learning Techniques, [w:] Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), Association for Computational Linguistics, Stroudsburg, PA, s. 79–86, https://www.cs.cornell.edu/home/llee/papers/sentiment.pdf (dostęp: 8.02.2021).	pl_PL
dc.references	Patil L., Atique M., (2013), A Novel Feature Selection Based on Information Gain Using WordNet, [w:] Proceedings of the Science and Information Conference (SAI), London, s. 625–629.	pl_PL
dc.references	Pintas J. T., Fernandes L. A. F., Garcia A. C. B., (2021), Feature Selection Methods for Text Classification: A Systematic Literature Review, „Artificial Intelligence Review”, vol. 54, s. 6149–6200, https://doi.org/10.1007/s10462-021-09970-6	pl_PL
dc.references	Rahate R. S., Emmanuel M., (2013), Feature Selection for Sentiment Analysis by Using SVM, „International Journal of Computer Applications”, vol. 84, no. 5, s. 24–32, https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.402.3178&rep=rep1&type=pdf (dostęp: 4.02.2022).	pl_PL
dc.references	Rastogi S., (2018), Improving Classification Accuracy of Automated Text Classifiers, [w:] Proceedings of the 7th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions), Noida, India, s. 1–7, https://doi.org/10.1109/ICRITO.2018.8748498	pl_PL
dc.references	Ruiz M. E., Srinivasan P., (1999), Hierarchical Neural Networks for Text Categorization, [w:] Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York, s. 281–282, https://doi.org/10.1145/312624.312700	pl_PL
dc.references	Saad S., Saberi B., (2017), Sentiment Analysis or Opinion Mining: A Review, „International Journal on Advanced Science, Engineering and Information Technology”, vol. 7, no. 5, s. 1660–1666, https://doi.org/10.18517/ijaseit.7.5.2137	pl_PL
dc.references	Saif H., He Y., Alani H., (2012), Alleviating Data Sparsity for Twitter Sentiment Analysis, [w:] 2nd Workshop on Making Sense of Microposts (#MSM2012): Big things come in small packages at the 21st International Conference on the World Wide Web (WWW’12), Lyon, CEUR Workshop Proceedings (CEUR-WS.org), s. 2–9, https://www2012.universite-lyon.fr/proceedings/nocompanion/MSM2012_paper_01.pdf (dostęp: 11.02.2022).	pl_PL
dc.references	Salton G., Wong A., and Yang C. S., (1975), A Vector Space Model for Automatic Indexing, „Communication of the ACM”, vol. 18, issue 11, s. 613–620, https://doi.org/10.1145/361219.361220	pl_PL
dc.references	Sathic Ali P. U., Venkateswaran C. J., (2014), A Dempster-Shafer Model for Feature Selection in Text Categorization, „Research Journal of Applied Sciences, Engineering and Technology”, vol. 7, no. 5, s. 981–985, http://dx.doi.org/10.19026/rjaset.7.347	pl_PL
dc.references	Sebastiani F., (2002), Machine Learning in Automated Text Categorization, ACM Computing Surveys, vol. 34, issue 1, s. 1–47, https://doi.org/10.1145/505282.505283	pl_PL
dc.references	Shan L.-L., Liu B.-Q., Sun C.-J., (2011), Comparison and Improvement of Feature Selection Method for Text Categorization, „Journal of Harbin Institute of Technology”, vol. 43, no. 1, s. 319–324.	pl_PL
dc.references	Shang W., Huang H., Zhu H., Lin Y., Qu Y., Wang Z., (2007), A Novel Feature Selection Algorithm for Text Categorization, „Expert Systems with Applications”, vol. 33, issue 1, s. 1–5, https://doi.org/10.1016/j.eswa.2006.04.001	pl_PL
dc.references	Shen K., Chen X., Ma J., Ke L., Lu Y., Zhang K., (2013), A Blended Feature Selection Method in Text Classification, [w:] Proceedings of the International Conference on Cyberspace Technology (CCT 2013), Beijing, China, s. 573–576, https://doi.org/10.1049/cp.2013.2077	pl_PL
dc.references	Simeon M., Hilderman R., (2008), Categorical Proportional Difference: A Feature Selection Method for Text Categorization, [w:] J. F. Roddick, J. Li, P. Christen, P. Kennedy (eds), The Seventh Australasian Data Mining Conference (AusDM 2008), Glenelg, South Australia. Conferences in Research and Practice in Information Technology (CRPIT), vol. 87, s. 201–208, https://dl.acm.org/doi/pdf/10.5555/2449288.2449320 (dostęp: 4.02.2022).	pl_PL
dc.references	Stoplista, https://pl.wikipedia.org/wiki/Wikipedia:Stopwords (dostęp: 20.09.2021).	pl_PL
dc.references	Subrahmanian V. S., Reforgiato D., (2008), AVA: Adjective-Verb-Adverb Combinations for Sentiment Analysis, „IEEE Intelligent Systems”, vol. 23, no. 4, s. 43–50, https://doi.org/10.1109/MIS.2008.57	pl_PL
dc.references	Sun J., Zhang X., Liao D., Chang V., (2017), Efficient Method for Feature Selection in Text Classification, [w:] Proceedings of International Conference on Engineering and Technology (ICET), Antalya, Turkey, s. 1–6, https://doi.org/10.1109/ICEngTechnol.2017.8308201	pl_PL
dc.references	Tan S., Zhang J., (2008), An Empirical Study of Sentiment Analysis for Chinese Documents, „Expert Systems with Application”, vol. 34, issue 4, s. 2622–2629, https://doi.org/10.1016/j.eswa.2007.05.028	pl_PL
dc.references	Wang H., Bell D., (2004), Extended k-Nearest Neighbours Based on Evidence Theory, „The Computer Journal”, no. 47, issue 6, s. 662–672, https://doi.org/10.1093/comjnl/47.6.662	pl_PL
dc.references	Wang W., Kang Y., Wu X., (2008), Study on Feature Selection in Text Categorization, „Information Technology”, no. 12, s. 29–31.	pl_PL
dc.references	Wu G., Wang L., Zhao N., Lin H., (2015), Improved Expected Cross Entropy Method for Text Feature Selection, [w:] Proceedings of the International Conference on Computer Science and Mechanical Automation (CSMA), Massachusetts, USA, s. 49–54, https://doi.org/10.1109/CSMA.2015.17	pl_PL
dc.references	Wu G., Xu J., (2015), Optimized Approach of Feature Selection Based on Information Gain, [w:] Proceedings of the International Conference on Computer Science and Mechanical Automation (CSMA), Massachusetts, USA, s. 157–161, https://doi.org/10.1109/CSMA.2015.38	pl_PL
dc.references	Wu L., Wang Y., Zhang S., Zhang Y., (2017), Fusing Gini Index and Term Frequency for Text Feature Selection, [w:] Proceedings of the IEEE 3rd International Conference on Multimedia Big Data (BigMM), Laguna Hills, USA, s. 280–283, https://doi.org/10.1109/BigMM.2017.65	pl_PL
dc.references	Xu H., Yu S., Chen J., Zuo X., (2018), An Improved Firefly Algorithm for Feature Selection in Classification, „Wireless Personal Communications: An International Journal”, vol. 102, issue 4, s. 2823–2834, https://doi.org/10.1007/s11277-018-5309-1	pl_PL
dc.references	Yang X.-S., (2009), Firefly Algorithms for Multimodal Optimization, [w:] Proceedings of the 5th International Symposium on Stochastic Algorithms: Foundations and Applications, Springer, Berlin–Heidelberg, s. 169–178.	pl_PL
dc.references	Yang Y., Liu X., (1999), A Re-Examination of Text Categorization Methods, [w:] Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’99), Berkeley, USA, s. 42–49, https://doi.org/10.1145/312624.312647	pl_PL
dc.references	Yazdani S. F., Murad M. A. A., Sharef N. M., Singh Y. P., Latiff A. R. A., (2017), Sentiment Classification of Financial News Using Statistical Features, „International Journal of Pattern Recognition and Artificial Intelligence”, vol. 31, no. 3, s. 1–34, https://doi.org/10.1142/S0218001417500069	pl_PL
dc.references	Yu L., Liu H., (2004), Efficient Feature Selection Via Analysis of Relevance and Redundancy, „Journal of Machine Learning Research”, vol. 5, s. 1205–1224.	pl_PL
dc.references	Zhang H., Ren Y., Yang X., (2013), Research on Text Feature Selection Algorithm Based on Information Gain and Feature Relation Tree, [w:] Proceedings of the 10th Web Information System and Application Conference, Yangzhou, China, s. 446–449, https://doi.org/10.1109/WISA.2013.90	pl_PL
dc.references	Zhang T., Oles F. J., (2001), Text Categorization Based on Regularized Linear Classification Methods, „Information Retrieval”, vol. 4, s. 5–31.	pl_PL
dc.references	Zhen Z., Zeng X., Wang H., Han L., (2011), A Global Evaluation Criterion for Feature Selection in Text Categorization Using Kullback-Leibler Divergence, [w:] International Conference of Soft Computing and Pattern Recognition (SoCPaR), Dalian, China, s. 440–445, https://doi.org/10.1109/SoCPaR.2011.6089284	pl_PL
dc.references	Zhu L., Wang G., Zou X., (2017), Improved Information Gain Feature Selection Method for Chinese Text Classification Based on Word Embedding, [w:] Proceedings of the 6th International Conference of Software and Computer Applications, Bangkok, Thailand, s. 72–76, https://doi.org/10.1145/3056662.3056671	pl_PL
dc.identifier.doi	10.18778/8220-786-6