[go: up one dir, main page]

  EconPapers    
Economics at your fingertips  
 

Ensembles of Classifiers for Parallel Categorization of Large Number of Text Documents Expressing Opinions

František Dařena and Jan Zizka ()
Additional contact information
Jan Zizka: Department of Informatics, Faculty of Business and Economics, Mendel Uni- versity in Brno, Zemedelska 1, 613 00 Brno, Czech Republic

No 2016-65, MENDELU Working Papers in Business and Economics from Mendel University in Brno, Faculty of Business and Economics

Abstract: Opinions provided by people that used some services or purchased some goods are a rich source of knowledge. The opinion classification, applying mostly supervised classifiers, is one of the essential tasks. Computer’s technological capabilities are still a major obstacle, especially when processing huge volumes of data. This study proposes and evaluates experimentally a parallelism application to the classification of a very large number of contrary opinions expressed as freely written text reviews. Instead of training a single classifier on the entire data set, an ensemble of classifiers is trained on disjunctive subsets of data and a group decision is used for the classification of unlabelled items. The main assessment criteria are computational efficiency and error rates, combined into a single measure to be able to compare ensembles of different sizes. Support vector machines, artificial neural networks, and deci- sion trees, belonging to frequently used classification methods, were examined. The paper demonstrates the suggested method viability when the number of text reviews leads to com- putational complexity, which is beyond the contemporary common PC’s capabilities. Classification accuracy and the values of other classification performance measures (Precision, Recall, F-measure) did not decrease, which is a positive finding.

Keywords: text documents; natural language; classification; parallel processing; ensembles of classifiers; machine learning (search for similar items in EconPapers)
JEL-codes: C38 C89 (search for similar items in EconPapers)
Pages: 17
Date: 2016-12
New Economics Papers: this item is included in nep-cmp
References: View references in EconPapers View complete reference list from CitEc
Citations: Track citations by RSS feed

Downloads: (external link)
http://ftp.mendelu.cz/RePEc/men/wpaper/65_2016.pdf Full text (application/pdf)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:men:wpaper:65_2016

Access Statistics for this paper

More papers in MENDELU Working Papers in Business and Economics from Mendel University in Brno, Faculty of Business and Economics Contact information at EDIRC.
Bibliographic data for series maintained by Luděk Kouba ().

 
Page updated 2023-06-15
Handle: RePEc:men:wpaper:65_2016