Veri Madenciliǧi Teknikleriyle Türkçe Web Sayfalarinin Kategorize Edilmesi

Translated title of the contribution: Categorizing the Turkish web pages by data mining techniques

Seçil Şekerci Hüsem, Ayla Gülcü

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Today, it is not possible to use human power alone to cope with the increasing amount of data. For this reason, some automated methods are needed to group similar documents together or to place documents in predefined categories according to certain rules. The use of automated classification techniques is becoming increasingly important for this reason. In this study, a database consisting of 22 thousand samples was created in order to respond to the need for Turkish data and various methods used for text classification in the literature were tested on this data. Multinomial Naive Bayes (M-NB) and Support Vector Machines (SVM) algorithms which are frequently used for text classification, were compared by applying the n-gram word vector selection and information gain ratio. Beside these, it has been focused on the number of categories, the content of data used to train the model and the completeness of this data, and also the effects of these on classification success are examined.

Translated title of the contributionCategorizing the Turkish web pages by data mining techniques
Original languageTurkish
Title of host publication2nd International Conference on Computer Science and Engineering, UBMK 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages255-260
Number of pages6
ISBN (Electronic)9781538609309
DOIs
Publication statusPublished - 31 Oct 2017
Event2nd International Conference on Computer Science and Engineering, UBMK 2017 - Antalya, Turkey
Duration: 5 Oct 20178 Oct 2017

Publication series

Name2nd International Conference on Computer Science and Engineering, UBMK 2017

Conference

Conference2nd International Conference on Computer Science and Engineering, UBMK 2017
Country/TerritoryTurkey
CityAntalya
Period5/10/178/10/17

Fingerprint

Dive into the research topics of 'Categorizing the Turkish web pages by data mining techniques'. Together they form a unique fingerprint.

Cite this