TY - GEN
T1 - Düzenli Ifadeler ile Ingilizce Dil Gruplarinin Analiz Edilmesi
AU - Duru, Ismail
AU - Diri, Banu
AU - Özçevik, M. Emir
AU - Ataseven, Kerim
AU - Doǧan, Gülüstan
AU - White, Su
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/11/29
Y1 - 2018/11/29
N2 - In most of widely used distance education platforms which are named as MOOC (Massive Open Online Courses) language of lectures are English, but even so, they have participants from a lot of different countries. This situation causes differences in learners usage behaviors and performances. In our previous studies we tried to divide the users into language groups according to their English language proficiency. In this study, with natural language processing techniques we aimed to improve the division of language groups of students and automatically generate datasets which belong to language groups from a distance education platform named as FutureLearn. In FutureLearn platform (like other distance education platforms), learners do not have to provide their country information while registering. Also for some of the learners, provided country information belongs to where they currently live which is different from their home country. In such situations, it is not possible to determine whether English is their first, official or secondary language. Our study focused on using regex patterns to update learners language groups' labels with aim of using them in future studies like predicting the learners' language groups. As data source the datasets of «Understanding Language: Learning and Teaching-4» course on the FutureLearn platform is used. To update the language groups with natural language processing we mostly used features like learners' comments, ids, and country information. As a result of this study, with the analysis of the comments of the users, we identified 63.06% of all commented users' language groups which consist of English as official and primary language, English is official but not primary language and English is not official language. It is observed that 78.19% of these learners belong to the same language group as their provided country information in registration progress and 21.81% of users groups' home country is different from their language group which is identified from their comments. When we just use their country information (the information provided in registration step) number of English language group identified learners were lower and identified learners' language groups could be wrong.
AB - In most of widely used distance education platforms which are named as MOOC (Massive Open Online Courses) language of lectures are English, but even so, they have participants from a lot of different countries. This situation causes differences in learners usage behaviors and performances. In our previous studies we tried to divide the users into language groups according to their English language proficiency. In this study, with natural language processing techniques we aimed to improve the division of language groups of students and automatically generate datasets which belong to language groups from a distance education platform named as FutureLearn. In FutureLearn platform (like other distance education platforms), learners do not have to provide their country information while registering. Also for some of the learners, provided country information belongs to where they currently live which is different from their home country. In such situations, it is not possible to determine whether English is their first, official or secondary language. Our study focused on using regex patterns to update learners language groups' labels with aim of using them in future studies like predicting the learners' language groups. As data source the datasets of «Understanding Language: Learning and Teaching-4» course on the FutureLearn platform is used. To update the language groups with natural language processing we mostly used features like learners' comments, ids, and country information. As a result of this study, with the analysis of the comments of the users, we identified 63.06% of all commented users' language groups which consist of English as official and primary language, English is official but not primary language and English is not official language. It is observed that 78.19% of these learners belong to the same language group as their provided country information in registration progress and 21.81% of users groups' home country is different from their language group which is identified from their comments. When we just use their country information (the information provided in registration step) number of English language group identified learners were lower and identified learners' language groups could be wrong.
KW - FutureLearn
KW - MOOC
KW - Regex
KW - identification of English language groups
KW - natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85059977237&partnerID=8YFLogxK
U2 - 10.1109/ASYU.2018.8554018
DO - 10.1109/ASYU.2018.8554018
M3 - Konferans katkısı
AN - SCOPUS:85059977237
T3 - Proceedings - 2018 Innovations in Intelligent Systems and Applications Conference, ASYU 2018
BT - Proceedings - 2018 Innovations in Intelligent Systems and Applications Conference, ASYU 2018
A2 - Yildirim, Tulay
A2 - Ozyildirim, Buse Melis
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 Innovations in Intelligent Systems and Applications Conference, ASYU 2018
Y2 - 4 October 2018 through 6 October 2018
ER -