Mandatory Fields

Authors

Alakrot, A;Murray, L;Nikolov, NS

Conference Title

ARABIC COMPUTATIONAL LINGUISTICS

Title of Paper

Towards Accurate Detection of Offensive Language in Online Communication in Arabic

Year

2018

Month

January

Status

Published

Peer Reviewed

Times Cited

25 ()

Optional Fields

Search Keyword

Editors

Start Page

315

End Page

320

Location

Start Date

End Date

Abstract

We present the results of predictive modelling for the detection of anti-social behaviour in online communication in Arabic, such as comments which contain obscene or offensive words and phrases. We collected and labelled a large dataset of YouTube comments in Arabic which contains a broad range of both offensive and inoffensive comments. We used this dataset to train a Support Vector Machine classifier and experimented with combinations of word-level features, N-gram features and a variety of pre-processing techniques. We summarise the pre-processing steps and features that allow training a classifier which is more precise, with 90.05% accuracy, than classifiers reported by previous studies on Arabic text. (C) 2018 The Authors. Published by Elsevier B.V.

Funded By

URL

DOI Link

10.1016/j.procs.2018.10.491

Grant Details

Funding Body

Grant Details