A genomics-based profanity-safe Web forum

Author

Christian Mogollón Pinzón;Sergio Rojas-Galeano

Author_Institution

School of Engineering, Universidad Distrital, Bogotá

fYear

2015

Firstpage

425

Lastpage

430

Abstract

User-generated text is the primary source of interaction in virtual communities on Web2.0 applications such as forums, blogs or social networks. Unfortunately some users abuse this freedom of speech liberty to disseminate non-authorised profanity content (foul language, insults, advertisement, boosting or denigration of a name or a trademark). Naïve filters based on literal comparisons against black-lists of forbidden terms, fail to detect variations obtained by character transliteration or masking (e.g. writing piss as P!55 or p.i.s.s). Recent approaches to this problem inspired in sequence alignment methods from comparative genomics in bioinformatics, have shown promise in preventing overlooking such variants. Building upon those results we have developed an experimental Web forum allowing users to generate text that is screened against transliterated profanity. In this paper we introduce the software (ForumForte) and describe briefly the technique and engineering behind it. We anticipate this kind of tools might prove beneficial for content moderation in mainstream applications such as newspaper forums and micro-blogging social networking sites. Our software is open-source under the New BSD License and is available at: http://tinyurl.com/ForumForte.

Keywords

"Software","Genomics","Bioinformatics","Blogs","Vegetation","Servers","Organisms"

Publisher

ieee

Conference_Titel

Computing Colombian Conference (10CCC), 2015 10th

Type

conf

DOI

10.1109/ColumbianCC.2015.7333455

Filename

7333455