Difference between revisions of "Publications/guillaume.22.egc"
From LRDE
(Created page with "{{Publication | published = true | date = 2022-01-01 | authors = Pierre Guillaume, Corentin Duchene, Reda Dehak | title = Hate Speech and Toxic Comment Detection using Transfo...") |
|||
Line 1: | Line 1: | ||
{{Publication |
{{Publication |
||
| published = true |
| published = true |
||
− | | date = 2022-01- |
+ | | date = 2022-01-12 |
| authors = Pierre Guillaume, Corentin Duchene, Reda Dehak |
| authors = Pierre Guillaume, Corentin Duchene, Reda Dehak |
||
| title = Hate Speech and Toxic Comment Detection using Transformers |
| title = Hate Speech and Toxic Comment Detection using Transformers |
||
| booktitle = Workshop EGC 2022 DL for NLP |
| booktitle = Workshop EGC 2022 DL for NLP |
||
| abstract = Hate speech and toxic comment detection on social media has proven to be an essential issue for content moderation. This paper displays a comparison between different Transformer models for Hate Speech detection such as Hate BERT, a BERT-based model, RoBERTa and BERTweet which is a RoBERTa based model. These Transformer models are tested on Jibes&Delight 2021 reddit dataset using the same training and testing conditions. Multiple approaches are detailed in this paper considering feature extraction and data augmentation. The paper concludes that our RoBERTa st4-aug model trained with data augmentation outperforms simple RoBERTa and HateBERT models. |
| abstract = Hate speech and toxic comment detection on social media has proven to be an essential issue for content moderation. This paper displays a comparison between different Transformer models for Hate Speech detection such as Hate BERT, a BERT-based model, RoBERTa and BERTweet which is a RoBERTa based model. These Transformer models are tested on Jibes&Delight 2021 reddit dataset using the same training and testing conditions. Multiple approaches are detailed in this paper considering feature extraction and data augmentation. The paper concludes that our RoBERTa st4-aug model trained with data augmentation outperforms simple RoBERTa and HateBERT models. |
||
+ | | lrdekeywords = IA |
||
+ | | category = national |
||
+ | | lrdenewsdate = 2022-01-12 |
||
+ | | note = accepted |
||
+ | | nodoi = |
||
| type = inproceedings |
| type = inproceedings |
||
| id = guillaume.22.egc |
| id = guillaume.22.egc |
||
Line 27: | Line 32: | ||
data augmentation. The paper concludes that our RoBERTa |
data augmentation. The paper concludes that our RoBERTa |
||
st4-aug model trained with data augmentation outperforms |
st4-aug model trained with data augmentation outperforms |
||
− | simple RoBERTa and HateBERT models.<nowiki>}</nowiki> |
+ | simple RoBERTa and HateBERT models.<nowiki>}</nowiki>, |
+ | category = <nowiki>{</nowiki>national<nowiki>}</nowiki>, |
||
+ | note = <nowiki>{</nowiki>accepted<nowiki>}</nowiki>, |
||
+ | nodoi = <nowiki>{</nowiki><nowiki>}</nowiki> |
||
<nowiki>}</nowiki> |
<nowiki>}</nowiki> |
||
Latest revision as of 19:07, 7 April 2023
- Authors
- Pierre Guillaume, Corentin Duchene, Réda Dehak
- Where
- Workshop EGC 2022 DL for NLP
- Type
- inproceedings
- Keywords
- IA
- Date
- 2022-01-12
Abstract
Hate speech and toxic comment detection on social media has proven to be an essential issue for content moderation. This paper displays a comparison between different Transformer models for Hate Speech detection such as Hate BERT, a BERT-based model, RoBERTa and BERTweet which is a RoBERTa based model. These Transformer models are tested on Jibes&Delight 2021 reddit dataset using the same training and testing conditions. Multiple approaches are detailed in this paper considering feature extraction and data augmentation. The paper concludes that our RoBERTa st4-aug model trained with data augmentation outperforms simple RoBERTa and HateBERT models.
Bibtex (lrde.bib)
@InProceedings{ guillaume.22.egc, author = {Pierre Guillaume and Corentin Duchene and Reda Dehak}, title = {Hate Speech and Toxic Comment Detection using Transformers}, booktitle = {Workshop EGC 2022 DL for NLP}, month = jan, year = {2022}, abstract = {Hate speech and toxic comment detection on social media has proven to be an essential issue for content moderation. This paper displays a comparison between different Transformer models for Hate Speech detection such as Hate BERT, a BERT-based model, RoBERTa and BERTweet which is a RoBERTa based model. These Transformer models are tested on Jibes&Delight 2021 reddit dataset using the same training and testing conditions. Multiple approaches are detailed in this paper considering feature extraction and data augmentation. The paper concludes that our RoBERTa st4-aug model trained with data augmentation outperforms simple RoBERTa and HateBERT models.}, category = {national}, note = {accepted}, nodoi = {} }