Cross-Domain Toxic Spans Detection @ NLDB 2023

less than 1 minute read

Abstract

Given the dynamic nature of toxic language use, automated methods for detecting toxic spans are likely to encounter distributional shift. To explore this phenomenon, we evaluate three approaches for detecting toxic spans under cross-domain conditions: lexicon-based, rationale extraction, and fine-tuned language models. Our findings indicate that a simple method using off-the-shelf lexicons performs best in the cross-domain setup. The cross-domain error analysis suggests that (1) rationale extraction methods are prone to false negatives, while (2) language models, despite performing best for the in-domain case, recall fewer explicitly toxic words than lexicons and are prone to certain types of false positives. Our code is publicly available at: https://github.com/sfschouten/toxic-cross-domain.

Full paper

Twitter Facebook LinkedIn

Stefan F. Schouten

Cross-Domain Toxic Spans Detection @ NLDB 2023

Abstract

Full paper

You May Also Enjoy

Reasoning about Ambiguous Definite Descriptions in EMNLP 2023 Findings

Probing the representations of named entities in Transformer-based Language Models @ BlackboxNLP 2022

Project AI: Do Feature-Additive Explanation Methods Agree?

Plotting on the Simplex - Visualizing (rank) correlations and losses.