Display options
Share it on
Full text links
View full text

A Xu, E Pathak, E Wallace, S Gururangan… - arXiv preprint arXiv …, 2021 - arxiv.org

Detoxifying language models risks marginalizing minority voices.

Minority STEM

Gururangan, Pathak, Wallace, Xu

GSID: dk1Do6tXnp8J

Excerpt

… We identify that these failures stem from detoxification methods exploiting spurious correlations in toxicity datasets. Overall, our results highlight the tension between the controllability …

Similar articles

Cited by