Detection of Misogyny in Hindi Code-Mixed Texts by BiGRU with Bahdanau Attention using ByT5 Embeddings

Published online: Jun 1, 2026 Full Text: PDF (1.79 MiB) DOI: https://doi.org/10.24138/jcomss-2025-0240
Cite this paper
Authors:
S. Karishma, V. Akila

Abstract

Social media platforms became the hub for conveying messages and responses to current events, but contain more harmful aspects by influencing negative stereotypes, spreading false information, and enabling misogyny in some scenarios. Detecting misogynistic language in social media is challenging for code-mixed languages due to demographic variations, transliteration, and noisy texts. The model has been evaluated on a Hindi-English code-mixed dataset of misogynistic comments. We proposed a hybrid misogyny classification model that combines byte-level ByT5 encoder embeddings with a Bidirectional Gated Recurrent Unit (BiGRU) augmented by the Bahdanau attention mechanism. ByT5 produces robust, subword-agnostic representations that reduce sensitivity to spelling variations and code switching. The BiGRU captures contextual sequential patterns and bidirectional dependencies, while attention emphasizes the most indicative tokens of abusive intent. It is demonstrated that the proposed hybrid model outperforms recurrent neural networks with static and dynamic embeddings, producing more stable misogyny predictions in low-resource and noisy texts.

Keywords

Bahdanau Attention Mechanism, Bidirectional Gated Recurrent Unit, ByT5 Embedding, Code-Mixed, Misogyny, Noisy Text
Creative Commons License 4.0
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.