The increasing demand for cross-lingual communication emphasizes the importance of S2S translation systems, notably in multilingual countries like India. This study develops a structured Hindi-to-English S2S model, achieving a significant MOS of 3.8, bridging language gaps for effective communication. The results can be found at here.
This study centers on detecting hate speech in code-mixed Telugu-English content, a critical task for social media platforms striving to maintain a safe and inclusive online space. Telugu, a language with limited NLP resources, presents challenges due to the scarcity of data. To address this, we've generated a corpus comprising 4500 code-mixed hate speech comments from YouTube. We detail the corpus creation process and share the results of hate speech analysis trained on this dataset.