10th International Congress on Information and Communication Technology in concurrent with ICT Excellence Awards (ICICT 2025) will be held at London, United Kingdom | February 18 - 21 2025.
Authors - Varsha Naik, Rajeswari K, Kshitij Jadhav, Aniket Rahalkar Abstract - This study examines cross-lingual natural language processing (NLP) techniques to address the challenges of developing conversational AI systems for low-resource languages. These languages often lack extensive linguistic re- sources such as large-scale corpora, annotated datasets, and language-specific tools, making it difficult to capture the linguistic distinctions and contextual meaning essential for high-quality dialogue systems. This language gap restricts accessibility and inclusivity, preventing speakers of these underrepresented languages from fully benefiting from advancements in technology. The study compares various factors that affect model performance, including transformer model architecture, cross-lingual embeddings, fine-tuning strategies, and transfer learning approaches. Despite these challenges, the research shows that cross-lingual models offer promising solutions, especially when utilizing techniques like transfer learning and multilingual pre-training. By transferring knowledge from high-resource languages, these models can compensate for the scarcity of data in low-resource languages, enabling the development of more accurate, culturally sensitive, and inclusive AI systems. The findings highlight the importance of bridging linguistic divides to foster greater language diversity, accessibility, and technological inclusivity, ultimately supporting cultural preservation and revitalization.