A Comparative Analysis of Large Language Models for English-to-Tamil Machine Translation with Performance Evaluation

Main Article Content

SyedKhaleel Jageer, Priyaradhikadevi.T, Madhan.K, Prasanna.S

Abstract

Machine translation is essential for cross-linguistic communication, especially for low-resource languages like Tamil, which presents challenges due to complex morphology, syntax, and cultural nuances. This study compares three advanced Large Language Models (LLMs) Claude, ChatGPT, and Gemini in translating English to Tamil, focusing on culturally sensitive, political, technical, and idiomatic content. Claude enhances coherence with prompt tuning, batch processing, and temperature control, proving effective in document-level translations. ChatGPT, using a decoder-only architecture and Reinforcement Learning from Human Feedback (RLHF), aligns translations with human cultural and contextual expectations. Gemini’s transformer-based framework integrates advanced tokenization and attention mechanisms, preserving tense, gender, and politeness markers in Tamil, excelling in idiomatic and technical translations. The study employed a diverse corpus, including poetry, technical documentation, and conversational text. Translation performance was evaluated using BLEU, BERT-based metrics, METEOR, and Translation Edit Rate (TER). Results show that Gemini ranks highest in accuracy and precision, especially for complex syntax and idioms. Claude demonstrates fluency but has a relatively high TER. ChatGPT scores well in semantic alignment, with high BERT and METEOR metrics. This analysis highlights the capabilities and limitations of each model in translating Tamil, providing insights into LLM performance for low-resource languages.

Article Details

Section
Articles