アラビア語の精巧な調節のLLMは関連言語に限界を与えず、タスク調節は利益を説明します

4Bから671Bのパラメータから7つのLLMの研究では、アラビア語の細かい調節がセミト語と非セミト語の両方でゼロショットの読書理解を改善することを発見し、クロス言語学習ではなくタスク形式の学習を指摘しています。

cross lingual transferlarge language modelsfine tuningzero shot reading comprehensionchain of thoughtmixed architectures

Across 7 large language models and two architectures, fine-tuning on Arabic produces zero cross-lingual transfer to related Semitic languages. The gains come entirely from learning how to answer reading comprehension questions.

Why Linguistic Relatedness Doesn't Matter

The experiment is clean: fine-tune seven LLMs (from 4B to 671B parameters, covering both dense and Mixture-of-Experts architectures) on Arabic, then test zero-shot on Semitic languages like Hebrew and Amharic plus non-Semitic controls like Turkish and English. If linguistic relatedness mattered, Semitic languages should show bigger improvements. They don't.

Models that start with weak baseline scores improve dramatically across all languages, regardless of family. Models that already score well show only marginal gains, again uniform across languages. The pattern holds for every architecture tested. This is a strong signal that fine-tuning teaches task alignment (how to produce the answer format) rather than transferring knowledge about Arabic grammar or vocabulary to cognate languages.

What the Ablation Reveals

Chain-of-thought reasoning without any fine-tuning produces the same pattern. The models that benefit most from fine-tuning also benefit most from inference-time chain-of-thought, and the magnitude of improvement correlates. Both mechanisms address the same bottleneck: understanding the reading comprehension task format. Neither mechanism transfers language-specific knowledge.

This result challenges a core assumption in multilingual NLP. If you thought fine-tuning on a high-resource language like Arabic would bootstrap understanding of low-resource Semitic languages, your money is on the wrong mechanism. The models learn to better parse questions and locate answer spans, not to map Arabic lexicons onto Hebrew or Amharic.

Future work on cross-lingual transfer should focus on explicit knowledge injection or alignment across language families, because fine-tuning alone isn't doing what we thought it was.

Source: Disentangling Linguistic Relatedness from Task Alignment in Cross-Lingual Transfer
Domain: arxiv.org

アラビア語の精巧な調節のLLMは関連言語に限界を与えず、タスク調節は利益を説明します

Why Linguistic Relatedness Doesn't Matter

What the Ablation Reveals

More in Artificial Intelligence