Evaluating Large Language Models for Accuracy Incentivizes Hallucinations

The study published in Nature evaluates the impact of accuracy-based evaluation on large language models and finds that it incentivizes the generation of hallucinated text. The researchers demonstrate that the current evaluation metrics, which focus on the accuracy of the model's output, inadvertently encourage the model to generate text that is not present in the training data. This has significant implications for the development of reliable AI systems, as hallucinated text can be detrimental to the system's performance and accuracy. The study highlights the need for a reevaluation of the evaluation metrics and their impact on AI development, and suggests that alternative metrics, such as those that focus on the model's ability to generate coherent and relevant text, may be more effective in promoting the development of reliable AI systems.

Source: Evaluating large language models for accuracy incentivizes hallucinations