LLM Benchmarking
What is the MMLU Benchmark name stand for?
Measuring Massive Multitask Language Understanding
LLM Benchmarking
What is Goodhart’s Law?
An adage, often stated as:
“When a measure becomes a target, it ceases to be a good measure.”
The more specific quote is reported to have come from a 1975 article, in which Goodhart stated:
“Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.”