Comparing LLMs for Enterprise: Interpreting 0.7% vs 20.2% and What Really Matters
https://iris-wiki.win/index.php/Why_a_Dash_in_a_Benchmark_Table_Usually_Means_%E2%80%9CNo_Data,%E2%80%9D_Not_Zero
Vendor numbers are tempting: "0.7% hallucination on basic summarization" or "20.2% hallucination rate." Those figures matter, but not the way many product decks imply