目前最好的中文测试标准
目前最好的中文测试标准
Yao Fu: I know, you've always been wondering, across the fog, the Chinese LLMs, how far have they gone?
This is why we create the C-Eval benchmark, inspired by MMLU, an evaluation suite consisting of 52 subjects testing LLMs Chinese capability