最近不少宣称支持超长上下文的论文，但实际可用性却很差。有点像人类一目十行，LLM 看了但没注意到关键信息。

最近不少宣称支持超长上下文的论文，但实际可用性却很差。有点像人类一目十行，LLM 看了但没注意到关键信息。目前可用性的平衡点可能还是100k以下。 Jim Fan: I'm calling the Myth of Context Length: Don't get too excited by claims of 1M or even 1B context tokens. You know what, LSTMs already achieve infinite context length 25 yrs ago! What truly matters is how well the model actually uses the context. It's easy to make seemingly wild…

在Telegram中查看

相关推荐