“Low-Resource” Text Classification: A Parameter-Free Classificat
“Low-Resource” Text Classification: A Parameter-Free Classification Method with Compressors Deep neural networks (DNNs) are often used for text classification due to their high accuracy. However, DNNs can be computationally intensive, requiring millions of parameters and large amounts of labeled data, which can make them expensive to use, to optimize, and to transfer to out-of-distribution (OOD) cases in practice. In this paper, we propose a non-parametric alternative to DNNs that’s easy, lightweight, and universal in text classification: a combination of a simple compressor like gzip with a k-nearest-neighbor classifier. Without any training parameters, our method achieves results that are competitive with non-pretrained deep learning methods on six in-distribution datasets.It even outperforms BERT on all five OOD datasets, including four low-resource languages. Our method also excels in the few-shot setting, where labeled data are too scarce to train DNNs effectively. > > > Holy shit this paper, and the insight behind it. > > You know how every receiver is also a transmitter, well: every text predictor is also text compressor, and vice-versa. > > You can outperform massive neural networks running millions of parameters, with a few lines of python and a novel application of gzip.
在Telegram中查看相关推荐

🔍 发送关键词来寻找群组、频道或视频。
启动SOSO机器人