FUTO Swipe 滑动输入模型架构说明
FUTO Swipe 采用三种模型:Encoder 模型通用且与布局、语言无关,用于一般滑动输入预测,精度非顶尖;ContextLM 模型是面向单一语言的小型语言模型,利用上下文消除无意义词,仅需文本数据训练;Decoder 模型针对特定语言和布局学习布局特性,实现顶尖精度,目前仅有 QWERTY 英文解码器。三者结合、beam width 为 300 时,测试集 top-4 失败率约 4%,忽略词汇外情况后错误率低于 1%。
Our architecture includes three model types.
The Encoder model is a universal layout-agnostic and language-agnostic, and is used for making swipe typing predictions in the general case. However, it does not offer cutting-edge accuracy.
The ContextLM model is a very small language model that is trained for a single language. It's used to improve the quality of predictions by eliminating nonsensical words given the preceding words in the sentence. It only requires text data for training.
Finally, the decoder is a language-specific and layout-specific model that learns layout's peculiarities and achieves leading accuracy. As it requires swipe typing data for a specific layout and language for training, we only have a QWERTY English decoder for now.
With all 3 models and with a beam width of 300, we achieve a top-4 fail rate of only ~4% on our test set. Ignoring out-of-vocabulary cases, the error rate is below 1%.