Automatic Text Correction for Chatbots

Authors:

Vasileios Palassopoulos, Ion Androutsopoulos, Themos Stafylakis

Collaborators:

Department of Informatics Athens University of Economics and Business

Publication Date

January 1, 2020

The present thesis addresses an important, open, Machine Learning problem, namely the automatic correction of the involuntary errors, made by humans, when communicating by written messages with chatbots. First, the problem is formulated as a “noisy-channel model” problem, and all the needed algorithms are developed, employing both, n-gram and Transformer-based language models. Next, a complete software framework is developed for solving the problem by employing Machine Learning methods, using Python and C++ libraries, and partially modifying them, resulting in a 20-fold increase in the processing speed for the specific problem. Finally, the developed software framework is used for performing Machine Learning experiments, using the publicly available corpora of “WikEd” and “W&I”. Although only a simple personal computer and limited use of cloud computing are used, and the publicly available corpora are not entirely appropriate for the machine training-tuning-testing procedures, certain interesting results are obtained, with respect to the relative efficiency of the various available methods for language processing. If, in the future, appropriate corpora become available and sufficient computer resources are used, it is expected that the developed software framework can provide acceptably efficient methods for the automatic text correction for chatbots.