Skip to the content.

Neural networks for language model smoothing

Smoothing is critical to ensure a language model performs accurately even in the event of data sparseness. In this report, we investigate if it is possible to smooth a language model with a neural network as well as determine if the neural network can address the over- estimation of events that sometimes occurs using current methods. Background material regarding current smoothing techniques such as Good-Turing smoothing is discussed. For comparison, we develop a baseline language model with Good-Turing smoothing. Two neural networks were developed, the first capable of predicting a discounted probability for a trigram, trained on the Good-Turing estimates. The second neural network was developed through an alternative training approach to directly calculate the estimates for a trigram. Both neural networks were capable of achieving the same perplexity as the baseline model. Through further analysis we show how the second neural network can prevent over-estimation.

This repository includes:

Self-written code for:

Other files included are examples of basic interactions with neural networks and language modelling.