Hank's Notes

Language model = finding a probability distribution over a sequence of words (traditionally estimated n-gram probabilities)
This model trains CNN + RNN to create language model based on character level embedding
Advantages of character level embedding:
- No need for morphological tagging
- Can embed morphological properties of an input language
- Free from out-of-vocabulary issues

While word embedding vectors focus on the contextual information of the input's surrounding words, this model relies on the context of each character within the word to generate word vectors.
In a way, it can create more isolated/localized embedding of a word that captures the internal structure or the morphological properties of a word (Representing words in different dimensional space)
No wonder combination of this localized embedding method with traditional word2vec like word embedding method shows improved performance in various NLP tasks