Language Model Adaptation
Language Model Adaptation
ADAPTATION
2
TECNIQUES
MODEL INTERPOLATION
• Also called as Mixture Models.
• This method combines two or more models: one trained on a large, general dataset
(background model) and another trained on a smaller, specific dataset (in-domain model).
Example:
• A model trained on general news might predict the next word in "The doctor prescribed" with
probabilities that favor common words.
• A specific model trained on medical texts would favor words like "medication" or "treatment."
• The adapted model uses a combination of both to make more accurate predictions in
medical contexts.
Click here to follow us on:
Youtube Telegram
6
Example:
• For a document about technology, use a model trained on tech articles.
• For a document about health, use a model trained on health articles.
• This technique adapts the model dynamically based on the words that have already
been seen in the text.
Example:
In a financial document, seeing the word "stock" increases the likelihood of subsequently
seeing "market," "price," or "exchange."
UNSUPERVISED ADAPTATION
• This method uses the output of a language model or speech recognizer as additional
data for adaptation, even if the output isn't perfect.
Example:
• Even if the transcript has errors, it can still provide useful information to adapt the model
to the speaker's style or vocabulary.
Example:
• If you're adapting a model to understand a new slang or jargon, you can collect recent
tweets or forum posts using that slang and use them to update your model.