Week 2, Dialogue Systems
Content
- Great homework submission! Thanks!
- Simon’s solutions
- Vojta’s solutions
- Python & nice code style is a common practice!
- Deadline Tue 7 AM - so I can review the solutions before next class
- Unit-tests why?
- Kaldi & Slurm experience
- Phonetic examples walk-through
- Submission formats: github.com or compressed folder
- Barge-in
- Examples
- How to detect it? Audio/word/semantic/pragmatic level?
- Voice Activity Detection (VAD) vs Wake words
- speaker diarization
- adaptive echo cancellation
- End-pointing and hesitations
- Meaning abstraction
- opinionated stance
- words, sentences
- speech acts: assertive, directive, commissive, expressive, declarative
- Actions
- Maxims - is it hard?
- M. of quantity – don’t give too little/too much information
- M. of quality – be truthful
- M. of relation – be relevant
- M. of manner – be clear
- Grounding and dialogue recovery
- Entropy
- Definition \(H(text) = - \sum_{x \in \mbox{text}}{\frac{freq(x)}{len(\mbox{text})} log_2(\frac{freq(x)}{len(\mbox{text})})}\)
- Simplification - Find it!
- Cross-entropy and LM
-
\[H(p, q) = -\sum_{x}{p(x) log_2(q(x))}\]
-
\[H(text, LM) = -\sum_{x \in text}{ 1/N * log_2(LM(x))}\]
- n-gram Language models – see a detailed description by Jurafsky & Martin here
Homework
- (1 point) Implement entropy calculations and compute the entropy for the following datasets:
- DSTC2 dataset
- Facebook babi tasks 1-6. See github for details.
- All the news - use just the “Article Content”
- Use at most first 10,000 utterances/sentences if the dataset is large.
- Describe in 5 sentences the properties of each dataset and explain how they relate to the computed entropy value.
- (2 point) Train a Language Model and compute cross entropy on the Vystadial dataset
- Recommended toolkit - KenLM
- Read the README and train a model
bin/lmplz -o 5 <text >text.arpa
on the Vystadial training set.
- Compute cross-entropy of the train, dev set and first sentence from dev set. See the example usage
- Describe and explain the results in 5 to 10 sentences.
- BONUS (3 points) Train a wake word model and evaluate it with your voice!
- Recommended model: Mycroft precise
- Write a short summary of what you did and what problems you have faced.
- Include your dataset with your source code.
- Include values of F1 measure on the training, development and test set.
- BONUS (3 points) Write a conditional language model using RNN (Recurrent Neural Networks).
- Conditional language model is a decoder RNN with the initial state initialized with (i.e. conditioned on) additional information.
- Run the conditional language model on user inputs from the DSTC2 dataset.
- Use the previous dialogue state (or a part of it) as the initialization for your conditional language model.
- Compare the perplexity of a vanilla RNN (zero-initialized) and your conditional implementation on the user inputs.