Lab 3, SDS 2016

22 Mar 2016

Content

Dialogue State Tracking (DST)
Using Recurrent Neural Networks (RNN) for DST
TensorFlow and RNN recapitulation needed for homework
Suggested topics
1. Mean Square Error (MSE) vs CrossEntropy loss funkce
2. Backpropagation (BP) need to know
3. Recurrent NN (RNN) - BP Through Time (BPTT) need to know
4. How to debug Neural Networks
  - TensorBoard, loss function, early stopping
If you have any questions about using RNN and TensorFlow, please ask them in advance via email. I may need to prepare some data or code to demonstrate the answer.

Homework - RNN DST

Explore DSTC2 tracking challenge data
- Especially field semantics in data/*/label.json which you should use to parse DAI
- Use dev set of size 200. After extracting the dstc2 data you can create it using ls data/*/*/label.json | shuf | head -n 200 > dstc2_200_dev_set.txt
  - If you want to compare against each other, use this split
- Suggest how to evaluate the DSTC data.
- Suggest what to classify and how often to classify (After each word? After each utterance? After each turn? …).
Understand Word embeddings tutorial
- Prepare answers for following questions
  - What is the range and domain of the softmax function?
  - Find at least two reasons for minimizing \(-log(P(x))\) instead of maximizing \(P(x)\) as loss function.
  - What is stochastic about SGD? What you need to do after each epoch?
  - What is a batch? Why to use it? Is the SGD more stochastic with larger batches?
  - What does reduce_mean operation does?
  - What will happen if we initialize all the embeddings to the same vector? What if they will be all zero?
  - What is the motivation between NCE and negative sampling?
Try to understand back propagation
- Quite nice intro