# Lab 3, SDS 2016

## Content

• Dialogue State Tracking (DST)
• Using Recurrent Neural Networks (RNN) for DST
• TensorFlow and RNN recapitulation needed for homework
• Suggested topics
1. Mean Square Error (MSE) vs CrossEntropy loss funkce
2. Backpropagation (BP) need to know
3. Recurrent NN (RNN) - BP Through Time (BPTT) need to know
4. How to debug Neural Networks
• TensorBoard, loss function, early stopping
• If you have any questions about using RNN and TensorFlow, please ask them in advance via email. I may need to prepare some data or code to demonstrate the answer.

## Homework - RNN DST

• Explore DSTC2 tracking challenge data
• Especially field semantics in data/*/label.json which you should use to parse DAI
• Use dev set of size 200. After extracting the dstc2 data you can create it using ls data/*/*/label.json | shuf | head -n 200 > dstc2_200_dev_set.txt
• If you want to compare against each other, use this split
• Suggest how to evaluate the DSTC data.
• Suggest what to classify and how often to classify (After each word? After each utterance? After each turn? …).
• Understand Word embeddings tutorial
• Prepare answers for following questions
• What is the range and domain of the softmax function?
• Find at least two reasons for minimizing $-log(P(x))$ instead of maximizing $P(x)$ as loss function.
• What is stochastic about SGD? What you need to do after each epoch?
• What is a batch? Why to use it? Is the SGD more stochastic with larger batches?
• What does reduce_mean operation does?
• What will happen if we initialize all the embeddings to the same vector? What if they will be all zero?
• What is the motivation between NCE and negative sampling?
• Try to understand back propagation