Ondřej Plátek Blog
PhD candidate@UFAL, Prague. LLM & TTS evaluation. Engineer. Researcher. Speaker. Father.

Lab 3, SDS 2016

Content

  • Dialogue State Tracking (DST)
  • Using Recurrent Neural Networks (RNN) for DST
  • TensorFlow and RNN recapitulation needed for homework
  • Suggested topics
    1. Mean Square Error (MSE) vs CrossEntropy loss funkce
    2. Backpropagation (BP) need to know
    3. Recurrent NN (RNN) - BP Through Time (BPTT) need to know
    4. How to debug Neural Networks
      • TensorBoard, loss function, early stopping
  • If you have any questions about using RNN and TensorFlow, please ask them in advance via email. I may need to prepare some data or code to demonstrate the answer.

Homework - RNN DST

  • Explore DSTC2 tracking challenge data
    • Especially field semantics in data/*/label.json which you should use to parse DAI
    • Use dev set of size 200. After extracting the dstc2 data you can create it using ls data/*/*/label.json | shuf | head -n 200 > dstc2_200_dev_set.txt
      • If you want to compare against each other, use this split
    • Suggest how to evaluate the DSTC data.
    • Suggest what to classify and how often to classify (After each word? After each utterance? After each turn? …).
  • Understand Word embeddings tutorial
    • Prepare answers for following questions
      • What is the range and domain of the softmax function?
      • Find at least two reasons for minimizing \(-log(P(x))\) instead of maximizing \(P(x)\) as loss function.
      • What is stochastic about SGD? What you need to do after each epoch?
      • What is a batch? Why to use it? Is the SGD more stochastic with larger batches?
      • What does reduce_mean operation does?
      • What will happen if we initialize all the embeddings to the same vector? What if they will be all zero?
      • What is the motivation between NCE and negative sampling?
  • Try to understand back propagation