Ondřej Plátek Archive
PhD candidate@UFAL, Prague. LLM & TTS evaluation. Engineer. Researcher. Speaker. Father.

Speech recognition


  1. Vector quantisation - k-Means or Loyd
    $ u_i = Centroid(R_i) $
    $ C_i = \sum_{X in R_i} { (x-u_i) *(x-u_i)^T}$

  2. we can change the distance to $ d_{Mahalanobis(X,u)} = (x-y)^T * C^{-1} * (x-u)$

  3. we can choose as the centroid according to $ R_i = \{ x | i = argmax_j Norm(x | u_j, C_j)\}$

  4. expectation and maximisation of EM algorithm
    $ p(x|\theta) = \sum_{i=1}^{N}{ c_i * Norm(x | u_i, C_i)}$ where $ \theta = (u_i = E(Norm), C_i= Var(norm)$
    $ w = { x_1, ..., x_T}$
    MLE (the observation x are independent, so we can write Sum for logarithm):
    $ L^{'}(\theta|w) = p(x_1, ..., x_{T}|\theta)$
    $ L(\theta|w) = ln(L^{'}(\theta|w)) = \sum_{x in w} {ln( p(x|\theta))}$

    1. E-step $w = {x_1, ..., x_{T}}$

    2. M-step N



  5. Rest in Fing: basicly we recalculate the Gaussions after each update step

  6. mfcc mell function fourier todo

  7. HDK todo


trellis - mřížka