# Machine Learning basics for Computational Linguistics

### Machine learning general structure

Processing steps

1. Objects/Instances

2. Training data(features,selection of data)

3. Machine Learning alg.(Simple generalisation, Decision Trees, Example based, Memory based, Support Vector Machine

4. Classifier / Model

5. Test Data

6. Evaluation and back to revise the things in bold(features, data, algorithm)

### Decision trees

tree BuildDecTree(Training Data T, Classes c) BEGIN    IF all examples in T belong to the same class Ci THEN        create Node for Ci    ELSE        1.a Select attribute F  with values V1, .., Vm         1.a Create new        1.c Divide the T according F into subsets T1,.. Tm         2   foreach not empty TN in T1,...Tm              run BuildDecTree(TN,c)ENDlog2 ptpt ... probability (the number) of Class tHow to select attribute? Introduce information gain(less of entropy, entropy measures confusion)GAIN(T,A) = todo  The Word Disabmiguation Problem We have different meaningsExample world: chairSentence examples:I sit on my new chair.The chair of local newspaper earns unknown amount of money.     chair -kind of furniture     chair - role in institutionLexical Matrix WordnetIn lexical matrix are on rows synsets (Synsets ; set of synonym - example  home = {home#1,abitation#1} senses of one world are in column - example home = ("place to live", "cell on chess")This is lexical relationIn Lexical  matrix are not stored semantic relation!Example - hypernyms are not stored in Lexila Matrix semantic relation are between synsets