There are some support functions in the read_data.py module
from  pathlib import  Path 
from  read_data import  read_baby_names, \  
     test_dev_train_split 
import  nltk 
from  nltk import  NaiveBayesClassifier 
import  numpy as  np 
 
 baby_names =  read_baby_names("data/baby_2017.csv" ) 
 
def  make_name_features( 
         data_line: list [str , str ] 
     ) ->  list [tuple [dict , str ]]: 
     gender =  data_line[0 ] 
     name =  data_line[1 ] 
 
     features =  { 
         "first_letter" : name[0 ], 
         "second_letter" : name[1 ], 
         "last_letter" : name[- 1 ] 
     } 
 
     return  (features, gender) 
 
 all_name_features =  [ 
     make_name_features(line)  
     for  line in  baby_names 
     ] 
 train_name, dev_name, test_name =  test_dev_train_split(all_name_features) 
 
Before we train a whole Naive Bayes classifier, let’s think about how we might do with a much  simpler classifier.
def  joes_v_good_classifier(features: dict ) ->  str : 
     return  "F"  
 
Let’s score its accuracy on the dev set.
def  classifier_metric(classifier, data): 
     guess =  classifier(data[0 ]) 
     answer =  data[1 ] 
 
     if  guess ==  answer: 
         return  1  
     return  0  
 
 joe_guesses =  np.array([ 
     classifier_metric(joes_v_good_classifier, data) 
     for  data in  dev_name 
 ]) 
 
By just always guessing "F", I did better than chance!
Moving beyond accuracy 
Recall: Of all of the names that were  "F", how many did the classifier label "F"? 
Precision: Of all of the names labelled "F", how many were  "F"? 
 
 joe_recall =  np.array([ 
     classifier_metric(joes_v_good_classifier, data) 
     for  data in  dev_name 
     if  data[1 ] ==  "F"  
 ]) 
 
 joe_recall_est =  joe_recall.mean() 
 joe_recall_est 
 
 joe_precision =  np.array([ 
     classifier_metric(joes_v_good_classifier, data) 
     for  data in  dev_name 
     if  joes_v_good_classifier(data[0 ]) ==  "F"  
 ]) 
 
 joe_precision_est =  joe_precision.mean() 
 joe_precision_est 
 
These two measure are often combined into a single score called the “F Measure”.
\[
F = 2\frac{pr}{p+r}
\] 
 joe_f =  2  *  ((joe_recall_est *  joe_precision_est)/ (joe_recall_est +  joe_precision_est)) 
 
A high-precision, low-recall classifier 
def  a_classifier(features: dict ) ->  str : 
     if  features["last_letter" ] ==  "a" : 
         return  "F"  
      
     return  "M"  
 
 a_recall =  np.array([ 
     classifier_metric(a_classifier, data) 
     for  data in  dev_name 
     if  data[1 ] ==  "F"  
 ]) 
 
 a_recall_est =  a_recall.mean() 
 a_recall_est 
 
 a_precision =  np.array([ 
     classifier_metric(a_classifier, data) 
     for  data in  dev_name 
     if  a_classifier(data[0 ]) ==  "F"  
 ]) 
 
 a_precision_est =  a_precision.mean() 
 a_precision_est 
 
The way this classifier worked, it was really reluctant to label a name "F", but when it did , it was mostly right. For this data set, this results in a way worse F measure than just labelling every single name "F".
 a_f =  2  *  ((a_recall_est *  a_precision_est)/ (a_recall_est +  a_precision_est)) 
 a_f 
 
 
 
Training and evaluating the Naive Bayes classifier 
 nb_classifier =  NaiveBayesClassifier.train(train_name) 
 
 nb_recall =  np.array([ 
     classifier_metric(nb_classifier.classify, data) 
     for  data in  dev_name 
     if  data[1 ] ==  "F"  
 ]) 
 
 nb_recall_est =  nb_recall.mean() 
 nb_recall_est 
 
 nb_precision =  np.array([ 
     classifier_metric(nb_classifier.classify, data) 
     for  data in  dev_name 
     if  nb_classifier.classify(data[0 ]) ==  "F"  
 ]) 
 
 nb_precision_est =  nb_precision.mean() 
 nb_precision_est 
 
 nb_f =  2  *  ((nb_precision_est *  nb_recall_est)/ (nb_precision_est +  nb_recall_est)) 
 nb_f 
 
 
  Back to topCitation BibTeX citation:
@online{fruehwald2024,
  author = {Fruehwald, Josef},
  title = {Training and Evaluating Classifiers},
  date = {2024-03-21},
  url = {https://lin511-2024.github.io/notes/programming/06_classifier.html},
  langid = {en}
}
For attribution, please cite this work as:
Fruehwald, Josef. 2024. 
“Training and Evaluating
Classifiers.”  March 21, 2024. 
https://lin511-2024.github.io/notes/programming/06_classifier.html .