There are some support functions in the read_data.py
module
from pathlib import Path
from read_data import read_baby_names, \
test_dev_train_split
import nltk
from nltk import NaiveBayesClassifier
import numpy as np
baby_names = read_baby_names("data/baby_2017.csv" )
def make_name_features(
data_line: list [str , str ]
) -> list [tuple [dict , str ]]:
gender = data_line[0 ]
name = data_line[1 ]
features = {
"first_letter" : name[0 ],
"second_letter" : name[1 ],
"last_letter" : name[- 1 ]
}
return (features, gender)
all_name_features = [
make_name_features(line)
for line in baby_names
]
train_name, dev_name, test_name = test_dev_train_split(all_name_features)
Before we train a whole Naive Bayes classifier, let’s think about how we might do with a much simpler classifier.
def joes_v_good_classifier(features: dict ) -> str :
return "F"
Let’s score its accuracy on the dev set.
def classifier_metric(classifier, data):
guess = classifier(data[0 ])
answer = data[1 ]
if guess == answer:
return 1
return 0
joe_guesses = np.array([
classifier_metric(joes_v_good_classifier, data)
for data in dev_name
])
By just always guessing "F"
, I did better than chance!
Moving beyond accuracy
Recall: Of all of the names that were "F"
, how many did the classifier label "F"
?
Precision: Of all of the names labelled "F"
, how many were "F"
?
joe_recall = np.array([
classifier_metric(joes_v_good_classifier, data)
for data in dev_name
if data[1 ] == "F"
])
joe_recall_est = joe_recall.mean()
joe_recall_est
joe_precision = np.array([
classifier_metric(joes_v_good_classifier, data)
for data in dev_name
if joes_v_good_classifier(data[0 ]) == "F"
])
joe_precision_est = joe_precision.mean()
joe_precision_est
These two measure are often combined into a single score called the “F Measure”.
\[
F = 2\frac{pr}{p+r}
\]
joe_f = 2 * ((joe_recall_est * joe_precision_est)/ (joe_recall_est + joe_precision_est))
A high-precision, low-recall classifier
def a_classifier(features: dict ) -> str :
if features["last_letter" ] == "a" :
return "F"
return "M"
a_recall = np.array([
classifier_metric(a_classifier, data)
for data in dev_name
if data[1 ] == "F"
])
a_recall_est = a_recall.mean()
a_recall_est
a_precision = np.array([
classifier_metric(a_classifier, data)
for data in dev_name
if a_classifier(data[0 ]) == "F"
])
a_precision_est = a_precision.mean()
a_precision_est
The way this classifier worked, it was really reluctant to label a name "F"
, but when it did , it was mostly right. For this data set, this results in a way worse F measure than just labelling every single name "F"
.
a_f = 2 * ((a_recall_est * a_precision_est)/ (a_recall_est + a_precision_est))
a_f
Training and evaluating the Naive Bayes classifier
nb_classifier = NaiveBayesClassifier.train(train_name)
nb_recall = np.array([
classifier_metric(nb_classifier.classify, data)
for data in dev_name
if data[1 ] == "F"
])
nb_recall_est = nb_recall.mean()
nb_recall_est
nb_precision = np.array([
classifier_metric(nb_classifier.classify, data)
for data in dev_name
if nb_classifier.classify(data[0 ]) == "F"
])
nb_precision_est = nb_precision.mean()
nb_precision_est
nb_f = 2 * ((nb_precision_est * nb_recall_est)/ (nb_precision_est + nb_recall_est))
nb_f
Back to topCitation BibTeX citation:
@online{fruehwald2024,
author = {Fruehwald, Josef},
title = {Training and Evaluating Classifiers},
date = {2024-03-21},
url = {https://lin511-2024.github.io/notes/programming/06_classifier.html},
langid = {en}
}
For attribution, please cite this work as:
Fruehwald, Josef. 2024.
“Training and Evaluating
Classifiers.” March 21, 2024.
https://lin511-2024.github.io/notes/programming/06_classifier.html .