In the last article, we discussed about why does AdaBoost work? which gave us an intuitive idea about it's working using exponential loss function optimization. Although we get a good enough understanding, still it may not be able to give us the complete picture. We also used scikit-learn library's Adaboostclassifier class to predict the error of a given classification problem.
Here we will try to implement AdaBoost Classifier function in python based on the pseudocode in the first AdaBoost article in the given series. Here we are dealing with a classification problem with two classes.
We will be using a Decision tree as the base estimator and we will fit new classifiers with weights.
Copy
<IPython.core.display.Image object>
First we import the necessary packages for our AdaBoost Classifier.
Copy
import numpy as np
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn import datasets
We divide our dataset into training and test set using the train_test_split function.
Copy
hastie = datasets.make_hastie_10_2()
x , y = hastie
X_train, X_test, Y_train, Y_test = train_test_split(x, y, test_size=0.3)
The function to calculate the error_rate is as below:
Copy
def error_rate(pred, Y):
error = 0
for i in range(len(Y)):
if pred[i] != Y[i]:
error = error + 1
return error / len(Y)
Next, we will write the Adaboost implementation function where we use the algorithm as given in the above pseudocode. The output of the given function will be the error rate of the test set and the training set.
Copy
def adaboost(X_train, Y_train, X_test, Y_test, M, clf):
n_train = len(X_train)
n_test = len(X_test)
#Initializing the weights
w = np.ones(n_train) / n_train
pred_train, pred_test = [np.zeros(n_train), np.zeros(n_test)]
for i in range(M):
#Fitting the classifier with weights
clf.fit(X_train, Y_train, sample_weight = w)
pred_train_ite = clf.predict(X_train)
pred_test_ite = clf.predict(X_test)
diff=[]
diff2=[]
for i in range(len(Y_train)):
if pred_train_ite[i] != Y_train[i]:
diff.append(1)
else:
diff.append(0)
for i in range(len(diff)):
if diff[i] == 0:
diff2.append(-1)
else:
diff2.append(1)
#error
summ = 0
for i in range(len(w)):
summ = summ + w[i]*diff[i]
err_m = summ / sum(w)
#Alpha value
alpha_m = 0.5 * np.log( (1 - err_m) / float(err_m))
#Updating weights
w = np.multiply(w, np.exp([float(x) * alpha_m for x in diff2]))
#Summation of the predictions for the training and test set
pred_train = [sum(x) for x in zip(pred_train,
[x * alpha_m for x in pred_train_ite])]
pred_test = [sum(x) for x in zip(pred_test,
[x * alpha_m for x in pred_test_ite])]
#signum function to the summation
pred_train, pred_test = np.sign(pred_train), np.sign(pred_test)
#Returning the error rate
return error_rate(pred_train, Y_train), \
error_rate(pred_test, Y_test)
We use a Decision tree as a base classifier
Copy
clf_tree = DecisionTreeClassifier()
Copy
err_train=[]
err_test=[]
x_range = range(10, 500, 10)
for i in x_range:
err_i = adaboost(X_train, Y_train, X_test, Y_test, i, clf_tree)
err_train.append(err_i[0])
err_test.append(err_i[1])
The training and test error rates in each iteration are stored in the given arrays.
Next we will be looking into Multi-Class Adaboost Algorothm by Trevor Hastie and Ji Zhu and try to implement in the coming articles.