CS1675: Homework 8

Due: 11/29/2018, 11:59pm

This assignment is worth 50 points.

In this exercise, you will implement a decision stump (a very basic classifier) and a boosting algorithm. You will also complete an exercise to help review basic probability, in preparation for discussing probabilistic graphical models.


Part I: Decision stumps (15 points)

Implement a set of decision stumps in a function decision_stump_set.

Instructions: Inputs: Outputs:
Part II: AdaBoost (20 points)

In a function adaboost, implement the AdaBoost method defined on pages 658-659 in Bishop (Section 14.3). Use decision stumps as your weak classifiers. If some classifier produces an α value less than 0, set the latter to 0 (which effectively discards this classifier) and exit the iteration loop.

Instructions:
  1. [3 pts] Initialize all weights to 1/N. Then iterate:
  2. [7 pts] Find the best decision stump, and evaluate the quantities ε and α.
  3. [7 pts] Recompute and normalize the weights.
  4. [3 pts] Compute the final labels on the test set, using all classifiers (one per iteration).
Inputs: Outputs:
Part III: Testing boosting on Pima Indians (10 pts)

In a script adaboost_demo.m, test the performance of your AdaBoost method on the Pima Indians dataset. Use the train/test split code (10-fold cross-validation) from HW4. Convert all 0 labels to -1. Try employing (10, 20, 50) iterations. Compute and report (in report.pdf/docx) the accuracy on the test set, using the final test set labels computed above.


Part IV: Probability review (5 points)

In your report file, complete Bishop Exercise 1.3. Show your work.


Submission: Please include the following files: