Beginning Classification in Machine Learning
A baby-steps guide using a Decision Tree Classifier in Python
with interactive “Run & Try” panels for students.
🎓 In this mini-lab, you’ll train a Decision Tree to classify marks into 1st, 2nd, 3rd, or Fail divisions. Every section expands to show code you can copy or download and run yourself.
Step 1 — Import required libraries
from sklearn import tree
import matplotlib.pyplot as plt
import numpy as np
💡 Make sure you have installed: pip install scikit-learn matplotlib
Step 2 — Define helper functions
def classifications(n):
if n < 40: return 4
if n < 50: return 3
if n < 60: return 2
return 1
def divisions(code):
return {1:"1st",2:"2nd",3:"3rd",4:"fail"}.get(int(code),"unknown")
These functions convert marks → division codes and back to text.
Step 3 — Prepare data
inputmarks = [50,45,66,7,89,21,39,40,89]
inputmarks.sort()
marks = [[x] for x in inputmarks]
results = [classifications(x) for x in inputmarks]
print(marks)
print(results)
We sort marks and convert to 2-D input for scikit-learn.
Step 4 — Train the Decision Tree
classifier = tree.DecisionTreeClassifier(random_state=0)
model = classifier.fit(marks, results)
print("Classes learned:", classifier.classes_)
Step 5 — Plot training points
plt.plot([m[0] for m in marks], results, '-o', label='Training')
plt.scatter([m[0] for m in marks], results, color='red')
plt.yticks([1,2,3,4],['1st','2nd','3rd','fail'])
plt.xlabel('Marks'); plt.ylabel('Division')
plt.title('Marks → Division')
plt.grid(True); plt.legend()
plt.show()
Step 6 — Predict and see probabilities
for i in range(101):
value = [[i]]
result = model.predict(value)
prob = model.predict_proba(value)
print(f"Marks {i:3d}: → {divisions(result[0])} | Probs {prob[0]}")
This loop prints predictions for every mark 0–100.
Step 7 — Visualize predicted divisions
fullmarks = [[i] for i in range(101)]
preds = model.predict(fullmarks)
plt.plot(range(101), preds, color='green', label='Predicted')
plt.scatter([m[0] for m in marks], results, color='brown')
plt.yticks([1,2,3,4],['1st','2nd','3rd','fail'])
plt.xlabel('Marks'); plt.ylabel('Division')
plt.title('Predicted Marks → Division')
plt.legend(); plt.grid(); plt.show()
Step 8 — Experiment with different criteria
def classifications(n):
if n < 50: return 4
if n < 70: return 2
return 1
# Recalculate results and retrain to see new divisions.
Stricter pass criteria — more students fail. Try comparing plots!
Download full runnable program
One click to get the entire script containing all steps.
Code on Substack
Code on Colab
0 Comments