Beginning and understanding classification

Beginning Classification — Baby Steps (Learning Sutras)

Beginning Classification in Machine Learning

A baby-steps guide using a Decision Tree Classifier in Python
with interactive “Run & Try” panels for students.

🎓 In this mini-lab, you’ll train a Decision Tree to classify marks into 1st, 2nd, 3rd, or Fail divisions. Every section expands to show code you can copy or download and run yourself.

Step 1 — Import required libraries
from sklearn import tree import matplotlib.pyplot as plt import numpy as np

💡 Make sure you have installed: pip install scikit-learn matplotlib

Step 2 — Define helper functions
def classifications(n): if n < 40: return 4 if n < 50: return 3 if n < 60: return 2 return 1 def divisions(code): return {1:"1st",2:"2nd",3:"3rd",4:"fail"}.get(int(code),"unknown")
These functions convert marks → division codes and back to text.
Step 3 — Prepare data
inputmarks = [50,45,66,7,89,21,39,40,89] inputmarks.sort() marks = [[x] for x in inputmarks] results = [classifications(x) for x in inputmarks] print(marks) print(results)

We sort marks and convert to 2-D input for scikit-learn.

Step 4 — Train the Decision Tree
classifier = tree.DecisionTreeClassifier(random_state=0) model = classifier.fit(marks, results) print("Classes learned:", classifier.classes_)
Step 5 — Plot training points
plt.plot([m[0] for m in marks], results, '-o', label='Training') plt.scatter([m[0] for m in marks], results, color='red') plt.yticks([1,2,3,4],['1st','2nd','3rd','fail']) plt.xlabel('Marks'); plt.ylabel('Division') plt.title('Marks → Division') plt.grid(True); plt.legend() plt.show()
Step 6 — Predict and see probabilities
for i in range(101): value = [[i]] result = model.predict(value) prob = model.predict_proba(value) print(f"Marks {i:3d}: → {divisions(result[0])} | Probs {prob[0]}")

This loop prints predictions for every mark 0–100.

Step 7 — Visualize predicted divisions
fullmarks = [[i] for i in range(101)] preds = model.predict(fullmarks) plt.plot(range(101), preds, color='green', label='Predicted') plt.scatter([m[0] for m in marks], results, color='brown') plt.yticks([1,2,3,4],['1st','2nd','3rd','fail']) plt.xlabel('Marks'); plt.ylabel('Division') plt.title('Predicted Marks → Division') plt.legend(); plt.grid(); plt.show()
Step 8 — Experiment with different criteria
def classifications(n): if n < 50: return 4 if n < 70: return 2 return 1 # Recalculate results and retrain to see new divisions.

Stricter pass criteria — more students fail. Try comparing plots!

Download full runnable program

One click to get the entire script containing all steps.


Code on Substack
Code on Colab

0 Comments

Post a Comment

0 Comments