The Model
The classifier is a scikit-learn model trained on the EMNIST balanced dataset (47 classes: A–Z and 0–9). It's serialised to artifacts/emnist_model.joblib and loaded once at startup using @lru_cache so it doesn't reload on every request.
from functools import lru_cache
import joblib, os
@lru_cache(maxsize=1)
def load_model():
path = os.path.join(
os.path.dirname(__file__), "..", "artifacts", "emnist_model.joblib"
)
return joblib.load(os.path.abspath(path))
Preprocessing the Canvas Input
The browser sends the canvas as a base64-encoded PNG. The backend decodes it, converts to grayscale, crops to the bounding box of the drawn pixels, resizes to 28×28, and normalises to [0, 1].
import base64, io
import numpy as np
from PIL import Image
def preprocess(b64_data: str) -> np.ndarray:
img_bytes = base64.b64decode(b64_data.split(",", 1)[-1])
img = Image.open(io.BytesIO(img_bytes)).convert("L")
# Rotate -90 degrees to match EMNIST orientation
img = img.rotate(-90, expand=True)
arr = np.array(img)
rows = np.any(arr > 10, axis=1)
cols = np.any(arr > 10, axis=0)
rmin, rmax = np.where(rows)[0][[0, -1]]
cmin, cmax = np.where(cols)[0][[0, -1]]
arr = arr[rmin:rmax+1, cmin:cmax+1]
img = Image.fromarray(arr).resize((28, 28), Image.LANCZOS)
return np.array(img).flatten().astype("float32") / 255.0
The Predict View
The view is CSRF-exempt because it receives JSON from a canvas JavaScript client. It returns the predicted character and the top-5 probabilities.
from django.views.decorators.csrf import csrf_exempt
from django.http import JsonResponse
import json
@csrf_exempt
def predict(request):
if request.method != "POST":
return JsonResponse({"error": "POST only"}, status=405)
data = json.loads(request.body)
features = preprocess(data["image"])
model = load_model()
prediction = model.predict([features])[0]
proba = model.predict_proba([features])[0]
classes = model.classes_
top5 = sorted(zip(classes, proba), key=lambda x: -x[1])[:5]
return JsonResponse({
"prediction": prediction,
"confidence": float(max(proba)),
"top5": [{"label": c, "prob": round(float(p), 4)} for c, p in top5],
})
Tip: EMNIST stores characters rotated 90° relative to how most people draw them. The
rotate(-90°) step is crucial — skip it and accuracy drops dramatically even though the model looks fine on paper.