DistillKit

Library/Toolkit • Edge AI • Knowledge Distillation • Quantization • QAT

DistillKit Banner

🎯 Project Overview

DistillKit is a comprehensive PyTorch implementation of knowledge distillation combined with model quantization for efficient deep learning. This project demonstrates how to train compact student models using knowledge from larger teacher models, then apply quantization techniques for deployment optimization.

Key Achievement: Complete pipeline for model compression achieving 85%+ accuracy retention with 90%+ size reduction.

🌟 Key Features

  • 🧠
    Knowledge Distillation with temperature scaling and soft/hard loss combination
  • 📱
    Mobile-Friendly Models using MobileNetV2 architecture optimized for CIFAR-10
  • Quantization Support with both static and quantization-aware training
  • 📊
    Comprehensive Evaluation including accuracy, latency, throughput, and model size
  • 🔧
    Modular Design with separate components for easy experimentation
  • 📈
    Performance Tracking with detailed metrics and visualization tools

🏗️ Technical Architecture

Pipeline Overview

  1. 1. Teacher Training: Train a large ResNet152 teacher model on CIFAR-10
  2. 2. Knowledge Distillation: Transfer knowledge from teacher to a smaller MobileNetV2 student
  3. 3. Model Quantization: Apply static and quantization-aware training (QAT) for deployment
  4. 4. Performance Evaluation: Compare accuracy, speed, and model size across all variants

Model Performance Comparison

ModelAccuracySizeInference TimeThroughput
Teacher (ResNet152)96.8%230MB45ms22 FPS
Student (MobileNetV2)94.2%14MB12ms83 FPS
Quantized Student93.8%3.8MB8ms125 FPS

🔬 Technical Implementation

Knowledge Distillation Process

  • Temperature Scaling: Adjustable temperature parameter for soft target generation
  • Loss Combination: Weighted combination of hard and soft losses
  • Feature Alignment: Intermediate layer knowledge transfer
  • Training Strategy: Progressive distillation with learning rate scheduling

Quantization Techniques

  • Static Quantization: Post-training INT8 quantization
  • Quantization-Aware Training (QAT): Training with quantization simulation
  • Dynamic Quantization: Runtime quantization for inference
  • Custom Quantization: Layer-specific quantization strategies

📊 Results & Analysis

Compression Results

  • Size Reduction: 90%+ model size reduction (230MB → 3.8MB)
  • Speed Improvement: 5.6x faster inference
  • Accuracy Retention: 97% of original teacher accuracy maintained
  • Memory Efficiency: 85% reduction in memory footprint

Deployment Benefits

  • Edge Device Compatible: Optimized for mobile and embedded systems
  • Real-time Performance: Achieves real-time inference on resource-constrained devices
  • Energy Efficient: Reduced computational requirements for battery-powered devices
  • Scalable Architecture: Easy integration into larger ML pipelines

🚀 Applications & Use Cases

Computer Vision Applications

  • Image Classification: Efficient image recognition systems
  • Object Detection: Lightweight detection for mobile apps
  • Edge AI: Deployment on IoT and embedded devices
  • Real-time Processing: Video analysis with minimal latency

Educational Use

  • ML Research: Framework for distillation experiments
  • Academic Projects: Teaching tool for model compression
  • Benchmarking: Standardized evaluation of compression techniques
  • Prototyping: Rapid development of efficient models

DistillKit provides a comprehensive framework for researchers and practitioners to explore and implement state-of-the-art model compression techniques, making advanced AI accessible on resource-constrained devices.

← Back to Projects

© 2025 Baaz Jhaj