madhavkarthi commited on
Commit
c0b5fbb
Β·
verified Β·
1 Parent(s): 88fed6d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +202 -6
README.md CHANGED
@@ -1,12 +1,208 @@
1
  ---
2
- title: 24679 HW3 Q2
3
- emoji: πŸƒ
4
- colorFrom: purple
5
- colorTo: purple
6
  sdk: gradio
7
- sdk_version: 5.47.2
8
  app_file: app.py
9
  pinned: false
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Tomato Classifier
3
+ emoji: πŸ…
4
+ colorFrom: red
5
+ colorTo: green
6
  sdk: gradio
7
+ sdk_version: 4.0.0
8
  app_file: app.py
9
  pinned: false
10
+ license: mit
11
+ short_description: Binary image classifier to detect tomatoes using MobileNetV3
12
+ tags:
13
+ - image-classification
14
+ - pytorch
15
+ - mobilenet
16
+ - food
17
+ - binary-classification
18
+ - computer-vision
19
  ---
20
 
21
+ # πŸ… Tomato vs Not-Tomato Classifier
22
+
23
+ An interactive web application for classifying images as tomato or not-tomato using a MobileNetV3-Small neural network trained with AutoML.
24
+
25
+ ## 🎯 Overview
26
+
27
+ This Gradio application provides a user-friendly interface for a binary image classifier that predicts whether an image contains a tomato. The model was trained using AutoML techniques (Optuna) on a small food dataset as part of a machine learning course assignment.
28
+
29
+ ## πŸš€ Features
30
+
31
+ - **Image Upload**: Support for PNG/JPG files up to 10MB
32
+ - **Multiple Input Sources**: Upload from file, webcam, or clipboard
33
+ - **Real-time Preview**: View both original and preprocessed images
34
+ - **Confidence Visualization**: Interactive bar chart showing class probabilities
35
+ - **Adjustable Threshold**: Control minimum confidence for predictions
36
+ - **Example Images**: Pre-loaded examples to test the model
37
+ - **Graceful Error Handling**: Validates file types and sizes with helpful error messages
38
+
39
+ ## πŸ€– Model Information
40
+
41
+ ### Architecture
42
+ - **Base Model**: MobileNetV3-Small (pretrained on ImageNet, fine-tuned)
43
+ - **Task**: Binary classification (0 = not_tomato, 1 = tomato)
44
+ - **Input Size**: 224Γ—224 pixels
45
+ - **Dropout**: 0.476
46
+ - **Final Layers**: Custom classifier with dropout regularization
47
+
48
+ ### Training Details
49
+ - **Framework**: PyTorch 2.4.1
50
+ - **AutoML**: Optuna with 10 trials, pruning enabled
51
+ - **Optimizer**: AdamW
52
+ - **Learning Rate**: 1.186Γ—10⁻⁡
53
+ - **Weight Decay**: 0.000433
54
+ - **Batch Size**: 16
55
+ - **Early Stopping**: Patience of 6 epochs on validation F1
56
+ - **Seed**: 42 (for reproducibility)
57
+
58
+ ### Performance Metrics
59
+ - **Test Accuracy**: 83%
60
+ - **Test F1 Score**: 0.80
61
+ - **Training Dataset Size**: ~30 images (very small)
62
+ - **Data Split**: 60/20/20 (train/val/test)
63
+
64
+ ## πŸ“Š Dataset
65
+
66
+ - **Source**: [Iris314/Food_tomatoes_dataset](https://huggingface.co/datasets/Iris314/Food_tomatoes_dataset)
67
+ - **Size**: Approximately 30 images total
68
+ - **Classes**: Binary (tomato / not-tomato)
69
+ - **Stratification**: Stratified splits to maintain class balance
70
+
71
+ ## πŸ”§ Preprocessing Pipeline
72
+
73
+ ### Training Augmentations
74
+ - Random resized crop
75
+ - Horizontal flip (p=0.5)
76
+ - Color jitter
77
+ - Normalization (ImageNet statistics)
78
+
79
+ ### Evaluation Transforms
80
+ 1. **Resize**: 256Γ—256 pixels
81
+ 2. **Center Crop**: 224Γ—224 pixels
82
+ 3. **Normalize**:
83
+ - Mean: [0.485, 0.456, 0.406]
84
+ - Std: [0.229, 0.224, 0.225]
85
+
86
+ The application displays both the original image and the preprocessed version that the model actually processes, helping users understand how the model "sees" the input.
87
+
88
+ ## πŸ“ˆ Usage Guide
89
+
90
+ ### Basic Classification
91
+ 1. Upload an image using the file uploader, webcam, or paste from clipboard
92
+ 2. Click "Classify Image" to get predictions
93
+ 3. View results including:
94
+ - Predicted class (Tomato or Not Tomato)
95
+ - Confidence score
96
+ - Probability distribution
97
+ - Visual confidence chart
98
+
99
+ ### Advanced Options
100
+ - **Confidence Threshold**: Adjust the minimum confidence required (default: 50%)
101
+ - **Show Preprocessing**: Toggle display of preprocessed image to see model input
102
+ - **Examples**: Click example images to quickly test the model
103
+
104
+ ## ⚠️ Limitations & Known Issues
105
+
106
+ ### Dataset Limitations
107
+ - **Very Small Dataset**: Only ~30 training images increases overfitting risk
108
+ - **Limited Diversity**: May not generalize well to unusual tomato varieties or presentations
109
+
110
+ ### Known Failure Modes
111
+ The model may struggle with:
112
+ - Cartoon or illustrated tomatoes
113
+ - Extreme viewing angles
114
+ - Heavy shadows or overexposure
115
+ - Multiple food items in one image
116
+ - Cherry tomatoes or heirloom varieties
117
+ - Processed tomato products (sauce, paste, soup)
118
+ - Out-of-distribution backgrounds
119
+
120
+ ### Performance Considerations
121
+ - Background and lighting variations can bias predictions
122
+ - Not suitable for production or consequential decisions
123
+ - Educational demonstration only
124
+
125
+ ## πŸ”— Links & Resources
126
+
127
+ - **Model Repository**: [kevinkyi/Homework2_NN](https://huggingface.co/kevinkyi/Homework2_NN)
128
+ - **Dataset**: [Iris314/Food_tomatoes_dataset](https://huggingface.co/datasets/Iris314/Food_tomatoes_dataset)
129
+ - **Framework**: [PyTorch](https://pytorch.org/)
130
+ - **AutoML Tool**: [Optuna](https://optuna.org/)
131
+ - **Model Architecture**: [MobileNetV3](https://arxiv.org/abs/1905.02244)
132
+
133
+ ## πŸ› οΈ Technical Stack
134
+
135
+ - **Frontend**: Gradio 4.x
136
+ - **Backend**: PyTorch 2.x, TorchVision
137
+ - **Model Loading**: Hugging Face Hub
138
+ - **Visualization**: Matplotlib
139
+ - **Compute**: CPU inference (no GPU required)
140
+
141
+ ## πŸ“ Inference Parameters
142
+
143
+ The interface exposes the following key parameters:
144
+
145
+ 1. **Confidence Threshold** (0.0-1.0): Minimum confidence for classification
146
+ 2. **Show Preprocessing** (boolean): Display preprocessed image
147
+ 3. **Input Validation**: Automatic file size and type checking
148
+
149
+ ## πŸŽ“ Educational Context
150
+
151
+ This project was created as part of a machine learning course assignment (Homework 2) to demonstrate:
152
+ - Neural network training with AutoML
153
+ - Transfer learning with pretrained models
154
+ - Hyperparameter optimization with Optuna
155
+ - Model deployment with Gradio
156
+ - Documentation best practices
157
+
158
+ ## πŸ“„ License
159
+
160
+ - **Code & Weights**: MIT License
161
+ - **Dataset**: Follow original dataset's license terms
162
+ - **Educational Use**: This model is for coursework demonstration only
163
+
164
+ ## πŸ™ Acknowledgments
165
+
166
+ - Dataset provided by classmate (Iris314)
167
+ - AutoML powered by Optuna
168
+ - Pretrained models from TorchVision
169
+ - Trained on Google Colab (T4 GPU)
170
+ - GenAI tools assisted with documentation and boilerplate code
171
+
172
+ ## ⚑ Quick Start
173
+
174
+ To run locally:
175
+
176
+ ```bash
177
+ # Clone the space
178
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/tomato-classifier
179
+
180
+ # Install dependencies
181
+ pip install -r requirements.txt
182
+
183
+ # Run the app
184
+ python app.py
185
+ ```
186
+
187
+ The application will automatically download the model weights from Hugging Face Hub on first run.
188
+
189
+ ## πŸ› Troubleshooting
190
+
191
+ **Model won't load?**
192
+ - Ensure you have internet connection for downloading weights
193
+ - Check that all dependencies are installed
194
+ - Verify PyTorch is properly installed
195
+
196
+ **Low accuracy on your images?**
197
+ - The model was trained on a very small dataset (~30 images)
198
+ - Performance may vary significantly on images different from training data
199
+ - Try adjusting lighting and background for better results
200
+
201
+ **File upload errors?**
202
+ - Ensure image is under 10MB
203
+ - Supported formats: PNG, JPG, JPEG
204
+ - Try converting or compressing large images
205
+
206
+ ---
207
+
208
+ **Note**: This is an educational project demonstrating ML deployment practices. It should not be used for production applications or any consequential decision-making.