Facial Expression Classification Using CNNs¶
Overview¶
The goal of this project was to build an image classification model to recognize facial expressions using deep learning. The dataset contained face images categorized by emotion labels, and the project aimed to classify each image into its respective emotional class. A high-performing ConvNeXt architecture was used to achieve this.
Data¶
Dataset downloaded here: https://www.kaggle.com/datasets/msambare/fer2013/data
The dataset consists of images grouped into emotion-specific subfolders: seven standard emotions — Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral. There is a 'train' folder containing the 7 subfolders of emotions all containing images pertaining to that emotion, and a 'test' folder to test the model on.
Albumentations was used for data augmentation and transformations included resizing, horizontal flip, color jittering, and normalization. For the Validation data simple resizing and normalization was utilized.
Methods & Materials¶
Implemented AlbumentationsDataset to load images from subfolders and apply transforms, used PyTorch DataLoader to iterate over datasets in mini-batches. For Exploratory Data Analysis, we displayed sample images and their labels using matplotlib.
We used pretrained EfficientNet-B3 utilizing resizing, flips, rotation, color jitter, and normalization as augmentations, with CrossEntropy Loss as the criterion and a AdamW optimizer. We used the CosineAnnealingLR scheduler and achieved an accuracy of about 70% on the validation through 9 epochs with early stoppage at level 2 patience.
To attempt to break past 0.70, we added label smoothing and increased patience to 3 and reran the model, which actually ended up performing marginally worse than before.
For our final model, we used ConvNeXt-Tiny imported from timm (PyTorch image models) and pretrained on ImageNet. We modified the classification head to output 7 emotion classes, and find tuned using cross-entropy and Adam optimizer using early stoppage to obtain the best model.
The best model was saved to a PTH file.
For Evaluation, we used classification reports and confusion matrices.
Outcomes¶
ConvNeXt-Tiny achieved the better accuracy (0.72) and F1 across all classes compared to E-B3. The F1 scores for each class was: Angry 0.65, Disgust: 0.73, Fear: 0.59, Happy: 0.90, Neutral: 0.69, Sad: 0.60, Suprise: 0.83.
Class | Precision | Recall | F1-Score | Support |
---|---|---|---|---|
Angry | 0.64 | 0.66 | 0.65 | 958 |
Disgust | 0.79 | 0.68 | 0.73 | 111 |
Fear | 0.62 | 0.56 | 0.59 | 1024 |
Happy | 0.89 | 0.90 | 0.90 | 1774 |
Neutral | 0.67 | 0.70 | 0.69 | 1233 |
Sad | 0.60 | 0.60 | 0.60 | 1247 |
Surprise | 0.83 | 0.84 | 0.83 | 831 |
Accuracy | 0.72 | 7178 | ||
Macro Avg | 0.72 | 0.71 | 0.71 | 7178 |
Weighted Avg | 0.72 | 0.72 | 0.72 | 7178 |
The model was best at classifying happiness and suprise, possibly because facial features from these classes are often very defined such as a big smile, or big suprise expression, and therefore are easier for the model to read.
Conclusion¶
This project demonstrated the effectiveness of using modern pretrained convolutional architectures (like ConvNeXt) for facial emotion classification. The use of transfer learning and Albumentations improved generalization despite limited and potentially noisy data.
Howver, we plateaued around 0.70ish accuracy possibly due to class confusion (e.g. sad vs neutral) due to overlapping facial features. Additionally, the datasetuses 2D static, low resolution photos and unbalanced classes.
Discussion¶
ConvNeXt captured hierarchical facial features better than a baseline CNN and pretraining helped the model generalize quickly with fewer epochs, while Albumentations prevented overfitting via aggressive augmentations. The performance ceiling may be due in part to class imbalance, which can be addressed using techniques like class weighting or focal loss to reduce bias toward majority classes.
Future Improvements:¶
Additional improvements could include adopting Vision Transformers to better capture global spatial features, integrating attention mechanisms to focus on emotionally salient facial regions, or enriching the model input with facial landmarks or action units (AUs)—specific muscle movements like brow raises or lip corner pulls—that provide structured, expression-relevant cues.
#show a few random example pictures (brought this cell up to the front just for show)
# Helper to unnormalize the pictures so we can show the original (this cell was coded last in the notebook)
def unnormalize(img_tensor):
img = img_tensor.clone().permute(1, 2, 0).cpu().numpy()
img = (img * 0.5) + 0.5 # reverse normalization
img = np.clip(img, 0, 1)
return img
# Plot
plt.figure(figsize=(14, 8))
for i in range(8):
img = unnormalize(images[i])
label = emotion_labels[int(labels[i])]
plt.subplot(2, 4, i + 1)
plt.imshow(img.squeeze(), cmap='gray' if img.shape[2] == 1 else None)
plt.title(label)
plt.axis('off')
plt.tight_layout()
plt.show()
Transformations and image loading:
import os
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
from torchvision import transforms
import albumentations as A
from albumentations.pytorch import ToTensorV2
from torchvision.datasets.folder import default_loader
import numpy as np
class AlbumentationsDataset(ImageFolder):
def __init__(self, root, transform=None):
super().__init__(root, transform=None)
self.albumentations_transform = transform
def __getitem__(self, index):
path, target = self.samples[index]
sample = default_loader(path)
sample = np.array(sample)
if self.albumentations_transform:
sample = self.albumentations_transform(image=sample)["image"]
return sample, target
# Transforms
train_transform = A.Compose([
A.Resize(224, 224),
A.HorizontalFlip(p=0.5),
A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.05, rotate_limit=10),
A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
A.Normalize(),
ToTensorV2()
])
val_transform = A.Compose([
A.Resize(224, 224),
A.Normalize(),
ToTensorV2()
])
train_ds = AlbumentationsDataset("train", transform=train_transform)
val_ds = AlbumentationsDataset("test", transform=val_transform)
train_loader = DataLoader(train_ds, batch_size=64, shuffle=True)
val_loader = DataLoader(val_ds, batch_size=64, shuffle=False)
import timm
import torch.nn as nn
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = timm.create_model("efficientnet_b3", pretrained=True, num_classes=7)
model = model.to(device)
We ran a model that trained for 20 epochs for ~20 hours and peaked at about epoch 10, but we did not include any early stopping in the code, so we scratched that whole block of code.
Below, we wrote a better training model that includes tracking train and val accuracy, saving of the best model, and early stopping when the val accuracy stops improving.
We set patience to 2 so it will stop after 2 bad epochs, and we expect it to stop after about 10-12 based on previous performace.
import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
# ---- Setup ----
criterion = nn.CrossEntropyLoss()
optimizer = optim.AdamW(model.parameters(), lr=3e-4)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)
best_val_acc = 0
epochs_no_improve = 0
patience = 2
num_epochs = 20
train_acc_list = []
val_acc_list = []
# ---- Training Loop ----
for epoch in range(num_epochs):
model.train()
total_loss = 0
correct = 0
for x, y in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}"):
x, y = x.to(device), y.to(device)
optimizer.zero_grad()
out = model(x)
loss = criterion(out, y)
loss.backward()
optimizer.step()
total_loss += loss.item()
correct += (out.argmax(1) == y).sum().item()
train_acc = correct / len(train_ds)
train_acc_list.append(train_acc)
# ---- Validation ----
model.eval()
val_correct = 0
val_total = 0
with torch.no_grad():
for x_val, y_val in val_loader:
x_val, y_val = x_val.to(device), y_val.to(device)
y_pred = model(x_val)
val_correct += (y_pred.argmax(1) == y_val).sum().item()
val_total += y_val.size(0)
val_acc = val_correct / val_total
val_acc_list.append(val_acc)
print(f"Epoch {epoch+1}, Train Acc: {train_acc:.4f}, Loss: {total_loss:.4f}, Val Acc: {val_acc:.4f}")
# ---- Early Stopping ----
if val_acc > best_val_acc:
best_val_acc = val_acc
epochs_no_improve = 0
torch.save(model.state_dict(), "best_model.pth")
print("New best model saved.")
else:
epochs_no_improve += 1
print(f"No improvement. Patience: {epochs_no_improve}/{patience}")
if epochs_no_improve >= patience:
print("Early stopping triggered.")
break
scheduler.step()
# ---- Plot Accuracies ----
plt.figure(figsize=(10, 5))
plt.plot(train_acc_list, label="Train Accuracy")
plt.plot(val_acc_list, label="Validation Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.title("Training vs Validation Accuracy")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
Epoch 1/20: 100%|██████████| 449/449 [1:08:56<00:00, 9.21s/it]
Epoch 1, Train Acc: 0.5422, Loss: 563.4186, Val Acc: 0.6310 New best model saved.
Epoch 2/20: 100%|██████████| 449/449 [1:06:05<00:00, 8.83s/it]
Epoch 2, Train Acc: 0.6780, Loss: 392.2814, Val Acc: 0.6636 New best model saved.
Epoch 3/20: 100%|██████████| 449/449 [1:06:54<00:00, 8.94s/it]
Epoch 3, Train Acc: 0.7505, Loss: 304.4303, Val Acc: 0.6808 New best model saved.
Epoch 4/20: 100%|██████████| 449/449 [1:06:14<00:00, 8.85s/it]
Epoch 4, Train Acc: 0.8263, Loss: 217.6914, Val Acc: 0.6796 No improvement. Patience: 1/2
Epoch 5/20: 100%|██████████| 449/449 [1:06:54<00:00, 8.94s/it]
Epoch 5, Train Acc: 0.8932, Loss: 137.3717, Val Acc: 0.6906 New best model saved.
Epoch 6/20: 100%|██████████| 449/449 [1:06:40<00:00, 8.91s/it]
Epoch 6, Train Acc: 0.9360, Loss: 83.2908, Val Acc: 0.6874 No improvement. Patience: 1/2
Epoch 7/20: 100%|██████████| 449/449 [1:08:01<00:00, 9.09s/it]
Epoch 7, Train Acc: 0.9634, Loss: 49.8292, Val Acc: 0.7042 New best model saved.
Epoch 8/20: 100%|██████████| 449/449 [1:07:07<00:00, 8.97s/it]
Epoch 8, Train Acc: 0.9778, Loss: 29.6626, Val Acc: 0.7030 No improvement. Patience: 1/2
Epoch 9/20: 100%|██████████| 449/449 [1:05:42<00:00, 8.78s/it]
Epoch 9, Train Acc: 0.9848, Loss: 21.1281, Val Acc: 0.7040 No improvement. Patience: 2/2 Early stopping triggered.
#-------- Model Eval -----------
model.load_state_dict(torch.load("best_model.pth")) # Load best model
model.eval()
all_preds = []
all_labels = []
with torch.no_grad():
for x, y in val_loader:
x, y = x.to(device), y.to(device)
preds = model(x).argmax(1)
all_preds.extend(preds.cpu().numpy())
all_labels.extend(y.cpu().numpy())
print(classification_report(all_labels, all_preds, target_names=train_ds.classes))
# Confusion matrix
cm = confusion_matrix(all_labels, all_preds)
plt.figure(figsize=(8,6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=train_ds.classes, yticklabels=train_ds.classes)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix')
plt.tight_layout()
plt.show()
precision recall f1-score support angry 0.63 0.61 0.62 958 disgust 0.75 0.71 0.73 111 fear 0.57 0.54 0.55 1024 happy 0.89 0.87 0.88 1774 neutral 0.64 0.69 0.66 1233 sad 0.60 0.59 0.60 1247 surprise 0.79 0.85 0.82 831 accuracy 0.70 7178 macro avg 0.70 0.69 0.69 7178 weighted avg 0.70 0.70 0.70 7178
correct = sum([p == t for p, t in zip(all_preds, all_labels)])
print(f"Final Accuracy: {correct / len(all_labels):.4f}")
Final Accuracy: 0.7042
Below we will make confidence 3, attempt to implement label smoothing to help generalization & increase data augmentation to hopefully push passed 70% val acc.
import torch, timm
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from tqdm import tqdm
import matplotlib.pyplot as plt
# --- Model Definition ---
class CustomModel(nn.Module):
def __init__(self, base_model, num_classes):
super().__init__()
self.backbone = base_model
in_features = self.backbone.classifier.in_features
self.backbone.classifier = nn.Identity()
self.dropout = nn.Dropout(0.3)
self.fc = nn.Linear(in_features, num_classes)
def forward(self, x):
x = self.backbone(x)
x = self.dropout(x)
return self.fc(x)
base_model = timm.create_model("efficientnet_b3a", pretrained=True)
model = CustomModel(base_model, num_classes=7).to(device)
# --- Label Smoothing Loss ---
class LabelSmoothingLoss(nn.Module):
def __init__(self, classes, smoothing=0.1):
super().__init__()
self.confidence = 1.0 - smoothing
self.smoothing = smoothing
self.cls = classes
def forward(self, pred, target):
pred = pred.log_softmax(dim=-1)
true_dist = torch.zeros_like(pred)
true_dist.fill_(self.smoothing / (self.cls - 1))
true_dist.scatter_(1, target.data.unsqueeze(1), self.confidence)
return torch.mean(torch.sum(-true_dist * pred, dim=-1))
# --- Training Setup ---
criterion = LabelSmoothingLoss(classes=7, smoothing=0.1)
optimizer = optim.AdamW(model.parameters(), lr=1e-4, weight_decay=1e-2)
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)
best_val_acc = 0
epochs_no_improve = 0
patience = 3
num_epochs = 20
train_acc_list, val_acc_list = [], []
# --- Training Loop ---
for epoch in range(num_epochs):
model.train()
total_loss = 0
correct = 0
for x, y in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}"):
x, y = x.to(device), y.to(device)
optimizer.zero_grad()
out = model(x)
loss = criterion(out, y)
loss.backward()
optimizer.step()
total_loss += loss.item()
correct += (out.argmax(1) == y).sum().item()
train_acc = correct / len(train_ds)
train_acc_list.append(train_acc)
# --- Validation ---
model.eval()
val_correct, val_total = 0, 0
with torch.no_grad():
for x_val, y_val in val_loader:
x_val, y_val = x_val.to(device), y_val.to(device)
y_pred = model(x_val)
val_correct += (y_pred.argmax(1) == y_val).sum().item()
val_total += y_val.size(0)
val_acc = val_correct / val_total
val_acc_list.append(val_acc)
print(f"Epoch {epoch+1}, Train Acc: {train_acc:.4f}, Loss: {total_loss:.4f}, Val Acc: {val_acc:.4f}")
# --- Early Stopping ---
if val_acc > best_val_acc:
best_val_acc = val_acc
epochs_no_improve = 0
torch.save(model.state_dict(), "best_model_improved.pth")
print(" New best model saved.")
else:
epochs_no_improve += 1
print(f"No improvement. Patience: {epochs_no_improve}/{patience}")
if epochs_no_improve >= patience:
print(" Early stopping triggered.")
break
scheduler.step()
# --- Plot Accuracy Curves ---
plt.figure(figsize=(10, 5))
plt.plot(train_acc_list, label="Train Accuracy")
plt.plot(val_acc_list, label="Validation Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.title("Training vs Validation Accuracy (Improved)")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
C:\Users\forca\anaconda3\Lib\site-packages\timm\models\_factory.py:138: UserWarning: Mapping deprecated model name efficientnet_b3a to current efficientnet_b3. model = create_fn( Epoch 1/20: 100%|██████████| 449/449 [1:09:48<00:00, 9.33s/it]
Epoch 1, Train Acc: 0.5174, Loss: 645.5958, Val Acc: 0.6245 New best model saved.
Epoch 2/20: 100%|██████████| 449/449 [1:08:04<00:00, 9.10s/it]
Epoch 2, Train Acc: 0.6634, Loss: 525.2005, Val Acc: 0.6654 New best model saved.
Epoch 3/20: 100%|██████████| 449/449 [1:05:10<00:00, 8.71s/it]
Epoch 3, Train Acc: 0.7318, Loss: 468.2296, Val Acc: 0.6786 New best model saved.
Epoch 4/20: 100%|██████████| 449/449 [1:04:41<00:00, 8.64s/it]
Epoch 4, Train Acc: 0.8015, Loss: 414.8496, Val Acc: 0.6888 New best model saved.
Epoch 5/20: 100%|██████████| 449/449 [1:04:42<00:00, 8.65s/it]
Epoch 5, Train Acc: 0.8524, Loss: 370.4793, Val Acc: 0.6865 No improvement. Patience: 1/3
Epoch 6/20: 100%|██████████| 449/449 [1:04:53<00:00, 8.67s/it]
Epoch 6, Train Acc: 0.8983, Loss: 333.6452, Val Acc: 0.6923 New best model saved.
Epoch 7/20: 100%|██████████| 449/449 [1:06:45<00:00, 8.92s/it]
Epoch 7, Train Acc: 0.9255, Loss: 310.8068, Val Acc: 0.6970 New best model saved.
Epoch 8/20: 100%|██████████| 449/449 [1:06:13<00:00, 8.85s/it]
Epoch 8, Train Acc: 0.9426, Loss: 296.1793, Val Acc: 0.6948 No improvement. Patience: 1/3
Epoch 9/20: 100%|██████████| 449/449 [1:04:43<00:00, 8.65s/it]
Epoch 9, Train Acc: 0.9525, Loss: 287.9650, Val Acc: 0.6969 No improvement. Patience: 2/3
Epoch 10/20: 100%|██████████| 449/449 [1:04:45<00:00, 8.65s/it]
Epoch 10, Train Acc: 0.9576, Loss: 283.2365, Val Acc: 0.6956 No improvement. Patience: 3/3 Early stopping triggered.
Below we will try ConvNeXt Tiny (pretrained on imagenet-22k) instead of efficientnet_b3a.
import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import timm
# ---- Load ConvNeXt Tiny Backbone ----
model = timm.create_model("convnext_tiny.fb_in22k", pretrained=True, num_classes=7)
model = model.to("cuda" if torch.cuda.is_available() else "cpu")
device = next(model.parameters()).device
# ---- Loss, Optimizer, Scheduler ----
criterion = nn.CrossEntropyLoss()
optimizer = optim.AdamW(model.parameters(), lr=3e-4)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)
# ---- Training Config ----
best_val_acc = 0
epochs_no_improve = 0
patience = 3
num_epochs = 20
train_acc_list = []
val_acc_list = []
# ---- Training Loop ----
for epoch in range(num_epochs):
model.train()
total_loss = 0
correct = 0
for x, y in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}"):
x, y = x.to(device), y.to(device)
optimizer.zero_grad()
out = model(x)
loss = criterion(out, y)
loss.backward()
optimizer.step()
total_loss += loss.item()
correct += (out.argmax(1) == y).sum().item()
train_acc = correct / len(train_ds)
train_acc_list.append(train_acc)
# ---- Validation ----
model.eval()
val_correct = 0
val_total = 0
with torch.no_grad():
for x_val, y_val in val_loader:
x_val, y_val = x_val.to(device), y_val.to(device)
y_pred = model(x_val)
val_correct += (y_pred.argmax(1) == y_val).sum().item()
val_total += y_val.size(0)
val_acc = val_correct / val_total
val_acc_list.append(val_acc)
print(f"Epoch {epoch+1}, Train Acc: {train_acc:.4f}, Loss: {total_loss:.4f}, Val Acc: {val_acc:.4f}")
# ---- Early Stopping ----
if val_acc > best_val_acc:
best_val_acc = val_acc
epochs_no_improve = 0
torch.save(model.state_dict(), "best_convnext_model.pth")
print("New best model saved.")
else:
epochs_no_improve += 1
print(f"No improvement. Patience: {epochs_no_improve}/{patience}")
if epochs_no_improve >= patience:
print(" Early stopping triggered.")
break
scheduler.step()
# ---- Plot Accuracy ----
plt.figure(figsize=(10, 5))
plt.plot(train_acc_list, label="Train Accuracy")
plt.plot(val_acc_list, label="Validation Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.title("ConvNeXt: Train vs Validation Accuracy")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
# ---- Model Evaluation ----
model.load_state_dict(torch.load("best_convnext_model.pth"))
model.eval()
all_preds = []
all_labels = []
with torch.no_grad():
for x, y in val_loader:
x, y = x.to(device), y.to(device)
preds = model(x).argmax(1)
all_preds.extend(preds.cpu().numpy())
all_labels.extend(y.cpu().numpy())
print(classification_report(all_labels, all_preds, target_names=train_ds.classes))
cm = confusion_matrix(all_labels, all_preds)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
xticklabels=train_ds.classes, yticklabels=train_ds.classes)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix - ConvNeXt Tiny')
plt.tight_layout()
plt.show()
model.safetensors: 0%| | 0.00/178M [00:00<?, ?B/s]
C:\Users\forca\anaconda3\Lib\site-packages\huggingface_hub\file_download.py:143: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\forca\.cache\huggingface\hub\models--timm--convnext_tiny.fb_in22k. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations. To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development warnings.warn(message) Epoch 1/20: 100%|██████████| 449/449 [47:35<00:00, 6.36s/it]
Epoch 1, Train Acc: 0.5935, Loss: 482.8378, Val Acc: 0.6468 New best model saved.
Epoch 2/20: 100%|██████████| 449/449 [47:56<00:00, 6.41s/it]
Epoch 2, Train Acc: 0.6915, Loss: 374.9729, Val Acc: 0.6737 New best model saved.
Epoch 3/20: 100%|██████████| 449/449 [50:55<00:00, 6.81s/it]
Epoch 3, Train Acc: 0.7429, Loss: 311.3200, Val Acc: 0.6946 New best model saved.
Epoch 4/20: 100%|██████████| 449/449 [47:58<00:00, 6.41s/it]
Epoch 4, Train Acc: 0.8070, Loss: 239.1212, Val Acc: 0.7112 New best model saved.
Epoch 5/20: 100%|██████████| 449/449 [47:36<00:00, 6.36s/it]
Epoch 5, Train Acc: 0.8690, Loss: 164.2940, Val Acc: 0.7088 No improvement. Patience: 1/3
Epoch 6/20: 100%|██████████| 449/449 [48:35<00:00, 6.49s/it]
Epoch 6, Train Acc: 0.9258, Loss: 96.7903, Val Acc: 0.7069 No improvement. Patience: 2/3
Epoch 7/20: 100%|██████████| 449/449 [48:07<00:00, 6.43s/it]
Epoch 7, Train Acc: 0.9626, Loss: 51.6013, Val Acc: 0.7151 New best model saved.
Epoch 8/20: 100%|██████████| 449/449 [49:41<00:00, 6.64s/it]
Epoch 8, Train Acc: 0.9808, Loss: 27.4399, Val Acc: 0.7189 New best model saved.
Epoch 9/20: 100%|██████████| 449/449 [2:05:57<00:00, 16.83s/it]
Epoch 9, Train Acc: 0.9871, Loss: 17.7518, Val Acc: 0.7208 New best model saved.
Epoch 10/20: 100%|██████████| 449/449 [1:43:31<00:00, 13.83s/it]
Epoch 10, Train Acc: 0.9914, Loss: 11.9923, Val Acc: 0.7210 New best model saved.
Epoch 11/20: 100%|██████████| 449/449 [48:24<00:00, 6.47s/it]
Epoch 11, Train Acc: 0.9924, Loss: 11.4604, Val Acc: 0.7210 No improvement. Patience: 1/3
Epoch 12/20: 100%|██████████| 449/449 [48:23<00:00, 6.47s/it]
Epoch 12, Train Acc: 0.9917, Loss: 11.3885, Val Acc: 0.7232 New best model saved.
Epoch 13/20: 100%|██████████| 449/449 [48:19<00:00, 6.46s/it]
Epoch 13, Train Acc: 0.9905, Loss: 12.8140, Val Acc: 0.7190 No improvement. Patience: 1/3
Epoch 14/20: 100%|██████████| 449/449 [1:08:25<00:00, 9.14s/it]
Epoch 14, Train Acc: 0.9847, Loss: 20.4750, Val Acc: 0.7147 No improvement. Patience: 2/3
Epoch 15/20: 100%|██████████| 449/449 [2:18:13<00:00, 18.47s/it]
Epoch 15, Train Acc: 0.9709, Loss: 37.6122, Val Acc: 0.7165 No improvement. Patience: 3/3 Early stopping triggered.
precision recall f1-score support angry 0.64 0.66 0.65 958 disgust 0.79 0.68 0.73 111 fear 0.62 0.56 0.59 1024 happy 0.89 0.90 0.90 1774 neutral 0.67 0.70 0.69 1233 sad 0.60 0.60 0.60 1247 surprise 0.83 0.84 0.83 831 accuracy 0.72 7178 macro avg 0.72 0.71 0.71 7178 weighted avg 0.72 0.72 0.72 7178
ConvNeXt-Tiny achieved the better accuracy (0.72) and F1 across all classes compared to E-B3. The F1 scores for each class was: Angry 0.65, Disgust: 0.73, Fear: 0.59, Happy: 0.90, Neutral: 0.69, Sad: 0.60, Suprise: 0.83.