Facial Expression Classification Using CNNs¶

Overview¶

The goal of this project was to build an image classification model to recognize facial expressions using deep learning. The dataset contained face images categorized by emotion labels, and the project aimed to classify each image into its respective emotional class. A high-performing ConvNeXt architecture was used to achieve this.

Data¶

Dataset downloaded here: https://www.kaggle.com/datasets/msambare/fer2013/data

The dataset consists of images grouped into emotion-specific subfolders: seven standard emotions — Angry, Disgust, Fear, Happy, Sad, Surprise, Neutral. There is a 'train' folder containing the 7 subfolders of emotions all containing images pertaining to that emotion, and a 'test' folder to test the model on.

Albumentations was used for data augmentation and transformations included resizing, horizontal flip, color jittering, and normalization. For the Validation data simple resizing and normalization was utilized.

Methods & Materials¶

Implemented AlbumentationsDataset to load images from subfolders and apply transforms, used PyTorch DataLoader to iterate over datasets in mini-batches. For Exploratory Data Analysis, we displayed sample images and their labels using matplotlib.

We used pretrained EfficientNet-B3 utilizing resizing, flips, rotation, color jitter, and normalization as augmentations, with CrossEntropy Loss as the criterion and a AdamW optimizer. We used the CosineAnnealingLR scheduler and achieved an accuracy of about 70% on the validation through 9 epochs with early stoppage at level 2 patience.

To attempt to break past 0.70, we added label smoothing and increased patience to 3 and reran the model, which actually ended up performing marginally worse than before.

For our final model, we used ConvNeXt-Tiny imported from timm (PyTorch image models) and pretrained on ImageNet. We modified the classification head to output 7 emotion classes, and find tuned using cross-entropy and Adam optimizer using early stoppage to obtain the best model.

The best model was saved to a PTH file.

For Evaluation, we used classification reports and confusion matrices.

Outcomes¶

ConvNeXt-Tiny achieved the better accuracy (0.72) and F1 across all classes compared to E-B3. The F1 scores for each class was: Angry 0.65, Disgust: 0.73, Fear: 0.59, Happy: 0.90, Neutral: 0.69, Sad: 0.60, Suprise: 0.83.

Class Precision Recall F1-Score Support
Angry 0.64 0.66 0.65 958
Disgust 0.79 0.68 0.73 111
Fear 0.62 0.56 0.59 1024
Happy 0.89 0.90 0.90 1774
Neutral 0.67 0.70 0.69 1233
Sad 0.60 0.60 0.60 1247
Surprise 0.83 0.84 0.83 831
Accuracy 0.72 7178
Macro Avg 0.72 0.71 0.71 7178
Weighted Avg 0.72 0.72 0.72 7178

The model was best at classifying happiness and suprise, possibly because facial features from these classes are often very defined such as a big smile, or big suprise expression, and therefore are easier for the model to read.

Conclusion¶

This project demonstrated the effectiveness of using modern pretrained convolutional architectures (like ConvNeXt) for facial emotion classification. The use of transfer learning and Albumentations improved generalization despite limited and potentially noisy data.

Howver, we plateaued around 0.70ish accuracy possibly due to class confusion (e.g. sad vs neutral) due to overlapping facial features. Additionally, the datasetuses 2D static, low resolution photos and unbalanced classes.

Discussion¶

ConvNeXt captured hierarchical facial features better than a baseline CNN and pretraining helped the model generalize quickly with fewer epochs, while Albumentations prevented overfitting via aggressive augmentations. The performance ceiling may be due in part to class imbalance, which can be addressed using techniques like class weighting or focal loss to reduce bias toward majority classes.

Future Improvements:¶

Additional improvements could include adopting Vision Transformers to better capture global spatial features, integrating attention mechanisms to focus on emotionally salient facial regions, or enriching the model input with facial landmarks or action units (AUs)—specific muscle movements like brow raises or lip corner pulls—that provide structured, expression-relevant cues.

In [64]:
#show a few random example pictures (brought this cell up to the front just for show)

# Helper to unnormalize the pictures so we can show the original (this cell was coded last in the notebook)
def unnormalize(img_tensor):
    img = img_tensor.clone().permute(1, 2, 0).cpu().numpy()
    img = (img * 0.5) + 0.5  # reverse normalization
    img = np.clip(img, 0, 1)
    return img

# Plot
plt.figure(figsize=(14, 8))
for i in range(8):
    img = unnormalize(images[i])
    label = emotion_labels[int(labels[i])]

    plt.subplot(2, 4, i + 1)
    plt.imshow(img.squeeze(), cmap='gray' if img.shape[2] == 1 else None)
    plt.title(label)
    plt.axis('off')

plt.tight_layout()
plt.show()
No description has been provided for this image

Transformations and image loading:

In [17]:
import os
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
from torchvision import transforms
import albumentations as A
from albumentations.pytorch import ToTensorV2
from torchvision.datasets.folder import default_loader
import numpy as np

class AlbumentationsDataset(ImageFolder):
    def __init__(self, root, transform=None):
        super().__init__(root, transform=None)
        self.albumentations_transform = transform

    def __getitem__(self, index):
        path, target = self.samples[index]
        sample = default_loader(path)
        sample = np.array(sample)
        if self.albumentations_transform:
            sample = self.albumentations_transform(image=sample)["image"]
        return sample, target

# Transforms
train_transform = A.Compose([
    A.Resize(224, 224),
    A.HorizontalFlip(p=0.5),
    A.ShiftScaleRotate(shift_limit=0.05, scale_limit=0.05, rotate_limit=10),
    A.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2),
    A.Normalize(),
    ToTensorV2()
])

val_transform = A.Compose([
    A.Resize(224, 224),
    A.Normalize(),
    ToTensorV2()
])

train_ds = AlbumentationsDataset("train", transform=train_transform)
val_ds = AlbumentationsDataset("test", transform=val_transform)

train_loader = DataLoader(train_ds, batch_size=64, shuffle=True)
val_loader = DataLoader(val_ds, batch_size=64, shuffle=False)
In [18]:
import timm
import torch.nn as nn
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = timm.create_model("efficientnet_b3", pretrained=True, num_classes=7)
model = model.to(device)

We ran a model that trained for 20 epochs for ~20 hours and peaked at about epoch 10, but we did not include any early stopping in the code, so we scratched that whole block of code.

Below, we wrote a better training model that includes tracking train and val accuracy, saving of the best model, and early stopping when the val accuracy stops improving.

We set patience to 2 so it will stop after 2 bad epochs, and we expect it to stop after about 10-12 based on previous performace.

In [22]:
import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns

# ---- Setup ----
criterion = nn.CrossEntropyLoss()
optimizer = optim.AdamW(model.parameters(), lr=3e-4)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)

best_val_acc = 0
epochs_no_improve = 0
patience = 2
num_epochs = 20

train_acc_list = []
val_acc_list = []

# ---- Training Loop ----
for epoch in range(num_epochs):
    model.train()
    total_loss = 0
    correct = 0

    for x, y in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}"):
        x, y = x.to(device), y.to(device)
        optimizer.zero_grad()
        out = model(x)
        loss = criterion(out, y)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
        correct += (out.argmax(1) == y).sum().item()

    train_acc = correct / len(train_ds)
    train_acc_list.append(train_acc)

    # ---- Validation ----
    model.eval()
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for x_val, y_val in val_loader:
            x_val, y_val = x_val.to(device), y_val.to(device)
            y_pred = model(x_val)
            val_correct += (y_pred.argmax(1) == y_val).sum().item()
            val_total += y_val.size(0)

    val_acc = val_correct / val_total
    val_acc_list.append(val_acc)

    print(f"Epoch {epoch+1}, Train Acc: {train_acc:.4f}, Loss: {total_loss:.4f}, Val Acc: {val_acc:.4f}")

    # ---- Early Stopping ----
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        epochs_no_improve = 0
        torch.save(model.state_dict(), "best_model.pth")
        print("New best model saved.")
    else:
        epochs_no_improve += 1
        print(f"No improvement. Patience: {epochs_no_improve}/{patience}")

    if epochs_no_improve >= patience:
        print("Early stopping triggered.")
        break

    scheduler.step()

# ---- Plot Accuracies ----
plt.figure(figsize=(10, 5))
plt.plot(train_acc_list, label="Train Accuracy")
plt.plot(val_acc_list, label="Validation Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.title("Training vs Validation Accuracy")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
Epoch 1/20: 100%|██████████| 449/449 [1:08:56<00:00,  9.21s/it]
Epoch 1, Train Acc: 0.5422, Loss: 563.4186, Val Acc: 0.6310
New best model saved.
Epoch 2/20: 100%|██████████| 449/449 [1:06:05<00:00,  8.83s/it]
Epoch 2, Train Acc: 0.6780, Loss: 392.2814, Val Acc: 0.6636
New best model saved.
Epoch 3/20: 100%|██████████| 449/449 [1:06:54<00:00,  8.94s/it]
Epoch 3, Train Acc: 0.7505, Loss: 304.4303, Val Acc: 0.6808
New best model saved.
Epoch 4/20: 100%|██████████| 449/449 [1:06:14<00:00,  8.85s/it]
Epoch 4, Train Acc: 0.8263, Loss: 217.6914, Val Acc: 0.6796
No improvement. Patience: 1/2
Epoch 5/20: 100%|██████████| 449/449 [1:06:54<00:00,  8.94s/it]
Epoch 5, Train Acc: 0.8932, Loss: 137.3717, Val Acc: 0.6906
New best model saved.
Epoch 6/20: 100%|██████████| 449/449 [1:06:40<00:00,  8.91s/it]
Epoch 6, Train Acc: 0.9360, Loss: 83.2908, Val Acc: 0.6874
No improvement. Patience: 1/2
Epoch 7/20: 100%|██████████| 449/449 [1:08:01<00:00,  9.09s/it]
Epoch 7, Train Acc: 0.9634, Loss: 49.8292, Val Acc: 0.7042
New best model saved.
Epoch 8/20: 100%|██████████| 449/449 [1:07:07<00:00,  8.97s/it]
Epoch 8, Train Acc: 0.9778, Loss: 29.6626, Val Acc: 0.7030
No improvement. Patience: 1/2
Epoch 9/20: 100%|██████████| 449/449 [1:05:42<00:00,  8.78s/it]
Epoch 9, Train Acc: 0.9848, Loss: 21.1281, Val Acc: 0.7040
No improvement. Patience: 2/2
Early stopping triggered.
No description has been provided for this image
In [45]:
#-------- Model Eval -----------
model.load_state_dict(torch.load("best_model.pth"))  # Load best model
model.eval()

all_preds = []
all_labels = []

with torch.no_grad():
    for x, y in val_loader:
        x, y = x.to(device), y.to(device)
        preds = model(x).argmax(1)
        all_preds.extend(preds.cpu().numpy())
        all_labels.extend(y.cpu().numpy())

print(classification_report(all_labels, all_preds, target_names=train_ds.classes))

# Confusion matrix
cm = confusion_matrix(all_labels, all_preds)
plt.figure(figsize=(8,6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=train_ds.classes, yticklabels=train_ds.classes)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix')
plt.tight_layout()
plt.show()
              precision    recall  f1-score   support

       angry       0.63      0.61      0.62       958
     disgust       0.75      0.71      0.73       111
        fear       0.57      0.54      0.55      1024
       happy       0.89      0.87      0.88      1774
     neutral       0.64      0.69      0.66      1233
         sad       0.60      0.59      0.60      1247
    surprise       0.79      0.85      0.82       831

    accuracy                           0.70      7178
   macro avg       0.70      0.69      0.69      7178
weighted avg       0.70      0.70      0.70      7178

No description has been provided for this image
In [46]:
correct = sum([p == t for p, t in zip(all_preds, all_labels)])
print(f"Final Accuracy: {correct / len(all_labels):.4f}")
Final Accuracy: 0.7042

Below we will make confidence 3, attempt to implement label smoothing to help generalization & increase data augmentation to hopefully push passed 70% val acc.

In [52]:
import torch, timm
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from tqdm import tqdm
import matplotlib.pyplot as plt

# --- Model Definition ---
class CustomModel(nn.Module):
    def __init__(self, base_model, num_classes):
        super().__init__()
        self.backbone = base_model
        in_features = self.backbone.classifier.in_features
        self.backbone.classifier = nn.Identity()
        self.dropout = nn.Dropout(0.3)
        self.fc = nn.Linear(in_features, num_classes)

    def forward(self, x):
        x = self.backbone(x)
        x = self.dropout(x)
        return self.fc(x)

base_model = timm.create_model("efficientnet_b3a", pretrained=True)
model = CustomModel(base_model, num_classes=7).to(device)

# --- Label Smoothing Loss ---
class LabelSmoothingLoss(nn.Module):
    def __init__(self, classes, smoothing=0.1):
        super().__init__()
        self.confidence = 1.0 - smoothing
        self.smoothing = smoothing
        self.cls = classes

    def forward(self, pred, target):
        pred = pred.log_softmax(dim=-1)
        true_dist = torch.zeros_like(pred)
        true_dist.fill_(self.smoothing / (self.cls - 1))
        true_dist.scatter_(1, target.data.unsqueeze(1), self.confidence)
        return torch.mean(torch.sum(-true_dist * pred, dim=-1))

# --- Training Setup ---
criterion = LabelSmoothingLoss(classes=7, smoothing=0.1)
optimizer = optim.AdamW(model.parameters(), lr=1e-4, weight_decay=1e-2)
scheduler = optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)

best_val_acc = 0
epochs_no_improve = 0
patience = 3
num_epochs = 20
train_acc_list, val_acc_list = [], []

# --- Training Loop ---
for epoch in range(num_epochs):
    model.train()
    total_loss = 0
    correct = 0

    for x, y in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}"):
        x, y = x.to(device), y.to(device)
        optimizer.zero_grad()
        out = model(x)
        loss = criterion(out, y)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
        correct += (out.argmax(1) == y).sum().item()

    train_acc = correct / len(train_ds)
    train_acc_list.append(train_acc)

    # --- Validation ---
    model.eval()
    val_correct, val_total = 0, 0
    with torch.no_grad():
        for x_val, y_val in val_loader:
            x_val, y_val = x_val.to(device), y_val.to(device)
            y_pred = model(x_val)
            val_correct += (y_pred.argmax(1) == y_val).sum().item()
            val_total += y_val.size(0)

    val_acc = val_correct / val_total
    val_acc_list.append(val_acc)

    print(f"Epoch {epoch+1}, Train Acc: {train_acc:.4f}, Loss: {total_loss:.4f}, Val Acc: {val_acc:.4f}")

    # --- Early Stopping ---
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        epochs_no_improve = 0
        torch.save(model.state_dict(), "best_model_improved.pth")
        print(" New best model saved.")
    else:
        epochs_no_improve += 1
        print(f"No improvement. Patience: {epochs_no_improve}/{patience}")

    if epochs_no_improve >= patience:
        print(" Early stopping triggered.")
        break

    scheduler.step()

# --- Plot Accuracy Curves ---
plt.figure(figsize=(10, 5))
plt.plot(train_acc_list, label="Train Accuracy")
plt.plot(val_acc_list, label="Validation Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.title("Training vs Validation Accuracy (Improved)")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
C:\Users\forca\anaconda3\Lib\site-packages\timm\models\_factory.py:138: UserWarning: Mapping deprecated model name efficientnet_b3a to current efficientnet_b3.
  model = create_fn(
Epoch 1/20: 100%|██████████| 449/449 [1:09:48<00:00,  9.33s/it]
Epoch 1, Train Acc: 0.5174, Loss: 645.5958, Val Acc: 0.6245
 New best model saved.
Epoch 2/20: 100%|██████████| 449/449 [1:08:04<00:00,  9.10s/it]
Epoch 2, Train Acc: 0.6634, Loss: 525.2005, Val Acc: 0.6654
 New best model saved.
Epoch 3/20: 100%|██████████| 449/449 [1:05:10<00:00,  8.71s/it]
Epoch 3, Train Acc: 0.7318, Loss: 468.2296, Val Acc: 0.6786
 New best model saved.
Epoch 4/20: 100%|██████████| 449/449 [1:04:41<00:00,  8.64s/it]
Epoch 4, Train Acc: 0.8015, Loss: 414.8496, Val Acc: 0.6888
 New best model saved.
Epoch 5/20: 100%|██████████| 449/449 [1:04:42<00:00,  8.65s/it]
Epoch 5, Train Acc: 0.8524, Loss: 370.4793, Val Acc: 0.6865
No improvement. Patience: 1/3
Epoch 6/20: 100%|██████████| 449/449 [1:04:53<00:00,  8.67s/it]
Epoch 6, Train Acc: 0.8983, Loss: 333.6452, Val Acc: 0.6923
 New best model saved.
Epoch 7/20: 100%|██████████| 449/449 [1:06:45<00:00,  8.92s/it]
Epoch 7, Train Acc: 0.9255, Loss: 310.8068, Val Acc: 0.6970
 New best model saved.
Epoch 8/20: 100%|██████████| 449/449 [1:06:13<00:00,  8.85s/it]
Epoch 8, Train Acc: 0.9426, Loss: 296.1793, Val Acc: 0.6948
No improvement. Patience: 1/3
Epoch 9/20: 100%|██████████| 449/449 [1:04:43<00:00,  8.65s/it]
Epoch 9, Train Acc: 0.9525, Loss: 287.9650, Val Acc: 0.6969
No improvement. Patience: 2/3
Epoch 10/20: 100%|██████████| 449/449 [1:04:45<00:00,  8.65s/it]
Epoch 10, Train Acc: 0.9576, Loss: 283.2365, Val Acc: 0.6956
No improvement. Patience: 3/3
 Early stopping triggered.
No description has been provided for this image

Below we will try ConvNeXt Tiny (pretrained on imagenet-22k) instead of efficientnet_b3a.

In [54]:
import torch
import torch.nn as nn
import torch.optim as optim
from tqdm import tqdm
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix
import seaborn as sns
import timm

# ---- Load ConvNeXt Tiny Backbone ----
model = timm.create_model("convnext_tiny.fb_in22k", pretrained=True, num_classes=7)
model = model.to("cuda" if torch.cuda.is_available() else "cpu")
device = next(model.parameters()).device

# ---- Loss, Optimizer, Scheduler ----
criterion = nn.CrossEntropyLoss()
optimizer = optim.AdamW(model.parameters(), lr=3e-4)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=10)

# ---- Training Config ----
best_val_acc = 0
epochs_no_improve = 0
patience = 3
num_epochs = 20
train_acc_list = []
val_acc_list = []

# ---- Training Loop ----
for epoch in range(num_epochs):
    model.train()
    total_loss = 0
    correct = 0

    for x, y in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs}"):
        x, y = x.to(device), y.to(device)
        optimizer.zero_grad()
        out = model(x)
        loss = criterion(out, y)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
        correct += (out.argmax(1) == y).sum().item()

    train_acc = correct / len(train_ds)
    train_acc_list.append(train_acc)

    # ---- Validation ----
    model.eval()
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for x_val, y_val in val_loader:
            x_val, y_val = x_val.to(device), y_val.to(device)
            y_pred = model(x_val)
            val_correct += (y_pred.argmax(1) == y_val).sum().item()
            val_total += y_val.size(0)

    val_acc = val_correct / val_total
    val_acc_list.append(val_acc)

    print(f"Epoch {epoch+1}, Train Acc: {train_acc:.4f}, Loss: {total_loss:.4f}, Val Acc: {val_acc:.4f}")

    # ---- Early Stopping ----
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        epochs_no_improve = 0
        torch.save(model.state_dict(), "best_convnext_model.pth")
        print("New best model saved.")
    else:
        epochs_no_improve += 1
        print(f"No improvement. Patience: {epochs_no_improve}/{patience}")

    if epochs_no_improve >= patience:
        print(" Early stopping triggered.")
        break

    scheduler.step()

# ---- Plot Accuracy ----
plt.figure(figsize=(10, 5))
plt.plot(train_acc_list, label="Train Accuracy")
plt.plot(val_acc_list, label="Validation Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.title("ConvNeXt: Train vs Validation Accuracy")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

# ---- Model Evaluation ----
model.load_state_dict(torch.load("best_convnext_model.pth"))
model.eval()

all_preds = []
all_labels = []

with torch.no_grad():
    for x, y in val_loader:
        x, y = x.to(device), y.to(device)
        preds = model(x).argmax(1)
        all_preds.extend(preds.cpu().numpy())
        all_labels.extend(y.cpu().numpy())

print(classification_report(all_labels, all_preds, target_names=train_ds.classes))

cm = confusion_matrix(all_labels, all_preds)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=train_ds.classes, yticklabels=train_ds.classes)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix - ConvNeXt Tiny')
plt.tight_layout()
plt.show()
model.safetensors:   0%|          | 0.00/178M [00:00<?, ?B/s]
C:\Users\forca\anaconda3\Lib\site-packages\huggingface_hub\file_download.py:143: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\forca\.cache\huggingface\hub\models--timm--convnext_tiny.fb_in22k. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
  warnings.warn(message)
Epoch 1/20: 100%|██████████| 449/449 [47:35<00:00,  6.36s/it]
Epoch 1, Train Acc: 0.5935, Loss: 482.8378, Val Acc: 0.6468
New best model saved.
Epoch 2/20: 100%|██████████| 449/449 [47:56<00:00,  6.41s/it]
Epoch 2, Train Acc: 0.6915, Loss: 374.9729, Val Acc: 0.6737
New best model saved.
Epoch 3/20: 100%|██████████| 449/449 [50:55<00:00,  6.81s/it]
Epoch 3, Train Acc: 0.7429, Loss: 311.3200, Val Acc: 0.6946
New best model saved.
Epoch 4/20: 100%|██████████| 449/449 [47:58<00:00,  6.41s/it]
Epoch 4, Train Acc: 0.8070, Loss: 239.1212, Val Acc: 0.7112
New best model saved.
Epoch 5/20: 100%|██████████| 449/449 [47:36<00:00,  6.36s/it]
Epoch 5, Train Acc: 0.8690, Loss: 164.2940, Val Acc: 0.7088
No improvement. Patience: 1/3
Epoch 6/20: 100%|██████████| 449/449 [48:35<00:00,  6.49s/it]
Epoch 6, Train Acc: 0.9258, Loss: 96.7903, Val Acc: 0.7069
No improvement. Patience: 2/3
Epoch 7/20: 100%|██████████| 449/449 [48:07<00:00,  6.43s/it]
Epoch 7, Train Acc: 0.9626, Loss: 51.6013, Val Acc: 0.7151
New best model saved.
Epoch 8/20: 100%|██████████| 449/449 [49:41<00:00,  6.64s/it]
Epoch 8, Train Acc: 0.9808, Loss: 27.4399, Val Acc: 0.7189
New best model saved.
Epoch 9/20: 100%|██████████| 449/449 [2:05:57<00:00, 16.83s/it]  
Epoch 9, Train Acc: 0.9871, Loss: 17.7518, Val Acc: 0.7208
New best model saved.
Epoch 10/20: 100%|██████████| 449/449 [1:43:31<00:00, 13.83s/it]  
Epoch 10, Train Acc: 0.9914, Loss: 11.9923, Val Acc: 0.7210
New best model saved.
Epoch 11/20: 100%|██████████| 449/449 [48:24<00:00,  6.47s/it]
Epoch 11, Train Acc: 0.9924, Loss: 11.4604, Val Acc: 0.7210
No improvement. Patience: 1/3
Epoch 12/20: 100%|██████████| 449/449 [48:23<00:00,  6.47s/it]
Epoch 12, Train Acc: 0.9917, Loss: 11.3885, Val Acc: 0.7232
New best model saved.
Epoch 13/20: 100%|██████████| 449/449 [48:19<00:00,  6.46s/it]
Epoch 13, Train Acc: 0.9905, Loss: 12.8140, Val Acc: 0.7190
No improvement. Patience: 1/3
Epoch 14/20: 100%|██████████| 449/449 [1:08:25<00:00,  9.14s/it]
Epoch 14, Train Acc: 0.9847, Loss: 20.4750, Val Acc: 0.7147
No improvement. Patience: 2/3
Epoch 15/20: 100%|██████████| 449/449 [2:18:13<00:00, 18.47s/it]  
Epoch 15, Train Acc: 0.9709, Loss: 37.6122, Val Acc: 0.7165
No improvement. Patience: 3/3
 Early stopping triggered.
No description has been provided for this image
              precision    recall  f1-score   support

       angry       0.64      0.66      0.65       958
     disgust       0.79      0.68      0.73       111
        fear       0.62      0.56      0.59      1024
       happy       0.89      0.90      0.90      1774
     neutral       0.67      0.70      0.69      1233
         sad       0.60      0.60      0.60      1247
    surprise       0.83      0.84      0.83       831

    accuracy                           0.72      7178
   macro avg       0.72      0.71      0.71      7178
weighted avg       0.72      0.72      0.72      7178

No description has been provided for this image

ConvNeXt-Tiny achieved the better accuracy (0.72) and F1 across all classes compared to E-B3. The F1 scores for each class was: Angry 0.65, Disgust: 0.73, Fear: 0.59, Happy: 0.90, Neutral: 0.69, Sad: 0.60, Suprise: 0.83.

In [ ]: