From Data Experiments to Business Impact

Explore real-world data science & machine learning work. Each post breaks down the approach and takeaways — and how the same methods can uncover opportunities, reduce risk, and guide better decisions.

Recent writing & project breakdowns

Insider Threats: Anomaly Detection on User Logs

Unsupervised ML flags suspicious behavior (USB events, file access) to surface potential insider risks early. This Project uses unsupervised anomaly detection (Isolation Forest) to flag user behaviors that significantly deviate from the norm and may indicate insider threats.

Topic Modeling Nike Product Reviews: Marketing and Product Insights

This project uses unsupervised text modeling techniques to explore online reviews of Nike pr oducts to extract actionable insights for marketing and product development.

Facial Expression Classification with CNNs

The goal of this project was to build an image classification model to recognize facial expressions using deep learning. The dataset contained face images categorized by emotion labels, and the project aimed to classify each image into its respective emotional class. A high-performing ConvNeXt architecture was used to achieve this.

Contextual Advertising: Supervised Text Classification For Targeted Marketing

A fine-tuned a transformer based text classifier using the ktrain wrapper over TensorFlow/Keras and HuggingFace transformers was trained to classify text based news articles for targeted marketing in the health & wellness niche.

Detecting Metastatic Cancer (ResNet18, PCam)

Binary classification on histopathologic scans; pipeline, metrics, and a clean Kaggle-ready submission.

Zeek Logs: Finding C2 / Beaconing with ML

Supervised learning over MITRE-labeled network data to highlight command-and-control patterns.

Life Stressors & Depression — Regression in R

Stepwise regression linking environmental and genetic factors (5-HTT) with depression outcomes.

Pitch Mining: A Case Study on Gerrit Cole

Feature engineering from pitch-level data to predict run-risk innings with ~83% accuracy.

CycleGAN: Turning Photos into Monet-Style Art

Training a CycleGAN to translate real-world images to Monet-style outputs; data, training, and MiFID.

Disaster Tweets with DistilBERT

Transformer-based classification for disaster vs. non-disaster tweets with minimal preprocessing.

Classifying News Articles with NLP & ML

Dimensionality reduction + supervised/unsupervised models to categorize news at scale.

Spotting Spam Job Posts via Logistic Regression

From feature selection to evaluation — a practical model to filter low-quality job listings.

Online Retail: Price Optimization

Finding revenue-maximizing price points by product with historical demand and price variation.

Single-Predictor Linear Regression in R

Transformations, lack-of-fit testing, and why the power model won — a compact walkthrough.