Predicting Customer Churn
Why are customers canceling their subscriptions? Using consumer data to evaluate the cause of canceled subscriptions, a logistic regression classifier can be created to predict the churn probabilities of new customers.
Online Retail Price Optimization Analysis
Many of the products sold had different pricing throughout the time our data was recorded, we can take advantage by finding the optimal price point for each product that led to the most revenue.
Power BI Dashboards
Use BI tools such as Power BI, Tableau & Looker to drive business growth and data-backed decision making with interactive automated visualizations for easy understanding.
Building a Logistic Regression Model to Identify Spam Job Posts
There are a lot of job listings these days that could be considered spam. 500 job applications were analyzed, and a machine learning model was deployed to predict if your current job application is a new opportunity, or just spam.
Examining Credit Card Fraud using Python
Given a dataset from datacamp containing all credit card purchases on the west coast, we cleaned & analyzed the data to pinpoint fraudulent transactions and determine the most common trends to strengthen fraud detection parameters for the credit card company.
Regression Model in R to Determine Various life stresses on Depression in Adults
Used stepwise regression techniques in R to analyze the relationship between stressful life experiences and depression, focusing on genetic predisposition related to the serotonin transporter gene (5-HTT). The analysis confirmed statistically significant associations between environmental stressors (E1, E3, E4), genetic markers (G3, G8), and the depression outcome, suggesting that individuals carrying the short allele of the 5-HTT polymorphism are more prone to depression. Project from Stony Brook University.
Single Predictor Linear Regression in R
Given a dataset in R, I merged files containing multiple variables, dealt with and imputed missing data, Transformed data to fit a linear regression and applied an approximate lack of fit test. We compared results of the Exponential model (DV=ln(y)), the Quadratic model (dv=sqrt(y)), the Reciprocal model (DV=1/y), the Logarithmic model (IV=ln(x)), and the Power model (DV=ln(y), IV= ln(x)) and found the power model to be the best fit Resulting in the highest R-squared value. Project from Stony Brook University.
Data Analyst Certification SQL/Python
Using SQL and Python I analyzed restaurant data from Datacamp to help company lawyers gain key insights on food poisoning claims.
Certification awarded:
