Data Science & ML — Focused on Real-World Systems

I build practical machine learning systems for real-world, messy data problems.

About

A data science/ML practitioner with strong engineering instincts. I focus on messy, real-world data and applied modeling (not just theory). I care about bias, uncertainty, validity, deployment, and usability. I've worked on supply chain optimization, ecological/foraging prediction, and full-stack systems.

Featured Projects

Local Food Supply Chain Planner

📈 20% cost reduction in delivery logistics

Multi-tenant web application for optimizing local food distribution logistics. Designed cost allocation models (fuel, depreciation, labor). Integrated database + backend (Flask + MySQL). Worked with real stakeholders (OGC).

Python Flask MySQL Optimization

Ecological / Foraging Prediction Model

🎯 89% accuracy in species prediction

Built models using iNaturalist + WorldClim data. Explored embeddings (Word2Vec-style) for ecological relationships. Focused on predicting species presence in microclimates. Addressed noisy, biased observational data.

Python ML Embeddings Bias-Aware

Personal VPS / Deployment Infrastructure

âš¡ 99.9% uptime across all services

Deployed multiple services using Docker + Nginx. Managed databases and APIs on VPS. Built and hosted production-style apps with monitoring.

Docker Nginx Linux VPS

A/B Testing Framework

🚀 3x faster experiment iteration

Built reusable Python package for A/B testing with statistical hypothesis testing. Implemented common ML metrics (RMSE, MAE, R², F1). Created visualization dashboard for experiment results.

Python A/B Testing Statistics

Skills & Expertise

Cloud & Infrastructure

AWS (EC2, S3, SageMaker), Docker, Linux, Git

Production deployments on AWS

Machine Learning

PyTorch, TensorFlow, Scikit-learn, Pandas

15+ models developed

ML Monitoring

MLflow, Model versioning, Performance tracking

Real-time model monitoring

A/B Testing & Experimentation

Statistical hypothesis testing, Metrics design

200+ experiments analyzed

Big Data

Spark, Distributed computing, Data pipelines

Processing 100GB+ datasets

Core Concepts

Optimization, Embeddings, Clustering, Recommender systems

Mathematical foundations

Certifications

🎓 IBM AI Engineering
🎓 IBM Applied AI
🎓 Johns Hopkins Data Science

Get In Touch