Data Scientist
I turn messy data into models people can trust.
A junior data scientist with a physics and business-informatics background, working end-to-end — from hand-labeling data to deploying inference, with a bias toward results that hold up under scrutiny.
About
01
I care about the part of machine learning that's easy to skip — knowing when a model is genuinely better, not just different.
I came to data science from physics and business informatics. I like the full loop: framing the question, building the dataset when one doesn’t exist yet, and pressure-testing a result until it’s honest.
My main project, OIRseg, took a multi-class segmentation model from several hundred hand-drawn masks to a validated, deployed web app — Dice 0.916 on the primary class. That mix of careful labeling, honest evaluation, and actually shipping is the work I want more of.
Selected Work
A few things I've built.
OIRseg — Retinal Image Segmentation
A multi-class U-Net (PyTorch) measuring disease zones in retinal microscopy — Dice 0.916 on the primary class, deployed as a public web app.
PubMed RAG
Retrieval-augmented Q&A over 980+ PubMed abstracts with local embeddings and citation-grounded answers, served via CLI and FastAPI.
Classical ML Studies
House-price regression, a CatBoost mushroom-edibility classifier (Kaggle), and audio-feature song clustering for mood-based playlists.
Sentiment Analysis Pipeline
A production-style NLP pipeline with split train/predict modules, Docker packaging, and CI-enforced quality gates.
Let's work together.
Open to data science roles and collaborations. The fastest way to reach me is email.