Evaluation is the real skill.
Anyone can call an API. The hard part is knowing whether it works — golden datasets, hallucination detection, LLM-as-judge, and drift monitoring are what separate prototypes from systems you can trust in production.
I build robust, scalable machine learning systems, advanced RAG pipelines, and autonomous agentic workflows that deliver real-world impact.
AI/ML Engineer & Founder. First-author researcher (BUET CSE Fest 2026). Kaggle Top 1% (29/4,082 in Road Accident Risk). Deployed a recommendation engine that delivered +10% client sales lift in 90 days.
Build → Deploy → Measure → Improve. Great AI systems are reliable, observable, and cost-effective under real constraints.
I optimize for what matters in production: evaluation quality, inference latency, and system trust.
Philosophy
Anyone can call an API. The hard part is knowing whether it works — golden datasets, hallucination detection, LLM-as-judge, and drift monitoring are what separate prototypes from systems you can trust in production.
A model that scores 0.95 in a notebook but can't handle latency, cost, or scale is a liability. I optimize for the full stack: serving, observability, failure modes, and the infrastructure that keeps AI reliable under real load.
95% of AI roles are applied, not research. I focus on end-to-end delivery — from data pipeline to deployed API — because companies hire engineers who can build, deploy, scale, and measure. Not just present.
Project Selection
A curated collection of systems I've built. These projects are selected to demonstrate end-to-end engineering—from custom deep learning architectures (Transformers from scratch) and agentic AI middleware, to production pipelines that solve real business constraints like latency, scalability, and vendor lock-in.
Curriculum
A structured, project-driven curriculum of production-grade AI systems. Each project is numbered, open-source, and designed to take you from zero to shipping real AI in the real world.
| # | Project | Description | Tags |
|---|---|---|---|
| 101 | LLM Playground | Production-level large language model training and deployment framework | LLMTrainingMLOps |
| 102 | Customer Support Chatbot | Production-ready AI customer support with RAG, hybrid search, reranking, and PEFT fine-tuning | RAGProductionLLM |
| 103 | Ask the Web | Perplexity-like AI agent with ReACT/ReWOO/Reflexion reasoning, MCP & A2A protocols, and multi-agent orchestration | AgentsSearchRAG |
| 104 | Deep Research | Deep research AI using web search, OpenAI o3, and DeepSeek-R1 with inference-time scaling | ResearchLLMAgents |
| 105 | Multi-Modal Generation | Production T2I and T2V synthesis with full VAE, GAN, DiT, and DDPM/DDIM/DPM-Solver++ implementation | T2IT2VDiffusionDL |
I'm an AI/ML Engineer from Dhaka, Bangladesh. My journey started during Computer Science studies at BNIST, where courses in linear algebra, probability, and data structures sparked a deep curiosity about how machines learn from data. That curiosity turned into 2+ years of building production ML systems — from end-to-end pipelines with ZenML and MLflow to Dockerized inference services with FastAPI.
I am an AI/ML Engineer and founder with a strong focus on production-ready systems. I hold a Kaggle Master rank (across 26 competitions) and published a first-author conference paper on efficient speaker diarization (BUET CSE Fest 2026). As an entrepreneur, I launched Toolly—an AI tool discovery platform—and ReWoo (2026), an AI startup building advanced agentic workflows and AI solutions. Alongside my ventures, I work remotely as an AI Engineer for a UK-based company, building robust AI products. In my past work, I deployed a recommender system that delivered a 10% sales lift for a client in 3 months.
When I'm not training models, I explore the frontier of Generative AI — LLMs, RAG pipelines, and LangChain/LangGraph agents. I believe the best ML work happens at the intersection of strong engineering and genuine curiosity.
BNIST
CAREER
BUET CSE Fest 2026
Bangla Diarizz: Domain-Adapted Bengali Speaker Diarization via Knowledge Distillation. DER 0.19 (dev) / 0.286 (private LB) · 56% inference speedup · 3.4× real-time on CPU. Preprint on ResearchGate →
UK-based Company (Remote)
Working remotely as an AI Engineer, focusing on building and deploying robust AI products and scalable machine learning solutions for production environments.
Kaggle
Top 1% (29/4,082) in Road Accident Risk. 26 competitions total, including Top 2% in BPM Prediction.
Freelance / Client Project
Built and delivered a hybrid recommendation system (collaborative + content-based). Successfully deployed and achieved a +10% sales increase in 3 months for the retail client.
Personal Project
Built and launched an AI tool discovery platform (toolly.tech). Defined product vision, led full-stack development, implemented submission moderation and analytics. 400+ curated tools across 15 categories.
Research
A lightweight pipeline for Bengali long-form audio that reaches DER 0.19 (dev) / 0.286 (private LB), with a distilled student model that runs at 3.4× real-time on CPU and roughly 56% faster inference than the baseline — aimed at deployments without heavy GPU infrastructure.
Preprint on ResearchGateRight now
Focusing on my own AI startup, ReWoo, while also working remotely with a UK-based company.
Containerized ML services on AWS with FastAPI, Docker, and CI/CD — production inference that handles real traffic, not just localhost demos.
Agentic workflows with LangGraph — multi-step orchestration, tool calling, and the evaluation challenges that come with autonomous AI systems.
Open to AI/ML engineer roles, production ML consulting, and research collaborations where the bar is shipping, not slides.
Kaggle Master — Top 1% (29/4,082) in Road Accident Risk. 26 competitions total. View Kaggle profile
CONTACT
I'm open to AI/ML engineer roles, production ML consulting, and research collaborations. If you want a system that actually ships — reach out.