Skip to content
View sanjaychelliah's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report sanjaychelliah

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sanjaychelliah/README.md

Senior ML Engineer Β· LLM Infrastructure Β· AI Platform

LinkedIn GitHub Email Chennai


🧠 About Me

Building the infrastructure that makes AI run fast, cheap, and reliably at scale.

I'm a Senior MLOps / AI Platform Engineer with 5+ years shipping production systems across two deep specializations:

  • πŸš€ LLM Inference Infrastructure β€” vLLM, SGLang, MCP-based agents, RAG architectures.
  • πŸ‘οΈ Computer Vision Pipelines β€” Real-time object detection, multi-object tracking, segmentation at millions of frames per week.

I've led teams of 4+ engineers, contributed to open-source SDKs increasing downloads by 10x, and pushed models to top throughput performance.

πŸ”­ Currently: MCP-based agent orchestration and pushing LLM inference latency boundaries with vLLM + SGLang.


⚑ What I've Shipped

I've taken systems from prototype to production across LLM infra and computer vision β€” a few highlights:

  • 3x latency reduction on a multi-modal RAG platform over a 1M+ document knowledge base
  • 80% GPU memory savings with LoRA/PEFT adapters for cost-efficient production fine-tuning
  • 0.97 mAP on car dent detection & segmentation for an insurance client
  • 90% MOTA on sports analytics pipelines processing millions of frames/week
  • 10x SDK download growth via Clarifai Python SDK & CLI contributions
  • NVIDIA Smart City Hackathon finalist (Asia-Pacific) β€” pothole detection with RT-DETR

πŸš€ Featured Projects

πŸ—‚οΈ Docwhisper

Ask questions directly against your documents β€” powered by local LLMs, no cloud required.

A document Q&A system built on a full RAG pipeline: ingest PDFs, chunk and embed them into a vector store, then retrieve and answer with a locally-running LLM via Ollama. Fast, private, and runs entirely on your machine. Includes MLflow-based observability β€” query traces, retrieval quality metrics, and latency are tracked per run for easy debugging and iteration.

Python FastAPI Ollama RAG Vector Search MLflow

Benchmark and compare small language models side-by-side β€” entirely offline, zero cloud.

Run up to 3 models simultaneously with real-time token streaming and per-model metrics. A built-in benchmark suite covers 18 prompts across 10 categories (reasoning, code, math, safety). Interactive Plotly charts visualize throughput, TTFT, RAM usage, and quality-vs-speed trade-offs β€” all on local hardware.

Python Streamlit Ollama Plotly Pandas

🀝 Clarifai Python SDK (Contributor)

The official Python client for the Clarifai AI platform β€” models, workflows, datasets, and deployments via a clean API.

Drove significant growth through CLI improvements, new API surface coverage, and DX enhancements β€” contributing to a 10x increase in monthly downloads.

Python gRPC CLI Open Source

Decompose complex queries into subtasks and coordinate specialized agents to research topics end-to-end.

Production-grade multi-agent pipeline: Orchestrator β†’ Search Agent β†’ Summarizer β†’ Critic, connected via A2A typed messaging and MCP servers for tools, memory, and web search. Tracks token usage, latency, cost, and confidence scores per agent. Streamlit dashboard with Plotly charts for visual analytics.

Python LangChain MCP Streamlit Plotly

Other Projects
Project Description Stack
PyTorch Object Detect & Track Real-time multi-object detection and tracking in video Python, PyTorch, YOLO

πŸ› οΈ Tech Stack

LLM & GenAI

vLLM SGLang LangChain RAG MCP LoRA TensorRT-LLM HuggingFace

Models I've Worked With

GPT Claude Llama Qwen DeepSeek Minimax Gemma Whisper

Computer Vision

OpenCV YOLO SAM PyTorch TensorFlow

MLOps & Infrastructure

Docker Kubernetes GitHub Actions ONNX TensorRT AWS Azure GCP

Data & Storage

PostgreSQL Qdrant Python


πŸ“Š GitHub Stats

Profile Stats

Activity Graph


πŸ’¬ Let's Build Something

Open to collaborations on LLM infrastructure, computer vision systems, and AI platform engineering.

LinkedIn Email


Pinned Loading

  1. Clarifai/clarifai-python Clarifai/clarifai-python Public

    Experience the power of Clarifai’s AI platform with the python SDK. 🌟 Star to support our work!

    Python 42 8

  2. Clarifai/clarifai-python-datautils Clarifai/clarifai-python-datautils Public

    Extract Transform and Load unstructured data into the Clarifai's AI platform

    Python 8