The Mother of AI project

A real world hands-on AI project series

Shantanu Ladhwe

and

Shirin Khosravi Jam

Jul 30, 2025

Hey there 👋

We’re

Shirin Khosravi Jam

and

Shantanu Ladhwe

- two AI practitioners with 17+ years of combined experience building production-grade ML systems in the wild.

Over the years, we’ve built RAG pipelines, AI Agents, Recommender systems, MLOps platforms, LLM-based agents, and everything in between. All in production!

Now, we’re opening that experience to the world - through a hands-on, open-source, no-hype learning initiative we call

The Mother of AI Project

This isn’t a typical AI course or bootcamp.

It’s a build-first, learn-by-doing, production-ready AI roadmap - made up of multiple focused phases, each guiding you through building a complete system from scratch.

The Project Phases

Phase 1 – RAG Systems : Zero to Hero

Build a personalized AI research assistant from scratch

Ingest 1000+ research papers, chunk, embed, and run hybrid search + LLM-based RAG + Monitoring
(This is what we’re starting with - full details below)

Phase 2 – AI Agents + Tool Use + Monitoring (new project)

Move beyond static RAGs.
Build decision-making agents with memory, planning, and tool use.
Multi-step reasoning, better interactivity, and deeper retrieval.

Phase 3 – Recommendation Systems (new subproject)

Build real-time content-based or hybrid multi-stage recommender systems.
Cover ranking, user personalization, feedback loops, and metrics.

Phase 4 – MLOps + LLMOps (continued)

Take your models to production
Make use of the most popular Cloud services & pipelines here.
CI/CD, evaluation harnesses, fine-tuning, prompt versioning, data pipelines, security, and testing - Alles!

Phase 5 – Full App Integration + Cloud Deployment (continued)

From one of the most suitable project above!
Bring everything together.
Containerization, API orchestration, front-end demo, cloud deployment (AWS/GCP), CI/CD, IaaS, cost optimization, etc.

Phase 6 – Monitoring + Alerting Mastery (continued)

Build reliable systems that never go silent.
Logging, tracing, drift detection, alerts, and incident-ready dashboards.

Each phase isn’t a toy demo . it’s a production-grade system you’d be proud to showcase.

You’ll learn how to think and build like a real-world AI/ML engineer - using tools that teams actually use in production: Docker, FastAPI, Airflow, Ollama, LangGraph, OpenSearch, Langfuse, and many more.
(we only teach what’s truly useful and community-backed).

💼 Everything you build is CV-worthy!

We’ll not only walk you through best practices -
we’ll show you how to extend each system, adapt it for your own use case, and communicate it effectively on your resume or portfolio.

Who is this for?

If you’re tired of surface-level tutorials and want to go deeper - this is for you.

Whether you’re:

A student working toward your first real AI project
A Data Scientist looking to level up your software and infra skills
A Data/Backend/Software Engineer curious about LLMs, RAG, or Agents
A working AI/ML Engineer who wants to learn GenAI systems, for real
Anyone who is looking into entering this industry by learning great projects!

Pre-requisites: Good Python knowledge and understanding of software programming.

This will accelerate your path - with structure, hands-on examples, and mentorship from people who’ve built this in the wild.

Next: dive into Phase 1: RAG Systems (Zero → Hero)

Phase 1: Build Your Own AI Research Assistant

We’re kicking off the Mother of AI Project with what we believe is one of the most impactful AI skills right now:

RAG (Retrieval-Augmented Generation) - used in everything from chatbots to search, internal knowledge bases, agents, and assistants.

But we’re not stopping at “just” RAG.

You’ll build a complete research assistant that:

Runs automated data ingestion pipelines
Downloads and Parse 100+ academic PDFs (e.g. arXiv papers) through APIs
Search across papers with keywords and embeddings
Answer questions using local or foundational LLMs (Ollama, OpenAI, etc.)
Show sources, give feedback, and maintain privacy
A complete real production grade system!

This is what we call the ArXiv Paper Curator - an AI system that turns the flood of new AI research into structured, searchable knowledge.

What you’ll build (Technical Breakdown)

You will build from scratch - a fully local with API integration, production-grade RAG system with:

Data Ingestion: Auto-download PDFs daily from arXiv using Airflow
Dual Parsing: Extract structured content via GROBID + Docling fallback
Metadata Storage: Store authors, titles, abstracts, etc. metadata in PostgreSQL
Search Engine: Use OpenSearch with BM25 + semantic vectors (hybrid)
Chunking Engine: Semantic-aware chunking (evaluate different chunking)
Embedding Store: SentenceTransformers + LlamaIndex indexing
RAG Pipeline: Query expansion + retrieval + prompt templating
Local LLM: Answer questions using Ollama or API (LLaMA3, OpenAI, etc.)
Observability: Use Langfuse for prompt versioning, tracing, quality
Evaluation: RAGAS metrics, nDCG scoring, accuracy, latency tracking
Frontend: Ask questions and explore results via Streamlit or Gradio
FastAPI Backend: Async API server for integration and extensions
Dev Best Practices: uv, ruff, pre-commit, pydantic, pytest, logging, etc.