Study Assistant

Try It

Paste your notes, then ask a question. The AI answers from your content only — nothing external.

Demo Video

Runs locally — clone the repo and follow the setup instructions to get started.

Description

Inspiration:Built out of frustration with tools like Quizlet and NotebookLM. I have ADHD and strongly detest studying, so I wanted a tool that could ingest my own notes and help me learn material to a depth that those tools don't reach — and do it faster.

What it does: Accepts Markdown files, PDFs, and images as input and uses a local RAG pipeline to extract and index the content. You can then query your notes conversationally — the assistant retrieves the most relevant chunks from ChromaDB and responds using a local Ollama model. Available as both a CLI and a Streamlit Web UI. The Web UI additionally supports direct camera uploads for photographing physical notes.

How it was built: Python is the core language. ChromaDB handles the vector database layer for semantic retrieval. Ollama runs the local LLM inference so nothing leaves your machine. Streamlit powers the Web UI.

Status: Actively in development. Submitted to the 2026 Hack America Hackathon.

How the RAG Pipeline Works

  RAG Pipeline — Study Assistant
  ════════════════════════════════

  INGEST (run once per document)           QUERY (run each conversation turn)

  ┌──────────────────┐                     ┌──────────────────┐
  │   Your Notes     │                     │   Your Question  │
  │  .md / PDF / img │                     │  "Explain X to   │
  │  (or camera snap)│                     │   me simply"     │
  └────────┬─────────┘                     └────────┬─────────┘
           │                                        │
           ▼                                        ▼
  ┌────────────────┐                      ┌─────────────────┐
  │    Chunker     │                      │   Embed Query   │
  │ split text     │                      │ text → vector   │
  │ into segments  │                      │ [...0.2, 0.8..] │
  └────────┬───────┘                      └────────┬────────┘
           │                                       │
           ▼                                       │
  ┌─────────────────────────────────────────────────────────┐
  │                       ChromaDB                          │
  │                    (Vector Store)                       │
  │                                                         │
  │  chunk₁ → [0.1, 0.7, ...]   ◄── cosine similarity ────►│◄──┘
  │  chunk₂ → [0.3, 0.2, ...]       picks top-K chunks     │
  │  chunk₃ → [0.8, 0.1, ...]                               │
  └──────────────────────────────┬──────────────────────────┘
                                 │  top-K relevant chunks
                                 ▼
                       ┌─────────────────┐
                       │   Ollama / LLM  │  (or Groq / Anthropic
                       │                 │   if no local GPU)
                       │  prompt =       │
                       │  context chunks │
                       │  + your query   │
                       └────────┬────────┘
                                │
                                ▼
                       ┌─────────────────┐
                       │     Answer      │
                       │  grounded in    │
                       │  your own notes │
                       └─────────────────┘

Nothing leaves your machine when using Ollama. Cloud models (Groq, Anthropic) are opt-in for low-end hardware.

Dev Notes

Problem Solved

Existing study tools don't understand your specific notes — they're generic. This tool ingests your own material and queries it semantically, so the answers are grounded in what you actually need to learn.

New Tech Learned

ChromaDB for vector storage and semantic retrieval, and Ollama for running local LLM inference without any API cost or data leaving the machine.

What's Next

More input formats, a better chunking strategy for long PDFs, and possibly a CLI-side camera capture feature to match the Web UI. Check the GitHub repo for the latest.