Quiet Signals
Lab

Alexander Hepburn
Data Scientist & Scientific Programmer
Amsterdam, NL

I'm a data scientist and AI engineer based in Amsterdam with a background in information retrieval and applied ML research.

The interesting problem is rarely which model to use—it's how to structure the problem so the model is actually useful. That means knowing when a language model is the right tool, how to build retrieval systems that retrieve the right thing, and how to diagnose why a model that worked in development stops working in production.

Services

A functional sales or support agent requires two things to work well: a clear model of the conversation it needs to have, and a reliable way to interpret what the user actually means at each step. I work with clients to build both — translating business logic into a structured system and implementing the AI layer that makes it responsive to real language.

Deliverables:

  • Custom agent architecture with configurable business rules
  • Integration with messaging platforms (Meta, WhatsApp, Telegram)
  • State management and fallback handling
  • Human escalation routing
  • Documentation and deployment support

The part of a RAG system most likely to fail is retrieval. A model can only work with what it's given — if the retrieved context is wrong or poorly ranked, the answer will be too, regardless of the model. I build retrieval pipelines where chunking strategy, embedding selection, and reranking are treated as the core engineering problems. Backend uses Python and FastAPI, with support for local models (Ollama) and commercial providers (OpenAI, Mistral), and vector storage via ChromaDB or pgvector.

Deliverables:

  • Production-ready RAG backend with REST API
  • Document ingestion pipeline and vector database setup
  • Retrieval optimisation: chunking, embedding tuning, reranking
  • Source attribution and generation constraints
  • Documentation and handover

When a model degrades in production, there's usually a specific, isolatable reason — a shift in input distribution, a data quality issue, a pattern that didn't generalise. I run controlled ablation studies to identify and rank contributing factors, so fixes can be prioritised by evidence rather than intuition.

Deliverables:

  • Root cause diagnosis with quantified impact analysis
  • Ablation studies isolating contributing factors
  • Recommendations ranked by priority
  • Documented failure modes and monitoring thresholds

Stack & Methods

Python PyTorch FastAPI PostgreSQL ChromaDB AWS Hybrid search Semantic reranking Embedding optimisation pgvector Ablation studies Distribution shift analysis Hypothesis testing Causal inference