Clinical Knowledge Base — RAG Retrieval Demo

A live retrieval pipeline over clinically-structured biomarker briefs. Your question is embedded by Pinecone's hosted model, then matched against the index behind a hard metadata filter. Answers are returned verbatim from approved entries — no language model generates them.

Demo only. The underlying entries use illustrative UK reference values for demonstration and are not medical advice. Always consult a healthcare professional. Not affiliated with any clinical provider or product.

How consistency is enforced

The discipline behind the demo — the part that usually breaks in a RAG knowledge base.

One schema, enforced in code
Every brief is validated against a Pydantic schema with controlled vocabularies before it is ever embedded. Invalid entries never reach the index.
Deterministic IDs, no duplicates
Each entry's vector ID is a hash of its source brief, so re-ingesting upserts in place. A content hash skips unchanged entries to avoid needless re-embedding.
Hard metadata filter
Queries filter on approved status, language and optional category before vector similarity — so unapproved or off-category content can't surface.
Anti-hallucination threshold
Matches below a cosine score are dropped, so the assistant declines rather than inventing an answer. No LLM generates the response text.

Stack: Next.js · Pinecone serverless with integrated inference (llama-text-embed-v2, cosine) — embeddings hosted, no third-party LLM key on the public endpoint.

Clinical Knowledge Base — RAG Retrieval Demo

How consistency is enforced

One schema, enforced in code

Deterministic IDs, no duplicates

Hard metadata filter

Anti-hallucination threshold