PH AI WorksRAG Demo
phaiworks.com

Clinical Knowledge Base — RAG Retrieval Demo

A live retrieval pipeline over clinically-structured biomarker briefs. Your question is embedded by Pinecone's hosted model, then matched against the index behind a hard metadata filter. Answers are returned verbatim from approved entries — no language model generates them.

Demo only. The underlying entries use illustrative UK reference values for demonstration and are not medical advice. Always consult a healthcare professional. Not affiliated with any clinical provider or product.

How consistency is enforced

The discipline behind the demo — the part that usually breaks in a RAG knowledge base.

  • One schema, enforced in code

    Every brief is validated against a Pydantic schema with controlled vocabularies before it is ever embedded. Invalid entries never reach the index.

  • Deterministic IDs, no duplicates

    Each entry's vector ID is a hash of its source brief, so re-ingesting upserts in place. A content hash skips unchanged entries to avoid needless re-embedding.

  • Hard metadata filter

    Queries filter on approved status, language and optional category before vector similarity — so unapproved or off-category content can't surface.

  • Anti-hallucination threshold

    Matches below a cosine score are dropped, so the assistant declines rather than inventing an answer. No LLM generates the response text.

Stack: Next.js · Pinecone serverless with integrated inference (llama-text-embed-v2, cosine) — embeddings hosted, no third-party LLM key on the public endpoint.