Documentation - RAGTrace

Quick Start

Get RAGTrace running in your project in under 5 minutes. This guide covers the essentials for .NET backends with optional Vue.js debugging panels.

Prerequisites: .NET 8+ runtime, a RAGTrace API key (available upon whitelist approval), and a supported vector database (Pinecone, Qdrant, Weaviate, pgvector, or Milvus).

1. Install the NuGet Package

Add the RAGTrace SDK to your ASP.NET Core project:

dotnet add package RAGTrace.SDK --version 0.9.0-beta

2. Configure the Middleware

Add RAGTrace to your service container and request pipeline. The middleware automatically intercepts calls to supported embedding providers and vector databases.

using RAGTrace.SDK;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRAGTrace(options =>
{
    options.ApiKey = builder.Configuration["RAGTrace:ApiKey"];
    options.CaptureEmbeddings = true;
    options.TracePipeline = true;
    options.SampleRate = 1.0;  // Capture 100% of requests
});

var app = builder.Build();

app.UseRAGTrace();  // Add before your RAG endpoint handlers

app.MapPost("/api/ask", async (AskRequest req, RAGTraceContext ctx) =>
{
    ctx.StartSpan("embedding");
    var embedding = await embeddingService.Embed(req.Query);
    ctx.EndSpan();

    ctx.StartSpan("retrieval");
    var results = await vectorDb.Search(embedding, topK: 5);
    ctx.EndSpan();

    ctx.StartSpan("generation");
    var answer = await llm.Generate(req.Query, results);
    ctx.EndSpan();

    return Results.Ok(new { answer });
});

Traces & Spans

RAGTrace organizes debugging data into traces and spans. A trace represents a single end-to-end RAG request. Spans represent individual operations within that trace (embedding, retrieval, prompt assembly, generation).

Each span captures:

Start and end timestamps with microsecond precision
Input and output data (query text, vectors, retrieved documents)
Metadata (model name, index name, similarity scores)
Error states and exception details

Embedding Capture

When CaptureEmbeddings is enabled, RAGTrace records the full embedding vector for each query. This enables similarity analysis and drift detection in the dashboard.

// Embeddings are captured automatically when using supported providers:
// - OpenAI (text-embedding-3-small, text-embedding-3-large)
// - Azure OpenAI
// - Cohere (embed-v3)
// - HuggingFace Inference API

// For custom embedding providers, use manual instrumentation:
ctx.CaptureEmbedding("custom-model", inputText, vectorOutput);

Retrieval Tracking

RAGTrace hooks into your vector database client to capture search parameters, result sets, and similarity scores. Supported databases include:

Pinecone — Full query and response capture with namespace tracking
Qdrant — Filter conditions, payload inspection, and scroll operations
Weaviate — GraphQL query capture with hybrid search support
pgvector — SQL query logging with explain plans
Milvus — Collection stats and partition tracking

Prompt Tracing

Understand exactly what your LLM receives as input. RAGTrace captures the assembled prompt template, marks where retrieved context was injected, and tracks token counts for each section.

// Automatic prompt capture for supported LLM providers
// Manual instrumentation for custom setups:

ctx.TracePrompt(new PromptTrace
{
    Template = "Answer based on: {context}\n\nQuestion: {query}",
    Variables = new Dictionary<string, string>
    {
        ["context"] = string.Join("\n", retrievedDocs),
        ["query"] = userQuery
    },
    TotalTokens = tokenCount
});

.NET SDK Reference

The .NET SDK provides the following core APIs:

AddRAGTrace() — Registers RAGTrace services in the DI container
UseRAGTrace() — Adds the tracing middleware to the pipeline
RAGTraceContext — Injected context for manual span creation
StartSpan() / EndSpan() — Manual span lifecycle management
CaptureEmbedding() — Record custom embedding vectors
TracePrompt() — Capture assembled prompt templates

Vue.js Debug Panel

The embeddable Vue.js component provides a real-time debugging interface for development environments.

npm install @ragtrace/vue

<template>
  <div id="app">
    <!-- Your app content -->

    <RAGTracePanel
      :api-key="rtApiKey"
      :session-id="currentSessionId"
      :show-vectors="true"
      :show-prompt-diff="true"
      theme="dark"
      position="bottom-right"
    />
  </div>
</template>

<script setup>
import { RAGTracePanel } from '@ragtrace/vue'
import { ref } from 'vue'

const rtApiKey = ref(import.meta.env.VITE_RAGTRACE_KEY)
const currentSessionId = ref(null)
</script>

REST API

The RAGTrace REST API provides programmatic access to all trace data. Authentication is via Bearer token using your API key.

GET /api/v1/traces
Authorization: Bearer rt_live_...

# Response
{
  "traces": [
    {
      "id": "tr_abc123",
      "timestamp": "2026-03-13T10:30:00Z",
      "duration_ms": 245,
      "spans": [
        { "name": "embedding", "duration_ms": 42 },
        { "name": "retrieval", "duration_ms": 18 },
        { "name": "generation", "duration_ms": 185 }
      ],
      "metadata": {
        "model": "text-embedding-3-small",
        "vector_db": "qdrant",
        "top_k": 5,
        "max_similarity": 0.9247
      }
    }
  ]
}

A/B Testing

Compare different RAG configurations side by side. Create experiments to test embedding models, chunking strategies, or retrieval parameters.

ctx.StartExperiment("embedding-comparison", new ExperimentConfig
{
    Variants = new[]
    {
        new Variant("openai-small", () => openai.Embed(query, "text-embedding-3-small")),
        new Variant("openai-large", () => openai.Embed(query, "text-embedding-3-large")),
        new Variant("cohere-v3", () => cohere.Embed(query))
    },
    Metric = ExperimentMetric.RetrievalRelevance,
    SampleSize = 1000
});

Alerts & Webhooks

Set up quality regression alerts to catch issues before they impact production. RAGTrace monitors key metrics and triggers notifications when thresholds are crossed.

Similarity Drop Alert — Fires when average retrieval similarity falls below a threshold
Latency Spike Alert — Triggers when P95 retrieval latency exceeds a limit
Context Overflow Alert — Warns when prompts approach token limits
Webhook Destinations — Slack, Discord, PagerDuty, custom HTTP endpoints

Self-Hosting

RAGTrace can be self-hosted for teams with strict data residency requirements. The self-hosted edition runs as a Docker container with PostgreSQL for trace storage.

docker pull ragtrace/server:latest

docker run -d \
  --name ragtrace \
  -p 8080:8080 \
  -e DATABASE_URL="postgresql://user:pass@host:5432/ragtrace" \
  -e RT_LICENSE_KEY="your-license-key" \
  ragtrace/server:latest