Skip to main content

Overview

ExuluReranker is a component that improves search result quality by reordering chunks based on relevance to the user’s query. After initial vector or hybrid search retrieves candidate results, a reranker applies more sophisticated relevance scoring to surface the most useful chunks at the top.

Key features

Post-search refinement

Reorder search results after initial retrieval for better relevance

Model-agnostic

Use any reranking API (Cohere, Voyage AI, custom models)

Simple integration

Drop-in enhancement for ExuluContext search

Flexible scoring

Custom reranking logic or third-party services

What is a reranker?

A reranker is a specialized model or algorithm that:
  1. Receives initial search results from vector or hybrid search
  2. Analyzes query-document pairs to compute refined relevance scores
  3. Reorders the results to surface the most relevant chunks first
  4. Returns the reordered chunks to the application
Rerankers typically use cross-attention mechanisms or more sophisticated language models than embedding models, providing better relevance at the cost of higher latency. This makes them ideal for post-processing search results.

Why use reranking?

Reranking models analyze the relationship between query and document more deeply than embedding similarity, leading to more accurate results.
Fast vector search retrieves candidates, then slower but more accurate reranking refines the top results. This balances speed and quality.
Custom rerankers can implement business logic, user preferences, or domain-specific relevance criteria.
Rerankers can better understand negations, comparisons, and complex queries that embedding models might miss.

Quick start

import { ExuluReranker } from "@exulu/backend";

// Create a reranker using Cohere
const cohereReranker = new ExuluReranker({
  id: "cohere_reranker",
  name: "Cohere Rerank",
  description: "Reranks search results using Cohere's rerank-english-v3.0 model",
  execute: async ({ query, chunks }) => {
    const response = await fetch("https://api.cohere.com/v1/rerank", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${process.env.COHERE_API_KEY}`,
        "Content-Type": "application/json"
      },
      body: JSON.stringify({
        model: "rerank-english-v3.0",
        query: query,
        documents: chunks.map(chunk => chunk.chunk_content),
        top_n: 10,
        return_documents: false
      })
    });

    const data = await response.json();

    // Reorder chunks based on Cohere's ranking
    return data.results
      .sort((a, b) => b.relevance_score - a.relevance_score)
      .map(result => chunks[result.index]);
  }
});

// Use with ExuluContext
const docsContext = new ExuluContext({
  id: "documentation",
  name: "Documentation",
  description: "Product docs",
  active: true,
  fields: [/* ... */],
  sources: [],
  resultReranker: async (results) => {
    return cohereReranker.run(
      results[0]?.query || "",
      results
    );
  }
});

Architecture

Integration with ExuluContext

Rerankers integrate with ExuluContext through the resultReranker configuration:
const context = new ExuluContext({
  // ... other config
  resultReranker: async (chunks) => {
    // Access the query from the first chunk's context
    const query = chunks[0]?.context?.query || "";

    // Apply reranking
    return reranker.run(query, chunks);
  }
});
When a search is performed:
  1. Vector/hybrid search retrieves initial candidates
  2. resultReranker is called with the chunks
  3. Reranker reorders the chunks
  4. Reordered results are returned to the agent/user

Chunk structure

Rerankers work with VectorSearchChunkResult objects:
type VectorSearchChunkResult = {
  chunk_content: string;           // The text content
  chunk_index: number;             // Position in original document
  chunk_id: string;                // Unique chunk ID
  chunk_source: string;            // Source item ID
  chunk_metadata: Record<string, string>;
  chunk_created_at: string;
  chunk_updated_at: string;
  item_id: string;                 // Parent item ID
  item_external_id: string;
  item_name: string;               // Parent item name
  item_updated_at: string;
  item_created_at: string;
  chunk_cosine_distance?: number;  // Similarity score
  chunk_fts_rank?: number;         // Keyword score
  chunk_hybrid_score?: number;     // Combined score
  context?: {
    name: string;
    id: string;
  };
};

Common reranking strategies

Use third-party reranking APIs like Cohere, Voyage AI, or Jina AI:
const apiReranker = new ExuluReranker({
  id: "cohere_rerank",
  name: "Cohere Reranker",
  description: "Uses Cohere rerank API",
  execute: async ({ query, chunks }) => {
    const response = await fetch("https://api.cohere.com/v1/rerank", {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${API_KEY}`,
        "Content-Type": "application/json"
      },
      body: JSON.stringify({
        model: "rerank-english-v3.0",
        query,
        documents: chunks.map(c => c.chunk_content),
        top_n: chunks.length
      })
    });

    const data = await response.json();
    return data.results.map(r => chunks[r.index]);
  }
});

Usage patterns

With ExuluContext

The most common pattern is integrating with ExuluContext:
import { ExuluContext, ExuluReranker } from "@exulu/backend";

// Create reranker
const reranker = new ExuluReranker({
  id: "my_reranker",
  name: "My Reranker",
  description: "Custom reranking logic",
  execute: async ({ query, chunks }) => {
    // Your reranking logic
    return reorderedChunks;
  }
});

// Use with context
const context = new ExuluContext({
  id: "docs",
  name: "Documentation",
  // ... other config
  resultReranker: async (chunks) => {
    const query = chunks[0]?.context?.query || "";
    return reranker.run(query, chunks);
  }
});

Standalone usage

You can also use rerankers independently:
const chunks = await context.search({
  query: "How do I authenticate?",
  method: "hybridSearch",
  // ... other params
});

const rerankedChunks = await reranker.run(
  "How do I authenticate?",
  chunks.chunks
);

console.log(rerankedChunks[0].chunk_content); // Most relevant result

Conditional reranking

Apply reranking only when needed:
resultReranker: async (chunks) => {
  // Only rerank if we have enough results
  if (chunks.length < 3) {
    return chunks;
  }

  // Only rerank if scores are similar (ambiguous results)
  const scores = chunks
    .map(c => c.chunk_hybrid_score || 0)
    .filter(s => s > 0);

  if (scores.length === 0) return chunks;

  const avgScore = scores.reduce((a, b) => a + b) / scores.length;
  const variance = scores.reduce((sum, score) => sum + Math.pow(score - avgScore, 2), 0) / scores.length;

  // Low variance means clear ranking, skip reranking
  if (variance < 0.01) {
    return chunks;
  }

  // High variance, apply reranking
  const query = chunks[0]?.context?.query || "";
  return reranker.run(query, chunks);
}

Best practices

Rerank the right amount: Reranking is slower than initial retrieval. Retrieve more candidates (e.g., 50-100) with fast vector search, then rerank the top 10-20.
Preserve metadata: Ensure your reranker returns chunks with all their original metadata intact. Only change the order, not the content.
Watch latency: Reranking adds latency to search. Test with production query volumes and consider caching or async reranking for non-critical queries.
Measure impact: A/B test your reranker to ensure it improves relevance. Track metrics like click-through rate or user satisfaction.

Performance considerations

Some APIs support batch reranking for better throughput:
execute: async ({ query, chunks }) => {
  // Send all documents at once
  const response = await cohereClient.rerank({
    model: "rerank-english-v3.0",
    query,
    documents: chunks.map(c => c.chunk_content),
    top_n: 20 // Limit results
  });

  return response.results.map(r => chunks[r.index]);
}
Cache reranking results for repeated queries:
const cache = new Map();

execute: async ({ query, chunks }) => {
  const cacheKey = `${query}:${chunks.map(c => c.chunk_id).join(",")}`;

  if (cache.has(cacheKey)) {
    return cache.get(cacheKey);
  }

  const reranked = await performReranking(query, chunks);
  cache.set(cacheKey, reranked);

  return reranked;
}
For custom scoring, process chunks in parallel:
execute: async ({ query, chunks }) => {
  const scored = await Promise.all(
    chunks.map(async (chunk) => ({
      chunk,
      score: await computeScore(query, chunk)
    }))
  );

  scored.sort((a, b) => b.score - a.score);
  return scored.map(s => s.chunk);
}

Next steps