Overview

The Materials API includes powerful semantic search capabilities using Retrieval-Augmented Generation (RAG). This allows you to:
  • Search materials by meaning, not just keywords
  • Find relevant content across all your materials
  • Get contextual results with relevance scores
  • Search within specific materials or folders

How It Works

  1. Automatic Indexing - When materials are uploaded, they’re automatically processed:
    • Text is extracted (OCR for PDFs, transcription for audio/video)
    • Content is split into chunks
    • Vector embeddings are generated
    • Embeddings are stored in a vector database
  2. Semantic Search - When you search:
    • Your query is converted to a vector embedding
    • Similar content is found using vector similarity
    • Results are ranked by relevance

Search Materials

Perform a semantic search across all your materials.
const searchResults = await client.v1.materials.search({
  query: 'What is photosynthesis and how does it work?',
  topK: 5 // Number of results to return
});

searchResults.results.forEach(result => {
  console.log(`Score: ${result.score}`);
  console.log(`Material: ${result.material.name}`);
  console.log(`Chunk: ${result.text.substring(0, 200)}...`);
  console.log('---');
});

Search Response Format

{
  "query": "What is photosynthesis and how does it work?",
  "totalResults": 5,
  "results": [
    {
      "score": 0.92,
      "text": "Photosynthesis is the process by which plants convert light energy into chemical energy...",
      "chunkIndex": 3,
      "material": {
        "id": "mat_123abc",
        "name": "Biology Chapter 3 - Plant Processes",
        "contentType": "pdf"
      }
    },
    {
      "score": 0.87,
      "text": "The light-dependent reactions of photosynthesis occur in the thylakoid membranes...",
      "chunkIndex": 5,
      "material": {
        "id": "mat_456def",
        "name": "Plant Biology Textbook",
        "contentType": "pdf"
      }
    }
  ]
}

Advanced Search Options

Search with More Results

// Get more results for comprehensive coverage
const results = await client.v1.materials.search({
  query: 'cellular respiration and ATP production',
  topK: 20 // Get top 20 results
});

console.log(`Found ${results.totalResults} relevant chunks`);

Filter by Material Type

While the current API doesn’t support filtering in the search endpoint, you can filter results after retrieval:
const searchResults = await client.v1.materials.search({
  query: 'DNA replication process',
  topK: 10
});

// Filter for only PDF results
const pdfResults = searchResults.results.filter(
  result => result.material.contentType === 'pdf'
);

console.log(`Found ${pdfResults.length} results in PDFs`);