This Hybrid RAG Trick Makes Your AI Agents More Reliable (n8n)

When building AI agents that rely on vector stores to fetch information from your own data, you might notice that the answers aren’t always accurate. The challenge often lies in how vector search handles queries. While it excels at understanding the meaning behind natural language queries, it can struggle with specific names, acronyms, codes, or exact phrases present in the knowledge base.

I created an automation that addresses this issue by implementing hybrid search, combining both vector (semantic) search and keyword (full-text) search to improve accuracy and relevance. In this article, I’ll walk you through the concepts behind hybrid search and show you how I built working examples using Supabase and Pinecone, integrated into n8n workflows.

Understanding Vector Search

Vector search is powerful because it captures the semantic intent behind a user’s query. Instead of matching exact words, it converts the query into a dense vector—a numerical representation that embodies the meaning of the query. This vector is then compared with vectors representing chunks of content in the knowledge base to find the most semantically relevant results.

For example, imagine an AI agent deployed on an e-commerce store. A customer asks, “Can I see the various blue cotton t-shirts?” The vector search interprets this request by looking for products semantically related to “blue,” “cotton,” and “t-shirts.” It might return blue cotton t-shirts but also other cotton t-shirts or blue shirts that share similar attributes.

Behind the scenes, the query is transformed into a dense vector. The knowledge base contains dense vector embeddings for all products. Comparing the query vector with product vectors retrieves items with similar meaning.

When the customer asks for “lightweight cotton t-shirts,” the vector search could return products described with terms like “comfy summer tops,” “breathable shirts for hot weather,” or “soft tees for everyday wear.” This broad matching is the strength of semantic search—it understands meaning beyond exact words.

Limitations of Vector Search

Despite its strengths, vector search has weaknesses with precise queries. If a customer asks for “blue t-shirts that are medium sized,” the system might return blue t-shirts of various sizes, not necessarily medium. Or if the customer requests a blue t-shirt with a specific product code, the search might show blue t-shirts generally, but not the exact item with that code.

This happens because semantic search focuses on meaning rather than exact matches. It can be too broad when the user’s query contains specific terms that need exact matching.

Keyword Search: The Other Side of the Coin

Keyword search, or full-text search, contrasts with vector search by focusing on exact or partial word matches. It excels at precision but lacks semantic understanding.

Returning to the e-commerce example, if the AI agent uses keyword search and the customer asks for “blue cotton t-shirts,” the system looks for the exact words “blue,” “cotton,” and “t-shirt” in product titles or descriptions. This approach ensures precise matches, such as a product titled “Apex 25 Black Cotton T-Shirt.”

Behind the scenes, keyword search often converts the query into a sparse vector representation, which captures the presence or absence of terms rather than their meaning. This sparse vector is stored alongside dense vectors in the vector store, enabling quick retrieval of exact matches.

However, keyword search has drawbacks. It doesn’t understand synonyms or related concepts. For example, “t-shirts” and “T” are not equivalent unless explicitly linked by synonyms. This rigidity limits flexibility in search results.

Hybrid Search: Combining the Best of Both Worlds

Hybrid search merges the precision of keyword search with the semantic understanding of vector search. It creates two sets of results—one from the dense vector search and one from the sparse keyword search—and then fuses them into a single ranked list.

In practice, when a customer searches for “blue cotton t-shirt,” the hybrid search system positions the exact match (found via keyword search) at the top of the results, while still including semantically similar products (found via vector search) below. This approach ensures both accuracy and breadth.

Behind the scenes, the query is converted into two embeddings: a dense vector for semantic search and a sparse vector for keyword search. Two separate result sets are retrieved and then combined using a ranking method that weighs the scores from both searches. This fusion produces a more reliable and relevant result list.

Challenges with Hybrid Search in n8n

In typical AI agent setups within n8n, vector stores only support semantic search. There’s no built-in option for full-text keyword search or hybrid search. To implement hybrid search, some custom work is necessary, especially to integrate both search methods and combine their results.

If you’ve experienced inaccurate retrievals in your RAG (Retrieval-Augmented Generation) agents, hybrid search is worth testing. It fixes the common problem where semantic search returns too broad or irrelevant results for specific queries.

Implementing Hybrid Search with Supabase

Supabase offers hybrid search capabilities by combining dense vector embeddings and full-text search using PostgreSQL’s tsvector feature. I built an n8n workflow that leverages this functionality, with most of the heavy lifting done on the Supabase side.

Setting up the Supabase Database

My documents table in Supabase contains two essential columns:

embedding: stores dense vector embeddings representing the semantic content of each document chunk.
tsvector: stores tokenized full-text search vectors used for keyword search.

When data is injected into this table, the tsvector column is automatically populated with keywords and their positions within each chunk, enabling efficient exact matching.

Creating Indexes and Hybrid Search Function

I created indexes on both the embedding column and the tsvector column to speed up search queries. Then, I added a custom database function that performs both vector and full-text searches, combining their results using a reciprocal rank fusion method.

This fusion ranks results by balancing their semantic similarity and keyword match scores, ensuring the best results surface at the top.

Connecting Supabase to n8n via Edge Functions

To trigger hybrid search from n8n, I created an Edge function in Supabase. This function takes the user’s query, generates the dense embedding using OpenAI’s API, and runs the combined search query against the documents table.

The Edge function requires an OpenAI API key, which I securely added as a secret in Supabase. This setup offloads the embedding generation to Supabase, simplifying the n8n workflow.

Building the n8n Workflow

In n8n, I started with a chat trigger node to capture user queries. Then, I added an HTTP request node configured to call the Supabase Edge function, passing the user query as input.

This setup sends the query to Supabase, which returns the top matching chunks based on the hybrid search. Initially, no results were returned because the documents table was empty. So, I created a manual trigger workflow to ingest data into Supabase.

Ingesting Documents into Supabase

I used the “Add documents to vector store” action in n8n, connecting it to Supabase with the appropriate credentials. It was important that the embedding model used for ingestion matched the one used for querying (OpenAI’s text-embedding-3-small with 1536 dimensions).

One key detail was the need to add a metadata column manually in the Supabase table to avoid errors during ingestion. The metadata column stores JSON data about each chunk.

After setup, I loaded a large document—the 180-page Formula One technical regulations PDF—into Supabase. This resulted in over 680 chunks being stored with both dense embeddings and full-text search vectors.

Querying the Data

With data in place, I tested queries like “engine intake air.” The system returned 10 results by default, which I increased to 20 to get more comprehensive results. The returned data included the chunk content, full-text search data, and embeddings.

To improve the response, I later updated the hybrid search function to also return ranking scores from both semantic and keyword searches. This helped me understand which results were favored by each search type before fusion.

Integrating Hybrid Search into the AI Agent

Because the Supabase node in n8n doesn’t support the custom parameters required by the hybrid search function, I replaced it with a simple HTTP request node to call the Edge function directly.

I set the AI agent’s system message to generate answers only based on the knowledge base results. Testing queries such as “What are the rules for the engine air intake?” produced detailed, accurate answers.

Examples Demonstrating Hybrid Search Strengths

For technical terms like “metal matrix composites (MMCs),” full-text search shines because it looks for exact matches. Semantic search might confuse related terms like “metal” or “steel” but miss the specific acronym.

In tests, the top chunk ranked first in both semantic and full-text searches, showing hybrid search’s ability to balance both approaches effectively.

Another example involved searching for “ISO 16220.” The best match appeared at the top of the full-text search results but was ranked very low by semantic search. Hybrid search ensured this exact match surfaced prominently.

Conversely, broad questions like “What is the impact of wind on an F1 car?” returned results dominated by semantic search, since the exact terms weren’t present. The system retrieved information about aerodynamics and airflow, demonstrating semantic search’s ability to handle vague queries.

Improving Result Ordering with Re-ranking

While reciprocal rank fusion merges results well, it isn’t perfect. To further refine result ordering, I integrated a re-ranking model using Cohere 3.5. This model takes the hybrid search results and orders them more accurately before feeding them into the AI agent for answer generation.

This multi-step approach greatly improves answer quality, combining the strengths of hybrid search with advanced re-ranking.

Implementing Hybrid Search with Pinecone

Next, I built a similar hybrid search setup using Pinecone. Pinecone supports hybrid search within a single index, combining dense and sparse vectors. This differs from Supabase, which uses separate columns and a custom fusion function.

Creating the Pinecone Hybrid Index

I created a Pinecone index with these settings:

Vector type: Dense
Dimensions: 1024 (matching the multilingual E5 large embedding model)
Metric: Dot product (required for hybrid search)

This setup enables Pinecone to store dense embeddings and sparse vectors for keyword search in the same index.

Ingesting Data into Pinecone

My ingestion workflow starts with a manual trigger, followed by setting the index host and namespace. I download the same Formula One technical regulations PDF and extract its text.

Since n8n lacks standalone text splitter nodes, I created a custom JavaScript chunking script with ChatGPT’s help. This script splits the document text into chunks with specified sizes and overlaps.

Chunking produced over 1,150 chunks for the 180-page document. I then batch process these chunks in groups of 96, which is the max batch size allowed by the embedding model I used.

For each batch, I generate dense embeddings using Pinecone’s embedding API with the multilingual E5 large model and sparse embeddings using Pinecone’s sparse English V0 model.

The sparse embeddings include indices and values, reflecting keyword presence for full-text search.

Next, I built a vector array combining unique IDs, dense embeddings, sparse embeddings, and chunk metadata in the format required by Pinecone’s upsert endpoint. This array is then sent to Pinecone for storage.

Querying Pinecone Hybrid Search

For inference, I can’t use Pinecone’s standard vector store node in n8n because it doesn’t support hybrid search. Instead, I call a dedicated n8n workflow as a tool within my AI agent.

This workflow accepts a query input, generates dense and sparse embeddings for the query (note the input type “query” is crucial here), and queries Pinecone’s hybrid search endpoint.

The query returns a hybrid result set with combined scores from both search types. I retrieve the chunk text and scores, then feed them to the AI agent for answer generation.

Benefits of Pinecone Hybrid Search

When searching for very specific terms like “Planck assembly,” hybrid search returns many exact matches, ensuring precise results. This is a clear advantage over pure semantic search, which might miss exact codes or names.

The integration of dense and sparse vectors into one index simplifies management and improves performance.

Summary of Key Takeaways

Vector search excels at understanding meaning and handling broad or vague queries.
Keyword search provides exact matching for specific terms, codes, or acronyms but lacks semantic flexibility.
Hybrid search combines both approaches, merging results and ranking them to maximize accuracy and relevance.
Supabase uses a separate dense embedding column and a full-text search column with a custom fusion function.
Pinecone supports hybrid search within a single index using dot product metric and combined dense and sparse vectors.
Integrating hybrid search into n8n requires custom workflows and HTTP requests, as built-in vector store nodes don’t support hybrid search directly.
Advanced re-ranking models can further improve result ordering before feeding data to AI agents.

This hybrid RAG approach significantly improves the reliability of AI agents by ensuring they return accurate answers for both precise and broad queries. It’s especially useful when your data contains many specific terms, codes, or technical language that pure semantic search might misinterpret.