The Way This Agentic RAG Blogging System Thinks Is SO IMPRESSIVE (n8n)

In this blog, I explore the revolutionary Agentic RAG blogging system I developed using N8N. This approach overcomes traditional limitations, enabling AI agents to seamlessly gather and synthesize information from multiple data sources, resulting in more accurate and engaging content.

Demo of Agentic RAG

In this section, I’ll walk you through a live demo of the Agentic RAG system. Imagine a local news website in Columbus utilizing this powerful tool. The setup is straightforward, yet incredibly effective. I’ll first show you the overall architecture before diving into the specifics of how the agent retrieves and processes information.

Understanding the RAG Agent

The RAG agent acts as a bridge between various data sources. It retrieves information from both our curated datasets and publicly available sources. For instance, we have a Pinecone vector store at our disposal, which contains relevant data about Columbus’s capital projects. Additionally, we utilize a no-code database to store information that the agents can query.

Initiating the Article Generation

To kick things off, I input a title for an article: “Update on Progress of Capital Projects in Columbus.” I also set it as an opinion piece and provide some tone of voice directions. After selecting the article length, I change the status to “ready for outline” and initiate the process.

Retrieving Information

Once the process starts, the first action is to query the dataset related to capital projects. The agent begins to hit the vector database multiple times to fetch relevant information from our knowledge base. After gathering internal data, it then moves on to public searches for additional insights.

Deep Research and Validation

During the retrieval process, the agent also accesses deep research tools like Perplexity and Jina AI. This allows it to validate the information it gathers in real-time. If the retrieved data isn’t satisfactory, the agent can adjust its queries and search again until it finds what it needs.

Creating the Article Outline

After gathering all the necessary information, the agent constructs a comprehensive outline, complete with statistics and citations. This outline is then saved in the no-code database, ready for the next steps in the article generation process.

Understanding Traditional RAG

Before we delve deeper into the Agentic RAG system, it’s essential to understand traditional RAG. The conventional approach typically involves a two-step process: data ingestion and query execution.

Data Ingestion

In the data ingestion phase, various documents, such as web pages or PDFs, need to be chunked. Each chunk is then sent to an embedding model, which converts the text into dense numeric representations. These vectors are stored in a vector database like Pinecone.

Query Execution

The second step involves querying the vector database. When a question is posed, it’s converted into numbers using the same embedding model. The goal is to find vectors that are similar to the query, returning the most relevant results for the AI or LLM to generate an output.

Limitations of Traditional RAG

Traditional RAG is limited to vector stores and can struggle with complex queries. The agentic version overcomes these limitations by integrating multiple data sources and employing reasoning models to refine its approach.

Web Scraping with Spider Cloud

For this project, I utilized Spider Cloud for web scraping. This platform excels at crawling websites quickly and affordably, making it an excellent choice for gathering large amounts of data from multiple sources.

Setting Up the Scraping Workflow

The scraping process is scheduled to run regularly, ensuring that the information remains up-to-date. I crawled two primary data sources: the Columbus city government website and Experience Columbus, the local tourism board’s site.

Handling Data Input

After scraping, the data is saved into a no-code database. This system is efficient for managing large volumes of API calls and operations, making it easier to track which pages have already been scraped.

Checking for Updates

Each time the scraping process runs, it generates a hash to check if any changes have occurred on the pages. If updates are found, the new data is saved to the database, ensuring that the information remains current.

Document Upserting to Pinecone

Once the data is in the no-code database, the next step is to create embeddings for the new content. This process involves checking if the vectors already exist in Pinecone to avoid duplicates.

Embedding Creation

Using the OpenAI embeddings model, I create embeddings based on the text chunks. This data is then upserted into the Pinecone vector store, allowing for efficient querying later on.

Handling PDF and Word Documents

Another common use case is indexing PDF or Word documents. For instance, I’ve indexed the Columbus government policies, allowing for more structured queries later on.

Structured Query Management

Structured queries play a crucial role in retrieving specific data efficiently. I imported a dataset of capital projects into the no-code database, which includes budget amounts and project details.

Dynamic Query Generation

With the structured data in place, I can create dynamic queries using the AI agent. It generates filters based on the data schema provided, allowing for precise and relevant results.

Utilizing External Data Sources

The agent can also access external data sources, enriching the context of the information retrieved. Tools like Perplexity and Jina AI allow for deeper research and validation, making the overall system more robust.

Deep Research Capabilities

Deep research is a key feature of the Agentic RAG system. It enhances the information retrieval process, allowing the agent to access a variety of sources beyond just internal datasets. This capability is essential for gathering comprehensive data, especially when dealing with complex topics.

During the research phase, I integrated tools like Perplexity and Jina AI. These tools enable the agent to validate the information it collects in real-time. If the agent finds that the quality of the retrieved data isn’t satisfactory, it can modify its queries and search again. This iterative process ensures that the final output is accurate and well-informed.

Benefits of Deep Research

Increased Accuracy: By cross-referencing multiple sources, the agent can produce content that is not only accurate but also rich in context.
Real-Time Validation: The ability to validate information as it’s retrieved helps avoid errors that can lead to misinformation.
Complex Query Handling: The system can break down complex queries into manageable parts, ensuring a thorough exploration of the topic.

Evaluating Traditional vs. Agentic RAG

Understanding the differences between traditional RAG and Agentic RAG helps clarify why the latter is more effective for modern applications. Traditional RAG typically relies on a two-step process: data ingestion and query execution.

Traditional RAG Limitations

Traditional RAG has its limitations, especially when faced with complex queries. It often struggles to provide accurate results due to its reliance on a single vector store. This can lead to incomplete or irrelevant information.

Agentic RAG Advantages

In contrast, Agentic RAG is not restricted to just vector stores. It can query various data sources, including APIs and structured databases. This flexibility allows for a more nuanced approach to information retrieval.

The agent can also rewrite queries based on the responses it receives. This iterative process enhances the quality of the information being retrieved. By validating and refining its queries, the agent can produce more accurate and relevant content.

Setting Up the RAG Pipeline

Setting up the RAG pipeline involves several key steps. I began by scraping data from multiple sources, ensuring that I had a comprehensive dataset to work with. The pipeline is designed to run automatically, making it efficient for ongoing content generation.

Crawling Local Data Sources

For this project, I focused on two primary data sources: the Columbus city government website and Experience Columbus, the local tourism board’s site. These sources provide valuable local information that enriches the content I create.

Using No Code DB

After scraping, the data is stored in a no-code database. This choice simplifies managing large volumes of data and allows for efficient tracking of scraped pages. No Code DB is particularly useful for automating the ingestion process.

Each time the scraping process runs, it generates a hash to check for updates. If changes are detected, the new data is automatically saved to the database. This ensures that the information remains current and relevant.

Document Upserting to Pinecone

Once the data is in the no-code database, the next step is to create embeddings for the new content. This process involves checking for duplicates in Pinecone to avoid redundancy in the dataset.

Using the OpenAI embeddings model, I create embeddings based on the text chunks. This data is then upserted into the Pinecone vector store, allowing for efficient querying later on.

Crawling Local Data Sources

The scraping process is set to run on a schedule. This automation ensures that I can gather the most up-to-date information without manual intervention. I used Spider Cloud for this task because it’s incredibly efficient and cost-effective.

Spider Cloud enables me to crawl websites quickly, making it an excellent tool for gathering data from multiple sources. The setup is straightforward and allows for flexibility in scheduling the scraping process.

Using No Code DB

No Code DB serves as the backbone of my data management system. It efficiently handles the storage and retrieval of scraped data, allowing me to focus on content creation rather than data management.

This database is particularly useful for managing large volumes of API calls and operations. It streamlines the process of tracking which pages have been scraped and which need to be updated.

Cost Analysis of Agentic RAG

Cost is a significant factor when considering the implementation of Agentic RAG. This system operates differently from traditional RAG, primarily due to its ability to validate and refine queries across multiple data sources. Each interaction with the agent can consume a considerable number of tokens, leading to increased costs over time.

For instance, I tracked the costs associated with a single article generation. The total token consumption was around eighty-two thousand tokens, which amounted to approximately thirty cents at the current rate of three dollars per million tokens. This is a clear contrast to traditional RAG, where costs are generally lower due to less complex operations.

Factors Affecting Cost

Token Consumption: Each query and response contributes to the overall token count. More complex queries lead to higher token usage.
Tool Utilization: The ability to call multiple tools for validation and data retrieval increases the frequency of API calls, further raising costs.
Model Selection: Using advanced reasoning models like Cloud 3.7 can enhance performance but also adds to the cost. Cheaper models may not provide the same level of capability.

Speed and Latency Considerations

When evaluating the speed of Agentic RAG, it’s essential to consider the latency introduced by multiple tool calls. The system’s design allows for complex queries, which can result in longer response times. This trade-off is often acceptable in automated content generation scenarios where speed is less critical than accuracy.

However, if implemented in a real-time chat interface, latency could become a challenge. The system might require careful selection of tools and optimizations to ensure prompt responses while maintaining the integrity of the output.

Strategies to Mitigate Latency

Selective Tool Calls: Limit the number of tools called in a single query to reduce processing time.
Asynchronous Processing: Implement asynchronous workflows to allow for streaming results without waiting for all processes to complete.
Optimized Query Structures: Design queries that are efficient and reduce the number of back-and-forth calls needed for validation.

Complex Queries and Retrieval Accuracy

Agentic RAG excels at handling complex queries, which is a significant advantage over traditional RAG systems. The ability to break down questions into manageable parts allows for more accurate retrieval of information. This capability is particularly valuable when dealing with nuanced topics or when data is spread across various sources.

During the retrieval process, the agent can validate and refine its queries based on the responses it receives. This iterative approach significantly enhances the accuracy of the information gathered, making it a powerful tool for content generation.

Benefits of Enhanced Retrieval Accuracy

Improved Content Quality: Higher accuracy leads to more reliable and informative content, which is crucial for maintaining reader trust.
Real-Time Feedback: The system’s ability to adjust queries in real-time ensures that the information stays relevant and up-to-date.
Better User Engagement: Accurate and well-researched articles are more likely to engage readers, enhancing their experience and encouraging return visits.

Autonomy in Agentic RAG

One of the standout features of Agentic RAG is its autonomy. Unlike traditional systems that rely on predefined workflows, this approach allows the agent to make decisions about data retrieval and query formulation on its own. This self-sufficiency is crucial for optimizing the research and writing process.

The agent can adapt its methods based on the quality of the data it retrieves. If the initial results aren’t satisfactory, it can reformulate queries and search again, ensuring that the final output meets a high standard of quality.

Advantages of Autonomous Operation

Efficiency: The agent can operate without human intervention, allowing for seamless content generation.
Continuous Improvement: The system learns from past interactions, refining its approach to enhance future outputs.
Flexibility: It can easily adapt to changes in data sources or query requirements, ensuring that it remains effective over time.