How to Build AI Agents with Human-In-The-Loop

AI agents are revolutionizing workflows, but they can sometimes lead to unintended mistakes. In this blog, I’ll share a powerful technique called an Agent Safeguard that ensures human approval before critical actions are taken, making your AI agents more reliable.

Understanding the Risks of AI Agents

AI agents can greatly enhance productivity and streamline tasks. However, they also come with potential pitfalls. Mistakes can happen, and these agents may act without proper oversight. This can lead to unintended consequences, such as posting incorrect information on social media or sending emails to the wrong recipients.

Errors can arise from misinterpretation of commands or unexpected behavior in response to certain inputs. When relying on AI agents, it’s crucial to acknowledge these risks and implement safeguards that ensure human oversight. This is where the concept of an agent safeguard becomes essential.

Introduction to Agent Safeguard

Introducing the Agent Safeguard

The agent safeguard is a technique I developed to ensure that any critical action taken by an AI agent requires human approval. This system acts as a checkpoint, preventing the agent from proceeding with actions like posting on social media without explicit consent.

By integrating this safeguard, I can maintain control over important tasks while still utilizing the efficiency of AI. This method is particularly effective for actions that cannot afford mistakes, such as publishing content or sending messages to clients.

Agent Safeguard Overview

Demo: The Agent in Action

To illustrate the agent safeguard, I created a scenario where the AI agent interacts with a Telegram bot. I can instruct the bot to research topics, generate images, and draft posts for Facebook. This setup highlights the simplicity of using make.com to orchestrate these tasks.

In practice, when I ask the agent to draft a Facebook post, it processes the request and generates a response. However, before anything gets published, the agent safeguard activates, sending me a message through Telegram. This message contains a link that I must click to approve the post.

Telegram Message for Approval

Explaining the Safeguard Workflow

The safeguard workflow operates by triggering a series of actions once the AI agent is ready to post to Facebook. Instead of directly posting, it saves an approval key in a data store and sends a message to my Telegram bot. This message requests my approval without the AI being aware of the approval link.

This approach is safer than trying to build approval directly into the AI prompts, which can still lead to misunderstandings. The agent safeguard ensures that I maintain control over what gets published, minimizing the risk of errors.

Safeguard Workflow Overview

How the Approval Mechanism Works

Once the AI agent prepares to post, it hits a webhook that creates an approval key. This key is a combination of a random number, a timestamp, and a secret key, ensuring that each request is unique. The approval key gets stored in a data store alongside relevant information like the request type and status.

When I click the approval link in Telegram, it triggers the next part of the workflow, bypassing the AI assistant entirely. This direct interaction eliminates the chance of miscommunication or mistakes, as it relies solely on my action to proceed.

Approval Mechanism Workflow

Through this method, if the approval link expires or if there’s an issue with the request, the system is designed to handle those scenarios gracefully. It checks the status and ensures that only valid requests are processed, providing an added layer of security.

This safeguard not only enhances reliability but also instills confidence in using AI agents for sensitive tasks. It ensures that I remain in control, allowing the AI to assist without risking critical errors.

Managing Approval Keys and Data Storage

Managing approval keys is a critical part of the agent safeguard workflow. Each time an AI agent requests to perform a key action, a unique approval key is generated. This key ensures that every request is distinct, reducing the chances of errors.

I store these approval keys in a data store, allowing for easy access and management. The data store holds various pieces of information, including the approval ID, request type, status, and content details. By organizing this information effectively, I can ensure that the approval process runs smoothly.

Data Storage Example

When a request comes in, the system checks the data store to verify the status of the approval key. If the key is valid and the request type matches, the system can proceed with the action. If not, it handles the situation accordingly, whether that means notifying me of an expired link or indicating that the request has already been processed.

This method not only streamlines the approval process but also adds a layer of security. By keeping track of each request and its status, I can maintain a clear overview of pending approvals and ensure that nothing slips through the cracks.

Triggering Actions Without Mistakes

Triggering actions without mistakes is paramount in automating tasks with AI agents. The agent safeguard ensures that actions like posting on social media or sending emails only occur after my explicit approval.

Once the AI agent prepares to take an action, it sends a message to my Telegram bot with the approval link. This way, I have the final say before anything goes live. It’s a straightforward process, but it significantly reduces the risk of errors.

Triggering Actions Safely

To further minimize mistakes, I’ve designed the workflow to bypass the AI assistant entirely during the approval stage. This means that once I click the approval link, the action is executed without any further input from the AI agent. I can trust that the only commands coming through are the ones I’ve consciously approved.

This direct approach eliminates miscommunication. Since the AI does not have access to the approval link, it cannot inadvertently trigger actions on its own. This setup is crucial for ensuring that sensitive tasks are handled with care and precision.

Handling Edge Cases in the Workflow

Every workflow has potential edge cases, and it’s essential to address them proactively. In my setup, I account for various scenarios that could disrupt the approval process.

For instance, if I don’t click the approval link within a specified time frame, the link expires. This prevents outdated requests from being processed. I can adjust the expiration time based on my needs, ensuring flexibility in my workflow.

Handling Edge Cases

Additionally, the workflow includes checks for invalid approval IDs or expired requests. If an approval ID doesn’t exist or has already been processed, the system won’t proceed with the action. Instead, it sends me a notification, allowing me to decide the next steps.

These safeguards ensure that my workflows are resilient. They allow me to focus on the tasks at hand without worrying about unexpected behaviors or mistakes from the AI agent.

Alternative Data Storage Solutions

While I use a data store within my setup, there are alternative solutions available for managing approval keys and other essential data. For example, Airtable is a great option for those who prefer a more visual approach to data management.

Airtable allows for easy organization and collaboration, making it simple to track approval requests and their statuses. Using external databases can also enhance flexibility and scalability, especially for more complex workflows.

Alternative Data Storage Solutions

Regardless of the storage solution chosen, the key is to ensure that it integrates smoothly with the automation process. The goal is to maintain an efficient workflow while keeping oversight in place. Whether I’m using a data store, Airtable, or another solution, the focus remains on reliability and security.

Leave a Comment