Learn

Automating customer support in banking using agentic RAG

Maksym Buleshnyi

In recent years, LLM Agents have emerged as a transformative force in various industries. By utilizing a wide range of tools and advanced interaction strategies, they not only boost efficiency but also redefine our communication and engagement with digital platforms.

Another significant advancement is Retrieval-Augmented Generation (RAG). RAG integrates vector databases with large language models, enhancing AI applications by ensuring responses are based on verified data and simplifying information scalability and supervision.

Integrating Retrieval-Augmented Generation with LLMs tailors sophisticated reasoning to specific tasks and datasets, particularly useful in domains requiring precise terminology like finance, medicine, and law.

Understanding Retrieval-Augmented Generation (RAG)

RAG is an innovative AI framework that integrates vector databases with large language models (read more). This combination offers several key advantages:

Custom Information: Enterprises can leverage their own databases to create AI-powered applications tailored to their specific needs.
Grounded in Relevant Facts: RAG addresses a major problem with LLMs - hallucination - by ensuring responses are based on verified data.
Flexible Access to Information: This framework provides an easy way to manage authority levels, ensuring that sensitive information is accessible only to authorized users.
Ease of Scalability and Supervision: When new information is needed, it can be effortlessly incorporated by simply adding additional chunks to the vector database. Accessing this data is straightforward, which allows human supervisors to review it for quality assurance.

What are LLM Agents?

LLM Agents are autonomous systems powered by LLMs which can perform various tasks, have access to tools (APIs), and be connected into complex structures. For more details about LLM Agents you can check out our previous article.

How to use LLM agents with RAG

Agents excel at complex reasoning, but their stochastic nature can pose challenges in domain-specific tasks, where specialized knowledge is required. By integrating RAG with LLMs into a unified workflow, sophisticated reasoning tailored to specific tasks and datasets is achieved.

In fields such as finance, medicine, and law, where precise terminology and nuanced understanding are essential, agents may yield less reliable results without extensive training on domain-specific datasets. Leveraging RAG effectively combines human expertise with agent capabilities, ensuring more accurate and contextually relevant outcomes.

Implementing LLM Agents + RAG flow

This section explores employing LLM Agents with RAG, using Dynamiq as a foundational framework. Dynamiq provides a flexible interface that allows developers to easily configure flows with Agents and RAG tailored to specific application needs.

In this article, we will create an example of an application that allows users to interactively communicate with the Bank API via the LLM Agent following instructions from the database (DB).

The workflow will consist of two agents:

The RAG Agent will have access to documentation via RAG, which will provide instructions on how to proceed with operations.
The API Agent will access the API and interact with users. It will adhere to the instructions from the RAG Agent.

Understanding the code behind the workflow construction

Test setup

Pinecone will be used as a vector database provider
OpenAI API will be used as the LLM provider

Connecting to the LLM provider

First, we begin by creating a connection to the LLM of interest (in our case, it’s the GPT-4o-mini). The temperature is set to 0.01 to ensure stable execution.


connection = OpenAIConnection()
llm = OpenAI(
    connection=connection,
    model="gpt-4o-mini",
    temperature=0.01,
)

The RAG Agent

The first agent will be responsible for taking requests from the user and searching the vector database for guidance on how to proceed.

RAG tool

Before creating the Agent, we must first create the RAG tool that the Agent will have access to.


text_embedder = OpenAITextEmbedder(model="text-embedding-ada-002")
document_retriever = PineconeDocumentRetriever(
    top_k=3, vector_store=PineconeVectorStore(index_name="default", dimension=1536)
)

bank_retriever_tool = RetrievalTool(
    name="Bank FAQ Search",
    text_embedder=text_embedder,
    document_retriever=document_retriever,
)

This tool will have its own workflow including:

text_embedded to create embeddings for requests from the Agent (OpenAITextEmbedder Node)
document_retriever to retrieve chunks of data from the DB using embeddings (PineconeDocumentRetriever Node)

This tool will be executed by the Agent. The Agent will determine when to run it and what input to provide.

Configuring RAG Agent

Once the tool is created, we can configure our first agent.


# Create a ReActAgent for handling bank documentation queries
agent_bank_documentation = ReActAgent(
    name="RAG Agent",
    role="Customer support assistant for Internal Bank Documentation",
    llm=llm,
    tools=[bank_retriever_tool],
)

Here, we initiate a ReActAgent, linking it with our model and RAG tool. The agent now has access to the Pinecone Vector Database and can query it for the required documentation. It’s important to note that the Agent will handle rephrasing of unclear human requests. If the initial request does not yield appropriate information, the Agent will generate multiple similar requests and attempt to query again. The Agent will also structure the output in a readable manner for the second Agent.

The API Agent

The second agent will be responsible for executing the operation. It will have access to the Bank API and can interact with users to gather required information. It will use responses from the RAG Agent to address requests according to the documentation.

Tools

First, we’ll need to define the tools:

HttpApiCall to allow the agent to send requests to the server, managing multiple endpoints.
HumanFeedbackTool to allow agent to communicate with users in order to gather required information.


# Create connection to Bank API
connection = HttpConnection(
    method="POST",
    url="http://localhost:8000/",
)

# Create api call tool
api_call = HttpApiCall(
    connection=connection,
    name="Bank API",
    description="""
    An internal bank API.

    Available endpoints:
        * 'block_card' (int card_number, int pin_code)
        * 'make_transaction' (int card_number_sender, int card_number_reciever, int amount)
        * 'request_report' (int card_number, int pin_code)

    Choose between endpoints and pass name of it in url_path parameter.
    Parameters for endpoint have to be passed in `data` object.
    """,
)

Configuring API Agent

Once the tools and RAG Agent are created, we can configure our second agent.


# Create user interaction tool
human_feedback_tool = HumanFeedbackTool()

def combine_inputs(_: dict, outputs: dict[str, dict]):
    return (
        f"Request: {input}\n"
        f"Follow this instruction: {outputs[agent_bank_documentation.id]['content']}"
    )

# Create a ReActAgent for handling internal bank API queries
agent_bank_support = ReActAgent(
    name="API Agent",
    role="Customer support assistant with access to Internal Bank API",
    llm=llm,
    tools=[api_call, human_feedback_tool],
    depends=[NodeDependency(node=agent_bank_documentation)],
).inputs(input=combine_inputs)

Here, we initiate another ReActAgent, linking it with our model and tools. We need to configure this agent to depend on the RAG Agent, ensuring that it executes only after the documentation has been retrieved.

The combine_inputs function combines both the initial request (desired action) and the response from the RAG Agent (steps to follow), and passes them as an input to the API Agent.

Creating and running flow

Now these agents can be connected into a single flow.


workflow = Workflow(flow=Flow(nodes=[agent_bank_documentation, agent_bank_support]))

The flow can be run easily with the following command:


result = workflow.run(input_data={"input": input})

All the code for this example is available here.

Key Takeaways

Advanced reasoning with specific data: A complex architecture can now be built on top of highly specific data, significantly expanding the range of use cases.
Flexible Architecture: This framework makes systems easier to update and scale. If we need to change the behavior of our customer support example, we can now simply update the documentation in our database.
Domain knowledge: Domain expertise can now be easily integrated into AI applications by using well-structured and human-readable databases.

Concluding Insights

LLM Agents and Retrieval-Augmented Generation have proven to be powerful tools. By leveraging the strengths of both frameworks, developers can create complex applications that are both flexible and scalable, setting a new standard in the industry.

Curious to find out how Dynamiq can help you extract ROI and boost productivity in your organization?

Free consultation

Table of contents

Text Link

Lead with AI: Subscribe for Insights

Join our newsletter to receive tailored updates and strategies for advancing AI in your enterprise.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.