How to build a research agent for your team
At Veratai, we've been building AI agents for organisations that want to automate complex information gathering workflows. After several implementations, we've developed a methodology that reliably delivers working systems rather than expensive proof-of-concept graveyards.
Our methodology consists of five distinct phases, designed to deliver early and iterate quickly. Here is exactly what we do to take an agent from concept to completion:
Phase 1: Envision
Start by running two structured workshops to understand the foundation of your research agent:
In the first, map everyone affected by - or participating in - the research workflow. This includes not just the obvious players but indirect supporters and downstream consumers of the research outputs. Document who they are and what their responsibilities are.
Scoping and governance discussions happen here too. What sources should the agent prefer or avoid? Which actions require supervision or review? How should sensitive data be handled? These constraints will shape your design, evaluations and guardrail catalogue.
In the second workshop, invite multiple experts and collectively map the research process. Research workflows are rarely standardised; practitioners will have evolved distinct approaches. Understanding these variations - and the reasoning behind them – is vital when designing agentic workflows.
As well as the process, catalogue every data source and tool currently used, then determine how an agent might access equivalent resources.
-
At the end of this phase, you should:
Know your stakeholders:
Who is affected by the workflow
Who provides input or other forms of support (direct or indirect)
Who has a stake in the outputs
Who will help you to describe, evaluate and debug the workflow
Know your process:
What the process is at a high level
What resources (data sources, external systems etc.) the process needs
What guardrails the agent needs (i.e. what it should and shouldn’t do)
How the system will be accessed (e.g. through a bespoke UI or via integration into other systems)
What the outputs look like
Phase 2: Explore
Following the workshops, move on to designing the agent itself.
You will need to:
map the process to a state graph
define the tools the agent will use
design guardrails and human hand-off criteria
specify schemas for inputs, outputs and intermediate state.
It’s impossible to understate the value of collecting or curating high quality exemplar data to cover instructions and evaluations, and this is the other task you must complete during this phase. Both collection and curation require significant effort! Past research outputs, intermediate artifacts, and source catalogues should all feed into the dataset. Where gaps exist, fill them through human labelling, synthetic generation, or additional data gathering.
-
At the end of this phase you should:
Have prepared exemplar data to use in the build phase:
Collected relevant inputs, outputs and intermediate artifacts (e.g. past research briefings)
Filled in any gaps in data with curated or synthetic data
Have a detailed design for the agent:
Developed a logic data model for inputs, output and intermediate results
Agreed the metrics you will use to evaluate the agent
Documented the system design for the agent (where it will run, what technology will the user interact with, what will run the agent)
Built the agent design (state graph, tool registry, data indexes, initial prompts)
Phase 3: Execute
Aim to release a first version of the agent in no longer than eight weeks.
Work in development cycles with weekly or fortnightly beats.
During early cycles, focus on core functionality but bake in observability from the start: you need to see what the agent is doing in order to improve it. Early releases should simply demonstrate that the deployment pipeline works and real users can interact with the system.
Each cycle should follow a structured pattern: implement functionality, add observability, conduct structured reviews with domain experts, then finesse the plan based on their feedback. Only during later cycles should you shift focus toward performance, cost engineering, and edge-case handling.
The primary goal of a greenfield agent build is to demonstrate successful integration into your business workflow, and to gain buy in across the organisation. Don’t fall into the trap of over-optimising the agentic AI elements at this point – providing your agent shows reasonable performance, focus instead on building a usable system that you can use to gain further buy-in.
-
At the end of this phase you should have:
A deployed, working agent that you can successfully demo!
Phase 4: Engage
Collect feedback on Agent performance by carefully monitoring usage and end-user assessments of the research outputs.
The early life of a deployed agent is a golden opportunity to get feedback, and this is your primary task during the Engage phase.
We typically recommend tracking quantity of usage, costs, and quality metrics to establish that business value is being generated and that the software is proving “sticky” with end users. Surveys and interviews with users reveal will gaps between your assumptions and reality.
Collect any data you can on end users’ assessments of the research outputs. We do this by adding the facility for users to accept, reject, rerun or edit the research artifacts and logging what users choose to do. This data can be used in Phase 5 to optimise your prompts, model weights or workflow.
The Engage phase typically surfaces additional needs - things that only become apparent when the system encounters real-world complexity. Refining the human-in-the-loop mechanisms proves particularly important here, allowing graceful handling of cases the agent cannot resolve independently.
-
At the end of this phase, you should have:
Buy in from across the organisation on future use of the agent
A roadmap of improvement requests
An idea of business value generated
Phase 5: Enhance
Steady-state operations focus on continuous improvement.
Tasks include model migrations (as better models become available), fine-tuning based on accumulated usage data and expanding the agent's capabilities both in breadth (handling more research types) and depth (more sophisticated analysis within existing domains).
Our approach, where possible, is to launch an agent with 50% of its capability in place and 50% handed off to humans. Then, we systematically decrease the hand-offs by increasing the number of “edge cases” the the agent can handle. The most successful deep research agent projects are those that plan for this continuous development rather than expecting a one-time implementation.
This phase ends when you want. You’ll very quickly have a long roadmap of features to add, so keep going as adoption and usage grows.
Afterward
This methodology isn't revolutionary, but it has worked well for us and for our clients across diverse domains. The structured approach ensures you build something useful, rather than something merely impressive.
We wish you the best of luck with your own projects. If this is helpful, please let us know, and if you’d like help (or just to know more), please give us a call.
