One post tagged with "evals" | Portia AI's blog

How Portia ensures reliable agents with evals and our in-house framework

June 4, 2025 · 8 min read

Tom Stuart

Backend Engineer

At Portia we spend a lot of time thinking about what it means to make agents reliable and production worthy. Lots of people find it easy to make agents for a proof of concept but much harder to get them into production. It takes a real focus on production readiness and a suite of features to do so (lots of which are available in our SDK as we’ve talked about in previous blog posts):

User Led Learning for reliable planning
Agent Memory for large data sets
Human in the loop clarifications to let agents raise questions back to humans
Separate planning and execution phases for constrained execution

But today we want to focus on the meta question of how we know that these features help improve the reliability of agents built on top of them by talking about evals.