Skip to main content

2 posts tagged with "browser-agents"

View All Tags

A unified framework for browser and API authentication

· 5 min read
Emma Burrows
Co-founder and CTO

The core of the Portia authorization framework is the ability for an agent to pause itself to solicit a user's authorization for an action it wants to perform. With delegated OAuth, we do this by creating an OAuth link that the user clicks on to grant Portia a token that can be used for the API requests made by the agent. We generally like API based agents for reliability reasons – they're fast, predictable and the rise of MCP means integration is getting easier.

However, there are some actions which are not easily accessible by API (my supermarket doesn't have a delegated OAuth flow surprisingly!), and so, there is huge power in being able to switch seamlessly between browser based and API based tasks. The question was, how to do this consistently and securely with our authorization framework.

What's next for Browser Agents? 🤔

· 5 min read
Emma Burrows
Co-founder and CTO
Mounir Mouawad
Co-founder and CEO
TLDR

I've been tinkering with browser automation recently (e.g., building a bot to search and buy on Amazon), and Operator’s release got me thinking about the future of these tools. Here are 3 key challenges browser agents face today:
1️⃣   Moving from text-only to multi-modal AI models.
2️⃣   Solving authentication without blending in with bad bots.
3️⃣   Enabling human-in-the-loop collaboration that's seamless and smart.

In this post we unpack these challenges, share insights, and explore what’s next for browser agents. Would you trust browser agents with your day-to-day tasks? Let me know your thoughts! 👇