You’re planning your next weekend. You ask Anton, your autonomous agent, to look up the best restaurants that can fit into your schedule, match your preferences, and make a reservation.
That’s the promise offered by autonomous agents, which are intelligent bots powered by language models, capable of dissecting intricate challenges and progressively resolving them while performing tasks for users. Your autonomous agent can segment a larger task into smaller components and leverage memory at each stage to direct the agent’s subsequent actions.
The concept of autonomous agents captured the public imagination, enabling Auto-GPT to reach 125k stars on GitHub in a matter of five weeks. To put this number in context, Auto-GPT has surpassed repositories that are widely used in production by large scale organizations, such as Apache Spark (36k stars), StableDiffusion (22k stars), and PyTorch (74k stars).
There are many real use cases that people have discovered. In one example, Auto-GPT is able to recursively debug, develop, and self-improve its own code so that it works out of the box. This avoids the problem where we end up with code that doesn’t work and have to ask ChatGPT to debug many issues.
In yet another example, you can set Auto-GPT to act as an autonomous agent to build a business. It researches low-cost business models, identifies target markets, and more—all autonomously. Auto-GPT's ability to brainstorm ideas and evaluate profitability showcases the power of Autonomous AI.
Key limitations
While Auto-GPT and similar autonomous agents certainly make for impressive demos, there’s still a lot of work left to do before these projects can be used in production. As it stands today, autonomous agents have been limited by 1) restricted subsets of functionality, 2) lack of reliability, 3) lack of memory and 4) cost.
Limited functionality
Auto-GPT has very limited functionality. Today, it only provides functionality for searching the web, managing memory, interacting with files, executing code, and generating images. It can’t write functioning applications, or scrape data on the web and summarize it for you in a digestible way (e.g. in a spreadsheet). Demos of Auto-GPT functionality are carefully curated to showcase the art of the possible, but its functionality today is somewhat limited and not entirely reliable.
Lack of reliability
Many prompts to Auto-GPT require around 50 sequences to solve. GPT-4 is already known to hallucinate and misunderstand a person’s intent - when several sequences of prompts are strung together, the impacts of hallucination are magnified to the extent that the end result may not at all resemble the original ask.
Lack of memory
Autonomous agents available today don’t have the concept of memory—every time you spin up an autonomous agent to run a specific task, it doesn’t have knowledge of what you have previously asked or broader context regarding your inquiry. This limits an agent’s ability to re-use previous tasks’ outputs from a performance perspective, as well as its ability to create a truly personalized experience.
Cost
Autonomous agents like Auto-GPT have been built to run sequences of GPT API calls, so the cost of using such tools today is high. A typical task carried out by Auto-GPT may cost north of $15, which makes it challenging to envision widespread adoption. It’s possible to bring down the costs by using cheaper models like open source GPT4All, but the overall compute cost will still be high.
What we’re excited about
We’re currently most excited about approaches to incorporating autonomous agents inside of applications themselves, to execute tasks and present the results for human review. Applications have been moving toward multi-player for years, and now we expect AI agents to be the next “players”.
Several companies have begun creating their own versions of an agentic copilot, which is the first step toward this paradigm - early examples include Height, Durable, and Monterey. Microsoft also seems to be launching a copilot for every application in their stack. Some agents are further along the path to autonomy than others, but our expectation is that people will become more comfortable giving greater control away as products in the space mature and demonstrate reliability. If the rapid ascent of Auto-GPT is any indication, the appetite for autonomy seems to be there.
-
Building something in AI or data infrastructure? Reach out! brittany at crv dot com or brian at crv dot com.
–
If you like learning about where AI is today, and where it might be heading, you might also want to check out these two recent AI focused posts (#PowerToTheConsumer: Insights from 30 Leading Consumer AI Founders, Operators and Thinkers and #PowerToTheAIBuilder — AI Will Be Core to Every Software Application) from our CRV colleagues.