4️AutoGPT

Traditional autonomous agents operated with limited knowledge, often confined to specific tasks or environments. They were like calculators — efficient but limited to predefined functions. LLM-based agents, on the other hand, are akin to having an encyclopedia combined with a calculator. They don’t just compute; they understand, reason, and then act, drawing from a vast reservoir of information.

The Agent Landscape Survey underscores this evolution, detailing the remarkable potential LLMs have shown in achieving human-like intelligence. They’re not just about more data; they represent a more holistic approach to AI, bridging gaps between isolated task knowledge and expansive web information.

Further expanding on this, The Rise and Potential of Large Language Model Based Agents: A Survey portrays LLMs as the foundational blocks for the next generation of AI agents. These agents sense, decide, and act, all backed by the comprehensive knowledge and adaptability of LLMs. It is an incrediable source of knowledge on AI Agent Research with almost 700 papers referenced and organised by reseach area.

1. Profile

When we humans focus on various tasks, we condition ourselves for those tasks. Whether we’re writing, chopping vegetables, driving, or playing sports, we concentrate and even adopt different mindsets. This adaptability is what the concept of profile alludes to when discussing agents. Research has shown that simply informing an agent that it is an expert in a specific task can enhance its performance.

The profiling module has potential applications beyond just prompt engineering. It could be used to adjust an agent’s memory functions, available actions, or even the underlying large language model (LLM) that drives the agent.

2. Memory

Memory, for an agent, is more than just storage — it’s the bedrock of its identity, capabilities and fundamental for it to learn. Just as our memories inform our decisions, reactions, and even our very personalities, an agent’s memory serves as its cumulative record of past interactions, learnings, and feedback. Two primary types of memories shape an agent’s cognition: long-term and short-term.

The Long-Term Memory is akin to the agent’s foundational knowledge, a vast reservoir that encompasses data and interactions spanning extended periods. It’s the agent’s historical archive, guiding its core behaviors and understanding.

On the other hand, the Short-Term (or Working) Memory focuses on the immediate, handling transient memories much like our recollection of recent events. While essential for real-time tasks, not all short-term memories make it to the agent’s long-term storage.

An emerging concept in this realm is Memory Reflection. Here, the agent doesn’t just store memories but actively revisits them. This introspection allows the agent to reassess, prioritize, or even discard information, akin to a human reminiscing and learning from past experiences.

3. Planning

Planning is the agent’s roadmap to problem-solving. When faced with a complex challenge, humans instinctively break it down into bite-sized, manageable tasks — a strategy mirrored in LLM-based agents. This methodical approach enables agents to navigate problems with a structured mindset, ensuring comprehensive and systematic solutions.

There are two dominant strategies in the agent’s planning toolkit. The first, Planning with Feedback, is an adaptive approach. Here, the agent refines its strategy based on outcomes, much like iterating through versions of a design based on user feedback.

The second, Planning without Feedback, sees the agent as a strategist, relying solely on its pre-existing knowledge and foresight. It’s a game of chess, with the agent anticipating challenges and preparing several moves in advance.

4. Action

After the introspection of memory and the strategizing of planning, comes the finale: Action. This is where the agent’s cognitive processes manifest into tangible outcomes using the agents Abilities. Every decision, every thought, culminates in the action phase, translating abstract concepts into definitive results.

Whether it’s penning a response, saving a file, or initiating a new process, the action component is the culmination of the agent’s decision-making journey. It’s the bridge between digital cognition and real-world impact, turning the agent’s electronic impulses into meaningful and purposeful outcomes.

Last updated