2025-04-08 claude [article](https://techcrunch.com/2025/03/31/amazon-unveils-nova-act-an-ai-agent-that-uses-a-web-browser/)
### Amazon Unveils Nova Act: A General-Purpose AI Agent
#### SUMMARY
Amazon has unveiled Nova Act, a general-purpose AI agent capable of controlling web browsers to perform simple tasks independently. Nova Act will power key features of Amazon's upcoming Alexa+ upgrade and is being released alongside an SDK for developers to build agent prototypes. While entering a crowded space, Amazon claims Nova Act outperforms competing agents from OpenAI and Anthropic on several internal benchmarks.
#### Detailed Summary
Amazon has introduced Nova Act, a general-purpose AI agent designed to take control of web browsers and independently perform simple actions. Released as a "research preview," the agent comes with the Nova Act SDK, a toolkit for developers to build agent prototypes. This technology will also power key features in the upcoming Alexa+ upgrade, Amazon's generative AI-enhanced version of its popular voice assistant.
Nova Act represents Amazon's entry into the AI agent market, competing with similar technologies like OpenAI's Operator and Anthropic's Computer Use. The goal of these systems is to navigate the web on behalf of users, making current AI chatbots significantly more practical. Although Amazon isn't first to market with this technology, its integration with Alexa+ could potentially give it the widest reach among consumers.
According to Amazon, developers using the Nova Act SDK will be able to automate basic user actions such as ordering food or making reservations. The toolkit enables developers to create tools that help AI agents navigate web pages, complete forms, and interact with calendar interfaces. Amazon claims Nova Act outperforms competitors in internal tests, scoring 94% on the ScreenSpot Web Text evaluation compared to OpenAI's 88% and Anthropic's 90%, though it hasn't been benchmarked using more standard agent evaluations like WebVoyager.
Nova Act is the first public release from Amazon's new AGI (Artificial General Intelligence) lab in San Francisco, led by former OpenAI researchers David Luan and Pieter Abbeel. Despite focusing on seemingly simple tasks like ordering food, Luan sees agent technology as a crucial step toward developing superintelligent AI systems. He defines AGI as "an AI system that can help you do anything a human does on a computer" and notes that Nova Act SDK was designed to reliably handle short, simple tasks while allowing developers to specify when human intervention is needed.
The release comes at a critical time for Amazon's AI strategy, as early tests of Nova Act may provide insights into capabilities of the delayed Alexa+ upgrade, which represents a pivotal moment for Amazon's AI efforts. A significant challenge for all current AI agents is their reliability across different domains, with existing systems from competitors being slow, struggling with independent operation, and making non-human errors. It remains to be seen whether Amazon has overcome these limitations or will face similar issues.
#### OUTLINE
- Nova Act Introduction
- General-purpose AI agent for web browser control
- Released as a "research preview"
- Accompanied by Nova Act SDK for developers
- Will power features in upcoming Alexa+ upgrade
- Market Position
- Competing with OpenAI's Operator and Anthropic's Computer Use
- Not first to market but potentially widest reach through Alexa+
- Available at nova.amazon.com alongside other Nova foundation models
- Capabilities and Performance
- Automates basic user actions (ordering food, making reservations)
- Tools for web navigation, form completion, calendar interaction
- Claims superior performance on internal benchmarks
- 94% score on ScreenSpot Web Text vs competitors' 88-90%
- Not benchmarked on common evaluations like WebVoyager
- Development Background
- First product from Amazon's San Francisco AGI lab
- Led by former OpenAI researchers David Luan and Pieter Abbeel
- Luan views agents as stepping stone to superintelligent AI
- Designed for reliability on simple tasks with defined human intervention
- Strategic Importance
- Critical for Amazon's AI strategy
- May provide insights into delayed Alexa+ capabilities
- Entering a field where competitors struggle with reliability
- Challenges include speed, independence, and error avoidance
#### What Makes This Development Special
##### Genius
The integration of agent technology directly into Alexa represents a strategic fusion of AI research with consumer products, potentially allowing Amazon to leapfrog competitors in practical AI deployment despite not being first to market with the underlying technology.
##### Interesting
Amazon's approach emphasizes reliability for simple tasks rather than full autonomy, suggesting a more pragmatic development path than competitors who may be struggling with autonomous operation for complex tasks.
##### Significant
As the first public product from Amazon's AGI lab, Nova Act signals Amazon's serious investment in agentic AI and provides insight into how the company defines and approaches AGI—as systems that can replicate human computer interactions rather than abstract intelligence.
##### Surprising
Despite the ambition implied by an "AGI lab," Amazon's first offering focuses on relatively mundane tasks like ordering food, indicating a bottom-up approach to building increasingly capable systems rather than attempting to solve AGI all at once.
#### TABLE
|Feature|Nova Act|Competitors|Significance|
|---|---|---|---|
|Developer|Amazon AGI lab|OpenAI (Operator), Anthropic (Computer Use)|First product from Amazon's new AGI initiative|
|Leadership|David Luan, Pieter Abbeel (ex-OpenAI)|N/A|Brings experience from both research and startups|
|Performance|94% on ScreenSpot Web Text|OpenAI: 88%, Anthropic: 90%|Claims performance edge on internal metrics|
|Consumer Integration|Will power Alexa+|Limited consumer deployment|Potential for widest consumer reach|
|Availability|Research preview with SDK|Various stages of limited release|Focused on developer adoption first|
|Design Philosophy|Reliable for simple tasks with human oversight|More autonomous but less reliable|Pragmatic approach to agent development|
|Target Use Cases|Food ordering, reservations, form filling|Web browsing, information retrieval|Focuses on high-frequency consumer actions|
|Strategic Importance|Critical for Alexa+ success|Important but not tied to existing products|Represents "make-or-break" moment for Amazon's AI|