Amazon Automates Payment System Testing with Multi-Agent AI
- •Amazon AMET Payments team reduces test case generation time from one week to hours using SAARAM.
- •Multi-agent system utilizes Amazon Bedrock and Strands Agents SDK to automate complex QA workflows.
- •Human-centric design mirrors expert cognitive patterns to minimize AI hallucinations and improve testing coverage.
Amazon's payments team has launched SAARAM, a sophisticated system using multiple AI Agent components to automate software testing. Traditionally, quality assurance engineers spent a full week manually analyzing documents to create test cases. By leveraging the Strands Agents SDK and Large Language Models (LLMs) via Amazon Bedrock, the team now generates specific, actionable test scenarios in just a few hours. Lead developers Jayashree R and Fahim Surani designed this to free engineers for strategic tasks rather than repetitive documentation. The breakthrough occurred when developers shifted from simple instructions to a human-centric architecture. Instead of treating the AI as a single brain, they decomposed the testing process into specialized steps that mirror how human experts think. These steps include analyzing customer journeys, identifying business rules, and mapping data flows. This modular design helps the system understand complex logic—like regional payment regulations—without getting confused or generating a hallucination, where the AI makes up false information. The current version of SAARAM uses a pipeline of specialized agents. An Intelligent Gateway first routes different file types like design mocks or code repositories to specialized Data Extractors. These tools then feed into a Visualizer that creates diagrams to map out every possible user path. This is achieved through meticulous prompt engineering, where instructions are crafted to guide the model through specific logical phases rather than asking for a single output. Finally, the system uses knowledge distillation principles to synthesize all this information into a structured summary. This ensures the AI has a complete, clear picture of the product requirements before writing a single test case. This solution is currently being scaled across Amazon’s international s