Bedrock Brief 20 Aug 2025

Welcome to another week of AI shenanigans, AWS enthusiasts! It seems the cloud giants are playing a game of "who can make the most controversial AI statement" lately. Amazon's cloud chief, Matt Garman, took the cake by declaring that replacing junior employees with AI is "one of the dumbest things" he's ever heard. Apparently, he's not keen on the idea of a future workforce consisting solely of AI and grumpy senior developers who haven't touched a keyboard in years.
Speaking of touching keyboards, AWS's new AI-powered coding tool, Kiro, is causing quite the stir. What was initially hailed as a wallet-friendly developer's dream has quickly turned into a "wallet-wrecking tragedy." It turns out that those nifty AI-assisted coding sessions might cost you more than your morning coffee habit. Who knew that asking an AI to fix your buggy code could be more expensive than therapy?
But fear not, aspiring tech wizards! According to Garman, the key to surviving the AI revolution isn't mastering the latest programming language or becoming an AI expert. No, the most valuable skill in this brave new world is... drumroll, please... critical thinking! So put down that machine learning textbook and pick up a Rubik's Cube. Your future career may depend on how well you can solve puzzles while simultaneously explaining your thought process to a confused AI assistant.
Fresh Cut
- Amazon Bedrock users can now instantly access OpenAI's GPT-OSS models without manual activation, streamlining AI development for developers who want to experiment with open-weight language models. Read announcement →
- AWS introduces R8i and R8i-flex EC2 instances with custom Intel Xeon 6 processors, offering up to 15% better price-performance and 2.5x more memory bandwidth than previous generations, ideal for memory-intensive workloads like databases and AI models. Read announcement →
- Developers can now use TwelveLabs' Pegasus 1.2, a powerful video understanding AI model, in two additional AWS regions to build applications that generate text from video content with reduced latency. Read announcement →
- Amazon QuickSight users can now create up to 2000 calculated fields per analysis and 500 per dataset, enabling more complex data transformations and insights for large datasets and diverse user needs. Read announcement →
- Amazon Bedrock adds batch inference for Claude Sonnet 4 and OpenAI GPT-OSS models, enabling developers to process large datasets more efficiently at half the cost of on-demand inference. Read announcement →
- Amazon Neptune now works with Cognee, allowing AI agents to store and access long-term memory in graph format, enabling more personalized and context-aware AI experiences for developers building generative AI applications. Read announcement →
- SageMaker HyperPod allows admins to allocate GPU, Trainium, vCPU, and memory resources at a granular level, helping teams optimize compute usage for LLM tasks without wasting entire instances. Read announcement →
- SageMaker HyperPod's new Topology Aware Scheduling feature optimizes large language model training by automatically placing tasks on the most efficient network layout, reducing communication delays and improving performance for data scientists. Read announcement →
- Amazon Braket's new program sets feature allows quantum researchers to run up to 100 quantum circuits 24 times faster, significantly speeding up complex workloads like variational quantum algorithms and error mitigation techniques. Read announcement →
- Amazon Q Business introduces AI agents to break down complex queries, improving accuracy and explanations for enterprise data searches, making it easier for developers to find and understand company information. Read announcement →
The Quarry
How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM
Amazon's Rufus AI shopping assistant got a major upgrade thanks to some clever engineering with AWS Trainium chips and vLLM. The team cooked up a multi-node inference solution that's like a well-oiled machine, using a leader/follower setup to keep things running smoothly across multiple nodes. They also threw in some hybrid parallelism tricks and a custom abstraction layer on Amazon ECS, allowing Rufus to flex its large language model muscles without breaking a sweat or the bank. Read blog →
More posts:
- Simplify access control and auditing for Amazon SageMaker Studio using trusted identity propagation
- Benchmarking document information localization with Amazon Nova
- How Infosys built a generative AI solution to process oil and gas drilling data with Amazon Bedrock
- Streamline employee training with an intelligent chatbot powered by Amazon Q Business
- Create a travel planning agentic workflow with Amazon Nova
- Introducing Amazon Bedrock AgentCore Gateway: Transforming enterprise AI agent tool development
- Build a scalable containerized web application on AWS using the MERN stack with Amazon Q Developer – Part 1
- Optimizing Salesforce’s model endpoints with Amazon SageMaker AI inference components
- Building a RAG chat-based assistant on Amazon EKS Auto Mode and NVIDIA NIMs
- Introducing Amazon Bedrock AgentCore Identity: Securing agentic AI at scale
- Scalable intelligent document processing using Amazon Bedrock Data Automation
- Whiteboard to cloud in minutes using Amazon Q, Amazon Bedrock Data Automation, and Model Context Protocol
- Bringing agentic Retrieval Augmented Generation to Amazon Q Business
- Empowering students with disabilities: University Startups’ generative AI solution for personalized student pathways
- Citations with Amazon Nova understanding models
- Securely launch and scale your agents and tools on Amazon Bedrock AgentCore Runtime
- PwC and AWS Build Responsible AI with Automated Reasoning on Amazon Bedrock
- How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM
- Build an intelligent financial analysis agent with LangGraph and Strands Agents
- Amazon Bedrock AgentCore Memory: Building context-aware agents
Core Sample
AWS Founder Spotlight: Latent Labs
Latent Labs is revolutionizing protein design by applying generative AI to biological systems, viewing cells as "mini-computers" and biology as a computational system. Through the AWS Generative AI Accelerator Program, they've leveraged Amazon SageMaker HyperPod to train proprietary models and scale compute resources efficiently. This AI-powered approach allows Latent Labs to sidestep traditional lab experiments, potentially accelerating drug discovery and development in ways that could reshape the pharmaceutical industry. Watch video →
More videos:
- Amazon Bedrock Cost Optimization
- A CTO’s POV: Speaking With Your CEO About Agentic AI
- Cognigy Uses AWS to Power Scalable, Agentic AI for Enterprises