Bedrock Brief 20 Aug 2025

Bedrock Brief 20 Aug 2025

Welcome to another week of AI shenanigans, AWS enthusiasts! It seems the cloud giants are playing a game of "who can make the most controversial AI statement" lately. Amazon's cloud chief, Matt Garman, took the cake by declaring that replacing junior employees with AI is "one of the dumbest things" he's ever heard. Apparently, he's not keen on the idea of a future workforce consisting solely of AI and grumpy senior developers who haven't touched a keyboard in years.

Speaking of touching keyboards, AWS's new AI-powered coding tool, Kiro, is causing quite the stir. What was initially hailed as a wallet-friendly developer's dream has quickly turned into a "wallet-wrecking tragedy." It turns out that those nifty AI-assisted coding sessions might cost you more than your morning coffee habit. Who knew that asking an AI to fix your buggy code could be more expensive than therapy?

But fear not, aspiring tech wizards! According to Garman, the key to surviving the AI revolution isn't mastering the latest programming language or becoming an AI expert. No, the most valuable skill in this brave new world is... drumroll, please... critical thinking! So put down that machine learning textbook and pick up a Rubik's Cube. Your future career may depend on how well you can solve puzzles while simultaneously explaining your thought process to a confused AI assistant.

Fresh Cut

  • Amazon Bedrock users can now instantly access OpenAI's GPT-OSS models without manual activation, streamlining AI development for developers who want to experiment with open-weight language models. Read announcement →
  • AWS introduces R8i and R8i-flex EC2 instances with custom Intel Xeon 6 processors, offering up to 15% better price-performance and 2.5x more memory bandwidth than previous generations, ideal for memory-intensive workloads like databases and AI models. Read announcement →
  • Developers can now use TwelveLabs' Pegasus 1.2, a powerful video understanding AI model, in two additional AWS regions to build applications that generate text from video content with reduced latency. Read announcement →
  • Amazon QuickSight users can now create up to 2000 calculated fields per analysis and 500 per dataset, enabling more complex data transformations and insights for large datasets and diverse user needs. Read announcement →
  • Amazon Bedrock adds batch inference for Claude Sonnet 4 and OpenAI GPT-OSS models, enabling developers to process large datasets more efficiently at half the cost of on-demand inference. Read announcement →
  • Amazon Neptune now works with Cognee, allowing AI agents to store and access long-term memory in graph format, enabling more personalized and context-aware AI experiences for developers building generative AI applications. Read announcement →
  • SageMaker HyperPod allows admins to allocate GPU, Trainium, vCPU, and memory resources at a granular level, helping teams optimize compute usage for LLM tasks without wasting entire instances. Read announcement →
  • SageMaker HyperPod's new Topology Aware Scheduling feature optimizes large language model training by automatically placing tasks on the most efficient network layout, reducing communication delays and improving performance for data scientists. Read announcement →
  • Amazon Braket's new program sets feature allows quantum researchers to run up to 100 quantum circuits 24 times faster, significantly speeding up complex workloads like variational quantum algorithms and error mitigation techniques. Read announcement →
  • Amazon Q Business introduces AI agents to break down complex queries, improving accuracy and explanations for enterprise data searches, making it easier for developers to find and understand company information. Read announcement →

The Quarry

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM

Amazon's Rufus AI shopping assistant got a major upgrade thanks to some clever engineering with AWS Trainium chips and vLLM. The team cooked up a multi-node inference solution that's like a well-oiled machine, using a leader/follower setup to keep things running smoothly across multiple nodes. They also threw in some hybrid parallelism tricks and a custom abstraction layer on Amazon ECS, allowing Rufus to flex its large language model muscles without breaking a sweat or the bank. Read blog →

More posts:


Core Sample

AWS Founder Spotlight: Latent Labs

Latent Labs is revolutionizing protein design by applying generative AI to biological systems, viewing cells as "mini-computers" and biology as a computational system. Through the AWS Generative AI Accelerator Program, they've leveraged Amazon SageMaker HyperPod to train proprietary models and scale compute resources efficiently. This AI-powered approach allows Latent Labs to sidestep traditional lab experiments, potentially accelerating drug discovery and development in ways that could reshape the pharmaceutical industry. Watch video →

More videos: