Bedrock Brief 19 Aug 2025

Pleri AI

19 Aug 2025 — 4 min read

Welcome back, AI enthusiasts and cloud tinkerers! This week in the wild world of AWS, we've got a mixed bag of treats that'll make you laugh, cry, and possibly contemplate a career change to goat farming. (Don't worry, I hear there's an AWS service for that too.)

First up, AWS CEO Matt Garman is dishing out some parental wisdom that doesn't involve "turn it off and on again." He's betting big on critical thinking as the superhero skill of the AI age. Forget coding bootcamps—apparently, the secret sauce is asking more questions and playing board games. Who knew your Risk addiction was actually professional development? Check out the full story here for more on how to future-proof your career without a CS degree.

In less heartwarming news, AWS's new AI coding tool Kiro is causing wallets to weep across the developer landscape. What started as a promising spec-driven IDE has turned into a "wallet-wrecking tragedy" faster than you can say "unexpected charges." With pricing that makes some devs contemplate selling a kidney, it's clear that AWS missed the memo on "make it rain" not being a pricing strategy. Meanwhile, Amazon Q sits in the corner, looking suspiciously affordable by comparison.

Amidst the chaos, Arm is playing chess while everyone else is playing checkers. They've snagged Amazon's AI chip whiz, Rami Sinno, in a move that screams "if you can't beat 'em, poach 'em." With plans to build their own chips, Arm is stepping up from backseat driver to taking the wheel. Will this shake up the chip world, or is it just another tech giant midlife crisis? Only time (and probably a few overheated prototypes) will tell.

Fresh Cut

Amazon QuickSight increases calculated field limits to 2000 per analysis and 500 per dataset, enabling data analysts to create more complex insights from large datasets without hitting previous restrictions. Read announcement →
Amazon Bedrock enables batch processing of large datasets using Anthropic's Claude and OpenAI's GPT models at half the cost of on-demand inference, allowing developers to efficiently analyze documents, generate content, and extract data at scale. Read announcement →
Amazon Neptune's integration with Cognee enables AI agents to have long-term memory and reasoning capabilities, allowing developers to create more personalized and effective AI applications using graph databases. Read announcement →
SageMaker HyperPod allows administrators to allocate GPU, Trainium, vCPU, and memory resources at a granular level, helping teams optimize resource usage and costs for large language model tasks. Read announcement →
SageMaker HyperPod's new Topology Aware Scheduling optimizes LLM task performance by automatically placing workloads on compute instances with minimal network hops, reducing communication latency and improving training efficiency for distributed AI models. Read announcement →
Quantum researchers can now run up to 100 quantum programs in a single task on Amazon Braket, speeding up complex workloads by up to 24 times and reducing overhead for variational quantum algorithms and error mitigation techniques. Read announcement →
Amazon Q Business introduces AI agents that break down complex queries, retrieve data in parallel, and generate more accurate responses, helping developers find information and gain insights from enterprise data more effectively. Read announcement →
Amazon SageMaker Studio enables admins to trace user actions and manage fine-grained data access permissions, improving security and accountability for machine learning workflows. Read announcement →
Amazon EC2 G6 instances, powered by NVIDIA L4 GPUs, are now available in AWS GovCloud (US-East), offering up to 8 GPUs with 24 GB memory each for graphics-intensive and machine learning tasks. Read announcement →

The Quarry

How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM

Amazon's Rufus AI shopping assistant got a serious power-up thanks to some clever engineering with AWS Trainium chips and vLLM. The team cooked up a multi-node inference solution that's like a well-oiled machine, using a leader/follower setup to keep things running smoothly across multiple nodes. What's really cool is how they mixed and matched parallelism strategies, creating a hybrid approach that squeezes every ounce of performance out of those Trainium chips while keeping the whole system as reliable as your grandma's secret recipe. Read blog →

More posts:

Core Sample

Smarter Tech Investing: How to Make Your CFO Your Strongest Ally

Chris Hennesey, a former IT CFO, spills the beans on turning your CFO from a budget gatekeeper into your secret weapon for tech initiatives. He argues that success isn't about snagging a bigger piggy bank, but about squeezing every drop of value from your existing resources and communicating that value in a way that makes your CFO's spreadsheets sing. Hennesey's approach emphasizes the importance of aligning technology investments with specific business outcomes, using metrics like Net Present Value (NPV) to quantify the long-term financial impact of initiatives like generative AI adoption. Watch video →

More videos:

Bedrock Brief 19 Aug 2025

Pleri AI

Fresh Cut

The Quarry

Core Sample

Read more

Bedrock Brief 12 Nov 2025

Bedrock Brief 05 Nov 2025

Bedrock Brief 29 Oct 2025

Bedrock Brief 22 Oct 2025