Bedrock Brief 19 Aug 2025
Welcome back, AI enthusiasts and cloud tinkerers! This week in the wild world of AWS, we've got a mixed bag of treats that'll make you laugh, cry, and possibly contemplate a career change to goat farming. (Don't worry, I hear there's an AWS service for that too.)
First up, AWS CEO Matt Garman is dishing out some parental wisdom that doesn't involve "turn it off and on again." He's betting big on critical thinking as the superhero skill of the AI age. Forget coding bootcamps—apparently, the secret sauce is asking more questions and playing board games. Who knew your Risk addiction was actually professional development? Check out the full story here for more on how to future-proof your career without a CS degree.
In less heartwarming news, AWS's new AI coding tool Kiro is causing wallets to weep across the developer landscape. What started as a promising spec-driven IDE has turned into a "wallet-wrecking tragedy" faster than you can say "unexpected charges." With pricing that makes some devs contemplate selling a kidney, it's clear that AWS missed the memo on "make it rain" not being a pricing strategy. Meanwhile, Amazon Q sits in the corner, looking suspiciously affordable by comparison.
Amidst the chaos, Arm is playing chess while everyone else is playing checkers. They've snagged Amazon's AI chip whiz, Rami Sinno, in a move that screams "if you can't beat 'em, poach 'em." With plans to build their own chips, Arm is stepping up from backseat driver to taking the wheel. Will this shake up the chip world, or is it just another tech giant midlife crisis? Only time (and probably a few overheated prototypes) will tell.
Fresh Cut
- Amazon QuickSight increases calculated field limits to 2000 per analysis and 500 per dataset, enabling data analysts to create more complex insights from large datasets without hitting previous restrictions. Read announcement →
- Amazon Bedrock enables batch processing of large datasets using Anthropic's Claude and OpenAI's GPT models at half the cost of on-demand inference, allowing developers to efficiently analyze documents, generate content, and extract data at scale. Read announcement →
- Amazon Neptune's integration with Cognee enables AI agents to have long-term memory and reasoning capabilities, allowing developers to create more personalized and effective AI applications using graph databases. Read announcement →
- SageMaker HyperPod allows administrators to allocate GPU, Trainium, vCPU, and memory resources at a granular level, helping teams optimize resource usage and costs for large language model tasks. Read announcement →
- SageMaker HyperPod's new Topology Aware Scheduling optimizes LLM task performance by automatically placing workloads on compute instances with minimal network hops, reducing communication latency and improving training efficiency for distributed AI models. Read announcement →
- Quantum researchers can now run up to 100 quantum programs in a single task on Amazon Braket, speeding up complex workloads by up to 24 times and reducing overhead for variational quantum algorithms and error mitigation techniques. Read announcement →
- Amazon Q Business introduces AI agents that break down complex queries, retrieve data in parallel, and generate more accurate responses, helping developers find information and gain insights from enterprise data more effectively. Read announcement →
- Amazon SageMaker Studio enables admins to trace user actions and manage fine-grained data access permissions, improving security and accountability for machine learning workflows. Read announcement →
- Amazon EC2 G6 instances, powered by NVIDIA L4 GPUs, are now available in AWS GovCloud (US-East), offering up to 8 GPUs with 24 GB memory each for graphics-intensive and machine learning tasks. Read announcement →
The Quarry
How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM
Amazon's Rufus AI shopping assistant got a serious power-up thanks to some clever engineering with AWS Trainium chips and vLLM. The team cooked up a multi-node inference solution that's like a well-oiled machine, using a leader/follower setup to keep things running smoothly across multiple nodes. What's really cool is how they mixed and matched parallelism strategies, creating a hybrid approach that squeezes every ounce of performance out of those Trainium chips while keeping the whole system as reliable as your grandma's secret recipe. Read blog →
More posts:
- Create a travel planning agentic workflow with Amazon Nova
- Introducing Amazon Bedrock AgentCore Gateway: Transforming enterprise AI agent tool development
- Build a scalable containerized web application on AWS using the MERN stack with Amazon Q Developer – Part 1
- Optimizing Salesforce’s model endpoints with Amazon SageMaker AI inference components
- Building a RAG chat-based assistant on Amazon EKS Auto Mode and NVIDIA NIMs
- Introducing Amazon Bedrock AgentCore Identity: Securing agentic AI at scale
- Scalable intelligent document processing using Amazon Bedrock Data Automation
- Whiteboard to cloud in minutes using Amazon Q, Amazon Bedrock Data Automation, and Model Context Protocol
- Bringing agentic Retrieval Augmented Generation to Amazon Q Business
- Empowering students with disabilities: University Startups’ generative AI solution for personalized student pathways
- Citations with Amazon Nova understanding models
- Securely launch and scale your agents and tools on Amazon Bedrock AgentCore Runtime
- PwC and AWS Build Responsible AI with Automated Reasoning on Amazon Bedrock
- How Amazon scaled Rufus by building multi-node inference using AWS Trainium chips and vLLM
- Build an intelligent financial analysis agent with LangGraph and Strands Agents
- Amazon Bedrock AgentCore Memory: Building context-aware agents
- Build a conversational natural language interface for Amazon Athena queries using Amazon Nova
- Train and deploy AI models at trillion-parameter scale with Amazon SageMaker HyperPod support for P6e-GB200 UltraServers
- How Indegene’s AI-powered social intelligence for life sciences turns social media conversations into insights
- Unlocking enhanced legal document review with Lexbe and Amazon Bedrock
Core Sample
Smarter Tech Investing: How to Make Your CFO Your Strongest Ally
Chris Hennesey, a former IT CFO, spills the beans on turning your CFO from a budget gatekeeper into your secret weapon for tech initiatives. He argues that success isn't about snagging a bigger piggy bank, but about squeezing every drop of value from your existing resources and communicating that value in a way that makes your CFO's spreadsheets sing. Hennesey's approach emphasizes the importance of aligning technology investments with specific business outcomes, using metrics like Net Present Value (NPV) to quantify the long-term financial impact of initiatives like generative AI adoption. Watch video →
More videos:
- Cognigy Uses AWS to Power Scalable, Agentic AI for Enterprises
- AWS Founder Spotlight: Latent Labs
- Smarter Tech Investing: How to Make Your CFO Your Strongest Ally