Bedrock Brief 18 Mar 2026

Pleri AI

18 Mar 2026 — 4 min read

Welcome to this week's Bedrock Brief, where AWS's AI ambitions are soaring higher than a drone carrying your next Prime delivery. Amazon CEO Andy Jassy is feeling bullish, projecting AWS could hit a whopping $600 billion in annual sales within a decade – double his previous estimate. Why the sudden optimism? Two words: artificial intelligence. Jassy's betting big, with plans to pour $200 billion into AI development and infrastructure this year alone. Wall Street might be skeptical, but Jassy insists they're not just throwing money at a hunch.

Speaking of AI bets, OpenAI is making moves in the government sector, inking a deal with AWS to sell its AI products for both classified and unclassified work. This partnership puts OpenAI in direct competition with Anthropic on AWS's turf, potentially opening doors to lucrative government contracts. It's a high-stakes game of AI chess, with national security and billion-dollar contracts on the line.

But before we get too carried away with AI's potential, let's remember it's not all sunshine and rainbows in the land of machine learning. Amazon recently had to clarify that a series of outages weren't caused by AI-written code (phew!), though one incident did involve an engineer following bad advice from an AI tool working off outdated info. It's a sobering reminder that even as companies race to integrate AI, the technology still has its "oops" moments. Maybe those human developers aren't so expendable after all?

Fresh Cut

Amazon Connect's AI-powered voice agents can now understand and respond to customers in London with new Spanish and UK English voices, enabling more natural and region-specific automated customer service. Read announcement →
Amazon Connect's voice AI agents can now handle customer service tasks in 13 additional languages, including Arabic and Welsh, expanding its ability to automate conversations for a more diverse global audience. Read announcement →
SageMaker Training Plans now lets you extend GPU capacity reservations for up to 182 days without interrupting your AI workloads, ensuring your long-running projects don't suddenly lose compute resources. Read announcement →
Amazon Bedrock AgentCore Runtime introduces a new API allowing direct shell command execution within running sessions, streamlining AI agent workflows by eliminating the need for custom container logic to handle deterministic operations alongside LLM-powered tasks. Read announcement →
Amazon Bedrock, a service for building generative AI applications with various foundation models, is now available in New Zealand, offering local developers easier access to powerful AI tools and models from companies like Anthropic and Amazon. Read announcement →
SageMaker HyperPod's new idle resource sharing feature allows teams to borrow unused compute capacity from shared clusters, maximizing utilization of expensive GPU instances and helping developers access more resources for their AI workloads. Read announcement →
AWS introduces AI-powered agents in Partner Central that help sales teams streamline co-selling by providing instant insights, automating data entry, and simplifying funding requests, accessible through the console or integrated with CRM systems. Read announcement →
Amazon Bedrock AgentCore Runtime adds AG-UI protocol support, enabling developers to create responsive, real-time AI agent experiences with features like streaming text chunks and tool result visualization, without worrying about authentication or scaling. Read announcement →
AI-assisted serverless development comes to Kiro with AWS SAM power, enabling developers to build, deploy, and test Lambda functions locally while enforcing best practices for security and observability. Read announcement →
Developers can now use AI assistants to manage AWS Landing Zone Accelerator deployments through natural language conversations, thanks to the open-source LZA MCP Server with 20 specialized tools for configuration, monitoring, and troubleshooting. Read announcement →

The Quarry

Introducing Disaggregated Inference on AWS powered by llm-d

Buckle up, ML enthusiasts—AWS just dropped a game-changer for large language model inference with their new disaggregated serving approach. By intelligently splitting massive models across multiple GPUs and orchestrating requests with surgical precision, this tech promises to squeeze every last drop of performance from your hardware while slashing costs. The secret sauce? A clever combo of expert parallelism and request batching that lets you serve more concurrent users without breaking a sweat (or your budget). Read blog →

More posts:

Core Sample

AWS and Cerebras are teaming up to build the fastest possible AI inference

AWS and Cerebras are teaming up to supercharge AI inference in the cloud, combining AWS Trainium3 servers with Cerebras CS-3 systems for a speed boost that'll make your head spin. They're using a clever trick called "inference disaggregation" to split the workload, letting each system do what it does best: Trainium3 handles the compute-heavy "prefill" stage, while the CS-3's wafer-scale design tackles the memory-hungry "decode" phase. This hardware tag-team promises to deliver the fastest inference speeds in Amazon Bedrock, potentially revolutionizing how we run large language models in the cloud. Watch video →

More videos:

Bedrock Brief 18 Mar 2026

Pleri AI

Fresh Cut

The Quarry

Core Sample

Read more

Bedrock Brief 25 Mar 2026

Bedrock Brief 11 Mar 2026

Bedrock Brief 04 Mar 2026

Bedrock Brief 25 Feb 2026