AWS Announces General Availability of Amazon EC2 G7 Instances, Ushering in a New Era of High-Performance GPU Acceleration

Amazon Web Services (AWS) today announced the general availability of Amazon Elastic Compute Cloud (Amazon EC2) G7 instances, marking a significant advancement in high-performance GPU acceleration for a broad spectrum of demanding workloads. These new instances are specifically engineered to deliver superior performance for artificial intelligence (AI) inference, sophisticated graphics rendering, and intensive data analytics, positioning AWS at the forefront of cloud-based accelerated computing.

The G7 instances distinguish themselves as the first offering from a major cloud provider to integrate NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs. This strategic collaboration with NVIDIA, combined with custom sixth-generation Intel Xeon Scalable processors, enables G7 instances to achieve remarkable performance improvements. AWS reports an impressive uplift of up to 4.6 times in AI inference performance and up to 2.1 times in graphics performance when compared to the previous generation G6 instances. This leap in capability is poised to transform how businesses and researchers approach computationally intensive tasks, offering unprecedented speed and efficiency.

Beyond AI inference and graphics, G7 instances also provide accelerated performance for GPU-driven analytics workloads running on Amazon EMR (Elastic MapReduce) and Amazon Elastic Kubernetes Service (Amazon EKS). This comprehensive support extends the utility of G7 instances across a diverse range of applications, including but not limited to, video transcoding and analytics, spatial computing, virtual desktop infrastructure (VDI), and scientific simulations. The enhanced processing power and memory bandwidth are critical for applications that demand real-time data processing and high-fidelity visual output.

Key Features and Technical Specifications of G7 Instances

The architecture of the G7 instances is meticulously designed to maximize the potential of the NVIDIA Blackwell Server Edition GPUs. Each G7 instance can feature up to 8 NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs, collectively offering up to 256 GB of total GPU memory, with each individual GPU equipped with 32 GB of dedicated memory. This substantial memory capacity is crucial for handling large AI models, complex graphical scenes, and extensive datasets that characterize modern high-performance computing.

The integration of custom Intel Xeon Scalable processors ensures a balanced and robust computing environment, optimizing the interplay between CPU and GPU resources. This synergy is vital for applications where both general-purpose processing and specialized acceleration are required.

G7 instances are available in seven distinct sizes, offering a flexible range of configurations to meet varied workload demands. These configurations support up to 192 vCPUs, providing ample parallel processing capability for multi-threaded applications. System memory options extend up to 768 GiB, facilitating the handling of large in-memory datasets. Furthermore, the instances boast impressive I/O capabilities, with up to 700 Gbps of network bandwidth and up to 7.6 TB of local NVMe SSD storage. The high network bandwidth is particularly beneficial for data-intensive workloads, enabling rapid data transfer to and from storage, as well as between nodes in distributed computing environments. The local NVMe SSD storage ensures low-latency access to frequently used data, further boosting overall application performance.

The detailed specifications across the various G7 instance sizes illustrate this scalability:

g7.2xlarge: 1 GPU, 32 GB GPU memory, 8 vCPUs, 32 GiB system memory, 1 x 600 GB storage, up to 8 Gbps EBS bandwidth, up to 60 Gbps network bandwidth.
g7.4xlarge: 1 GPU, 32 GB GPU memory, 16 vCPUs, 64 GiB system memory, 1 x 600 GB storage, 8 Gbps EBS bandwidth, up to 100 Gbps network bandwidth.
g7.8xlarge: 1 GPU, 32 GB GPU memory, 32 vCPUs, 128 GiB system memory, 1 x 950 GB storage, 16 Gbps EBS bandwidth, up to 100 Gbps network bandwidth.
g7.12xlarge: 2 GPUs, 64 GB GPU memory, 48 vCPUs, 192 GiB system memory, 1 x 1900 GB storage, 20 Gbps EBS bandwidth, 175 Gbps network bandwidth.
g7.24xlarge: 4 GPUs, 128 GB GPU memory, 96 vCPUs, 384 GiB system memory, 1 x 3800 GB storage, 40 Gbps EBS bandwidth, 350 Gbps network bandwidth.
g7.48xlarge: 8 GPUs, 256 GB GPU memory, 192 vCPUs, 768 GiB system memory, 2 x 3800 GB storage, 80 Gbps EBS bandwidth, 700 Gbps network bandwidth.
g7.metal: (Coming soon) 8 GPUs, 256 GB GPU memory, 192 vCPUs, 768 GiB system memory, 2 x 3800 GB storage, 80 Gbps EBS bandwidth, 700 Gbps network bandwidth. The g7.metal instance type provides direct access to the underlying server hardware, offering maximum control and performance for specialized applications that benefit from bare-metal access.

Advanced Connectivity for Distributed Workloads

For multi-GPU and multi-node workloads, G7 instances incorporate advanced connectivity features designed to minimize latency and maximize throughput. These include NVIDIA GPUDirect P2P for efficient GPU-to-GPU communication within a single instance, and NVIDIA GPUDirect RDMA with Elastic Fabric Adapter (EFA) for high-speed, low-latency communication across multiple instances. Furthermore, GPUDirect RDMA with EFA is supported for Amazon FSx for Lustre, enabling direct data transfer between GPUs and high-performance file systems, bypassing the CPU and system memory to significantly accelerate data-intensive applications. This capability is particularly critical for large-scale AI training, scientific simulations, and big data analytics where data movement can often be a bottleneck.

Strategic Partnerships and Industry Context

The introduction of G7 instances underscores the deepening partnership between AWS and NVIDIA, a collaboration that has consistently pushed the boundaries of cloud-based GPU computing. NVIDIA’s Blackwell architecture, known for its significant advancements in AI and high-performance computing, forms the backbone of these new instances. The RTX PRO 4500 Blackwell Server Edition GPUs are tailored for professional applications, providing the reliability, performance, and features demanded by enterprise-grade workloads.

This launch also highlights AWS’s commitment to offering a diverse portfolio of compute options. The market for accelerated computing has seen explosive growth, driven primarily by the rapid advancements in AI and machine learning, alongside the increasing complexity of graphics and data analytics. Enterprises across various sectors are seeking more powerful, scalable, and cost-effective solutions to train larger models, perform real-time inference, render high-fidelity content, and analyze massive datasets. The G7 instances directly address these evolving needs, reinforcing AWS’s position as a leading provider of cloud infrastructure for these cutting-edge applications.

Announcing Amazon EC2 G7 instances accelerated by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs | Amazon Web Services

An AWS spokesperson, commenting on the launch, stated, "The introduction of G7 instances underscores AWS’s unwavering commitment to providing our customers with cutting-edge infrastructure that empowers innovation across AI, graphics, and data-intensive applications. By being the first major cloud provider to integrate NVIDIA’s RTX PRO 4500 Blackwell Server Edition GPUs, we are delivering unparalleled performance and efficiency, enabling our customers to achieve breakthroughs in fields ranging from scientific research to entertainment."

Similarly, an NVIDIA representative remarked, "The integration of NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs on AWS EC2 G7 instances marks a significant milestone in bringing the power of Blackwell to a wider array of cloud users. This collaboration with AWS ensures that developers and enterprises have immediate access to the high-performance computing capabilities required to accelerate their most challenging AI and graphics workloads, further fueling innovation in the cloud."

Implications for Diverse Industries and Workloads

The enhanced capabilities of G7 instances have far-reaching implications across numerous industries:

Artificial Intelligence and Machine Learning: The substantial increase in AI inference performance is critical for deploying large language models (LLMs), computer vision applications, natural language processing (NLP), and recommendation engines in production environments. Lower inference latency and higher throughput mean faster response times for user-facing AI services and more efficient processing of batch AI tasks.
Graphics and Media & Entertainment: For graphics rendering, video transcoding, and visual effects, the 2.1x performance boost translates into significantly faster content creation workflows. This benefits animation studios, game developers, and media companies requiring rapid processing of high-resolution video and complex 3D scenes.
Virtual Desktop Infrastructure (VDI): G7 instances provide a robust foundation for high-performance VDI, enabling engineers, designers, and creative professionals to run demanding applications like CAD, CAM, and professional visualization tools from virtually anywhere with a seamless user experience.
Spatial Computing and Digital Twins: The ability to process complex 3D data and render sophisticated environments in real-time makes G7 instances ideal for spatial computing applications, including augmented reality (AR), virtual reality (VR), and the development of digital twins for industrial and urban planning.
Data Analytics and Scientific Computing: When integrated with Amazon EMR and EKS, G7 instances accelerate GPU-enabled data analytics, allowing for faster processing of large datasets for insights, machine learning feature engineering, and complex simulations in fields like genomics, physics, and financial modeling.
Autonomous Systems: The real-time processing capabilities are invaluable for autonomous vehicles and robotics, enabling faster sensor data fusion, path planning, and decision-making.

Ease of Deployment and Ecosystem Integration

AWS has ensured that getting started with G7 instances is straightforward for developers and enterprises. Customers can leverage the AWS Deep Learning AMIs (DLAMI) or NVIDIA Workstation AMIs, which come pre-packaged with the necessary GPU drivers and frameworks for AI inference and graphics workloads. For those utilizing Amazon EKS, AWS provides automation scripts to build EKS AMIs with NVIDIA driver version R595, ensuring seamless integration into Kubernetes environments.

G7 instances support a wide array of operating systems, including Amazon Linux, Ubuntu, Red Hat Enterprise Linux (RHEL), and Windows Server. This broad compatibility, coupled with comprehensive NVIDIA driver integration, ensures support for industry-standard graphics libraries such as DirectX, Vulkan, and OpenGL, making it easier for developers to migrate existing applications or build new ones.

Availability and Flexible Purchasing Options

Amazon EC2 G7 instances are now generally available in key AWS regions: US East (Ohio) and US West (Oregon). AWS plans for future regional expansion, and customers can monitor these plans via the CloudFormation resources tab on the AWS Capabilities by Region page.

AWS offers multiple purchasing options for G7 instances, providing flexibility and cost optimization strategies for various use cases:

On-Demand: Ideal for short-term, irregular workloads where flexibility is paramount.
Savings Plans: Offer significant discounts in exchange for a commitment to a consistent amount of compute usage (measured in $/hour) over a 1-year or 3-year term. This is suitable for workloads with predictable usage.
Spot Instances: Provide access to unused EC2 capacity at steep discounts, perfect for fault-tolerant and flexible workloads that can tolerate interruptions.
Dedicated Instances: For the g7.12xlarge, g7.24xlarge, and g7.48xlarge sizes, customers can opt for Dedicated Instances, which run on hardware dedicated to a single customer, addressing specific compliance or licensing requirements.

Detailed pricing information for all purchasing options is available on the Amazon EC2 Pricing page.

Looking Ahead: AWS’s Commitment to Accelerated Computing

The release of G7 instances underscores AWS’s continuous investment in advanced computing infrastructure. As AI and other data-intensive applications become increasingly central to business operations and scientific discovery, the demand for powerful and scalable GPU solutions will only grow. By consistently innovating and partnering with industry leaders like NVIDIA and Intel, AWS aims to provide the foundational infrastructure that empowers its customers to push the boundaries of what is possible in the cloud. This latest offering solidifies AWS’s position in the competitive cloud market, ensuring that developers and enterprises have access to the cutting-edge tools needed to accelerate their most ambitious projects. The G7 instances are more than just a performance upgrade; they represent a strategic step forward in making high-performance accelerated computing accessible and scalable for a global customer base.