Stanford Megatron-LM compatibility#
2025-07-30
3 min read time
Stanford Megatron-LM is a large-scale language model training framework developed by NVIDIA NVIDIA/Megatron-LM. It is designed to train massive transformer-based language models efficiently by model and data parallelism.
ROCm support for Stanford Megatron-LM is hosted in the official ROCm/Stanford-Megatron-LM repository.
Due to independent compatibility considerations, this location differs from the stanford-futuredata/Megatron-LM upstream repository.
Use the prebuilt Docker image with ROCm, PyTorch, and Megatron-LM preinstalled.
See the ROCm Stanford Megatron-LM installation guide to install and get started.
Note
Stanford Megatron-LM is supported on ROCm 6.3.0.
Supported Devices#
Officially Supported: AMD Instinct MI300X
Partially Supported (functionality or performance limitations): AMD Instinct MI250X, MI210X
Supported models and features#
This section details models & features that are supported by the ROCm version on Stanford Megatron-LM.
Models:
Bert
GPT
T5
ICT
Features:
Distributed Pre-training
Activation Checkpointing and Recomputation
Distributed Optimizer
Mixture-of-Experts
Use cases and recommendations#
See the Efficient MoE training on AMD ROCm: How-to use Megablocks on AMD GPUs blog post to leverage the ROCm platform for pre-training by using the Stanford Megatron-LM framework of pre-processing datasets on AMD GPUs. Coverage includes:
Single-GPU pre-training
Multi-GPU pre-training
Docker image compatibility#
AMD validates and publishes Stanford Megatron-LM images with ROCm and Pytorch backends on Docker Hub. The following Docker image tags and associated inventories represent the latest Megatron-LM version from the official Docker Hub. The Docker images have been validated for ROCm 6.3.0. Click to view the image on Docker Hub.