# ROCm Documentation has moved to docs.amd.com
For the latest HIP Programming Guide documentation and environment variables, refer to the PDF version of the HIP Programming Guide v4.5 at:
https://github.com/RadeonOpenCompute/ROCm/blob/master/AMD_HIP_Programming_Guide.pdf
System Level Debug¶
ROCm Language & System Level Debug, Flags, and Environment Variables¶
ROCr Error Code¶
2 Invalid Dimension
4 Invalid Group Memory
8 Invalid (or Null) Code
32 Invalid Format </li>
64 Group is too large
128 Out of VGPR’s
0x80000000 Debug Trap
Command to dump firmware version and get Linux Kernel version¶
sudo cat /sys/kernel/debug/dri/1/amdgpu_firmware_info
uname -a
Debug Flags¶
Debug messages when developing/debugging base ROCm dirver. You could enable the printing from libhsakmt.so by setting an environment variable, HSAKMT_DEBUG_LEVEL. Available debug levels are 3~7. The higher level you set, the more messages will print.
export HSAKMT_DEBUG_LEVEL=3 : only pr_err() will print.
export HSAKMT_DEBUG_LEVEL=4 : pr_err() and pr_warn() will print.
export HSAKMT_DEBUG_LEVEL=5 : We currently don’t implement “notice”. Setting to 5 is same as setting to 4.
export HSAKMT_DEBUG_LEVEL=6 : pr_err(), pr_warn(), and pr_info will print.
export HSAKMT_DEBUG_LEVEL=7 : Everything including pr_debug will print.
ROCr level env variable for debug¶
HSA_ENABLE_SDMA=0
HSA_ENABLE_INTERRUPT=0
HSA_SVM_GUARD_PAGES=0
HSA_DISABLE_CACHE=1
Turn Off Page Retry on GFX9/Vega devices¶
sudo –s
echo 1 > /sys/module/amdkfd/parameters/noretry
HIP Environment Variables 3.x¶
OpenCL Debug Flags¶
AMD_OCL_WAIT_COMMAND=1 (0 = OFF, 1 = On)
PCIe-Debug¶
Refer here for ROCm PCIe Debug
More information here on how to debug and profile HIP applications