Occupancy#

hipError_t hipModuleOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit)#

determine the grid and block sizes to achieves maximum occupancy for a kernel

Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.

Parameters:
  • gridSize[out] minimum grid size for maximum potential occupancy

  • blockSize[out] block size for maximum potential occupancy

  • f[in] kernel function for which occupancy is calulated

  • dynSharedMemPerBlk[in] dynamic shared memory usage (in bytes) intended for each block

  • blockSizeLimit[in] the maximum block size for the kernel, use 0 for no limit

Returns:

hipSuccess, hipErrorInvalidValue

hipError_t hipModuleOccupancyMaxPotentialBlockSizeWithFlags(int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit, unsigned int flags)#

determine the grid and block sizes to achieves maximum occupancy for a kernel

Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.

Parameters:
  • gridSize[out] minimum grid size for maximum potential occupancy

  • blockSize[out] block size for maximum potential occupancy

  • f[in] kernel function for which occupancy is calulated

  • dynSharedMemPerBlk[in] dynamic shared memory usage (in bytes) intended for each block

  • blockSizeLimit[in] the maximum block size for the kernel, use 0 for no limit

  • flags[in] Extra flags for occupancy calculation (only default supported)

Returns:

hipSuccess, hipErrorInvalidValue

hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk)#

Returns occupancy for a device function.

Parameters:
  • numBlocks[out] Returned occupancy

  • f[in] Kernel function (hipFunction) for which occupancy is calulated

  • blockSize[in] Block size the kernel is intended to be launched with

  • dynSharedMemPerBlk[in] Dynamic shared memory usage (in bytes) intended for each block

Returns:

hipSuccess, hipErrorInvalidValue

hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#

Returns occupancy for a device function.

Parameters:
  • numBlocks[out] Returned occupancy

  • f[in] Kernel function(hipFunction_t) for which occupancy is calulated

  • blockSize[in] Block size the kernel is intended to be launched with

  • dynSharedMemPerBlk[in] Dynamic shared memory usage (in bytes) intended for each block

  • flags[in] Extra flags for occupancy calculation (only default supported)

Returns:

hipSuccess, hipErrorInvalidValue

hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk)#

Returns occupancy for a device function.

Parameters:
  • numBlocks[out] Returned occupancy

  • f[in] Kernel function for which occupancy is calulated

  • blockSize[in] Block size the kernel is intended to be launched with

  • dynSharedMemPerBlk[in] Dynamic shared memory usage (in bytes) intended for each block

Returns:

hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue

hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#

Returns occupancy for a device function.

Parameters:
  • numBlocks[out] Returned occupancy

  • f[in] Kernel function for which occupancy is calulated

  • blockSize[in] Block size the kernel is intended to be launched with

  • dynSharedMemPerBlk[in] Dynamic shared memory usage (in bytes) intended for each block

  • flags[in] Extra flags for occupancy calculation (currently ignored)

Returns:

hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue

hipError_t hipOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, const void *f, size_t dynSharedMemPerBlk, int blockSizeLimit)#

determine the grid and block sizes to achieves maximum occupancy for a kernel

Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.

Parameters:
  • gridSize[out] minimum grid size for maximum potential occupancy

  • blockSize[out] block size for maximum potential occupancy

  • f[in] kernel function for which occupancy is calulated

  • dynSharedMemPerBlk[in] dynamic shared memory usage (in bytes) intended for each block

  • blockSizeLimit[in] the maximum block size for the kernel, use 0 for no limit

Returns:

hipSuccess, hipErrorInvalidValue

template<class T>
inline hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor(int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk)#

Returns occupancy for a kernel function.

Parameters:
  • numBlocks[out] - Pointer of occupancy in number of blocks.

  • f[in] - The kernel function to launch on the device.

  • blockSize[in] - The block size as kernel launched.

  • dynSharedMemPerBlk[in] - Dynamic shared memory in bytes per block.

Returns:

hipSuccess, hipErrorInvalidValue

template<class T>
inline hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags(int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)#

Returns occupancy for a device function with the specified flags.

Parameters:
  • numBlocks[out] - Pointer of occupancy in number of blocks.

  • f[in] - The kernel function to launch on the device.

  • blockSize[in] - The block size as kernel launched.

  • dynSharedMemPerBlk[in] - Dynamic shared memory in bytes per block.

  • flags[in] - Flag to handle the behavior for the occupancy calculator.

Returns:

hipSuccess, hipErrorInvalidValue

template<typename UnaryFunction, class T>
static inline hipError_t hipOccupancyMaxPotentialBlockSizeVariableSMemWithFlags(int *min_grid_size, int *block_size, T func, UnaryFunction block_size_to_dynamic_smem_size, int block_size_limit = 0, unsigned int flags = 0)#

Returns grid and block size that achieves maximum potential occupancy for a device function.

Returns in *min_grid_size and *block_size a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).

Parameters:
  • min_grid_size[out] minimum grid size needed to achieve the best potential occupancy

  • block_size[out] block size required for the best potential occupancy

  • func[in] device function symbol

  • block_size_to_dynamic_smem_size[in] - a unary function/functor that takes block size, and returns the size, in bytes, of dynamic shared memory needed for a block

  • block_size_limit[in] the maximum block size func is designed to work with. 0 means no limit.

  • flags[in] reserved

Returns:

hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown

template<typename UnaryFunction, class T>
static inline hipError_t hipOccupancyMaxPotentialBlockSizeVariableSMem(int *min_grid_size, int *block_size, T func, UnaryFunction block_size_to_dynamic_smem_size, int block_size_limit = 0)#

Returns grid and block size that achieves maximum potential occupancy for a device function.

Returns in *min_grid_size and *block_size a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).

Parameters:
  • min_grid_size[out] minimum grid size needed to achieve the best potential occupancy

  • block_size[out] block size required for the best potential occupancy

  • func[in] device function symbol

  • block_size_to_dynamic_smem_size[in] - a unary function/functor that takes block size, and returns the size, in bytes, of dynamic shared memory needed for a block

  • block_size_limit[in] the maximum block size func is designed to work with. 0 means no limit.

Returns:

hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown

template<typename F>
inline hipError_t hipOccupancyMaxPotentialBlockSize(int *gridSize, int *blockSize, F kernel, size_t dynSharedMemPerBlk, uint32_t blockSizeLimit)#

Returns grid and block size that achieves maximum potential occupancy for a device function.

Returns in *min_grid_size and *block_size a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).

Returns:

hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue