Occupancy#
Functions | |
| hipError_t | hipModuleOccupancyMaxPotentialBlockSize (int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit) |
| determine the grid and block sizes to achieves maximum occupancy for a kernel | |
| hipError_t | hipModuleOccupancyMaxPotentialBlockSizeWithFlags (int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit, unsigned int flags) |
| determine the grid and block sizes to achieves maximum occupancy for a kernel | |
| hipError_t | hipModuleOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk) |
| Returns occupancy for a device function. | |
| hipError_t | hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags) |
| Returns occupancy for a device function. | |
| hipError_t | hipOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk) |
| Returns occupancy for a device function. | |
| hipError_t | hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags) |
| Returns occupancy for a device function. | |
| hipError_t | hipOccupancyMaxPotentialBlockSize (int *gridSize, int *blockSize, const void *f, size_t dynSharedMemPerBlk, int blockSizeLimit) |
| determine the grid and block sizes to achieves maximum occupancy for a kernel | |
| hipError_t | hipOccupancyAvailableDynamicSMemPerBlock (size_t *dynamicSmemSize, const void *f, int numBlocks, int blockSize) |
| Returns dynamic shared memory available per block when launching numBlocks blocks on SM. | |
| hipError_t | hipOccupancyMaxActiveClusters (int *numClusters, const void *f, const hipLaunchConfig_t *config) |
| determines the amount of active kernel clusters can co-exist at the same time in a device | |
| hipError_t | hipOccupancyMaxPotentialClusterSize (int *clusterSize, const void *f, const hipLaunchConfig_t *config) |
| returns the maximum cluster size (in number of blocks) that can run on the device | |
| template<class T > | |
| hipError_t | hipOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk) |
| Returns occupancy for a kernel function. | |
| template<class T > | |
| hipError_t | hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags) |
| Returns occupancy for a device function with the specified flags. | |
| template<typename F > | |
| hipError_t | hipOccupancyMaxPotentialBlockSize (int *gridSize, int *blockSize, F kernel, size_t dynSharedMemPerBlk, uint32_t blockSizeLimit) |
| Returns grid and block size that achieves maximum potential occupancy for a device function. | |
| template<typename F > | |
| hipError_t | hipOccupancyAvailableDynamicSMemPerBlock (size_t *dynamicSmemSize, F f, int numBlocks, int blockSize) |
| Returns dynamic shared memory available per block when launching numBlocks blocks on SM. | |
Detailed Description
This section describes the occupancy functions of HIP runtime API.
Function Documentation
◆ hipModuleOccupancyMaxActiveBlocksPerMultiprocessor()
| hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessor | ( | int * | numBlocks, |
| hipFunction_t | f, | ||
| int | blockSize, | ||
| size_t | dynSharedMemPerBlk | ||
| ) |
Returns occupancy for a device function.
- Parameters
-
[out] numBlocks Returned occupancy [in] f Kernel function (hipFunction) for which occupancy is calculated [in] blockSize Block size the kernel is intended to be launched with [in] dynSharedMemPerBlk Dynamic shared memory usage (in bytes) intended for each block
- Returns
- hipSuccess, hipErrorInvalidValue
◆ hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags()
| hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags | ( | int * | numBlocks, |
| hipFunction_t | f, | ||
| int | blockSize, | ||
| size_t | dynSharedMemPerBlk, | ||
| unsigned int | flags | ||
| ) |
Returns occupancy for a device function.
- Parameters
-
[out] numBlocks Returned occupancy [in] f Kernel function(hipFunction_t) for which occupancy is calculated [in] blockSize Block size the kernel is intended to be launched with [in] dynSharedMemPerBlk Dynamic shared memory usage (in bytes) intended for each block [in] flags Extra flags for occupancy calculation (only default supported)
- Returns
- hipSuccess, hipErrorInvalidValue
◆ hipModuleOccupancyMaxPotentialBlockSize()
| hipError_t hipModuleOccupancyMaxPotentialBlockSize | ( | int * | gridSize, |
| int * | blockSize, | ||
| hipFunction_t | f, | ||
| size_t | dynSharedMemPerBlk, | ||
| int | blockSizeLimit | ||
| ) |
determine the grid and block sizes to achieves maximum occupancy for a kernel
- Parameters
-
[out] gridSize minimum grid size for maximum potential occupancy [out] blockSize block size for maximum potential occupancy [in] f kernel function for which occupancy is calculated [in] dynSharedMemPerBlk dynamic shared memory usage (in bytes) intended for each block [in] blockSizeLimit the maximum block size for the kernel, use 0 for no limit
Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Returns
- hipSuccess, hipErrorInvalidValue
◆ hipModuleOccupancyMaxPotentialBlockSizeWithFlags()
| hipError_t hipModuleOccupancyMaxPotentialBlockSizeWithFlags | ( | int * | gridSize, |
| int * | blockSize, | ||
| hipFunction_t | f, | ||
| size_t | dynSharedMemPerBlk, | ||
| int | blockSizeLimit, | ||
| unsigned int | flags | ||
| ) |
determine the grid and block sizes to achieves maximum occupancy for a kernel
- Parameters
-
[out] gridSize minimum grid size for maximum potential occupancy [out] blockSize block size for maximum potential occupancy [in] f kernel function for which occupancy is calculated [in] dynSharedMemPerBlk dynamic shared memory usage (in bytes) intended for each block [in] blockSizeLimit the maximum block size for the kernel, use 0 for no limit [in] flags Extra flags for occupancy calculation (only default supported)
Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Returns
- hipSuccess, hipErrorInvalidValue
◆ hipOccupancyAvailableDynamicSMemPerBlock() [1/2]
| hipError_t hipOccupancyAvailableDynamicSMemPerBlock | ( | size_t * | dynamicSmemSize, |
| const void * | f, | ||
| int | numBlocks, | ||
| int | blockSize | ||
| ) |
Returns dynamic shared memory available per block when launching numBlocks blocks on SM.
Returns in *dynamicSmemSize the maximum size of dynamic shared memory / to allow numBlocks blocks per SM.
- Parameters
-
[out] dynamicSmemSize Returned maximum dynamic shared memory. [in] f Kernel function for which occupancy is calculated. [in] numBlocks Number of blocks to fit on SM [in] blockSize Size of the block
◆ hipOccupancyAvailableDynamicSMemPerBlock() [2/2]
|
inline |
Returns dynamic shared memory available per block when launching numBlocks blocks on SM.
Returns in *dynamicSmemSize the maximum size of dynamic shared memory / to allow numBlocks blocks per SM.
- Parameters
-
[out] dynamicSmemSize Returned maximum dynamic shared memory. [in] f Kernel function for which occupancy is calculated. [in] numBlocks Number of blocks to fit on SM [in] blockSize Size of the block
◆ hipOccupancyMaxActiveBlocksPerMultiprocessor() [1/2]
| hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor | ( | int * | numBlocks, |
| const void * | f, | ||
| int | blockSize, | ||
| size_t | dynSharedMemPerBlk | ||
| ) |
Returns occupancy for a device function.
- Parameters
-
[out] numBlocks Returned occupancy [in] f Kernel function for which occupancy is calculated [in] blockSize Block size the kernel is intended to be launched with [in] dynSharedMemPerBlk Dynamic shared memory usage (in bytes) intended for each block
◆ hipOccupancyMaxActiveBlocksPerMultiprocessor() [2/2]
|
inline |
Returns occupancy for a kernel function.
- Parameters
-
[out] numBlocks - Pointer of occupancy in number of blocks. [in] f - The kernel function to launch on the device. [in] blockSize - The block size as kernel launched. [in] dynSharedMemPerBlk - Dynamic shared memory in bytes per block.
- Returns
- hipSuccess, hipErrorInvalidValue
◆ hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags() [1/2]
| hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags | ( | int * | numBlocks, |
| const void * | f, | ||
| int | blockSize, | ||
| size_t | dynSharedMemPerBlk, | ||
| unsigned int | flags | ||
| ) |
Returns occupancy for a device function.
- Parameters
-
[out] numBlocks Returned occupancy [in] f Kernel function for which occupancy is calculated [in] blockSize Block size the kernel is intended to be launched with [in] dynSharedMemPerBlk Dynamic shared memory usage (in bytes) intended for each block [in] flags Extra flags for occupancy calculation (currently ignored)
◆ hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags() [2/2]
|
inline |
Returns occupancy for a device function with the specified flags.
- Parameters
-
[out] numBlocks - Pointer of occupancy in number of blocks. [in] f - The kernel function to launch on the device. [in] blockSize - The block size as kernel launched. [in] dynSharedMemPerBlk - Dynamic shared memory in bytes per block. [in] flags - Flag to handle the behavior for the occupancy calculator.
- Returns
- hipSuccess, hipErrorInvalidValue
◆ hipOccupancyMaxActiveClusters()
| hipError_t hipOccupancyMaxActiveClusters | ( | int * | numClusters, |
| const void * | f, | ||
| const hipLaunchConfig_t * | config | ||
| ) |
determines the amount of active kernel clusters can co-exist at the same time in a device
- Parameters
-
[out] numClusters the amount of clusters [in] f kernel function for which occupancy is calculated [in] config pointer to the kernel launch configuration structure
- Returns
- hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidClusterSize, hipErrorInvalidValue
◆ hipOccupancyMaxPotentialBlockSize() [1/2]
| hipError_t hipOccupancyMaxPotentialBlockSize | ( | int * | gridSize, |
| int * | blockSize, | ||
| const void * | f, | ||
| size_t | dynSharedMemPerBlk, | ||
| int | blockSizeLimit | ||
| ) |
determine the grid and block sizes to achieves maximum occupancy for a kernel
- Parameters
-
[out] gridSize minimum grid size for maximum potential occupancy [out] blockSize block size for maximum potential occupancy [in] f kernel function for which occupancy is calculated [in] dynSharedMemPerBlk dynamic shared memory usage (in bytes) intended for each block [in] blockSizeLimit the maximum block size for the kernel, use 0 for no limit
Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.
- Returns
- hipSuccess, hipErrorInvalidValue
◆ hipOccupancyMaxPotentialBlockSize() [2/2]
|
inline |
Returns grid and block size that achieves maximum potential occupancy for a device function.
Returns in *min_grid_size and *block_size a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).
◆ hipOccupancyMaxPotentialClusterSize()
| hipError_t hipOccupancyMaxPotentialClusterSize | ( | int * | clusterSize, |
| const void * | f, | ||
| const hipLaunchConfig_t * | config | ||
| ) |
returns the maximum cluster size (in number of blocks) that can run on the device
- Parameters
-
[out] clusterSize the maximum cluster size [in] f kernel function for which occupancy is calculated [in] config pointer to the kernel launch configuration structure
- Returns
- hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidClusterSize, hipErrorInvalidValue