Occupancy

Occupancy#

HIP Runtime API Reference: Occupancy
Occupancy
Collaboration diagram for Occupancy:

Functions

hipError_t hipModuleOccupancyMaxPotentialBlockSize (int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit)
 determine the grid and block sizes to achieves maximum occupancy for a kernel More...
 
hipError_t hipModuleOccupancyMaxPotentialBlockSizeWithFlags (int *gridSize, int *blockSize, hipFunction_t f, size_t dynSharedMemPerBlk, int blockSizeLimit, unsigned int flags)
 determine the grid and block sizes to achieves maximum occupancy for a kernel More...
 
hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk)
 Returns occupancy for a device function. More...
 
hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks, hipFunction_t f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)
 Returns occupancy for a device function. More...
 
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk)
 Returns occupancy for a device function. More...
 
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks, const void *f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)
 Returns occupancy for a device function. More...
 
hipError_t hipOccupancyMaxPotentialBlockSize (int *gridSize, int *blockSize, const void *f, size_t dynSharedMemPerBlk, int blockSizeLimit)
 determine the grid and block sizes to achieves maximum occupancy for a kernel More...
 
hipError_t hipOccupancyAvailableDynamicSMemPerBlock (size_t *dynamicSmemSize, const void *f, int numBlocks, int blockSize)
 Returns dynamic shared memory available per block when launching numBlocks blocks on SM. More...
 
template<class T >
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor (int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk)
 Returns occupancy for a kernel function. More...
 
template<class T >
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags (int *numBlocks, T f, int blockSize, size_t dynSharedMemPerBlk, unsigned int flags)
 Returns occupancy for a device function with the specified flags. More...
 
template<typename F >
hipError_t hipOccupancyMaxPotentialBlockSize (int *gridSize, int *blockSize, F kernel, size_t dynSharedMemPerBlk, uint32_t blockSizeLimit)
 Returns grid and block size that achieves maximum potential occupancy for a device function. More...
 
template<typename F >
hipError_t hipOccupancyAvailableDynamicSMemPerBlock (size_t *dynamicSmemSize, F f, int numBlocks, int blockSize)
 Returns dynamic shared memory available per block when launching numBlocks blocks on SM. More...
 

Detailed Description



This section describes the occupancy functions of HIP runtime API.

Function Documentation

◆ hipModuleOccupancyMaxActiveBlocksPerMultiprocessor()

hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessor ( int *  numBlocks,
hipFunction_t  f,
int  blockSize,
size_t  dynSharedMemPerBlk 
)

Returns occupancy for a device function.

Parameters
[out]numBlocksReturned occupancy
[in]fKernel function (hipFunction) for which occupancy is calulated
[in]blockSizeBlock size the kernel is intended to be launched with
[in]dynSharedMemPerBlkDynamic shared memory usage (in bytes) intended for each block
Returns
hipSuccess, hipErrorInvalidValue

◆ hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags()

hipError_t hipModuleOccupancyMaxActiveBlocksPerMultiprocessorWithFlags ( int *  numBlocks,
hipFunction_t  f,
int  blockSize,
size_t  dynSharedMemPerBlk,
unsigned int  flags 
)

Returns occupancy for a device function.

Parameters
[out]numBlocksReturned occupancy
[in]fKernel function(hipFunction_t) for which occupancy is calulated
[in]blockSizeBlock size the kernel is intended to be launched with
[in]dynSharedMemPerBlkDynamic shared memory usage (in bytes) intended for each block
[in]flagsExtra flags for occupancy calculation (only default supported)
Returns
hipSuccess, hipErrorInvalidValue

◆ hipModuleOccupancyMaxPotentialBlockSize()

hipError_t hipModuleOccupancyMaxPotentialBlockSize ( int *  gridSize,
int *  blockSize,
hipFunction_t  f,
size_t  dynSharedMemPerBlk,
int  blockSizeLimit 
)

determine the grid and block sizes to achieves maximum occupancy for a kernel

Parameters
[out]gridSizeminimum grid size for maximum potential occupancy
[out]blockSizeblock size for maximum potential occupancy
[in]fkernel function for which occupancy is calulated
[in]dynSharedMemPerBlkdynamic shared memory usage (in bytes) intended for each block
[in]blockSizeLimitthe maximum block size for the kernel, use 0 for no limit

Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.

Returns
hipSuccess, hipErrorInvalidValue

◆ hipModuleOccupancyMaxPotentialBlockSizeWithFlags()

hipError_t hipModuleOccupancyMaxPotentialBlockSizeWithFlags ( int *  gridSize,
int *  blockSize,
hipFunction_t  f,
size_t  dynSharedMemPerBlk,
int  blockSizeLimit,
unsigned int  flags 
)

determine the grid and block sizes to achieves maximum occupancy for a kernel

Parameters
[out]gridSizeminimum grid size for maximum potential occupancy
[out]blockSizeblock size for maximum potential occupancy
[in]fkernel function for which occupancy is calulated
[in]dynSharedMemPerBlkdynamic shared memory usage (in bytes) intended for each block
[in]blockSizeLimitthe maximum block size for the kernel, use 0 for no limit
[in]flagsExtra flags for occupancy calculation (only default supported)

Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.

Returns
hipSuccess, hipErrorInvalidValue

◆ hipOccupancyAvailableDynamicSMemPerBlock() [1/2]

hipError_t hipOccupancyAvailableDynamicSMemPerBlock ( size_t *  dynamicSmemSize,
const void *  f,
int  numBlocks,
int  blockSize 
)

Returns dynamic shared memory available per block when launching numBlocks blocks on SM.

Returns in *dynamicSmemSize the maximum size of dynamic shared memory / to allow numBlocks blocks per SM.

Parameters
[out]dynamicSmemSizeReturned maximum dynamic shared memory.
[in]fKernel function for which occupancy is calculated.
[in]numBlocksNumber of blocks to fit on SM
[in]blockSizeSize of the block
Returns
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown

◆ hipOccupancyAvailableDynamicSMemPerBlock() [2/2]

template<typename F >
hipError_t hipOccupancyAvailableDynamicSMemPerBlock ( size_t *  dynamicSmemSize,
f,
int  numBlocks,
int  blockSize 
)
inline

Returns dynamic shared memory available per block when launching numBlocks blocks on SM.

Returns in *dynamicSmemSize the maximum size of dynamic shared memory / to allow numBlocks blocks per SM.

Parameters
[out]dynamicSmemSizeReturned maximum dynamic shared memory.
[in]fKernel function for which occupancy is calculated.
[in]numBlocksNumber of blocks to fit on SM
[in]blockSizeSize of the block
Returns
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidDeviceFunction, hipErrorInvalidValue, hipErrorUnknown

◆ hipOccupancyMaxActiveBlocksPerMultiprocessor() [1/2]

hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor ( int *  numBlocks,
const void *  f,
int  blockSize,
size_t  dynSharedMemPerBlk 
)

Returns occupancy for a device function.

Parameters
[out]numBlocksReturned occupancy
[in]fKernel function for which occupancy is calulated
[in]blockSizeBlock size the kernel is intended to be launched with
[in]dynSharedMemPerBlkDynamic shared memory usage (in bytes) intended for each block
Returns
hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue

◆ hipOccupancyMaxActiveBlocksPerMultiprocessor() [2/2]

template<class T >
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessor ( int *  numBlocks,
f,
int  blockSize,
size_t  dynSharedMemPerBlk 
)
inline

Returns occupancy for a kernel function.

Parameters
[out]numBlocks- Pointer of occupancy in number of blocks.
[in]f- The kernel function to launch on the device.
[in]blockSize- The block size as kernel launched.
[in]dynSharedMemPerBlk- Dynamic shared memory in bytes per block.
Returns
hipSuccess, hipErrorInvalidValue

◆ hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags() [1/2]

hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags ( int *  numBlocks,
const void *  f,
int  blockSize,
size_t  dynSharedMemPerBlk,
unsigned int  flags 
)

Returns occupancy for a device function.

Parameters
[out]numBlocksReturned occupancy
[in]fKernel function for which occupancy is calulated
[in]blockSizeBlock size the kernel is intended to be launched with
[in]dynSharedMemPerBlkDynamic shared memory usage (in bytes) intended for each block
[in]flagsExtra flags for occupancy calculation (currently ignored)
Returns
hipSuccess, hipErrorInvalidDeviceFunction, hipErrorInvalidValue

◆ hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags() [2/2]

template<class T >
hipError_t hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags ( int *  numBlocks,
f,
int  blockSize,
size_t  dynSharedMemPerBlk,
unsigned int  flags 
)
inline

Returns occupancy for a device function with the specified flags.

Parameters
[out]numBlocks- Pointer of occupancy in number of blocks.
[in]f- The kernel function to launch on the device.
[in]blockSize- The block size as kernel launched.
[in]dynSharedMemPerBlk- Dynamic shared memory in bytes per block.
[in]flags- Flag to handle the behavior for the occupancy calculator.
Returns
hipSuccess, hipErrorInvalidValue

◆ hipOccupancyMaxPotentialBlockSize() [1/2]

hipError_t hipOccupancyMaxPotentialBlockSize ( int *  gridSize,
int *  blockSize,
const void *  f,
size_t  dynSharedMemPerBlk,
int  blockSizeLimit 
)

determine the grid and block sizes to achieves maximum occupancy for a kernel

Parameters
[out]gridSizeminimum grid size for maximum potential occupancy
[out]blockSizeblock size for maximum potential occupancy
[in]fkernel function for which occupancy is calulated
[in]dynSharedMemPerBlkdynamic shared memory usage (in bytes) intended for each block
[in]blockSizeLimitthe maximum block size for the kernel, use 0 for no limit

Please note, HIP does not support kernel launch with total work items defined in dimension with size gridDim x blockDim >= 2^32.

Returns
hipSuccess, hipErrorInvalidValue

◆ hipOccupancyMaxPotentialBlockSize() [2/2]

template<typename F >
hipError_t hipOccupancyMaxPotentialBlockSize ( int *  gridSize,
int *  blockSize,
kernel,
size_t  dynSharedMemPerBlk,
uint32_t  blockSizeLimit 
)
inline

Returns grid and block size that achieves maximum potential occupancy for a device function.

Returns in *min_grid_size and *block_size a suggested grid / block size pair that achieves the best potential occupancy (i.e. the maximum number of active warps on the current device with the smallest number of blocks for a particular function).

Returns
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue
See also
hipOccupancyMaxPotentialBlockSize