Device Management#

HIP Runtime API Reference: Device Management
Device Management
Collaboration diagram for Device Management:

Functions

hipError_t hipDeviceSynchronize (void)
 Waits on all active streams on current device. More...
 
hipError_t hipDeviceReset (void)
 The state of current device is discarded and updated to a fresh state. More...
 
hipError_t hipSetDevice (int deviceId)
 Set default device to be used for subsequent hip API calls from this thread. More...
 
hipError_t hipGetDevice (int *deviceId)
 Return the default device id for the calling host thread. More...
 
hipError_t hipGetDeviceCount (int *count)
 Return number of compute-capable devices. More...
 
hipError_t hipDeviceGetAttribute (int *pi, hipDeviceAttribute_t attr, int deviceId)
 Query for a specific device attribute. More...
 
hipError_t hipDeviceGetDefaultMemPool (hipMemPool_t *mem_pool, int device)
 Returns the default memory pool of the specified device. More...
 
hipError_t hipDeviceSetMemPool (int device, hipMemPool_t mem_pool)
 Sets the current memory pool of a device. More...
 
hipError_t hipDeviceGetMemPool (hipMemPool_t *mem_pool, int device)
 Gets the current memory pool for the specified device. More...
 
hipError_t hipGetDeviceProperties (hipDeviceProp_t *prop, int deviceId)
 Returns device properties. More...
 
hipError_t hipDeviceSetCacheConfig (hipFuncCache_t cacheConfig)
 Set L1/Shared cache partition. More...
 
hipError_t hipDeviceGetCacheConfig (hipFuncCache_t *cacheConfig)
 Get Cache configuration for a specific Device. More...
 
hipError_t hipDeviceGetLimit (size_t *pValue, enum hipLimit_t limit)
 Gets resource limits of current device. More...
 
hipError_t hipDeviceSetLimit (enum hipLimit_t limit, size_t value)
 Sets resource limits of current device. More...
 
hipError_t hipDeviceGetSharedMemConfig (hipSharedMemConfig *pConfig)
 Returns bank width of shared memory for current device. More...
 
hipError_t hipGetDeviceFlags (unsigned int *flags)
 Gets the flags set for current device. More...
 
hipError_t hipDeviceSetSharedMemConfig (hipSharedMemConfig config)
 The bank width of shared memory on current device is set. More...
 
hipError_t hipSetDeviceFlags (unsigned flags)
 The current device behavior is changed according the flags passed. More...
 
hipError_t hipChooseDevice (int *device, const hipDeviceProp_t *prop)
 Device which matches hipDeviceProp_t is returned. More...
 
hipError_t hipExtGetLinkTypeAndHopCount (int device1, int device2, uint32_t *linktype, uint32_t *hopcount)
 Returns the link type and hop count between two devices. More...
 
hipError_t hipIpcGetMemHandle (hipIpcMemHandle_t *handle, void *devPtr)
 Gets an interprocess memory handle for an existing device memory allocation. More...
 
hipError_t hipIpcOpenMemHandle (void **devPtr, hipIpcMemHandle_t handle, unsigned int flags)
 Opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process. More...
 
hipError_t hipIpcCloseMemHandle (void *devPtr)
 Close memory mapped with hipIpcOpenMemHandle. More...
 
hipError_t hipIpcGetEventHandle (hipIpcEventHandle_t *handle, hipEvent_t event)
 Gets an opaque interprocess handle for an event. More...
 
hipError_t hipIpcOpenEventHandle (hipEvent_t *event, hipIpcEventHandle_t handle)
 Opens an interprocess event handles. More...
 

Detailed Description

This section describes the device management functions of HIP runtime API.

Function Documentation

◆ hipChooseDevice()

hipError_t hipChooseDevice ( int *  device,
const hipDeviceProp_t prop 
)

Device which matches hipDeviceProp_t is returned.

Parameters
[out]devicePointer of the device
[in]propPointer of the properties
Returns
hipSuccess, hipErrorInvalidValue

◆ hipDeviceGetAttribute()

hipError_t hipDeviceGetAttribute ( int *  pi,
hipDeviceAttribute_t  attr,
int  deviceId 
)

Query for a specific device attribute.

Parameters
[out]pipointer to value to return
[in]attrattribute to query
[in]deviceIdwhich device to query for information
Returns
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue

◆ hipDeviceGetCacheConfig()

hipError_t hipDeviceGetCacheConfig ( hipFuncCache_t cacheConfig)

Get Cache configuration for a specific Device.

Parameters
[out]cacheConfigPointer of cache configuration
Returns
hipSuccess, hipErrorNotInitialized Note: AMD devices do not support reconfigurable cache. This hint is ignored on these architectures.

◆ hipDeviceGetDefaultMemPool()

hipError_t hipDeviceGetDefaultMemPool ( hipMemPool_t mem_pool,
int  device 
)

Returns the default memory pool of the specified device.

Parameters
[out]mem_poolDefault memory pool to return
[in]deviceDevice index for query the default memory pool
Returns
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue, hipErrorNotSupported
See also
hipDeviceGetDefaultMemPool, hipMallocAsync, hipMemPoolTrimTo, hipMemPoolGetAttribute, hipDeviceSetMemPool, hipMemPoolSetAttribute, hipMemPoolSetAccess, hipMemPoolGetAccess
Warning
: This API is marked as beta, meaning, while this is feature complete, it is still open to changes and may have outstanding issues.

◆ hipDeviceGetLimit()

hipError_t hipDeviceGetLimit ( size_t *  pValue,
enum hipLimit_t  limit 
)

Gets resource limits of current device.

The function queries the size of limit value, as required by the input enum value hipLimit_t, which can be either hipLimitStackSize, or hipLimitMallocHeapSize. Any other input as default, the function will return hipErrorUnsupportedLimit.

Parameters
[out]pValueReturns the size of the limit in bytes
[in]limitThe limit to query
Returns
hipSuccess, hipErrorUnsupportedLimit, hipErrorInvalidValue

◆ hipDeviceGetMemPool()

hipError_t hipDeviceGetMemPool ( hipMemPool_t mem_pool,
int  device 
)

Gets the current memory pool for the specified device.

Returns the last pool provided to hipDeviceSetMemPool for this device or the device's default memory pool if hipDeviceSetMemPool has never been called. By default the current mempool is the default mempool for a device, otherwise the returned pool must have been set with hipDeviceSetMemPool.

Parameters
[out]mem_poolCurrent memory pool on the specified device
[in]deviceDevice index to query the current memory pool
Returns
hipSuccess, hipErrorInvalidValue, hipErrorNotSupported
See also
hipDeviceGetDefaultMemPool, hipMallocAsync, hipMemPoolTrimTo, hipMemPoolGetAttribute, hipDeviceSetMemPool, hipMemPoolSetAttribute, hipMemPoolSetAccess, hipMemPoolGetAccess
Warning
: This API is marked as beta, meaning, while this is feature complete, it is still open to changes and may have outstanding issues.

◆ hipDeviceGetSharedMemConfig()

hipError_t hipDeviceGetSharedMemConfig ( hipSharedMemConfig pConfig)

Returns bank width of shared memory for current device.

Parameters
[out]pConfigThe pointer of the bank width for shared memory
Returns
hipSuccess, hipErrorInvalidValue, hipErrorNotInitialized

Note: AMD devices and some Nvidia GPUS do not support shared cache banking, and the hint is ignored on those architectures.

◆ hipDeviceReset()

hipError_t hipDeviceReset ( void  )

The state of current device is discarded and updated to a fresh state.

Calling this function deletes all streams created, memory allocated, kernels running, events created. Make sure that no other thread is using the device or streams, memory, kernels, events associated with the current device.

Returns
hipSuccess
See also
hipDeviceSynchronize

◆ hipDeviceSetCacheConfig()

hipError_t hipDeviceSetCacheConfig ( hipFuncCache_t  cacheConfig)

Set L1/Shared cache partition.

Parameters
[in]cacheConfigCache configuration
Returns
hipSuccess, hipErrorNotInitialized, hipErrorNotSupported

Note: AMD devices do not support reconfigurable cache. This API is not implemented on AMD platform. If the function is called, it will return hipErrorNotSupported.

◆ hipDeviceSetLimit()

hipError_t hipDeviceSetLimit ( enum hipLimit_t  limit,
size_t  value 
)

Sets resource limits of current device.

As the input enum limit, hipLimitStackSize sets the limit value of the stack size on the current GPU device, per thread. The limit size can get via hipDeviceGetLimit. The size is in units of 256 dwords, up to the limit (128K - 16).

hipLimitMallocHeapSize sets the limit value of the heap used by the malloc()/free() calls. For limit size, use the hipDeviceGetLimit API.

Any other input as default, the funtion will return hipErrorUnsupportedLimit.

Parameters
[in]limitEnum of hipLimit_t to set
[in]valueThe size of limit value in bytes
Returns
hipSuccess, hipErrorUnsupportedLimit, hipErrorInvalidValue

◆ hipDeviceSetMemPool()

hipError_t hipDeviceSetMemPool ( int  device,
hipMemPool_t  mem_pool 
)

Sets the current memory pool of a device.

The memory pool must be local to the specified device. hipMallocAsync allocates from the current mempool of the provided stream's device. By default, a device's current memory pool is its default memory pool.

Note
Use hipMallocFromPoolAsync for asynchronous memory allocations from a device different than the one the stream runs on.
Parameters
[in]deviceDevice index for the update
[in]mem_poolMemory pool for update as the current on the specified device
Returns
hipSuccess, hipErrorInvalidValue, hipErrorInvalidDevice, hipErrorNotSupported
See also
hipDeviceGetDefaultMemPool, hipMallocAsync, hipMemPoolTrimTo, hipMemPoolGetAttribute, hipDeviceSetMemPool, hipMemPoolSetAttribute, hipMemPoolSetAccess, hipMemPoolGetAccess
Warning
: This API is marked as beta, meaning, while this is feature complete, it is still open to changes and may have outstanding issues.

◆ hipDeviceSetSharedMemConfig()

hipError_t hipDeviceSetSharedMemConfig ( hipSharedMemConfig  config)

The bank width of shared memory on current device is set.

Parameters
[in]configConfiguration for the bank width of shared memory
Returns
hipSuccess, hipErrorInvalidValue, hipErrorNotInitialized

Note: AMD devices and some Nvidia GPUS do not support shared cache banking, and the hint is ignored on those architectures.

◆ hipDeviceSynchronize()

hipError_t hipDeviceSynchronize ( void  )

Waits on all active streams on current device.

When this command is invoked, the host thread gets blocked until all the commands associated with streams associated with the device. HIP does not support multiple blocking modes (yet!).

Returns
hipSuccess
See also
hipSetDevice, hipDeviceReset

◆ hipExtGetLinkTypeAndHopCount()

hipError_t hipExtGetLinkTypeAndHopCount ( int  device1,
int  device2,
uint32_t *  linktype,
uint32_t *  hopcount 
)

Returns the link type and hop count between two devices.

Parameters
[in]device1Ordinal for device1
[in]device2Ordinal for device2
[out]linktypeReturns the link type (See hsa_amd_link_info_type_t) between the two devices
[out]hopcountReturns the hop count between the two devices

Queries and returns the HSA link type and the hop count between the two specified devices.

Returns
hipSuccess, hipErrorInvalidValue

◆ hipGetDevice()

hipError_t hipGetDevice ( int *  deviceId)

Return the default device id for the calling host thread.

Parameters
[out]deviceId*device is written with the default device

HIP maintains an default device for each thread using thread-local-storage. This device is used implicitly for HIP runtime APIs called by this thread. hipGetDevice returns in * device the default device for the calling host thread.

Returns
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue
See also
hipSetDevice, hipGetDevicesizeBytes

◆ hipGetDeviceCount()

hipError_t hipGetDeviceCount ( int *  count)

Return number of compute-capable devices.

Parameters
[out]countReturns number of compute-capable devices.
Returns
hipSuccess, hipErrorNoDevice

Returns in *count the number of devices that have ability to run compute commands. If there are no such devices, then hipGetDeviceCount will return hipErrorNoDevice. If 1 or more devices can be found, then hipGetDeviceCount returns hipSuccess.

◆ hipGetDeviceFlags()

hipError_t hipGetDeviceFlags ( unsigned int *  flags)

Gets the flags set for current device.

Parameters
[out]flagsPointer of the flags
Returns
hipSuccess, hipErrorInvalidDevice, hipErrorInvalidValue

◆ hipGetDeviceProperties()

hipError_t hipGetDeviceProperties ( hipDeviceProp_t prop,
int  deviceId 
)

Returns device properties.

Parameters
[out]propwritten with device properties
[in]deviceIdwhich device to query for information
Returns
hipSuccess, hipErrorInvalidDevice
Bug:

HCC always returns 0 for maxThreadsPerMultiProcessor

HCC always returns 0 for regsPerBlock

HCC always returns 0 for l2CacheSize

Populates hipGetDeviceProperties with information for the specified device.

◆ hipIpcCloseMemHandle()

hipError_t hipIpcCloseMemHandle ( void *  devPtr)

Close memory mapped with hipIpcOpenMemHandle.

Unmaps memory returnd by hipIpcOpenMemHandle. The original allocation in the exporting process as well as imported mappings in other processes will be unaffected.

Any resources used to enable peer access will be freed if this is the last mapping using them.

Parameters
devPtr- Device pointer returned by hipIpcOpenMemHandle
Returns
hipSuccess, hipErrorMapFailed, hipErrorInvalidHandle
Note
This IPC memory related feature API on Windows may behave differently from Linux.

◆ hipIpcGetEventHandle()

hipError_t hipIpcGetEventHandle ( hipIpcEventHandle_t handle,
hipEvent_t  event 
)

Gets an opaque interprocess handle for an event.

This opaque handle may be copied into other processes and opened with hipIpcOpenEventHandle. Then hipEventRecord, hipEventSynchronize, hipStreamWaitEvent and hipEventQuery may be used in either process. Operations on the imported event after the exported event has been freed with hipEventDestroy will result in undefined behavior.

Parameters
[out]handlePointer to hipIpcEventHandle to return the opaque event handle
[in]eventEvent allocated with hipEventInterprocess and hipEventDisableTiming flags
Returns
hipSuccess, hipErrorInvalidConfiguration, hipErrorInvalidValue
Note
This IPC event related feature API is currently applicable on Linux.

◆ hipIpcGetMemHandle()

hipError_t hipIpcGetMemHandle ( hipIpcMemHandle_t handle,
void *  devPtr 
)

Gets an interprocess memory handle for an existing device memory allocation.

Takes a pointer to the base of an existing device memory allocation created with hipMalloc and exports it for use in another process. This is a lightweight operation and may be called multiple times on an allocation without adverse effects.

If a region of memory is freed with hipFree and a subsequent call to hipMalloc returns memory with the same device address, hipIpcGetMemHandle will return a unique handle for the new memory.

Parameters
handle- Pointer to user allocated hipIpcMemHandle to return the handle in.
devPtr- Base pointer to previously allocated device memory
Returns
hipSuccess, hipErrorInvalidHandle, hipErrorOutOfMemory, hipErrorMapFailed
Note
This IPC memory related feature API on Windows may behave differently from Linux.

◆ hipIpcOpenEventHandle()

hipError_t hipIpcOpenEventHandle ( hipEvent_t event,
hipIpcEventHandle_t  handle 
)

Opens an interprocess event handles.

Opens an interprocess event handle exported from another process with cudaIpcGetEventHandle. The returned hipEvent_t behaves like a locally created event with the hipEventDisableTiming flag specified. This event need be freed with hipEventDestroy. Operations on the imported event after the exported event has been freed with hipEventDestroy will result in undefined behavior. If the function is called within the same process where handle is returned by hipIpcGetEventHandle, it will return hipErrorInvalidContext.

Parameters
[out]eventPointer to hipEvent_t to return the event
[in]handleThe opaque interprocess handle to open
Returns
hipSuccess, hipErrorInvalidValue, hipErrorInvalidContext
Note
This IPC event related feature API is currently applicable on Linux.

◆ hipIpcOpenMemHandle()

hipError_t hipIpcOpenMemHandle ( void **  devPtr,
hipIpcMemHandle_t  handle,
unsigned int  flags 
)

Opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process.

Maps memory exported from another process with hipIpcGetMemHandle into the current device address space. For contexts on different devices hipIpcOpenMemHandle can attempt to enable peer access between the devices as if the user called hipDeviceEnablePeerAccess. This behavior is controlled by the hipIpcMemLazyEnablePeerAccess flag. hipDeviceCanAccessPeer can determine if a mapping is possible.

Contexts that may open hipIpcMemHandles are restricted in the following way. hipIpcMemHandles from each device in a given process may only be opened by one context per device per other process.

Memory returned from hipIpcOpenMemHandle must be freed with hipIpcCloseMemHandle.

Calling hipFree on an exported memory region before calling hipIpcCloseMemHandle in the importing context will result in undefined behavior.

Parameters
devPtr- Returned device pointer
handle- hipIpcMemHandle to open
flags- Flags for this operation. Must be specified as hipIpcMemLazyEnablePeerAccess
Returns
hipSuccess, hipErrorInvalidValue, hipErrorInvalidContext, hipErrorInvalidDevicePointer
Note
During multiple processes, using the same memory handle opened by the current context, there is no guarantee that the same device poiter will be returned in *devPtr. This is diffrent from CUDA.
This IPC memory related feature API on Windows may behave differently from Linux.

◆ hipSetDevice()

hipError_t hipSetDevice ( int  deviceId)

Set default device to be used for subsequent hip API calls from this thread.

Parameters
[in]deviceIdValid device in range 0...hipGetDeviceCount().

Sets device as the default device for the calling host thread. Valid device id's are 0... (hipGetDeviceCount()-1).

Many HIP APIs implicitly use the "default device" :

  • Any device memory subsequently allocated from this host thread (using hipMalloc) will be allocated on device.
  • Any streams or events created from this host thread will be associated with device.
  • Any kernels launched from this host thread (using hipLaunchKernel) will be executed on device (unless a specific stream is specified, in which case the device associated with that stream will be used).

This function may be called from any host thread. Multiple host threads may use the same device. This function does no synchronization with the previous or new device, and has very little runtime overhead. Applications can use hipSetDevice to quickly switch the default device before making a HIP runtime call which uses the default device.

The default device is stored in thread-local-storage for each thread. Thread-pool implementations may inherit the default device of the previous thread. A good practice is to always call hipSetDevice at the start of HIP coding sequency to establish a known standard device.

Returns
hipSuccess, hipErrorInvalidDevice, hipErrorNoDevice
See also
hipGetDevice, hipGetDeviceCount

◆ hipSetDeviceFlags()

hipError_t hipSetDeviceFlags ( unsigned  flags)

The current device behavior is changed according the flags passed.

Parameters
[in]flagsFlag to set on the current device

The schedule flags impact how HIP waits for the completion of a command running on a device. hipDeviceScheduleSpin : HIP runtime will actively spin in the thread which submitted the work until the command completes. This offers the lowest latency, but will consume a CPU core and may increase power. hipDeviceScheduleYield : The HIP runtime will yield the CPU to system so that other tasks can use it. This may increase latency to detect the completion but will consume less power and is friendlier to other tasks in the system. hipDeviceScheduleBlockingSync : On ROCm platform, this is a synonym for hipDeviceScheduleYield. hipDeviceScheduleAuto : Use a hueristic to select between Spin and Yield modes. If the number of HIP contexts is greater than the number of logical processors in the system, use Spin scheduling. Else use Yield scheduling.

hipDeviceMapHost : Allow mapping host memory. On ROCM, this is always allowed and the flag is ignored. hipDeviceLmemResizeToMax :

Warning
ROCm silently ignores this flag.
Returns
hipSuccess, hipErrorInvalidDevice, hipErrorSetOnActiveProcess