Next: Debugging Remote Programs, Previous: Specifying a Debugging Target, Up: Debugging with ROCGDB [Contents][Index]
|
In some operating systems, such as Linux with the AMD ROCm platform installed, a single program may have multiple threads in the same process, executing on different devices which may have different target architectures. Such a system is termed a heterogeneous system and a program that uses the multiple devices is termed a heterogeneous program.
The multiple devices of a heterogeneous system are termed heterogeneous agents. They can include the following kinds of devices: CPU (Central Processing Unit), GPU (Graphics Processing Unit), DSP (Digital Signal Processor), FPGA (Field Programmable Gate Array), as well as other specialized hardware.
The device of a heterogeneous system that starts the execution of the program is termed the heterogeneous host agent.
The precise way threads are created on different heterogeneous agents may vary from one heterogeneous system to another, but in general the threads behave similarly no matter what heterogeneous agent is executing them, except that the target architecture may be different.
A heterogeneous program can create heterogeneous queues associated with a heterogeneous agent. The heterogeneous program can then place heterogeneous packets on a heterogeneous queue to control the actions of the associated heterogeneous agent. A heterogeneous agent removes heterogeneous packets from the heterogeneous queues assocated with it and performs the requested actions. The packet actions and scheduling of packet processing varies depending on the heterogeneous system and the target architecture of the heterogeneous agent. See Architectures.
A heterogeneous dispatch packet is used to initiate code execution on a heterogeneous agent. A single heterogeneous dispatch packet may specify that the heterogeneous agent create a set of threads that are all associated with a corresponding heterogeneous dispatch. Each thread typically has an associated position within the heterogeneous dispatch, possibly expressed as a multi-dimensional grid position. The heterogeneous agent typically can create multiple threads that execute concurrently. If a heterogeneous dispatch is larger than the number of concurrent threads that can be created, the heterogeneous agent creates threads of the heterogeneous dispatch as other threads complete. When all the threads of a heterogeneous dispatch have been created and have completed, the heterogeneous dispatch is considered complete.
The threads of a heterogeneous dispatch may be grouped into heterogeneous work-groups. The threads that belong to the same heterogeneous work-group may have special shared memory, and efficient execution synchronization abilities. A thread that is part of a heterogeneous work-group typically has an associated position within the heterogeneous work-group, possibly also expressed as a multi-dimensional grid position.
Other heterogeneous packets may control heterogeneous packet scheduling, memory visibility between the threads of a heterogeneous dispatch and other threads, or other services supported by the heterogeneous system.
On some heterogeneous systems there can be heterogeneous agents that support SIMD (Single Instruction Multiple Data) or SIMT (Single Instruction Multiple Threads) machine instructions. On these target architectures, a single machine instruction can operate in parallel on multiple heterogeneous lanes.
When an heterogeneous lane is not associated with any work-group, it is said to be unused. For example, the left over lanes when the work-group size is not a multiple of the lanes in a SIMD/SIMT thread, or when the grid size is not a multiple of the work-group size (resulting in partial work-groups on the dimension edges of the grid). ROCGDB hides unused lanes by default.
Source languages used by heterogeneous programs can be implemented on target architectures that support multiple heterogeneous lanes by mapping a source language thread of execution onto a heterogeneous lane of a single target architecture thread. Control flow in the source language may be implemented by controlling which heterogeneous lanes are active. If the source language control flow may result in some heterogeneous lanes becoming inactive while some remain active, the control flow is said to be divergent. Typically, the machine code may execute different divergent paths for different sets of heterogeneous lanes, before the control flow reconverges and all heterogeneous lanes become active.
Just because a target architecture supports multiple lanes, does not mean that the source language is mapped to use them to implement source language threads of execution. Therefore, a thread is only considered to have multiple heterogeneous lanes if it’s current frame corresponds to a source language that does do such a mapping.
On some heterogeneous systems there can be heterogeneous agents with target architectures that support multiple address spaces. In these target architectures, there may be memory that is physically disjoint from regular global virtual memory. There can also be cases when the same underlying memory can be accessed using linear addresses that map to the underlying physical memory in an interleaved manner. In these target architectures there can be distinct machine instructions to access the distinct address spaces. For example, there may be physical hardware scratch pad memory that is allocated and accessible only to the threads that are associated with the same heterogeneous work-group. There may be hardware address swizzle logic that allows regular global virtual memory to be allocated per heterogeneous lane such that they have a linear address view, which in fact maps to an interleaved global virtual memory access to improve cache performance.
ROCGDB provides these facilities for debugging heterogeneous programs:
info sharedlibrary
, command supports code objects for
multiple architectures
show architecture
, x/i
, disassemble
, commands
to disassemble multiple architectures in the same inferior
info threads
, thread
, commands support threads
executing on multiple heterogeneous agents
info agents
, info queues
, info packets
,
info dispatches
, commands to inquire about the heterogeneous
system
queue find
, dispatch find
, commands to find heterogeneous
entities
info lanes
, lane
, commands support source language
threads of execution that are mapped to SIMD-like lanes of a thread
$_thread_find
, $_thread_find_first_gid
,
$_lane_find
, $_lane_find_first_gid
debugger convenience
functions can find threads and heterogeneous lanes associated with
specific heterogeneous entities
maint print address-spaces
, command together with address
qualifiers supports multiple address spaces
A heterogeneous system may use separate code objects for the different
target architectures of the heterogeneous agents. The info
sharedlibrary
command lists all the code objects currently loaded,
regardless of their target architecture.
The following rules apply in determining the target architecture used by commands when debugging heterogeneous programs:
set
architecture
command (see Specifying a Debugging Target)
can be used to change this target architecture. The target
architecture of other heterogeneous agents is typically the target
architecture of the associated device.
ROCGDB handles the heterogeneous agent, queue, and dispatch entities in a similar manner to threads (see Debugging Programs with Multiple Threads):
Each lane of a thread is assigned a single number, also known as lane ID. This is a single integer, which is an index number starting at 0, going up to the number of hardware-supported lanes per thread. This ID is unique per thread, not global, and thus multiple threads can have lanes with the same ID.
Some commands accept a space-separated lane ID list as argument. A list element can be:
The following debugger convenience variables (see Convenience Variables) are related to heterogeneous debugging. You may find these useful in writing breakpoint conditional expressions, command scripts, and so forth.
$_thread
$_gthread
$_thread_systag
$_thread_name
$_agent
$_queue
$_dispatch
There are debugger convenience variables that contain the number of
each heterogeneous entity associated with the current thread if it was
created by a heterogeneous dispatch, or 0 otherwise. $_agent
,
$_queue
, and $_dispatch
contain the corresponding
per-inferior heterogeneous entity number.
$_lane
The heterogeneous lane number of the current lane of the current thread. If the current thread does not support SIMD/SIMT lanes, it is treated as if it has a single lane.
$_dispatch_pos
The heterogeneous dispatch position string of the current thread within its associated heterogeneous dispatch if it is was created by a heterogeneous dispatch, or the empty string otherwise. The format varies depending on the heterogeneous system and target architecture of the heterogeneous agent. See Architectures.
$_lane_name
The heterogeneous lane name string of the current heterogeneous lane, or
the empty string if no name has been assigned by the lane name
command.
$_thread_workgroup_pos
$_lane_workgroup_pos
The heterogeneous work-group position string of the current thread or heterogeneous lane within its associated heterogeneous dispatch if it is was created by a heterogeneous dispatch, or the empty string otherwise. The format varies depending on the heterogeneous system and target architecture of the heterogeneous agent. See Architectures.
$_lane_systag
The target system’s heterogeneous lane identifier (lane_systag) string of the current heterogeneous lane. See target system lane identifier.
The target system’s heterogeneous agent identifier (agent_systag) string of the heterogeneous agent of the current thread.
The target system’s heterogeneous queue identifier (queue_systag) string of the heterogeneous queue of the current thread.
$_dispatch_systag
The target system’s heterogeneous dispatch identifier (dispatch_systag) string of the heterogeneous dispatch of the current thread.
The following debugger convenience functions (see Convenience Functions) are related to heterogeneous debugging. Given the very large number of threads on heterogeneous systems, these may be very useful. They allow threads or thread lists to be specified based on the target system’s thread identifier (systag) or thread name, and allow heterogeneous lanes or heterogeneous lane lists to be specified based on the target system’s heterogeneous lane identifier (lane_systag) or heterogeneous lane name.
$_thread_find
$_thread_find_first_gid
$_lane_find(regex)
Searches for heterogeneous lanes whose name or lane_systag
matches the supplied regular expression. The syntax of the regular
expression is that specified by Python
’s regular expression
support.
Returns a string that is the space separated list of per-inferior heterogeneous lane numbers of the found heterogeneous lanes. If debugging multiple inferiors, the heterogeneous lane numbers are qualified with the inferior number. If no heterogeneous lane are found, the empty string is returned. The string can be used in commands that accept a heterogeneous lane ID list. See heterogeneous entity ID list.
For example, the following command lists all heterogeneous lanes that are part of a heterogeneous work-group with work-group position ‘(1,2,3)’ (see Debugging Heterogeneous Programs):
(gdb) info lanes $_thread_find ("(1,2,3)")
$_lane_find_first_gid(regex)
¶Similar to the $_lane_find
convenience function, except it
returns a number that is the global heterogeneous lane number of one
of the heterogeneous lanes found, or 0 if no heterogeneous lanes were
found. The number can be used in commands that accept a global
heterogeneous lane number. See global heterogeneous entity numbers.
For example, the following command sets the current heterogeneous lane to one of the heterogeneous lanes that are part of a heterogeneous work-group with work-item position ‘[1,2,3]’:
(gdb) lane -gid $_lane_find_first_gid ("[1,2,3]")
The following commands are related to heterogeneous debugging:
info agents [agent-id-list]
¶The info agents
command lists the following information for
each heterogeneous agent (in this order):
An asterisk ‘*’ to the left of the ROCGDB heterogeneous agent number indicates the heterogeneous agent executing the current thread.
Some heterogeneous agents may not be listed until the inferior has started execution of the program.
With no arguments displays information about all heterogeneous agents. You can specify the list of heterogeneous agents that you want to display using the heterogeneous entity ID list syntax (see heterogeneous entity ID list).
For example,
(gdb) info agents Id State Target Id Architecture Device Name Cores Threads Location * 1 A AMDGPU Agent (GPUID 45151) gfx906 vega20 240 2400 0a:00.0 2 A AMDGPU Agent (GPUID 39113) gfx906 vega20 240 2400 44:00.0
If you’re debugging multiple inferiors, ROCGDB displays heterogeneous agent IDs using the qualified inferior-num.agent-num format. Otherwise, only agent-num is shown.
info queues [queue-id-list]
¶The info queues
command lists the following information for
each heterogeneous queue (in this order):
An asterisk ‘*’ to the left of the ROCGDB heterogeneous queue number indicates the heterogeneous queue executing the current thread.
With no arguments displays information about all heterogeneous queues. You can specify the list of heterogeneous queues that you want to display using the heterogeneous entity ID list syntax (see heterogeneous entity ID list).
For example,
(gdb) info queues Id Target Id Type Read Write Size Address * 1 AMDGPU Queue 1:1 (QID 1) HSA (Multi) 0 2 65536 0x00007ffff7f60000 2 AMDGPU Queue 1:2 (QID 2) DMA 1048576 0x00007ffde4e00000 3 AMDGPU Queue 1:3 (QID 0) HSA (Multi) 4 4 262144 0x00007ffff7f00000
If you’re debugging multiple inferiors, ROCGDB displays heterogeneous queue IDs using the qualified inferior-num.queue-num format. Otherwise, only queue-num is shown.
info dispatches [-full] [dispatch-id-list]
¶The info dispatches
command lists the following information for each
heterogeneous dispatch (in this order):
An asterisk ‘*’ to the left of the ROCGDB heterogeneous dispatch number indicates the heterogeneous dispatch executing the current thread.
With no arguments displays information about all heterogeneous dispatches. You can specify the list of heterogeneous dispatches that you want to display using the heterogeneous entity ID list syntax (see heterogeneous entity ID list).
For example,
(gdb) info dispatches -full Id Target Id Grid Workgroup Fence Address Spaces Kernel Descriptor Kernel Args Completion Signal Kernel Function * 1 AMDGPU Dispatch 1:1:1 (PKID 0) [256,1,1] [128,1,1] B|As Shared(0), Private(220) 0x00007ffde5409800 0x00007ffff7e00000 (nil) bit_extract_kernel(unsigned int*, unsigned int const*, unsigned long)
If you’re debugging multiple inferiors, ROCGDB displays heterogeneous dispatch IDs using the qualified inferior-num.dispatch-num format. Otherwise, only dispatch-num is shown.
queue find regexp
¶dispatch find regexp
These commands operate the same way as the ‘thread find’ command (see ‘thread find’) except that they use the target system’s heterogeneous agent, queue, and dispatch identifiers respectively.
info packets [queue-id-list]
¶Display information about the heterogeneous packets on one or more heterogeneous queues. With no arguments displays information about all heterogeneous queues. You can specify the list of heterogeneous queues that you want to display using the heterogeneous queue ID list syntax (see heterogeneous entity ID list).
Since heterogeneous agents may be processing heterogeneous packets asynchronously, the display is at best a snapshot, and may be inconsistent due to the heterogeneous queues being updated while they are being inspected.
The heterogeneous packets are listed contiguously for each heterogeneous agent, and for each heterogeneous queue of that heterogeneous agent, with the oldest packet first.
ROCGDB displays for each heterogeneous packet (in this order):
info threads [-gid] [thread-id-list]
The info threads
command (see Debugging Programs with Multiple Threads) lists the threads
created on all the heterogeneous agents.
If any of the threads listed have multiple heterogeneous lanes, then an additional Lanes column is displayed before the target system’s thread identifier (systag) column. For threads that have multiple heterogeneous lanes, the number of heterogeneous lanes that are active followed by a slash and the total number of heterogeneous lanes of the current frame of the thread is displayed. Otherwise, nothing is displayed.
The target system’s thread identifier (systag) (see target system thread identifier) for threads associated with heterogeneous dispatches varies depending on the heterogeneous system and target architecture of the heterogeneous agent. However, it typically will include information about the heterogeneous agent, heterogeneous queue, heterogeneous dispatch, heterogeneous work-group position within the heterogeneous dispatch, and thread position within the heterogeneous work-group. See Architectures.
The stack frame summary displayed is for the active lanes of the
thread. This may differ from the stack frame information for the
current lane if the focus is on an inactive lane. Use the info
lanes
command for information about individual lanes of a thread.
See Debugging Programs with Multiple Threads.
For example,
(gdb) info threads Id Lanes Target Id Frame 1 Thread 0x7ffff7fc4cc0 (LWP 74764) "bit_extract" 0x00007ffff6b56f37 in sched_yield () from /lib/x86_64-linux-gnu/libc.so.6 2 Thread 0x7ffff59cb700 (LWP 74773) "bit_extract" 0x00007ffff6b696d7 in ioctl () from /lib/x86_64-linux-gnu/libc.so.6 4 Thread 0x7ffff7fc1700 (LWP 74775) "bit_extract" 0x00007ffff6b696d7 in ioctl () from /lib/x86_64-linux-gnu/libc.so.6 * 5 62/64 AMDGPU Wave 1:1:1:1 (0,0,0)/0 "bit_extract" bit_extract_kernel (C_d=0x7ffde8800000, A_d=0x7ffde8e00000, N=4000000) at bit_extract.cpp:38 6 2/64 AMDGPU Wave 1:1:1:2 (0,0,0)/1 "bit_extract" __hip_get_block_dim_x () at /opt/rocm-3.8.0-3471/hip/include/hip/hcc_detail/hip_runtime.h:462 7 64/64 AMDGPU Wave 1:1:1:3 (1,0,0)/0 "bit_extract" __hip_get_block_dim_x () at /opt/rocm-3.8.0-3471/hip/include/hip/hcc_detail/hip_runtime.h:462 8 8/64 AMDGPU Wave 1:1:1:4 (1,0,0)/1 "bit_extract" __hip_get_block_dim_x () at /opt/rocm-3.8.0-3471/hip/include/hip/hcc_detail/hip_runtime.h:462
thread [-gid] thread-id [lane-index]
The thread
command has an optional lane-index argument to
specify the heterogeneous lane index. If the value is not
between 1 and the number of heterogeneous lanes of the current frame
of the thread, then ROCGDB will print an error. If omitted it
defaults to 1.
The current thread is set to thread-id and the current heterogeneous lane is set to the heterogeneous lane corresponding to the specified heterogeneous lane index.
If the thread has multiple heterogeneous lanes, ROCGDB responds by displaying the system identifier of the heterogeneous lane you selected, otherwise it responds with the system identifier of the thread you selected, followed by its current stack frame summary.
thread apply [thread-id-list | all [-ascending]] [flag]… command
taas [option]… command
tfaas [option]… command
thread name [name]
These commands operate the same way for all threads, regardless of whether or not the thread is associated with a heterogeneous dispatch.
If the thread’s frame has multiple heterogeneous lanes then the heterogeneous lane index 1 is used. Use the heterogeneous lane counterpart commands if it is desired to perform the the command on each lane of a thread.
thread find regexp
In addition to searching thread information, if a thread’s frame has multiple heterogeneous lanes then the command also searches the thread’s lanes for those whose name or systag matches the supplied regular expression.
info lanes [-all | -active | -inactived] lane-id-list
Display information about one or more heterogeneous lanes of the current thread. With no arguments displays information about all used lanes of the current thread. You can specify the list of lanes that you want to display using the lane ID list syntax (see lane ID list).
By default, ROCGDB lists all the used lanes of the thread; lanes which are not associated with any work-group (see unused heterogeneous lane) are not displayed. The following options can be used to fine-tune this behavior:
-all
All lanes, including active, inactive and unused lanes.
-active
Only active lanes.
-inactive
Only used inactive lanes.
ROCGDB displays for each heterogeneous lane (in this order):
lane name
, below).
An asterisk ‘*’ to the left of the ROCGDB heterogeneous lane number indicates the current heterogeneous lane.
For example,
(gdb) info lanes 1-2 Id State Target Id Frame * 1 A AMDGPU Lane 1:2:3:463/2 (2,3,4)[1,2,4] 0x34e5 in saxpy () 2 I AMDGPU Lane 1:2:4:456/12 (2,4,4)[1,2,3] (inactive)
lane lane-id
Make lane ID lane-id the current lane of the current thread.
The command argument lane-id is the ROCGDB per-thread lane
ID (see lane ID), as shown in the first field of the info
lanes
display.
ROCGDB responds by displaying the system identifier of the heterogeneous lane you selected, and its current stack frame summary:
(gdb) lane 2 [Switching to thread 5, lane 2 (AMDGPU Lane 1:1:1:1/2 (0,0,0)[2,0,0])] #0 some_function (ignore=0x0) at example.c:8 8 printf ("hello\n");
As with the ‘[New …]’ message, the form of the text in parentheses after ‘Switching to’ depends on your system’s conventions for identifying heterogeneous lanes.
If you switch to an inactive lane, ROCGDB displays a warning, so that you are made aware that values of local variables, function arguments, etc. will have meaningless values. For example:
(gdb) lane 1 [Switching to thread 6, lane 1 (AMDGPU Lane 1:1:1:2/1 (0,0,0)[65,0,0])] warning: Current lane is inactive. #0 some_function (ignore=0x0) at example.c:8 8 printf ("hello\n");
See description of how execution commands and heterogeneous debugging interact, for what happens if you run a stepping command with an inactive lane selected.
lane name [name]
This command assigns a name to the current heterogeneous lane. If no
argument is given, any existing user-specified name is removed. The
heterogeneous lane name appears in the info lanes
display.
lane find [regexp]
Search for and display heterogeneous lane ids whose name or
lane_systag matches the supplied regular expression. The syntax
of the regular expression is that specified by Python
’s regular
expression support.
As well as being the complement to the lane name
command, this
command also allows you to identify a heterogeneous lane by its target
lane_systag. For instance, on the AMD ROCm platform,
the target lane_systag is the heterogeneous agent, heterogeneous
queue, heterogeneous dispatch, heterogeneous work-group position and
heterogeneous work-item position.
(gdb) lane find "work-group(2,3,4)" Lane 2 has lane id 'ROCm process 35 agent 1 queue 2 dispatch 3 work-group(2,3,4) work-item(1,2,4)' (gdb) info lane 2 Id Thread Active Target Id Frame 2 5/2 Y AMDGPU Lane 1:2:3:324/2 (2,3,4)[]1,2,4] 0x34e5 in saxpy ()
lane apply [lane-id-list | all] [-all | -active | -inactive] [flag]… command
laas [option]… command
lfaas [option]… command
lane apply
, laas
, and lfass
commands are similar
to their thread counterparts thread apply
, taas
, and
tfaas
respectively, except they operate on heterogeneous lanes.
See Debugging Programs with Multiple Threads.
By default, ROCGDB operates on all the used lanes of the thread. The following flags can be used to fine-tune this behavior:
-all
All lanes, including active, inactive and unused lanes.
-active
Only active lanes.
-inactive
Only used inactive lanes.
The flag arguments control what output to produce and how to
handle errors raised when applying command to a lane. See
info threads
command (see Debugging Programs with Multiple Threads) for available flags and
their meaning.
backtrace [option]… [qualifier]… [count]
frame [ frame-selection-spec ]
frame apply [all | count | -count | level level…] [option]… command
select-frame [ frame-selection-spec ]
up-silently n
down-silently n
info frame
info args [-q] [-t type_regexp] [regexp]
info locals [-q] [-t type_regexp] [regexp]
faas command
The frame commands apply to the current heterogeneous lane.
If the frame is switched from one that has multiple heterogeneous lanes to one with fewer (including only one) then the current lane is switched to the heterogeneous lane corresponding to the highest heterogeneous lane index of the new frame and ROCGDB responds by displaying the system identifier of the heterogeneous lane selected.
See Examining the Stack.
set libthread-db-search-path
show libthread-db-search-path
set debug libthread-db
show debug libthread-db
These commands only apply to threads created on the heterogeneous host agent that are not associated with a heterogeneous dispatch. There are no commands that support reporting of heterogeneous dispatch thread events.
x/i
display/i
The x/i
and display/i
commands (see Examining
Memory) can be used to disassemble machine instructions. They use
the current target architecture.
disassemble
The disassemble
command (see Source and
Machine Code) can also be used to disassemble machine instructions.
If the start address of the range is within a loaded code object, then
the target architecture of the code object is used. Otherwise, the
current target architecture is used.
info registers
info all-registers
maint print reggroups
The register commands display information about the current architecture.
print
The print
command evaluates the source language expression in
the context of the current heterogeneous lane.
step
next
finish
until
stepi
nexti
Execution commands such as step
, next
, finish
and
until
(see Continuing and
Stepping) automatically skip over code regions where the current
heterogeneous lane becomes inactive. Similarly, if the current
heterogeneous lane is inactive when the execution command is entered,
ROCGDB advances the thread until the current lane becomes
active.
At the hardware level, all lanes of a thread execute instructions in lock-step, though some lanes may be set inactive (as result of special instructions emitted by the compiler) so that instructions do not have an effect in them. Thus, stepping a lane may cause other active heterogeneous lanes of the same thread to also execute code. This may even result in other heterogeneous lanes completing whole functions.
If the current heterogeneous lane is set to an inactive heterogeneous
lane, then the stepi
and nexti
commands
(see Continuing and Stepping) may not
cause the source position to appear to move until execution reaches a
point that makes the current heterogeneous lane active. However,
other heterogeneous lanes of the same thread will advance.
break [-lane lane-index] [location] [if cond]
tbreak [-lane lane-index] [location] [if cond]
hbreak [-lane lane-index] [location] [if cond]
thbreak [-lane lane-index] [location] [if cond]
rbreak [-lane lane-index] regex
info breakpoints [list…]
watch [-lane lane-index] [-l|-location] expr [thread thread-id] [mask maskvalue]
rwatch [-lane lane-index] [-l|-location] expr [thread thread-id] [mask maskvalue]
awatch [-lane lane-index] [-l|-location] expr [thread thread-id] [mask maskvalue]
info watchpoints [list…]
catch [-lane lane-index] event
tcatch [-lane lane-index] event
When a breakpoint, watchpoint, or catchpoint (see Breakpoints; Watchpoints; and Catchpoints) is hit by a frame of a thread with multiple heterogeneous lanes:
lane apply
command.
(gdb) c Continuing. [Switching to thread 5, lane 1 (AMDGPU Lane 1:1:1:1/1 (0,0,0)[1,0,0])] Thread 5 hit Breakpoint 2, with lanes [1-10 13], func () at example.cpp:48
If a heterogeneous lane causes a thread to halt, then the other heterogeneous lanes of the thread will no longer execute even if in non-stop mode.
For break
, watch
, catch
, and their variants, the
-lane lane-index option can be specified. This
limits ROCGDB to only process breakpoints if the heterogeneous
lane has a heterogeneous lane index that matches lane-index.
The info break
and info watch
commands add a Lane
column before the Address column if any breakpoint has a
lane-index specified that displays the heterogeneous lane index.
maint set lane-divergence-support
¶maint show lane-divergence-support
Control whether or not ROCGDB understands lane divergence.
When set to ‘on’, which is the default, execution commands such
as step
, next
, finish
and until
automatically skip over code regions where the selected lane is
inactive, and ROCGDB ignores breakpoint hits that trigger with
all lanes inactive.
When set to ‘off’, execution commands ignore lane active/inactive state, and may thus present stops in code regions where the selected lane is inactive. Also, breakpoint hits are reported even if all lanes are inactive.
maint print address-spaces
maint print address-spaces
displays the address space names
supported by the current architecture.
The address spaces info looks like this:
(gdb) maint print address-spaces Name global generic local private_lane private_wave
Any expression that evaluates to an integral value, can be used as an address. It can optionally specify an address space qualifier by prepending an address space name followed by a ‘#’. ROCGDB will print an error if the address space name is not supported by the current architecture. If no address space is specified the current architecture’s default address space (generally the global address space) is used.
The same syntax is used when an address that is not in the current architecture’s default address space is displayed.
For example,
(gdb) x/x local#0x10021608 local#0x10021608: 0x0022fd98
Heterogeneous systems often have very large numbers of threads. Breakpoint conditions can be used to limit the number of threads reporting breakpoint hits. For example,
break kernel_foo if $_streq($_lane_workgroup_pos, "(0,0,0)")
The tbreak
command can be used so only one heterogeneous lane
will report the breakpoint. Before continuing execution, the
breakpoint will need to be set again if necessary.
The set scheduler-locking on
command (see Non-Stop Mode)
together with the -lane breakpoint option can be used to
lock ROCGDB to only resume the current thread, and only report
breakpoints for a fixed heterogeneous lane index. This avoids the
overhead of resuming a large number of threads every time resuming
from a breakpoint, and also avoids the focus being switched to other
threads that hit the breakpoints. Note however that other threads
will not be executed.
The scheduler locking commands can also be helpful to prevent
ROCGDB switching to other threads while concentrating on
debugging one particular thread. The non-stop mode can be hepful to
prevent the continue
command from resuming other threads that
are intentionally halted or from cancelling a single step command that
is in progress by another thread and resuming it instead.
See Non-Stop Mode.
Next: Debugging Remote Programs, Previous: Specifying a Debugging Target, Up: Debugging with ROCGDB [Contents][Index]