Skip to main content
Ctrl+K
This page contains proposed changes for a future release of ROCm. Read the latest Linux release of ROCm documentation for your production environments.
AMD Logo
ROCm™ Software Future Release Version List
  • GitHub
  • Community
  • Blogs
  • ROCm Developer Hub
  • Instinct™ Docs
  • Infinity Hub
  • Support

ROCm documentation

rocPRIM 4.0.0 Documentation

Installation

  • roPRIM prerequisites
  • Installing rocPRIM on Linux
  • Building and installing rocAL from source

Conceptual

  • Block and stripe configurations
  • Types of rocPRIM operations
  • rocPRIM operation scope
  • rocPRIM Performance tuning
  • Implementing traits for custom types in rocPRIM

How to

  • Use the SPIR-V target

Reference

  • Glossary of rocPRIM terms
  • Data type support
  • Iterators
  • rocPRIM Utility types
  • Device-Wide Operations
    • Configuring the Kernels
    • Transform
    • Unique
    • Sort
    • Partial Sort
    • Nth Element
    • Merge
    • Partition
    • Run Length Encode
    • Scan
    • Search N
    • Select
    • Reduce
    • Adjacent Difference
    • Adjacent Find
    • Binary Search
    • Histogram
    • DeviceCopy
    • Memcpy
    • Find first of
    • Find end
    • Search
  • Block-Wide Operations
    • Operation classes
      • Load
      • Store
      • Adjacent difference
      • Discontinuity
      • Scan
      • Reduce
      • Run-length decode
      • Shuffle
      • Exchange
      • Sort
      • Histogram
    • Data movement functions
  • Warp-Level Operations
    • Load
    • Store
    • Reduce
    • Scan
    • Sort
    • Shuffle
    • Exchange
  • Thread-Level Operations
    • Radix Key Encoder/Decoder
    • Operators
    • Load
    • Reduce
    • Scan
    • Search
    • Store
  • rocPRIM Developer guidelines
  • License
  • Device-Wide Operations
  • Search

Search

Contents

  • Configuring the kernel
    • search_config
  • search
    • search()

Search#

Configuring the kernel#

template<unsigned int BlockSize, unsigned int ItemsPerThread, unsigned int MaxSharedKeyBytes>
struct search_config : public rocprim::detail::search_config_params#

Configuration of device-level search/find_end.

Template Parameters:
  • BlockSize – number of threads in a block.

  • ItemsPerThread – number of items processed by each thread.

  • MaxSharedKeyBytes – maximum number of bytes for which a shared key is used.

search#

template<class Config = default_config, class InputIterator1, class InputIterator2, class OutputIterator, class BinaryFunction = rocprim::equal_to<typename std::iterator_traits<InputIterator1>::value_type>>
inline hipError_t rocprim::search(void *temporary_storage, size_t &storage_size, InputIterator1 input, InputIterator2 keys, OutputIterator output, size_t size, size_t keys_size, BinaryFunction compare_function = BinaryFunction(), hipStream_t stream = 0, bool debug_synchronous = false)#

Searches for the first occurrence of the sequence.

Searches the input for the first occurence of a sequence, according to a particular comparison function. If found, the index of the first item of the found sequence in the input is returned. Otherwise, returns the size of the input.

Overview

  • The contents of the inputs are not altered by the function.

  • Returns the required size of temporary_storage in storage_size if temporary_storage is a null pointer.

  • Accepts custom compare_functions for search across the device.

  • Streams in graph capture mode are supported

Example

In this example a device-level search is performed where input values are represented by an array of unsigned integers and the key is also an array of unsigned integers.

#include <rocprim/rocprim.hpp>

// Prepare input and output (declare pointers, allocate device memory etc.)
size_t size;           // e.g., 10
size_t key_size;       // e.g., 3
unsigned int * input;  // e.g., [ 6, 3, 5, 4, 1, 8, 2, 5, 4, 1 ]
unsigned int * key;    // e.g., [ 5, 4, 1 ]
unsigned int * output; // e.g., empty array of size 1

size_t temporary_storage_size_bytes;
void * temporary_storage_ptr = nullptr;
// Get required size of the temporary storage
rocprim::search(
    temporary_storage_ptr, temporary_storage_size_bytes,
    input, key, output, size, key_size
);

// allocate temporary storage
hipMalloc(&temporary_storage_ptr, temporary_storage_size_bytes);

// perform search
rocprim::search(
    temporary_storage_ptr, temporary_storage_size_bytes,
    input, key, output, size, key_size
);
// output:   [ 2 ]

Template Parameters:
  • Config – [optional] configuration of the primitive, must be default_config or search_config.

  • InputIterator1 – [inferred] random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.

  • InputIterator2 – [inferred] random-access iterator type of the input range. Must meet the requirements of a C++ InputIterator concept. It can be a simple pointer type.

  • OutputIterator – [inferred] random-access iterator type of the input range. Must meet the requirements of a C++ OutputIterator concept. It can be a simple pointer type.

  • BinaryFunction – [inferred] Type of binary function that accepts two arguments of the type InputIterator1 and returns a value convertible to bool. Default type is rocprim::less<>.

Parameters:
  • temporary_storage – [in] pointer to a device-accessible temporary storage. When a null pointer is passed, the required allocation size (in bytes) is written to storage_size and function returns without performing the search.

  • storage_size – [inout] reference to a size (in bytes) of temporary_storage.

  • input – [in] iterator to the input range.

  • keys – [in] iterator to the key range.

  • output – [out] iterator to the output range. The output is one element.

  • size – [in] number of elements in the input range.

  • keys_size – [in] number of elements in the key range.

  • compare_function – [in] binary operation function object that will be used for comparison. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b);. The signature does not need to have const &, but function object must not modify the objects passed to it. The comparator must meet the C++ named requirement BinaryPredicate. The default value is BinaryFunction().

  • stream – [in] [optional] HIP stream object. Default is 0 (default stream).

  • debug_synchronous – [in] [optional] If true, synchronization after every kernel launch is forced in order to check for errors. Default value is false.

Returns:

hipSuccess (0) after successful search; otherwise a HIP runtime error of type hipError_t.

previous

Find end

next

Block-Wide Operations

Contents
  • Configuring the kernel
    • search_config
  • search
    • search()

  • Terms and Conditions
  • ROCm Licenses and Disclaimers
  • Privacy
  • Trademarks
  • Supply Chain Transparency
  • Fair and Open Competition
  • UK Tax Strategy
  • Cookie Policy
  • Cookie Settings
© 2025 Advanced Micro Devices, Inc