API

Contents

API#

This section provides a detailed list of the library API

Host Utility Functions#

template<typename DataType>
void rocalution::allocate_host(int64_t n, DataType **ptr)#

Allocate buffer on the host.

allocate_host allocates a buffer on the host.

Parameters:
  • n[in] number of elements the buffer need to be allocated for

  • ptr[out] pointer to the position in memory where the buffer should be allocated, it is expected that *ptr == NULL

Template Parameters:

DataType – can be char, int, unsigned int, float, double, std::complex<float> or std::complex<double>.

template<typename DataType>
void rocalution::free_host(DataType **ptr)#

Free buffer on the host.

free_host deallocates a buffer on the host. *ptr will be set to NULL after successful deallocation.

Parameters:

ptr[inout] pointer to the position in memory where the buffer should be deallocated, it is expected that *ptr != NULL

Template Parameters:

DataType – can be char, int, unsigned int, float, double, std::complex<float> or std::complex<double>.

template<typename DataType>
void rocalution::set_to_zero_host(int64_t n, DataType *ptr)#

Set a host buffer to zero.

set_to_zero_host sets a host buffer to zero.

Parameters:
  • n[in] number of elements

  • ptr[inout] pointer to the host buffer

Template Parameters:

DataType – can be char, int, unsigned int, float, double, std::complex<float> or std::complex<double>.

double rocalution::rocalution_time(void)#

Return current time in microseconds.

Backend Manager#

int rocalution::init_rocalution(int rank = -1, int dev_per_node = 1)#

Initialize rocALUTION platform.

init_rocalution defines a backend descriptor with information about the hardware and its specifications. All objects created after that contain a copy of this descriptor. If the specifications of the global descriptor are changed (e.g. set different number of threads) and new objects are created, only the new objects will use the new configurations.

For control, the library provides the following functions

Example
#include <rocalution/rocalution.hpp>

using namespace rocalution;

int main(int argc, char* argv[])
{
    init_rocalution();

    // ...

    stop_rocalution();

    return 0;
}

Parameters:
  • rank[in] specifies MPI rank when multi-node environment

  • dev_per_node[in] number of accelerator devices per node, when in multi-GPU environment

int rocalution::stop_rocalution(void)#

Shutdown rocALUTION platform.

stop_rocalution shuts down the rocALUTION platform.

void rocalution::set_device_rocalution(int dev)#

Set the accelerator device.

set_device_rocalution lets the user select the accelerator device that is supposed to be used for the computation.

Parameters:

dev[in] accelerator device ID for computation

void rocalution::set_omp_threads_rocalution(int nthreads)#

Set number of OpenMP threads.

The number of threads which rocALUTION will use can be set with set_omp_threads_rocalution or by the global OpenMP environment variable (for Unix-like OS this is OMP_NUM_THREADS). During the initialization phase, the library provides affinity thread-core mapping:

  • If the number of cores (including SMT cores) is greater or equal than two times the number of threads, then all the threads can occupy every second core ID (e.g. 0, 2, 4, \(\ldots\)). This is to avoid having two threads working on the same physical core, when SMT is enabled.

  • If the number of threads is less or equal to the number of cores (including SMT), and the previous clause is false, then the threads can occupy every core ID (e.g. 0, 1, 2, 3, \(\ldots\)).

  • If non of the above criteria is matched, then the default thread-core mapping is used (typically set by the OS).

Note

The thread-core mapping is available only for Unix-like OS.

Note

The user can disable the thread affinity by calling set_omp_affinity_rocalution(), before initializing the library (i.e. before init_rocalution()).

Parameters:

nthreads[in] number of OpenMP threads

void rocalution::set_omp_affinity_rocalution(bool affinity)#

Enable/disable OpenMP host affinity.

set_omp_affinity_rocalution enables / disables OpenMP host affinity.

Parameters:

affinity[in] boolean to turn on/off OpenMP host affinity

void rocalution::set_omp_threshold_rocalution(int threshold)#

Set OpenMP threshold size.

Whenever you want to work on a small problem, you might observe that the OpenMP host backend is (slightly) slower than using no OpenMP. This is mainly attributed to the small amount of work, which every thread should perform and the large overhead of forking/joining threads. This can be avoid by the OpenMP threshold size parameter in rocALUTION. The default threshold is set to 10000, which means that all matrices under (and equal) this size will use only one thread (disregarding the number of OpenMP threads set in the system). The threshold can be modified with set_omp_threshold_rocalution.

Parameters:

threshold[in] OpenMP threshold size

void rocalution::info_rocalution(void)#

Print info about rocALUTION.

info_rocalution prints information about the rocALUTION platform

void rocalution::info_rocalution(const struct Rocalution_Backend_Descriptor &backend_descriptor)#

Print info about specific rocALUTION backend descriptor.

info_rocalution prints information about the rocALUTION platform of the specific backend descriptor.

Parameters:

backend_descriptor[in] rocALUTION backend descriptor

void rocalution::disable_accelerator_rocalution(bool onoff = true)#

Disable/Enable the accelerator.

If you want to disable the accelerator (without re-compiling the code), you need to call disable_accelerator_rocalution before init_rocalution().

Parameters:

onoff[in] boolean to turn on/off the accelerator

void rocalution::_rocalution_sync(void)#

Sync rocALUTION.

_rocalution_sync blocks the host until all active asynchronous transfers are completed (this is a global sync).

Base Rocalution#

template<typename ValueType>
class BaseRocalution : public rocalution::RocalutionObj#

Base class for all operators and vectors.

Template Parameters:

ValueType – - can be int, float, double, std::complex<float> and std::complex<double>

Subclassed by rocalution::Operator< ValueType >, rocalution::Vector< ValueType >

Public Functions

virtual void MoveToAccelerator(void) = 0#

Move the object to the accelerator backend.

virtual void MoveToHost(void) = 0#

Move the object to the host backend.

virtual void MoveToAcceleratorAsync(void)#

Move the object to the accelerator backend with async move.

virtual void MoveToHostAsync(void)#

Move the object to the host backend with async move.

virtual void Sync(void)#

Sync (the async move)

virtual void CloneBackend(const BaseRocalution<ValueType> &src)#

Clone the Backend descriptor from another object.

With CloneBackend, the backend can be cloned without copying any data. This is especially useful, if several objects should reside on the same backend, but keep their original data.

Example
LocalVector<ValueType> vec;
LocalMatrix<ValueType> mat;

// Allocate and initialize vec and mat
// ...

LocalVector<ValueType> tmp;
// By cloning backend, tmp and vec will have the same backend as mat
tmp.CloneBackend(mat);
vec.CloneBackend(mat);

// The following matrix vector multiplication will be performed on the backend
// selected in mat
mat.Apply(vec, &tmp);

Parameters:

src[in] Object, where the backend should be cloned from.

virtual void Info(void) const = 0#

Print object information.

Info can print object information about any rocALUTION object. This information consists of object properties and backend data.

Example
mat.Info();
vec.Info();

virtual void Clear(void) = 0#

Clear (free all data) the object.

Operator#

template<typename ValueType>
class Operator : public rocalution::BaseRocalution<ValueType>#

Operator class.

The Operator class defines the generic interface for applying an operator (e.g. matrix or stencil) from/to global and local vectors.

Template Parameters:

ValueType – - can be int, float, double, std::complex<float> and std::complex<double>

Subclassed by rocalution::GlobalMatrix< ValueType >, rocalution::LocalMatrix< ValueType >, rocalution::LocalStencil< ValueType >

Public Functions

virtual int64_t GetM(void) const = 0#

Return the number of rows in the matrix/stencil.

virtual int64_t GetN(void) const = 0#

Return the number of columns in the matrix/stencil.

virtual int64_t GetNnz(void) const = 0#

Return the number of non-zeros in the matrix/stencil.

virtual int64_t GetLocalM(void) const#

Return the number of rows in the local matrix/stencil.

virtual int64_t GetLocalN(void) const#

Return the number of columns in the local matrix/stencil.

virtual int64_t GetLocalNnz(void) const#

Return the number of non-zeros in the local matrix/stencil.

virtual int64_t GetGhostM(void) const#

Return the number of rows in the ghost matrix/stencil.

virtual int64_t GetGhostN(void) const#

Return the number of columns in the ghost matrix/stencil.

virtual int64_t GetGhostNnz(void) const#

Return the number of non-zeros in the ghost matrix/stencil.

virtual void Transpose(void)#

Transpose the operator.

virtual void Apply(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Apply the operator, out = Operator(in), where in and out are local vectors.

virtual void ApplyAdd(const LocalVector<ValueType> &in, ValueType scalar, LocalVector<ValueType> *out) const#

Apply and add the operator, out += scalar * Operator(in), where in and out are local vectors.

virtual void Apply(const GlobalVector<ValueType> &in, GlobalVector<ValueType> *out) const#

Apply the operator, out = Operator(in), where in and out are global vectors.

virtual void ApplyAdd(const GlobalVector<ValueType> &in, ValueType scalar, GlobalVector<ValueType> *out) const#

Apply and add the operator, out += scalar * Operator(in), where in and out are global vectors.

Vector#

template<typename ValueType>
class Vector : public rocalution::BaseRocalution<ValueType>#

Vector class.

The Vector class defines the generic interface for local and global vectors.

Template Parameters:

ValueType – - can be int, float, double, std::complex<float> and std::complex<double>

Subclassed by rocalution::LocalVector< int >, rocalution::GlobalVector< ValueType >, rocalution::LocalVector< ValueType >

Unnamed Group

virtual void CopyFrom(const LocalVector<ValueType> &src)#

Copy vector from another vector.

CopyFrom copies values from another vector.

Example
LocalVector<ValueType> vec1, vec2;

// Allocate and initialize vec1 and vec2
// ...

// Move vec1 to accelerator
// vec1.MoveToAccelerator();

// Now, vec1 is on the accelerator (if available)
// and vec2 is on the host

// Copy vec1 to vec2 (or vice versa) will move data between host and
// accelerator backend
vec1.CopyFrom(vec2);

Note

This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.

Parameters:

src[in] Vector, where values should be copied from.

virtual void CopyFrom(const GlobalVector<ValueType> &src)#

Copy vector from another vector.

CopyFrom copies values from another vector.

Example
LocalVector<ValueType> vec1, vec2;

// Allocate and initialize vec1 and vec2
// ...

// Move vec1 to accelerator
// vec1.MoveToAccelerator();

// Now, vec1 is on the accelerator (if available)
// and vec2 is on the host

// Copy vec1 to vec2 (or vice versa) will move data between host and
// accelerator backend
vec1.CopyFrom(vec2);

Note

This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.

Parameters:

src[in] Vector, where values should be copied from.

Unnamed Group

virtual void CloneFrom(const LocalVector<ValueType> &src)#

Clone the vector.

CloneFrom clones the entire vector, with data and backend descriptor from another Vector.

Example
LocalVector<ValueType> vec;

// Allocate and initialize vec (host or accelerator)
// ...

LocalVector<ValueType> tmp;

// By cloning vec, tmp will have identical values and will be on the same
// backend as vec
tmp.CloneFrom(vec);

Parameters:

src[in] Vector to clone from.

virtual void CloneFrom(const GlobalVector<ValueType> &src)#

Clone the vector.

CloneFrom clones the entire vector, with data and backend descriptor from another Vector.

Example
LocalVector<ValueType> vec;

// Allocate and initialize vec (host or accelerator)
// ...

LocalVector<ValueType> tmp;

// By cloning vec, tmp will have identical values and will be on the same
// backend as vec
tmp.CloneFrom(vec);

Parameters:

src[in] Vector to clone from.

Public Functions

virtual int64_t GetSize(void) const = 0#

Return the size of the vector.

virtual int64_t GetLocalSize(void) const#

Return the size of the local vector.

virtual bool Check(void) const = 0#

Perform a sanity check of the vector.

Checks, if the vector contains valid data, i.e. if the values are not infinity and not NaN (not a number).

Return values:
  • true – if the vector is ok (empty vector is also ok).

  • false – if there is something wrong with the values.

virtual void Clear(void) = 0#

Clear (free all data) the object.

virtual void Zeros(void) = 0#

Set all values of the vector to 0.

virtual void Ones(void) = 0#

Set all values of the vector to 1.

virtual void SetValues(ValueType val) = 0#

Set all values of the vector to given argument.

virtual void SetRandomUniform(unsigned long long seed, ValueType a = static_cast<ValueType>(-1), ValueType b = static_cast<ValueType>(1)) = 0#

Fill the vector with random values from interval [a,b].

virtual void SetRandomNormal(unsigned long long seed, ValueType mean = static_cast<ValueType>(0), ValueType var = static_cast<ValueType>(1)) = 0#

Fill the vector with random values from normal distribution.

virtual void ReadFileASCII(const std::string &filename) = 0#

Read vector from ASCII file.

Read a vector from ASCII file.

Example
LocalVector<ValueType> vec;
vec.ReadFileASCII("my_vector.dat");

Parameters:

filename[in] name of the file containing the ASCII data.

virtual void WriteFileASCII(const std::string &filename) const = 0#

Write vector to ASCII file.

Write a vector to ASCII file.

Example
LocalVector<ValueType> vec;

// Allocate and fill vec
// ...

vec.WriteFileASCII("my_vector.dat");

Parameters:

filename[in] name of the file to write the ASCII data to.

virtual void ReadFileBinary(const std::string &filename) = 0#

Read vector from binary file.

Read a vector from binary file. For details on the format, see WriteFileBinary().

Example
LocalVector<ValueType> vec;
vec.ReadFileBinary("my_vector.bin");

Parameters:

filename[in] name of the file containing the data.

virtual void WriteFileBinary(const std::string &filename) const = 0#

Write vector to binary file.

Write a vector to binary file.

The binary format contains a header, the rocALUTION version and the vector data as follows

// Header
out << "#rocALUTION binary vector file" << std::endl;

// rocALUTION version
out.write((char*)&version, sizeof(int));

// Vector data
out.write((char*)&size, sizeof(int));
out.write((char*)vec_val, size * sizeof(double));

Example
LocalVector<ValueType> vec;

// Allocate and fill vec
// ...

vec.WriteFileBinary("my_vector.bin");

Note

Vector values array is always stored in double precision (e.g. double or std::complex<double>).

Parameters:

filename[in] name of the file to write the data to.

virtual void CopyFromAsync(const LocalVector<ValueType> &src)#

Async copy from another local vector.

virtual void CopyFromFloat(const LocalVector<float> &src)#

Copy values from another local float vector.

virtual void CopyFromDouble(const LocalVector<double> &src)#

Copy values from another local double vector.

virtual void CopyFrom(const LocalVector<ValueType> &src, int64_t src_offset, int64_t dst_offset, int64_t size)#

Copy vector from another vector with offsets and size.

CopyFrom copies values with specific source and destination offsets and sizes from another vector.

Note

This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.

Parameters:
  • src[in] Vector, where values should be copied from.

  • src_offset[in] source offset.

  • dst_offset[in] destination offset.

  • size[in] number of entries to be copied.

virtual void AddScale(const LocalVector<ValueType> &x, ValueType alpha)#

Perform vector update of type this = this + alpha * x.

virtual void AddScale(const GlobalVector<ValueType> &x, ValueType alpha)#

Perform vector update of type this = this + alpha * x.

virtual void ScaleAdd(ValueType alpha, const LocalVector<ValueType> &x)#

Perform vector update of type this = alpha * this + x.

virtual void ScaleAdd(ValueType alpha, const GlobalVector<ValueType> &x)#

Perform vector update of type this = alpha * this + x.

virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta)#

Perform vector update of type this = alpha * this + x * beta.

virtual void ScaleAddScale(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta)#

Perform vector update of type this = alpha * this + x * beta.

virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, int64_t src_offset, int64_t dst_offset, int64_t size)#

Perform vector update of type this = alpha * this + x * beta with offsets.

virtual void ScaleAddScale(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta, int64_t src_offset, int64_t dst_offset, int64_t size)#

Perform vector update of type this = alpha * this + x * beta with offsets.

virtual void ScaleAdd2(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, const LocalVector<ValueType> &y, ValueType gamma)#

Perform vector update of type this = alpha * this + x * beta + y * gamma.

virtual void ScaleAdd2(ValueType alpha, const GlobalVector<ValueType> &x, ValueType beta, const GlobalVector<ValueType> &y, ValueType gamma)#

Perform vector update of type this = alpha * this + x * beta + y * gamma.

virtual void Scale(ValueType alpha) = 0#

Perform vector scaling this = alpha * this.

virtual ValueType Dot(const LocalVector<ValueType> &x) const#

Compute dot (scalar) product, return this^T y.

virtual ValueType Dot(const GlobalVector<ValueType> &x) const#

Compute dot (scalar) product, return this^T y.

virtual ValueType DotNonConj(const LocalVector<ValueType> &x) const#

Compute non-conjugate dot (scalar) product, return this^T y.

virtual ValueType DotNonConj(const GlobalVector<ValueType> &x) const#

Compute non-conjugate dot (scalar) product, return this^T y.

virtual ValueType Norm(void) const = 0#

Compute \(L_2\) norm of the vector, return = srqt(this^T this)

virtual ValueType Reduce(void) const = 0#

Reduce the vector.

virtual ValueType InclusiveSum(void) = 0#

Compute Inclusive sum.

virtual ValueType InclusiveSum(const LocalVector<ValueType> &vec)#

Compute Inclusive sum.

virtual ValueType InclusiveSum(const GlobalVector<ValueType> &vec)#

Compute Inclusive sum.

virtual ValueType ExclusiveSum(void) = 0#

Compute exclusive sum.

virtual ValueType ExclusiveSum(const LocalVector<ValueType> &vec)#

Compute exclusive sum.

virtual ValueType ExclusiveSum(const GlobalVector<ValueType> &vec)#

Compute exclusive sum.

virtual ValueType Asum(void) const = 0#

Compute the sum of absolute values of the vector, return = sum(|this|)

virtual int64_t Amax(ValueType &value) const = 0#

Compute the absolute max of the vector, return = index(max(|this|))

virtual void PointWiseMult(const LocalVector<ValueType> &x)#

Perform point-wise multiplication (element-wise) of this = this * x.

virtual void PointWiseMult(const GlobalVector<ValueType> &x)#

Perform point-wise multiplication (element-wise) of this = this * x.

virtual void PointWiseMult(const LocalVector<ValueType> &x, const LocalVector<ValueType> &y)#

Perform point-wise multiplication (element-wise) of this = x * y.

virtual void PointWiseMult(const GlobalVector<ValueType> &x, const GlobalVector<ValueType> &y)#

Perform point-wise multiplication (element-wise) of this = x * y.

virtual void Power(double power) = 0#

Perform power operation to a vector.

Local Matrix#

template<typename ValueType>
class LocalMatrix : public rocalution::Operator<ValueType>#

LocalMatrix class.

A LocalMatrix is called local, because it will always stay on a single system. The system can contain several CPUs via UMA or NUMA memory system or it can contain an accelerator.

A number of matrix formats are supported. These are CSR, BCSR, MCSR, COO, DIA, ELL, HYB, and DENSE.

Note

For CSR type matrices, the column indices must be sorted in increasing order. For COO matrices, the row indices must be sorted in increasing order. The function Check can be used to check whether a matrix contains valid data. For CSR and COO matrices, the function Sort can be used to sort the row or column indices respectively.

Template Parameters:

ValueType – - can be int, float, double, std::complex<float> and std::complex<double>

Unnamed Group

void AllocateCSR(const std::string &name, int64_t nnz, int64_t nrow, int64_t ncol)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

void AllocateBCSR(const std::string &name, int64_t nnzb, int64_t nrowb, int64_t ncolb, int blockdim)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

void AllocateMCSR(const std::string &name, int64_t nnz, int64_t nrow, int64_t ncol)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

void AllocateCOO(const std::string &name, int64_t nnz, int64_t nrow, int64_t ncol)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

void AllocateDIA(const std::string &name, int64_t nnz, int64_t nrow, int64_t ncol, int ndiag)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

void AllocateELL(const std::string &name, int64_t nnz, int64_t nrow, int64_t ncol, int max_row)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

void AllocateHYB(const std::string &name, int64_t ell_nnz, int64_t coo_nnz, int ell_max_row, int64_t nrow, int64_t ncol)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

void AllocateDENSE(const std::string &name, int64_t nrow, int64_t ncol)#

Allocate a local matrix with name and sizes.

The local matrix allocation functions require a name of the object (this is only for information purposes) and corresponding number of non-zero elements, number of rows and number of columns. Furthermore, depending on the matrix format, additional parameters are required.

Example
LocalMatrix<ValueType> mat;

mat.AllocateCSR("my CSR matrix", 456, 100, 100);
mat.Clear();

mat.AllocateCOO("my COO matrix", 200, 100, 100);
mat.Clear();

Unnamed Group

void SetDataPtrCOO(int **row, int **col, ValueType **val, std::string name, int64_t nnz, int64_t nrow, int64_t ncol)#

Initialize a LocalMatrix on the host with externally allocated data.

SetDataPtr functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.

Example
// Allocate a CSR matrix
int* csr_row_ptr   = new int[100 + 1];
int* csr_col_ind   = new int[345];
ValueType* csr_val = new ValueType[345];

// Fill the CSR matrix
// ...

// rocALUTION local matrix object
LocalMatrix<ValueType> mat;

// Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become
// invalid
mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);

Note

Setting data pointers will leave the original pointers empty (set to NULL).

void SetDataPtrCSR(PtrType **row_offset, int **col, ValueType **val, std::string name, int64_t nnz, int64_t nrow, int64_t ncol)#

Initialize a LocalMatrix on the host with externally allocated data.

SetDataPtr functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.

Example
// Allocate a CSR matrix
int* csr_row_ptr   = new int[100 + 1];
int* csr_col_ind   = new int[345];
ValueType* csr_val = new ValueType[345];

// Fill the CSR matrix
// ...

// rocALUTION local matrix object
LocalMatrix<ValueType> mat;

// Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become
// invalid
mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);

Note

Setting data pointers will leave the original pointers empty (set to NULL).

void SetDataPtrBCSR(int **row_offset, int **col, ValueType **val, std::string name, int64_t nnzb, int64_t nrowb, int64_t ncolb, int blockdim)#

Initialize a LocalMatrix on the host with externally allocated data.

SetDataPtr functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.

Example
// Allocate a CSR matrix
int* csr_row_ptr   = new int[100 + 1];
int* csr_col_ind   = new int[345];
ValueType* csr_val = new ValueType[345];

// Fill the CSR matrix
// ...

// rocALUTION local matrix object
LocalMatrix<ValueType> mat;

// Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become
// invalid
mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);

Note

Setting data pointers will leave the original pointers empty (set to NULL).

void SetDataPtrMCSR(int **row_offset, int **col, ValueType **val, std::string name, int64_t nnz, int64_t nrow, int64_t ncol)#

Initialize a LocalMatrix on the host with externally allocated data.

SetDataPtr functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.

Example
// Allocate a CSR matrix
int* csr_row_ptr   = new int[100 + 1];
int* csr_col_ind   = new int[345];
ValueType* csr_val = new ValueType[345];

// Fill the CSR matrix
// ...

// rocALUTION local matrix object
LocalMatrix<ValueType> mat;

// Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become
// invalid
mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);

Note

Setting data pointers will leave the original pointers empty (set to NULL).

void SetDataPtrELL(int **col, ValueType **val, std::string name, int64_t nnz, int64_t nrow, int64_t ncol, int max_row)#

Initialize a LocalMatrix on the host with externally allocated data.

SetDataPtr functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.

Example
// Allocate a CSR matrix
int* csr_row_ptr   = new int[100 + 1];
int* csr_col_ind   = new int[345];
ValueType* csr_val = new ValueType[345];

// Fill the CSR matrix
// ...

// rocALUTION local matrix object
LocalMatrix<ValueType> mat;

// Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become
// invalid
mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);

Note

Setting data pointers will leave the original pointers empty (set to NULL).

void SetDataPtrDIA(int **offset, ValueType **val, std::string name, int64_t nnz, int64_t nrow, int64_t ncol, int num_diag)#

Initialize a LocalMatrix on the host with externally allocated data.

SetDataPtr functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.

Example
// Allocate a CSR matrix
int* csr_row_ptr   = new int[100 + 1];
int* csr_col_ind   = new int[345];
ValueType* csr_val = new ValueType[345];

// Fill the CSR matrix
// ...

// rocALUTION local matrix object
LocalMatrix<ValueType> mat;

// Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become
// invalid
mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);

Note

Setting data pointers will leave the original pointers empty (set to NULL).

void SetDataPtrDENSE(ValueType **val, std::string name, int64_t nrow, int64_t ncol)#

Initialize a LocalMatrix on the host with externally allocated data.

SetDataPtr functions have direct access to the raw data via pointers. Already allocated data can be set by passing their pointers.

Example
// Allocate a CSR matrix
int* csr_row_ptr   = new int[100 + 1];
int* csr_col_ind   = new int[345];
ValueType* csr_val = new ValueType[345];

// Fill the CSR matrix
// ...

// rocALUTION local matrix object
LocalMatrix<ValueType> mat;

// Set the CSR matrix data, csr_row_ptr, csr_col and csr_val pointers become
// invalid
mat.SetDataPtrCSR(&csr_row_ptr, &csr_col, &csr_val, "my_matrix", 345, 100, 100);

Note

Setting data pointers will leave the original pointers empty (set to NULL).

Unnamed Group

void LeaveDataPtrCOO(int **row, int **col, ValueType **val)#

Leave a LocalMatrix to host pointers.

LeaveDataPtr functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.

Example
// rocALUTION CSR matrix object
LocalMatrix<ValueType> mat;

// Allocate the CSR matrix
mat.AllocateCSR("my_matrix", 345, 100, 100);

// Fill CSR matrix
// ...

int* csr_row_ptr   = NULL;
int* csr_col_ind   = NULL;
ValueType* csr_val = NULL;

// Get (steal) the data from the matrix, this will leave the local matrix
// object empty
mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);

void LeaveDataPtrCSR(PtrType **row_offset, int **col, ValueType **val)#

Leave a LocalMatrix to host pointers.

LeaveDataPtr functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.

Example
// rocALUTION CSR matrix object
LocalMatrix<ValueType> mat;

// Allocate the CSR matrix
mat.AllocateCSR("my_matrix", 345, 100, 100);

// Fill CSR matrix
// ...

int* csr_row_ptr   = NULL;
int* csr_col_ind   = NULL;
ValueType* csr_val = NULL;

// Get (steal) the data from the matrix, this will leave the local matrix
// object empty
mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);

void LeaveDataPtrBCSR(int **row_offset, int **col, ValueType **val, int &blockdim)#

Leave a LocalMatrix to host pointers.

LeaveDataPtr functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.

Example
// rocALUTION CSR matrix object
LocalMatrix<ValueType> mat;

// Allocate the CSR matrix
mat.AllocateCSR("my_matrix", 345, 100, 100);

// Fill CSR matrix
// ...

int* csr_row_ptr   = NULL;
int* csr_col_ind   = NULL;
ValueType* csr_val = NULL;

// Get (steal) the data from the matrix, this will leave the local matrix
// object empty
mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);

void LeaveDataPtrMCSR(int **row_offset, int **col, ValueType **val)#

Leave a LocalMatrix to host pointers.

LeaveDataPtr functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.

Example
// rocALUTION CSR matrix object
LocalMatrix<ValueType> mat;

// Allocate the CSR matrix
mat.AllocateCSR("my_matrix", 345, 100, 100);

// Fill CSR matrix
// ...

int* csr_row_ptr   = NULL;
int* csr_col_ind   = NULL;
ValueType* csr_val = NULL;

// Get (steal) the data from the matrix, this will leave the local matrix
// object empty
mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);

void LeaveDataPtrELL(int **col, ValueType **val, int &max_row)#

Leave a LocalMatrix to host pointers.

LeaveDataPtr functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.

Example
// rocALUTION CSR matrix object
LocalMatrix<ValueType> mat;

// Allocate the CSR matrix
mat.AllocateCSR("my_matrix", 345, 100, 100);

// Fill CSR matrix
// ...

int* csr_row_ptr   = NULL;
int* csr_col_ind   = NULL;
ValueType* csr_val = NULL;

// Get (steal) the data from the matrix, this will leave the local matrix
// object empty
mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);

void LeaveDataPtrDIA(int **offset, ValueType **val, int &num_diag)#

Leave a LocalMatrix to host pointers.

LeaveDataPtr functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.

Example
// rocALUTION CSR matrix object
LocalMatrix<ValueType> mat;

// Allocate the CSR matrix
mat.AllocateCSR("my_matrix", 345, 100, 100);

// Fill CSR matrix
// ...

int* csr_row_ptr   = NULL;
int* csr_col_ind   = NULL;
ValueType* csr_val = NULL;

// Get (steal) the data from the matrix, this will leave the local matrix
// object empty
mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);

void LeaveDataPtrDENSE(ValueType **val)#

Leave a LocalMatrix to host pointers.

LeaveDataPtr functions have direct access to the raw data via pointers. A LocalMatrix object can leave its raw data to host pointers. This will leave the LocalMatrix empty.

Example
// rocALUTION CSR matrix object
LocalMatrix<ValueType> mat;

// Allocate the CSR matrix
mat.AllocateCSR("my_matrix", 345, 100, 100);

// Fill CSR matrix
// ...

int* csr_row_ptr   = NULL;
int* csr_col_ind   = NULL;
ValueType* csr_val = NULL;

// Get (steal) the data from the matrix, this will leave the local matrix
// object empty
mat.LeaveDataPtrCSR(&csr_row_ptr, &csr_col_ind, &csr_val);

Public Functions

virtual void Info(void) const#

Shows simple info about the matrix.

unsigned int GetFormat(void) const#

Return the matrix format id (see matrix_formats.hpp)

int GetBlockDimension(void) const#

Return the matrix block dimension.

virtual int64_t GetM(void) const#

Return the number of rows in the local matrix.

virtual int64_t GetN(void) const#

Return the number of columns in the local matrix.

virtual int64_t GetNnz(void) const#

Return the number of non-zeros in the local matrix.

bool Check(void) const#

Perform a sanity check of the matrix.

Checks, if the matrix contains valid data, i.e. if the values are not infinity and not NaN (not a number) and if the structure of the matrix is correct (e.g. indices cannot be negative, CSR and COO matrices have to be sorted, etc.).

Return values:
  • true – if the matrix is ok (empty matrix is also ok).

  • false – if there is something wrong with the structure or values.

virtual void Clear(void)#

Clear (free) the matrix.

void Zeros(void)#

Set all matrix values to zero.

void Scale(ValueType alpha)#

Scale all values in the matrix.

void ScaleDiagonal(ValueType alpha)#

Scale the diagonal entries of the matrix with alpha, all diagonal elements must exist.

void ScaleOffDiagonal(ValueType alpha)#

Scale the off-diagonal entries of the matrix with alpha, all diagonal elements must exist.

void AddScalar(ValueType alpha)#

Add a scalar to all matrix values.

void AddScalarDiagonal(ValueType alpha)#

Add alpha to the diagonal entries of the matrix, all diagonal elements must exist.

void AddScalarOffDiagonal(ValueType alpha)#

Add alpha to the off-diagonal entries of the matrix, all diagonal elements must exist.

void ExtractSubMatrix(int64_t row_offset, int64_t col_offset, int64_t row_size, int64_t col_size, LocalMatrix<ValueType> *mat) const#

Extract a sub-matrix with row/col_offset and row/col_size.

void ExtractSubMatrices(int row_num_blocks, int col_num_blocks, const int *row_offset, const int *col_offset, LocalMatrix<ValueType> ***mat) const#

Extract array of non-overlapping sub-matrices (row/col_num_blocks define the blocks for rows/columns; row/col_offset have sizes col/row_num_blocks+1, where [i+1]-[i] defines the i-th size of the sub-matrix)

void ExtractDiagonal(LocalVector<ValueType> *vec_diag) const#

Extract the diagonal values of the matrix into a LocalVector.

void ExtractInverseDiagonal(LocalVector<ValueType> *vec_inv_diag) const#

Extract the inverse (reciprocal) diagonal values of the matrix into a LocalVector.

void ExtractU(LocalMatrix<ValueType> *U, bool diag) const#

Extract the upper triangular matrix.

void ExtractL(LocalMatrix<ValueType> *L, bool diag) const#

Extract the lower triangular matrix.

void Permute(const LocalVector<int> &permutation)#

Perform (forward) permutation of the matrix.

void PermuteBackward(const LocalVector<int> &permutation)#

Perform (backward) permutation of the matrix.

void CMK(LocalVector<int> *permutation) const#

Create permutation vector for CMK reordering of the matrix.

The Cuthill-McKee ordering minimize the bandwidth of a given sparse matrix.

Example
LocalVector<int> cmk;

mat.CMK(&cmk);
mat.Permute(cmk);

Parameters:

permutation[out] permutation vector for CMK reordering

void RCMK(LocalVector<int> *permutation) const#

Create permutation vector for reverse CMK reordering of the matrix.

The Reverse Cuthill-McKee ordering minimize the bandwidth of a given sparse matrix.

Example
LocalVector<int> rcmk;

mat.RCMK(&rcmk);
mat.Permute(rcmk);

Parameters:

permutation[out] permutation vector for reverse CMK reordering

void ConnectivityOrder(LocalVector<int> *permutation) const#

Create permutation vector for connectivity reordering of the matrix.

Connectivity ordering returns a permutation, that sorts the matrix by non-zero entries per row.

Example
LocalVector<int> conn;

mat.ConnectivityOrder(&conn);
mat.Permute(conn);

Parameters:

permutation[out] permutation vector for connectivity reordering

void MultiColoring(int &num_colors, int **size_colors, LocalVector<int> *permutation) const#

Perform multi-coloring decomposition of the matrix.

The Multi-Coloring algorithm builds a permutation (coloring of the matrix) in a way such that no two adjacent nodes in the sparse matrix have the same color.

Example
LocalVector<int> mc;
int num_colors;
int* block_colors = NULL;

mat.MultiColoring(num_colors, &block_colors, &mc);
mat.Permute(mc);

Parameters:
  • num_colors[out] number of colors

  • size_colors[out] pointer to array that holds the number of nodes for each color

  • permutation[out] permutation vector for multi-coloring reordering

void MaximalIndependentSet(int &size, LocalVector<int> *permutation) const#

Perform maximal independent set decomposition of the matrix.

The Maximal Independent Set algorithm finds a set with maximal size, that contains elements that do not depend on other elements in this set.

Example
LocalVector<int> mis;
int size;

mat.MaximalIndependentSet(size, &mis);
mat.Permute(mis);

Parameters:
  • size[out] number of independent sets

  • permutation[out] permutation vector for maximal independent set reordering

void ZeroBlockPermutation(int &size, LocalVector<int> *permutation) const#

Return a permutation for saddle-point problems (zero diagonal entries)

For Saddle-Point problems, (i.e. matrices with zero diagonal entries), the Zero Block Permutation maps all zero-diagonal elements to the last block of the matrix.

Example
LocalVector<int> zbp;
int size;

mat.ZeroBlockPermutation(size, &zbp);
mat.Permute(zbp);

Parameters:
  • size[out]

  • permutation[out] permutation vector for zero block permutation

void ILU0Factorize(void)#

Perform ILU(0) factorization.

void ItILU0Factorize(ItILU0Algorithm alg, int option, int max_iter, double tolerance)#

Perform Iterative ILU(0) factorization.

void LUFactorize(void)#

Perform LU factorization.

void ILUTFactorize(double t, int maxrow)#

Perform ILU(t,m) factorization based on threshold and maximum number of elements per row.

void ILUpFactorize(int p, bool level = true)#

Perform ILU(p) factorization based on power.

void LUAnalyse(void)#

Analyse the structure (level-scheduling)

void LUAnalyseClear(void)#

Delete the analysed data (see LUAnalyse)

void LUSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve LU out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.

void ICFactorize(LocalVector<ValueType> *inv_diag)#

Perform IC(0) factorization.

void LLAnalyse(void)#

Analyse the structure (level-scheduling)

void LLAnalyseClear(void)#

Delete the analysed data (see LLAnalyse)

void LLSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve LL^T out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.

void LLSolve(const LocalVector<ValueType> &in, const LocalVector<ValueType> &inv_diag, LocalVector<ValueType> *out) const#

Solve LL^T out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.

void LAnalyse(bool diag_unit = false)#

Analyse the structure (level-scheduling) L-part.

  • diag_unit == true the diag is 1;

  • diag_unit == false the diag is 0;

void LAnalyseClear(void)#

Delete the analysed data (see LAnalyse) L-part.

void LSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve L out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.

void UAnalyse(bool diag_unit = false)#

Analyse the structure (level-scheduling) U-part;.

  • diag_unit == true the diag is 1;

  • diag_unit == false the diag is 0;

void UAnalyseClear(void)#

Delete the analysed data (see UAnalyse) U-part.

void USolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve U out = in; if level-scheduling algorithm is provided then the graph traversing is performed in parallel.

void ItLUAnalyse(void)#

Analyse the structure for Iterative solve.

void ItLUAnalyseClear(void)#

Delete the analysed data (see ItLUAnalyse)

void ItLUSolve(int max_iter, double tolerance, bool use_tol, const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve LU out = in iteratively using the Jacobi method.

void ItLLAnalyse(void)#

Analyse the structure (level-scheduling)

void ItLLAnalyseClear(void)#

Delete the analysed data (see ItLLAnalyse)

void ItLLSolve(int max_iter, double tolerance, bool use_tol, const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve LL^T out = in iteratively using the Jacobi method.

void ItLLSolve(int max_iter, double tolerance, bool use_tol, const LocalVector<ValueType> &in, const LocalVector<ValueType> &inv_diag, LocalVector<ValueType> *out) const#

Solve LL^T out = in iteratively using the Jacobi method.

void ItLAnalyse(bool diag_unit = false)#

Analyse the structure (level-scheduling) L-part.

  • diag_unit == true the diag is 1;

  • diag_unit == false the diag is 0;

void ItLAnalyseClear(void)#

Delete the analysed data (see ItLAnalyse) L-part.

void ItLSolve(int max_iter, double tolerance, bool use_tol, const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve L out = in iteratively using the Jacobi method.

void ItUAnalyse(bool diag_unit = false)#

Analyse the structure (level-scheduling) U-part;.

  • diag_unit == true the diag is 1;

  • diag_unit == false the diag is 0;

void ItUAnalyseClear(void)#

Delete the analysed data (see ItUAnalyse) U-part.

void ItUSolve(int max_iter, double tolerance, bool use_tol, const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve U out = in iteratively using the Jacobi method.

void Householder(int idx, ValueType &beta, LocalVector<ValueType> *vec) const#

Compute Householder vector.

void QRDecompose(void)#

QR Decomposition.

void QRSolve(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Solve QR out = in.

void Invert(void)#

Matrix inversion using QR decomposition.

void ReadFileMTX(const std::string &filename)#

Read matrix from MTX (Matrix Market Format) file.

Read a matrix from Matrix Market Format file.

Example
LocalMatrix<ValueType> mat;
mat.ReadFileMTX("my_matrix.mtx");

Parameters:

filename[in] name of the file containing the MTX data.

void WriteFileMTX(const std::string &filename) const#

Write matrix to MTX (Matrix Market Format) file.

Write a matrix to Matrix Market Format file.

Example
LocalMatrix<ValueType> mat;

// Allocate and fill mat
// ...

mat.WriteFileMTX("my_matrix.mtx");

Parameters:

filename[in] name of the file to write the MTX data to.

void ReadFileCSR(const std::string &filename)#

Read matrix from CSR (rocALUTION binary format) file.

Read a CSR matrix from binary file. For details on the format, see WriteFileCSR().

Example
LocalMatrix<ValueType> mat;
mat.ReadFileCSR("my_matrix.csr");

Parameters:

filename[in] name of the file containing the data.

void WriteFileCSR(const std::string &filename) const#

Write CSR matrix to binary file.

Write a CSR matrix to binary file.

The binary format contains a header, the rocALUTION version and the matrix data as follows

// Header
out << "#rocALUTION binary csr file" << std::endl;

// rocALUTION version
out.write((char*)&version, sizeof(int));

// CSR matrix data
out.write((char*)&m, sizeof(int));
out.write((char*)&n, sizeof(int));
out.write((char*)&nnz, sizeof(int64_t));
out.write((char*)csr_row_ptr, (m + 1) * sizeof(int));
out.write((char*)csr_col_ind, nnz * sizeof(int));
out.write((char*)csr_val, nnz * sizeof(double));

Example
LocalMatrix<ValueType> mat;

// Allocate and fill mat
// ...

mat.WriteFileCSR("my_matrix.csr");

Note

Vector values array is always stored in double precision (e.g. double or std::complex<double>).

Parameters:

filename[in] name of the file to write the data to.

virtual void MoveToAccelerator(void)#

Move all data (i.e. move the matrix) to the accelerator.

virtual void MoveToAcceleratorAsync(void)#

Move all data (i.e. move the matrix) to the accelerator asynchronously.

virtual void MoveToHost(void)#

Move all data (i.e. move the matrix) to the host.

virtual void MoveToHostAsync(void)#

Move all data (i.e. move the matrix) to the host asynchronously.

virtual void Sync(void)#

Synchronize the matrix.

void CopyFrom(const LocalMatrix<ValueType> &src)#

Copy matrix from another LocalMatrix.

CopyFrom copies values and structure from another local matrix. Source and destination matrix should be in the same format.

Example
LocalMatrix<ValueType> mat1, mat2;

// Allocate and initialize mat1 and mat2
// ...

// Move mat1 to accelerator
// mat1.MoveToAccelerator();

// Now, mat1 is on the accelerator (if available)
// and mat2 is on the host

// Copy mat1 to mat2 (or vice versa) will move data between host and
// accelerator backend
mat1.CopyFrom(mat2);

Note

This function allows cross platform copying. One of the objects could be allocated on the accelerator backend.

Parameters:

src[in] Local matrix where values and structure should be copied from.

void CopyFromAsync(const LocalMatrix<ValueType> &src)#

Async copy matrix (values and structure) from another LocalMatrix.

void CloneFrom(const LocalMatrix<ValueType> &src)#

Clone the matrix.

CloneFrom clones the entire matrix, including values, structure and backend descriptor from another LocalMatrix.

Example
LocalMatrix<ValueType> mat;

// Allocate and initialize mat (host or accelerator)
// ...

LocalMatrix<ValueType> tmp;

// By cloning mat, tmp will have identical values and structure and will be on
// the same backend as mat
tmp.CloneFrom(mat);

Parameters:

src[in] LocalMatrix to clone from.

void UpdateValuesCSR(ValueType *val)#

Update CSR matrix entries only, structure will remain the same.

void CopyFromCSR(const PtrType *row_offsets, const int *col, const ValueType *val)#

Copy (import) CSR matrix described in three arrays (offsets, columns, values). The object data has to be allocated (call AllocateCSR first)

void CopyToCSR(PtrType *row_offsets, int *col, ValueType *val) const#

Copy (export) CSR matrix described in three arrays (offsets, columns, values). The output arrays have to be allocated.

void CopyFromCOO(const int *row, const int *col, const ValueType *val)#

Copy (import) COO matrix described in three arrays (rows, columns, values). The object data has to be allocated (call AllocateCOO first)

void CopyToCOO(int *row, int *col, ValueType *val) const#

Copy (export) COO matrix described in three arrays (rows, columns, values). The output arrays have to be allocated.

void CopyFromHostCSR(const PtrType *row_offset, const int *col, const ValueType *val, const std::string &name, int64_t nnz, int64_t nrow, int64_t ncol)#

Allocates and copies (imports) a host CSR matrix.

If the CSR matrix data pointers are only accessible as constant, the user can create a LocalMatrix object and pass const CSR host pointers. The LocalMatrix will then be allocated and the data will be copied to the corresponding backend, where the original object was located at.

Parameters:
  • row_offset[in] CSR matrix row offset pointers.

  • col[in] CSR matrix column indices.

  • val[in] CSR matrix values array.

  • name[in] Matrix object name.

  • nnz[in] Number of non-zero elements.

  • nrow[in] Number of rows.

  • ncol[in] Number of columns.

void CreateFromMap(const LocalVector<int> &map, int64_t n, int64_t m)#

Create a restriction matrix operator based on an int vector map.

void CreateFromMap(const LocalVector<int> &map, int64_t n, int64_t m, LocalMatrix<ValueType> *pro)#

Create a restriction and prolongation matrix operator based on an int vector map.

void ConvertToCSR(void)#

Convert the matrix to CSR structure.

void ConvertToMCSR(void)#

Convert the matrix to MCSR structure.

void ConvertToBCSR(int blockdim)#

Convert the matrix to BCSR structure.

void ConvertToCOO(void)#

Convert the matrix to COO structure.

void ConvertToELL(void)#

Convert the matrix to ELL structure.

void ConvertToDIA(void)#

Convert the matrix to DIA structure.

void ConvertToHYB(void)#

Convert the matrix to HYB structure.

void ConvertToDENSE(void)#

Convert the matrix to DENSE structure.

void ConvertTo(unsigned int matrix_format, int blockdim = 1)#

Convert the matrix to specified matrix ID format.

virtual void Apply(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Perform matrix-vector multiplication, out = this * in;.

Example
// rocALUTION structures
LocalMatrix<T> A;
LocalVector<T> x;
LocalVector<T> y;

// Allocate matrices and vectors
A.AllocateCSR("my CSR matrix", 456, 100, 100);
x.Allocate("x", A.GetN());
y.Allocate("y", A.GetM());

// Fill data in A matrix and x vector

A.Apply(x, &y);

virtual void ApplyAdd(const LocalVector<ValueType> &in, ValueType scalar, LocalVector<ValueType> *out) const#

Perform matrix-vector multiplication, out = scalar * this * in;.

Example
// rocALUTION structures
LocalMatrix<T> A;
LocalVector<T> x;
LocalVector<T> y;

// Allocate matrices and vectors
A.AllocateCSR("my CSR matrix", 456, 100, 100);
x.Allocate("x", A.GetN());
y.Allocate("y", A.GetM());

// Fill data in A matrix and x vector

T scalar = 2.0;
A.Apply(x, scalar, &y);

void SymbolicPower(int p)#

Perform symbolic computation (structure only) of \(|this|^p\).

void MatrixAdd(const LocalMatrix<ValueType> &mat, ValueType alpha = static_cast<ValueType>(1), ValueType beta = static_cast<ValueType>(1), bool structure = false)#

Perform matrix addition, this = alpha*this + beta*mat;.

  • if structure==false the sparsity pattern of the matrix is not changed;

  • if structure==true a new sparsity pattern is computed

void MatrixMult(const LocalMatrix<ValueType> &A, const LocalMatrix<ValueType> &B)#

Multiply two matrices, this = A * B.

void DiagonalMatrixMult(const LocalVector<ValueType> &diag)#

Multiply the matrix with diagonal matrix (stored in LocalVector), as DiagonalMatrixMultR()

void DiagonalMatrixMultL(const LocalVector<ValueType> &diag)#

Multiply the matrix with diagonal matrix (stored in LocalVector), this=diag*this.

void DiagonalMatrixMultR(const LocalVector<ValueType> &diag)#

Multiply the matrix with diagonal matrix (stored in LocalVector), this=this*diag.

void TripleMatrixProduct(const LocalMatrix<ValueType> &R, const LocalMatrix<ValueType> &A, const LocalMatrix<ValueType> &P)#

Triple matrix product C=RAP.

void Gershgorin(ValueType &lambda_min, ValueType &lambda_max) const#

Compute the spectrum approximation with Gershgorin circles theorem.

void Compress(double drop_off)#

Delete all entries in the matrix which abs(a_ij) <= drop_off; the diagonal elements are never deleted.

virtual void Transpose(void)#

Transpose the matrix.

void Transpose(LocalMatrix<ValueType> *T) const#

Transpose the matrix.

void Sort(void)#

Sort the matrix indices.

Sorts the matrix by indices.

  • For CSR matrices, column values are sorted.

  • For COO matrices, row indices are sorted.

void Key(long int &row_key, long int &col_key, long int &val_key) const#

Compute a unique hash key for the matrix arrays.

Typically, it is hard to compare if two matrices have the same structure (and values). To do so, rocALUTION provides a keying function, that generates three keys, for the row index, column index and values array.

Parameters:
  • row_key[out] row index array key

  • col_key[out] column index array key

  • val_key[out] values array key

void ReplaceColumnVector(int idx, const LocalVector<ValueType> &vec)#

Replace a column vector of a matrix.

void ReplaceRowVector(int idx, const LocalVector<ValueType> &vec)#

Replace a row vector of a matrix.

void ExtractColumnVector(int idx, LocalVector<ValueType> *vec) const#

Extract values from a column of a matrix to a vector.

void ExtractRowVector(int idx, LocalVector<ValueType> *vec) const#

Extract values from a row of a matrix to a vector.

void AMGConnect(ValueType eps, LocalVector<int> *connections) const#

Strong couplings for aggregation-based AMG.

void AMGAggregate(const LocalVector<int> &connections, LocalVector<int> *aggregates) const#

Plain aggregation - Modification of a greedy aggregation scheme from Vanek (1996)

void AMGPMISAggregate(const LocalVector<int> &connections, LocalVector<int> *aggregates) const#

Parallel aggregation - Parallel maximal independent set aggregation scheme from Bell, Dalton, & Olsen (2012)

void AMGSmoothedAggregation(ValueType relax, const LocalVector<int> &aggregates, const LocalVector<int> &connections, LocalMatrix<ValueType> *prolong, int lumping_strat = 0) const#

Interpolation scheme based on smoothed aggregation from Vanek (1996)

void AMGAggregation(const LocalVector<int> &aggregates, LocalMatrix<ValueType> *prolong) const#

Aggregation-based interpolation scheme.

void AMGGreedyAggregate(ValueType eps, LocalVector<bool> *connections, LocalVector<int64_t> *aggregates, LocalVector<int64_t> *aggregate_root_nodes) const#

Plain aggregation - Modification of a greedy aggregation scheme from Vanek (1996)

void AMGPMISAggregate(ValueType eps, LocalVector<bool> *connections, LocalVector<int64_t> *aggregates, LocalVector<int64_t> *aggregate_root_nodes) const#

Parallel aggregation - Parallel maximal independent set aggregation scheme from Bell, Dalton, & Olsen (2012)

void AMGSmoothedAggregation(ValueType relax, const LocalVector<bool> &connections, const LocalVector<int64_t> &aggregates, const LocalVector<int64_t> &aggregate_root_nodes, LocalMatrix<ValueType> *prolong, int lumping_strat = 0) const#

Interpolation scheme based on smoothed aggregation from Vanek (1996)

void AMGUnsmoothedAggregation(const LocalVector<int64_t> &aggregates, const LocalVector<int64_t> &aggregate_root_nodes, LocalMatrix<ValueType> *prolong) const#

Aggregation-based interpolation scheme.

void RSCoarsening(float eps, LocalVector<int> *CFmap, LocalVector<bool> *S) const#

Ruge Stueben coarsening.

void RSPMISCoarsening(float eps, LocalVector<int> *CFmap, LocalVector<bool> *S) const#

Parallel maximal independent set coarsening for RS AMG.

void RSDirectInterpolation(const LocalVector<int> &CFmap, const LocalVector<bool> &S, LocalMatrix<ValueType> *prolong) const#

Ruge Stueben Direct Interpolation.

void RSExtPIInterpolation(const LocalVector<int> &CFmap, const LocalVector<bool> &S, bool FF1, LocalMatrix<ValueType> *prolong) const#

Ruge Stueben Ext+i Interpolation.

void RSExtPIProlongNnz(int64_t global_column_begin, int64_t global_column_end, bool FF1, const LocalVector<int64_t> &l2g, const LocalVector<int> &CFmap, const LocalVector<bool> &S, const LocalMatrix<ValueType> &ghost, const LocalVector<PtrType> &bnd_csr_row_ptr, const LocalVector<int64_t> &bnd_csr_col_ind, LocalVector<int> *f2c, LocalMatrix<ValueType> *prolong_int, LocalMatrix<ValueType> *prolong_gst) const#

Ruge Stueben Prolongation matrix non-zeros.

void RSExtPIProlongFill(int64_t global_column_begin, int64_t global_column_end, bool FF1, const LocalVector<int64_t> &l2g, const LocalVector<int> &f2c, const LocalVector<int> &CFmap, const LocalVector<bool> &S, const LocalMatrix<ValueType> &ghost, const LocalVector<PtrType> &bnd_csr_row_ptr, const LocalVector<int64_t> &bnd_csr_col_ind, const LocalVector<PtrType> &ext_csr_row_ptr, const LocalVector<int64_t> &ext_csr_col_ind, const LocalVector<ValueType> &ext_csr_val, LocalMatrix<ValueType> *prolong_int, LocalMatrix<ValueType> *prolong_gst, LocalVector<int64_t> *global_ghost_col) const#

Ruge Stueben fill Prolongation matrix non-zeros.

void FSAI(int power, const LocalMatrix<ValueType> *pattern)#

Factorized Sparse Approximate Inverse assembly for given system matrix power pattern or external sparsity pattern.

void SPAI(void)#

SParse Approximate Inverse assembly for given system matrix pattern.

void InitialPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#

Initial Pairwise Aggregation scheme.

void InitialPairwiseAggregation(const LocalMatrix<ValueType> &mat, ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#

Initial Pairwise Aggregation scheme for split matrices.

void FurtherPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#

Further Pairwise Aggregation scheme.

void FurtherPairwiseAggregation(const LocalMatrix<ValueType> &mat, ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#

Further Pairwise Aggregation scheme for split matrices.

void CoarsenOperator(LocalMatrix<ValueType> *Ac, int nrow, int ncol, const LocalVector<int> &G, int Gsize, const int *rG, int rGsize) const#

Build coarse operator for pairwise aggregation scheme.

void CompressAdd(const LocalVector<int64_t> &l2g, const LocalVector<int64_t> &global_ghost_col, const LocalMatrix<ValueType> &ext, LocalVector<int64_t> *global_col)#

Merge ghost and Ext matrix.

Local Stencil#

template<typename ValueType>
class LocalStencil : public rocalution::Operator<ValueType>#

LocalStencil class.

A LocalStencil is called local, because it will always stay on a single system. The system can contain several CPUs via UMA or NUMA memory system or it can contain an accelerator.

Template Parameters:

ValueType – - can be int, float, double, std::complex<float> and std::complex<double>

Public Functions

LocalStencil(unsigned int type)#

Initialize a local stencil with a type.

virtual void Info() const#

Shows simple info about the stencil.

int64_t GetNDim(void) const#

Return the dimension of the stencil.

virtual int64_t GetM(void) const#

Return the number of rows in the local stencil.

virtual int64_t GetN(void) const#

Return the number of columns in the local stencil.

virtual int64_t GetNnz(void) const#

Return the number of non-zeros in the local stencil.

void SetGrid(int size)#

Set the stencil grid size.

virtual void Clear()#

Clear (free) the stencil.

virtual void Apply(const LocalVector<ValueType> &in, LocalVector<ValueType> *out) const#

Perform stencil-vector multiplication, out = this * in;.

virtual void ApplyAdd(const LocalVector<ValueType> &in, ValueType scalar, LocalVector<ValueType> *out) const#

Perform stencil-vector multiplication, out = scalar * this * in;.

virtual void MoveToAccelerator(void)#

Move all data (i.e. move the stencil) to the accelerator.

virtual void MoveToHost(void)#

Move all data (i.e. move the stencil) to the host.

Global Matrix#

template<typename ValueType>
class GlobalMatrix : public rocalution::Operator<ValueType>#

GlobalMatrix class.

A GlobalMatrix is called global, because it can stay on a single or on multiple nodes in a network. For this type of communication, MPI is used.

A number of matrix formats are supported. These are CSR, BCSR, MCSR, COO, DIA, ELL, HYB, and DENSE.

Note

For CSR type matrices, the column indices must be sorted in increasing order. For COO matrices, the row indices must be sorted in increasing order. The function Check can be used to check whether a matrix contains valid data. For CSR and COO matrices, the function Sort can be used to sort the row or column indices respectively.

Template Parameters:

ValueType – - can be int, float, double, std::complex<float> and std::complex<double>

Public Functions

explicit GlobalMatrix(const ParallelManager &pm)#

Initialize a global matrix with a parallel manager.

virtual int64_t GetM(void) const#

Return the number of rows in the global matrix.

virtual int64_t GetN(void) const#

Return the number of columns in the global matrix.

virtual int64_t GetNnz(void) const#

Return the number of non-zeros in the global matrix.

virtual int64_t GetLocalM(void) const#

Return the number of rows in the interior matrix.

virtual int64_t GetLocalN(void) const#

Return the number of columns in the interior matrix.

virtual int64_t GetLocalNnz(void) const#

Return the number of non-zeros in the interior matrix.

virtual int64_t GetGhostM(void) const#

Return the number of rows in the ghost matrix.

virtual int64_t GetGhostN(void) const#

Return the number of columns in the ghost matrix.

virtual int64_t GetGhostNnz(void) const#

Return the number of non-zeros in the ghost matrix.

unsigned int GetFormat(void) const#

Return the global matrix format id (see matrix_formats.hpp)

virtual void MoveToAccelerator(void)#

Move all data (i.e. move the part of the global matrix stored on this rank) to the accelerator.

virtual void MoveToHost(void)#

Move all data (i.e. move the part of the global matrix stored on this rank) to the host.

virtual void Info(void) const#

Shows simple info about the matrix.

virtual bool Check(void) const#

Perform a sanity check of the matrix.

Checks, if the matrix contains valid data, i.e. if the values are not infinity and not NaN (not a number) and if the structure of the matrix is correct (e.g. indices cannot be negative, CSR and COO matrices have to be sorted, etc.).

Return values:
  • true – if the matrix is ok (empty matrix is also ok).

  • false – if there is something wrong with the structure or values.

void AllocateCSR(const std::string &name, int64_t local_nnz, int64_t ghost_nnz)#

Allocate CSR Matrix.

void AllocateCOO(const std::string &name, int64_t local_nnz, int64_t ghost_nnz)#

Allocate COO Matrix.

virtual void Clear(void)#

Clear (free) the matrix.

void SetParallelManager(const ParallelManager &pm)#

Set the parallel manager of a global matrix.

void SetDataPtrCSR(PtrType **local_row_offset, int **local_col, ValueType **local_val, PtrType **ghost_row_offset, int **ghost_col, ValueType **ghost_val, std::string name, int64_t local_nnz, int64_t ghost_nnz)#

Initialize a CSR matrix on the host with externally allocated data.

void SetDataPtrCOO(int **local_row, int **local_col, ValueType **local_val, int **ghost_row, int **ghost_col, ValueType **ghost_val, std::string name, int64_t local_nnz, int64_t ghost_nnz)#

Initialize a COO matrix on the host with externally allocated data.

void SetLocalDataPtrCSR(PtrType **row_offset, int **col, ValueType **val, std::string name, int64_t nnz)#

Initialize a CSR matrix on the host with externally allocated local data.

void SetLocalDataPtrCOO(int **row, int **col, ValueType **val, std::string name, int64_t nnz)#

Initialize a COO matrix on the host with externally allocated local data.

void SetGhostDataPtrCSR(PtrType **row_offset, int **col, ValueType **val, std::string name, int64_t nnz)#

Initialize a CSR matrix on the host with externally allocated ghost data.

void SetGhostDataPtrCOO(int **row, int **col, ValueType **val, std::string name, int64_t nnz)#

Initialize a COO matrix on the host with externally allocated ghost data.

void LeaveDataPtrCSR(PtrType **local_row_offset, int **local_col, ValueType **local_val, PtrType **ghost_row_offset, int **ghost_col, ValueType **ghost_val)#

Leave a CSR matrix to host pointers.

void LeaveDataPtrCOO(int **local_row, int **local_col, ValueType **local_val, int **ghost_row, int **ghost_col, ValueType **ghost_val)#

Leave a COO matrix to host pointers.

void LeaveLocalDataPtrCSR(PtrType **row_offset, int **col, ValueType **val)#

Leave a local CSR matrix to host pointers.

void LeaveLocalDataPtrCOO(int **row, int **col, ValueType **val)#

Leave a local COO matrix to host pointers.

void LeaveGhostDataPtrCSR(PtrType **row_offset, int **col, ValueType **val)#

Leave a CSR ghost matrix to host pointers.

void LeaveGhostDataPtrCOO(int **row, int **col, ValueType **val)#

Leave a COO ghost matrix to host pointers.

void CloneFrom(const GlobalMatrix<ValueType> &src)#

Clone the entire matrix (values,structure+backend descr) from another GlobalMatrix.

void CopyFrom(const GlobalMatrix<ValueType> &src)#

Copy matrix (values and structure) from another GlobalMatrix.

void ConvertToCSR(void)#

Convert the matrix to CSR structure.

void ConvertToMCSR(void)#

Convert the matrix to MCSR structure.

void ConvertToBCSR(int blockdim)#

Convert the matrix to BCSR structure.

void ConvertToCOO(void)#

Convert the matrix to COO structure.

void ConvertToELL(void)#

Convert the matrix to ELL structure.

void ConvertToDIA(void)#

Convert the matrix to DIA structure.

void ConvertToHYB(void)#

Convert the matrix to HYB structure.

void ConvertToDENSE(void)#

Convert the matrix to DENSE structure.

void ConvertTo(unsigned int matrix_format, int blockdim = 1)#

Convert the matrix to specified matrix ID format.

virtual void Apply(const GlobalVector<ValueType> &in, GlobalVector<ValueType> *out) const#

Perform matrix-vector multiplication, out = this * in;.

virtual void ApplyAdd(const GlobalVector<ValueType> &in, ValueType scalar, GlobalVector<ValueType> *out) const#

Perform matrix-vector multiplication, out = scalar * this * in;.

virtual void Transpose(void)#

Transpose the matrix.

void Transpose(GlobalMatrix<ValueType> *T) const#

Transpose the matrix.

void TripleMatrixProduct(const GlobalMatrix<ValueType> &R, const GlobalMatrix<ValueType> &A, const GlobalMatrix<ValueType> &P)#

Triple matrix product C=RAP.

void ReadFileMTX(const std::string &filename)#

Read matrix from MTX (Matrix Market Format) file.

void WriteFileMTX(const std::string &filename) const#

Write matrix to MTX (Matrix Market Format) file.

void ReadFileCSR(const std::string &filename)#

Read matrix from CSR (ROCALUTION binary format) file.

void WriteFileCSR(const std::string &filename) const#

Write matrix to CSR (ROCALUTION binary format) file.

void Sort(void)#

Sort the matrix indices.

Sorts the matrix by indices.

  • For CSR matrices, column values are sorted.

  • For COO matrices, row indices are sorted.

void ExtractDiagonal(GlobalVector<ValueType> *vec_diag) const#

Extract the diagonal values of the matrix into a GlobalVector.

void ExtractInverseDiagonal(GlobalVector<ValueType> *vec_inv_diag) const#

Extract the inverse (reciprocal) diagonal values of the matrix into a GlobalVector.

void Scale(ValueType alpha)#

Scale all the values in the matrix.

void InitialPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#

Initial Pairwise Aggregation scheme.

void FurtherPairwiseAggregation(ValueType beta, int &nc, LocalVector<int> *G, int &Gsize, int **rG, int &rGsize, int ordering) const#

Further Pairwise Aggregation scheme.

void CoarsenOperator(GlobalMatrix<ValueType> *Ac, int nrow, int ncol, const LocalVector<int> &G, int Gsize, const int *rG, int rGsize) const#

Build coarse operator for pairwise aggregation scheme.

void CreateFromMap(const LocalVector<int> &map, int64_t n, int64_t m, GlobalMatrix<ValueType> *pro)#

Create a restriction and prolongation matrix operator based on an int vector map.

void AMGGreedyAggregate(ValueType eps, LocalVector<bool> *connections, LocalVector<int64_t> *aggregates, LocalVector<int64_t> *aggregate_root_nodes) const#

Plain aggregation - Modification of a greedy aggregation scheme from Vanek (1996)

void AMGPMISAggregate(ValueType eps, LocalVector<bool> *connections, LocalVector<int64_t> *aggregates, LocalVector<int64_t> *aggregate_root_nodes) const#

Parallel aggregation - Parallel maximal independent set aggregation scheme from Bell, Dalton, & Olsen (2012)

void AMGSmoothedAggregation(ValueType relax, const LocalVector<bool> &connections, const LocalVector<int64_t> &aggregates, const LocalVector<int64_t> &aggregate_root_nodes, GlobalMatrix<ValueType> *prolong, int lumping_strat = 0) const#

Interpolation scheme based on smoothed aggregation from Vanek (1996)

void AMGUnsmoothedAggregation(const LocalVector<int64_t> &aggregates, const LocalVector<int64_t> &aggregate_root_nodes, GlobalMatrix<ValueType> *prolong) const#

Aggregation-based interpolation scheme.

void RSCoarsening(float eps, LocalVector<int> *CFmap, LocalVector<bool> *S) const#

Ruge Stueben coarsening.

void RSPMISCoarsening(float eps, LocalVector<int> *CFmap, LocalVector<bool> *S) const#

Parallel maximal independent set coarsening for RS AMG.

void RSDirectInterpolation(const LocalVector<int> &CFmap, const LocalVector<bool> &S, GlobalMatrix<ValueType> *prolong) const#

Ruge Stueben Direct Interpolation.

void RSExtPIInterpolation(const LocalVector<int> &CFmap, const LocalVector<bool> &S, bool FF1, GlobalMatrix<ValueType> *prolong) const#

Ruge Stueben Ext+i Interpolation.

Local Vector#

template<typename ValueType>
class LocalVector : public rocalution::Vector<ValueType>#

LocalVector class.

A LocalVector is called local, because it will always stay on a single system. The system can contain several CPUs via UMA or NUMA memory system or it can contain an accelerator.

Template Parameters:

ValueType – - can be int, float, double, std::complex<float> and std::complex<double>

Unnamed Group

ValueType &operator[](int64_t i)#

Access operator (only for host data)

The elements in the vector can be accessed via [] operators, when the vector is allocated on the host.

Example
// rocALUTION local vector object
LocalVector<ValueType> vec;

// Allocate vector
vec.Allocate("my_vector", 100);

// Initialize vector with 1
vec.Ones();

// Set even elements to -1
for(int64_t i = 0; i < vec.GetSize(); i += 2)
{
  vec[i] = -1;
}

Parameters:

i[in] access data at index i

Returns:

value at index i

const ValueType &operator[](int64_t i) const#

Access operator (only for host data)

The elements in the vector can be accessed via [] operators, when the vector is allocated on the host.

Example
// rocALUTION local vector object
LocalVector<ValueType> vec;

// Allocate vector
vec.Allocate("my_vector", 100);

// Initialize vector with 1
vec.Ones();

// Set even elements to -1
for(int64_t i = 0; i < vec.GetSize(); i += 2)
{
  vec[i] = -1;
}

Parameters:

i[in] access data at index i

Returns:

value at index i

Unnamed Group

virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta)#

Perform scalar-vector multiplication and add another scaled vector (i.e. axpby), this = alpha * this + beta * x;.

Example
// rocALUTION structures
LocalVector<T> x;
LocalVector<T> y;

// Allocate vectors
x.Allocate("x", 100);
y.Allocate("y", 100);

x.Ones();
y.Ones();

T alpha = 2.0;
T beta = -1.0;
y.ScaleAddScale(alpha, x, beta);

virtual void ScaleAddScale(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, int64_t src_offset, int64_t dst_offset, int64_t size)#

Perform scalar-vector multiplication and add another scaled vector (i.e. axpby), this = alpha * this + beta * x;.

Example
// rocALUTION structures
LocalVector<T> x;
LocalVector<T> y;

// Allocate vectors
x.Allocate("x", 100);
y.Allocate("y", 100);

x.Ones();
y.Ones();

T alpha = 2.0;
T beta = -1.0;
y.ScaleAddScale(alpha, x, beta);

virtual void ScaleAdd2(ValueType alpha, const LocalVector<ValueType> &x, ValueType beta, const LocalVector<ValueType> &y, ValueType gamma)#

Perform vector update of type this = alpha*this + x*beta + y*gamma.

Unnamed Group

virtual ValueType InclusiveSum(void)#

Compute inclsuive sum of vector.

// rocALUTION structures
LocalVector<T> y;

// Allocate vectors
y.Allocate("y", 100);

y.Ones();

T sum = y.InclusiveSum();
Example

Given starting vector: this = [1, 1, 1, 1] After performing inclusive sum out vector will be: this = [1, 2, 3, 4] The function returns 4.

virtual ValueType InclusiveSum(const LocalVector<ValueType> &vec)#

Compute inclsuive sum of vector.

// rocALUTION structures
LocalVector<T> y;

// Allocate vectors
y.Allocate("y", 100);

y.Ones();

T sum = y.InclusiveSum();
Example

Given starting vector: this = [1, 1, 1, 1] After performing inclusive sum out vector will be: this = [1, 2, 3, 4] The function returns 4.

Unnamed Group

virtual ValueType ExclusiveSum(void)#

Compute exclusive sum of vector.

// rocALUTION structures
LocalVector<T> y;

// Allocate vectors
y.Allocate("y", 100);

y.Ones();

T sum = y.ExclusiveSum();
Example

Given starting vector: this = [1, 1, 1, 1] After performing exclusive sum out vector will be: this = [0, 1, 2, 3] The function returns 3.

virtual ValueType ExclusiveSum(const LocalVector<ValueType> &vec)#

Compute exclusive sum of vector.

// rocALUTION structures
LocalVector<T> y;

// Allocate vectors
y.Allocate("y", 100);

y.Ones();

T sum = y.ExclusiveSum();
Example

Given starting vector: this = [1, 1, 1, 1] After performing exclusive sum out vector will be: this = [0, 1, 2, 3] The function returns 3.

Unnamed Group

virtual void PointWiseMult(const LocalVector<ValueType> &x)#

Perform pointwise multiplication of vector.

Perform pointwise multiplication of vector components with the vector components of x, this = this * x. Alternatively, one can also perform pointwise multiplication of vector components of x with vector components of y and set that to the current ‘this’ vector, this = x * y.

Example
// rocALUTION structures
LocalVector<T> x;
LocalVector<T> y;

// Allocate vectors
y.Allocate("x", 100);
y.Allocate("y", 100);

x.Ones();
y.Ones();

y.PointWiseMult(x);

virtual void PointWiseMult(const LocalVector<ValueType> &x, const LocalVector<ValueType> &y)#

Perform pointwise multiplication of vector.

Perform pointwise multiplication of vector components with the vector components of x, this = this * x. Alternatively, one can also perform pointwise multiplication of vector components of x with vector components of y and set that to the current ‘this’ vector, this = x * y.

Example
// rocALUTION structures
LocalVector<T> x;
LocalVector<T> y;

// Allocate vectors
y.Allocate("x", 100);
y.Allocate("y", 100);

x.Ones();
y.Ones();

y.PointWiseMult(x);

Public Functions

virtual void MoveToAccelerator(void)#

Move all data (i.e. move the vector) to the accelerator.

virtual void MoveToAcceleratorAsync(void)#

Move all data (i.e. move the vector) to the accelerator asynchronously.

virtual void MoveToHost(void)#

Move all data (i.e. move the vector) to the host.

virtual void MoveToHostAsync(void)#

Move all data (i.e. move the vector) to the host asynchronously.

virtual void Sync(void)#

Synchronize the vector.

virtual void Info(void) const#

Shows simple info about the vector.

virtual int64_t GetSize(void) const#

Return the size of the vector.

virtual bool Check(void) const#

Perform a sanity check of the vector.

Checks, if the vector contains valid data, i.e. if the values are not infinity and not NaN (not a number).

Return values:
  • true – if the vector is ok (empty vector is also ok).

  • false – if there is something wrong with the values.

void Allocate(std::string name, int64_t size)#

Allocate a local vector with name and size.

The local vector allocation function requires a name of the object (this is only for information purposes) and corresponding size description for vector objects.

Example
LocalVector<ValueType> vec;

vec.Allocate("my vector", 100);
vec.Clear();

Parameters:
  • name[in] object name

  • size[in] number of elements in the vector

void SetDataPtr(ValueType **ptr, std::string name, int64_t size)#

Initialize a LocalVector on the host with externally allocated data.

SetDataPtr has direct access to the raw data via pointers. Already allocated data can be set by passing the pointer.

Example
// Allocate vector
ValueType* ptr_vec = new ValueType[200];

// Fill vector
// ...

// rocALUTION local vector object
LocalVector<ValueType> vec;

// Set the vector data, ptr_vec will become invalid
vec.SetDataPtr(&ptr_vec, "my_vector", 200);

Note

Setting data pointer will leave the original pointer empty (set to NULL).

void LeaveDataPtr(ValueType **ptr)#

Leave a LocalVector to host pointers.

LeaveDataPtr has direct access to the raw data via pointers. A LocalVector object can leave its raw data to a host pointer. This will leave the LocalVector empty.

Example
// rocALUTION local vector object
LocalVector<ValueType> vec;

// Allocate the vector
vec.Allocate("my_vector", 100);

// Fill vector
// ...

ValueType* ptr_vec = NULL;

// Get (steal) the data from the vector, this will leave the local vector object empty
vec.LeaveDataPtr(&ptr_vec);

virtual void Clear()#

Clear (free) the vector.

virtual void Zeros()#

Set the values of the vector to zero.

virtual void Ones()#

Set the values of the vector to one.

virtual void SetValues(ValueType val)#

Set the values of the vector to given argument.

virtual void SetRandomUniform(unsigned long long seed, ValueType a = static_cast<ValueType>(-1), ValueType b = static_cast<ValueType>(1))#

Set the values of the vector to random uniformly distributed values (between -1 and 1)

virtual void SetRandomNormal(unsigned long long seed, ValueType mean = static_cast<ValueType>(0), ValueType var = static_cast<ValueType>(1))#

Set the values of the vector to random normally distributed values (between 0 and 1)

virtual void ReadFileASCII(const std::string &filename)#

Read LocalVector from ASCII file.

virtual void WriteFileASCII(const std::string &filename) const#

Write LocalVector to ASCII file.

virtual void ReadFileBinary(const std::string &filename)#

Read LocalVector from binary file.

virtual void WriteFileBinary(const std::string &filename) const#

Write LocalVector to binary file.

virtual void CopyFrom(const LocalVector<ValueType> &src)#

Clone the entire vector (values,structure+backend descr) from another LocalVector.

virtual void CopyFromAsync(const LocalVector<ValueType> &src)#

Async copy from another local vector.

virtual void CopyFromFloat(const LocalVector<float> &src)#

Copy values from another local float vector.

virtual void CopyFromDouble(const LocalVector<double> &src)#

Copy values from another local double vector.

virtual void CopyFrom(const LocalVector<ValueType> &src, int64_t src_offset, int64_t dst_offset, int64_t size)#

Copy from another vector.

void CopyFromPermute(const LocalVector<ValueType> &src, const LocalVector<int> &permutation)#

Copy a vector under permutation (forward permutation)

void CopyFromPermuteBackward(const LocalVector<ValueType> &src, const LocalVector<int> &permutation)#

Copy a vector under permutation (backward permutation)

virtual void CloneFrom(const LocalVector<ValueType> &src)#

Clone from another vector.

void CopyFromData(const ValueType *data)#

Copy (import) vector.

Copy (import) vector data that is described in one array (values). The object data has to be allocated with Allocate(), using the corresponding size of the data, first.

Parameters:

data[in] data to be imported.

void CopyFromHostData(const ValueType *data)#

Copy (import) vector from host data.

Copy (import) vector data that is described in one host array (values). The object data has to be allocated with Allocate(), using the corresponding size of the data, first.

Parameters:

data[in] data to be imported from host.

void CopyToData(ValueType *data) const#

Copy (export) vector.

Copy (export) vector data that is described in one array (values). The output array has to be allocated, using the corresponding size of the data, first. Size can be obtain by GetSize().

Parameters:

data[out] exported data.

void CopyToHostData(ValueType *data) const#

Copy (export) vector to host data.

Copy (export) vector data that is described in one array (values). The output array has to be allocated on the host, using the corresponding size of the data, first. Size can be obtain by GetSize().

Parameters:

data[out] exported data on host.

void Permute(const LocalVector<int> &permutation)#

Perform in-place permutation (forward) of the vector.

void PermuteBackward(const LocalVector<int> &permutation)#

Perform in-place permutation (backward) of the vector.

void Restriction(const LocalVector<ValueType> &vec_fine, const LocalVector<int> &map)#

Restriction operator based on restriction mapping vector.

void Prolongation(const LocalVector<ValueType> &vec_coarse, const LocalVector<int> &map)#

Prolongation operator based on restriction mapping vector.

virtual void AddScale(const LocalVector<ValueType> &x, ValueType alpha)#

Perform scalar-vector multiplication and add it to another vector, this = this + alpha * x;.

Example
// rocALUTION structures
LocalVector<T> x;
LocalVector<T> y;

// Allocate vectors
x.Allocate("x", 100);
y.Allocate("y", 100);

x.Ones();
y.Ones();

T alpha = 2.0;
y.AddScale(x, alpha);

virtual void ScaleAdd(ValueType alpha, const LocalVector<ValueType> &x)#

Perform scalar-vector multiplication and add another vector, this = alpha * this + x;.

Example
// rocALUTION structures
LocalVector<T> x;
LocalVector<T> y;

// Allocate vectors
x.Allocate("x", 100);
y.Allocate("y", 100);

x.Ones();
y.Ones();

T alpha = 2.0;
y.ScaleAdd(alpha, x);

virtual void Scale(ValueType alpha)#

Scale vector, this = alpha * this;.

Example
// rocALUTION structures
LocalVector<T> y;

// Allocate vectors
y.Allocate("y", 100);

y.Ones();

T alpha = 2.0;
y.Scale(alpha);

virtual ValueType Dot(const LocalVector<ValueType> &x) const#

Perform dot product.

Perform dot product of ‘this’ vector and the vector x. In the case of complex types, this performs conjugate dot product.

Example
// rocALUTION structures
LocalVector<T> x;
LocalVector<T> y;

// Allocate vectors
y.Allocate("x", 100);
y.Allocate("y", 100);

x.Ones();
y.Ones();

y.Dot(x);

virtual ValueType DotNonConj(const