`heat.sparse`

add sparse heat function to the ht.sparse namespace

Submodules

Package Contents

add(t1: heat.sparse.dcsr_matrix.DCSR_matrix, t2: heat.sparse.dcsr_matrix.DCSR_matrix) → heat.sparse.dcsr_matrix.DCSR_matrix

Element-wise addition of values from two operands, commutative. Takes the first and second operand (scalar or DCSR_matrix) whose elements are to be added as argument and returns a DCSR_matrix containing the results of element-wise addition of t1 and t2.

Parameters:

t1 (DCSR_matrix) – The first operand involved in the addition
t2 (DCSR_matrix) – The second operand involved in the addition

Examples

>>> heat_sparse_csr
(indptr: tensor([0, 2, 3]), indices: tensor([0, 2, 2]), data: tensor([1., 2., 3.]), dtype=ht.float32, device=cpu:0, split=0)
>>> heat_sparse_csr.todense()
DNDarray([[1., 0., 2.],
          [0., 0., 3.]], dtype=ht.float32, device=cpu:0, split=0)
>>> sum_sparse = heat_sparse_csr + heat_sparse_csr
    (or)
>>> sum_sparse = ht.sparse.sparse_add(heat_sparse_csr, heat_sparse_csr)
>>> sum_sparse
(indptr: tensor([0, 2, 3], dtype=torch.int32), indices: tensor([0, 2, 2], dtype=torch.int32), data: tensor([2., 4., 6.]), dtype=ht.float32, device=cpu:0, split=0)
>>> sum_sparse.todense()
DNDarray([[2., 0., 4.],
          [0., 0., 6.]], dtype=ht.float32, device=cpu:0, split=0)

mul(t1: heat.sparse.dcsr_matrix.DCSR_matrix, t2: heat.sparse.dcsr_matrix.DCSR_matrix) → heat.sparse.dcsr_matrix.DCSR_matrix

Element-wise multiplication (NOT matrix multiplication) of values from two operands, commutative. Takes the first and second operand (scalar or DCSR_matrix) whose elements are to be multiplied as argument.

Parameters:

t1 (DCSR_matrix) – The first operand involved in the multiplication
t2 (DCSR_matrix) – The second operand involved in the multiplication

Examples

>>> heat_sparse_csr
(indptr: tensor([0, 2, 3]), indices: tensor([0, 2, 2]), data: tensor([1., 2., 3.]), dtype=ht.float32, device=cpu:0, split=0)
>>> heat_sparse_csr.todense()
DNDarray([[1., 0., 2.],
          [0., 0., 3.]], dtype=ht.float32, device=cpu:0, split=0)
>>> pdt_sparse = heat_sparse_csr * heat_sparse_csr
    (or)
>>> pdt_sparse = ht.sparse.sparse_mul(heat_sparse_csr, heat_sparse_csr)
>>> pdt_sparse
(indptr: tensor([0, 2, 3]), indices: tensor([0, 2, 2]), data: tensor([1., 4., 9.]), dtype=ht.float32, device=cpu:0, split=0)
>>> pdt_sparse.todense()
DNDarray([[1., 0., 4.],
          [0., 0., 9.]], dtype=ht.float32, device=cpu:0, split=0)

class DCSR_matrix(array: torch.Tensor, gnnz: int, gshape: Tuple[int, Ellipsis], dtype: heat.core.types.datatype, split: int | None, device: heat.core.devices.Device, comm: Communication, balanced: bool)

Distributed Compressed Sparse Row Matrix. It is composed of PyTorch sparse_csr_tensors local to each process.

Parameters:

array (torch.Tensor (layout ==> torch.sparse_csr)) – Local sparse array
gnnz (int) – Total number of non-zero elements across all processes
gshape (Tuple[int,...]) – The global shape of the array
dtype (datatype) – The datatype of the array
split (int or None) – If split is not None, it denotes the axis on which the array is divided between processes. DCSR_matrix only supports distribution along axis 0.
device (Device) – The device on which the local arrays are using (cpu or gpu)
comm (Communication) – The communications object for sending and receiving data
balanced (bool or None) – Describes whether the data are evenly distributed across processes.

global_indptr() → heat.core.dndarray.DNDarray: Global indptr of the DCSR_matrix as a DNDarray

is_distributed() → bool: Determines whether the data of this DCSR_matrix is distributed across multiple processes.

counts_displs_nnz() → Tuple[Tuple[int], Tuple[int]]: Returns actual counts (number of non-zero items per process) and displacements (offsets) of the DCSR_matrix. Does not assume load balance.

astype(dtype, copy=True) → DCSR_matrix

Returns a casted version of this matrix. Casted matrix is a new matrix of the same shape but with given type of this matrix. If copy is True, the same matrix is returned instead.

Parameters:

dtype (datatype) – HeAT type to which the matrix is cast
copy (bool, optional) – By default the operation returns a copy of this matrix. If copy is set to False the cast is performed in-place and this matrix is returned

__repr__() → str: Computes a printable representation of the passed DCSR_matrix.

sparse_csr_matrix(obj: Iterable, dtype: Type[heat.core.types.datatype] | None = None, split: int | None = None, is_split: int | None = None, device: heat.core.devices.Device | None = None, comm: heat.core.communication.Communication | None = None) → heat.sparse.dcsr_matrix.DCSR_matrix

Create a DCSR_matrix.

Parameters:

obj (array_like) – A tensor or array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence. Sparse tensor that needs to be distributed.
dtype (datatype, optional) – The desired data-type for the sparse matrix. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the astype() method.
split (int or None, optional) – The axis along which the passed array content obj is split and distributed in memory. DCSR_matrix only supports distribution along axis 0. Mutually exclusive with is_split.
is_split (int or None, optional) – Specifies the axis along which the local data portions, passed in obj, are split across all machines. DCSR_matrix only supports distribution along axis 0. Useful for interfacing with other distributed-memory code. The shape of the global array is automatically inferred. Mutually exclusive with split.
device (str or Device, optional) – Specifies the Device the array shall be allocated on (i.e. globally set default device).
comm (Communication, optional) – Handle to the nodes holding distributed array chunks.

Raises:

ValueError – If split and is_split parameters are not one of 0 or None.

Examples

Create a DCSR_matrix from torch.Tensor (layout ==> torch.sparse_csr) >>> indptr = torch.tensor([0, 2, 3, 6]) >>> indices = torch.tensor([0, 2, 2, 0, 1, 2]) >>> data = torch.tensor([1, 2, 3, 4, 5, 6], dtype=torch.float) >>> torch_sparse_csr = torch.sparse_csr_tensor(indptr, indices, data) >>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(torch_sparse_csr, split=0) >>> heat_sparse_csr (indptr: tensor([0, 2, 3, 6]), indices: tensor([0, 2, 2, 0, 1, 2]), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)

Create a DCSR_matrix from scipy.sparse.csr_matrix >>> scipy_sparse_csr = scipy.sparse.csr_matrix((data, indices, indptr)) >>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(scipy_sparse_csr, split=0) >>> heat_sparse_csr (indptr: tensor([0, 2, 3, 6], dtype=torch.int32), indices: tensor([0, 2, 2, 0, 1, 2], dtype=torch.int32), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)

Create a DCSR_matrix using data that is already distributed (with is_split) >>> indptrs = [torch.tensor([0, 2, 3]), torch.tensor([0, 3])] >>> indices = [torch.tensor([0, 2, 2]), torch.tensor([0, 1, 2])] >>> data = [torch.tensor([1, 2, 3], dtype=torch.float),

torch.tensor([4, 5, 6], dtype=torch.float)]

>>> rank = ht.MPI_WORLD.rank
>>> local_indptr = indptrs[rank]
>>> local_indices = indices[rank]
>>> local_data = data[rank]
>>> local_torch_sparse_csr = torch.sparse_csr_tensor(local_indptr, local_indices, local_data)
>>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(local_torch_sparse_csr, is_split=0)
>>> heat_sparse_csr
(indptr: tensor([0, 2, 3, 6]), indices: tensor([0, 2, 2, 0, 1, 2]), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)

Create a DCSR_matrix from List >>> ht.sparse.sparse_csr_matrix([[0, 0, 1], [1, 0, 2], [0, 0, 3]]) (indptr: tensor([0, 1, 3, 4]), indices: tensor([2, 0, 2, 2]), data: tensor([1, 1, 2, 3]), dtype=ht.int64, device=cpu:0, split=None)

to_dense(sparse_matrix: heat.sparse.dcsr_matrix.DCSR_matrix, order='C', out: heat.core.dndarray.DNDarray = None) → heat.core.dndarray.DNDarray

Convert DCSR_matrix to a dense DNDarray. Output follows the same distribution among processes as the input

Parameters:

sparse_matrix (DCSR_matrix) – The sparse csr matrix which is to be converted to a dense array
order (str, optional) – Options: 'C' or 'F'. Specifies the memory layout of the newly created DNDarray. Default is order='C', meaning the array will be stored in row-major order (C-like). If order=‘F’, the array will be stored in column-major order (Fortran-like).
out (DNDarray) – Output buffer in which the values of the dense format is stored. If not specified, a new DNDarray is created.

Raises:

ValueError – If shape of output buffer does not match that of the input.
ValueError – If split axis of output buffer does not match that of the input.

Examples

>>> indptr = torch.tensor([0, 2, 3, 6])
>>> indices = torch.tensor([0, 2, 2, 0, 1, 2])
>>> data = torch.tensor([1, 2, 3, 4, 5, 6], dtype=torch.float)
>>> torch_sparse_csr = torch.sparse_csr_tensor(indptr, indices, data)
>>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(torch_sparse_csr, split=0)
>>> heat_sparse_csr
(indptr: tensor([0, 2, 3, 6]), indices: tensor([0, 2, 2, 0, 1, 2]), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)
>>> heat_sparse_csr.todense()
DNDarray([[1., 0., 2.],
          [0., 0., 3.],
          [4., 5., 6.]], dtype=ht.float32, device=cpu:0, split=0)

to_sparse(array: heat.core.dndarray.DNDarray) → heat.sparse.dcsr_matrix.DCSR_matrix

Convert the distributed array to a sparse DCSR_matrix representation.

Parameters:: array (DNDarray) – The distributed array to be converted to a sparse DCSR_matrix.
Returns:: A sparse DCSR_matrix representation of the input DNDarray.
Return type:: DCSR_matrix

Notes

This method allows for the conversion of a DNDarray into a sparse DCSR_matrix representation, which is useful for handling large and sparse datasets efficiently.

Examples

>>> dense_array = ht.array([[1, 0, 0], [0, 0, 2], [0, 3, 0]])
>>> sparse_matrix = dense_array.to_sparse()

heat.sparse

Submodules

Package Contents

`heat.sparse`