heat.sparse

add sparse heat function to the ht.sparse namespace

Submodules

Package Contents

add(t1: heat.sparse.dcsr_matrix.DCSR_matrix, t2: heat.sparse.dcsr_matrix.DCSR_matrix) heat.sparse.dcsr_matrix.DCSR_matrix

Element-wise addition of values from two operands, commutative. Takes the first and second operand (scalar or DCSR_matrix) whose elements are to be added as argument and returns a DCSR_matrix containing the results of element-wise addition of t1 and t2.

Parameters:
  • t1 (DCSR_matrix) – The first operand involved in the addition

  • t2 (DCSR_matrix) – The second operand involved in the addition

Examples

>>> heat_sparse_csr
(indptr: tensor([0, 2, 3]), indices: tensor([0, 2, 2]), data: tensor([1., 2., 3.]), dtype=ht.float32, device=cpu:0, split=0)
>>> heat_sparse_csr.todense()
DNDarray([[1., 0., 2.],
          [0., 0., 3.]], dtype=ht.float32, device=cpu:0, split=0)
>>> sum_sparse = heat_sparse_csr + heat_sparse_csr
    (or)
>>> sum_sparse = ht.sparse.sparse_add(heat_sparse_csr, heat_sparse_csr)
>>> sum_sparse
(indptr: tensor([0, 2, 3], dtype=torch.int32), indices: tensor([0, 2, 2], dtype=torch.int32), data: tensor([2., 4., 6.]), dtype=ht.float32, device=cpu:0, split=0)
>>> sum_sparse.todense()
DNDarray([[2., 0., 4.],
          [0., 0., 6.]], dtype=ht.float32, device=cpu:0, split=0)
mul(t1: heat.sparse.dcsr_matrix.DCSR_matrix, t2: heat.sparse.dcsr_matrix.DCSR_matrix) heat.sparse.dcsr_matrix.DCSR_matrix

Element-wise multiplication (NOT matrix multiplication) of values from two operands, commutative. Takes the first and second operand (scalar or DCSR_matrix) whose elements are to be multiplied as argument.

Parameters:
  • t1 (DCSR_matrix) – The first operand involved in the multiplication

  • t2 (DCSR_matrix) – The second operand involved in the multiplication

Examples

>>> heat_sparse_csr
(indptr: tensor([0, 2, 3]), indices: tensor([0, 2, 2]), data: tensor([1., 2., 3.]), dtype=ht.float32, device=cpu:0, split=0)
>>> heat_sparse_csr.todense()
DNDarray([[1., 0., 2.],
          [0., 0., 3.]], dtype=ht.float32, device=cpu:0, split=0)
>>> pdt_sparse = heat_sparse_csr * heat_sparse_csr
    (or)
>>> pdt_sparse = ht.sparse.sparse_mul(heat_sparse_csr, heat_sparse_csr)
>>> pdt_sparse
(indptr: tensor([0, 2, 3]), indices: tensor([0, 2, 2]), data: tensor([1., 4., 9.]), dtype=ht.float32, device=cpu:0, split=0)
>>> pdt_sparse.todense()
DNDarray([[1., 0., 4.],
          [0., 0., 9.]], dtype=ht.float32, device=cpu:0, split=0)
class DCSR_matrix(array: torch.Tensor, gnnz: int, gshape: Tuple[int, Ellipsis], dtype: heat.core.types.datatype, split: int | None, device: heat.core.devices.Device, comm: Communication, balanced: bool)

Distributed Compressed Sparse Row Matrix. It is composed of PyTorch sparse_csr_tensors local to each process.

Parameters:
  • array (torch.Tensor (layout ==> torch.sparse_csr)) – Local sparse array

  • gnnz (int) – Total number of non-zero elements across all processes

  • gshape (Tuple[int,...]) – The global shape of the array

  • dtype (datatype) – The datatype of the array

  • split (int or None) – If split is not None, it denotes the axis on which the array is divided between processes. DCSR_matrix only supports distribution along axis 0.

  • device (Device) – The device on which the local arrays are using (cpu or gpu)

  • comm (Communication) – The communications object for sending and receiving data

  • balanced (bool or None) – Describes whether the data are evenly distributed across processes.

global_indptr() heat.core.dndarray.DNDarray

Global indptr of the DCSR_matrix as a DNDarray

is_distributed() bool

Determines whether the data of this DCSR_matrix is distributed across multiple processes.

counts_displs_nnz() Tuple[Tuple[int], Tuple[int]]

Returns actual counts (number of non-zero items per process) and displacements (offsets) of the DCSR_matrix. Does not assume load balance.

astype(dtype, copy=True) DCSR_matrix

Returns a casted version of this matrix. Casted matrix is a new matrix of the same shape but with given type of this matrix. If copy is True, the same matrix is returned instead.

Parameters:
  • dtype (datatype) – HeAT type to which the matrix is cast

  • copy (bool, optional) – By default the operation returns a copy of this matrix. If copy is set to False the cast is performed in-place and this matrix is returned

__repr__() str

Computes a printable representation of the passed DCSR_matrix.

sparse_csr_matrix(obj: Iterable, dtype: Type[heat.core.types.datatype] | None = None, split: int | None = None, is_split: int | None = None, device: heat.core.devices.Device | None = None, comm: heat.core.communication.Communication | None = None) heat.sparse.dcsr_matrix.DCSR_matrix

Create a DCSR_matrix.

Parameters:
  • obj (array_like) – A tensor or array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence. Sparse tensor that needs to be distributed.

  • dtype (datatype, optional) – The desired data-type for the sparse matrix. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the astype() method.

  • split (int or None, optional) – The axis along which the passed array content obj is split and distributed in memory. DCSR_matrix only supports distribution along axis 0. Mutually exclusive with is_split.

  • is_split (int or None, optional) – Specifies the axis along which the local data portions, passed in obj, are split across all machines. DCSR_matrix only supports distribution along axis 0. Useful for interfacing with other distributed-memory code. The shape of the global array is automatically inferred. Mutually exclusive with split.

  • device (str or Device, optional) – Specifies the Device the array shall be allocated on (i.e. globally set default device).

  • comm (Communication, optional) – Handle to the nodes holding distributed array chunks.

Raises:

ValueError – If split and is_split parameters are not one of 0 or None.

Examples

Create a DCSR_matrix from torch.Tensor (layout ==> torch.sparse_csr) >>> indptr = torch.tensor([0, 2, 3, 6]) >>> indices = torch.tensor([0, 2, 2, 0, 1, 2]) >>> data = torch.tensor([1, 2, 3, 4, 5, 6], dtype=torch.float) >>> torch_sparse_csr = torch.sparse_csr_tensor(indptr, indices, data) >>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(torch_sparse_csr, split=0) >>> heat_sparse_csr (indptr: tensor([0, 2, 3, 6]), indices: tensor([0, 2, 2, 0, 1, 2]), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)

Create a DCSR_matrix from scipy.sparse.csr_matrix >>> scipy_sparse_csr = scipy.sparse.csr_matrix((data, indices, indptr)) >>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(scipy_sparse_csr, split=0) >>> heat_sparse_csr (indptr: tensor([0, 2, 3, 6], dtype=torch.int32), indices: tensor([0, 2, 2, 0, 1, 2], dtype=torch.int32), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)

Create a DCSR_matrix using data that is already distributed (with is_split) >>> indptrs = [torch.tensor([0, 2, 3]), torch.tensor([0, 3])] >>> indices = [torch.tensor([0, 2, 2]), torch.tensor([0, 1, 2])] >>> data = [torch.tensor([1, 2, 3], dtype=torch.float),

torch.tensor([4, 5, 6], dtype=torch.float)]

>>> rank = ht.MPI_WORLD.rank
>>> local_indptr = indptrs[rank]
>>> local_indices = indices[rank]
>>> local_data = data[rank]
>>> local_torch_sparse_csr = torch.sparse_csr_tensor(local_indptr, local_indices, local_data)
>>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(local_torch_sparse_csr, is_split=0)
>>> heat_sparse_csr
(indptr: tensor([0, 2, 3, 6]), indices: tensor([0, 2, 2, 0, 1, 2]), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)

Create a DCSR_matrix from List >>> ht.sparse.sparse_csr_matrix([[0, 0, 1], [1, 0, 2], [0, 0, 3]]) (indptr: tensor([0, 1, 3, 4]), indices: tensor([2, 0, 2, 2]), data: tensor([1, 1, 2, 3]), dtype=ht.int64, device=cpu:0, split=None)

to_dense(sparse_matrix: heat.sparse.dcsr_matrix.DCSR_matrix, order='C', out: heat.core.dndarray.DNDarray = None) heat.core.dndarray.DNDarray

Convert DCSR_matrix to a dense DNDarray. Output follows the same distribution among processes as the input

Parameters:
  • sparse_matrix (DCSR_matrix) – The sparse csr matrix which is to be converted to a dense array

  • order (str, optional) – Options: 'C' or 'F'. Specifies the memory layout of the newly created DNDarray. Default is order='C', meaning the array will be stored in row-major order (C-like). If order=‘F’, the array will be stored in column-major order (Fortran-like).

  • out (DNDarray) – Output buffer in which the values of the dense format is stored. If not specified, a new DNDarray is created.

Raises:
  • ValueError – If shape of output buffer does not match that of the input.

  • ValueError – If split axis of output buffer does not match that of the input.

Examples

>>> indptr = torch.tensor([0, 2, 3, 6])
>>> indices = torch.tensor([0, 2, 2, 0, 1, 2])
>>> data = torch.tensor([1, 2, 3, 4, 5, 6], dtype=torch.float)
>>> torch_sparse_csr = torch.sparse_csr_tensor(indptr, indices, data)
>>> heat_sparse_csr = ht.sparse.sparse_csr_matrix(torch_sparse_csr, split=0)
>>> heat_sparse_csr
(indptr: tensor([0, 2, 3, 6]), indices: tensor([0, 2, 2, 0, 1, 2]), data: tensor([1., 2., 3., 4., 5., 6.]), dtype=ht.float32, device=cpu:0, split=0)
>>> heat_sparse_csr.todense()
DNDarray([[1., 0., 2.],
          [0., 0., 3.],
          [4., 5., 6.]], dtype=ht.float32, device=cpu:0, split=0)
to_sparse(array: heat.core.dndarray.DNDarray) heat.sparse.dcsr_matrix.DCSR_matrix

Convert the distributed array to a sparse DCSR_matrix representation.

Parameters:

array (DNDarray) – The distributed array to be converted to a sparse DCSR_matrix.

Returns:

A sparse DCSR_matrix representation of the input DNDarray.

Return type:

DCSR_matrix

Notes

This method allows for the conversion of a DNDarray into a sparse DCSR_matrix representation, which is useful for handling large and sparse datasets efficiently.

Examples

>>> dense_array = ht.array([[1, 0, 0], [0, 0, 2], [0, 3, 0]])
>>> sparse_matrix = dense_array.to_sparse()