:mod:`heat.communication`
==============================
.. py:module:: heat.core.communication

.. autoapi-nested-parse::

   Module implementing the communication layer of HeAT


Module Contents
---------------


.. py:class:: MPIRequest(handle, sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any] = None, recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any] = None, tensor: torch.Tensor = None, permutation: Tuple[int, Ellipsis] = None)

   Represents a handle on a non-blocking operation

   :param handle: Handle for the mpi4py Communicator
   :type handle: MPI.Communicator
   :param sendbuf: The buffer for the data to be send
   :type sendbuf: DNDarray or torch.Tensor or Any
   :param recvbuf: The buffer to the receive data
   :type recvbuf: DNDarray or torch.Tensor or Any
   :param tensor: Internal Data
   :type tensor: torch.Tensor
   :param permutation: Permutation of the tensor axes
   :type permutation: Tuple[int,...]


   .. attribute:: handle
      

   .. attribute:: tensor
      :annotation: = None

      
   .. attribute:: recvbuf
      :annotation: = None

      
   .. attribute:: sendbuf
      :annotation: = None

      
   .. attribute:: permutation
      :annotation: = None

      
   .. role:: raw-html(raw)
      :format: html
   .. method:: Wait(status: mpi4py.MPI.Status = None)

      Waits for an MPI request to complete


   .. method:: __getattr__(name: str) -> Callable

      Default pass-through for the communicator methods.

      :param name: The name of the method to be called.
      :type name: str


.. py:class:: Communication

   Base class for Communications (inteded for other backends)


   .. role:: raw-html(raw)
      :format: html
   .. method:: is_distributed() -> NotImplementedError

      Whether or not the Communication is distributed


   .. method:: chunk(shape, split) -> NotImplementedError

      Calculates the chunk of data that will be assigned to this compute node given a global data shape and a split
      axis. Returns ``(offset, local_shape, slices)``: the offset in the split dimension, the resulting local shape if the
      global input shape is chunked on the split axis and the chunk slices with respect to the given shape

      :param shape: The global shape of the data to be split
      :type shape: Tuple[int,...]
      :param split: The axis along which to chunk the data
      :type split: int


.. py:class:: MPICommunication(handle=MPI.COMM_WORLD)

   Bases: :class:`Communication`

   Class encapsulating all MPI Communication

   :param handle: Handle for the mpi4py Communicator
   :type handle: MPI.Communicator


   .. attribute:: COUNT_LIMIT
      

   .. attribute:: __mpi_type_mappings
      

   .. attribute:: __mpi_dtype2ctype
      

   .. attribute:: handle
      

   .. role:: raw-html(raw)
      :format: html
   .. method:: __del__()


   .. method:: is_distributed() -> bool

      Determines whether the communicator is distributed, i.e. handles more than one node.


   .. method:: chunk(shape: Tuple[int], split: int, rank: int = None, w_size: int = None, sparse: bool = False) -> Tuple[int, Tuple[int], Tuple[slice]]

      Calculates the chunk of data that will be assigned to this compute node given a global data shape and a split
      axis.
      Returns ``(offset, local_shape, slices)``: the offset in the split dimension, the resulting local shape if the
      global input shape is chunked on the split axis and the chunk slices with respect to the given shape

      :param shape: The global shape of the data to be split
      :type shape: Tuple[int,...]
      :param split: The axis along which to chunk the data
      :type split: int
      :param rank: Process for which the chunking is calculated for, defaults to ``self.rank``.
                   Intended for creating chunk maps without communication
      :type rank: int, optional
      :param w_size: The MPI world size, defaults to ``self.size``.
                     Intended for creating chunk maps without communication
      :type w_size: int, optional
      :param sparse: Specifies whether the array is a sparse matrix
      :type sparse: bool, optional


   .. method:: counts_displs_shape(shape: Tuple[int], axis: int) -> Tuple[Tuple[int], Tuple[int], Tuple[int]]

      Calculates the item counts, displacements and output shape for a variable sized all-to-all MPI-call (e.g.
      ``MPI_Alltoallv``). The passed shape is regularly chunk along the given axis and for all nodes.

      :param shape: The object for which to calculate the chunking.
      :type shape: Tuple[int,...]
      :param axis: The axis along which the chunking is performed.
      :type axis: int


   .. method:: mpi_type_of(dtype: torch.dtype) -> mpi4py.MPI.Datatype

      Determines the MPI Datatype from the torch dtype.

      :param dtype: PyTorch data type
      :type dtype: torch.dtype


   .. method:: _handle_large_count(mpi_type: mpi4py.MPI.Datatype, elements: int) -> Tuple[mpi4py.MPI.Datatype, int]

      Handles large counts for MPI data types by creating vector types to circumvent the MAX_INT limit on certain MPI implementations.

      :param mpi_type: The base MPI data type
      :type mpi_type: MPI.Datatype
      :param elements: The total number of elements to be sent
      :type elements: int

      :returns: A tuple containing the constructed MPI data type and the count (always 1 in this case)
      :rtype: Tuple[MPI.Datatype, int]

      :raises ValueError: If the tensor is too large to be handled

      .. rubric:: Notes

      Uses vector type to get around the MAX_INT limit on certain MPI implementations
      This is at the moment only applied when sending contiguous data, as the construction of data types to get around non-contiguous data naturally aliviates the problem to a certain extent.
      Thanks to: J. R. Hammond, A. Schäfer and R. Latham, "To INT_MAX... and Beyond! Exploring Large-Count Support in MPI," 2014 Workshop on Exascale MPI at Supercomputing Conference, New Orleans, LA, USA, 2014, pp. 1-8


   .. method:: mpi_type_and_elements_of(obj: Union[heat.core.dndarray.DNDarray, torch.Tensor], counts: Optional[Tuple[int]], displs: Tuple[int], is_contiguous: Optional[bool]) -> Tuple[mpi4py.MPI.Datatype, Tuple[int, Ellipsis]]

      Determines the MPI data type and number of respective elements for the given tensor (:class:`~heat.core.dndarray.DNDarray`
      or ``torch.Tensor). In case the tensor is contiguous in memory, a native MPI data type can be used.
      Otherwise, a derived data type is automatically constructed using the storage information of the passed object.

      :param obj: The object for which to construct the MPI data type and number of elements
      :type obj: DNDarray or torch.Tensor
      :param counts: Optional counts arguments for variable MPI-calls (e.g. Alltoallv)
      :type counts: Tuple[ints,...], optional
      :param displs: Optional displacements arguments for variable MPI-calls (e.g. Alltoallv)
      :type displs: Tuple[ints,...], optional
      :param is_contiguous: Information on global contiguity of the memory-distributed object. If `None`, it will be set to local contiguity via ``torch.Tensor.is_contiguous()``.
      :type is_contiguous: bool
      :param # ToDo:
      :type # ToDo: The option to explicitely specify the counts and displacements to be send still needs propper implementation


   .. method:: as_mpi_memory(obj: torch.Tensor) -> mpi4py.MPI.memory

      Converts the passed ``torch.Tensor`` into an MPI compatible memory view.

      :param obj: The tensor to be converted into a MPI memory view.
      :type obj: torch.Tensor


   .. method:: as_buffer(obj: torch.Tensor, counts: Optional[Tuple[int]] = None, displs: Optional[Tuple[int]] = None, is_contiguous: Optional[bool] = None) -> List[Union[mpi4py.MPI.memory, Tuple[int, int], mpi4py.MPI.Datatype]]

      Converts a passed ``torch.Tensor`` into a memory buffer object with associated number of elements and MPI data type.

      :param obj: The object to be converted into a buffer representation.
      :type obj: torch.Tensor
      :param counts: Optional counts arguments for variable MPI-calls (e.g. Alltoallv)
      :type counts: Tuple[int,...], optional
      :param displs: Optional displacements arguments for variable MPI-calls (e.g. Alltoallv)
      :type displs: Tuple[int,...], optional
      :param is_contiguous: Optional information on global contiguity of the memory-distributed object.
      :type is_contiguous: bool, optional


   .. method:: _moveToCompDevice(x: torch.Tensor, func: Callable | None) -> torch.Tensor

      Moves the torch tensor to the relevant device, in case the function is not compatible with the MPI+GPU library.


   .. method:: alltoall_sendbuffer(obj: torch.Tensor) -> List[Union[mpi4py.MPI.memory, Tuple[int, int], mpi4py.MPI.Datatype]]

      Converts a passed ``torch.Tensor`` into a memory buffer object with associated number of elements and MPI data type.
      XXX: might not work for all MPI stacks. Might require multiple type commits or so

      :param obj: The object to be transformed into a custom MPI datatype
      :type obj: torch.Tensor


   .. method:: alltoall_recvbuffer(obj: torch.Tensor) -> List[Union[mpi4py.MPI.memory, Tuple[int, int], mpi4py.MPI.Datatype]]

      Converts a passed ``torch.Tensor`` into a memory buffer object with associated number of elements and MPI data type.
      XXX: might not work for all MPI stacks. Might require multiple type commits or so

      :param obj: The object to be transformed into a custom MPI datatype
      :type obj: torch.Tensor


   .. method:: Free() -> None

      Free a communicator.


   .. method:: Split(color: int = 0, key: int = 0) -> MPICommunication

      Split communicator by color and key.

      :param color: Determines the new communicator for a process.
      :type color: int, optional
      :param key: Ordering within the new communicator.
      :type key: int, optional


   .. method:: Irecv(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], source: int = MPI.ANY_SOURCE, tag: int = MPI.ANY_TAG) -> MPIRequest

      Nonblocking receive

      :param buf: Buffer address where to place the received message
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param source: Rank of source process, that send the message
      :type source: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Recv(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], source: int = MPI.ANY_SOURCE, tag: int = MPI.ANY_TAG, status: mpi4py.MPI.Status = None)

      Blocking receive

      :param buf: Buffer address where to place the received message
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param source: Rank of the source process, that send the message
      :type source: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional
      :param status: Details on the communication
      :type status: MPI.Status, optional


   .. method:: __send_like(func: Callable, buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int) -> Tuple[Optional[Union[heat.core.dndarray.DNDarray, torch.Tensor]]]

      Generic function for sending a message to process with rank "dest"

      :param func: The respective MPI sending function
      :type func: Callable
      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Bsend(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0)

      Blocking buffered send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Index of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Ibsend(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0) -> MPIRequest

      Nonblocking buffered send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Irsend(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0) -> MPIRequest

      Nonblocking ready send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Isend(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0) -> MPIRequest

      Nonblocking send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Issend(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0) -> MPIRequest

      Nonblocking synchronous send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Rsend(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0)

      Blocking ready send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Ssend(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0)

      Blocking synchronous send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: Send(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], dest: int, tag: int = 0)

      Blocking send

      :param buf: Buffer address of the message to be send
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param dest: Rank of the destination process, that receives the message
      :type dest: int, optional
      :param tag: A Tag to identify the message
      :type tag: int, optional


   .. method:: __broadcast_like(func: Callable, buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int) -> Tuple[Optional[heat.core.dndarray.DNDarray, torch.Tensor]]

      Generic function for broadcasting a message from the process with rank "root" to all other processes of the
      communicator

      :param func: The respective MPI broadcast function
      :type func: Callable
      :param buf: Buffer address of the message to be broadcasted
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of the root process, that broadcasts the message
      :type root: int


   .. method:: Bcast(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0) -> None

      Blocking Broadcast

      :param buf: Buffer address of the message to be broadcasted
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of the root process, that broadcasts the message
      :type root: int


   .. method:: Ibcast(buf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0) -> MPIRequest

      Nonblocking Broadcast

      :param buf: Buffer address of the message to be broadcasted
      :type buf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of the root process, that broadcasts the message
      :type root: int


   .. method:: __derived_op(tensor: torch.Tensor, datatype: mpi4py.MPI.Datatype, operation: mpi4py.MPI.Op) -> Callable[[mpi4py.MPI.memory, mpi4py.MPI.memory, mpi4py.MPI.Datatype], None]


   .. method:: _minmax_op(dtype: torch.dtype, total_count: int, shape: Tuple[int], stride: Tuple[int], offset: int = 0) -> Callable[[mpi4py.MPI.memory, mpi4py.MPI.memory, mpi4py.MPI.Datatype], None]

      Create an MPI.Op for elementwise min/max combine of a packed buffer [mins; maxs].

      :param dtype: torch.dtype of underlying elements
      :type dtype: torch.dtype
      :param total_count: Number of elements per mins OR per max (so recv buffer has 2*total_count elements)
      :type total_count: int
      :param shape: Shape of the packed buffer that the MPI callback will operate on.
                    This describes the logical shape of the concatenated buffer [mins; maxs]
      :type shape: Tuple[int]
      :param stride: Stride (in elements) of the packed buffer's storage, matching the layout
      :type stride: Tuple[int]
      :param offset: Storage offset (if needed), default 0
      :type offset: int, optional


   .. method:: __reduce_like(func: Callable, sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op, *args: Any, **kwargs: Any) -> Tuple[Optional[heat.core.dndarray.DNDarray, torch.Tensor]]

      Generic function for reduction operations.

      :param func: The respective MPI reduction operation
      :type func: Callable
      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: Operation to apply during the reduction.
      :type op: MPI.Op
      :param \*args: Additional positional arguments to be passed to the function
      :type \*args: Any
      :param \*\*kwargs: Additional keyword arguments to be passed to the function
      :type \*\*kwargs: Any


   .. method:: Allreduce(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM)

      Combines values from all processes and distributes the result back to all processes

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op


   .. method:: Exscan(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM)

      Computes the exclusive scan (partial reductions) of data on a collection of processes

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op


   .. method:: Iallreduce(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM) -> MPIRequest

      Nonblocking allreduce reducing values on all processes to a single value

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op


   .. method:: Iexscan(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM) -> MPIRequest

      Nonblocking Exscan

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op


   .. method:: Iscan(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM) -> MPIRequest

      Nonblocking Scan

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op


   .. method:: Ireduce(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM, root: int = 0) -> MPIRequest

      Nonblocking reduction operation

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op
      :param root: Rank of the root process
      :type root: int


   .. method:: Reduce(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM, root: int = 0)

      Reduce values from all processes to a single value on process "root"

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op
      :param root: Rank of the root process
      :type root: int


   .. method:: Scan(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], op: mpi4py.MPI.Op = MPI.SUM)

      Computes the scan (partial reductions) of data on a collection of processes in a nonblocking way

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result of the reduction
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param op: The operation to perform upon reduction
      :type op: MPI.Op


   .. method:: __allgather_like(func: Callable, sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], axis: int, **kwargs)

      Generic function for allgather operations.

      :param func: Type of MPI Allgather function (i.e. allgather, allgatherv, iallgather)
      :type func: Callable
      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param axis: Concatenation axis: The axis along which ``sendbuf`` is packed and along which ``recvbuf`` puts together individual chunks
      :type axis: int
      :param \*\*kwargs: Extra arguments to be passed to the function.


   .. method:: Allgather(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recv_axis: int = 0)

      Gathers data from all tasks and distribute the combined data to all tasks

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param recv_axis: Concatenation axis: The axis along which ``sendbuf`` is packed and along which ``recvbuf`` puts together individual chunks
      :type recv_axis: int


   .. method:: Allgatherv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recv_axis: int = 0)

      v-call of Allgather: Each process may contribute a different amount of data.

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param recv_axis: Concatenation axis: The axis along which ``sendbuf`` is packed and along which ``recvbuf`` puts together individual chunks
      :type recv_axis: int


   .. method:: Iallgather(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recv_axis: int = 0) -> MPIRequest

      Nonblocking Allgather.

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param recv_axis: Concatenation axis: The axis along which ``sendbuf`` is packed and along which ``recvbuf`` puts together individual chunks
      :type recv_axis: int


   .. method:: Iallgatherv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recv_axis: int = 0)

      Nonblocking v-call of Allgather: Each process may contribute a different amount of data.

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param recv_axis: Concatenation axis: The axis along which ``sendbuf`` is packed and along which ``recvbuf`` puts together individual chunks
      :type recv_axis: int


   .. method:: __alltoall_like(func: Callable, sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], send_axis: int, recv_axis: int, **kwargs)

      Generic function for alltoall operations.

      :param func: Specific alltoall function
      :type func: Callable
      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param send_axis:
                        Future split axis, along which data blocks will be created that will be send to individual ranks

                            - if ``send_axis==recv_axis``, an error will be thrown
                            - if ``send_axis`` or ``recv_axis`` are ``None``, an error will be thrown
      :type send_axis: int
      :param recv_axis: Prior split axis, along which blocks are received from the individual ranks
      :type recv_axis: int
      :param \*\*kwargs: Extra arguments to be passed to the function.


   .. method:: Alltoall(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], send_axis: int = 0, recv_axis: int = None)

      All processes send data to all processes: The jth block sent from process i is received by process j and is
      placed in the ith block of recvbuf.

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param send_axis:
                        Future split axis, along which data blocks will be created that will be send to individual ranks

                            - if ``send_axis==recv_axis``, an error will be thrown
                            - if ``send_axis`` or ``recv_axis`` are ``None``, an error will be thrown
      :type send_axis: int
      :param recv_axis: Prior split axis, along which blocks are received from the individual ranks
      :type recv_axis: int


   .. method:: Alltoallv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], send_axis: int = 0, recv_axis: int = None)

      v-call of Alltoall: All processes send different amount of data to, and receive different amount of data
      from, all processes

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param send_axis:
                        Future split axis, along which data blocks will be created that will be send to individual ranks

                            - if ``send_axis==recv_axis``, an error will be thrown
                            - if ``send_axis`` or ``recv_axis`` are ``None``, an error will be thrown
      :type send_axis: int
      :param recv_axis: Prior split axis, along which blocks are received from the individual ranks
      :type recv_axis: int


   .. method:: Alltoallw(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any])

      Generalized All-to-All communication allowing different counts, displacements and datatypes for each partner. See MPI standard for more information.

      :param sendbuf: Buffer address of the send message. The buffer is expected to be a tuple of the form (buffer, (counts, displacements), subarray_params_list), where subarray_params_list is a list of tuples of the form (lshape, subsizes, substarts).
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result. The buffer is expected to be a tuple of the form (buffer, (counts, displacements), subarray_params_list), where subarray_params_list is a list of tuples of the form (lshape, subsizes, substarts).
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]


   .. method:: _create_recursive_vectortype(datatype: mpi4py.MPI.Datatype, tensor_stride: Tuple[int], subarray_sizes: List[int], start: List[int]) -> mpi4py.MPI.Datatype

      Create a recursive vector to handle non-contiguous tensor data. The created datatype will be a recursively defined vector datatype that will enable the collection of  non-contiguous tensor data in the specified subarray sizes.

      :param datatype: The base datatype to create the recursive vector datatype from.
      :type datatype: MPI.Datatype
      :param tensor_stride: A list of tensor strides for each dimension.
      :type tensor_stride: Tuple[int]
      :param subarray_sizes: A list of subarray sizes for each dimension.
      :type subarray_sizes: List[int]
      :param start: Index of the first element of the subarray in the original array.
      :type start: List[int]

      .. rubric:: Notes

      This function creates a recursive vector datatype by defining vectors out of the previous datatype with specified strides and sizes. The extent (size of the data type in bytes) of the new datatype is set to the extent of the basic datatype to allow interweaving of data.

      .. rubric:: Examples

      >>> datatype = MPI.INT
      >>> tensor_stride = [1, 2, 3]
      >>> subarray_sizes = [4, 5, 6]
      >>> recursive_vectortype = create_recursive_vectortype(
      ...     datatype, tensor_stride, subarray_sizes
      ... )


   .. method:: Ialltoall(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], send_axis: int = 0, recv_axis: int = None) -> MPIRequest

      Nonblocking Alltoall

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param send_axis:
                        Future split axis, along which data blocks will be created that will be send to individual ranks

                            - if ``send_axis==recv_axis``, an error will be thrown
                            - if ``send_axis`` or ``recv_axis`` are ``None``, an error will be thrown
      :type send_axis: int
      :param recv_axis: Prior split axis, along which blocks are received from the individual ranks
      :type recv_axis: int


   .. method:: Ialltoallv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], send_axis: int = 0, recv_axis: int = None) -> MPIRequest

      Nonblocking v-call of Alltoall: All processes send different amount of data to, and receive different amount of
      data from, all processes

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param send_axis:
                        Future split axis, along which data blocks will be created that will be send to individual ranks

                            - if ``send_axis==recv_axis``, an error will be thrown
                            - if ``send_axis`` or ``recv_axis`` are ``None``, an error will be thrown
      :type send_axis: int
      :param recv_axis: Prior split axis, along which blocks are received from the individual ranks
      :type recv_axis: int


   .. method:: __gather_like(func: Callable, sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], send_axis: int, recv_axis: int, send_factor: int = 1, recv_factor: int = 1, **kwargs)

      Generic function for gather operations.

      :param func: Type of MPI Scatter/Gather function
      :type func: Callable
      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param send_axis: The axis along which ``sendbuf`` is packed
      :type send_axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int
      :param send_factor: Number of elements to be scattered (vor non-v-calls)
      :type send_factor: int
      :param recv_factor: Number of elements to be gathered (vor non-v-calls)
      :type recv_factor: int
      :param \*\*kwargs: Extra arguments to be passed to the function.


   .. method:: Gather(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0, axis: int = 0, recv_axis: int = None)

      Gathers together values from a group of processes

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of receiving process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: Gatherv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0, axis: int = 0, recv_axis: int = None)

      v-call for Gather: All processes send different amount of data

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of receiving process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: Igather(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0, axis: int = 0, recv_axis: int = None) -> MPIRequest

      Non-blocking Gather

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of receiving process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: Igatherv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0, axis: int = 0, recv_axis: int = None) -> MPIRequest

      Non-blocking v-call for Gather: All processes send different amount of data

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of receiving process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: __scatter_like(func: Callable, sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], send_axis: int, recv_axis: int, send_factor: int = 1, recv_factor: int = 1, **kwargs)

      Generic function for scatter operations.

      :param func: Type of MPI Scatter/Gather function
      :type func: Callable
      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param send_axis: The axis along which ``sendbuf`` is packed
      :type send_axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int
      :param send_factor: Number of elements to be scattered (vor non-v-calls)
      :type send_factor: int
      :param recv_factor: Number of elements to be gathered (vor non-v-calls)
      :type recv_factor: int
      :param \*\*kwargs: Extra arguments to be passed to the function.


   .. method:: Iscatter(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0, axis: int = 0, recv_axis: int = None) -> MPIRequest

      Non-blocking Scatter

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of sending process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: Iscatterv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0, axis: int = 0, recv_axis: int = None) -> MPIRequest

      Non-blocking v-call for Scatter: Sends different amounts of data to different processes

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of sending process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: Scatter(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], root: int = 0, axis: int = 0, recv_axis: int = None)

      Sends data parts from one process to all other processes in a communicator

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of sending process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: Scatterv(sendbuf: Union[heat.core.dndarray.DNDarray, torch.Tensor, Any], recvbuf: int, root: int = 0, axis: int = 0, recv_axis: int = None)

      v-call for Scatter: Sends different amounts of data to different processes

      :param sendbuf: Buffer address of the send message
      :type sendbuf: Union[DNDarray, torch.Tensor, Any]
      :param recvbuf: Buffer address where to store the result
      :type recvbuf: Union[DNDarray, torch.Tensor, Any]
      :param root: Rank of sending process
      :type root: int
      :param axis: The axis along which ``sendbuf`` is packed
      :type axis: int
      :param recv_axis: The axis along which ``recvbuf`` is packed
      :type recv_axis: int


   .. method:: __getattr__(name: str)

      Default pass-through for the communicator methods.

      :param name: The name of the method to be called.
      :type name: str


.. function:: get_comm() -> Communication

   Retrieves the currently globally set default communication.


.. function:: sanitize_comm(comm: Optional[Communication]) -> Communication

   Sanitizes a device or device identifier, i.e. checks whether it is already an instance of :class:`heat.core.devices.Device`
   or a string with known device identifier and maps it to a proper ``Device``.

   :param comm: The comm to be sanitized
   :type comm: Communication

   :raises TypeError: If the given communication is not the proper type


.. function:: use_comm(comm: Communication = None)

   Sets the globally used default communicator.

   :param comm: The communication to be set
   :type comm: Communication or None