heat.communication
Module implementing the communication layer of HeAT
Module Contents
- class MPIRequest(handle, sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any = None, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any = None, tensor: torch.Tensor = None, permutation: Tuple[int, Ellipsis] = None)[source]
Represents a handle on a non-blocking operation
- Parameters:
handle (MPI.Communicator) – Handle for the mpi4py Communicator
sendbuf (DNDarray or torch.Tensor or Any) – The buffer for the data to be send
recvbuf (DNDarray or torch.Tensor or Any) – The buffer to the receive data
tensor (torch.Tensor) – Internal Data
permutation (Tuple[int,...]) – Permutation of the tensor axes
- handle
- tensor = None
- recvbuf = None
- sendbuf = None
- permutation = None
- class Communication[source]
Base class for Communications (inteded for other backends)
- chunk(shape, split) NotImplementedError[source]
Calculates the chunk of data that will be assigned to this compute node given a global data shape and a split axis. Returns
(offset, local_shape, slices): the offset in the split dimension, the resulting local shape if the global input shape is chunked on the split axis and the chunk slices with respect to the given shape- Parameters:
shape (Tuple[int,...]) – The global shape of the data to be split
split (int) – The axis along which to chunk the data
- class MPICommunication(handle=MPI.COMM_WORLD)[source]
Bases:
CommunicationClass encapsulating all MPI Communication
- Parameters:
handle (MPI.Communicator) – Handle for the mpi4py Communicator
- COUNT_LIMIT
- __mpi_type_mappings
- __mpi_dtype2ctype
- handle
- is_distributed() bool[source]
Determines whether the communicator is distributed, i.e. handles more than one node.
- chunk(shape: Tuple[int], split: int, rank: int = None, w_size: int = None, sparse: bool = False) Tuple[int, Tuple[int], Tuple[slice]][source]
Calculates the chunk of data that will be assigned to this compute node given a global data shape and a split axis. Returns
(offset, local_shape, slices): the offset in the split dimension, the resulting local shape if the global input shape is chunked on the split axis and the chunk slices with respect to the given shape- Parameters:
shape (Tuple[int,...]) – The global shape of the data to be split
split (int) – The axis along which to chunk the data
rank (int, optional) – Process for which the chunking is calculated for, defaults to
self.rank. Intended for creating chunk maps without communicationw_size (int, optional) – The MPI world size, defaults to
self.size. Intended for creating chunk maps without communicationsparse (bool, optional) – Specifies whether the array is a sparse matrix
- counts_displs_shape(shape: Tuple[int], axis: int) Tuple[Tuple[int], Tuple[int], Tuple[int]][source]
Calculates the item counts, displacements and output shape for a variable sized all-to-all MPI-call (e.g.
MPI_Alltoallv). The passed shape is regularly chunk along the given axis and for all nodes.- Parameters:
shape (Tuple[int,...]) – The object for which to calculate the chunking.
axis (int) – The axis along which the chunking is performed.
- mpi_type_of(dtype: torch.dtype) mpi4py.MPI.Datatype[source]
Determines the MPI Datatype from the torch dtype.
- Parameters:
dtype (torch.dtype) – PyTorch data type
- _handle_large_count(mpi_type: mpi4py.MPI.Datatype, elements: int) Tuple[mpi4py.MPI.Datatype, int][source]
Handles large counts for MPI data types by creating vector types to circumvent the MAX_INT limit on certain MPI implementations.
- Parameters:
mpi_type (MPI.Datatype) – The base MPI data type
elements (int) – The total number of elements to be sent
- Returns:
A tuple containing the constructed MPI data type and the count (always 1 in this case)
- Return type:
Tuple[MPI.Datatype, int]
- Raises:
ValueError – If the tensor is too large to be handled
Notes
Uses vector type to get around the MAX_INT limit on certain MPI implementations This is at the moment only applied when sending contiguous data, as the construction of data types to get around non-contiguous data naturally aliviates the problem to a certain extent. Thanks to: J. R. Hammond, A. Schäfer and R. Latham, “To INT_MAX… and Beyond! Exploring Large-Count Support in MPI,” 2014 Workshop on Exascale MPI at Supercomputing Conference, New Orleans, LA, USA, 2014, pp. 1-8
- mpi_type_and_elements_of(obj: heat.core.dndarray.DNDarray | torch.Tensor, counts: Tuple[int] | None, displs: Tuple[int], is_contiguous: bool | None) Tuple[mpi4py.MPI.Datatype, Tuple[int, Ellipsis]][source]
Determines the MPI data type and number of respective elements for the given tensor (
DNDarrayor ``torch.Tensor). In case the tensor is contiguous in memory, a native MPI data type can be used. Otherwise, a derived data type is automatically constructed using the storage information of the passed object.- Parameters:
obj (DNDarray or torch.Tensor) – The object for which to construct the MPI data type and number of elements
counts (Tuple[ints,...], optional) – Optional counts arguments for variable MPI-calls (e.g. Alltoallv)
displs (Tuple[ints,...], optional) – Optional displacements arguments for variable MPI-calls (e.g. Alltoallv)
is_contiguous (bool) – Information on global contiguity of the memory-distributed object. If None, it will be set to local contiguity via
torch.Tensor.is_contiguous().ToDo (#)
- as_mpi_memory(obj: torch.Tensor) mpi4py.MPI.memory[source]
Converts the passed
torch.Tensorinto an MPI compatible memory view.- Parameters:
obj (torch.Tensor) – The tensor to be converted into a MPI memory view.
- as_buffer(obj: torch.Tensor, counts: Tuple[int] | None = None, displs: Tuple[int] | None = None, is_contiguous: bool | None = None) List[mpi4py.MPI.memory | Tuple[int, int] | mpi4py.MPI.Datatype][source]
Converts a passed
torch.Tensorinto a memory buffer object with associated number of elements and MPI data type.- Parameters:
obj (torch.Tensor) – The object to be converted into a buffer representation.
counts (Tuple[int,...], optional) – Optional counts arguments for variable MPI-calls (e.g. Alltoallv)
displs (Tuple[int,...], optional) – Optional displacements arguments for variable MPI-calls (e.g. Alltoallv)
is_contiguous (bool, optional) – Optional information on global contiguity of the memory-distributed object.
- _moveToCompDevice(x: torch.Tensor, func: Callable | None) torch.Tensor[source]
Moves the torch tensor to the relevant device, in case the function is not compatible with the MPI+GPU library.
- alltoall_sendbuffer(obj: torch.Tensor) List[mpi4py.MPI.memory | Tuple[int, int] | mpi4py.MPI.Datatype][source]
Converts a passed
torch.Tensorinto a memory buffer object with associated number of elements and MPI data type. XXX: might not work for all MPI stacks. Might require multiple type commits or so- Parameters:
obj (torch.Tensor) – The object to be transformed into a custom MPI datatype
- alltoall_recvbuffer(obj: torch.Tensor) List[mpi4py.MPI.memory | Tuple[int, int] | mpi4py.MPI.Datatype][source]
Converts a passed
torch.Tensorinto a memory buffer object with associated number of elements and MPI data type. XXX: might not work for all MPI stacks. Might require multiple type commits or so- Parameters:
obj (torch.Tensor) – The object to be transformed into a custom MPI datatype
- Split(color: int = 0, key: int = 0) MPICommunication[source]
Split communicator by color and key.
- Parameters:
color (int, optional) – Determines the new communicator for a process.
key (int, optional) – Ordering within the new communicator.
- Irecv(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, source: int = MPI.ANY_SOURCE, tag: int = MPI.ANY_TAG) MPIRequest[source]
Nonblocking receive
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to place the received message
source (int, optional) – Rank of source process, that send the message
tag (int, optional) – A Tag to identify the message
- Recv(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, source: int = MPI.ANY_SOURCE, tag: int = MPI.ANY_TAG, status: mpi4py.MPI.Status = None)[source]
Blocking receive
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to place the received message
source (int, optional) – Rank of the source process, that send the message
tag (int, optional) – A Tag to identify the message
status (MPI.Status, optional) – Details on the communication
- __send_like(func: Callable, buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int) Tuple[heat.core.dndarray.DNDarray | torch.Tensor | None]
Generic function for sending a message to process with rank “dest”
- Parameters:
func (Callable) – The respective MPI sending function
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Bsend(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0)[source]
Blocking buffered send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Index of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Ibsend(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0) MPIRequest[source]
Nonblocking buffered send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Irsend(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0) MPIRequest[source]
Nonblocking ready send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Isend(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0) MPIRequest[source]
Nonblocking send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Issend(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0) MPIRequest[source]
Nonblocking synchronous send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Rsend(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0)[source]
Blocking ready send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Ssend(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0)[source]
Blocking synchronous send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- Send(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, dest: int, tag: int = 0)[source]
Blocking send
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be send
dest (int, optional) – Rank of the destination process, that receives the message
tag (int, optional) – A Tag to identify the message
- __broadcast_like(func: Callable, buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int) Tuple[heat.core.dndarray.DNDarray | torch.Tensor | None]
Generic function for broadcasting a message from the process with rank “root” to all other processes of the communicator
- Parameters:
func (Callable) – The respective MPI broadcast function
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be broadcasted
root (int) – Rank of the root process, that broadcasts the message
- Bcast(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0) None[source]
Blocking Broadcast
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be broadcasted
root (int) – Rank of the root process, that broadcasts the message
- Ibcast(buf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0) MPIRequest[source]
Nonblocking Broadcast
- Parameters:
buf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the message to be broadcasted
root (int) – Rank of the root process, that broadcasts the message
- __derived_op(tensor: torch.Tensor, datatype: mpi4py.MPI.Datatype, operation: mpi4py.MPI.Op) Callable[[mpi4py.MPI.memory, mpi4py.MPI.memory, mpi4py.MPI.Datatype], None]
- _minmax_op(dtype: torch.dtype, total_count: int, shape: Tuple[int], stride: Tuple[int], offset: int = 0) Callable[[mpi4py.MPI.memory, mpi4py.MPI.memory, mpi4py.MPI.Datatype], None][source]
Create an MPI.Op for elementwise min/max combine of a packed buffer [mins; maxs].
- Parameters:
dtype (torch.dtype) – torch.dtype of underlying elements
total_count (int) – Number of elements per mins OR per max (so recv buffer has 2*total_count elements)
shape (Tuple[int]) – Shape of the packed buffer that the MPI callback will operate on. This describes the logical shape of the concatenated buffer [mins; maxs]
stride (Tuple[int]) – Stride (in elements) of the packed buffer’s storage, matching the layout
offset (int, optional) – Storage offset (if needed), default 0
- __reduce_like(func: Callable, sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op, *args: Any, **kwargs: Any) Tuple[heat.core.dndarray.DNDarray | torch.Tensor | None]
Generic function for reduction operations.
- Parameters:
func (Callable) – The respective MPI reduction operation
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result of the reduction
op (MPI.Op) – Operation to apply during the reduction.
*args (Any) – Additional positional arguments to be passed to the function
**kwargs (Any) – Additional keyword arguments to be passed to the function
- Allreduce(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM)[source]
Combines values from all processes and distributes the result back to all processes
- Exscan(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM)[source]
Computes the exclusive scan (partial reductions) of data on a collection of processes
- Iallreduce(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM) MPIRequest[source]
Nonblocking allreduce reducing values on all processes to a single value
- Iexscan(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM) MPIRequest[source]
Nonblocking Exscan
- Iscan(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM) MPIRequest[source]
Nonblocking Scan
- Ireduce(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM, root: int = 0) MPIRequest[source]
Nonblocking reduction operation
- Reduce(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM, root: int = 0)[source]
Reduce values from all processes to a single value on process “root”
- Scan(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, op: mpi4py.MPI.Op = MPI.SUM)[source]
Computes the scan (partial reductions) of data on a collection of processes in a nonblocking way
- __allgather_like(func: Callable, sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, axis: int, **kwargs)
Generic function for allgather operations.
- Parameters:
func (Callable) – Type of MPI Allgather function (i.e. allgather, allgatherv, iallgather)
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
axis (int) – Concatenation axis: The axis along which
sendbufis packed and along whichrecvbufputs together individual chunks**kwargs – Extra arguments to be passed to the function.
- Allgather(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recv_axis: int = 0)[source]
Gathers data from all tasks and distribute the combined data to all tasks
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
recv_axis (int) – Concatenation axis: The axis along which
sendbufis packed and along whichrecvbufputs together individual chunks
- Allgatherv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recv_axis: int = 0)[source]
v-call of Allgather: Each process may contribute a different amount of data.
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
recv_axis (int) – Concatenation axis: The axis along which
sendbufis packed and along whichrecvbufputs together individual chunks
- Iallgather(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recv_axis: int = 0) MPIRequest[source]
Nonblocking Allgather.
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
recv_axis (int) – Concatenation axis: The axis along which
sendbufis packed and along whichrecvbufputs together individual chunks
- Iallgatherv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recv_axis: int = 0)[source]
Nonblocking v-call of Allgather: Each process may contribute a different amount of data.
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
recv_axis (int) – Concatenation axis: The axis along which
sendbufis packed and along whichrecvbufputs together individual chunks
- __alltoall_like(func: Callable, sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, send_axis: int, recv_axis: int, **kwargs)
Generic function for alltoall operations.
- Parameters:
func (Callable) – Specific alltoall function
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
send_axis (int) –
Future split axis, along which data blocks will be created that will be send to individual ranks
if
send_axis==recv_axis, an error will be thrownif
send_axisorrecv_axisareNone, an error will be thrown
recv_axis (int) – Prior split axis, along which blocks are received from the individual ranks
**kwargs – Extra arguments to be passed to the function.
- Alltoall(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, send_axis: int = 0, recv_axis: int = None)[source]
All processes send data to all processes: The jth block sent from process i is received by process j and is placed in the ith block of recvbuf.
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
send_axis (int) –
Future split axis, along which data blocks will be created that will be send to individual ranks
if
send_axis==recv_axis, an error will be thrownif
send_axisorrecv_axisareNone, an error will be thrown
recv_axis (int) – Prior split axis, along which blocks are received from the individual ranks
- Alltoallv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, send_axis: int = 0, recv_axis: int = None)[source]
v-call of Alltoall: All processes send different amount of data to, and receive different amount of data from, all processes
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
send_axis (int) –
Future split axis, along which data blocks will be created that will be send to individual ranks
if
send_axis==recv_axis, an error will be thrownif
send_axisorrecv_axisareNone, an error will be thrown
recv_axis (int) – Prior split axis, along which blocks are received from the individual ranks
- Alltoallw(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any)[source]
Generalized All-to-All communication allowing different counts, displacements and datatypes for each partner. See MPI standard for more information.
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message. The buffer is expected to be a tuple of the form (buffer, (counts, displacements), subarray_params_list), where subarray_params_list is a list of tuples of the form (lshape, subsizes, substarts).
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result. The buffer is expected to be a tuple of the form (buffer, (counts, displacements), subarray_params_list), where subarray_params_list is a list of tuples of the form (lshape, subsizes, substarts).
- _create_recursive_vectortype(datatype: mpi4py.MPI.Datatype, tensor_stride: Tuple[int], subarray_sizes: List[int], start: List[int]) mpi4py.MPI.Datatype[source]
Create a recursive vector to handle non-contiguous tensor data. The created datatype will be a recursively defined vector datatype that will enable the collection of non-contiguous tensor data in the specified subarray sizes.
- Parameters:
datatype (MPI.Datatype) – The base datatype to create the recursive vector datatype from.
tensor_stride (Tuple[int]) – A list of tensor strides for each dimension.
subarray_sizes (List[int]) – A list of subarray sizes for each dimension.
start (List[int]) – Index of the first element of the subarray in the original array.
Notes
This function creates a recursive vector datatype by defining vectors out of the previous datatype with specified strides and sizes. The extent (size of the data type in bytes) of the new datatype is set to the extent of the basic datatype to allow interweaving of data.
Examples
>>> datatype = MPI.INT >>> tensor_stride = [1, 2, 3] >>> subarray_sizes = [4, 5, 6] >>> recursive_vectortype = create_recursive_vectortype( ... datatype, tensor_stride, subarray_sizes ... )
- Ialltoall(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, send_axis: int = 0, recv_axis: int = None) MPIRequest[source]
Nonblocking Alltoall
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
send_axis (int) –
Future split axis, along which data blocks will be created that will be send to individual ranks
if
send_axis==recv_axis, an error will be thrownif
send_axisorrecv_axisareNone, an error will be thrown
recv_axis (int) – Prior split axis, along which blocks are received from the individual ranks
- Ialltoallv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, send_axis: int = 0, recv_axis: int = None) MPIRequest[source]
Nonblocking v-call of Alltoall: All processes send different amount of data to, and receive different amount of data from, all processes
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
send_axis (int) –
Future split axis, along which data blocks will be created that will be send to individual ranks
if
send_axis==recv_axis, an error will be thrownif
send_axisorrecv_axisareNone, an error will be thrown
recv_axis (int) – Prior split axis, along which blocks are received from the individual ranks
- __gather_like(func: Callable, sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, send_axis: int, recv_axis: int, send_factor: int = 1, recv_factor: int = 1, **kwargs)
Generic function for gather operations.
- Parameters:
func (Callable) – Type of MPI Scatter/Gather function
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
send_axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packedsend_factor (int) – Number of elements to be scattered (vor non-v-calls)
recv_factor (int) – Number of elements to be gathered (vor non-v-calls)
**kwargs – Extra arguments to be passed to the function.
- Gather(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0, axis: int = 0, recv_axis: int = None)[source]
Gathers together values from a group of processes
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of receiving process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- Gatherv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0, axis: int = 0, recv_axis: int = None)[source]
v-call for Gather: All processes send different amount of data
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of receiving process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- Igather(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0, axis: int = 0, recv_axis: int = None) MPIRequest[source]
Non-blocking Gather
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of receiving process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- Igatherv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0, axis: int = 0, recv_axis: int = None) MPIRequest[source]
Non-blocking v-call for Gather: All processes send different amount of data
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of receiving process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- __scatter_like(func: Callable, sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, send_axis: int, recv_axis: int, send_factor: int = 1, recv_factor: int = 1, **kwargs)
Generic function for scatter operations.
- Parameters:
func (Callable) – Type of MPI Scatter/Gather function
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
send_axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packedsend_factor (int) – Number of elements to be scattered (vor non-v-calls)
recv_factor (int) – Number of elements to be gathered (vor non-v-calls)
**kwargs – Extra arguments to be passed to the function.
- Iscatter(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0, axis: int = 0, recv_axis: int = None) MPIRequest[source]
Non-blocking Scatter
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of sending process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- Iscatterv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0, axis: int = 0, recv_axis: int = None) MPIRequest[source]
Non-blocking v-call for Scatter: Sends different amounts of data to different processes
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of sending process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- Scatter(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, root: int = 0, axis: int = 0, recv_axis: int = None)[source]
Sends data parts from one process to all other processes in a communicator
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of sending process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- Scatterv(sendbuf: heat.core.dndarray.DNDarray | torch.Tensor | Any, recvbuf: int, root: int = 0, axis: int = 0, recv_axis: int = None)[source]
v-call for Scatter: Sends different amounts of data to different processes
- Parameters:
sendbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address of the send message
recvbuf (Union[DNDarray, torch.Tensor, Any]) – Buffer address where to store the result
root (int) – Rank of sending process
axis (int) – The axis along which
sendbufis packedrecv_axis (int) – The axis along which
recvbufis packed
- get_comm() Communication[source]
Retrieves the currently globally set default communication.
- sanitize_comm(comm: Communication | None) Communication[source]
Sanitizes a device or device identifier, i.e. checks whether it is already an instance of
heat.core.devices.Deviceor a string with known device identifier and maps it to a properDevice.- Parameters:
comm (Communication) – The comm to be sanitized
- Raises:
TypeError – If the given communication is not the proper type
- use_comm(comm: Communication = None)[source]
Sets the globally used default communicator.
- Parameters:
comm (Communication or None) – The communication to be set