:mod:`heat.graph` ================= .. py:module:: heat.graph .. autoapi-nested-parse:: import the graph functions into the graph namespace Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 laplacian/index.rst Package Contents ---------------- .. py:class:: DNDarray(array: torch.Tensor, gshape: Tuple[int, Ellipsis], dtype: heat.core.types.datatype, split: Union[int, None], device: heat.core.devices.Device, comm: Communication, balanced: bool) Distributed N-Dimensional array. The core element of HeAT. It is composed of PyTorch tensors local to each process. :param array: Local array elements :type array: torch.Tensor :param gshape: The global shape of the array :type gshape: Tuple[int,...] :param dtype: The datatype of the array :type dtype: datatype :param split: The axis on which the array is divided between processes :type split: int or None :param device: The device on which the local arrays are using (cpu or gpu) :type device: Device :param comm: The communications object for sending and receiving data :type comm: Communication :param balanced: Describes whether the data are evenly distributed across processes. If this information is not available (``self.balanced is None``), it can be gathered via the :func:`is_balanced()` method (requires communication). :type balanced: bool or None .. attribute:: __array .. attribute:: __gshape .. attribute:: __dtype .. attribute:: __split .. attribute:: __device .. attribute:: __comm .. attribute:: __balanced .. attribute:: __ishalo :annotation: = False .. attribute:: __halo_next :annotation: = None .. attribute:: __halo_prev :annotation: = None .. attribute:: __partitions_dict__ :annotation: = None .. attribute:: __lshape_map :annotation: = None .. role:: raw-html(raw) :format: html .. method:: __prephalo(start, end) -> torch.Tensor Extracts the halo indexed by start, end from ``self.array`` in the direction of ``self.split`` :param start: Start index of the halo extracted from ``self.array`` :type start: int :param end: End index of the halo extracted from ``self.array`` :type end: int .. method:: get_halo(halo_size: int, prev: bool = True, next: bool = True) Fetch halos of size ``halo_size`` from neighboring ranks and save them in ``self.halo_next/self.halo_prev``. :param halo_size: Size of the halo. :type halo_size: int :param prev: If True, fetch the halo from the previous rank. Default: True. :type prev: bool, optional :param next: If True, fetch the halo from the next rank. Default: True. :type next: bool, optional .. method:: __cat_halo() -> torch.Tensor Return local array concatenated to halos if they are available. .. method:: __array__() -> numpy.ndarray Returns a view of the process-local slice of the :class:`DNDarray` as a numpy ndarray, if the ``DNDarray`` resides on CPU. Otherwise, it returns a copy, on CPU, of the process-local slice of ``DNDarray`` as numpy ndarray. .. method:: __array_ufunc__(ufunc, method, *inputs, **kwargs) Override NumPy's universal functions. .. method:: __array_function__(func, types, args, kwargs) Augments NumPy's functions. .. method:: astype(dtype, copy=True) -> DNDarray Returns a casted version of this array. Casted array is a new array of the same shape but with given type of this array. If copy is ``True``, the same array is returned instead. :param dtype: Heat type to which the array is cast :type dtype: datatype :param copy: By default the operation returns a copy of this array. If copy is set to ``False`` the cast is performed in-place and this array is returned :type copy: bool, optional .. method:: balance_() -> DNDarray Function for balancing a :class:`DNDarray` between all nodes. To determine if this is needed use the :func:`is_balanced()` function. If the ``DNDarray`` is already balanced this function will do nothing. This function modifies the ``DNDarray`` itself and will not return anything. .. rubric:: Examples >>> a = ht.zeros((10, 2), split=0) >>> a[:, 0] = ht.arange(10) >>> b = a[3:] [0/2] tensor([[3., 0.], [1/2] tensor([[4., 0.], [5., 0.], [6., 0.]]) [2/2] tensor([[7., 0.], [8., 0.], [9., 0.]]) >>> b.balance_() >>> print(b.gshape, b.lshape) [0/2] (7, 2) (1, 2) [1/2] (7, 2) (3, 2) [2/2] (7, 2) (3, 2) >>> b [0/2] tensor([[3., 0.], [4., 0.], [5., 0.]]) [1/2] tensor([[6., 0.], [7., 0.]]) [2/2] tensor([[8., 0.], [9., 0.]]) >>> print(b.gshape, b.lshape) [0/2] (7, 2) (3, 2) [1/2] (7, 2) (2, 2) [2/2] (7, 2) (2, 2) .. method:: __bool__() -> bool Boolean scalar casting. .. method:: __cast(cast_function) -> Union[float, int] Implements a generic cast function for ``DNDarray`` objects. :param cast_function: The actual cast function, e.g. ``float`` or ``int`` :type cast_function: function :raises TypeError: If the ``DNDarray`` object cannot be converted into a scalar. .. method:: collect_(target_rank: Optional[int] = 0) -> None A method collecting a distributed DNDarray to one MPI rank, chosen by the `target_rank` variable. It is a specific case of the ``redistribute_`` method. :param target_rank: The rank to which the DNDarray will be collected. Default: 0. :type target_rank: int, optional :raises TypeError: If the target rank is not an integer. :raises ValueError: If the target rank is out of bounds. .. rubric:: Examples >>> st = ht.ones((50, 81, 67), split=2) >>> print(st.lshape) [0/2] (50, 81, 23) [1/2] (50, 81, 22) [2/2] (50, 81, 22) >>> st.collect_() >>> print(st.lshape) [0/2] (50, 81, 67) [1/2] (50, 81, 0) [2/2] (50, 81, 0) >>> st.collect_(1) >>> print(st.lshape) [0/2] (50, 81, 0) [1/2] (50, 81, 67) [2/2] (50, 81, 0) .. method:: __complex__() -> DNDarray Complex scalar casting. .. method:: counts_displs() -> Tuple[Tuple[int], Tuple[int]] Returns actual counts (number of items per process) and displacements (offsets) of the DNDarray. Does not assume load balance. .. method:: cpu() -> DNDarray Returns a copy of this object in main memory. If this object is already in main memory, then no copy is performed and the original object is returned. .. method:: create_lshape_map(force_check: bool = False) -> torch.Tensor Generate a 'map' of the lshapes of the data on all processes. Units are ``(process rank, lshape)`` :param force_check: if False (default) and the lshape map has already been created, use the previous result. Otherwise, create the lshape_map :type force_check: bool, optional .. method:: create_partition_interface() Create a partition interface in line with the DPPY proposal. This is subject to change. The intention of this to facilitate the usage of a general format for the referencing of distributed datasets. An example of the output and shape is shown below. __partitioned__ = { 'shape': (27, 3, 2), 'partition_tiling': (4, 1, 1), 'partitions': { (0, 0, 0): { 'start': (0, 0, 0), 'shape': (7, 3, 2), 'data': tensor([...], dtype=torch.int32), 'location': [0], 'dtype': torch.int32, 'device': 'cpu' }, (1, 0, 0): { 'start': (7, 0, 0), 'shape': (7, 3, 2), 'data': None, 'location': [1], 'dtype': torch.int32, 'device': 'cpu' }, (2, 0, 0): { 'start': (14, 0, 0), 'shape': (7, 3, 2), 'data': None, 'location': [2], 'dtype': torch.int32, 'device': 'cpu' }, (3, 0, 0): { 'start': (21, 0, 0), 'shape': (6, 3, 2), 'data': None, 'location': [3], 'dtype': torch.int32, 'device': 'cpu' } }, 'locals': [(rank, 0, 0)], 'get': lambda x: x, } :rtype: dictionary containing the partition interface as shown above. .. method:: __float__() -> DNDarray Float scalar casting. .. seealso:: :func:`~heat.core.manipulations.flatten` .. method:: fill_diagonal(value: float) -> DNDarray Fill the main diagonal of a 2D :class:`DNDarray`. This function modifies the input tensor in-place, and returns the input array. :param value: The value to be placed in the ``DNDarrays`` main diagonal :type value: float .. method:: __getitem__(key: Union[int, Tuple[int, Ellipsis], List[int, Ellipsis]]) -> DNDarray Global getter function for DNDarrays. Returns a new DNDarray composed of the elements of the original tensor selected by the indices given. This does *NOT* redistribute or rebalance the resulting tensor. If the selection of values is unbalanced then the resultant tensor is also unbalanced! To redistributed the ``DNDarray`` use :func:`balance()` (issue #187) :param key: Indices to get from the tensor. :type key: int, slice, Tuple[int,...], List[int,...] .. rubric:: Examples >>> a = ht.arange(10, split=0) (1/2) >>> tensor([0, 1, 2, 3, 4], dtype=torch.int32) (2/2) >>> tensor([5, 6, 7, 8, 9], dtype=torch.int32) >>> a[1:6] (1/2) >>> tensor([1, 2, 3, 4], dtype=torch.int32) (2/2) >>> tensor([5], dtype=torch.int32) >>> a = ht.zeros((4, 5), split=0) (1/2) >>> tensor([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]]) (2/2) >>> tensor([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]]) >>> a[1:4, 1] (1/2) >>> tensor([0.]) (2/2) >>> tensor([0., 0.]) .. method:: gpu() -> DNDarray Returns a copy of this object in GPU memory. If this object is already in GPU memory, then no copy is performed and the original object is returned. .. method:: __int__() -> DNDarray Integer scalar casting. .. method:: is_balanced(force_check: bool = False) -> bool Determine if ``self`` is balanced evenly (or as evenly as possible) across all nodes distributed evenly (or as evenly as possible) across all processes. This is equivalent to returning ``self.balanced``. If no information is available (``self.balanced = None``), the balanced status will be assessed via collective communication. :param force_check: If True, the balanced status of the ``DNDarray`` will be assessed via collective communication in any case. :type force_check: bool, optional .. method:: is_distributed() -> bool Determines whether the data of this ``DNDarray`` is distributed across multiple processes. .. method:: __key_is_singular(key: any, axis: int, self_proxy: torch.Tensor) -> bool .. method:: __key_adds_dimension(key: any, axis: int, self_proxy: torch.Tensor) -> bool .. method:: item() Returns the only element of a 1-element :class:`DNDarray`. Mirror of the pytorch command by the same name. If size of ``DNDarray`` is >1 element, then a ``ValueError`` is raised (by pytorch) .. rubric:: Examples >>> import heat as ht >>> x = ht.zeros((1)) >>> x.item() 0.0 .. method:: __len__() -> int The length of the ``DNDarray``, i.e. the number of items in the first dimension. .. method:: numpy() -> numpy.array Returns a copy of the :class:`DNDarray` as numpy ndarray. If the ``DNDarray`` resides on the GPU, the underlying data will be copied to the CPU first. If the ``DNDarray`` is distributed, an MPI Allgather operation will be performed before converting to np.ndarray, i.e. each MPI process will end up holding a copy of the entire array in memory. Make sure process memory is sufficient! .. rubric:: Examples >>> import heat as ht T1 = ht.random.randn((10,8)) T1.numpy() .. method:: _repr_pretty_(p, cycle) Pretty print for IPython. .. method:: __repr__() -> str Returns a printable representation of the passed DNDarray, targeting developers. .. method:: ravel() Flattens the ``DNDarray``. .. seealso:: :func:`~heat.core.manipulations.ravel` .. rubric:: Examples >>> a = ht.ones((2, 3), split=0) >>> b = a.ravel() >>> a[0, 0] = 4 >>> b DNDarray([4., 1., 1., 1., 1., 1.], dtype=ht.float32, device=cpu:0, split=0) .. method:: redistribute_(lshape_map: Optional[torch.Tensor] = None, target_map: Optional[torch.Tensor] = None) Redistributes the data of the :class:`DNDarray` *along the split axis* to match the given target map. This function does not modify the non-split dimensions of the ``DNDarray``. This is an abstraction and extension of the balance function. :param lshape_map: The current lshape of processes. Units are ``[rank, lshape]``. :type lshape_map: torch.Tensor, optional :param target_map: The desired distribution across the processes. Units are ``[rank, target lshape]``. Note: the only important parts of the target map are the values along the split axis, values which are not along this axis are there to mimic the shape of the ``lshape_map``. :type target_map: torch.Tensor, optional .. rubric:: Examples >>> st = ht.ones((50, 81, 67), split=2) >>> target_map = torch.zeros((st.comm.size, 3), dtype=torch.int64) >>> target_map[0, 2] = 67 >>> print(target_map) [0/2] tensor([[ 0, 0, 67], [0/2] [ 0, 0, 0], [0/2] [ 0, 0, 0]], dtype=torch.int32) [1/2] tensor([[ 0, 0, 67], [1/2] [ 0, 0, 0], [1/2] [ 0, 0, 0]], dtype=torch.int32) [2/2] tensor([[ 0, 0, 67], [2/2] [ 0, 0, 0], [2/2] [ 0, 0, 0]], dtype=torch.int32) >>> print(st.lshape) [0/2] (50, 81, 23) [1/2] (50, 81, 22) [2/2] (50, 81, 22) >>> st.redistribute_(target_map=target_map) >>> print(st.lshape) [0/2] (50, 81, 67) [1/2] (50, 81, 0) [2/2] (50, 81, 0) .. method:: __redistribute_shuffle(snd_pr: Union[int, torch.Tensor], send_amt: Union[int, torch.Tensor], rcv_pr: Union[int, torch.Tensor], snd_dtype: torch.dtype) Function to abstract the function used during redistribute for shuffling data between processes along the split axis :param snd_pr: Sending process :type snd_pr: int or torch.Tensor :param send_amt: Amount of data to be sent by the sending process :type send_amt: int or torch.Tensor :param rcv_pr: Receiving process :type rcv_pr: int or torch.Tensor :param snd_dtype: Torch type of the data in question :type snd_dtype: torch.dtype .. method:: resplit_(axis: int = None) In-place option for resplitting a :class:`DNDarray`. :param axis: The new split axis, ``None`` denotes gathering, an int will set the new split axis :type axis: int .. rubric:: Examples >>> a = ht.zeros( ... ( ... 4, ... 5, ... ), ... split=0, ... ) >>> a.lshape (0/2) (2, 5) (1/2) (2, 5) >>> ht.resplit_(a, None) >>> a.split None >>> a.lshape (0/2) (4, 5) (1/2) (4, 5) >>> a = ht.zeros( ... ( ... 4, ... 5, ... ), ... split=0, ... ) >>> a.lshape (0/2) (2, 5) (1/2) (2, 5) >>> ht.resplit_(a, 1) >>> a.split 1 >>> a.lshape (0/2) (4, 3) (1/2) (4, 2) .. method:: __setitem__(key: Union[int, Tuple[int, Ellipsis], List[int, Ellipsis]], value: Union[float, DNDarray, torch.Tensor]) Global item setter :param key: Index/indices to be set :type key: Union[int, Tuple[int,...], List[int,...]] :param value: Value to be set to the specified positions in the DNDarray (self) :type value: Union[float, DNDarray,torch.Tensor] .. rubric:: Notes If a ``DNDarray`` is given as the value to be set then the split axes are assumed to be equal. If they are not, PyTorch will raise an error when the values are attempted to be set on the local array .. rubric:: Examples >>> a = ht.zeros((4, 5), split=0) (1/2) >>> tensor([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]]) (2/2) >>> tensor([[0., 0., 0., 0., 0.], [0., 0., 0., 0., 0.]]) >>> a[1:4, 1] = 1 >>> a (1/2) >>> tensor([[0., 0., 0., 0., 0.], [0., 1., 0., 0., 0.]]) (2/2) >>> tensor([[0., 1., 0., 0., 0.], [0., 1., 0., 0., 0.]]) .. method:: __setter(key: Union[int, Tuple[int, Ellipsis], List[int, Ellipsis]], value: Union[float, DNDarray, torch.Tensor]) Utility function for checking ``value`` and forwarding to :func:``__setitem__`` :raises NotImplementedError: If the type of ``value`` ist not supported .. method:: __str__() -> str Computes a string representation of the passed ``DNDarray``. .. method:: tolist(keepsplit: bool = False) -> List Return a copy of the local array data as a (nested) Python list. For scalars, a standard Python number is returned. :param keepsplit: Whether the list should be returned locally or globally. :type keepsplit: bool .. rubric:: Examples >>> a = ht.array([[0, 1], [2, 3]]) >>> a.tolist() [[0, 1], [2, 3]] >>> a = ht.array([[0, 1], [2, 3]], split=0) >>> a.tolist() [[0, 1], [2, 3]] >>> a = ht.array([[0, 1], [2, 3]], split=1) >>> a.tolist(keepsplit=True) (1/2) [[0], [2]] (2/2) [[1], [3]] .. method:: __torch_function__(func, types, args=(), kwargs=None) Supports PyTorch's dispatch mechanism. .. method:: __torch_proxy__() -> torch.Tensor Return a 1-element `torch.Tensor` strided as the global `self` shape. Used internally for sanitation purposes. .. method:: __xitem_get_key_start_stop(rank: int, actives: list, key_st: int, key_sp: int, step: int, ends: torch.Tensor, og_key_st: int) -> Tuple[int, int] .. py:class:: Laplacian(similarity: Callable, weighted: bool = True, definition: str = 'norm_sym', mode: str = 'fully_connected', threshold_key: str = 'upper', threshold_value: float = 1.0, neighbours: int = 10) Graph Laplacian from a dataset :param similarity: Metric function that defines similarity between vertices. Should accept a data matrix :math:`n \times f` as input and return an :math:`n\times n` similarity matrix. Additional required parameters can be passed via a lambda function. :type similarity: Callable :param definition: Type of Laplacian - ``'simple'``: Laplacian matrix for simple graphs :math:`L = D - A` - ``'norm_sym'``: Symmetric normalized Laplacian :math:`L^{sym} = I - D^{-1/2} A D^{-1/2}` - ``'norm_rw'``: Random walk normalized Laplacian :math:`L^{rw} = D^{-1} L = I - D^{-1}` :type definition: str :param mode: How to calculate adjacency from the similarity matrix - ``'fully_connected'`` is fully-connected, so :math:`A = S` - ``'eNeighbour'`` is the epsilon neighbourhood, with :math:`A_{ji} = 0` if :math:`S_{ij} > upper` or :math:`S_{ij} < lower`; for eNeighbour an upper or lower boundary needs to be set :type mode: str :param threshold_key: ``'upper'`` or ``'lower'``, defining the type of threshold for the epsilon-neighborhood :type threshold_key: str :param threshold_value: Boundary value for the epsilon-neighborhood :type threshold_value: float :param neighbours: Number of nearest neighbors to be considered for adjacency definition. Currently not implemented :type neighbours: int .. attribute:: similarity_metric .. attribute:: weighted :annotation: = True .. attribute:: neighbours :annotation: = 10 .. role:: raw-html(raw) :format: html .. method:: _normalized_symmetric_L(A: heat.core.dndarray.DNDarray) -> heat.core.dndarray.DNDarray Helper function to calculate the normalized symmetric Laplacian .. math:: L^{sym} = D^{-1/2} L D^{-1/2} = I - D^{-1/2} A D^{-1/2} :param A: The adjacency matrix of the graph :type A: DNDarray .. method:: _simple_L(A: heat.core.dndarray.DNDarray) Helper function to calculate the simple graph Laplacian .. math:: L = D - A :param A: The Adjacency Matrix of the graph :type A: DNDarray .. method:: construct(X: heat.core.dndarray.DNDarray) -> heat.core.dndarray.DNDarray Callable to get the Laplacian matrix from the dataset ``X`` according to the specified Laplacian :param X: The data matrix, Shape = (n_samples, n_features) :type X: DNDarray