:mod:`heat.io` =================== .. py:module:: heat.core.io .. autoapi-nested-parse:: Enables parallel I/O with data on disk. Module Contents --------------- .. function:: supports_netcdf() -> bool Returns ``True`` if Heat supports reading from and writing to netCDF4 files, ``False`` otherwise. .. function:: supports_hdf5() -> bool Returns ``True`` if Heat supports reading from and writing to HDF5 files, ``False`` otherwise. .. function:: load(path: str, *args: Optional[List[object]], **kwargs: Optional[Dict[str, object]]) -> heat.core.dndarray.DNDarray Attempts to load data from a file stored on disk. Attempts to auto-detect the file format by determining the extension. Supports at least CSV files, HDF5 and netCDF4 are additionally possible if the corresponding libraries are installed. :param path: Path to the file to be read. :type path: str :param args: Additional options passed to the particular functions. :type args: list, optional :param kwargs: Additional options passed to the particular functions. :type kwargs: dict, optional :raises ValueError: If the file extension is not understood or known. :raises RuntimeError: If the optional dependency for a file extension is not available. .. rubric:: Examples >>> ht.load("data.h5", dataset="DATA") DNDarray([ 1.0000, 2.7183, 7.3891, 20.0855, 54.5981], dtype=ht.float32, device=cpu:0, split=None) >>> ht.load("data.nc", variable="DATA") DNDarray([ 1.0000, 2.7183, 7.3891, 20.0855, 54.5981], dtype=ht.float32, device=cpu:0, split=None) >>> ht.load("my_data.zarr", variable="RECEIVER_1/DATA") DNDarray([ 1.0000, 2.7183, 7.3891, 20.0855, 54.5981], dtype=ht.float32, device=cpu:0, split=0) >>> ht.load("my_data.zarr", variable="RECEIVER_*/DATA") DNDarray([[ 1.0000, 2.7183, 7.3891, 20.0855, 54.5981], [ 1.0000, 2.7183, 7.3891, 20.0855, 54.5981], [ 1.0000, 2.7183, 7.3891, 20.0855, 54.5981]], dtype=ht.float32, device=cpu:0, split=0) .. seealso:: :func:`load_csv` Loads data from a CSV file. :func:`load_csv_from_folder` Loads multiple .csv files into one DNDarray which will be returned. :func:`load_hdf5` Loads data from an HDF5 file. :func:`load_netcdf` Loads data from a NetCDF4 file. :func:`load_npy_from_path` Loads multiple .npy files into one DNDarray which will be returned. :func:`load_zarr` Loads zarr-Format into DNDarray which will be returned. .. function:: load_csv(path: str, header_lines: int = 0, sep: str = ',', dtype: heat.core.types.datatype = types.float32, encoding: str = 'utf-8', split: Optional[int] = None, device: Optional[str] = None, comm: Optional[heat.core.communication.Communication] = None) -> heat.core.dndarray.DNDarray Loads data from a CSV file. The data will be distributed along the axis 0. :param path: Path to the CSV file to be read. :type path: str :param header_lines: The number of columns at the beginning of the file that should not be considered as data. :type header_lines: int, optional :param sep: The single ``char`` or ``str`` that separates the values in each row. :type sep: str, optional :param dtype: Data type of the resulting array. :type dtype: datatype, optional :param encoding: The type of encoding which will be used to interpret the lines of the csv file as strings. :type encoding: str, optional :param split: Along which axis the resulting array should be split. Default is ``None`` which means each node will have the full array. :type split: int or None : optional :param device: The device id on which to place the data, defaults to globally set default device. :type device: str, optional :param comm: The communication to use for the data distribution, defaults to global default :type comm: Communication, optional :raises TypeError: If any of the input parameters are not of correct type. .. rubric:: Examples >>> import heat as ht >>> a = ht.load_csv("data.csv") >>> a.shape [0/3] (150, 4) [1/3] (150, 4) [2/3] (150, 4) [3/3] (150, 4) >>> a.lshape [0/3] (38, 4) [1/3] (38, 4) [2/3] (37, 4) [3/3] (37, 4) >>> b = ht.load_csv("data.csv", header_lines=10) >>> b.shape [0/3] (140, 4) [1/3] (140, 4) [2/3] (140, 4) [3/3] (140, 4) >>> b.lshape [0/3] (35, 4) [1/3] (35, 4) [2/3] (35, 4) [3/3] (35, 4) .. function:: save_csv(data: heat.core.dndarray.DNDarray, path: str, header_lines: Iterable[str] = None, sep: str = ',', decimals: int = -1, encoding: str = 'utf-8', comm: Optional[heat.core.communication.Communication] = None, truncate: bool = True) Saves data to CSV files. Only 2D data, all split axes. :param data: The DNDarray to be saved to CSV. :type data: DNDarray :param path: The path as a string. :type path: str :param header_lines: Optional iterable of str to prepend at the beginning of the file. No pound sign or any other comment marker will be inserted. :type header_lines: Iterable[str] :param sep: The separator character used in this CSV. :type sep: str :param decimals: Number of digits after decimal point. :type decimals: int :param encoding: The encoding to be used in this CSV. :type encoding: str :param comm: An optional object of type Communication to be used. :type comm: Optional[Communication] :param truncate: Whether to truncate an existing file before writing, i.e. fully overwrite it. The sane default is True. Setting it to False will not shorten files if needed and thus may leave garbage at the end of existing files. :type truncate: bool .. function:: save(data: heat.core.dndarray.DNDarray, path: str, *args: Optional[List[object]], **kwargs: Optional[Dict[str, object]]) Attempts to save data from a :class:`~heat.core.dndarray.DNDarray` to disk. An auto-detection based on the file format extension is performed. :param data: The array holding the data to be stored :type data: DNDarray :param path: Path to the file to be stored. :type path: str :param args: Additional options passed to the particular functions. :type args: list, optional :param kwargs: Additional options passed to the particular functions. :type kwargs: dict, optional :raises ValueError: If the file extension is not understood or known. :raises RuntimeError: If the optional dependency for a file extension is not available. .. rubric:: Examples >>> x = ht.arange(100, split=0) >>> ht.save(x, "data.h5", "DATA", mode="a") .. function:: load_npy_from_path(path: str, dtype: heat.core.types.datatype = types.int32, split: int = 0, device: Optional[str] = None, comm: Optional[heat.core.communication.Communication] = None) -> heat.core.dndarray.DNDarray Loads multiple .npy files into one DNDarray which will be returned. The data will be concatenated along the split axis provided as input. :param path: Path to the directory in which .npy-files are located. :type path: str :param dtype: Data type of the resulting array. :type dtype: datatype, optional :param split: Along which axis the loaded arrays should be concatenated. :type split: int :param device: The device id on which to place the data, defaults to globally set default device. :type device: str, optional :param comm: The communication to use for the data distribution, default is 'heat.MPI_WORLD' :type comm: Communication, optional .. function:: supports_zarr() -> bool Returns ``True`` if zarr is installed, ``False`` otherwise.