Bamboost
bamboost/core/hdf5/file

bamboost.core.hdf5.file

This module provides an enhanced interface for working with HDF5 files. Its main goal is to provide automatic file management with heavy caching to limit file access to a minimum.

Key features:

  • Lazy h5py.File wrapper: HDF5File postpones file opening until necessary.
  • Cached file map: FileMap caches all groups and datasets in the file. We also implement a singleton pattern for the file map to avoid multiple instances of the same file. This allows to subsequent usage of the file map (and the file in general) while skipping file access.
  • Automatic handling of opening the file when necessary using a context stack, which enables knowing when the file is no longer needed.
  • Queued operations enable the bundling of operations. This is useful to bundle operations which require to be executed on the root process only (such as attribute updating). The queued operations are executed once the serial context is closed. If the file is not used when something is added to the queue, it is executed immediately.
  • Utility classes for working with HDF5 paths.
  • Decorators to handle file opening, and file mutability checks (avoid write when not intended).

Decorators

mutable_only: Ensures that a method can only be executed if the file is mutable. with_file_open: Opens and closes the file automatically when executing a method. add_to_file_queue: Adds a method call to the process queue instead of executing immediately.

Attributes

  • _T_HasFile=TypeVar('_T_HasFile', bound=HasFile)
  • log:Logger=BAMBOOST_LOGGER.getChild('hdf5')

    Logger instance for this module.

  • HDF_MPI_ACTIVE='mpio' in h5py.h5py.registered_drivers() and h5py.h5py.get_config().mpi and bamboost.mpi.MPI_ON

    Indicates whether MPI support is available for HDF5.

  • _T_H5Object=TypeVar('_T_H5Object', bound=H5Object)

Functions

mutable_only(method) -> Callable[Concatenate[_T_HasFile, _P], _T]

Decorator to raise an error if the file is not mutable.

Arguments:
  • method:typing.Callable[typing_extensions.Concatenate[_T_HasFile, bamboost._typing._P], bamboost._typing._T]
with_file_open(mode=FileMode.READ, driver=None) -> Callable[[Callable[Concatenate[_T_HasFile, _P], _T]], Callable[Concatenate[_T_HasFile, _P], _T]]

Decorator for context manager to open and close the file for a method of a class with a file attribute (self._file).

Arguments:
  • mode:FileMode=bamboost.core.hdf5.file.FileMode.bamboost.core.hdf5.file.FileMode.READ

    The mode to open the file with. Defaults to FileMode.READ.

  • driver:typing.Optional[typing.Literal['mpio']]=None

    The driver to use for file I/O. If "mpio" and MPI is active, it will use MPI I/O. Defaults to None.

add_to_file_queue(method) -> Callable[Concatenate[_T_H5Object, _P], None]

Decorator to add a method call to the single process queue of the file object instead of executing it immediately.

Arguments:
  • method:typing.Callable[typing_extensions.Concatenate[_T_H5Object, bamboost._typing._P], None]

Classes

HasFile

Attributes:

FileMode

Attributes:
  • READ='r'
  • READ_WRITE='r+'
  • APPEND='a'
  • WRITE='w'
  • WRITE_FAIL='w-'
  • WRITE_CREATE='x'
  • __hirarchy__={'r': 0, 'r+': 1, 'a': 1, 'w': 1, 'w-': 1, 'x': 1}
FileMode.__eq__(self, other) -> bool
Arguments:
  • other
FileMode.__lt__(self, other) -> bool
Arguments:
  • other

H5Object

H5Object(self, file) -> None
Arguments:
Attributes:
  • _file:HDF5File[bamboost._typing._MT]=bamboost.core.hdf5.file.H5Object(file)
  • _comm=Communicator()
  • mutable:bool
Bases
ElligibleForPlugin1
H5Object.open(self, mode='r', driver=None) -> HDF5File[_MT]

Use this as a context manager in a with statement. Purpose: keeping the file open to directly access/edit something in the HDF5 file of this simulation.

Arguments:
  • mode:FileMode | str='r'

    file mode (see h5py docs)

  • driver:typing.Optional[typing.Literal['mpio']]=None

    file driver (see h5py docs)

H5Object.post_write_instruction(self, instruction) -> None
Arguments:
  • instruction:typing.Callable[[], None]
H5Object.suspend_immediate_write(self) -> Generator[None, None, None]

Context manager to suspend immediate write operations. Patches self._file.available_for_single_process_write to return False.

SingleProcessQueue

SingleProcessQueue(self, file)

A queue to defer execution of write operations that need to be executed on the root only. Only relevant for parallelized code.

This class is a deque of instructions that are to be executed in order when the file is available for writing (i.e., not open with MPI I/O OR closed). We append instructions to the right and pop them from the left.

This class uses the RootProcessMeta metaclass to ensure that all methods are only executed on the root process.

Arguments:
Attributes:
  • _comm=Communicator()
  • _file=bamboost.core.hdf5.file.SingleProcessQueue(file)
SingleProcessQueue.add_instruction(self, instruction) -> None
Arguments:
  • instruction:typing.Callable[[], None]
SingleProcessQueue.apply_instruction(self, instruction) -> None
Arguments:
  • instruction:typing.Callable[[], None]

Applies all write instructions in the queue.

WriteInstruction

Abstract base class for write instructions. Not useful currently, but could be extended in the future (e.g. provide logging).

HDF5File

HDF5File(self, file, comm=None, mutable=False)

Lazy h5py.File wrapper with deferred process execution and file map caching.

Arguments:
  • file:StrPath
  • comm:typing.Optional[Comm]=None
  • mutable:bool=False
Attributes:
  • _filename:str=bamboost.core.hdf5.file.HDF5File(file).bamboost.core.hdf5.file.HDF5File(file).as_posix() if isinstance(bamboost.core.hdf5.file.HDF5File(file), pathlib.Path) else bamboost.core.hdf5.file.HDF5File(file)
  • _comm=Communicator()
  • _context_stack:int=0
  • mutable:bool=bamboost.core.hdf5.file.HDF5File(mutable)
  • file_map:FileMap=FileMap(self)
  • _is_open_on_root_only:bool=False
  • _attrs_dict_instances:dict[str, 'AttrsDict[_MT]']={}
  • _path=Path(self._filename).absolute()
  • is_open:bool
  • single_process_queue:SingleProcessQueue

    The single process queue of this file object. See SingleProcessQueue for details.

  • root:Group[bamboost._typing._MT]

    Returns the root group of the file. Same as Group("/", file)

HDF5File.__repr__(self) -> str
HDF5File.open(self, mode='r', driver=None) -> HDF5File

Context manager to opens the HDF5 file with the specified mode and driver.

This method attempts to open the file, and if it's locked, it will retry until the file becomes available.

Arguments:
  • mode:typing.Union[FileMode, str]='r'

    The mode to open the file with. Defaults to "r" (read-only).

  • driver=None

    The driver to use for file I/O. If "mpio" and MPI is active, it will use MPI I/O. Defaults to None.

Returns
HDF5FileThe opened HDF5 file object.
HDF5File.delete_object(self, path) -> None

Deletes an object in the file. In addition to deleting the object, revoking this method also removes the object from the file map.

Arguments:
  • path:typing.Union[HDF5Path, str]

    The path to the object to delete.

Whether single process write instructions can be executed immediately.

HDF5File._create_file(self) -> HDF5File[Mutable]

Opens and closes the file to create it while doing nothing to it.

HDF5File._try_open_repeat(self, mode, driver=None) -> Self
Arguments:
  • driver:typing.Optional[typing.Literal['mpio']]=None