datalad_next.iter_collections.tarfile

Report on the content of TAR archives

The main functionality is provided by the iter_tar() function.

class datalad_next.iter_collections.tarfile.TarfileItem(type: 'FileSystemItemType', name: 'str', size: 'int', mtime: 'float | None' = None, mode: 'int | None' = None, uid: 'int | None' = None, gid: 'int | None' = None, link_target: 'str | None' = None, fp: 'IO | None' = None)[source]

Bases: FileSystemItem

Just as for name, a link target is also reported in POSIX format.

Returns the link_target as a PurePosixPath instance

name: str

TAR uses POSIX paths as item identifiers. Not all POSIX paths can be represented on all (non-POSIX) file systems, therefore the item name is represented in POSIX form, instead of in platform conventions.

property path: PurePosixPath

Returns the item name as a PurePosixPath instance

datalad_next.iter_collections.tarfile.iter_tar(path: Path, *, fp: bool = False) Generator[TarfileItem, None, None][source]

Uses the standard library tarfile module to report on TAR archives

A TAR archive can represent more or less the full bandwidth of file system properties, therefore reporting on archive members is implemented similar to iter_dir(). The iterator produces an TarfileItem instance with standard information on file system elements, such as size, or mtime.

Parameters:
  • path (Path) -- Path of the TAR archive to report content for (iterate over).

  • fp (bool, optional) -- If True, each file-type item includes a file-like object to access the file's content. This file handle will be closed automatically when the next item is yielded or the function returns.

Yields:

TarfileItem -- The name attribute of an item is a str with the corresponding archive member name (in POSIX conventions).