palgen.ingest.filter
#
Module Contents#
- class palgen.ingest.filter.Filter(*needles, regex=False, unix=False)#
palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter Generic filter
- Parameters:
Important
Needles that start with ^ or end with $ are always converted to regex patterns.
- __slots__ = ('needles',)#
- match_str(other)#
- Parameters:
other (str) – string to check against
- Returns:
True if other matches any of the needles
- match_files(files, attribute=None)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
attribute – attribute of the pathlib.Path object to check against
- Yields:
Path
– for every file that matches any of the needles
- filter(files)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- __call__(file_cache)#
- Parameters:
file_cache (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- class palgen.ingest.filter.Pattern(*patterns, unix=False)#
palgen.ingest.filter.Pattern palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Pattern palgen.ingest.filter.Pattern palgen.ingest.filter.Pattern palgen.ingest.filter.Pattern->palgen.ingest.filter.Filter Filter by regex or unix pattern. Equivalent to Filter(…, regex=True)
- Parameters:
- __slots__ = ()#
- match_str(other)#
- Parameters:
other (str) – string to check against
- Returns:
True if other matches any of the needles
- match_files(files, attribute=None)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
attribute – attribute of the pathlib.Path object to check against
- Yields:
Path
– for every file that matches any of the needles
- filter(files)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- __call__(file_cache)#
- Parameters:
file_cache (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- class palgen.ingest.filter.Folder(*needles, regex=False, unix=False)#
palgen.ingest.filter.Folder palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Folder palgen.ingest.filter.Folder palgen.ingest.filter.Folder palgen.ingest.filter.Folder->palgen.ingest.filter.Filter Generic filter
- Parameters:
Important
Needles that start with ^ or end with $ are always converted to regex patterns.
- __slots__ = ()#
- filter(files)#
Filters by folder name. This will match folder names at any level
- Parameters:
files (
Iterable[Path]
) – input Iterable of files to filter- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- match_str(other)#
- Parameters:
other (str) – string to check against
- Returns:
True if other matches any of the needles
- match_files(files, attribute=None)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
attribute – attribute of the pathlib.Path object to check against
- Yields:
Path
– for every file that matches any of the needles
- __call__(file_cache)#
- Parameters:
file_cache (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- class palgen.ingest.filter.Suffix(*needles, regex=False, unix=False)#
palgen.ingest.filter.Suffix palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Suffix palgen.ingest.filter.Suffix palgen.ingest.filter.Suffix palgen.ingest.filter.Suffix->palgen.ingest.filter.Filter Generic filter
- Parameters:
Important
Needles that start with ^ or end with $ are always converted to regex patterns.
- __slots__ = ()#
- filter(files)#
Filters by suffix with leading dot. Unlike Python’s default behavior this concatenates all suffixes.
ie while pathlib.Path.suffix for foo.tar.gz would only be .gz, this will instead check against .tar.gz.
- Parameters:
files (
Iterable[Path]
) – input Iterable of files to filter- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- match_str(other)#
- Parameters:
other (str) – string to check against
- Returns:
True if other matches any of the needles
- match_files(files, attribute=None)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
attribute – attribute of the pathlib.Path object to check against
- Yields:
Path
– for every file that matches any of the needles
- __call__(file_cache)#
- Parameters:
file_cache (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- class palgen.ingest.filter.Suffixes(*needles, regex=False, unix=False, position=None)#
palgen.ingest.filter.Suffixes palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Suffixes palgen.ingest.filter.Suffixes palgen.ingest.filter.Suffixes palgen.ingest.filter.Suffixes->palgen.ingest.filter.Filter Multiple suffixes filter
- Parameters:
*needles (
str | re.Pattern[str]
) – list of strings or regex patternsregex (
bool
) – if True, all needles will be interpreted as regex patternsunix (
bool
) – if True, all needles will be interpreted as unix patternsposition (
Optional[int]
) – Check against the suffix at position position only. Tries all parts of the suffix if this is None.
- __slots__ = ('position',)#
- filter(files)#
Filters by suffix
- Parameters:
files (
Iterable[Path]
) – input Iterable of files to filter- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- match_str(other)#
- Parameters:
other (str) – string to check against
- Returns:
True if other matches any of the needles
- match_files(files, attribute=None)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
attribute – attribute of the pathlib.Path object to check against
- Yields:
Path
– for every file that matches any of the needles
- __call__(file_cache)#
- Parameters:
file_cache (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- class palgen.ingest.filter.Stem(*needles, regex=False, unix=False)#
palgen.ingest.filter.Stem palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Stem palgen.ingest.filter.Stem palgen.ingest.filter.Stem palgen.ingest.filter.Stem->palgen.ingest.filter.Filter Generic filter
- Parameters:
Important
Needles that start with ^ or end with $ are always converted to regex patterns.
- __slots__ = ()#
- filter(files)#
Filters by stem (file’s name without extension(s)) ie the stem of foobar.tar.gz is foobar
- Parameters:
files (
Iterable[Path]
) – input Iterable of files to filter- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- match_str(other)#
- Parameters:
other (str) – string to check against
- Returns:
True if other matches any of the needles
- match_files(files, attribute=None)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
attribute – attribute of the pathlib.Path object to check against
- Yields:
Path
– for every file that matches any of the needles
- __call__(file_cache)#
- Parameters:
file_cache (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- class palgen.ingest.filter.Name(*needles, regex=False, unix=False)#
palgen.ingest.filter.Name palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Filter palgen.ingest.filter.Name palgen.ingest.filter.Name palgen.ingest.filter.Name palgen.ingest.filter.Name->palgen.ingest.filter.Filter Generic filter
- Parameters:
Important
Needles that start with ^ or end with $ are always converted to regex patterns.
- __slots__ = ()#
- filter(files)#
Filters by name
- Parameters:
files (
Iterable[Path]
) – input Iterable of files to filter- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- match_str(other)#
- Parameters:
other (str) – string to check against
- Returns:
True if other matches any of the needles
- match_files(files, attribute=None)#
- Parameters:
files (Iterable[pathlib.Path]) – input Iterable of files to filter
attribute – attribute of the pathlib.Path object to check against
- Yields:
Path
– for every file that matches any of the needles
- __call__(file_cache)#
- Parameters:
file_cache (Iterable[pathlib.Path]) – input Iterable of files to filter
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[pathlib.Path]
- palgen.ingest.filter.Passthrough(data)#
No-op, yields everything from the input Iterable
- Parameters:
files (
Iterable[Any]
) – any Iterabledata (Iterable[Any]) –
- Yields:
Path
– for every file that matches any of the needles- Return type:
Iterable[Any]
- palgen.ingest.filter.Nothing(data)#
Consumes the input iterable but does not yield anything
- Parameters:
files (
Iterable[Any]
) – any Iterabledata (Iterable[Any]) –
- Yields:
Nothing whatsoever.
- Return type:
Iterable[Any]