Ingesting files#
To make common operations such as ingesting files from your project’s source directory a little easier, palgen comes with a “magic” Pipeline
helper. This helper allows palgen to automatically parallelize tasks if possible.
Essentially a Pipeline such as
pipeline = Sources() >> foo >> bar >> baz
when invoked with pipeline(iterable)
would essentially expand to
You can use the @max_jobs(...)
decorator from palgen.ext
to control the amount of jobs a step can be run at. Additionally annotating the data
parameter with list
is equivalent to decorating the step with @max_jobs(1)
.
All steps must be functions or function objects returning an iterable, generators, a class implementing the iterator protocol or an instance of such a class. The palgen.machinery.pipelines.Pipeline
class is aliased as palgen.ext.Sources
for convenience. The default run(...)
method will call the pipeline
class attribute of your extension with a pre-filtered list of source files.
Filters and loaders#
For convenience some commonly used path filters and file loaders are implemented in palgen.ingest
. All of the built-in filter and file loader steps expect an iterable of Path`s, so make sure to use only one loader as last step as it yields a tuple of the :code:`Path
to the file and the file’s loaded content.
Default pipelines#
By default palgen searches for TOML files named after the extension’s name. For example an extension called error
would by default ingest all files called error.toml
from the source directories.
ingest = Sources() >> Suffix('toml')
>> Name(<extension name>)
>> Toml
This behavior can be overridden by defining the ingest
class variable of your extension. Possible values are None
to disable ingest entirely, a pipeline or a dictionary of pipelines with string keys.
The overall pipeline of an extension can be overridden by setting the pipeline
class variable of your extension. By default palgen will execute the ingest
pipeline and the transform
, validate
, render
and write
methods of your extension in order. This is equivalent to setting something like
pipeline = Sources() >> <extension>.ingest
>> <extension>.transform
>> <extension>.validate
>> <extension>.render
>> <extension>.write
Full example#
1from hashlib import md5
2from pathlib import Path
3from typing import Iterable
4
5from palgen import Extension, Sources, max_jobs
6from palgen.ingest import Suffixes, Relative, Raw
7from palgen.template.string import Template
8
9# extension class must inherit from palgen.Extension
10
11
12class Embed(Extension):
13 ''' Converts arbitrary files to C arrays '''
14
15 ingest = (Sources # Paths to all files
16 >> Suffixes('.embed', position=0) # Select files with .embed as first extension (ie foo.embed.bar)
17 >> Relative # Convert paths to relative paths
18 >> Raw) # Read in files as raw bytes
19
20 def transform(self, data: Iterable[tuple[Path, bytes]]):
21 for file, content in data:
22 # remove the `.embed` suffix
23 path = file.with_stem(file.name[: -len(''.join(file.suffixes))])
24
25 # add `.hpp` suffix
26 outpath = path.with_suffix(''.join([*file.suffixes[1:], '.hpp']))
27
28 # hash the input path without the `.embed` suffix
29 filehash = md5(bytes(str(path), encoding="utf-8")).hexdigest()
30
31 # pass this along to the next step
32 yield outpath, filehash, Template("resource.tpl.hpp")(hash = filehash,
33 data = ','.join([hex(value)for value in content]),
34 path = path)
35
36 # this collects all previous results, so limit this to 1 job
37 @max_jobs(1)
38 def render(self, data: Iterable[tuple[Path, str, str]]):
39 includes = []
40 pairs = []
41 for file, filehash, content in data:
42 includes.append(f"Embed::r{filehash}")
43 pairs.append(f"#include <{file}>")
44 yield file, content
45
46 indent = " " * 16
47 yield Path('src') / 'embed.hpp', Template("embed.tpl.hpp")(amount = len(pairs),
48 pairs = f',\n{indent}'.join(pairs),
49 includes = '\n'.join(includes))