banffprocessor.metadata package#
Subpackages#
- banffprocessor.metadata.models package
- Submodules
- banffprocessor.metadata.models.algorithms module
- banffprocessor.metadata.models.donorspecs module
- banffprocessor.metadata.models.editgroups module
- banffprocessor.metadata.models.edits module
- banffprocessor.metadata.models.errorlocspecs module
- banffprocessor.metadata.models.estimators module
- banffprocessor.metadata.models.estimatorspecs module
- banffprocessor.metadata.models.expressions module
- banffprocessor.metadata.models.jobs module
- banffprocessor.metadata.models.massimputationspecs module
- banffprocessor.metadata.models.metadataclass module
- banffprocessor.metadata.models.outlierspecs module
- banffprocessor.metadata.models.processcontrols module
- banffprocessor.metadata.models.processoutputs module
- banffprocessor.metadata.models.proratespecs module
- banffprocessor.metadata.models.uservars module
- banffprocessor.metadata.models.varlists module
- banffprocessor.metadata.models.verifyeditsspecs module
- banffprocessor.metadata.models.weights module
- Module contents
Submodules#
banffprocessor.metadata.metaobjects module#
- class banffprocessor.metadata.metaobjects.MetaObjects(metadata_folder: str | Path | None = None, job_id: str | None = None, dbconn: DuckDBPyConnection | None = None)[source]#
Bases:
object
Container class for collections of metadata objects.
- add_objects_of_single_type(objects: list[MetadataClass]) None [source]#
Add a list of metadata objects all of the same type to the MetaObjects collection which can be retrieved using their type.
Only one list per object type can be added; if a second is added it will overwrite the original list stored under that type.
- Parameters:
objects (list[
src.banffprocessor.metadata.MetadataClass
]) – The list of metadata objects to load- Raises:
ValueError – If objects is empty or None
TypeError – If objects contains objects of more than one type
- check_constraints() None [source]#
Perform the check_constraints method on each metadata class type loaded to this object.
- cleanup_metadata() None [source]#
Perform the cleanup() method on each metadata class type loaded to this object.
- property dbconn: DuckDBPyConnection | None#
The currently connected database used to store metadata objects.
- Returns:
The duckdbpyconnection currently being used to store metadata objects.
- Return type:
duckdb.DuckDBPyConnection | None
- display_load_summary() None [source]#
Display a summary of the Metadata files that were loaded to memory.
- get_algorithm(algorithmname: str) Algorithms | None [source]#
Get and return the
src.banffprocessor.metadata.Algorithms
object associated with the specified algorithmname.- Parameters:
algorithmname – The algorithmname of the
src.banffprocessor.metadata.Algorithms
object to retrieve- Returns:
The
src.banffprocessor.metadata.Algorithms
object with the specified algorithmname- Return type:
src.banffprocessor.metadata.Algorithms
| None
- get_edits_string(editgroupid: str) str [source]#
Get and return a string containing the list of edits in the
src.banffprocessor.metadata.Edits
objects associated with the specified editgroupid.The string is formed by concatenating the formed edit strings from the edits objects with a semi-colon and space as well as prepending each edit with its modifier (if present) and a colon and space. If no edits are found under the editgroupid, an empty string is returned.
i.e. “PASS: a > b; FAIL: c + d <= e; f - g = h;”
- Parameters:
editgroupid (str) – the ID to filter the
src.banffprocessor.metadata.Editgroups
on- Returns:
The semi-colon separated list of formed edits as a single string, empty if no edits were found
- Return type:
str
- get_estimators(estid: str) list[Estimators] [source]#
Get and return the list of
src.banffprocessor.metadata.Estimators
objects associated with the specified estid and sorted by their seqno.If no variables are found under the estid, an empty list is returned.
- Parameters:
estid (str) – the ID to filter the
src.banffprocessor.metadata.Estimators
on- Returns:
A list of
src.banffprocessor.metadata.Estimators
objects or an empty list if nosrc.banffprocessor.metadata.Estimators
are found under the estid- Return type:
list[
src.banffprocessor.metadata.Estimators
]
- get_expression(exprid: str) str [source]#
Get the expression string associated with the specified exprid.
- Parameters:
exprid (str) – The identifier of the Expression to get.
- Returns:
The expressions field value of the Expression object fetched.
- Return type:
str
- get_job_steps(jobid: str | None) list[Jobs] [source]#
Get and returns the list of Jobs objects with jobid sorted in ascending order of their seqno.
If no objects are found under the jobid an empty list is returned.
- Parameters:
jobid (str | None) – The class reference of the object type to fetch
- Returns:
A list of the
src.banffprocessor.metadata.Jobs
objects with jobid jobid- Return type:
list[
src.banffprocessor.metadata.Jobs
]
- get_objects_of_type(cls: type[MetadataClass]) list[MetadataClass] | dict[str, Any] [source]#
Get the list of metadata objects of type cls.
If no objects are found, an empty list is returned.
- Parameters:
cls (type[
src.banffprocessor.metadata.MetadataClass
]) – The class reference of the object type to fetch- Returns:
A list of all type cls objects found in this MetaObjects object or a special dictionary if objects are of type
src.banffprocessor.metadata.ProcessControls
- Return type:
list[
src.banffprocessor.metadata.MetadataClass
] | dict[str, Any]
- get_process_controls(controlid: str) dict[str, dict[ProcessControlType, list[ProcessControls]]] [source]#
Get and return a mapping of targetfile names to a dict of parameter values to their list of
src.banffprocessor.metadata.ProcessControls
objects associated with the specified controlid.The lists are sorted on the enum value of the parameter field to ensure that a regular list traversal will always pass over controls in the order that they should be applied to the target_file. If no variables are found under the controlid, an empty dict is returned.
i.e. {
- “indata”: {
ProcessControlType.ROW_FILTER: [ProcessControls1, ProcessControls2], ProcessControlType.COLUMN_FILTER: [ProcessControls3, ProcessControls4],
},
}
- Parameters:
controlid (str) – the ID to filter the
src.banffprocessor.metadata.ProcessControls
on- Returns:
A dict of target file names mapped to dicts of
src.banffprocessor.metadata.ProcessControlType
mapped to lists ofsrc.banffprocessor.metadata.ProcessControls
of that type for that targetfile, or an empty list if no records are found under the controlid- Return type:
dict[str,
src.banffprocessor.metadata.ProcessControls
]
- get_process_outputs(process: str) list[str] [source]#
Get and return the list of output_name strings for process. If no objects are found under process an empty list is returned.
- Parameters:
process (str) – The name of the process value to retrieve records of
- Returns:
A list of the output_name attributes of the
banffprocessor.metadata.ProcessOutputs
objects with process process- Return type:
list[str]
- get_specs_obj(cls: type[MetadataClass], specid: str) MetadataClass [source]#
Get and return the object of type cls with the specid specid.
Only one result should be found for the specified specid as it is effectively a primary key for its metadata table. If no object is found for the specid None is returned.
- Parameters:
cls (type[
src.banffprocessor.metadata.MetadataClass
]) – The metadata object type to search forspecid (str) – The specid to match on objects of type cls
- Raises:
MetadataConstraintError – If multiple
src.banffprocessor.metadata.MetadataClass
objects are found under specid- Returns:
The object with type cls and specid specid or None if not found
- Return type:
src.banffprocessor.metadata.MetadataClass
- get_user_vars_dict(specid: str, process: str) dict[str, str] [source]#
Get Uservars objects identified by the specid and process and return a dict mapping the Uservars var to its value.
- Parameters:
specid (str) – The specid identifying Uservars to fetch.
process (str) – The process value of the Uservars to fetch.
- Returns:
A dictionary mapping the fetched Uservars var field to their value
- Return type:
dict[str,str]
- get_varlist_fieldids(varid: str | None) list[str] [source]#
Given a list of varlist objects gets and returns the list of fieldids of the varlist objects associated with the specified varid and sorts it on seqno.
If no variables are found under the varid, an empty list is returned.
- Parameters:
varid (str | None) – the ID to filter the varlists on
- Returns:
A list of varlist objects with varid varid sorted on their seqno and an empty list if no objects are found
- Return type:
list[str]
- get_weights_string(weightid: str) str [source]#
Get and return a string containing the list of weights in the
src.banffprocessor.metadata.Weights
objects associated with the specified weightid sorted in descending order by weight, and formed by concatenating the formed weight strings from the objects with a semi-colon and space.If no
src.banffprocessor.metadata.Weights
are found under the weightid, an empty string is returned.i.e. “field1=9.0; field2=7.0; field3=5.0;”
- Parameters:
weightid (str) – the ID to filter the
src.banffprocessor.metadata.Weights
on- Returns:
The semi-colon separated list of formed weights as a single string, empty if no
src.banffprocessor.metadata.Weights
were found- Return type:
str
- initialize_metadata() None [source]#
Perform the initialize() method on each metadata class type loaded to this object.
- job_proc_names: set[str]#
- load_xml_file(metadata_file: Path, cls: type[MetadataClass]) None [source]#
Load the metadata found in metadata_file into this class:src.banffprocessor.metadata.MetaObjects object.
The new entry added will have a key of the cls name and value of a collection of objects of type cls.
- Parameters:
metadata_file (pathlib.Path) – The path of the XML file to load
cls (type[
src.banffprocessor.metadata.MetadataClass
]) – The metadata object type to load the file into
- Raises:
EmptyMetadataFileError – If the metadata file does not contain any valid entries
MetadataConstraintError – If an entry in the metadata file contains values that violate the constraints on the object type being loaded
- total_job_steps: int#
- static validate_job_sequence(job_steps: list[Jobs], job_id: str | None = None) tuple[set[str], int] [source]#
Iterate through job_steps and validates the sequence of all steps and process blocks contained/referenced in the job with job_id.
If job_id is not provided, the first job found in job_steps will be used as the starting point. Returns a list of the unique proc names contained in the job sequence.
- Parameters:
job_steps (list[Jobs]) – A collection of Jobs metadata objects
job_id (str | None, optional) – The job_id to be run and whose job steps to validate, defaults to None
- Raises:
MetadataConstraintError – If a job step of process “JOB” has a specid pointing to a job_id that does not exist in the current Jobs metadata collection
MetadataConstraintError – If a cycle exists in the graph of job_steps (i.e. a step points to a process block which points back to the calling block, thus creating an infinite loop)
- Returns:
A set of the unique proc names contained in the job sequence and the total number of job steps across the entire job.
- Return type:
tuple[set[str], int]
Module contents#
Contains the modules used to interact and store Banff Processor metadata.