banffprocessor.metadata.models package#

Submodules#

banffprocessor.metadata.models.algorithms module#

Metadata model for user-defined estimator algorithms.

class banffprocessor.metadata.models.algorithms.Algorithms(algorithmname: str, status: str, formula: str, type: str, description: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for user-defined algorithms which define estimators.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

to_dict() dict[str, str][source]#

Return the algormithm metadata as a dictionary.

banffprocessor.metadata.models.donorspecs module#

Metadata model for Donor Imputation specifications.

class banffprocessor.metadata.models.donorspecs.Donorspecs(specid: str, n: int, dataexclvar: str | None = None, posteditgroupid: str | None = None, mustmatchid: str | None = None, mindonors: int | None = None, pcentdonors: float | None = None, eligdon: str | None = None, random: bool | None = None, nlimit: int | None = None, mrl: float | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for donor imputation specifications.

classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Check constaints after all metadata has been loaded (typically foreign key constraints).

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.editgroups module#

Metadata model for defining edit groups.

class banffprocessor.metadata.models.editgroups.Editgroups(editgroupid: str, editid: str, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for defining edit groups.

classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

All edits defined in an edit group must exist.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.edits module#

Metadata model for edits.

class banffprocessor.metadata.models.edits.Edits(editid: str, leftside: str, rightside: str, operator: str, modifier: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for defining edits.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.errorlocspecs module#

Metadata model for Errorloc Specifiations.

class banffprocessor.metadata.models.errorlocspecs.Errorlocspecs(specid: str, cardinality: float | None = None, timeperobs: float | None = None, weightid: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for ErrorLoc specifications.

classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Check constaints after all metadata has been loaded (typically foreign key constraints).

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.estimators module#

Metadata model for Estimators.

class banffprocessor.metadata.models.estimators.Estimators(estimatorid: str, seqno: float, fieldid: str, algorithmname: str, randomerror: bool, auxvariables: str | None = None, weightvariable: str | None = None, variancevariable: str | None = None, varianceexponent: float | None = None, varianceperiod: str | None = None, excludeimputed: bool | None = None, excludeoutliers: bool | None = None, countcriteria: int | None = None, percentcriteria: float | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Estimators metadata class.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

to_dict() dict[str, str | int | float][source]#

Return the object as a dictionary.

Used for creating a Dataframe from the object. Explicitly makes all fields values reflect their type, if no value was provided. This way there is no possiblity of an incorrect datatype (character seen as numeric or vice versa) for any empty fields when the constructed dataframe is passed to the Banff package c-code.

banffprocessor.metadata.models.estimators.builtin_estimators() dict[source]#

Return built-in estimators as a dictionary, where the key is the name and the value is the type.

banffprocessor.metadata.models.estimatorspecs module#

Metadata model for Estimator specifications.

class banffprocessor.metadata.models.estimatorspecs.Estimatorspecs(specid: str, estimatorid: str, dataexclvar: str | None = None, histexclvar: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for Estimator specifications.

classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Check constaints after all metadata has been loaded (typically foreign key constraints).

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.expressions module#

Metadata model for Expressions.

class banffprocessor.metadata.models.expressions.Expressions(expressionid: str, expressions: str, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for expressions.

static get_expression(expression_id: str, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) str[source]#

Return the expression as a string for the given id.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.jobs module#

Metadata model for Jobs.

class banffprocessor.metadata.models.jobs.Jobs(jobid: str, seqno: float, process: str, specid: str | None = None, editgroupid: str | None = None, byid: str | None = None, acceptnegative: str | None = None, controlid: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for defining Jobs.

classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Check constaints after all metadata has been loaded (typically foreign key constraints).

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.massimputationspecs module#

Metadata model for Mass Imputation specifications.

class banffprocessor.metadata.models.massimputationspecs.Massimputationspecs(specid: str, mustimputeid: str, mindonors: int | None = None, pcentdonors: float | None = None, mustmatchid: str | None = None, random: bool | None = None, nlimit: int | None = None, mrl: float | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for mass imputation specifications.

classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Check constaints after all metadata has been loaded (typically foreign key constraints).

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.metadataclass module#

Abstract Class for Banff Processor Metadata models.

class banffprocessor.metadata.models.metadataclass.MetadataClass[source]#

Bases: object

Banff Processor Metadata Class.

Abstract class definition for all metadata classes to extend, to allow for easier type hinting without needing to explicitly write out all class names and it reduces replicated code

DATA_FIELD_SCHEMA_MAX_LENGTH: str = '64'#
classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Constraints to check after all metadata has been loaded.

Subclasses may implement a set of checks, if necessary. If a constraint fails, an exception is raised.

classmethod cleanup(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Cleanup metadata in the given database.

The metadata table will be deleted if the connection is still open and the table exists. If the database is not open, the connection object will have no default_connection attribute.

classmethod get_record_count(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) int[source]#

Return the number of records in the metadata table.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return a string that contains the XML Schema Definition for the classes metadata.

By default the root element will be banffProcessor, but this may be changed as some XML generators use a standard root, like ‘root’ or ‘data’.

static handle_foreign_key_violation(table1_name: str, table1_column: str, table2_name: str, table2_column: str, values_not_found: str) None[source]#

Handle foreign key violations.

When a relationship error is detected, a MetadataConstraintError exception is raised.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Perform any initialization before loading metadata.

This is typically creating the database table to store the metadata in.

classmethod load_xml(xml_file_name: str) None[source]#

Attempt to load the given XML file.

The xml file is attempted to be loaded in the banff processor metadata based on the XMLschema.

classmethod setup(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Perform setup.

This is called by sub-classes to ensure the the standard setup is performed during the initialization process.

banffprocessor.metadata.models.outlierspecs module#

Metadata model for Outlier specifications.

class banffprocessor.metadata.models.outlierspecs.Outlierspecs(specid: str, method: str, varid: str | None = None, minobs: int | None = None, side: str | None = None, withid: str | None = None, mii: float | None = None, mei: float | None = None, mdm: float | None = None, exponent: float | None = None, sigma: str | None = None, betai: float | None = None, betae: float | None = None, startcentile: float | None = None, acceptzero: bool | None = None, weight: str | None = None, dataexclvar: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for donor imputation specifications.

classmethod check_constraints(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Check constaints after all metadata has been loaded (typically foreign key constraints).

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

invalid_min_obs: int = 3#
methods: ClassVar[dict[str, str]] = {'C': 'CURRENT', 'H': 'HISTORIC', 'R': 'RATIO', 'S': 'SIGMAGAP'}#

banffprocessor.metadata.models.processcontrols module#

Metadata model for Process Controls.

class banffprocessor.metadata.models.processcontrols.ProcessControlType(*values)[source]#

Bases: Enum

Define process control type.

Ordering of these enum values could be used to dictate order of operations but is not currently used as all current control types are applied simultaneously.

COLUMN_FILTER = 2#
EDIT_GROUP_FILTER = 4#
EXCLUDE_REJECTED = 3#
ROW_FILTER = 1#
class banffprocessor.metadata.models.processcontrols.ProcessControls(controlid: str, parameter: str, value: str | None = None, targetfile: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for defining process control specifications.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.processoutputs module#

Metadata model for ProcessOutputs.

class banffprocessor.metadata.models.processoutputs.ProcessOutputs(process: str, output_name: str, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Allow the user to define which additional outputs they would like for each process type.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return the XSD for the metadata model as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Initialize metadata model.

An empty metadata table is created to load the metadata into.

banffprocessor.metadata.models.proratespecs module#

Metadata model for Prorate specifications.

class banffprocessor.metadata.models.proratespecs.Proratespecs(specid: str, decimal: int | None = None, lowerbound: float | None = None, upperbound: float | None = None, modifier: str | None = None, method: str | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for prorate imputation specifications.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.uservars module#

Metadata model for user-defined variables.

class banffprocessor.metadata.models.uservars.Uservars(process: str, specid: str, var: str, value: str, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

User-defined variables metadata class.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

static uservars_to_dict(specid: str, process: str, dbconn: DuckDBPyConnection) dict[str, str][source]#

Return the user variables for the given specid as a dictionary.

banffprocessor.metadata.models.varlists module#

Metadata model for Varlists.

class banffprocessor.metadata.models.varlists.Varlists(varlistid: str, seqno: float, fieldid: str, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for defining variable lists.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.verifyeditsspecs module#

Metadata model for VerifyEdits specifications.

class banffprocessor.metadata.models.verifyeditsspecs.Verifyeditsspecs(specid: str, imply: int | None = None, extremal: int | None = None, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for verify edit procedure specifications.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

banffprocessor.metadata.models.weights module#

Metadata model for Weights.

class banffprocessor.metadata.models.weights.Weights(weightid: str, fieldid: str, weight: float, dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>)[source]#

Bases: MetadataClass

Metadata class for defining the weights of variables.

static get_schema(root_element_name: str = 'banffProcessor') str[source]#

Return schema (XSD) contents as a string.

classmethod initialize(dbconn: ~duckdb.duckdb.DuckDBPyConnection = <module 'duckdb' from '/builds/gensys/banff/banff-processor/venv/lib/python3.12/site-packages/duckdb/__init__.py'>) None[source]#

Create duckdb table to store the metadata.

Module contents#

The group of modules defining models to represent Banff Processor metadata files.