banffprocessor.util package#

Submodules#

banffprocessor.util.case_insensitive_enum_meta module#

Metaclass for a case-insensitive version of an Enum.

class banffprocessor.util.case_insensitive_enum_meta.CaseInsensitiveEnumMeta(cls, bases, classdict, *, boundary=None, _simple=False, **kwds)[source]#

Bases: EnumType

Metaclass for a case-insensitive version of an Enum.

banffprocessor.util.dataset module#

Model for a Banff Processor working dataset.

class banffprocessor.util.dataset.Dataset(name: str, ds: Table, dbconn: DuckDBPyConnection | None = None)[source]#

Bases: object

Model for a Banff Processor working dataset.

property dbconn: DuckDBPyConnection | None#

Return a connection to the database being used.

property ds: Table#

Return the dataset as an Arrow Table.

property ds_curr_output: Table#

Return the dataset output by the current proc as an Arrow table.

property ds_filtered: Table#

Return the filtered version of the dataset as an Arrow table.

name: str#
register_table() None[source]#

Register ds in the in-memory duckdb instance under name or the alias of name, if one exists.

unregister_table() None[source]#

Un-registers the dataset in the in-memory duckdb instance under name or the alias of name, if one exists.

banffprocessor.util.dataset.add_single_value_column(table: Table, column_name: str, value: Any, dtype: DataType | None = None) Table[source]#

Add a new column to table where every row of the column contains the same value and the column is the same length as table.

Parameters:
  • table (pa.Table) – The table to append the new column to

  • column_name (str) – The name for the new column

  • value (Any) – The value to use for each row of the new column

  • dtype (pa.DataType, optional) – The PyArrow DataType to use for the columnc, defaults to None

Returns:

table with the new column appended to it

Return type:

pa.Table

banffprocessor.util.dataset.copy_table(to_copy: Table) Table[source]#

Return a copy of a PyArrow Table.

Parameters:

to_copy (pa.Table) – The table to make a copy of

Returns:

A new table containing the data and metadata of to_copy

Return type:

pa.Table

banffprocessor.util.dataset.get_dataset_name_alias(name: str) str | None[source]#

Get the alias name of the given dataset name.

Casefold() name and get the aliased name of the dataset name, if name exists with an alias. If no alias exists, None is returned.

Parameters:

name (str) – The name of the dataset to get the aliased name of

Returns:

name casefolded, or the proper dataset name if name is an alias

Return type:

str

banffprocessor.util.dataset.get_dataset_real_name(name: str) str[source]#

Get the real name of the given dataset name.

Casefold() name and get the actual name of the dataset name, if name exists as an alias. If no alias exists just returns the casefolded name.

Parameters:

name (str) – The name of the dataset to get the proper name of

Returns:

name casefolded, or the proper dataset name if name is an alias

Return type:

str

banffprocessor.util.dataset.table_empty(table: Table) bool[source]#

Determine if table is empty (has no rows nor columns).

Parameters:

table (pa.Table) – The table to check

Returns:

True if the table’s shape is (0,0), false otherwise

Return type:

bool

banffprocessor.util.metadata_excel_to_xml module#

A program to convert Banff excel spreadsheet to XML files.

Statistics Canada 2024 K Williamson

banffprocessor.util.metadata_excel_to_xml.convert_excel_to_xml(in_file: str, out_dir: str | None = None) None[source]#

Create XML files from an Excel file that was created using the Banff Processor Metadata Template.

In order to facilitate the creation of Banff Processor metadata, users can create the metadata in Excel using the Banff Processor metadata template. This function reads in the Excel file indicated by the infile parameter and writes XML files to the directory indicated in out_dir.

banffprocessor.util.metadata_excel_to_xml.get_args(args: list | str | None = None) ArgumentParser[source]#

Create an argument parser.

Example args -> [“my_filename.xlsx”, “-o”, “/my/out/folder”, “-l”, “fr”]

banffprocessor.util.metadata_excel_to_xml.init() None[source]#

Call the main function.

Used when running this module from the command line. Created to faciliate testing.

banffprocessor.util.metadata_excel_to_xml.main(iargs: list | str | None = None) None[source]#

Call the convert_excel_to_xml function.

Used when running this module from the command line. Created to faciliate testing.

Module contents#