Model

`verifia.models.build_from_model_card(model_card_input)`

Build a model instance from a model card.

This function accepts either a file path (str) to a YAML model card or a model card directly provided as a dictionary. The model card must contain at least the following keys:

- name: The model name.
- version: The model version.
- framework: The machine learning framework used.
- type: The model type.
- feature_names: A list of feature names.
- target_name: The target variable name.

Optionally, the model card may contain:

- cat_feature_names: A list of categorical feature names.
- classification_threshold: A threshold for classification tasks.
- description: A description of the model.
- local_dirpath: The local directory path for the model.

Parameters:

Name	Type	Description	Default
`model_card_input`	`Union[str, Dict[str, Any]]`	Either the file path to the model card YAML file or the model card dictionary itself.	required

Returns:

Name	Type	Description
`BaseModel`	`BaseModel`	An instance of a model (a subclass of BaseModel) configured as per the model card.

Raises:

Type	Description
`FileNotFoundError`	If the specified model card file does not exist (when given a file path).
`KeyError`	If any required key is missing from the model card.
`YAMLError`	If the YAML file cannot be parsed.
`ValueError`	If the framework specified in the model card is not supported.

`verifia.models.SKLearnModel`

Bases: BaseModel

A scikit-learn model wrapper that implements the BaseModel interface.

This class provides methods to save, load, and perform predictions using models trained with scikit-learn.

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

Initialize the SKLearn model wrapper.

Parameters:

Name	Type	Description	Default
`name`	`str`	The model name.	required
`version`	`str`	The model version.	required
`model_type`	`SupportedModelTypes`	The type of the model (regression or classification).	required
`feature_names`	`Iterable`	Iterable of feature names.	required
`target_name`	`str`	The target variable name.	required
`cat_feature_names`	`Optional[Iterable]`	Optional iterable of categorical feature names.	`None`
`classification_threshold`	`Optional[float]`	Threshold for classification tasks.	`0.5`
`description`	`Optional[str]`	Optional model description.	`None`

`load_model(path=None, *args, **kwargs)`

Load a scikit-learn model from the specified file path using cloudpickle.

If the provided path is a directory, the default model file path is used.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path (or directory) from which to load the model.	`None`
`*args`		Additional positional arguments for cloudpickle.load.	`()`
`**kwargs`		Additional keyword arguments for cloudpickle.load.	`{}`

Returns:

Name	Type	Description
`BaseModel`	`BaseModel`	Self, to allow method chaining.

Raises:

Type	Description
`Exception`	Propagates any exception raised during file reading.

`predict(data)`

Generate predictions for the given input data.

For regression, only predictions are produced. For classification, predictions and probabilities are produced.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Name	Type	Description
`ModelOutputs`	`ModelOutputs`	An object containing predictions and probabilities (if applicable).

`predict_score(data)`

Compute prediction scores for the given input data.

For regression models, returns the raw predictions. For classification models, returns the probability of the positive class.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Type	Description
`NDArray`	npt.NDArray: Array of prediction scores.

`save_model(path=None, *args, **kwargs)`

Save the scikit-learn model to the specified file path using cloudpickle.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path where the model should be saved.	`None`
`*args`		Additional positional arguments for cloudpickle.dump.	`()`
`**kwargs`		Additional keyword arguments for cloudpickle.dump.	`{}`

Raises:

Type	Description
`ValueError`	If there is no model instance available to save.
`Exception`	Propagates any exception raised during file writing.

`verifia.models.TFModel`

Bases: BaseModel

A TensorFlow/Keras model wrapper that implements the BaseModel interface.

This class provides methods to save, load, and perform predictions with Keras models.

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

Initialize the TFModel.

Parameters:

Name	Type	Description	Default
`name`	`str`	The model name.	required
`version`	`str`	The model version.	required
`model_type`	`SupportedModelTypes`	The type of the model (regression or classification).	required
`feature_names`	`Iterable`	Iterable of feature names.	required
`target_name`	`str`	The target variable name.	required
`cat_feature_names`	`Optional[Iterable]`	Optional iterable of categorical feature names.	`None`
`classification_threshold`	`Optional[float]`	Threshold for classification tasks.	`0.5`
`description`	`Optional[str]`	Optional model description.	`None`

`load_model(path=None, *args, **kwargs)`

Load a Keras model from the specified path.

If the provided local_path is a directory, the default model file path is used.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path (or directory) from which to load the model.	`None`
`*args`		Additional positional arguments for load_model.	`()`
`**kwargs`		Additional keyword arguments for load_model.	`{}`

Returns:

Name	Type	Description
`BaseModel`	`BaseModel`	Self, to allow method chaining.

Raises:

Type	Description
`Exception`	Propagates exceptions raised during loading.

`predict(data)`

Generate predictions for the given data.

For classification, includes probabilities and mapped labels. For regression, includes only predictions.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Name	Type	Description
`ModelOutputs`	`ModelOutputs`	An object containing predictions, probabilities, and labels (if applicable).

Raises:

Type	Description
`Exception`	Propagates exceptions raised during data conversion or prediction.

`predict_score(data)`

Predict probabilities or regression scores for the given input data.

Assumes a single output for the model.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Type	Description
`NDArray`	npt.NDArray: Array of prediction scores.

Raises:

Type	Description
`Exception`	Propagates exceptions raised during data conversion or prediction.

`save_model(path=None, *args, **kwargs)`

Save the Keras model to the specified path.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path where the model should be saved.	`None`
`*args`		Additional positional arguments for the Keras save method.	`()`
`**kwargs`		Additional keyword arguments for the Keras save method.	`{}`

Raises:

Type	Description
`Exception`	Propagates exceptions raised during saving.

`verifia.models.PytorchModel`

Bases: BaseModel

A PyTorch model wrapper that implements the BaseModel interface.

Provides methods to save, load, and perform inference with PyTorch models. Expects the underlying model to implement a 'build_floating_tensors' method for preprocessing.

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

Initialize the PyTorch model wrapper.

Parameters:

Name	Type	Description	Default
`name`	`str`	The model name.	required
`version`	`str`	The model version.	required
`model_type`	`SupportedModelTypes`	Model type (classification or regression).	required
`feature_names`	`Iterable`	Iterable of feature names.	required
`target_name`	`str`	The target variable name.	required
`cat_feature_names`	`Optional[Iterable]`	Optional iterable of categorical feature names.	`None`
`classification_threshold`	`Optional[float]`	Threshold for classification tasks.	`0.5`
`description`	`Optional[str]`	Optional model description.	`None`

`load_model(path=None)`

Load the PyTorch model from the specified file path.

If the provided local_path is a directory, the default model filepath is used.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The path (or directory) from which to load the model.	`None`

Returns:

Name	Type	Description
`BaseModel`	`BaseModel`	Self, to allow method chaining.

Raises:

Type	Description
`MissingBuildFloatingTensorsError`	If the loaded model does not implement 'build_floating_tensors'.
`Exception`	Propagates any exception raised during model loading.

`predict(data)`

Generate predictions for the input data.

For regression, only predictions are generated. For classification, generates predictions, probabilities, and maps predictions to labels.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Name	Type	Description
`ModelOutputs`	`ModelOutputs`	Object containing predictions and probabilities (if applicable).

Raises:

Type	Description
`Exception`	Propagates any exception raised during prediction.

`predict_score(data)`

Compute prediction scores for the given input data.

For regression, returns raw predictions. For classification, returns probability scores or class scores as appropriate.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Type	Description
`NDArray`	npt.NDArray: Array of prediction scores.

Raises:

Type	Description
`Exception`	Propagates any exception raised during prediction.

`save_model(path=None)`

Save the PyTorch model to the specified file path.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The path (or directory) to save the model.	`None`

`verifia.models.XGBModel`

Bases: BaseModel

XGBoost model wrapper that implements the BaseModel interface for both regression and classification tasks.

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

Initialize the XGBModel wrapper.

Parameters:

Name	Type	Description	Default
`name`	`str`	The model name.	required
`version`	`str`	The model version.	required
`model_type`	`SupportedModelTypes`	Type of the model (e.g., classification or regression).	required
`feature_names`	`Iterable`	Iterable of feature names.	required
`target_name`	`str`	The target variable name.	required
`cat_feature_names`	`Optional[Iterable]`	Optional iterable of categorical feature names.	`None`
`classification_threshold`	`Optional[float]`	Classification threshold (for classification tasks).	`0.5`
`description`	`Optional[str]`	Optional model description.	`None`

`load_model(path=None, *args, **kwargs)`

Load an XGBoost model from the specified file path.

If the provided path is a directory, the default model filepath is used.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path (or directory) from which to load the model.	`None`
`*args`		Additional positional arguments.	`()`
`**kwargs`		Additional keyword arguments.	`{}`

Returns:

Name	Type	Description
`BaseModel`	`BaseModel`	Self, to allow method chaining.

Raises:

Type	Description
`Exception`	Propagates exceptions raised during model loading.

`predict(data)`

Generate predictions for the given input data.

For regression tasks, only predictions are produced. For classification tasks, produces predictions, probabilities, and maps predictions to labels.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Name	Type	Description
`ModelOutputs`	`ModelOutputs`	An object containing predictions and, for classification, probabilities.

Raises:

Type	Description
`Exception`	Propagates exceptions raised during prediction.

`predict_score(data)`

Compute prediction scores for the given input data.

For regression models, returns raw predictions. For classification models, returns the probability of the positive class.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input data array (1D or 2D).	required

Returns:

Type	Description
`NDArray`	npt.NDArray: Array of prediction scores.

Raises:

Type	Description
`Exception`	Propagates exceptions raised during prediction.

`prepare_inputs(X)`

Prepare input data by converting the NumPy array to a DataFrame with appropriate dtypes.

This method assumes that the model's feature types are stored in its 'feature_types' attribute, where a type value of "c" indicates a categorical feature.

Parameters:

Name	Type	Description	Default
`X`	`NDArray`	Input feature data.	required

Returns:

Type	Description
`DataFrame`	pd.DataFrame: Processed data with columns renamed and dtypes set.

Raises:

Type	Description
`Exception`	Propagates exceptions raised during data processing.

`save_model(path=None, *args, **kwargs)`

Save the XGBoost model to the specified file path.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path where the model should be saved.	`None`
`*args`		Additional positional arguments for the XGBoost save_model method.	`()`
`**kwargs`		Additional keyword arguments for the XGBoost save_model method.	`{}`

Raises:

Type	Description
`Exception`	Propagates exceptions raised during model saving.

`verifia.models.LGBModel`

Bases: BaseModel

A LightGBM model wrapper that extends BaseModel for regression and classification tasks.

This class provides methods to save, load, and make predictions using LightGBM models.

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

Initialize a LightGBM model wrapper.

Parameters:

Name	Type	Description	Default
`name`	`str`	The model name.	required
`version`	`str`	The model version.	required
`model_type`	`SupportedModelTypes`	Type of the model (classification or regression).	required
`feature_names`	`Iterable`	Iterable of feature names.	required
`target_name`	`str`	The target variable name.	required
`cat_feature_names`	`Optional[Iterable]`	Optional iterable of categorical feature names.	`None`
`classification_threshold`	`Optional[float]`	Threshold for classification tasks.	`0.5`
`description`	`Optional[str]`	Optional model description.	`None`

`predict(data)`

Generate predictions for the input data.

For regression tasks, only predictions are produced. For classification tasks, both predictions and probabilities are produced.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input feature data as a NumPy array (1D or 2D).	required

Returns:

Name	Type	Description
`ModelOutputs`	`ModelOutputs`	An object containing predictions and probabilities (if applicable).

`predict_score(data)`

Compute prediction scores for the input data.

For regression, returns raw predictions. For classification, returns the probability of the positive class.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	Input feature data as a NumPy array (1D or 2D).	required

Returns:

Type	Description
`NDArray`	npt.NDArray: A NumPy array of prediction scores.

`prepare_inputs(X)`

Prepare and convert the input NumPy array into a Pandas DataFrame with proper dtypes.

Parameters:

Name	Type	Description	Default
`X`	`NDArray`	Input feature data as a NumPy array.	required

Returns:

Type	Description
`DataFrame`	pd.DataFrame: DataFrame with column names matching the model's features and appropriate types.

`save_model(path=None, *args, **kwargs)`

Save the LightGBM model to the specified file path.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path (or directory) where the model should be saved.	`None`
`*args`		Additional positional arguments for LightGBM's save_model.	`()`
`**kwargs`		Additional keyword arguments for LightGBM's save_model.	`{}`

Raises:

Type	Description
`ValueError`	If no model instance is available.
`Exception`	Propagates exceptions raised during model saving.

`verifia.models.CBModel`

Bases: BaseModel

A CatBoost model wrapper that extends the BaseModel interface.

Provides methods to save, load, and make predictions using CatBoost models. Depending on the model type (classification or regression), the appropriate CatBoost estimator is used.

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

Initialize a CatBoost model wrapper.

Parameters:

Name	Type	Description	Default
`name`	`str`	The model name.	required
`version`	`str`	The model version.	required
`model_type`	`SupportedModelTypes`	Type of the model (classification/regression).	required
`feature_names`	`Iterable`	Iterable of feature names.	required
`target_name`	`str`	Target variable name.	required
`cat_feature_names`	`Optional[Iterable]`	Optional iterable of categorical feature names.	`None`
`classification_threshold`	`Optional[float]`	Threshold for classification decisions.	`0.5`
`description`	`Optional[str]`	Optional model description.	`None`

`load_model(path=None, *args, **kwargs)`

Load the CatBoost model from the specified file path.

If the provided path is a directory, the default model file path (constructed by default_model_filepath) is used.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	Path (or directory) from which to load the model.	`None`
`*args`		Additional positional arguments.	`()`
`**kwargs`		Additional keyword arguments.	`{}`

Returns:

Name	Type	Description
`BaseModel`	`BaseModel`	Self, to allow method chaining.

Raises:

Type	Description
`ValueError`	If the model type is unsupported.
`Exception`	Propagates any exception raised by the underlying load_model method.

`predict(data)`

Generate predictions and, if applicable, probabilities for the given data.

For regression, only predictions are populated. For classification, both predictions and probabilities are populated.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	The input data as a NumPy array. Can be 1D or 2D.	required

Returns:

Name	Type	Description
`ModelOutputs`	`ModelOutputs`	An instance containing predictions and probabilities (if classification).

Raises:

Type	Description
`Exception`	Propagates any exception raised during prediction.

`predict_score(data)`

Compute prediction scores for the given data.

For regression, the raw predictions are returned. For classification, the probability of the positive class is returned.

Parameters:

Name	Type	Description	Default
`data`	`NDArray`	The input data as a NumPy array. Can be 1D or 2D.	required

Returns:

Type	Description
`Union[NDArray, float]`	Union[npt.NDArray, float]: A NumPy array of prediction scores, or a single score if one sample is provided.

Raises:

Type	Description
`Exception`	Propagates any exception raised during prediction.

`save_model(path=None, *args, **kwargs)`

Save the CatBoost model to the specified file path.

Parameters:

Name	Type	Description	Default
`path`	`PathLike`	The file path where the model should be saved.	`None`
`*args`		Additional positional arguments.	`()`
`**kwargs`		Additional keyword arguments.	`{}`

Raises:

Type	Description
`ValueError`	If the model attribute is not set.
`Exception`	Propagates any exception raised by the underlying save_model method.

Model

verifia.models.build_from_model_card(model_card_input)

verifia.models.SKLearnModel

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

load_model(path=None, *args, **kwargs)

predict(data)

predict_score(data)

save_model(path=None, *args, **kwargs)

verifia.models.TFModel

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

load_model(path=None, *args, **kwargs)

predict(data)

predict_score(data)

save_model(path=None, *args, **kwargs)

verifia.models.PytorchModel

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

load_model(path=None)

predict(data)

predict_score(data)

save_model(path=None)

verifia.models.XGBModel

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

load_model(path=None, *args, **kwargs)

predict(data)

predict_score(data)

prepare_inputs(X)

save_model(path=None, *args, **kwargs)

verifia.models.LGBModel

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

predict(data)

predict_score(data)

prepare_inputs(X)

save_model(path=None, *args, **kwargs)

verifia.models.CBModel

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

load_model(path=None, *args, **kwargs)

predict(data)

predict_score(data)

save_model(path=None, *args, **kwargs)

`verifia.models.build_from_model_card(model_card_input)`

`verifia.models.SKLearnModel`

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

`load_model(path=None, *args, **kwargs)`

`predict(data)`

`predict_score(data)`

`save_model(path=None, *args, **kwargs)`

`verifia.models.TFModel`

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

`load_model(path=None, *args, **kwargs)`

`predict(data)`

`predict_score(data)`

`save_model(path=None, *args, **kwargs)`

`verifia.models.PytorchModel`

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

`load_model(path=None)`

`predict(data)`

`predict_score(data)`

`save_model(path=None)`

`verifia.models.XGBModel`

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

`load_model(path=None, *args, **kwargs)`

`predict(data)`

`predict_score(data)`

`prepare_inputs(X)`

`save_model(path=None, *args, **kwargs)`

`verifia.models.LGBModel`

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

`predict(data)`

`predict_score(data)`

`prepare_inputs(X)`

`save_model(path=None, *args, **kwargs)`

`verifia.models.CBModel`

`init(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)`

`load_model(path=None, *args, **kwargs)`

`predict(data)`

`predict_score(data)`

`save_model(path=None, *args, **kwargs)`