Skip to content

Model

verifia.models.build_from_model_card(model_card_input)

Build a model instance from a model card.

This function accepts either a file path (str) to a YAML model card or a model card directly provided as a dictionary. The model card must contain at least the following keys:

- name: The model name.
- version: The model version.
- framework: The machine learning framework used.
- type: The model type.
- feature_names: A list of feature names.
- target_name: The target variable name.

Optionally, the model card may contain:

- cat_feature_names: A list of categorical feature names.
- classification_threshold: A threshold for classification tasks.
- description: A description of the model.
- local_dirpath: The local directory path for the model.

Parameters:

Name Type Description Default
model_card_input Union[str, Dict[str, Any]]

Either the file path to the model card YAML file or the model card dictionary itself.

required

Returns:

Name Type Description
BaseModel BaseModel

An instance of a model (a subclass of BaseModel) configured as per the model card.

Raises:

Type Description
FileNotFoundError

If the specified model card file does not exist (when given a file path).

KeyError

If any required key is missing from the model card.

YAMLError

If the YAML file cannot be parsed.

ValueError

If the framework specified in the model card is not supported.

verifia.models.SKLearnModel

Bases: BaseModel

A scikit-learn model wrapper that implements the BaseModel interface.

This class provides methods to save, load, and perform predictions using models trained with scikit-learn.

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

Initialize the SKLearn model wrapper.

Parameters:

Name Type Description Default
name str

The model name.

required
version str

The model version.

required
model_type SupportedModelTypes

The type of the model (regression or classification).

required
feature_names Iterable

Iterable of feature names.

required
target_name str

The target variable name.

required
cat_feature_names Optional[Iterable]

Optional iterable of categorical feature names.

None
classification_threshold Optional[float]

Threshold for classification tasks.

0.5
description Optional[str]

Optional model description.

None

load_model(path=None, *args, **kwargs)

Load a scikit-learn model from the specified file path using cloudpickle.

If the provided path is a directory, the default model file path is used.

Parameters:

Name Type Description Default
path PathLike

The file path (or directory) from which to load the model.

None
*args

Additional positional arguments for cloudpickle.load.

()
**kwargs

Additional keyword arguments for cloudpickle.load.

{}

Returns:

Name Type Description
BaseModel BaseModel

Self, to allow method chaining.

Raises:

Type Description
Exception

Propagates any exception raised during file reading.

predict(data)

Generate predictions for the given input data.

For regression, only predictions are produced. For classification, predictions and probabilities are produced.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Name Type Description
ModelOutputs ModelOutputs

An object containing predictions and probabilities (if applicable).

predict_score(data)

Compute prediction scores for the given input data.

For regression models, returns the raw predictions. For classification models, returns the probability of the positive class.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Type Description
NDArray

npt.NDArray: Array of prediction scores.

save_model(path=None, *args, **kwargs)

Save the scikit-learn model to the specified file path using cloudpickle.

Parameters:

Name Type Description Default
path PathLike

The file path where the model should be saved.

None
*args

Additional positional arguments for cloudpickle.dump.

()
**kwargs

Additional keyword arguments for cloudpickle.dump.

{}

Raises:

Type Description
ValueError

If there is no model instance available to save.

Exception

Propagates any exception raised during file writing.

verifia.models.TFModel

Bases: BaseModel

A TensorFlow/Keras model wrapper that implements the BaseModel interface.

This class provides methods to save, load, and perform predictions with Keras models.

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

Initialize the TFModel.

Parameters:

Name Type Description Default
name str

The model name.

required
version str

The model version.

required
model_type SupportedModelTypes

The type of the model (regression or classification).

required
feature_names Iterable

Iterable of feature names.

required
target_name str

The target variable name.

required
cat_feature_names Optional[Iterable]

Optional iterable of categorical feature names.

None
classification_threshold Optional[float]

Threshold for classification tasks.

0.5
description Optional[str]

Optional model description.

None

load_model(path=None, *args, **kwargs)

Load a Keras model from the specified path.

If the provided local_path is a directory, the default model file path is used.

Parameters:

Name Type Description Default
path PathLike

The file path (or directory) from which to load the model.

None
*args

Additional positional arguments for load_model.

()
**kwargs

Additional keyword arguments for load_model.

{}

Returns:

Name Type Description
BaseModel BaseModel

Self, to allow method chaining.

Raises:

Type Description
Exception

Propagates exceptions raised during loading.

predict(data)

Generate predictions for the given data.

For classification, includes probabilities and mapped labels. For regression, includes only predictions.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Name Type Description
ModelOutputs ModelOutputs

An object containing predictions, probabilities, and labels (if applicable).

Raises:

Type Description
Exception

Propagates exceptions raised during data conversion or prediction.

predict_score(data)

Predict probabilities or regression scores for the given input data.

Assumes a single output for the model.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Type Description
NDArray

npt.NDArray: Array of prediction scores.

Raises:

Type Description
Exception

Propagates exceptions raised during data conversion or prediction.

save_model(path=None, *args, **kwargs)

Save the Keras model to the specified path.

Parameters:

Name Type Description Default
path PathLike

The file path where the model should be saved.

None
*args

Additional positional arguments for the Keras save method.

()
**kwargs

Additional keyword arguments for the Keras save method.

{}

Raises:

Type Description
Exception

Propagates exceptions raised during saving.

verifia.models.PytorchModel

Bases: BaseModel

A PyTorch model wrapper that implements the BaseModel interface.

Provides methods to save, load, and perform inference with PyTorch models. Expects the underlying model to implement a 'build_floating_tensors' method for preprocessing.

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

Initialize the PyTorch model wrapper.

Parameters:

Name Type Description Default
name str

The model name.

required
version str

The model version.

required
model_type SupportedModelTypes

Model type (classification or regression).

required
feature_names Iterable

Iterable of feature names.

required
target_name str

The target variable name.

required
cat_feature_names Optional[Iterable]

Optional iterable of categorical feature names.

None
classification_threshold Optional[float]

Threshold for classification tasks.

0.5
description Optional[str]

Optional model description.

None

load_model(path=None)

Load the PyTorch model from the specified file path.

If the provided local_path is a directory, the default model filepath is used.

Parameters:

Name Type Description Default
path PathLike

The path (or directory) from which to load the model.

None

Returns:

Name Type Description
BaseModel BaseModel

Self, to allow method chaining.

Raises:

Type Description
MissingBuildFloatingTensorsError

If the loaded model does not implement 'build_floating_tensors'.

Exception

Propagates any exception raised during model loading.

predict(data)

Generate predictions for the input data.

For regression, only predictions are generated. For classification, generates predictions, probabilities, and maps predictions to labels.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Name Type Description
ModelOutputs ModelOutputs

Object containing predictions and probabilities (if applicable).

Raises:

Type Description
Exception

Propagates any exception raised during prediction.

predict_score(data)

Compute prediction scores for the given input data.

For regression, returns raw predictions. For classification, returns probability scores or class scores as appropriate.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Type Description
NDArray

npt.NDArray: Array of prediction scores.

Raises:

Type Description
Exception

Propagates any exception raised during prediction.

save_model(path=None)

Save the PyTorch model to the specified file path.

Parameters:

Name Type Description Default
path PathLike

The path (or directory) to save the model.

None

verifia.models.XGBModel

Bases: BaseModel

XGBoost model wrapper that implements the BaseModel interface for both regression and classification tasks.

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

Initialize the XGBModel wrapper.

Parameters:

Name Type Description Default
name str

The model name.

required
version str

The model version.

required
model_type SupportedModelTypes

Type of the model (e.g., classification or regression).

required
feature_names Iterable

Iterable of feature names.

required
target_name str

The target variable name.

required
cat_feature_names Optional[Iterable]

Optional iterable of categorical feature names.

None
classification_threshold Optional[float]

Classification threshold (for classification tasks).

0.5
description Optional[str]

Optional model description.

None

load_model(path=None, *args, **kwargs)

Load an XGBoost model from the specified file path.

If the provided path is a directory, the default model filepath is used.

Parameters:

Name Type Description Default
path PathLike

The file path (or directory) from which to load the model.

None
*args

Additional positional arguments.

()
**kwargs

Additional keyword arguments.

{}

Returns:

Name Type Description
BaseModel BaseModel

Self, to allow method chaining.

Raises:

Type Description
Exception

Propagates exceptions raised during model loading.

predict(data)

Generate predictions for the given input data.

For regression tasks, only predictions are produced. For classification tasks, produces predictions, probabilities, and maps predictions to labels.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Name Type Description
ModelOutputs ModelOutputs

An object containing predictions and, for classification, probabilities.

Raises:

Type Description
Exception

Propagates exceptions raised during prediction.

predict_score(data)

Compute prediction scores for the given input data.

For regression models, returns raw predictions. For classification models, returns the probability of the positive class.

Parameters:

Name Type Description Default
data NDArray

Input data array (1D or 2D).

required

Returns:

Type Description
NDArray

npt.NDArray: Array of prediction scores.

Raises:

Type Description
Exception

Propagates exceptions raised during prediction.

prepare_inputs(X)

Prepare input data by converting the NumPy array to a DataFrame with appropriate dtypes.

This method assumes that the model's feature types are stored in its 'feature_types' attribute, where a type value of "c" indicates a categorical feature.

Parameters:

Name Type Description Default
X NDArray

Input feature data.

required

Returns:

Type Description
DataFrame

pd.DataFrame: Processed data with columns renamed and dtypes set.

Raises:

Type Description
Exception

Propagates exceptions raised during data processing.

save_model(path=None, *args, **kwargs)

Save the XGBoost model to the specified file path.

Parameters:

Name Type Description Default
path PathLike

The file path where the model should be saved.

None
*args

Additional positional arguments for the XGBoost save_model method.

()
**kwargs

Additional keyword arguments for the XGBoost save_model method.

{}

Raises:

Type Description
Exception

Propagates exceptions raised during model saving.

verifia.models.LGBModel

Bases: BaseModel

A LightGBM model wrapper that extends BaseModel for regression and classification tasks.

This class provides methods to save, load, and make predictions using LightGBM models.

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

Initialize a LightGBM model wrapper.

Parameters:

Name Type Description Default
name str

The model name.

required
version str

The model version.

required
model_type SupportedModelTypes

Type of the model (classification or regression).

required
feature_names Iterable

Iterable of feature names.

required
target_name str

The target variable name.

required
cat_feature_names Optional[Iterable]

Optional iterable of categorical feature names.

None
classification_threshold Optional[float]

Threshold for classification tasks.

0.5
description Optional[str]

Optional model description.

None

predict(data)

Generate predictions for the input data.

For regression tasks, only predictions are produced. For classification tasks, both predictions and probabilities are produced.

Parameters:

Name Type Description Default
data NDArray

Input feature data as a NumPy array (1D or 2D).

required

Returns:

Name Type Description
ModelOutputs ModelOutputs

An object containing predictions and probabilities (if applicable).

predict_score(data)

Compute prediction scores for the input data.

For regression, returns raw predictions. For classification, returns the probability of the positive class.

Parameters:

Name Type Description Default
data NDArray

Input feature data as a NumPy array (1D or 2D).

required

Returns:

Type Description
NDArray

npt.NDArray: A NumPy array of prediction scores.

prepare_inputs(X)

Prepare and convert the input NumPy array into a Pandas DataFrame with proper dtypes.

Parameters:

Name Type Description Default
X NDArray

Input feature data as a NumPy array.

required

Returns:

Type Description
DataFrame

pd.DataFrame: DataFrame with column names matching the model's features and appropriate types.

save_model(path=None, *args, **kwargs)

Save the LightGBM model to the specified file path.

Parameters:

Name Type Description Default
path PathLike

The file path (or directory) where the model should be saved.

None
*args

Additional positional arguments for LightGBM's save_model.

()
**kwargs

Additional keyword arguments for LightGBM's save_model.

{}

Raises:

Type Description
ValueError

If no model instance is available.

Exception

Propagates exceptions raised during model saving.

verifia.models.CBModel

Bases: BaseModel

A CatBoost model wrapper that extends the BaseModel interface.

Provides methods to save, load, and make predictions using CatBoost models. Depending on the model type (classification or regression), the appropriate CatBoost estimator is used.

__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)

Initialize a CatBoost model wrapper.

Parameters:

Name Type Description Default
name str

The model name.

required
version str

The model version.

required
model_type SupportedModelTypes

Type of the model (classification/regression).

required
feature_names Iterable

Iterable of feature names.

required
target_name str

Target variable name.

required
cat_feature_names Optional[Iterable]

Optional iterable of categorical feature names.

None
classification_threshold Optional[float]

Threshold for classification decisions.

0.5
description Optional[str]

Optional model description.

None

load_model(path=None, *args, **kwargs)

Load the CatBoost model from the specified file path.

If the provided path is a directory, the default model file path (constructed by default_model_filepath) is used.

Parameters:

Name Type Description Default
path PathLike

Path (or directory) from which to load the model.

None
*args

Additional positional arguments.

()
**kwargs

Additional keyword arguments.

{}

Returns:

Name Type Description
BaseModel BaseModel

Self, to allow method chaining.

Raises:

Type Description
ValueError

If the model type is unsupported.

Exception

Propagates any exception raised by the underlying load_model method.

predict(data)

Generate predictions and, if applicable, probabilities for the given data.

For regression, only predictions are populated. For classification, both predictions and probabilities are populated.

Parameters:

Name Type Description Default
data NDArray

The input data as a NumPy array. Can be 1D or 2D.

required

Returns:

Name Type Description
ModelOutputs ModelOutputs

An instance containing predictions and probabilities (if classification).

Raises:

Type Description
Exception

Propagates any exception raised during prediction.

predict_score(data)

Compute prediction scores for the given data.

For regression, the raw predictions are returned. For classification, the probability of the positive class is returned.

Parameters:

Name Type Description Default
data NDArray

The input data as a NumPy array. Can be 1D or 2D.

required

Returns:

Type Description
Union[NDArray, float]

Union[npt.NDArray, float]: A NumPy array of prediction scores, or a single score if one sample is provided.

Raises:

Type Description
Exception

Propagates any exception raised during prediction.

save_model(path=None, *args, **kwargs)

Save the CatBoost model to the specified file path.

Parameters:

Name Type Description Default
path PathLike

The file path where the model should be saved.

None
*args

Additional positional arguments.

()
**kwargs

Additional keyword arguments.

{}

Raises:

Type Description
ValueError

If the model attribute is not set.

Exception

Propagates any exception raised by the underlying save_model method.