Model
verifia.models.build_from_model_card(model_card_input)
Build a model instance from a model card.
This function accepts either a file path (str) to a YAML model card or a model card directly provided as a dictionary. The model card must contain at least the following keys:
- name: The model name.
- version: The model version.
- framework: The machine learning framework used.
- type: The model type.
- feature_names: A list of feature names.
- target_name: The target variable name.
Optionally, the model card may contain:
- cat_feature_names: A list of categorical feature names.
- classification_threshold: A threshold for classification tasks.
- description: A description of the model.
- local_dirpath: The local directory path for the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_card_input
|
Union[str, Dict[str, Any]]
|
Either the file path to the model card YAML file or the model card dictionary itself. |
required |
Returns:
Name | Type | Description |
---|---|---|
BaseModel |
BaseModel
|
An instance of a model (a subclass of BaseModel) configured as per the model card. |
Raises:
Type | Description |
---|---|
FileNotFoundError
|
If the specified model card file does not exist (when given a file path). |
KeyError
|
If any required key is missing from the model card. |
YAMLError
|
If the YAML file cannot be parsed. |
ValueError
|
If the framework specified in the model card is not supported. |
verifia.models.SKLearnModel
Bases: BaseModel
A scikit-learn model wrapper that implements the BaseModel interface.
This class provides methods to save, load, and perform predictions using models trained with scikit-learn.
__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)
Initialize the SKLearn model wrapper.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The model name. |
required |
version
|
str
|
The model version. |
required |
model_type
|
SupportedModelTypes
|
The type of the model (regression or classification). |
required |
feature_names
|
Iterable
|
Iterable of feature names. |
required |
target_name
|
str
|
The target variable name. |
required |
cat_feature_names
|
Optional[Iterable]
|
Optional iterable of categorical feature names. |
None
|
classification_threshold
|
Optional[float]
|
Threshold for classification tasks. |
0.5
|
description
|
Optional[str]
|
Optional model description. |
None
|
load_model(path=None, *args, **kwargs)
Load a scikit-learn model from the specified file path using cloudpickle.
If the provided path is a directory, the default model file path is used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path (or directory) from which to load the model. |
None
|
*args
|
Additional positional arguments for cloudpickle.load. |
()
|
|
**kwargs
|
Additional keyword arguments for cloudpickle.load. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
BaseModel |
BaseModel
|
Self, to allow method chaining. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates any exception raised during file reading. |
predict(data)
Generate predictions for the given input data.
For regression, only predictions are produced. For classification, predictions and probabilities are produced.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Name | Type | Description |
---|---|---|
ModelOutputs |
ModelOutputs
|
An object containing predictions and probabilities (if applicable). |
predict_score(data)
Compute prediction scores for the given input data.
For regression models, returns the raw predictions. For classification models, returns the probability of the positive class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Type | Description |
---|---|
NDArray
|
npt.NDArray: Array of prediction scores. |
save_model(path=None, *args, **kwargs)
Save the scikit-learn model to the specified file path using cloudpickle.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path where the model should be saved. |
None
|
*args
|
Additional positional arguments for cloudpickle.dump. |
()
|
|
**kwargs
|
Additional keyword arguments for cloudpickle.dump. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If there is no model instance available to save. |
Exception
|
Propagates any exception raised during file writing. |
verifia.models.TFModel
Bases: BaseModel
A TensorFlow/Keras model wrapper that implements the BaseModel interface.
This class provides methods to save, load, and perform predictions with Keras models.
__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)
Initialize the TFModel.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The model name. |
required |
version
|
str
|
The model version. |
required |
model_type
|
SupportedModelTypes
|
The type of the model (regression or classification). |
required |
feature_names
|
Iterable
|
Iterable of feature names. |
required |
target_name
|
str
|
The target variable name. |
required |
cat_feature_names
|
Optional[Iterable]
|
Optional iterable of categorical feature names. |
None
|
classification_threshold
|
Optional[float]
|
Threshold for classification tasks. |
0.5
|
description
|
Optional[str]
|
Optional model description. |
None
|
load_model(path=None, *args, **kwargs)
Load a Keras model from the specified path.
If the provided local_path is a directory, the default model file path is used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path (or directory) from which to load the model. |
None
|
*args
|
Additional positional arguments for load_model. |
()
|
|
**kwargs
|
Additional keyword arguments for load_model. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
BaseModel |
BaseModel
|
Self, to allow method chaining. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during loading. |
predict(data)
Generate predictions for the given data.
For classification, includes probabilities and mapped labels. For regression, includes only predictions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Name | Type | Description |
---|---|---|
ModelOutputs |
ModelOutputs
|
An object containing predictions, probabilities, and labels (if applicable). |
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during data conversion or prediction. |
predict_score(data)
Predict probabilities or regression scores for the given input data.
Assumes a single output for the model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Type | Description |
---|---|
NDArray
|
npt.NDArray: Array of prediction scores. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during data conversion or prediction. |
save_model(path=None, *args, **kwargs)
Save the Keras model to the specified path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path where the model should be saved. |
None
|
*args
|
Additional positional arguments for the Keras save method. |
()
|
|
**kwargs
|
Additional keyword arguments for the Keras save method. |
{}
|
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during saving. |
verifia.models.PytorchModel
Bases: BaseModel
A PyTorch model wrapper that implements the BaseModel interface.
Provides methods to save, load, and perform inference with PyTorch models. Expects the underlying model to implement a 'build_floating_tensors' method for preprocessing.
__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)
Initialize the PyTorch model wrapper.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The model name. |
required |
version
|
str
|
The model version. |
required |
model_type
|
SupportedModelTypes
|
Model type (classification or regression). |
required |
feature_names
|
Iterable
|
Iterable of feature names. |
required |
target_name
|
str
|
The target variable name. |
required |
cat_feature_names
|
Optional[Iterable]
|
Optional iterable of categorical feature names. |
None
|
classification_threshold
|
Optional[float]
|
Threshold for classification tasks. |
0.5
|
description
|
Optional[str]
|
Optional model description. |
None
|
load_model(path=None)
Load the PyTorch model from the specified file path.
If the provided local_path is a directory, the default model filepath is used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The path (or directory) from which to load the model. |
None
|
Returns:
Name | Type | Description |
---|---|---|
BaseModel |
BaseModel
|
Self, to allow method chaining. |
Raises:
Type | Description |
---|---|
MissingBuildFloatingTensorsError
|
If the loaded model does not implement 'build_floating_tensors'. |
Exception
|
Propagates any exception raised during model loading. |
predict(data)
Generate predictions for the input data.
For regression, only predictions are generated. For classification, generates predictions, probabilities, and maps predictions to labels.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Name | Type | Description |
---|---|---|
ModelOutputs |
ModelOutputs
|
Object containing predictions and probabilities (if applicable). |
Raises:
Type | Description |
---|---|
Exception
|
Propagates any exception raised during prediction. |
predict_score(data)
Compute prediction scores for the given input data.
For regression, returns raw predictions. For classification, returns probability scores or class scores as appropriate.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Type | Description |
---|---|
NDArray
|
npt.NDArray: Array of prediction scores. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates any exception raised during prediction. |
save_model(path=None)
Save the PyTorch model to the specified file path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The path (or directory) to save the model. |
None
|
verifia.models.XGBModel
Bases: BaseModel
XGBoost model wrapper that implements the BaseModel interface for both regression and classification tasks.
__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)
Initialize the XGBModel wrapper.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The model name. |
required |
version
|
str
|
The model version. |
required |
model_type
|
SupportedModelTypes
|
Type of the model (e.g., classification or regression). |
required |
feature_names
|
Iterable
|
Iterable of feature names. |
required |
target_name
|
str
|
The target variable name. |
required |
cat_feature_names
|
Optional[Iterable]
|
Optional iterable of categorical feature names. |
None
|
classification_threshold
|
Optional[float]
|
Classification threshold (for classification tasks). |
0.5
|
description
|
Optional[str]
|
Optional model description. |
None
|
load_model(path=None, *args, **kwargs)
Load an XGBoost model from the specified file path.
If the provided path is a directory, the default model filepath is used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path (or directory) from which to load the model. |
None
|
*args
|
Additional positional arguments. |
()
|
|
**kwargs
|
Additional keyword arguments. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
BaseModel |
BaseModel
|
Self, to allow method chaining. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during model loading. |
predict(data)
Generate predictions for the given input data.
For regression tasks, only predictions are produced. For classification tasks, produces predictions, probabilities, and maps predictions to labels.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Name | Type | Description |
---|---|---|
ModelOutputs |
ModelOutputs
|
An object containing predictions and, for classification, probabilities. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during prediction. |
predict_score(data)
Compute prediction scores for the given input data.
For regression models, returns raw predictions. For classification models, returns the probability of the positive class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input data array (1D or 2D). |
required |
Returns:
Type | Description |
---|---|
NDArray
|
npt.NDArray: Array of prediction scores. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during prediction. |
prepare_inputs(X)
Prepare input data by converting the NumPy array to a DataFrame with appropriate dtypes.
This method assumes that the model's feature types are stored in its 'feature_types' attribute, where a type value of "c" indicates a categorical feature.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
NDArray
|
Input feature data. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame: Processed data with columns renamed and dtypes set. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during data processing. |
save_model(path=None, *args, **kwargs)
Save the XGBoost model to the specified file path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path where the model should be saved. |
None
|
*args
|
Additional positional arguments for the XGBoost save_model method. |
()
|
|
**kwargs
|
Additional keyword arguments for the XGBoost save_model method. |
{}
|
Raises:
Type | Description |
---|---|
Exception
|
Propagates exceptions raised during model saving. |
verifia.models.LGBModel
Bases: BaseModel
A LightGBM model wrapper that extends BaseModel for regression and classification tasks.
This class provides methods to save, load, and make predictions using LightGBM models.
__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)
Initialize a LightGBM model wrapper.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The model name. |
required |
version
|
str
|
The model version. |
required |
model_type
|
SupportedModelTypes
|
Type of the model (classification or regression). |
required |
feature_names
|
Iterable
|
Iterable of feature names. |
required |
target_name
|
str
|
The target variable name. |
required |
cat_feature_names
|
Optional[Iterable]
|
Optional iterable of categorical feature names. |
None
|
classification_threshold
|
Optional[float]
|
Threshold for classification tasks. |
0.5
|
description
|
Optional[str]
|
Optional model description. |
None
|
predict(data)
Generate predictions for the input data.
For regression tasks, only predictions are produced. For classification tasks, both predictions and probabilities are produced.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input feature data as a NumPy array (1D or 2D). |
required |
Returns:
Name | Type | Description |
---|---|---|
ModelOutputs |
ModelOutputs
|
An object containing predictions and probabilities (if applicable). |
predict_score(data)
Compute prediction scores for the input data.
For regression, returns raw predictions. For classification, returns the probability of the positive class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
Input feature data as a NumPy array (1D or 2D). |
required |
Returns:
Type | Description |
---|---|
NDArray
|
npt.NDArray: A NumPy array of prediction scores. |
prepare_inputs(X)
Prepare and convert the input NumPy array into a Pandas DataFrame with proper dtypes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X
|
NDArray
|
Input feature data as a NumPy array. |
required |
Returns:
Type | Description |
---|---|
DataFrame
|
pd.DataFrame: DataFrame with column names matching the model's features and appropriate types. |
save_model(path=None, *args, **kwargs)
Save the LightGBM model to the specified file path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path (or directory) where the model should be saved. |
None
|
*args
|
Additional positional arguments for LightGBM's save_model. |
()
|
|
**kwargs
|
Additional keyword arguments for LightGBM's save_model. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If no model instance is available. |
Exception
|
Propagates exceptions raised during model saving. |
verifia.models.CBModel
Bases: BaseModel
A CatBoost model wrapper that extends the BaseModel interface.
Provides methods to save, load, and make predictions using CatBoost models. Depending on the model type (classification or regression), the appropriate CatBoost estimator is used.
__init__(name, version, model_type, feature_names, target_name, local_dirpath, cat_feature_names=None, classification_threshold=0.5, description=None)
Initialize a CatBoost model wrapper.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
The model name. |
required |
version
|
str
|
The model version. |
required |
model_type
|
SupportedModelTypes
|
Type of the model (classification/regression). |
required |
feature_names
|
Iterable
|
Iterable of feature names. |
required |
target_name
|
str
|
Target variable name. |
required |
cat_feature_names
|
Optional[Iterable]
|
Optional iterable of categorical feature names. |
None
|
classification_threshold
|
Optional[float]
|
Threshold for classification decisions. |
0.5
|
description
|
Optional[str]
|
Optional model description. |
None
|
load_model(path=None, *args, **kwargs)
Load the CatBoost model from the specified file path.
If the provided path is a directory, the default model file path
(constructed by default_model_filepath
) is used.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
Path (or directory) from which to load the model. |
None
|
*args
|
Additional positional arguments. |
()
|
|
**kwargs
|
Additional keyword arguments. |
{}
|
Returns:
Name | Type | Description |
---|---|---|
BaseModel |
BaseModel
|
Self, to allow method chaining. |
Raises:
Type | Description |
---|---|
ValueError
|
If the model type is unsupported. |
Exception
|
Propagates any exception raised by the underlying load_model method. |
predict(data)
Generate predictions and, if applicable, probabilities for the given data.
For regression, only predictions are populated. For classification, both predictions and probabilities are populated.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
The input data as a NumPy array. Can be 1D or 2D. |
required |
Returns:
Name | Type | Description |
---|---|---|
ModelOutputs |
ModelOutputs
|
An instance containing predictions and probabilities (if classification). |
Raises:
Type | Description |
---|---|
Exception
|
Propagates any exception raised during prediction. |
predict_score(data)
Compute prediction scores for the given data.
For regression, the raw predictions are returned. For classification, the probability of the positive class is returned.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
NDArray
|
The input data as a NumPy array. Can be 1D or 2D. |
required |
Returns:
Type | Description |
---|---|
Union[NDArray, float]
|
Union[npt.NDArray, float]: A NumPy array of prediction scores, or a single score if one sample is provided. |
Raises:
Type | Description |
---|---|
Exception
|
Propagates any exception raised during prediction. |
save_model(path=None, *args, **kwargs)
Save the CatBoost model to the specified file path.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
PathLike
|
The file path where the model should be saved. |
None
|
*args
|
Additional positional arguments. |
()
|
|
**kwargs
|
Additional keyword arguments. |
{}
|
Raises:
Type | Description |
---|---|
ValueError
|
If the model attribute is not set. |
Exception
|
Propagates any exception raised by the underlying save_model method. |