Rule Consistency Verifier

`verifia.verification.RuleConsistencyVerifier`

Manages the verification of model rules based on a domain definition and search strategy.

Evaluates a dataset against specified rules, collects statistics, and generates detailed reports.

`init(domain_cfg_dict=None, domain_cfg_fpath=None)`

Initialize the verifier using a domain configuration file. Provide either a model instance or a domain configuration file path to load the domain.

Parameters:

Name	Type	Description	Default
`domain_cfg_dict`	`Optional[Dict]`	A dictionary of a domain configuration.	`None`
`domain_cfg_fpath`	`Optional[PathLike]`	Path to a domain configuration YAML file.	`None`

`calculate_dataset_statistics()`

Compute detailed statistics for the original dataset based on domain constraints and model predictions.

This method performs the following steps

Validates that the model and dataset have been set. If not, a ValueError is raised instructing the user to call the appropriate setup methods (verify() for the model and on() for the dataset).
Filters out rows containing any feature value that falls outside its allowed domain. This is done using an internal helper function that checks each row against the domain constraints.
Records the total number of original rows (n_orig) and the number of rows removed because they are out-of-domain (n_ood).
Computes the model's predictive performance on the entire dataset and stores the performance metric name and score.
Further refines the filtered dataset based on the model's predictions:
- For regression models:
  - Retrieves the error tolerance (err_thresh) from the domain of the target variable.
  - Removes rows where the absolute prediction error exceeds the tolerance.
  - Records the count of in-domain rows removed due to high error (n_herr).
- For classification models:
  - Removes rows where the model's predictions do not match the true target values.
  - Records the count of in-domain rows removed due to misclassification (n_miscls).

Returns:

Name	Type	Description
`OriginalStatistics`	`OriginalStatistics`	An object containing: - n_orig: Total number of rows in the original dataset. - n_ood: Number of rows removed because they are out-of-domain. - n_herr: For regression, number of rows removed due to prediction error exceeding the tolerance. - n_miscls: For classification, number of rows removed due to misclassification. - metric_name: The name of the performance metric used. - metric_score: The score of the performance metric. - err_thresh: For regression, the error tolerance threshold applied.

Raises:

Type	Description
`ValueError`	If the model or dataset has not been set.

`clean_results()`

Remove all previously generated verification results.

This will delete the directory where rule‐violation reports and checkpoints have been stored. You must have already run a verification (i.e., called verify()) before cleaning, otherwise no results will be available.

Raises:

Type	Description
`ValueError`	If no results are available (i.e., verify() has not been called).

`on(dataframe=None, data_fpath=None, dataset=None)`

Set the dataset to be verified.

Provide either a Dataset, a DataFrame, or a file path to the data.

Parameters:

Name	Type	Description	Default
`dataframe`	`Optional[DataFrame]`	A pandas DataFrame.	`None`
`data_fpath`	`Optional[PathLike]`	File path to the data.	`None`
`dataset`	`Optional[Dataset]`	A pre-constructed Dataset object.	`None`

Returns:

Name	Type	Description
`RuleConsistencyVerifier`	`RuleConsistencyVerifier`	Self, to allow method chaining.

Raises:

Type	Description
`ValueError`	If none of the Dataset, DataFrame, or file path is provided. If the model is not set, instructs the user to call verify() first.

`run(pop_size, max_iters, orig_seed_ratio=None, orig_seed_size=None, persistance=True)`

Execute the verification run.

The method performs the following steps

Validates input parameters.
Samples the original dataset.
Filters out rows violating domain constraints.
Loads original seed predictions.
Iterates over rules and original inputs to search for rule violations.
Records any inconsistent candidates.
Persists the results if requested.

Parameters:

Name	Type	Description	Default
`pop_size`	`int`	Population size for the search algorithm.	required
`max_iters`	`int`	Maximum iterations for the search.	required
`orig_seed_ratio`	`Optional[float]`	Ratio of original seed samples to use.	`None`
`orig_seed_size`	`Optional[int]`	Number of original seed samples to use.	`None`
`persistance`	`bool`	Whether to persist the run results. Defaults to True.	`True`

Returns:

Name	Type	Description
`RulesViolationResult`	`RulesViolationResult`	The final verification result.

Raises:

Type	Description
`TypeError`	If pop_size or max_iters are not integers, or if seed parameters have incorrect types.
`ValueError`	If pop_size or max_iters are out of valid ranges, or if neither seed parameter is provided. Also if the model, dataset, or searcher have not been set.

`using(search_algo, search_params=None, search_params_fpath=None)`

Specify the search algorithm and parameters for verification.

Parameters:

Name	Type	Description	Default
`search_algo`	`str`	The identifier of the search algorithm.	required
`search_params`	`Optional[dict]`	A dictionary of search parameters.	`None`
`search_params_fpath`	`Optional[PathLike]`	Path to a configuration file for search parameters.	`None`

Returns:

Name	Type	Description
`RuleConsistencyVerifier`	`RuleConsistencyVerifier`	Self, to allow method chaining.

Warns:

Type	Description
`UserWarning`	If no search parameters are provided.

`verify(model=None, model_card_fpath_or_dict=None)`

Set up the model for verification.

Provide either a model instance or a model card file path to build the model.

Parameters:

Name	Type	Description	Default
`model`	`Optional[BaseModel]`	An instance of a model.	`None`
`model_card_fpath`	`Optional[PathLike]`	Path to a model card YAML file.	required

Returns:

Name	Type	Description
`RuleConsistencyVerifier`	`RuleConsistencyVerifier`	Self, to allow method chaining.

Raises:

Type	Description
`ValueError`	If neither model nor model_card_fpath is provided.

Rule Consistency Verifier

verifia.verification.RuleConsistencyVerifier

__init__(domain_cfg_dict=None, domain_cfg_fpath=None)

calculate_dataset_statistics()

clean_results()

on(dataframe=None, data_fpath=None, dataset=None)

run(pop_size, max_iters, orig_seed_ratio=None, orig_seed_size=None, persistance=True)

using(search_algo, search_params=None, search_params_fpath=None)

verify(model=None, model_card_fpath_or_dict=None)

`verifia.verification.RuleConsistencyVerifier`

`init(domain_cfg_dict=None, domain_cfg_fpath=None)`

`calculate_dataset_statistics()`

`clean_results()`

`on(dataframe=None, data_fpath=None, dataset=None)`

`run(pop_size, max_iters, orig_seed_ratio=None, orig_seed_size=None, persistance=True)`

`using(search_algo, search_params=None, search_params_fpath=None)`

`verify(model=None, model_card_fpath_or_dict=None)`