Validating Your Model with AI Stress Testing

This tutorial will guide you through validating a Binary Classification model with RIME AI Stress Testing.

This model was trained on a slightly modified version of the Adult Census Income dataset and is available in the rime_trial/ bundle provided during installation.

An AI Stress Test is a statistical evaluation of a machine learning model, designed to detect a specific vulnerability. At Robust Intelligence, we are constantly researching new vulnerabilities to test.

For a full list of available stress tests, see our Test Bank.

Running Stress Testing on the Income Example

Stress Testing with a Model and Datasets

In this example, we will be providing the model directly to RIME, which enables the most thorough possible analysis. However, RIME can be run with prediction logs alone (or even just datasets), which we will illustrate below as well.

To kick off a run of AI Stress Testing using a model and datasets:

rime-engine run-stress-tests --config-path examples/income/stress_tests_model.json

After this finishes running, you should be able to see the results in the web client, where they will be uploaded to the Default Project.

For additional command line options, please see the CLI Reference.

Stress Testing with Prediction Logs

To kick off a run of AI Stress Testing using a model and datasets:

rime-engine run-stress-tests --config-path examples/income/stress_tests_prediction_logs.json

Note that the command is exactly the same EXCEPT for the --config-path provided.

Stress Testing for Compliance

This will run a specific suite of tests geared towards bias and fairness.

To kick off a run in the Compliance setting:

rime-engine run-stress-tests --config-path examples/income/stress_tests_compliance.json

The command is exactly the same as the others; however, in the configuration file we have specified Compliance under categories as well as a list of protected_features in our data_info:

{
  "run_name": "Income - Compliance Mode",
  "data_info": {
    "protected_features": ["sex", "race", "education", "age", "native.country"]
    ...
  },
  "test_config": {
    "categories": ["Compliance"],
    "run_default": false
  },
  "model_info": { ... }
}

Running Stress Testing on Your Own Model and Datasets

This guide will cover how to run AI Stress Testing on your own model and datasets.

Define a Python Model File

A model is not required for AI Stress Testing, but providing one will produce better results.

For step-by-step instructions, please see How to Create a Python Model File.

Gather Datasets

For a detailed specification of data formatting, see Input Data Format.

1. Split the Data

For AI Stress Testing, RIME requires two datasets: a reference dataset (typically the training data) and an evaluation dataset (typically the validation, testing, or other data). Currently RIME expects each dataset to be passed in as a .csv or .parquet file where each column is a separate feature.

Create Configuration

With your data and model ready, you can now create a configuration file. Examples of these can be found in the rime_trial/ bundle (the ones used for this example are under examples/income/).

For a detailed reference on what the configuration should look like, see AI Stress Testing Configuration Reference.

Run the CLI

To kick off a run of AI Stress Testing using your configuration file, simply replace the --config-path argument below:

rime-engine run-stress-tests --config-path <PATH-TO-CONFIGURATION>

After this finishes running, you should be able to see the results in the web client, where they will be uploaded to the Default Project.


Troubleshooting

If you run into issues, please refer to our Troubleshooting page for help! Additionally, your RI representative will be happy to assist — feel free to reach out!