Scheduled Stress Testing

Install Dependencies, Import Libraries and Download Data

Run the cell below to install libraries to receive data, install our SDK, and load analysis libraries.

[ ]:

!pip install rime-sdk &> /dev/null

import pandas as pd
from pathlib import Path
from rime_sdk import Client

[ ]:

!pip install https://github.com/RobustIntelligence/ri-public-examples/archive/master.zip

from ri_public_examples.download_files import download_files

download_files('tabular-2.0/fraud', 'fraud')

Establish the RIME Client

To get started, provide the API credentials and the base domain/address of the RIME service. You can generate and copy an API token from the API Access Tokens Page under Workspace settings. For the domian/address of the RIME service, contact your admin.

[ ]:

API_TOKEN = '' # PASTE API_KEY
CLUSTER_URL = '' # PASTE DEDICATED DOMAIN OF RIME SERVICE (e.g., https://rime.example.rbst.io)

client = Client(CLUSTER_URL, API_TOKEN)

Create a new project

After creating the project, note down the ID. You can use it to retrieve the project the next time you use the RIME client.

[ ]:

description = (
    "Create a Stress Test and set up scheduling."
    " Demonstration uses a tabular binary classification dataset"
    " and model that simulates credit card fraud detection."
)
project = client.create_project(
    "Scheduled Stress Testing Demo",
    description,
    "MODEL_TASK_BINARY_CLASSIFICATION"
)
print(f"Project ID: {project.project_id}")

[ ]:

project_id = '' # PASTE FROM ABOVE
project = client.get_project(project_id)

Upload the model and datasets

First let’s see what the data looks like.

[ ]:

df = pd.read_csv(Path('fraud/data/fraud_ref.csv'))
df.head()

For this demo, we are going to use a pretrained CatBoostClassifier Model.

The model predicts whether a particular transaction is fraud or not fraud.

The model makes use of the following features -

category
card_type
card_company
transaction_amount
city
browser_version
country

We now want to kick off RIME Stress Tests that will help us evaluate the model in further depth beyond basic performance metrics like accuracy, precision, recall. In order to do this, we will upload this pre-trained model, the reference dataset the model was trained on, and the evaluation dataset the model was evaluated on to an S3 bucket that can be accessed by RIME.

Uploading Artifacts to Blob Storage

For SaaS environments using the default S3 storage location, the Python SDK supports direct file uploads using upload_*().

For other environments and storage technologies, artifacts must be managed through alternate means.

[ ]:

IS_SAAS = False # TOGGLE True/False (Note: SaaS environments use URLs ending in "rbst.io" and have an "Internal Agent")

[ ]:

if not IS_SAAS:
    BLOB_STORE_URI = "" # PROVIDE BLOB STORE URI (e.g., "s3://acmecorp-rime")
    assert BLOB_STORE_URI != ""

UPLOAD_PATH = "ri_public_examples_fraud"

[ ]:

if IS_SAAS:
    model_s3_dir = client.upload_directory(
        Path('fraud/models'), upload_path=UPLOAD_PATH
    )
    model_s3_path = model_s3_dir + "/fraud_model.py"

    ref_s3_path = client.upload_file(
        Path('fraud/data/fraud_ref.csv'), upload_path=UPLOAD_PATH
    )
    eval_s3_path = client.upload_file(
        Path('fraud/data/fraud_eval.csv'), upload_path=UPLOAD_PATH
    )

    ref_preds_s3_path = client.upload_file(
        Path("fraud/data/fraud_ref_preds.csv"), upload_path=UPLOAD_PATH
    )
    eval_preds_s3_path = client.upload_file(
        Path("fraud/data/fraud_eval_preds.csv"), upload_path=UPLOAD_PATH
    )
else:
    model_s3_path = f"{BLOB_STORE_URI}/{UPLOAD_PATH}/models/fraud_model.py"

    ref_s3_path = f"{BLOB_STORE_URI}/{UPLOAD_PATH}/data/fraud_ref.csv"
    eval_s3_path = f"{BLOB_STORE_URI}/{UPLOAD_PATH}/data/fraud_eval.csv"

    ref_preds_s3_path = f"{BLOB_STORE_URI}/{UPLOAD_PATH}/data/fraud_ref_preds.csv"
    eval_preds_s3_path = f"{BLOB_STORE_URI}/{UPLOAD_PATH}/data/fraud_eval_preds.csv"

Once the data and model are uploaded to S3, we can register them to RIME. Once they’re registered, we can refer to these resources using their RIME-generated IDs.

Tip: Note down the RIME-generated IDs for future use so that you don’t have to repeatedly upload datasets and models to RIME every time you want to run Stress Tests.

[ ]:

from datetime import datetime

dt = str(datetime.now())

# Note: models and datasets need to have unique names.
model_id = project.register_model_from_path(f"model_{dt}", model_s3_path)

ref_dataset_id = project.register_dataset_from_file(
    f"ref_dataset_{dt}", ref_s3_path, data_params={"label_col": "label"}
)
eval_dataset_id = project.register_dataset_from_file(
    f"eval_dataset_{dt}", eval_s3_path, data_params={"label_col": "label"}
)

project.register_predictions_from_file(
    ref_dataset_id, model_id, ref_preds_s3_path
)
project.register_predictions_from_file(
    eval_dataset_id, model_id, eval_preds_s3_path
)

print(f"Model ID: {model_id}")
print(f"Reference dataset ID: {ref_dataset_id}")
print(f"Evaluation dataset ID: {eval_dataset_id}")

Running a Stress Test

AI Stress Tests allow you to test your data and model before deployment. They are a comprehensive suite of hundreds of tests that automatically identify implicit assumptions and weaknesses of pre-production models. Each stress test is run on a single model and its associated reference and evaluation datasets.

Below is a sample configuration of how to setup and run a RIME Stress Test.

[ ]:

model_id = '' # PASTE FROM ABOVE
ref_dataset_id = '' # PASTE FROM ABOVE
eval_dataset_id = '' # PASTE FROM ABOVE

stress_test_config = {
    "run_name": "Onboarding Stress Test Run",
    "data_info": {
        "ref_dataset_id": ref_dataset_id,
        "eval_dataset_id": eval_dataset_id,
    },
    "model_id": model_id,
    "categories": [
            "TEST_CATEGORY_TYPE_ADVERSARIAL",
            "TEST_CATEGORY_TYPE_SUBSET_PERFORMANCE",
            "TEST_CATEGORY_TYPE_TRANSFORMATIONS",
            "TEST_CATEGORY_TYPE_ABNORMAL_INPUTS",
            "TEST_CATEGORY_TYPE_DATA_CLEANLINESS"]
}

stress_test_config

[ ]:

stress_job = client.start_stress_test(
    stress_test_config, project.project_id
)
stress_job.get_status(verbose=True, wait_until_finish=True)

Set up a Schedule to run Stress Tests automatically

After you have successfully run a manual Stress Test, you can carry over the configuration and use it to set up Stress Tests to run automatically.

First, let’s take another look at stress_test_config and see if we need to make any updates. For example, you may want to update the evaluation dataset used to run your automatically scheduled Stress Tests.

[ ]:

stress_test_config

# e.g.,
# stress_test_config["data_info"]["eval_dataset_id"] = new_eval_dataset_id

We can now create a new Schedule containing the information necessary to automatically run Stress Tests. You will need to provide a configuration dict as well as a string indicating how often you want the Stress Tests to run. Supported strings are: “@hourly”, “@daily”, “@weekly” and “@monthly”.

[ ]:

schedule = project.create_schedule(test_run_config=stress_test_config, frequency_cron_expr="@hourly")
#print(schedule.info)

print(f"Schedule ID: {schedule.schedule_id}")

After creating a Schedule, you need to activate it on your project. Feel free to activate and deactivate your Schedule as required.

[ ]:

schedule_id = '' # PASTE FROM ABOVE

project.activate_schedule(schedule_id)
#project.deactivate_schedule(schedule_id)

If you forget which schedule is active on your project, you can always retrieve the information later using either of the methods below.

[ ]:

project.info.active_schedule

project.get_active_schedule().info

You can also update how frequently you want your Schedule to run Stress Tests.

[ ]:

project.update_schedule(schedule_id, "@daily")

Congratulations! Stress test scheduling is now set up. You can go to the Stress Testing tab for your project on the Robust Intelligence UI to view and analyze new Stress Tests as they run over time.