Running Tests on Managed Docker Images

There may be circumstances when you need to run Robust Intelligence with a model which requires additional dependencies that are not specified in the provided base testing engine image.

This guide will cover how to run Robust Intelligence with a custom Docker image when you need a particular version of a library or any additional dependencies to run your model.

This feature allows you to create Docker images directly from the SDK without needing to manually write Dockerfiles. You can also use this feature to query for existing Managed Images so that you can easily find images with the requirements you need and use the same images across teams. Finally, when you upgrade your version of Robust Intelligence, your old managed images will still be usable because the backend will upgrade them to be compatible with the latest version.

Note: The following assumes you have instantiated the Client, and that you are familiar with running Robust Intelligence with the Robust Intelligence SDK. To learn more about the SDK, please refer to the SDK Reference for more information.

Building a Managed Docker Image

Prerequisite: This guide assumes that your administrator has enabled the Managed Images feature in your Robust Intelligence deployment.

The Managed Images feature allows you to build custom Docker images directly from the SDK. You only need to specify the name of the image to use as an alias, the list of additional package requirements to be installed, and, optionally, the type of base image you want (all, images, nlp, or tabular). If not given, the image type will default to use the most general ‘all’ image type.

Say you have a Tabular model that requires xgboost at version 1.0.2 and tensorflow at a flexible version. You can call this image "xgboost102_tensorflow" as an alias to remember it later.

To begin, initialize the Client and point it to the location of your RIME backend.

from rime_sdk import Client

rime_client = Client("rime.<YOUR_ORG_NAME>.rbst.io", "<YOUR_API_TOKEN>")

Next, you may use the helper method Client.pip_requirement(library_name: str, version_spec: Optional[str]) to construct a list of pip requirements for your image. The first argument is the name of the library (e.g. "tensorflow" or "xgboost") and the second argument is a valid pip version specifier (e.g. ">=0.1.2" or "==1.0.2"). Omitting the version specifier is equivalent to not fixing the version.

requirements = [
  # Fix the version of `xgboost` to `1.0.2`.
  rime_client.pip_requirement("xgboost", "==1.0.2"),
  # We do not care about the installed version of `tensorflow`.
  rime_client.pip_requirement("tensorflow")
]

To start the job that will build your custom image, use rime_client.create_managed_image(name: str, requirements: List[ManagedImage.PipRequirement]). The return value will be a Python class that you can use to track the status of the image building job.

# Start a new image building job
builder_job = rime_client.create_managed_image("xgboost102_tensorflow", requirements)

# Wait until the job has finished and print out status information.
# Once this prints out the `READY` status, your image is available for use in stress tests.
builder_job.get_status(verbose=True, wait_until_finish=True)

As an experimental customization, we also now offer the ability to specify additional linux packages that are required for your image. As with pip, we provide a helper method os_requirement(name: str, version_specifier: Optional[str]) to apply a list of operating system packages. The first argument is the name of the package (e.g. "texlive" or "vim") and the second optional argument is a valid version. Omitting the version specifier is equivalent to taking the system default version.

os_requirements = [
  # Fix the version of `texlive` to `20200406`.
  rime_client.os_requirement("texlive", "20200406"),
  # We do not care about the installed version of `vim`.
  rime_client.os_requirement("vim")
]

You can add these linux requirements to your create_managed_image request with the optional parameter package_requirements. For example:

builder_job = rime_client.create_managed_image("xgboost102_tensorflow", requirements, package_requirements=os_requirements)

Start a Stress Test on a Managed Docker Image

Once the image has been built successfully (this will take a few minutes), you may start stress tests using your managed image. The following code snippet shows how you can use your "xgboost102_tensorflow" image.

# Insert your own stress testing configuration here.
config = {
    # rest of configuration goes here...
    "custom_image": {
        "managed_image_name": "xgboost102_tensorflow"
    }
}

# Kick off a stress test on the "xgboost102_tensorflow" image.
stress_test_job = rime_client.start_stress_test(test_run_config=config)

# Wait until the job has finished and print out status information.
stress_test_job.get_status(verbose=True, wait_until_finish=True)

Querying for existing Managed Docker Images

The true power of the Managed Images feature is that it allows you to query for existing images with desired requirements. Instead of building a new custom Docker image for each stress test, you can query the backend for existing images that have the desired requirements. That makes managed images much easier to use across a team.

Say you want an image with catboost at version 1.0.3. Perhaps one of your teammates has worked on a model with the same requirement or you previously built a catboost image. To recover those images, you can first construct filters using the helper method Client.pip_library_filter(library_name: str, fixed_version: Optional[str]).

filters = [rime_client.pip_library_filter("catboost", "1.0.3")]

Then, you can provide the filters as an argument to Client.list_managed_images(). This is a paginated call so you can limit the number of images returned.

Note: list_managed_images returns a tuple of (images, next_page_token). See the reference for more detail on the function’s behavior.

images, next_page_token = rime_client.list_managed_images(pip_library_filters=filters)

Each element of images is a dictionary that contains information about all the installed pip libraries, the alias of the image, etc. Here is a code snippet to get all the names of the images in the first page of the query result.

names = [x["name"] for x in rime_client.list_managed_images(pip_library_filters=filters)[0]]

Running Tests on Managed Docker Images

Building a Managed Docker Image

Start a Stress Test on a Managed Docker Image

Querying for existing Managed Docker Images

Related Topics