Input Data Format
Automated Validation
The Python SDK exposes a command-line utility that can automatically validate your input data:
rime-data-format-check <ARGS>
Inspecting <REFERENCE_SET>
Done!
Inspecting <EVALUATION_SET>
Done!
---
Your data should work with RIME!
Instructions are available here.
Supported File Formats
RIME Tabular currently supports both CSV (.csv
) and Parquet (.parquet
), with task-specific nuances defined below. Input files should have header columns in string format — these will be used as feature names.
RIME is most effective when both label and prediction column are provided; however, neither are required for most tasks*.
Requirements By Task
Regression
Labels should be any real number
Predictions should be any real number
Binary Classification
Labels should be integer values 0 or 1
Predictions should be float values (probabilities) between 0 and 1
Multi-Class Classification
Labels should be integers referring to class index
Predictions should be an array summing to 1, with index i representing the probability of the ith class
Predictions should be uploaded as a separate
.csv
or.parquet
file, with columns corresponding to prediction classes
Ranking
* Labels are required
Labels should be any real number
Predictions should be any real number
ranking_info
must be provided in the data configuration