🔎Evaluation Comparison

The Comparison page is specifically designed for comparing the performance of two models against a given dataset. At the top of the page, the input bar accepts four parameters: data range, a control model that serves as a baseline, a variant model that is tested against the baseline, and tags that define the dataset. The page features two types of charts: a bar chart and a dodge plot.

  • The bar chart displays a side-by-side comparison of the two models, showing counts for each category for both the control and variant models.

  • The dodge plot illustrates the density distribution of responses from the control and variant models side by side.

Additionally, there is a tabular view at the bottom of the page that presents detailed data for each input, including the respective responses from both the control and variant models and the differences in their scores across each evaluation criterion.

Features

  • Real-time Charts: The platform offers real-time charts for control and variant models used. These charts provide immediate insights into prompts passed to models.

  • Detailed Evaluation Analysis: The Comparison page allows users to deep dive into evaluations between two models.

  • Data Visualization: This page offers advanced data analysis capabilities using bar and dodge plots for various evaluation methods. This visual technique enhances intuitive comprehension of the data.

Last updated