📓API

DEPRECATED: This version of the SDK and the API have been deprecated. To try out our latest API and SDK in beta, please contact us at contactus@evaluable.ai

Our API facilitates seamless integration between performing inferences on various AI models and evaluating those responses on the Evaluable AI platform. Designed to streamline the workflow from inference to evaluation, our API supports a range of operations tailored to different use cases, whether the inferences are conducted on our platform or through external model APIs like those from OpenAI and Mistral AI.

Understanding Your Workflow Needs

The Evaluable AI Python SDK API accommodates three primary workflows:

  1. Evaluating Existing Inferences: For inferences already performed on Evaluable AI and responses stored within our platform.

  2. Submitting and Evaluating New Inferences: For newly conducted inferences via model APIs that need to be evaluated on the Evaluable AI platform.

  3. Storing Inferences for Future Evaluation: For uploading inference results to Evaluable AI for later analysis.

Below, we detail the API endpoints designed for these workflows, ensuring you can efficiently run inferences and evaluations as needed.

All APIs are built on top of the responses you get from inference APIs of Mistral AI and Open AI. We use additional params called evaluableai_params to get evaluation details along with the inference objects from the users. Below are the evaluableai params listed with their definitions

ParameterTypeDescription

eval

Boolean

Determines whether the response should be evaluated (true) or not (false).

eval_list

Array

A list of evaluation metrics to be used for evaluation, e.g., ["bleu"]. Refer to this link for guidance on default and custom scores

time_taken

Float

The time taken for the model to generate the response, measured in seconds.

ground_truth

String

The correct answer or ground truth against which the model's response will be evaluated.

async

Boolean

If set to true, the evaluation process is performed asynchronously. Defaults to false.

POST /run/eval

  • Endpoint: https://api.evaluable.ai/pythonsdk/run/eval

  • Use Case: Ideal when you've performed inferences directly on Evaluable AI or have already submitted them to the platform using the submitData API and need subsequent evaluation.

Headers

NameValue

Content-Type

application/json

Authorization

Bearer <token>

Body

{
  "response_ids": ["response-unique-id"],
  "eval_list": ["evaluation_metric1", "evaluation_metric2"]
}

Response

{
  "scores": [
    {
      "evaluation_metric1": 0.75,
      "responseId": "response-unique-id"
    },
    {
      "evaluation_metric1": 0.70,
      "responseId": "response-unique-id"
    }
  ]
}

Submit Data and Evaluate

POST /submitdataandeval

  • Endpoints:

    OpenAI: POST https://api.evaluable.ai/pythonsdk/openai/submitdataandeval

    Mistral: POST https://api.evaluable.ai/pythonsdk/mistralai/submitdataandeval

  • Use Case: Suitable when using external models for inference, and using Evaluable AI for evaluation, support for both Mistral and Open AI models.

Headers

NameValue

Content-Type

application/json

Authorization

Bearer <token>

Body

{
  "openai_request": {
    "model": "text-davinci-003",
    ...
  },
   "openai_response":{
         "id":"unique_identifier",
         "choices":[],
         "model":"gpt-3.5-turbo",
         ...
   },
   "evaluableai_params":{
         "eval":true,
         "sampling":0.8,
         "eval_list":[
            "bleu"
         ],
         "time_taken":2.5,
         "ground_truth":"example truth",
         "async":false
   }
}

Response

{
  "scores": [
    {
      "evaluation_metric1": 0.75,
      "responseId": "response-unique-id"
    },
    {
      "evaluation_metric1": 0.70,
      "responseId": "response-unique-id"
    }
  ]
}

Submit Data

POST /submitdata

To store inference results on Evaluable AI without immediate evaluation, use these endpoints.

  • Endpoints:

    OpenAI: POST https://api.evaluable.ai/pythonsdk/openai/submitdata

    Mistral AI: POST https://api.evaluable.ai/pythonsdk/mistralai/submitdata

  • Use Case: Best for uploading inferences for later evaluation, keeping the data ready within Evaluable AI for observability metrics.

Headers

NameValue

Content-Type

application/json

Authorization

Bearer <token>

Body

{
  "openai_request": {   //this will be a mistral request and response if using the mistral api
    "model": "text-davinci-003",
    ...
  },
   "openai_response":{
         "id":"unique_identifier",
         "choices":[],
         "model":"gpt-3.5-turbo",
         ...
   },
   "evaluableai_params":{
         "eval":false,
   }
}

Response

[
"responseId": "response-unique-id"
]

Last updated