ℹ️ These docs are for the v2.0 version of Galileo. Documentation for v1.0 version can be found here.
curl --request GET \
--url https://api.galileo.ai/v2/projects/{project_id}/experiments/{experiment_id}/metric_settings \
--header 'Galileo-API-Key: <api-key>'{
"scorers": [
{
"id": "<string>",
"scorer_type": "llm",
"model_name": "<string>",
"num_judges": 123,
"filters": [
{
"value": "<string>",
"operator": "eq",
"name": "node_name",
"filter_type": "string",
"case_sensitive": true
}
],
"scoreable_node_types": [
"<string>"
],
"cot_enabled": true,
"output_type": "boolean",
"input_type": "basic",
"name": "<string>",
"model_type": "slm",
"scorer_version": {
"id": "<string>",
"version": 123,
"scorer_id": "<string>",
"generated_scorer": {
"id": "<string>",
"name": "<string>",
"chain_poll_template": {
"template": "<string>",
"metric_system_prompt": "<string>",
"metric_description": "<string>",
"value_field_name": "rating",
"explanation_field_name": "explanation",
"metric_few_shot_examples": [
{
"generation_prompt_and_response": "<string>",
"evaluating_response": "<string>"
}
],
"response_schema": {}
},
"instructions": "<string>",
"user_prompt": "<string>"
},
"registered_scorer": {
"id": "<string>",
"name": "<string>",
"score_type": "<string>"
},
"finetuned_scorer": {
"id": "<string>",
"name": "<string>",
"lora_task_id": 123,
"prompt": "<string>",
"luna_input_type": "span",
"luna_output_type": "float",
"class_name_to_vocab_ix": {},
"executor": "action_completion_luna"
},
"model_name": "<string>",
"num_judges": 123,
"scoreable_node_types": [
"<string>"
],
"cot_enabled": true,
"output_type": "boolean",
"input_type": "basic"
}
}
],
"segment_filters": [
{
"sample_rate": 0.5,
"filter": {
"value": "<string>",
"operator": "eq",
"name": "node_name",
"filter_type": "string",
"case_sensitive": true
},
"llm_scorers": false
}
]
}curl --request GET \
--url https://api.galileo.ai/v2/projects/{project_id}/experiments/{experiment_id}/metric_settings \
--header 'Galileo-API-Key: <api-key>'{
"scorers": [
{
"id": "<string>",
"scorer_type": "llm",
"model_name": "<string>",
"num_judges": 123,
"filters": [
{
"value": "<string>",
"operator": "eq",
"name": "node_name",
"filter_type": "string",
"case_sensitive": true
}
],
"scoreable_node_types": [
"<string>"
],
"cot_enabled": true,
"output_type": "boolean",
"input_type": "basic",
"name": "<string>",
"model_type": "slm",
"scorer_version": {
"id": "<string>",
"version": 123,
"scorer_id": "<string>",
"generated_scorer": {
"id": "<string>",
"name": "<string>",
"chain_poll_template": {
"template": "<string>",
"metric_system_prompt": "<string>",
"metric_description": "<string>",
"value_field_name": "rating",
"explanation_field_name": "explanation",
"metric_few_shot_examples": [
{
"generation_prompt_and_response": "<string>",
"evaluating_response": "<string>"
}
],
"response_schema": {}
},
"instructions": "<string>",
"user_prompt": "<string>"
},
"registered_scorer": {
"id": "<string>",
"name": "<string>",
"score_type": "<string>"
},
"finetuned_scorer": {
"id": "<string>",
"name": "<string>",
"lora_task_id": 123,
"prompt": "<string>",
"luna_input_type": "span",
"luna_output_type": "float",
"class_name_to_vocab_ix": {},
"executor": "action_completion_luna"
},
"model_name": "<string>",
"num_judges": 123,
"scoreable_node_types": [
"<string>"
],
"cot_enabled": true,
"output_type": "boolean",
"input_type": "basic"
}
}
],
"segment_filters": [
{
"sample_rate": 0.5,
"filter": {
"value": "<string>",
"operator": "eq",
"name": "node_name",
"filter_type": "string",
"case_sensitive": true
},
"llm_scorers": false
}
]
}Successful Response
Show child attributes
llm, code, luna, preset List of filters to apply to the scorer.
Filters on node names in scorer jobs.
Show child attributes
eq, ne, contains "node_name""string"List of node types that can be scored by this scorer. Defaults to llm/chat.
Whether to enable chain of thought for this scorer. Defaults to False for llm scorers.
What type of output to use for model-based scorers (boolean, categorical, etc.).
boolean, categorical, count, discrete, freeform, percentage, multilabel What type of input to use for model-based scorers (sessions_normalized, trace_io_only, etc..).
basic, llm_spans, retriever_spans, sessions_normalized, sessions_trace_io_only, tool_spans, trace_input_only, trace_io_only, trace_normalized, trace_output_only, agent_spans, workflow_spans Type of model to use for this scorer. slm maps to luna, and llm maps to plus
slm, llm, code ScorerVersion to use for this scorer. If not provided, the latest version will be used.
Show child attributes
Show child attributes
Template for a chainpoll metric prompt, containing all the info necessary to send a chainpoll prompt.
Show child attributes
Chainpoll prompt template.
System prompt for the metric.
Description of what the metric should do.
Field name to look for in the chainpoll response, for the rating.
Field name to look for in the chainpoll response, for the explanation.
Few-shot examples for the metric.
Response schema for the output
Show child attributes
span, trace_object, trace_input_output_only float, string, string_list Executor pipeline. Defaults to finetuned scorer pipeline but can run custom galileo score pipelines.
action_completion_luna, action_advancement_luna, agentic_session_success, agentic_session_success, agentic_workflow_success, agentic_workflow_success, agent_efficiency, agent_flow, bleu, chunk_attribution_utilization_luna, chunk_attribution_utilization, completeness_luna, completeness, context_adherence, context_adherence_luna, context_relevance, context_relevance_luna, conversation_quality, correctness, ground_truth_adherence, input_pii, input_pii_gpt, input_sexist, input_sexist, input_sexist_luna, input_sexist_luna, input_tone, input_tone_gpt, input_toxicity, input_toxicity_luna, instruction_adherence, output_pii, output_pii_gpt, output_sexist, output_sexist, output_sexist_luna, output_sexist_luna, output_tone, output_tone_gpt, output_toxicity, output_toxicity_luna, prompt_injection, prompt_injection_luna, prompt_perplexity, rouge, tool_error_rate, tool_error_rate_luna, tool_selection_quality, tool_selection_quality_luna, uncertainty, user_intent_change List of node types that can be scored by this scorer. Defaults to llm/chat.
Whether to enable chain of thought for this scorer. Defaults to False for llm scorers.
What type of output to use for model-based scorers (sessions_normalized, trace_io_only, etc.).
boolean, categorical, count, discrete, freeform, percentage, multilabel What type of input to use for model-based scorers (sessions_normalized, trace_io_only, etc.).
basic, llm_spans, retriever_spans, sessions_normalized, sessions_trace_io_only, tool_spans, trace_input_only, trace_io_only, trace_normalized, trace_output_only, agent_spans, workflow_spans List of segment filters to apply to the run.
Show child attributes
The fraction of the data to sample. Must be between 0 and 1, inclusive.
0 <= x <= 1Filter to apply to the segment. By default sample on all data.
Show child attributes
eq, ne, contains "node_name""string"Whether to sample only on LLM scorers.
Was this page helpful?