Skip to main content

Metrics

create_custom_llm_metric

def create_custom_llm_metric(self,
                             name: str,
                             user_prompt: str,
                             node_level: StepType=StepType.llm,
                             cot_enabled: bool=True,
                             model_name: str='gpt-4.1-mini',
                             num_judges: int=3,
                             description: str='',
                             tags: Optional[list[str]]=None,
                             output_type: OutputTypeEnum=OutputTypeEnum.BOOLEAN) -> BaseScorerVersionResponse
Create a custom LLM metric. Arguments
  • name (str): Name of the metric.
  • user_prompt (str): User prompt for the metric.
  • node_level (StepType): Node level for the metric.
  • cot_enabled (bool): Whether chain-of-thought is enabled.
  • model_name (str): Model name to use.
  • num_judges (int): Number of judges for the metric.
  • description (str): Description of the metric.
  • tags (List[str]): Tags associated with the metric.
  • output_type (OutputTypeEnum): Output type for the metric.
Returns
  • BaseScorerVersionResponse: Response containing the created metric details.

create_custom_llm_metric

def create_custom_llm_metric(name: str,
                             user_prompt: str,
                             node_level: StepType=StepType.llm,
                             cot_enabled: bool=True,
                             model_name: str='gpt-4.1-mini',
                             num_judges: int=3,
                             description: str='',
                             tags: Optional[list[str]]=None,
                             output_type: OutputTypeEnum=OutputTypeEnum.BOOLEAN) -> BaseScorerVersionResponse
Create a custom LLM metric. Arguments
  • name (str): Name of the metric.
  • user_prompt (str): User prompt for the metric.
  • node_level (StepType): Node level for the metric.
  • cot_enabled (bool): Whether chain-of-thought is enabled.
  • model_name (str): Model name to use.
  • num_judges (int): Number of judges for the metric.
  • description (str): Description of the metric.
  • tags (List[str]): Tags associated with the metric.
  • output_type (OutputTypeEnum): Output type for the metric.
Returns
  • BaseScorerVersionResponse: Response containing the created metric details.

delete_metric

def delete_metric(name: str) -> None
Deletes a metric by its name. Arguments
  • name: The name of the metric to delete.

get_metrics

def get_metrics(project_id: str,
                start_time: datetime.datetime,
                end_time: datetime.datetime,
                experiment_id: Optional[str]=None,
                log_stream_id: Optional[str]=None,
                filters: Optional[list[FilterType]]=None,
                group_by: Optional[str]=None,
                interval: int=5) -> LogRecordsMetricsResponse
Queries for metrics in a project. Arguments
  • project_id: The unique identifier of the project.
  • start_time: The start of the time range for the query.
  • end_time: The end of the time range for the query.
  • experiment_id: Filter records by a specific experiment ID.
  • log_stream_id: Filter records by a specific run ID.
  • filters: A list of filters to apply to the query.
  • group_by: The field to group the results by.
  • interval: The time interval for the query in seconds.
Returns
  • LogRecordsMetricsResponse: A LogRecordsMetricsResponse object containing the query results, or None if the query fails.
I