Skip to main content

Dataset

add_rows

def add_rows(self, row_data: list[dict[str, Any]]) -> 'Dataset'
Adds rows to the dataset. Arguments
  • row_data (List[Dict[str, Any]]): The rows to add to the dataset.
Raises
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • Dataset: The updated dataset with the new rows.

get_content

def get_content(self) -> Union[None, DatasetContent]
Gets and returns the content of the dataset. Also refreshes the content of the local dataset instance. Raises
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • Union[None, DatasetContent]: The content of the dataset

list_projects

def list_projects(self, limit: Union[Unset, int]=100) -> list
Lists all projects that this dataset is associated with. Arguments
  • limit (Union[Unset, int]): The maximum number of projects to return. Default is 100.
Raises
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • List[DatasetProject]: A list of projects this dataset is used in.

Datasets

create

def create(self,
           name: str,
           content: DatasetType,
           *,
           project_id: Optional[str]=None,
           project_name: Optional[str]=None) -> Dataset
Creates a new dataset, optionally associating it with a project. Arguments
  • name (str): The name of the dataset.
  • content (DatasetType): The content of the dataset.
  • project_id (str): Associate the dataset with this project by ID. Mutually exclusive with project_name.
  • project_name (str): Associate the dataset with this project by name. Mutually exclusive with project_id.
Raises
  • ValueError: If both project_id and project_name are provided, or if the specified project does not exist.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • Dataset: The created dataset.

delete

def delete(self,
           *,
           id: Optional[str]=None,
           name: Optional[str]=None,
           project_id: Optional[str]=None,
           project_name: Optional[str]=None) -> None
Deletes a dataset by id or name. Optionally validates that the dataset is used in a specific project before deletion. Arguments
  • id (str): The id of the dataset.
  • name (str): The name of the dataset.
  • project_id (str): Validate that the dataset is used in this project by ID before deletion. Mutually exclusive with project_name.
  • project_name (str): Validate that the dataset is used in this project by name before deletion. Mutually exclusive with project_id.
Raises
  • ValueError: If neither or both id and name are provided, if both project_id and project_name are provided, or if the specified project does not exist, or if the dataset is not used in the specified project.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.

extend

def extend(self,
           *,
           prompt_settings: Optional[dict[str, Any]]=None,
           prompt: Optional[str]=None,
           instructions: Optional[str]=None,
           examples: Optional[builtins.list[str]]=None,
           data_types: Optional[builtins.list[str]]=None,
           count: int=10) -> builtins.list[DatasetRow]
Extends a dataset with synthetically generated data based on the provided parameters. This method initiates a dataset extension job, waits for it to complete by polling its status, and then returns the content of the extended dataset. Arguments
  • prompt_settings (Dict[str, Any]): Settings for the prompt generation. Should contain ‘model_alias’ key. Example: {'model_alias': 'GPT-4o mini'}
  • prompt (str): A description of the assistant’s role.
  • instructions (str): Instructions for the assistant.
  • examples (List[str]): Examples of user prompts.
  • data_types (List[str]): The types of data to generate. Possible values are: ‘General Query’, ‘Prompt Injection’, ‘Off-Topic Query’, ‘Toxic Content in Query’, ‘Multiple Questions in Query’, ‘Sexist Content in Query’.
  • count (int, default 10): The number of synthetic examples to generate.
Raises
  • DatasetAPIException: If the request to extend the dataset fails.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • List[DatasetRow]: A list of rows from the extended dataset.

get

def get(self,
        *,
        id: Optional[str]=None,
        name: Optional[str]=None,
        with_content: bool=False,
        project_id: Optional[str]=None,
        project_name: Optional[str]=None) -> Optional[Dataset]
Retrieves a dataset by id or name (exactly one of id or name must be provided). Optionally validates that the dataset is used in a specific project. Arguments
  • id (str): The id of the dataset.
  • name (str): The name of the dataset.
  • with_content (bool): Whether to return the content of the dataset. Default is False.
  • project_id (str): Validate that the dataset is used in this project by ID. Mutually exclusive with project_name.
  • project_name (str): Validate that the dataset is used in this project by name. Mutually exclusive with project_id.
Raises
  • ValueError: If neither or both id and name are provided, if both project_id and project_name are provided, or if the specified project does not exist, or if the dataset is not used in the specified project.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • Dataset: The dataset.

list

def list(self,
         limit: Union[Unset, int]=100,
         *,
         project_id: Optional[str]=None,
         project_name: Optional[str]=None) -> list[Dataset]
Lists all datasets, optionally filtered by project. Arguments
  • limit (Union[Unset, int]): The maximum number of datasets to return. Default is 100.
  • project_id (str): Filter datasets used in this project by ID. Mutually exclusive with project_name.
  • project_name (str): Filter datasets used in this project by name. Mutually exclusive with project_id.
Raises
  • ValueError: If both project_id and project_name are provided, or if the specified project does not exist.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • List[Dataset]: A list of datasets.

convert_dataset_row_to_record

def convert_dataset_row_to_record(dataset_row: DatasetRow) -> DatasetRecord
Converts a DatasetRow to a DatasetRecord. Arguments
  • dataset_row (DatasetRow): The dataset row to convert.
Raises
  • ValueError: If the dataset row does not have an input field.
Returns
  • DatasetRecord: The converted dataset record.

create_dataset

def create_dataset(name: str,
                   content: DatasetType,
                   *,
                   project_id: Optional[str]=None,
                   project_name: Optional[str]=None) -> Dataset
Creates a new dataset, optionally associating it with a project. Arguments
  • name (str): The name of the dataset.
  • content (DatasetType): The content of the dataset.
  • project_id (str): Associate the dataset with this project by ID. Mutually exclusive with project_name.
  • project_name (str): Associate the dataset with this project by name. Mutually exclusive with project_id.
Raises
  • ValueError: If both project_id and project_name are provided, or if the specified project does not exist.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • Dataset: The created dataset.

delete_dataset

def delete_dataset(*,
                   id: Optional[str]=None,
                   name: Optional[str]=None,
                   project_id: Optional[str]=None,
                   project_name: Optional[str]=None) -> None
Deletes a dataset by id or name (exactly one of id or name must be provided). Optionally validates that the dataset is used in a specific project before deletion. Arguments
  • id (str): The id of the dataset.
  • name (str): The name of the dataset.
  • project_id (str): Validate that the dataset is used in this project by ID before deletion. Mutually exclusive with project_name.
  • project_name (str): Validate that the dataset is used in this project by name before deletion. Mutually exclusive with project_id.
Raises
  • ValueError: If neither or both id and name are provided, if both project_id and project_name are provided, or if the specified project does not exist, or if the dataset is not used in the specified project.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.

extend_dataset

def extend_dataset(*,
                   prompt_settings: Optional[dict[str, Any]]=None,
                   prompt: Optional[str]=None,
                   instructions: Optional[str]=None,
                   examples: Optional[list[str]]=None,
                   data_types: Optional[list[str]]=None,
                   count: int=10) -> list[DatasetRow]
Extends a dataset with synthetically generated data based on the provided parameters. This function initiates a dataset extension job, waits for it to complete by polling its status, and then returns the content of the extended dataset. Arguments
  • prompt_settings (Dict[str, Any]): Settings for the prompt generation. Should contain ‘model_alias’ key. Example: {'model_alias': 'GPT-4o mini'}
  • prompt (str): A description of the assistant’s role.
  • instructions (str): Instructions for the assistant.
  • examples (List[str]): Examples of user prompts.
  • data_types (List[str]): The types of data to generate. Possible values are: ‘General Query’, ‘Prompt Injection’, ‘Off-Topic Query’, ‘Toxic Content in Query’, ‘Multiple Questions in Query’, ‘Sexist Content in Query’.
  • count (int, default 10): The number of synthetic examples to generate.
Raises
  • DatasetAPIException: If the request to extend the dataset fails.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • List[DatasetRow]: A list of rows from the extended dataset.

get_dataset

def get_dataset(*,
                id: Optional[str]=None,
                name: Optional[str]=None,
                project_id: Optional[str]=None,
                project_name: Optional[str]=None) -> Optional[Dataset]
Retrieves a dataset by id or name (exactly one of id or name must be provided). Optionally validates that the dataset is used in a specific project. Arguments
  • id (str): The id of the dataset.
  • name (str): The name of the dataset.
  • project_id (str): Validate that the dataset is used in this project by ID. Mutually exclusive with project_name.
  • project_name (str): Validate that the dataset is used in this project by name. Mutually exclusive with project_id.
Raises
  • ValueError: If neither or both id and name are provided, if both project_id and project_name are provided, or if the specified project does not exist, or if the dataset is not used in the specified project.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • Dataset: The dataset.

get_dataset_version

def get_dataset_version(*,
                        version_index: int,
                        dataset_name: Optional[str]=None,
                        dataset_id: Optional[str]=None) -> Optional[DatasetContent]
Retrieves a dataset version by dataset name or dataset id. Arguments
  • version_index (int): The version of the dataset.
  • dataset_name (Optional[str]): The name of the dataset.
  • dataset_id (Optional[str]): The id of the dataset.
Returns
  • DatasetContent:

get_dataset_version_history

def get_dataset_version_history(*,
                                dataset_name: Optional[str]=None,
                                dataset_id: Optional[str]=None) -> Optional[Union[HTTPValidationError, ListDatasetVersionResponse]]
Retrieves a dataset version history by dataset name or dataset id. Arguments
  • dataset_name (str): The name of the dataset.
  • dataset_id (str): The id of the dataset.
Raises
  • HTTPValidationError:
Returns
  • ListDatasetVersionResponse:

list_dataset_projects

def list_dataset_projects(*,
                          dataset_id: Optional[str]=None,
                          dataset_name: Optional[str]=None,
                          limit: Union[Unset, int]=100) -> list
Lists all projects that a dataset is associated with. Arguments
  • dataset_id (str): The ID of the dataset.
  • dataset_name (str): The name of the dataset.
  • limit (Union[Unset, int]): The maximum number of projects to return. Default is 100.
Raises
  • ValueError: If neither or both dataset_id and dataset_name are provided, or if the dataset does not exist.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • List[DatasetProject]: A list of projects the dataset is used in.

list_datasets

def list_datasets(limit: Union[Unset, int]=100,
                  *,
                  project_id: Optional[str]=None,
                  project_name: Optional[str]=None) -> list[Dataset]
Lists all datasets, optionally filtered by project. Arguments
  • limit (Union[Unset, int]): The maximum number of datasets to return. Default is 100.
  • project_id (str): Filter datasets used in this project by ID. Mutually exclusive with project_name.
  • project_name (str): Filter datasets used in this project by name. Mutually exclusive with project_id.
Raises
  • ValueError: If both project_id and project_name are provided, or if the specified project does not exist.
  • errors.UnexpectedStatus: If the server returns an undocumented status code and Client.raise_on_unexpected_status is True.
  • httpx.TimeoutException: If the request takes longer than Client.timeout.
Returns
  • List[Dataset]: A list of datasets.
I