promptquality package¶

Subpackages¶

Submodules¶

promptquality.callback module¶

class GalileoPromptCallback(project_name=None, run_name=None, scorers=None, run_tags=None, config=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False), **kwargs)¶

Bases: BaseCallbackHandler

LangChain callbackbander for logging prompts to Galileo.

Parameters:

project_name (str) – Name of the project to log to

set_relationships(run_id, node_type, parent_run_id=None)¶
Return type:

None

get_root_id(run_id)¶
Return type:

UUID

mark_step_start(run_id, serialized, prompt=None, node_input=None, node_name=None, **kwargs)¶
Return type:

None

mark_step_end(run_id, response=None, node_output=None, **kwargs)¶
Return type:

None

on_retriever_start(serialized, query, *, run_id, parent_run_id=None, tags=None, metadata=None, **kwargs)¶

Run when Retriever starts running.

Return type:

Any

on_retriever_end(documents, *, run_id, parent_run_id=None, **kwargs)¶

Run when Retriever ends running.

Return type:

Any

on_retriever_error(error, *, run_id, parent_run_id=None, **kwargs)¶

Run when Retriever errors.

Return type:

Any

on_tool_start(serialized, input_str, *, run_id, parent_run_id=None, tags=None, metadata=None, **kwargs)¶

Run when tool starts running.

Return type:

Any

on_tool_end(output, *, run_id, parent_run_id=None, **kwargs)¶

Run when tool ends running.

Return type:

Any

on_tool_error(error, *, run_id, parent_run_id=None, **kwargs)¶

Run when tool errors.

Return type:

Any

on_agent_finish(finish, *, run_id, parent_run_id=None, tags=None, **kwargs)¶

Run on agent finish.

The order of operations for agents is on_chain_start, on_agent_action x times, on_agent_finish, on_chain_finish. We are creating the agent node with on_chain_start, then populating all of it’s agent specific data in on_agent_finish. We are skipping on_agent_action, because there is no relevant info there as of yet and it could also be called 0 times.

Return type:

None

on_llm_start(serialized, prompts, *, run_id, parent_run_id=None, **kwargs)¶

Run when LLM starts running.

Return type:

Any

on_chat_model_start(serialized, messages, *, run_id, parent_run_id=None, **kwargs)¶

Run when Chat Model starts running.

Return type:

Any

on_llm_end(response, *, run_id, parent_run_id=None, **kwargs)¶

Run when LLM ends running.

Return type:

Any

on_llm_error(error, *, run_id, parent_run_id=None, **kwargs)¶

Run when LLM errors.

Return type:

Any

on_chain_start(serialized, inputs, *, run_id, parent_run_id=None, **kwargs)¶

Run when chain starts running.

The inputs here are expected to only be a dictionary per the langchain docs :rtype: Any

but from experience, we do see strings and `BaseMessage`s in there, so we support those as well.

on_chain_end(outputs, *, run_id, parent_run_id=None, **kwargs)¶

Run when chain ends running.

Return type:

Any

on_chain_error(error, *, run_id, parent_run_id=None, **kwargs)¶

Run when chain errors.

Return type:

Any

static json_serializer(obj)¶

For serializing objects that cannot be serialized by default with json.dumps.

Checks for certain methods to convert object to dict.

Return type:

Union[str, Dict[Any, Any]]

add_targets(targets)¶
Return type:

None

targets: List[str]:

A list of target outputs. The list should be the length of the number of chain invokations. Targets will be mapped to chain root nodes.

finish()¶
Return type:

None

promptquality.chain_run module¶

chain_run(rows, project_name=None, run_name=None, scorers=None, run_tags=None, silent=False, config=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False))¶
Return type:

None

promptquality.get_metrics module¶

get_metrics(project_id=None, run_id=None, job_id=None, config=None)¶
Return type:

PromptMetrics

promptquality.get_rows module¶

get_rows(project_id=None, run_id=None, task_type=None, config=None, starting_token=0, limit=25)¶
Return type:

List[PromptRow]

promptquality.get_template module¶

get_template(project_name=None, project_id=None, template_name=None)¶

Get a template for a specific project.

Parameters:
  • project_name (Optional[str]) – Project name.

  • project_id (Optional[UUID4]) – Project ID.

  • template_name (Optional[str]) – Template name.

Returns:

Template response.

Return type:

BaseTemplateResponse

promptquality.helpers module¶

create_project(project_name=None, config=None)¶
Return type:

ProjectResponse

get_project(project_id, config=None)¶
Return type:

ProjectResponse

get_project_from_name(project_name, raise_if_missing=True, config=None)¶
Return type:

Optional[ProjectResponse]

create_template(template, project_id=None, template_name=None, config=None)¶

Create a template in the project.

If the project ID is not provided, it will be taken from the config.

If a template with the same name already exists, it will be used. If the template text is the same, the existing template version will be selected. Otherwise, a new template version will be created and selected.

Parameters:
  • template (str) – Template text to use for the new template.

  • project_id (Optional[UUID4], optional) – Project ID, by default None, i.e. use the current project ID in config.

  • template_name (Optional[str], optional) – Name for the template, by default None, i.e. use a random name.

  • config (Optional[Config], optional) – PromptQuality Configuration, by default None, i.e. use the current config on disk.

Returns:

Validated response from the API.

Return type:

BaseTemplateResponse

Raises:

ValueError – If the project ID is not set in config.

get_template_from_id(project_id=None, template_id=None, config=None)¶
Return type:

BaseTemplateResponse

get_templates(project_id=None, config=None)¶
Return type:

List[BaseTemplateResponse]

select_template_version(version, project_id=None, template_id=None, config=None)¶
Return type:

BaseTemplateResponse

create_template_version(template, project_id=None, template_id=None, version=None, config=None)¶

Create a template version for the current template ID in config.

Parameters:
  • template (str) – Template text to use for the new template version.

  • project_id (Optional[UUID4], optional) – Project ID, by default None, i.e. use the current project ID in config.

  • template_id (Optional[UUID4], optional) – Template ID, by default None, i.e. use the current template ID in config.

  • version (Optional[int], optional) – Version number, by default None, i.e. use the next version number.

  • config (Optional[Config], optional) – PromptQuality Configuration, by default None, i.e. use the current config on disk.

Returns:

Validated response from the API.

Return type:

CreateTemplateVersionResponse

Raises:
  • ValueError – If the template ID is not set in config.

  • ValueError – If the project ID is not set in config.

upload_dataset(dataset, project_id, template_version_id, config=None)¶
Return type:

UUID

create_run(project_id, run_name=None, task_type=7, run_tags=None, config=None)¶
Return type:

UUID

get_run_from_name(run_name, project_id=None, config=None)¶
Return type:

CreateRunResponse

create_job(project_id=None, run_id=None, dataset_id=None, template_version_id=None, settings=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False), registered_scorers=None, config=None)¶
Return type:

UUID

create_prompt_optimization_job(prompt_optimization_configuration, project_id, run_id, train_dataset_id, config)¶

Kicks off a prompt optimization job in Runners.

Parameters:
  • prompt_optimization_config (PromptOptimizationConfiguration) – Configuration for the prompt optimization job.

  • project_name (str) – Name of the project, by default None. If None we will generate a name.

  • run_name (str) – Name of the run, by default None. If None we will generate a name.

  • config (Config) – pq config object, by default None. If None we will use the default config.

Returns:

job_id – Job ID kicked off for prompt optimization

Return type:

UUID

get_estimated_cost(dataset, template, project_id=None, settings=None, config=None)¶
Return type:

float

get_job_status(job_id=None, config=None)¶
Return type:

GetJobStatusResponse

upload_custom_metrics(custom_scorer, project_id=None, run_id=None, task_type=None, config=None)¶
Return type:

UserSubmittedMetricsResponse

ingest_chain_rows(rows, project_id=None, run_id=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False), registered_scorers=None, config=None)¶
Return type:

ChainIngestResponse

get_run_scorer_jobs(project_id=None, run_id=None, config=None)¶
Return type:

List[GetJobStatusResponse]

promptquality.integrations module¶

add_openai_integration(api_key, organization_id=None, config=None)¶

Add an OpenAI integration to your Galileo account.

If you add an integration while one already exists, the new integration will overwrite the old one.

Parameters:
  • api_key (str) – Your OpenAI API key.

  • organization_id (Optional[str], optional) – Organization ID, if you want to include it in OpenAI requests, by default None

  • config (Optional[Config], optional) – Config to use, by default None which translates to the config being set automatically.

Return type:

None

add_azure_integration(api_key, endpoint, headers=None, proxy=None, config=None)¶

Add an Azure integration to your Galileo account.

If you add an integration while one already exists, the new integration will overwrite the old one.

Parameters:
  • api_key (str) – Your Azure API key.

  • endpoint (str) – The endpoint to use for the Azure API.

  • headers (Optional[Dict[str, str]], optional) – Headers to use for making requests to Azure, by default None

  • config (Optional[Config], optional) – Config to use, by default None which translates to the config being set automatically.

Return type:

None

promptquality.job_progress module¶

random() x in the interval [0, 1).¶
job_progress(job_id=None, config=None)¶
Return type:

UUID

scorer_jobs_status(project_id=None, run_id=None, config=None)¶
Return type:

None

promptquality.login module¶

login(console_url=None)¶

Login to Galileo Environment.

By default, this will login to Galileo Cloud but can be used to login to the enterprise version of Galileo by passing in the console URL for the environment.

Return type:

Config

promptquality.prompt_optimization module¶

optimize_prompt(prompt_optimization_config, train_dataset, val_dataset=None, project_name=None, run_name=None, config=None)¶

Optimize a prompt for a given task.

This function takes a prompt and a list of evaluation criteria, and optimizes the prompt for the given task. The function uses the OpenAI API to generate and evaluate prompts, and returns the best prompt based on the evaluation criteria.

Parameters:
  • prompt_optimization_config (PromptOptimizationConfiguration) – Configuration for the prompt optimization job.

  • train_dataset (Path) – Path to the training dataset.

  • val_dataset (Optional[Path], optional) – Path to the validation dataset, by default None. If None we will use a subset of the training dataset.

  • project_name (Optional[str], optional) – Name of the project, by default None. If None we will generate a name.

  • run_name (Optional[str], optional) – Name of the run, by default None. If None we will generate a name.

  • config (Optional[Config], optional) – pq config object, by default None. If None we will use the default config.

Returns:

Object that can be polled to see progress and current prompt.

Return type:

PromptOptimizaton

promptquality.registered_scorers module¶

register_scorer(scorer_name, scorer_file, config=None)¶
Return type:

RegisteredScorer

list_registered_scorers(config=None)¶
Return type:

List[RegisteredScorer]

promptquality.run module¶

run(template, dataset=None, project_name=None, run_name=None, template_name=None, scorers=None, settings=None, run_tags=None, wait=True, silent=False, config=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False))¶

Create a prompt run.

This function creates a prompt run that can be viewed on the Galileo console. The processing of the prompt run is asynchronous, so the function will return immediately. If the wait parameter is set to True, the function will block until the prompt run is complete.

Additionally, all of the scorers are executed asynchronously in the background after the prompt run is complete, regardless of the value of the wait parameter.

Parameters:
  • template (str) – Template text to use for the prompt run.

  • dataset (Optional[DatasetType]) – Dataset to use for the prompt run.

  • project_name (Optional[str], optional) – Project name to use, by default None which translates to a randomly generated name.

  • run_name (Optional[str], optional) – Run name to use, by default None which translates to one derived from the project name, current timestamp and template version.

  • template_name (Optional[str], optional) – Template name to use, by default None which translates to the project name.

  • scorers (List[Union[Scorers, CustomScorer, RegisteredScorer, str]], optional) – List of scorers to use, by default None.

  • settings (Optional[Settings], optional) – Settings to use, by default None which translates to the default settings.

  • run_tags (Optional[List[RunTag]], optional,) – List of tags to attribute to a run, by default no tags will be added.

  • wait (bool, optional) – Whether to wait for the prompt run to complete, by default True.

  • silent (bool, optional) – Whether to suppress the console output, by default False.

  • config (Optional[Config], optional) – Config to use, by default None which translates to the config being set automatically.

  • scorers_config (ScorersConfig, optional) – Can be used to enable or disable scorers. Can be used instead of scorers param, or can be used to disable default scorers.

Returns:

Metrics for the prompt run. These are only returned if the wait parameter is True for metrics that have been computed upto that point. Other metrics will be computed asynchronously.

Return type:

Optional[PromptMetrics]

promptquality.run_sweep module¶

create_settings_combinations(base_settings, model_aliases=None, temperatures=None, max_token_options=None)¶
Return type:

List[Settings]

run_sweep(templates, dataset, project_name=None, model_aliases=None, temperatures=None, settings=None, max_token_options=None, scorers=None, run_tags=None, execute=False, wait=True, silent=True, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False))¶

Run a sweep of prompt runs over various settings.

If execute is False, this function will estimate the cost of the batch of runs and print the estimated cost. If execute is True, this function will create the batch of runs.

We support optionally providing a subset of settings to override the base settings. If no settings are provided, we will use the base settings.

Return type:

None

promptquality.set_config module¶

set_config(console_url=None)¶

Set the config for promptquality.

If the config file exists, and console_url is not passed, read it and return the config. Otherwise, set the default console URL and return the config.

Parameters:

console_url (Optional[str], optional) – URL to the Galileo console, by default None and we use the Galileo Cloud URL.

Returns:

Config object for promptquality.

Return type:

Config

promptquality.sweep module¶

sweep(fn, params)¶

Run a sweep of a function over various settings.

Given a function and a dictionary of parameters, run the function over all combinations of the parameters.

Parameters:
  • fn (Callable) – Function to run.

  • params (Dict[str, Iterable]) – Dictionary of parameters to run the function over. The keys are the parameter names and the values are the values to run the function with.

Return type:

None

Module contents¶

PromptQuality.