promptquality package¶
Subpackages¶
- promptquality.constants package
- Submodules
- promptquality.constants.config module
- promptquality.constants.dataset_format module
- promptquality.constants.integrations module
- promptquality.constants.job module
- promptquality.constants.models module
Models
Models.chat_gpt
Models.chat_gpt_16k
Models.gpt_35_turbo
Models.gpt_35_turbo_16k
Models.gpt_35_turbo_16k_0125
Models.gpt_35_turbo_instruct
Models.gpt_4
Models.gpt_4_turbo
Models.gpt_4_turbo_0125
Models.gpt_4_128k
Models.babbage_2
Models.davinci_2
Models.azure_chat_gpt
Models.azure_chat_gpt_16k
Models.azure_gpt_35_turbo
Models.azure_gpt_35_turbo_16k
Models.azure_gpt_35_turbo_instruct
Models.azure_gpt_4
Models.text_bison
Models.text_bison_001
Models.gemini_pro
Models.aws_titan_tg1_large
Models.aws_titan_text_lite_v1
Models.aws_titan_text_express_v1
Models.cohere_command_r_v1
Models.cohere_command_r_plus_v1
Models.cohere_command_text_v14
Models.cohere_command_light_text_v14
Models.ai21_j2_mid_v1
Models.ai21_j2_ultra_v1
Models.anthropic_claude_instant_v1
Models.anthropic_claude_v1
Models.anthropic_claude_v2
Models.anthropic_claude_v21
Models.anthropic_claude_3_sonnet
Models.anthropic_claude_3_haiku
Models.meta_llama2_13b_chat_v1
Models.meta_llama3_8b_instruct_v1
Models.meta_llama3_70b_instruct_v1
Models.mistral_7b_instruct
Models.mistral_8x7b_instruct
Models.mistral_large
Models.palmyra_base
Models.palmyra_large
Models.palmyra_instruct
Models.palmyra_instruct_30
Models.palmyra_beta
Models.silk_road
Models.palmyra_e
Models.palmyra_x
Models.palmyra_x_32k
Models.palmyra_med
Models.examworks_v1
- promptquality.constants.prompt_optimization module
- promptquality.constants.routes module
- promptquality.constants.run module
- promptquality.constants.scorers module
Scorers
Scorers.toxicity
Scorers.factuality
Scorers.correctness
Scorers.groundedness
Scorers.context_adherence
Scorers.context_adherence_plus
Scorers.pii
Scorers.latency
Scorers.context_relevance
Scorers.sexist
Scorers.tone
Scorers.prompt_perplexity
Scorers.chunk_attribution_utilization_gpt
Scorers.chunk_attribution_utilization_plus
Scorers.completeness_gpt
Scorers.completeness_plus
Scorers.prompt_injection
Scorers.adherence_basic
Scorers.context_adherence_basic
Scorers.completeness_basic
Scorers.chunk_attribution_utilization_basic
- Module contents
- promptquality.types package
- Subpackages
- Submodules
- promptquality.types.config module
Config
Config.api_key
Config.console_url
Config.current_dataset_id
Config.current_job_id
Config.current_project_id
Config.current_project_name
Config.current_run_id
Config.current_run_name
Config.current_run_task_type
Config.current_run_url
Config.current_template
Config.current_template_id
Config.current_template_name
Config.current_template_version
Config.current_template_version_id
Config.current_user
Config.password
Config.token
Config.username
Config.http_url
Config.login()
Config.logout()
Config.merge_dataset()
Config.merge_job()
Config.merge_project()
Config.merge_run()
Config.merge_template()
Config.merge_template_version()
Config.read()
Config.serialize_token()
Config.token_login()
Config.validate_api_url
Config.write()
Config.api_client
Config.api_url
Config.config_file
Config.project_url
- promptquality.types.custom_scorer module
- promptquality.types.pagination module
- promptquality.types.prompt_optimization module
PromptOptimizationConfiguration
PromptOptimizationConfiguration.evaluation_criteria
PromptOptimizationConfiguration.evaluation_model_alias
PromptOptimizationConfiguration.generation_model_alias
PromptOptimizationConfiguration.iterations
PromptOptimizationConfiguration.max_tokens
PromptOptimizationConfiguration.num_train_rows
PromptOptimizationConfiguration.num_val_rows
PromptOptimizationConfiguration.prompt
PromptOptimizationConfiguration.task_description
PromptOptimizationConfiguration.temperature
PromptOptimizationConfiguration.validation_dataset_id
- promptquality.types.registered_scorers module
- promptquality.types.rows module
- promptquality.types.run module
CreateProjectRequest
ProjectResponse
BaseTemplateVersionRequest
CreateTemplateRequest
CreateTemplateVersionRequest
BaseTemplateVersionResponse
CreateTemplateVersionResponse
BaseTemplateResponse
BaseDatasetRequest
UploadDatasetRequest
UploadDatasetResponse
EstimateCostRequest
EstimatedCostResponse
RunTag
CreateRunRequest
CreateRunResponse
ScorerSettings
ScorersConfiguration
ScorersConfiguration.adherence_nli
ScorersConfiguration.chunk_attribution_utilization_gpt
ScorersConfiguration.chunk_attribution_utilization_nli
ScorersConfiguration.completeness_gpt
ScorersConfiguration.completeness_nli
ScorersConfiguration.context_relevance
ScorersConfiguration.factuality
ScorersConfiguration.groundedness
ScorersConfiguration.latency
ScorersConfiguration.pii
ScorersConfiguration.prompt_injection
ScorersConfiguration.prompt_perplexity
ScorersConfiguration.sexist
ScorersConfiguration.tone
ScorersConfiguration.toxicity
ScorersConfiguration.from_scorers()
ScorersConfiguration.merge_scorers()
CreateJobRequest
CreateJobRequest.job_name
CreateJobRequest.project_id
CreateJobRequest.prompt_dataset_id
CreateJobRequest.prompt_optimization_configuration
CreateJobRequest.prompt_registered_scorers_configuration
CreateJobRequest.prompt_scorer_settings
CreateJobRequest.prompt_scorers_configuration
CreateJobRequest.prompt_settings
CreateJobRequest.prompt_template_version_id
CreateJobRequest.run_id
CreateJobRequest.task_type
JobInfoMixin
CreateJobResponse
GetMetricsRequest
PromptMetrics
GetJobStatusResponse
GetJobStatusResponse.error_message
GetJobStatusResponse.id
GetJobStatusResponse.job_name
GetJobStatusResponse.progress_message
GetJobStatusResponse.progress_percent
GetJobStatusResponse.project_id
GetJobStatusResponse.request_data
GetJobStatusResponse.run_id
GetJobStatusResponse.status
GetJobStatusResponse.steps_completed
GetJobStatusResponse.steps_total
CreateIntegrationRequest
SelectTemplateVersionRequest
UserSubmittedMetricsResponse
- promptquality.types.settings module
- promptquality.types.user_submitted_metrics module
- Module contents
- promptquality.utils package
- Submodules
- promptquality.utils.api_client module
ApiClient
ApiClient.api_url
ApiClient.token
ApiClient.api_key_login()
ApiClient.create_job()
ApiClient.create_project()
ApiClient.create_run()
ApiClient.create_template()
ApiClient.create_template_version()
ApiClient.get_current_user()
ApiClient.get_estimated_cost()
ApiClient.get_job_status()
ApiClient.get_metrics()
ApiClient.get_project()
ApiClient.get_project_by_name()
ApiClient.get_rows()
ApiClient.get_run_by_name()
ApiClient.get_run_scorer_jobs()
ApiClient.get_template()
ApiClient.get_templates()
ApiClient.healthcheck()
ApiClient.ingest_chain_rows()
ApiClient.list_registered_scorers()
ApiClient.put_integration()
ApiClient.put_template_version_selection()
ApiClient.put_user_metrics()
ApiClient.register_scorer()
ApiClient.upload_dataset()
ApiClient.username_login()
ApiClient.auth_header
ApiClient.base_url
- promptquality.utils.config module
- promptquality.utils.dataset module
- promptquality.utils.dependencies module
- promptquality.utils.logger module
- promptquality.utils.name module
- promptquality.utils.request module
- promptquality.utils.scorer module
- Module contents
Submodules¶
promptquality.callback module¶
- class GalileoPromptCallback(project_name=None, run_name=None, scorers=None, run_tags=None, config=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False), **kwargs)¶
Bases:
BaseCallbackHandler
LangChain callbackbander for logging prompts to Galileo.
- Parameters:
project_name (str) – Name of the project to log to
- set_relationships(run_id, node_type, parent_run_id=None)¶
- Return type:
None
- get_root_id(run_id)¶
- Return type:
UUID
- mark_step_start(run_id, serialized, prompt=None, node_input=None, node_name=None, **kwargs)¶
- Return type:
None
- mark_step_end(run_id, response=None, node_output=None, **kwargs)¶
- Return type:
None
- on_retriever_start(serialized, query, *, run_id, parent_run_id=None, tags=None, metadata=None, **kwargs)¶
Run when Retriever starts running.
- Return type:
Any
- on_retriever_end(documents, *, run_id, parent_run_id=None, **kwargs)¶
Run when Retriever ends running.
- Return type:
Any
- on_retriever_error(error, *, run_id, parent_run_id=None, **kwargs)¶
Run when Retriever errors.
- Return type:
Any
- on_tool_start(serialized, input_str, *, run_id, parent_run_id=None, tags=None, metadata=None, **kwargs)¶
Run when tool starts running.
- Return type:
Any
- on_tool_end(output, *, run_id, parent_run_id=None, **kwargs)¶
Run when tool ends running.
- Return type:
Any
- on_tool_error(error, *, run_id, parent_run_id=None, **kwargs)¶
Run when tool errors.
- Return type:
Any
- on_agent_finish(finish, *, run_id, parent_run_id=None, tags=None, **kwargs)¶
Run on agent finish.
The order of operations for agents is on_chain_start, on_agent_action x times, on_agent_finish, on_chain_finish. We are creating the agent node with on_chain_start, then populating all of it’s agent specific data in on_agent_finish. We are skipping on_agent_action, because there is no relevant info there as of yet and it could also be called 0 times.
- Return type:
None
- on_llm_start(serialized, prompts, *, run_id, parent_run_id=None, **kwargs)¶
Run when LLM starts running.
- Return type:
Any
- on_chat_model_start(serialized, messages, *, run_id, parent_run_id=None, **kwargs)¶
Run when Chat Model starts running.
- Return type:
Any
- on_llm_end(response, *, run_id, parent_run_id=None, **kwargs)¶
Run when LLM ends running.
- Return type:
Any
- on_llm_error(error, *, run_id, parent_run_id=None, **kwargs)¶
Run when LLM errors.
- Return type:
Any
- on_chain_start(serialized, inputs, *, run_id, parent_run_id=None, **kwargs)¶
Run when chain starts running.
The inputs here are expected to only be a dictionary per the langchain docs :rtype:
Any
but from experience, we do see strings and `BaseMessage`s in there, so we support those as well.
- on_chain_end(outputs, *, run_id, parent_run_id=None, **kwargs)¶
Run when chain ends running.
- Return type:
Any
- on_chain_error(error, *, run_id, parent_run_id=None, **kwargs)¶
Run when chain errors.
- Return type:
Any
- static json_serializer(obj)¶
For serializing objects that cannot be serialized by default with json.dumps.
Checks for certain methods to convert object to dict.
- Return type:
Union
[str
,Dict
[Any
,Any
]]
- add_targets(targets)¶
- Return type:
None
- targets: List[str]:
A list of target outputs. The list should be the length of the number of chain invokations. Targets will be mapped to chain root nodes.
- finish()¶
- Return type:
None
promptquality.chain_run module¶
- chain_run(rows, project_name=None, run_name=None, scorers=None, run_tags=None, silent=False, config=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False))¶
- Return type:
None
promptquality.get_metrics module¶
- get_metrics(project_id=None, run_id=None, job_id=None, config=None)¶
- Return type:
promptquality.get_rows module¶
promptquality.get_template module¶
- get_template(project_name=None, project_id=None, template_name=None)¶
Get a template for a specific project.
- Parameters:
project_name (Optional[str]) – Project name.
project_id (Optional[UUID4]) – Project ID.
template_name (Optional[str]) – Template name.
- Returns:
Template response.
- Return type:
promptquality.helpers module¶
- create_project(project_name=None, config=None)¶
- Return type:
- get_project(project_id, config=None)¶
- Return type:
- get_project_from_name(project_name, raise_if_missing=True, config=None)¶
- Return type:
Optional
[ProjectResponse
]
- create_template(template, project_id=None, template_name=None, config=None)¶
Create a template in the project.
If the project ID is not provided, it will be taken from the config.
If a template with the same name already exists, it will be used. If the template text is the same, the existing template version will be selected. Otherwise, a new template version will be created and selected.
- Parameters:
template (str) – Template text to use for the new template.
project_id (Optional[UUID4], optional) – Project ID, by default None, i.e. use the current project ID in config.
template_name (Optional[str], optional) – Name for the template, by default None, i.e. use a random name.
config (Optional[Config], optional) – PromptQuality Configuration, by default None, i.e. use the current config on disk.
- Returns:
Validated response from the API.
- Return type:
- Raises:
ValueError – If the project ID is not set in config.
- get_template_from_id(project_id=None, template_id=None, config=None)¶
- Return type:
- get_templates(project_id=None, config=None)¶
- Return type:
List
[BaseTemplateResponse
]
- select_template_version(version, project_id=None, template_id=None, config=None)¶
- Return type:
- create_template_version(template, project_id=None, template_id=None, version=None, config=None)¶
Create a template version for the current template ID in config.
- Parameters:
template (str) – Template text to use for the new template version.
project_id (Optional[UUID4], optional) – Project ID, by default None, i.e. use the current project ID in config.
template_id (Optional[UUID4], optional) – Template ID, by default None, i.e. use the current template ID in config.
version (Optional[int], optional) – Version number, by default None, i.e. use the next version number.
config (Optional[Config], optional) – PromptQuality Configuration, by default None, i.e. use the current config on disk.
- Returns:
Validated response from the API.
- Return type:
- Raises:
ValueError – If the template ID is not set in config.
ValueError – If the project ID is not set in config.
- upload_dataset(dataset, project_id, template_version_id, config=None)¶
- Return type:
UUID
- create_run(project_id, run_name=None, task_type=7, run_tags=None, config=None)¶
- Return type:
UUID
- get_run_from_name(run_name, project_id=None, config=None)¶
- Return type:
- create_job(project_id=None, run_id=None, dataset_id=None, template_version_id=None, settings=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False), registered_scorers=None, config=None)¶
- Return type:
UUID
- create_prompt_optimization_job(prompt_optimization_configuration, project_id, run_id, train_dataset_id, config)¶
Kicks off a prompt optimization job in Runners.
- Parameters:
prompt_optimization_config (PromptOptimizationConfiguration) – Configuration for the prompt optimization job.
project_name (str) – Name of the project, by default None. If None we will generate a name.
run_name (str) – Name of the run, by default None. If None we will generate a name.
config (Config) – pq config object, by default None. If None we will use the default config.
- Returns:
job_id – Job ID kicked off for prompt optimization
- Return type:
UUID
- get_estimated_cost(dataset, template, project_id=None, settings=None, config=None)¶
- Return type:
float
- get_job_status(job_id=None, config=None)¶
- Return type:
- upload_custom_metrics(custom_scorer, project_id=None, run_id=None, task_type=None, config=None)¶
- Return type:
- ingest_chain_rows(rows, project_id=None, run_id=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False), registered_scorers=None, config=None)¶
- Return type:
- get_run_scorer_jobs(project_id=None, run_id=None, config=None)¶
- Return type:
List
[GetJobStatusResponse
]
promptquality.integrations module¶
- add_openai_integration(api_key, organization_id=None, config=None)¶
Add an OpenAI integration to your Galileo account.
If you add an integration while one already exists, the new integration will overwrite the old one.
- Parameters:
api_key (str) – Your OpenAI API key.
organization_id (Optional[str], optional) – Organization ID, if you want to include it in OpenAI requests, by default None
config (Optional[Config], optional) – Config to use, by default None which translates to the config being set automatically.
- Return type:
None
- add_azure_integration(api_key, endpoint, headers=None, proxy=None, config=None)¶
Add an Azure integration to your Galileo account.
If you add an integration while one already exists, the new integration will overwrite the old one.
- Parameters:
api_key (str) – Your Azure API key.
endpoint (str) – The endpoint to use for the Azure API.
headers (Optional[Dict[str, str]], optional) – Headers to use for making requests to Azure, by default None
config (Optional[Config], optional) – Config to use, by default None which translates to the config being set automatically.
- Return type:
None
promptquality.job_progress module¶
- random() x in the interval [0, 1). ¶
- job_progress(job_id=None, config=None)¶
- Return type:
UUID
- scorer_jobs_status(project_id=None, run_id=None, config=None)¶
- Return type:
None
promptquality.login module¶
promptquality.prompt_optimization module¶
- optimize_prompt(prompt_optimization_config, train_dataset, val_dataset=None, project_name=None, run_name=None, config=None)¶
Optimize a prompt for a given task.
This function takes a prompt and a list of evaluation criteria, and optimizes the prompt for the given task. The function uses the OpenAI API to generate and evaluate prompts, and returns the best prompt based on the evaluation criteria.
- Parameters:
prompt_optimization_config (PromptOptimizationConfiguration) – Configuration for the prompt optimization job.
train_dataset (Path) – Path to the training dataset.
val_dataset (Optional[Path], optional) – Path to the validation dataset, by default None. If None we will use a subset of the training dataset.
project_name (Optional[str], optional) – Name of the project, by default None. If None we will generate a name.
run_name (Optional[str], optional) – Name of the run, by default None. If None we will generate a name.
config (Optional[Config], optional) – pq config object, by default None. If None we will use the default config.
- Returns:
Object that can be polled to see progress and current prompt.
- Return type:
PromptOptimizaton
promptquality.registered_scorers module¶
- register_scorer(scorer_name, scorer_file, config=None)¶
- Return type:
- list_registered_scorers(config=None)¶
- Return type:
List
[RegisteredScorer
]
promptquality.run module¶
- run(template, dataset=None, project_name=None, run_name=None, template_name=None, scorers=None, settings=None, run_tags=None, wait=True, silent=False, config=None, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False))¶
Create a prompt run.
This function creates a prompt run that can be viewed on the Galileo console. The processing of the prompt run is asynchronous, so the function will return immediately. If the wait parameter is set to True, the function will block until the prompt run is complete.
Additionally, all of the scorers are executed asynchronously in the background after the prompt run is complete, regardless of the value of the wait parameter.
- Parameters:
template (str) – Template text to use for the prompt run.
dataset (Optional[DatasetType]) – Dataset to use for the prompt run.
project_name (Optional[str], optional) – Project name to use, by default None which translates to a randomly generated name.
run_name (Optional[str], optional) – Run name to use, by default None which translates to one derived from the project name, current timestamp and template version.
template_name (Optional[str], optional) – Template name to use, by default None which translates to the project name.
scorers (List[Union[Scorers, CustomScorer, RegisteredScorer, str]], optional) – List of scorers to use, by default None.
settings (Optional[Settings], optional) – Settings to use, by default None which translates to the default settings.
run_tags (Optional[List[RunTag]], optional,) – List of tags to attribute to a run, by default no tags will be added.
wait (bool, optional) – Whether to wait for the prompt run to complete, by default True.
silent (bool, optional) – Whether to suppress the console output, by default False.
config (Optional[Config], optional) – Config to use, by default None which translates to the config being set automatically.
scorers_config (ScorersConfig, optional) – Can be used to enable or disable scorers. Can be used instead of scorers param, or can be used to disable default scorers.
- Returns:
Metrics for the prompt run. These are only returned if the wait parameter is True for metrics that have been computed upto that point. Other metrics will be computed asynchronously.
- Return type:
Optional[PromptMetrics]
promptquality.run_sweep module¶
- create_settings_combinations(base_settings, model_aliases=None, temperatures=None, max_token_options=None)¶
- Return type:
List
[Settings
]
- run_sweep(templates, dataset, project_name=None, model_aliases=None, temperatures=None, settings=None, max_token_options=None, scorers=None, run_tags=None, execute=False, wait=True, silent=True, scorers_config=ScorersConfiguration(toxicity=True, factuality=False, groundedness=False, context_relevance=False, latency=True, sexist=False, pii=True, prompt_perplexity=False, chunk_attribution_utilization_gpt=False, completeness_gpt=False, tone=False, prompt_injection=False, adherence_nli=False, chunk_attribution_utilization_nli=False, completeness_nli=False))¶
Run a sweep of prompt runs over various settings.
If execute is False, this function will estimate the cost of the batch of runs and print the estimated cost. If execute is True, this function will create the batch of runs.
We support optionally providing a subset of settings to override the base settings. If no settings are provided, we will use the base settings.
- Return type:
None
promptquality.set_config module¶
- set_config(console_url=None)¶
Set the config for promptquality.
If the config file exists, and console_url is not passed, read it and return the config. Otherwise, set the default console URL and return the config.
- Parameters:
console_url (Optional[str], optional) – URL to the Galileo console, by default None and we use the Galileo Cloud URL.
- Returns:
Config object for promptquality.
- Return type:
promptquality.sweep module¶
- sweep(fn, params)¶
Run a sweep of a function over various settings.
Given a function and a dictionary of parameters, run the function over all combinations of the parameters.
- Parameters:
fn (Callable) – Function to run.
params (Dict[str, Iterable]) – Dictionary of parameters to run the function over. The keys are the parameter names and the values are the values to run the function with.
- Return type:
None
Module contents¶
PromptQuality.