BedrockAgentCoreControl / Client / get_evaluator

get_evaluator¶

BedrockAgentCoreControl.Client.get_evaluator(**kwargs)¶

Retrieves detailed information about an evaluator, including its configuration, status, and metadata. Works with both built-in and custom evaluators.

Request Syntax

response = client.get_evaluator(
    evaluatorId='string'
)

Parameters:

evaluatorId (string) –

[REQUIRED]

The unique identifier of the evaluator to retrieve. Can be a built-in evaluator ID (e.g., Builtin.Helpfulness) or a custom evaluator ID.

Return type:

dict

Returns:

Response Syntax

{
    'evaluatorArn': 'string',
    'evaluatorId': 'string',
    'evaluatorName': 'string',
    'description': 'string',
    'evaluatorConfig': {
        'llmAsAJudge': {
            'instructions': 'string',
            'ratingScale': {
                'numerical': [
                    {
                        'definition': 'string',
                        'value': 123.0,
                        'label': 'string'
                    },
                ],
                'categorical': [
                    {
                        'definition': 'string',
                        'label': 'string'
                    },
                ]
            },
            'modelConfig': {
                'bedrockEvaluatorModelConfig': {
                    'modelId': 'string',
                    'inferenceConfig': {
                        'maxTokens': 123,
                        'temperature': ...,
                        'topP': ...,
                        'stopSequences': [
                            'string',
                        ]
                    },
                    'additionalModelRequestFields': {...}|[...]|123|123.4|'string'|True|None
                }
            }
        }
    },
    'level': 'TOOL_CALL'|'TRACE'|'SESSION',
    'status': 'ACTIVE'|'CREATING'|'CREATE_FAILED'|'UPDATING'|'UPDATE_FAILED'|'DELETING',
    'createdAt': datetime(2015, 1, 1),
    'updatedAt': datetime(2015, 1, 1),
    'lockedForModification': True|False
}

Response Structure

(dict) –
- evaluatorArn (string) –
  
  The Amazon Resource Name (ARN) of the evaluator.
- evaluatorId (string) –
  
  The unique identifier of the evaluator.
- evaluatorName (string) –
  
  The name of the evaluator.
- description (string) –
  
  The description of the evaluator.
- evaluatorConfig (dict) –
  
  The configuration of the evaluator, including LLM-as-a-Judge settings for custom evaluators.
  Note
  This is a Tagged Union structure. Only one of the following top level keys will be set: llmAsAJudge. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
  'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
  - llmAsAJudge (dict) –
    
    The LLM-as-a-Judge configuration that uses a language model to evaluate agent performance based on custom instructions and rating scales.
    - instructions (string) –
      
      The evaluation instructions that guide the language model in assessing agent performance, including criteria and evaluation guidelines.
    - ratingScale (dict) –
      
      The rating scale that defines how the evaluator should score agent performance, either numerical or categorical.
      Note
      This is a Tagged Union structure. Only one of the following top level keys will be set: numerical, categorical. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
      
      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      - numerical (list) –
        
        The numerical rating scale with defined score values and descriptions for quantitative evaluation.
        
        (dict) –
        
        The definition of a numerical rating scale option that provides a numeric value with its description for evaluation scoring.
        
        definition (string) –
        
        The description that explains what this numerical rating represents and when it should be used.
        
        value (float) –
        
        The numerical value for this rating scale option.
        
        label (string) –
        
        The label or name that describes this numerical rating option.
      - categorical (list) –
        
        The categorical rating scale with named categories and definitions for qualitative evaluation.
        
        (dict) –
        
        The definition of a categorical rating scale option that provides a named category with its description for evaluation scoring.
        
        definition (string) –
        
        The description that explains what this categorical rating represents and when it should be used.
        
        label (string) –
        
        The label or name of this categorical rating option.
    - modelConfig (dict) –
      
      The model configuration that specifies which foundation model to use and how to configure it for evaluation.
      Note
      This is a Tagged Union structure. Only one of the following top level keys will be set: bedrockEvaluatorModelConfig. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
      
      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      - bedrockEvaluatorModelConfig (dict) –
        
        The Amazon Bedrock model configuration for evaluation.
        
        modelId (string) –
        
        The identifier of the Amazon Bedrock model to use for evaluation. Must be a supported foundation model available in your region.
        
        inferenceConfig (dict) –
        
        The inference configuration parameters that control model behavior during evaluation, including temperature, token limits, and sampling settings.
        
        maxTokens (integer) –
        
        The maximum number of tokens to generate in the model response during evaluation.
        
        temperature (float) –
        
        The temperature value that controls randomness in the model’s responses. Lower values produce more deterministic outputs.
        
        topP (float) –
        
        The top-p sampling parameter that controls the diversity of the model’s responses by limiting the cumulative probability of token choices.
        
        stopSequences (list) –
        
        The list of sequences that will cause the model to stop generating tokens when encountered.
        
        (string) –
        
        additionalModelRequestFields (document) –
        
        Additional model-specific request fields to customize model behavior beyond the standard inference configuration.
- level (string) –
  
  The evaluation level ( TOOL_CALL, TRACE, or SESSION) that determines the scope of evaluation.
- status (string) –
  
  The current status of the evaluator.
- createdAt (datetime) –
  
  The timestamp when the evaluator was created.
- updatedAt (datetime) –
  
  The timestamp when the evaluator was last updated.
- lockedForModification (boolean) –
  
  Whether the evaluator is locked for modification due to being referenced by active online evaluation configurations.

get_evaluator¶

Request Syntax

Response Syntax

Response Structure

Note

Note

Note

Exceptions