BedrockAgentCoreControl / Client / get_evaluator

get_evaluator

BedrockAgentCoreControl.Client.get_evaluator(**kwargs)

Retrieves detailed information about an evaluator, including its configuration, status, and metadata. Works with both built-in and custom evaluators.

See also: AWS API Documentation

Request Syntax

response = client.get_evaluator(
    evaluatorId='string'
)
Parameters:

evaluatorId (string) –

[REQUIRED]

The unique identifier of the evaluator to retrieve. Can be a built-in evaluator ID (e.g., Builtin.Helpfulness) or a custom evaluator ID.

Return type:

dict

Returns:

Response Syntax

{
    'evaluatorArn': 'string',
    'evaluatorId': 'string',
    'evaluatorName': 'string',
    'description': 'string',
    'evaluatorConfig': {
        'llmAsAJudge': {
            'instructions': 'string',
            'ratingScale': {
                'numerical': [
                    {
                        'definition': 'string',
                        'value': 123.0,
                        'label': 'string'
                    },
                ],
                'categorical': [
                    {
                        'definition': 'string',
                        'label': 'string'
                    },
                ]
            },
            'modelConfig': {
                'bedrockEvaluatorModelConfig': {
                    'modelId': 'string',
                    'inferenceConfig': {
                        'maxTokens': 123,
                        'temperature': ...,
                        'topP': ...,
                        'stopSequences': [
                            'string',
                        ]
                    },
                    'additionalModelRequestFields': {...}|[...]|123|123.4|'string'|True|None
                }
            }
        }
    },
    'level': 'TOOL_CALL'|'TRACE'|'SESSION',
    'status': 'ACTIVE'|'CREATING'|'CREATE_FAILED'|'UPDATING'|'UPDATE_FAILED'|'DELETING',
    'createdAt': datetime(2015, 1, 1),
    'updatedAt': datetime(2015, 1, 1),
    'lockedForModification': True|False
}

Response Structure

  • (dict) –

    • evaluatorArn (string) –

      The Amazon Resource Name (ARN) of the evaluator.

    • evaluatorId (string) –

      The unique identifier of the evaluator.

    • evaluatorName (string) –

      The name of the evaluator.

    • description (string) –

      The description of the evaluator.

    • evaluatorConfig (dict) –

      The configuration of the evaluator, including LLM-as-a-Judge settings for custom evaluators.

      Note

      This is a Tagged Union structure. Only one of the following top level keys will be set: llmAsAJudge. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      
      • llmAsAJudge (dict) –

        The LLM-as-a-Judge configuration that uses a language model to evaluate agent performance based on custom instructions and rating scales.

        • instructions (string) –

          The evaluation instructions that guide the language model in assessing agent performance, including criteria and evaluation guidelines.

        • ratingScale (dict) –

          The rating scale that defines how the evaluator should score agent performance, either numerical or categorical.

          Note

          This is a Tagged Union structure. Only one of the following top level keys will be set: numerical, categorical. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

          'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
          
          • numerical (list) –

            The numerical rating scale with defined score values and descriptions for quantitative evaluation.

            • (dict) –

              The definition of a numerical rating scale option that provides a numeric value with its description for evaluation scoring.

              • definition (string) –

                The description that explains what this numerical rating represents and when it should be used.

              • value (float) –

                The numerical value for this rating scale option.

              • label (string) –

                The label or name that describes this numerical rating option.

          • categorical (list) –

            The categorical rating scale with named categories and definitions for qualitative evaluation.

            • (dict) –

              The definition of a categorical rating scale option that provides a named category with its description for evaluation scoring.

              • definition (string) –

                The description that explains what this categorical rating represents and when it should be used.

              • label (string) –

                The label or name of this categorical rating option.

        • modelConfig (dict) –

          The model configuration that specifies which foundation model to use and how to configure it for evaluation.

          Note

          This is a Tagged Union structure. Only one of the following top level keys will be set: bedrockEvaluatorModelConfig. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

          'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
          
          • bedrockEvaluatorModelConfig (dict) –

            The Amazon Bedrock model configuration for evaluation.

            • modelId (string) –

              The identifier of the Amazon Bedrock model to use for evaluation. Must be a supported foundation model available in your region.

            • inferenceConfig (dict) –

              The inference configuration parameters that control model behavior during evaluation, including temperature, token limits, and sampling settings.

              • maxTokens (integer) –

                The maximum number of tokens to generate in the model response during evaluation.

              • temperature (float) –

                The temperature value that controls randomness in the model’s responses. Lower values produce more deterministic outputs.

              • topP (float) –

                The top-p sampling parameter that controls the diversity of the model’s responses by limiting the cumulative probability of token choices.

              • stopSequences (list) –

                The list of sequences that will cause the model to stop generating tokens when encountered.

                • (string) –

            • additionalModelRequestFields (document) –

              Additional model-specific request fields to customize model behavior beyond the standard inference configuration.

    • level (string) –

      The evaluation level ( TOOL_CALL, TRACE, or SESSION) that determines the scope of evaluation.

    • status (string) –

      The current status of the evaluator.

    • createdAt (datetime) –

      The timestamp when the evaluator was created.

    • updatedAt (datetime) –

      The timestamp when the evaluator was last updated.

    • lockedForModification (boolean) –

      Whether the evaluator is locked for modification due to being referenced by active online evaluation configurations.

Exceptions