Skip to content

Latest commit

 

History

History
110 lines (77 loc) · 4.58 KB

llm-emit-token-metric-policy.md

File metadata and controls

110 lines (77 loc) · 4.58 KB
title description services author ms.service ms.topic ms.date ms.author ms.collection ms.custom
Azure API Management policy reference - llm-emit-token-metric
Reference for the llm-emit-token-metric policy available for use in Azure API Management. Provides policy usage, settings, and examples.
api-management
dlepow
azure-api-management
reference
04/18/2025
danlep
ce-skilling-ai-copilot

Emit metrics for consumption of large language model tokens

[!INCLUDE api-management-availability-all-tiers]

The llm-emit-token-metric policy sends custom metrics to Application Insights about consumption of large language model (LLM) tokens through LLM APIs. Token count metrics include: Total Tokens, Prompt Tokens, and Completion Tokens.

Note

Currently, this policy is in preview.

[!INCLUDE api-management-policy-generic-alert]

[!INCLUDE api-management-llm-models]

Limits for custom metrics

[!INCLUDE api-management-custom-metrics-limits]

Prerequisites

Policy statement

<llm-emit-token-metric
        namespace="metric namespace" >      
        <dimension name="dimension name" value="dimension value" />
        ...additional dimensions...
</llm-emit-token-metric>

Attributes

Attribute Description Required Default value
namespace A string. Namespace of metric. Policy expressions aren't allowed. No API Management

Elements

Element Description Required
dimension Add one or more of these elements for each dimension included in the metric. Yes

dimension attributes

Attribute Description Required Default value
name A string or policy expression. Name of dimension. Yes N/A
value A string or policy expression. Value of dimension. Can only be omitted if name matches one of the default dimensions. If so, value is provided as per dimension name. No N/A

[!INCLUDE api-management-emit-metric-dimensions-llm]

Usage

Usage notes

  • This policy can be used multiple times per policy definition.
  • You can configure at most 10 custom dimensions for this policy.
  • Where available, values in the usage section of the response from the LLM API are used to determine token metrics.
  • Certain LLM endpoints support streaming of responses. When stream is set to true in the API request to enable streaming, token metrics are estimated.

Example

The following example sends LLM token count metrics to Application Insights along with API ID as a default dimension.

<policies>
  <inbound>
      <llm-emit-token-metric
            namespace="MyLLM">   
            <dimension name="API ID" />
        </llm-emit-token-metric> 
  </inbound>
  <outbound>
  </outbound>
</policies>

Related policies

[!INCLUDE api-management-policy-ref-next-steps]