title | description | ms.date | ms.topic | ms.subservice | ms.custom | ms.collection |
---|---|---|---|---|---|---|
Scale Azure OpenAI for Python with Azure API Management |
Learn how to add load balancing with Azure API Management to your application to extend the chat app beyond the Azure OpenAI token and model quota limits. |
12/20/2024 |
get-started |
intelligent-apps |
devx-track-python, devx-track-python-ai, build-2024-intelligent-apps |
ce-skilling-ai-copilot |
[!INCLUDE aca-load-balancer-intro]
-
An Azure subscription. Create one for free.
-
Dev containers are available for both samples, with all the dependencies that are required to complete this article. You can run the dev containers in GitHub Codespaces (in a browser) or locally by using Visual Studio Code.
- Only a GitHub account is required to use GitHub Codespaces.
- Docker Desktop. Start Docker Desktop if it's not already running.
- Visual Studio Code.
- Dev Containers extension.
[!INCLUDE scaling-load-balancer-aca-procedure.md]
[!INCLUDE py-deployment-procedure]
[!INCLUDE capacity.md]
[!INCLUDE py-apim-cleanup]
Samples used in this article include:
- View Azure API Management diagnostic data in Azure Monitor.
- Use Azure Load Testing to load test your chat app.