Skip to main content

AzureChatOpenAI

This guide will help you get started with AzureOpenAI chat models. For detailed documentation of all AzureChatOpenAI features and configurations head to the API reference.

Azure OpenAI has several chat models. You can find information about their latest models and their costs, context windows, and supported input types in the Azure docs.

Azure OpenAI vs OpenAI

Azure OpenAI refers to OpenAI models hosted on the Microsoft Azure platform. OpenAI also provides its own model APIs. To access OpenAI services directly, use the ChatOpenAI integration.

Overviewโ€‹

Integration detailsโ€‹

ClassPackageLocalSerializableJS supportPackage downloadsPackage latest
AzureChatOpenAIlangchain-openaiโŒbetaโœ…PyPI - DownloadsPyPI - Version

Model featuresโ€‹

Tool callingStructured outputJSON modeImage inputAudio inputVideo inputToken-level streamingNative asyncToken usageLogprobs
โœ…โœ…โœ…โœ…โŒโŒโœ…โœ…โœ…โœ…

Setupโ€‹

To access AzureOpenAI models you'll need to create an Azure account, create a deployment of an Azure OpenAI model, get the name and endpoint for your deployment, get an Azure OpenAI API key, and install the langchain-openai integration package.

Credentialsโ€‹

Head to the Azure docs to create your deployment and generate an API key. Once you've done this set the AZURE_OPENAI_API_KEY and AZURE_OPENAI_ENDPOINT environment variables:

import getpass
import os

if "AZURE_OPENAI_API_KEY" not in os.environ:
os.environ["AZURE_OPENAI_API_KEY"] = getpass.getpass(
"Enter your AzureOpenAI API key: "
)
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://YOUR-ENDPOINT.openai.azure.com/"

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

Installationโ€‹

The LangChain AzureOpenAI integration lives in the langchain-openai package:

%pip install -qU langchain-openai

Instantiationโ€‹

Now we can instantiate our model object and generate chat completions.

from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
azure_deployment="gpt-35-turbo", # or your deployment
api_version="2023-06-01-preview", # or your api version
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
# other params...
)
API Reference:AzureChatOpenAI

Invocationโ€‹

messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 31, 'total_tokens': 39}, 'model_name': 'gpt-35-turbo', 'system_fingerprint': None, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, id='run-bea4b46c-e3e1-4495-9d3a-698370ad963d-0', usage_metadata={'input_tokens': 31, 'output_tokens': 8, 'total_tokens': 39})
print(ai_msg.content)
J'adore la programmation.

Chainingโ€‹

We can chain our model with a prompt template like so:

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
),
("human", "{input}"),
]
)

chain = prompt | llm
chain.invoke(
{
"input_language": "English",
"output_language": "German",
"input": "I love programming.",
}
)
API Reference:ChatPromptTemplate
AIMessage(content='Ich liebe das Programmieren.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 26, 'total_tokens': 32}, 'model_name': 'gpt-35-turbo', 'system_fingerprint': None, 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}, id='run-cbc44038-09d3-40d4-9da2-c5910ee636ca-0', usage_metadata={'input_tokens': 26, 'output_tokens': 6, 'total_tokens': 32})

Specifying model versionโ€‹

Azure OpenAI responses contain model_name response metadata property, which is name of the model used to generate the response. However unlike native OpenAI responses, it does not contain the specific version of the model, which is set on the deployment in Azure. E.g. it does not distinguish between gpt-35-turbo-0125 and gpt-35-turbo-0301. This makes it tricky to know which version of the model was used to generate the response, which as result can lead to e.g. wrong total cost calculation with OpenAICallbackHandler.

To solve this problem, you can pass model_version parameter to AzureChatOpenAI class, which will be added to the model name in the llm output. This way you can easily distinguish between different versions of the model.

%pip install -qU langchain-community
from langchain_community.callbacks import get_openai_callback

with get_openai_callback() as cb:
llm.invoke(messages)
print(
f"Total Cost (USD): ${format(cb.total_cost, '.6f')}"
) # without specifying the model version, flat-rate 0.002 USD per 1k input and output tokens is used
API Reference:get_openai_callback
Total Cost (USD): $0.000063
llm_0301 = AzureChatOpenAI(
azure_deployment="gpt-35-turbo", # or your deployment
api_version="2023-06-01-preview", # or your api version
model_version="0301",
)
with get_openai_callback() as cb:
llm_0301.invoke(messages)
print(f"Total Cost (USD): ${format(cb.total_cost, '.6f')}")
Total Cost (USD): $0.000074

API referenceโ€‹

For detailed documentation of all AzureChatOpenAI features and configurations head to the API reference: https://python.langchain.com/v0.2/api_reference/openai/chat_models/langchain_openai.chat_models.azure.AzureChatOpenAI.html


Was this page helpful?


You can also leave detailed feedback on GitHub.