Model aliasing

Configure global or provider-specific aliases for your models to refer to your model by using user-friendly names.

Before you begin

  1. Set up an agentgateway proxy.
  2. Set up access to the OpenAI LLM provider.

Set up aliases

  1. Update your AgentgatewayBackend to add global model aliases. The following example adds two aliases, fast and smart. Each alias points to a specific model. Note that the example does not specify a default model.

    kubectl apply -f- <<EOF
    apiVersion: agentgateway.dev/v1alpha1
    kind: AgentgatewayBackend 
    metadata:
      name: openai
      namespace: agentgateway-system 
    spec:
      ai:
        provider:
          openai: {}
      policies:
        auth:
          secretRef:
            name: openai-secret
        ai: 
          modelAliases: 
            fast: gpt-3.5-turbo
            smart: gpt-4-turbo      
    EOF
  2. Send a request to the OpenAI provider with the fast model. Verify that the request succeeds and that you also see the gpt-3.5-turbo model in your response.

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json  -d '{
       "model": "fast",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Localhost:

    curl "localhost:8080/v1/chat/completions" -H content-type:application/json  -d '{
       "model": "fast",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/openai" -H content-type:application/json  -d '{
       "model": "fast",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Localhost:

    curl "localhost:8080/openai" -H content-type:application/json  -d '{
       "model": "fast",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Example output:

    {
      "model": "gpt-3.5-turbo-0125",
      "usage": {
        "prompt_tokens": 39,
    ...
    
  3. Repeat the request to the OpenAI provider with the smart model. Verify that the request succeeds and that you also see the gpt-4-turbo model in your response.

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/v1/chat/completions" -H content-type:application/json  -d '{
       "model": "smart",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Localhost:

    curl "localhost:8080/v1/chat/completions" -H content-type:application/json  -d '{
       "model": "smart",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Cloud Provider LoadBalancer:

    curl "$INGRESS_GW_ADDRESS/openai" -H content-type:application/json  -d '{
       "model": "smart",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Localhost:

    curl "localhost:8080/openai" -H content-type:application/json  -d '{
       "model": "smart",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq

    Example output:

    {
      "model": "gpt-4-turbo-2024-04-09",
      "usage": {
        "prompt_tokens": 39,
    ...
    

Cleanup

You can remove the resources that you created in this guide.
kubectl delete AgentgatewayBackend openai -n agentgateway-system 
Agentgateway assistant

Ask me anything about agentgateway configuration, features, or usage.

Note: AI-generated content might contain errors; please verify and test all returned information.

Tip: one topic per conversation gives the best results. Use the + button in the chat header to start a new conversation.

Switching topics? Starting a new conversation improves accuracy.
↑↓ navigate select esc dismiss

What could be improved?

Your feedback helps us improve assistant answers and identify docs gaps we should fix.

Need more help? Join us on Discord: https://discord.gg/y9efgEmppm

Want to use your own agent? Add the Solo MCP server to query our docs directly. Get started here: https://search.solo.io/.