如何在 Spring Boot 應(yīng)用中配置多個 Spring AI 的 LLM 客戶端

1. 概述

越來越多的現(xiàn)代應(yīng)用開始集成大型語言模型（LLM），以構(gòu)建更智能的功能。如何使用Spring AI快速整合LLM能力到自己的Spring Boot應(yīng)用，在之前的博文中有過很多篇關(guān)于使用Spring AI使用不同供應(yīng)商LLM的整合案例。雖然一個 LLM 能勝任多種任務(wù)，但只依賴單一模型并不總是最優(yōu)。

不同模型各有側(cè)重：有的擅長技術(shù)分析，有的更適合創(chuàng)意寫作。簡單任務(wù)更適合輕量、性價比高的模型；復(fù)雜任務(wù)則交給更強大的模型。

本文將演示如何借助 Spring AI，在 Spring Boot 應(yīng)用中集成多個 LLM。

我們既會配置來自不同供應(yīng)商的模型，也會配置同一供應(yīng)商下的多個模型。隨后基于這些配置，構(gòu)建一個具備彈性的聊天機器人，在故障時可自動在模型間切換。

2. 配置不同供應(yīng)商的 LLM

我們先在應(yīng)用中配置來自不同供應(yīng)商的兩個 LLM。

在本文示例中，我們將使用 OpenAI 和 Anthropic 作為 AI 模型提供商。

2.1. 配置主 LLM

我們先將一個 OpenAI 模型配置為主 LLM。

首先，在項目的 pom.xml 文件中添加所需依賴：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-openai</artifactId>
    <version>1.0.2</version>
</dependency>

該 OpenAI Starter 依賴是對 OpenAI Chat Completions API 的封裝，使我們能夠在應(yīng)用中與 OpenAI 模型交互。

接著，在 application.yaml 中配置我們的 OpenAI API Key 和聊天模型：

spring:
  ai:
    open-ai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: ${PRIMARY_LLM}
          temperature: 1

我們使用 ${} 屬性占位符從環(huán)境變量中加載屬性值。另外，我們將溫度設(shè)置為 1，因為較新的 OpenAI 模型只接受這個默認值。

在完成上述屬性配置后，Spring AI 會自動創(chuàng)建一個 OpenAiChatModel 類型的 bean。我們使用它來定義一個 ChatClient bean，作為與 LLM 交互的主要入口：

@Configuration
class ChatbotConfiguration {

    @Bean
    @Primary
    ChatClient primaryChatClient(OpenAiChatModel chatModel) {
        return ChatClient.create(chatModel);
    }
}

在 ChatbotConfiguration 類中，我們使用 OpenAiChatModel bean 創(chuàng)建了主 LLM 的 ChatClient。

我們使用 @Primary 注解標記該 bean。當在組件中注入 ChatClient 且未使用 Qualifier 時，Spring Boot 會自動注入它。

2.2. 配置次級 LLM

現(xiàn)在，我們將配置一個來自 Anthropic 的模型作為次級 LLM。

首先，在 pom.xml 中添加 Anthropic Starter 依賴：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-anthropic</artifactId>
    <version>1.0.2</version>
</dependency>

該依賴是對 Anthropic Message API 的封裝，提供了與 Anthropic 模型建立連接并交互所需的類。

接著，為次級模型定義配置屬性：

spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: ${SECONDARY_LLM}

與主 LLM 的配置類似，我們從環(huán)境變量中加載 Anthropic API Key 和模型 ID。

最后，為次級模型創(chuàng)建一個專用的 ChatClient bean：

@Bean
ChatClient secondaryChatClient(AnthropicChatModel chatModel) {
    return ChatClient.create(chatModel);
}

這里，我們使用 Spring AI 自動配置的 AnthropicChatModel bean 創(chuàng)建了 secondaryChatClient。

3. 配置同一供應(yīng)商的多個 LLM

很多時候，我們需要配置的多個 LLM 可能來自同一 AI 供應(yīng)商。

Spring AI 并不原生支持這種場景，其自動配置每個供應(yīng)商只會創(chuàng)建一個 ChatModel bean。因此，對于額外的模型，我們需要手動定義 ChatModel bean。

讓我們來看看具體過程，并在應(yīng)用中配置第二個 Anthropic 模型：

spring:
  ai:
    anthropic:
      chat:
        options:
          tertiary-model: ${TERTIARY_LLM}

在 application.yaml 的 Anthropic 配置下，我們添加了一個自定義屬性來保存第三個（tertiary）LLM 的模型名稱。

接著，為第三個 LLM 定義必要的 bean：

@Bean
ChatModel tertiaryChatModel(
    AnthropicApi anthropicApi,
    AnthropicChatModel anthropicChatModel,
    @Value("${spring.ai.anthropic.chat.options.tertiary-model}") String tertiaryModelName
) {
    AnthropicChatOptions chatOptions = anthropicChatModel.getDefaultOptions().copy();
    chatOptions.setModel(tertiaryModelName);
    return AnthropicChatModel.builder()
      .anthropicApi(anthropicApi)
      .defaultOptions(chatOptions)
      .build();
}

@Bean
ChatClient tertiaryChatClient(@Qualifier("tertiaryChatModel") ChatModel tertiaryChatModel) {
    return ChatClient.create(tertiaryChatModel);
}

首先，為創(chuàng)建自定義的 ChatModel bean，我們注入自動配置的 AnthropicApi bean、用于創(chuàng)建次級 LLM 的默認 AnthropicChatModel bean，并通過 @Value 注入第三個模型的名稱屬性。

我們復(fù)制現(xiàn)有 AnthropicChatModel 的默認選項，并僅覆蓋其中的模型名稱。

該設(shè)置假定兩個 Anthropic 模型共享同一個 API Key 及其他配置。如果需要不同的屬性，可以進一步自定義 AnthropicChatOptions。

最后，我們使用自定義的 tertiaryChatModel 在配置類中創(chuàng)建第三個 ChatClient bean。

4. 探索一個實用用例

在完成多模型配置后，讓我們實現(xiàn)一個實用用例。我們將構(gòu)建一個具備彈性的聊天機器人，當主模型出現(xiàn)故障時可按順序自動回退到替代模型。

4.1. 構(gòu)建具備彈性的聊天機器人

為實現(xiàn)回退邏輯，我們將使用 Spring Retry。

創(chuàng)建一個新的 ChatbotService 類，并注入我們定義的三個 ChatClient。接著，定義一個入口方法使用主 LLM：

@Retryable(retryFor = Exception.class, maxAttempts = 3)
String chat(String prompt) {
    logger.debug("Attempting to process prompt '{}' with primary LLM. Attempt #{}",
        prompt, RetrySynchronizationManager.getContext().getRetryCount() + 1);
    return primaryChatClient
      .prompt(prompt)
      .call()
      .content();
}

這里，我們創(chuàng)建了一個使用 primaryChatClient 的 chat() 方法。該方法使用 @Retryable 注解，在遇到任意 Exception 時最多重試三次。

接著，定義一個恢復(fù)方法：

@Recover
String chat(Exception exception, String prompt) {
    logger.warn("Primary LLM failure. Error received: {}", exception.getMessage());
    logger.debug("Attempting to process prompt '{}' with secondary LLM", prompt);
    try {
        return secondaryChatClient
          .prompt(prompt)
          .call()
          .content();
    } catch (Exception e) {
        logger.warn("Secondary LLM failure: {}", e.getMessage());
        logger.debug("Attempting to process prompt '{}' with tertiary LLM", prompt);
        return tertiaryChatClient
          .prompt(prompt)
          .call()
          .content();
    }
}

使用 @Recover 注解標記的重載 chat() 方法將作為原始 chat() 方法失敗并耗盡重試后的回退處理。

我們首先嘗試通過 secondaryChatClient 獲取響應(yīng)；如果仍失敗，則最后再嘗試使用 tertiaryChatClient。

這里使用了簡單的 try-catch 實現(xiàn)，因為 Spring Retry 每個方法簽名只允許一個恢復(fù)方法。但在生產(chǎn)應(yīng)用中，我們應(yīng)考慮使用更完善的方案，例如 Resilience4j。

在完成服務(wù)層實現(xiàn)后，我們再對外暴露一個 REST API：

@PostMapping("/api/chatbot/chat")
ChatResponse chat(@RequestBody ChatRequest request) {
    String response = chatbotService.chat(request.prompt);
    return new ChatResponse(response);
}

record ChatRequest(String prompt) {}
record ChatResponse(String response) {}

這里定義了一個 POST 接口 /api/chatbot/chat，接收 prompt，將其傳遞到服務(wù)層，最后把 response 包裝在 ChatResponse record 中返回。

4.2. 測試我們的聊天機器人

最后，我們來測試聊天機器人，驗證回退機制是否正常工作。

通過環(huán)境變量啟動應(yīng)用：為主、次級 LLM 設(shè)置無效模型名稱，同時為第三個 LLM 設(shè)置一個有效的模型名稱：

OPENAI_API_KEY=.... \
ANTHROPIC_API_KEY=.... \
PRIMARY_LLM=gpt-100 \
SECONDARY_LLM=claude-opus-200 \
TERTIARY_LLM=claude-3-haiku-20240307 \
mvn spring-boot:run

在上述命令中，gpt-100 和 claude-opus-200 是無效的模型名稱，會導(dǎo)致 API 錯誤；而 claude-3-haiku-20240307 是 Anthropic 提供的有效模型。

接著，使用 HTTPie CLI 調(diào)用接口，與聊天機器人交互：

http POST :8080/api/chatbot/chat prompt="What is the capital of France?"

這里我們向聊天機器人發(fā)送一個簡單的提示詞，看看返回結(jié)果：

{
    "response": "The capital of France is Paris."
}

可以看到，盡管主、次級 LLM 的配置為無效模型，聊天機器人仍返回了正確響應(yīng)，這驗證了系統(tǒng)成功回退到了第三個 LLM。

為了更直觀地看到回退邏輯的執(zhí)行過程，我們再來看一下應(yīng)用日志：

[2025-09-30 12:56:03] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #1
[2025-09-30 12:56:05] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #2
[2025-09-30 12:56:06] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with primary LLM. Attempt #3
[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Primary LLM failure. Error received: HTTP 404 - {
    "error": {
        "message": "The model `gpt-100` does not exist or you do not have access to it.",
        "type": "invalid_request_error",
        "param": null,
        "code": "model_not_found"
    }
}
[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with secondary LLM
[2025-09-30 12:56:07] [WARN] [com.baeldung.multillm.ChatbotService] - Secondary LLM failure: HTTP 404 - {"type":"error","error":{"type":"not_found_error","message":"model: claude-opus-200"},"request_id":"req_011CTeBrAY8rstsSPiJyv3sj"}
[2025-09-30 12:56:07] [DEBUG] [com.baeldung.multillm.ChatbotService] - Attempting to process prompt 'What is the capital of France?' with tertiary LLM

日志清晰地展示了請求的執(zhí)行流程。

可以看到，主 LLM 連續(xù)三次嘗試失敗；隨后服務(wù)嘗試使用次級 LLM，仍然失敗；最終調(diào)用第三個 LLM 處理提示詞并返回了我們看到的響應(yīng)。

這表明回退機制按設(shè)計正常工作，即使多個 LLM 同時失敗，聊天機器人仍保持可用。

5. 小結(jié)

本文探討了如何在單個 Spring AI 應(yīng)用中集成多個 LLM。首先，我們演示了 Spring AI 的抽象層如何簡化來自不同供應(yīng)商（如 OpenAI 與 Anthropic）的模型配置。隨后，我們解決了更復(fù)雜的場景：在同一供應(yīng)商下配置多個模型，并在 Spring AI 的自動配置不夠用時創(chuàng)建自定義 bean。最后，我們利用多模型配置構(gòu)建了一個具有高可用性的彈性聊天機器人。借助 Spring Retry，我們實現(xiàn)了級聯(lián)回退模式，在發(fā)生故障時可在不同 LLM 間自動切換。

posted @ 2025-10-10 20:26 程序猿DD 閱讀(1065) 評論(0) 收藏舉報

刷新頁面返回頂部

程序猿DD

Spring Boot | Spring Cloud | 干貨分享