Spring Boot 集成免費的 EdgeTTS 實現文本轉語音

在需要文本轉語音（TTS）的應用場景中（如語音助手、語音通知、內容播報等），Java生態缺少類似Python生態的Edge TTS 客戶端庫。不過沒關系，現在可以通過 UnifiedTTS 提供的 API 來調用免費的 EdgeTTS 能力。同時，UnifiedTTS 還支持 Azure TTS、MiniMax TTS、Elevenlabs TTS 等多種模型，通過對請求接口的抽象封裝，用戶可以方便在不同模型與音色之間靈活切換。

下面我們以調用免費的EdgeTTS為目標，構建一個包含文本轉語音功能的Spring Boot應用。

實戰

1. 構建 Spring Boot 應用

通過 start.spring.io 或其他構建基礎的Spring Boot工程，根據你構建應用的需要增加一些依賴，比如最后用接口提供服務的話，可以加入web模塊：

<dependencies>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
</dependencies>

2. 注冊 UnifiedTTS，獲取 API Key

前往 UnifiedTTS 官網注冊賬號（直接GitHub登錄即可）
從左側菜單進入“API密鑰”頁面，創建 API Key；

存好API Key，后續需要使用

3. 集成 UnifiedTTS API

下面根據API 文檔：https://unifiedtts.com/zh/api-docs/tts-sync 實現一個可運行的參考實現，包括配置文件、請求模型、服務類與控制器。

3.1 配置文件（`application.properties`）

unified-tts.host=https://unifiedtts.com
unified-tts.api-key=your-api-key-here

這里unifiedtts.api-key參數記得替換成之前創建的ApiKey。

3.2 配置加載類

@Data
@ConfigurationProperties(prefix = "unified-tts")
public class UnifiedTtsProperties {

    private String host;
    private String apiKey;

}

3.3 請求封裝和響應封裝

@Data
@AllArgsConstructor
@NoArgsConstructor
public class UnifiedTtsRequest {
    
    private String model;
    private String voice;
    private String text;
    private Double speed;
    private Double pitch;
    private Double volume;
    private String format;

}

@Data
@AllArgsConstructor
@NoArgsConstructor
public class UnifiedTtsResponse {

    private boolean success;
    private String message;
    private long timestamp;
    private UnifiedTtsResponseData data;

    @Data
    @AllArgsConstructor
    @NoArgsConstructor
    public static class UnifiedTtsResponseData {
        @JsonProperty("request_id")
        private String requestId;

        @JsonProperty("audio_url")
        private String audioUrl;

        @JsonProperty("file_size")
        private long fileSize;
    }
}

UnifiedTTS 抽象了不同模型的請求，這樣用戶可以用同一套請求參數標準來實現對不同TTS模型的調用，這個非常方便。所以，為了簡化TTS的客戶端調用，非常推薦使用 UnifiedTTS。

3.3 服務實現（調用 UnifiedTTS）

使用 Spring Boot自帶的RestClient HTTP客戶端來實現UnifiedTTS的功能實現類，提供兩個實現：

接收音頻字節并返回。

@Service
public class UnifiedTtsService {

    private final RestClient restClient;
    private final UnifiedTtsProperties properties;

    public UnifiedTtsService(RestClient restClient, UnifiedTtsProperties properties) {
        this.restClient = restClient;
        this.properties = properties;
    }

    /**
     * 調用 UnifiedTTS 同步 TTS 接口，返回音頻字節數據。
     *
     * <p>請求頭：
     * <ul>
     *   <li>Content-Type: application/json</li>
     *   <li>X-API-Key: 來自配置的 API Key</li>
     *   <li>Accept: 接受二進制流或常見 mp3/mpeg 音頻類型</li>
     * </ul>
     *
     * @param request 模型、音色、文本、速度/音調/音量、輸出格式等參數
     * @return 音頻二進制字節（例如 mp3）
     * @throws IllegalStateException 當服務端返回非 2xx 或無內容時拋出
     */
    public byte[] synthesize(UnifiedTtsRequest request) {
        ResponseEntity<byte[]> response = restClient
                .post()
                .uri("/api/v1/common/tts-sync")
                .contentType(MediaType.APPLICATION_JSON)
                .accept(MediaType.APPLICATION_OCTET_STREAM, MediaType.valueOf("audio/mpeg"), MediaType.valueOf("audio/mp3"))
                .header("X-API-Key", properties.getApiKey())
                .body(request)
                .retrieve()
                .toEntity(byte[].class);

        if (response.getStatusCode().is2xxSuccessful() && response.getBody() != null) {
            return response.getBody();
        }
        throw new IllegalStateException("UnifiedTTS synthesize failed: " + response.getStatusCode());
    }

    /**
     * 調用合成并將音頻寫入指定文件。
     *
     * <p>若輸出路徑的父目錄不存在，會自動創建；失敗時拋出運行時異常。
     *
     * @param request TTS 請求參數
     * @param outputPath 目標文件路徑（例如 output.mp3）
     * @return 實際寫入的文件路徑
     */
    public Path synthesizeToFile(UnifiedTtsRequest request, Path outputPath) {
        byte[] data = synthesize(request);
        try {
            if (outputPath.getParent() != null) {
                Files.createDirectories(outputPath.getParent());
            }
            Files.write(outputPath, data);
            return outputPath;
        } catch (IOException e) {
            throw new RuntimeException("Failed to write TTS output to file: " + outputPath, e);
        }
    }
}

3.4 單元測試

@SpringBootTest
class UnifiedTtsServiceTest {

    @Autowired
    private UnifiedTtsService unifiedTtsService;

    @Test
    void testRealSynthesizeAndDownloadToFile() throws Exception {
        UnifiedTtsRequest req = new UnifiedTtsRequest(
            "edge-tts",
            "en-US-JennyNeural",
            "Hello, this is a test of text to speech synthesis.",
            1.0,
            1.0,
            1.0,
            "mp3"
        );

        // 調用真實接口，斷言返回結構
        UnifiedTtsResponse resp = unifiedTtsService.synthesize(req);
        assertNotNull(resp);
        assertTrue(resp.isSuccess(), "Response should be success");
        assertNotNull(resp.getData(), "Response data should not be null");
        assertNotNull(resp.getData().getAudioUrl(), "audio_url should be present");

        // 在當前工程目錄下生成測試結果目錄并寫入文件
        Path projectDir = Paths.get(System.getProperty("user.dir"));
        Path resultDir = projectDir.resolve("test-result");
        Files.createDirectories(resultDir);
        Path out = resultDir.resolve(System.currentTimeMillis() + ".mp3");
        Path written = unifiedTtsService.synthesizeToFile(req, out);
        System.out.println("UnifiedTTS test output: " + written.toAbsolutePath());
        assertTrue(Files.exists(written), "Output file should exist");
        assertTrue(Files.size(written) > 0, "Output file size should be > 0");
    }
}

4. 運行與驗證

執行單元測試之后，可以在工程目錄test-result下找到生成的音頻文件：

5. 常用參數與音色選擇

目前支持的常用參數如下圖所示：

對于model和voice參數可以因為內容較多，可以前往API文檔查看。

小結

本文展示了如何在 Spring Boot 中集成 UnifiedTTS 的 EdgeTTS 能力，實現文本轉語音并輸出為 mp3。UnifiedTTS 通過統一的 API 屏蔽了不同 TTS 模型的差異，使你無需維護多個 SDK，即可在成本與效果之間自由切換。根據業務需求，你可以進一步完善異常處理、緩存與并發控制，實現更可靠的生產級 TTS 服務。

本文樣例工程：https://github.com/dyc87112/unified-tts-example

posted @ 2025-10-14 15:42 程序猿DD 閱讀(347) 評論(0) 收藏舉報

刷新頁面返回頂部

程序猿DD

Spring Boot | Spring Cloud | 干貨分享

Spring Boot 集成免費的 EdgeTTS 實現文本轉語音

實戰

1. 構建 Spring Boot 應用

2. 注冊 UnifiedTTS，獲取 API Key

3. 集成 UnifiedTTS API

3.1 配置文件（`application.properties`）

3.2 配置加載類

3.3 請求封裝和響應封裝

3.3 服務實現（調用 UnifiedTTS）

3.4 單元測試

4. 運行與驗證

5. 常用參數與音色選擇

小結

公告

程序猿DD

Spring Boot | Spring Cloud | 干貨分享

Spring Boot 集成免費的 EdgeTTS 實現文本轉語音

實戰

1. 構建 Spring Boot 應用

2. 注冊 UnifiedTTS，獲取 API Key

3. 集成 UnifiedTTS API

3.1 配置文件（application.properties）

3.2 配置加載類

3.3 請求封裝和響應封裝

3.3 服務實現（調用 UnifiedTTS）

3.4 單元測試

4. 運行與驗證

5. 常用參數與音色選擇

小結

公告

3.1 配置文件（`application.properties`）