语音转录Telegram机器人

使用Supabase边缘函数中的TypeScript与Deno，构建一个能转录99种语言音频和视频消息的Telegram机器人。

简介

在本教程中，您将学习如何使用TypeScript和ElevenLabs的Scribe模型，通过语音转文本API构建一个支持99种语言的Telegram机器人，用于转录音频和视频消息。

要查看最终效果，您可以测试t.me/ElevenLabsScribeBot示例机器人。

在GitHub上查看示例项目。

前提条件

拥有ElevenLabs账号及API密钥
注册Supabase账号（可通过database.new免费注册）
在本地机器安装Supabase CLI
安装Deno运行时环境，可选在常用IDE中配置
拥有Telegram账号

设置

注册Telegram机器人

使用BotFather创建新的Telegram机器人。执行/newbot命令并按照指引创建机器人。完成后您将获得机器人密钥，请妥善保存供后续步骤使用。

BotFather

本地创建Supabase项目

安装Supabase CLI后，执行以下命令在本地创建新项目：

1
supabase init

创建数据库表记录转录结果

接下来，创建一个新的数据库表来记录语音转录结果：

1
supabase migrations new init

这将在 supabase/migrations 目录中创建一个新的迁移文件。打开该文件并添加以下SQL：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
CREATE TABLE IF NOT EXISTS transcription_logs (  id BIGSERIAL PRIMARY KEY,  file_type VARCHAR NOT NULL,  duration INTEGER NOT NULL,  chat_id BIGINT NOT NULL,  message_id BIGINT NOT NULL,  username VARCHAR,  transcript TEXT,  language_code VARCHAR,  created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,  error TEXT);ALTER TABLE transcription_logs ENABLE ROW LEVEL SECURITY;

创建Supabase边缘函数处理Telegram webhook请求

接下来，创建一个新的边缘函数来处理Telegram webhook请求：

1
supabase functions new scribe-bot

如果您使用VS Code或Cursor，当CLI提示"Generate VS Code settings for Deno? [y/N]"时选择y！

配置环境变量

在 supabase/functions 目录下，创建一个新的 .env 文件并添加以下变量：

1
2
3
4
5
6
7
8
9
10
# 在 https://elevenlabs.io/app/settings/api-keys 获取/创建API密钥ELEVENLABS_API_KEY=your_api_key# 从BotFather获取的机器人令牌TELEGRAM_BOT_TOKEN=your_bot_token# 您选择的随机密钥，用于保护函数安全FUNCTION_SECRET=random_secret

依赖项

本项目使用了以下依赖项：

开源的 grammY Framework 用于处理 Telegram webhook 请求
@supabase/supabase-js 库用于与 Supabase 数据库交互
ElevenLabs 的 JavaScript SDK 用于与语音转文本 API 交互

由于 Supabase 边缘函数使用 Deno 运行时，您无需安装这些依赖项，而是可以通过 npm: 前缀直接导入它们。

编写 Telegram 机器人代码

在新建的 scribe-bot/index.ts 文件中，添加以下代码：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
import { Bot, webhookCallback } from 'https://deno.land/x/grammy@v1.34.0/mod.ts'import 'jsr:@supabase/functions-js/edge-runtime.d.ts'import { createClient } from 'jsr:@supabase/supabase-js@2'import { ElevenLabsClient } from 'npm:elevenlabs@1.50.5'console.log(`函数 "elevenlabs-scribe-bot" 已启动并运行！`)const elevenLabsClient = new ElevenLabsClient({  apiKey: Deno.env.get('ELEVENLABS_API_KEY') || '',})const supabase = createClient(  Deno.env.get('SUPABASE_URL') || '',  Deno.env.get('SUPABASE_SERVICE_ROLE_KEY') || '')async function scribe({  fileURL,  fileType,  duration,  chatId,  messageId,  username,}: {  fileURL: string  fileType: string  duration: number  chatId: number  messageId: number  username: string}) {  let transcript: string | null = null  let languageCode: string | null = null  let errorMsg: string | null = null  try {    const sourceFileArrayBuffer = await fetch(fileURL).then((res) => res.arrayBuffer())    const sourceBlob = new Blob([sourceFileArrayBuffer], {      type: fileType,    })    const scribeResult = await elevenLabsClient.speechToText.convert({      file: sourceBlob,      model_id: 'scribe_v1',      tag_audio_events: false,    })    transcript = scribeResult.text    languageCode = scribeResult.language_code    // 向用户回复转录文本    await bot.api.sendMessage(chatId, transcript, {      reply_parameters: { message_id: messageId },    })  } catch (error) {    errorMsg = error.message    console.log(errorMsg)    await bot.api.sendMessage(chatId, '抱歉，发生错误。请重试。', {      reply_parameters: { message_id: messageId },    })  }  // 将日志写入Supabase  const logLine = {    file_type: fileType,    duration,    chat_id: chatId,    message_id: messageId,    username,    language_code: languageCode,    error: errorMsg,  }  console.log({ logLine })  await supabase.from('transcription_logs').insert({ ...logLine, transcript })}const telegramBotToken = Deno.env.get('TELEGRAM_BOT_TOKEN')const bot = new Bot(telegramBotToken || '')const startMessage = `欢迎使用 ElevenLabs 转录机器人！我可以以超高准确度转录99种语言的语音！    \n尝试发送或转发语音消息、视频或音频文件给我吧！    \n[了解 Scribe 更多信息](https://elevenlabs.io/speech-to-text) 或 [构建您自己的机器人](https://elevenlabs.io/docs/cookbooks/speech-to-text/telegram-bot)！  `bot.command('start', (ctx) => ctx.reply(startMessage.trim(), { parse_mode: 'MarkdownV2' }))bot.on([':voice', ':audio', ':video'], async (ctx) => {  try {    const file = await ctx.getFile()    const fileURL = `https://api.telegram.org/file/bot${telegramBotToken}/${file.file_path}`    const fileMeta = ctx.message?.video ?? ctx.message?.voice ?? ctx.message?.audio    if (!fileMeta) {      return ctx.reply('未找到视频|音频|语音元数据。请重试。')    }    // 在后台运行转录任务    EdgeRuntime.waitUntil(      scribe({        fileURL,        fileType: fileMeta.mime_type!,        duration: fileMeta.duration,        chatId: ctx.chat.id,        messageId: ctx.message?.message_id!,        username: ctx.from?.username || '',      })    )    // 立即回复用户告知已收到文件    return ctx.reply('已收到。正在转录中...')  } catch (error) {    console.error(error)    return ctx.reply(      '抱歉，获取文件时出错。请尝试发送较小的文件！'    )  }})const handleUpdate = webhookCallback(bot, 'std/http')Deno.serve(async (req) => {  try {    const url = new URL(req.url)    if (url.searchParams.get('secret') !== Deno.env.get('FUNCTION_SECRET')) {      return new Response('不允许访问', { status: 405 })    }    return await handleUpdate(req)  } catch (err) {    console.error(err)  }})

部署到 Supabase

如果尚未创建，请先在 database.new 创建一个新的 Supabase 账户，并将本地项目链接到您的 Supabase 账户：

1
supabase link

应用数据库迁移

运行以下命令来应用 supabase/migrations 目录中的数据库迁移：

1
supabase db push

在 Supabase 仪表板的表编辑器中，您应该能看到一个空的 transcription_logs 表。

最后，运行以下命令部署边缘函数：

1
supabase functions deploy --no-verify-jwt scribe-bot

在 Supabase 仪表板的边缘函数视图中，您应该能看到已部署的 scribe-bot 函数。请记下函数 URL，稍后将会用到，其格式类似于 https://<project-ref>.functions.supabase.co/scribe-bot。

已部署的边缘函数

设置 Webhook

将您的机器人 webhook URL 设置为 https://<PROJECT_REFERENCE>.functions.supabase.co/telegram-bot（将 <...> 替换为实际值）。为此，可以向以下 URL 发送 GET 请求（例如在浏览器中）：

1
https://api.telegram.org/bot<TELEGRAM_BOT_TOKEN>/setWebhook?url=https://<PROJECT_REFERENCE>.supabase.co/functions/v1/scribe-bot?secret=<FUNCTION_SECRET>

请注意 FUNCTION_SECRET 是您在 .env 文件中设置的密钥。

设置 webhook

设置函数密钥

现在您已在本地设置了所有密钥，可以运行以下命令在 Supabase 项目中设置这些密钥：

1
supabase secrets set --env-file supabase/functions/.env

测试机器人

最后，您可以通过发送语音消息、音频或视频文件来测试机器人。

测试机器人

当您看到回复中的文字转录后，返回 Supabase 仪表板中的表编辑器，您应该在 transcription_logs 表中看到一个新行。

表中的新行

语音转录Telegram机器人

使用Supabase边缘函数中的TypeScript与Deno，构建一个能转录99种语言音频和视频消息的Telegram机器人。

简介

前提条件

设置

注册Telegram机器人#

本地创建Supabase项目#

创建数据库表记录转录结果

创建Supabase边缘函数处理Telegram webhook请求#

配置环境变量

依赖项

编写 Telegram 机器人代码#

部署到 Supabase#

应用数据库迁移

设置 Webhook#

设置函数密钥

测试机器人

注册Telegram机器人

本地创建Supabase项目

创建Supabase边缘函数处理Telegram webhook请求

编写 Telegram 机器人代码

部署到 Supabase

设置 Webhook