语音合成api_语音合成API

程序员文章站 2022-04-30 10:24:16

...

语音合成api

The Speech Synthesis API is an awesome tool provided by modern browsers.

语音合成API是现代浏览器提供的强大工具。

Introduced in 2014, it’s now widely adopted and available in Chrome, Firefox, Safari and Edge. IE is not supported.

它于2014年推出，现已被 Chrome，Firefox，Safari和Edge 广泛采用并可用。不支持IE。

It’s part of the Web Speech API, along with the Speech Recognition API, although that is only currently supported, in experimental mode, on Chrome.

它是Web Speech API的一部分，与Speech Recognition API一起 ，虽然目前仅在Chrome上以实验模式受支持。

I used it recently to provide an alert on a page that monitored some parameters. When one of the numbers went up, I was alerted thought the computer speakers.

我最近使用它在监视某些参数的页面上提供了警报。当其中一个数字上升时，我就被计算机扬声器吓到了。

入门 (Getting started)

The most simple example of using the Speech Synthesis API stays on one line:

使用语音合成API的最简单示例停留在一行上：

speechSynthesis.speak(new SpeechSynthesisUtterance('Hey'))

Copy and paste it in your browser console, and your computer should speak!

将其复制并粘贴到您的浏览器控制台中，您的计算机就会讲话！

API (The API)

The API exposes several objects to the window object.

该API将多个对象公开给window对象。

`SpeechSynthesisUtterance` (`SpeechSynthesisUtterance`)

SpeechSynthesisUtterance represents a speech request. In the example above we passed it a string. That’s the message the browser should read aloud.

SpeechSynthesisUtterance表示语音请求。在上面的示例中，我们为它传递了一个字符串。那是浏览器应大声读出的消息。

Once you got the utterance object, you can perform some tweaks to edit the speech properties:

获得语音对象后，您可以进行一些调整以编辑语音属性：

const utterance = new SpeechSynthesisUtterance('Hey')

utterance.rate: set the speed, accepts between [0.1 - 10], defaults to 1

utterance.rate ：设置速度，可接受[0.1-10]，默认为1
utterance.pitch: set the pitch, accepts between [0 - 2], defaults to 1

utterance.pitch ：设置音高，接受[0-2]，默认为1
utterance.volume: sets the volume, accepts between [0 - 1], defaults to 1

utterance.volume ：设置音量，接受[0-1]，默认为1
utterance.lang: set the language (values use a BCP 47 language tag, like en-US or it-IT)

utterance.lang ：设置语言(值使用BCP 47语言标记，例如en-US或it-IT )
utterance.text: instead of setting it in the constructor, you can pass it as a property. Text can be maximum 32767 characters

utterance.text ：您可以将其作为属性传递，而不是在构造函数中进行设置。文字最多32767个字符
utterance.voice: sets the voice (more on this below)

utterance.voice ：设置声音(有关此内容，请参见下文)

Example:

例：

const utterance = new SpeechSynthesisUtterance('Hey')
utterance.pitch = 1.5
utterance.volume = 0.5
utterance.rate = 8
speechSynthesis.speak(utterance)

设定声音 (Set a voice)

The browser has a different number of voices available.

浏览器有不同数量的声音。

To see the list, use this code:

要查看列表，请使用以下代码：

console.log(`Voices #: ${speechSynthesis.getVoices().length}`)

speechSynthesis.getVoices().forEach(voice => {
  console.log(voice.name, voice.lang)
})

Here is one of the cross browser issues. The above code works in Firefox, Safari (and possibly Edge but I didn’t test it), but does not work in Chrome. Chrome requires the voices handling in a different way, and requires a callback that is called when the voices have been loaded:

这是跨浏览器问题之一。上面的代码在Firefox，Safari(可能还有Edge)中都可以使用，但在Chrome中不起作用 。 Chrome需要以不同的方式处理声音，并且需要在加载声音后调用回调：

const voiceschanged = () => {
  console.log(`Voices #: ${speechSynthesis.getVoices().length}`)
  speechSynthesis.getVoices().forEach(voice => {
    console.log(voice.name, voice.lang)
  })
}
speechSynthesis.onvoiceschanged = voiceschanged

After the callback is called, we can access the list using speechSynthesis.getVoices().

调用回调后，我们可以使用speechSynthesis.getVoices()访问列表。

I believe this is because Chrome - if there is a network connection - checks additional languages from the Google servers:

我认为这是因为Chrome(如果存在网络连接)会从Google服务器检查其他语言：

If there is no network connection, the number of languages available is the same as Firefox and Safari. The additional languages are available where the network is enabled, but the API works offline as well.

如果没有网络连接，则可用的语言数量与Firefox和Safari相同。在启用网络的情况下，还可以使用其他语言，但是API也可以脱机工作。

跨浏览器实现以获取语言 (Cross browser implementation to get the language)

Since we have this difference, we need a way to abstract it to use the API. This example does this abstraction:

由于存在这种差异，因此我们需要一种抽象方法以使用API。这个例子做了这个抽象：

const getVoices = () => {
  return new Promise(resolve => {
    let voices = speechSynthesis.getVoices()
    if (voices.length) {
      resolve(voices)
      return
    }
    speechSynthesis.onvoiceschanged = () => {
      voices = speechSynthesis.getVoices()
      resolve(voices)
    }
  })
}

const printVoicesList = async () => {
  ;(await getVoices()).forEach(voice => {
    console.log(voice.name, voice.lang)
  })
}

printVoicesList()

See on Glitch

见小故障

使用自定义语言 (Use a custom language)

The default voice speaks in english.

默认声音是英语。

You can use any language you want, by setting the utterance lang property:

通过设置话语lang属性，可以使用所需的任何语言：

let utterance = new SpeechSynthesisUtterance('Ciao')
utterance.lang = 'it-IT'
speechSynthesis.speak(utterance)

使用其他声音 (Use another voice)

If there is more than one voice available, you might want to choose the other. For example the default italian voice is female, but maybe I want a male voice. That’s the second one we get from th voices list.

如果有多个声音可用，则可能要选择另一个声音。例如，默认的意大利语声音是女性，但也许我想要男性声音。这是我们从声音列表中获得的第二个声音。

const lang = 'it-IT'
const voiceIndex = 1

const speak = async text => {
  if (!speechSynthesis) {
    return
  }
  const message = new SpeechSynthesisUtterance(text)
  message.voice = await chooseVoice()
  speechSynthesis.speak(message)
}

const getVoices = () => {
  return new Promise(resolve => {
    let voices = speechSynthesis.getVoices()
    if (voices.length) {
      resolve(voices)
      return
    }
    speechSynthesis.onvoiceschanged = () => {
      voices = speechSynthesis.getVoices()
      resolve(voices)
    }
  })
}

const chooseVoice = async () => {
  const voices = (await getVoices()).filter(voice => voice.lang == lang)

  return new Promise(resolve => {
    resolve(voices[voiceIndex])
  })
}

speak('Ciao')

See on Glitch

见小故障

语言的价值 (Values for the language)

Those are some examples of the languages you can use:

这些是您可以使用的语言的一些示例：

Arabic (Saudi Arabia) ➡️ ar-SA

阿拉伯文(沙特阿拉伯)➡️ar ar-SA
Chinese (China) ➡️ zh-CN

中文(中国)➡️zh zh-CN
Chinese (* SAR China) ➡️ zh-HK

中文(中国香港特别行政区)➡️zh zh-HK
Chinese (*) ➡️ zh-TW

中文(*)➡️zh zh-TW
Czech (Czech Republic) ➡️ cs-CZ

捷克(捷克*)➡️CS cs-CZ
Danish (Denmark) ➡️ da-DK

丹麦文(丹麦)➡️da da-DK
Dutch (Belgium) ➡️ nl-BE

荷兰语(比利时)➡️nl nl-BE
Dutch (Netherlands) ➡️ nl-NL

荷兰语(荷兰)➡️nl nl-NL
English (Australia) ➡️ en-AU

英文(澳大利亚)➡️en en-AU
English (Ireland) ➡️ en-IE

英语(爱尔兰)➡️en en-IE
English (South Africa) ➡️ en-ZA

英语(南非)➡️en en-ZA
English (United Kingdom) ➡️ en-GB

英文(英国)➡️en en-GB
English (United States) ➡️ en-US

英文(美国)➡️en en-US
Finnish (Finland) ➡️ fi-FI

芬兰语(芬兰)➡️fi fi-FI
French (Canada) ➡️ fr-CA

法语(加拿大)➡️fr fr-CA
French (France) ➡️ fr-FR

法国(法国)➡️fr fr-FR
German (Germany) ➡️ de-DE

德语(德国)➡️de de-DE
Greek (Greece) ➡️ el-GR

希腊文(希腊)➡️el el-GR
Hindi (India) ➡️ hi-IN

印地语(印度)➡️hi hi-IN
Hungarian (Hungary) ➡️ hu-HU

匈牙利文(匈牙利)➡️hu hu-HU
Indonesian (Indonesia) ➡️ id-ID

印尼文(印度尼西亚)➡️id id-ID
Italian (Italy) ➡️ it-IT

意大利语(意大利)➡️it it-IT
Japanese (Japan) ➡️ ja-JP

日语(日本)➡️ja ja-JP
Korean (South Korea) ➡️ ko-KR

韩国语(韩国)➡️ko ko-KR
Norwegian (Norway) ➡️ no-NO

挪威语(挪威)➡️no no-NO
Polish (Poland) ➡️ pl-PL

波兰语(波兰)➡️pl pl-PL
Portuguese (Brazil) ➡️ pt-BR

葡萄牙语(巴西)➡️pt pt-BR
Portuguese (Portugal) ➡️ pt-PT

葡萄牙语(葡萄牙)➡️pt pt-PT
Romanian (Romania) ➡️ ro-RO

罗马尼亚语(罗马尼亚)➡️ro ro-RO
Russian (Russia) ➡️ ru-RU

俄语(俄罗斯)➡️ru ru-RU
Slovak (Slovakia) ➡️ sk-SK

斯洛伐克(斯洛伐克)➡️sk sk-SK
Spanish (Mexico) ➡️ es-MX

西班牙语(墨西哥)➡️es es-MX
Spanish (Spain) ➡️ es-ES

西班牙语(西班牙)➡️es es-ES
Swedish (Sweden) ➡️ sv-SE

瑞典语(瑞典)➡️sv sv-SE
Thai (Thailand) ➡️ th-TH

泰国(泰国)➡️th th-TH
Turkish (Turkey) ➡️ tr-TR

土耳其语(土耳其)➡️tr tr-TR

移动 (Mobile)

On iOS the API works but must be triggered by a user action callback, like a response to a tap event, to provide a better experience to users and avoid unexpected sounds out of your phone.

在iOS上，该API可以运行，但必须由用户操作回调(例如对轻击事件的响应)触发，以为用户提供更好的体验并避免手机发出意外声音。

You can’t do like in the desktop where you can make your web pages speak something out of the blue.

您无法在桌面上喜欢那样，在桌面上，您的网页可以突如其来。

翻译自: https://flaviocopes.com/speech-synthesis-api/

语音合成api

相关标签： java js javascript python web

上一篇：韩世忠和岳飞都是南宋抗金名将为什么两人的下场如此之大

下一篇：【海思Hi3520D开发笔记】移植EC20（未完结）

语音合成api_语音合成API

入门 (Getting started)

API (The API)

`SpeechSynthesisUtterance` (`SpeechSynthesisUtterance`)

设定声音 (Set a voice)

跨浏览器实现以获取语言 (Cross browser implementation to get the language)

使用自定义语言 (Use a custom language)

使用其他声音 (Use another voice)

语言的价值 (Values for the language)

移动 (Mobile)

python实现百度语音识别api

iOS 14 N多新功能曝光：自订语音合成、手写识别、独立健身APP等

Android 云之声离线语音合成

Android实现语音合成与识别功能

python调用百度语音识别api

python调用百度REST API实现语音识别

Python语言实现百度语音识别API的使用实例

微信合作伙伴大会释放的信息：推语音识别API

Google推出AI语音合成器，DeepMind AI提供支持

语音合成综述

语音合成api_语音合成API

入门 (Getting started)

API (The API)

SpeechSynthesisUtterance (SpeechSynthesisUtterance)

设定声音 (Set a voice)

跨浏览器实现以获取语言 (Cross browser implementation to get the language)

使用自定义语言 (Use a custom language)

使用其他声音 (Use another voice)

语言的价值 (Values for the language)

移动 (Mobile)

python实现百度语音识别api

iOS 14 N多新功能曝光：自订语音合成、手写识别、独立健身APP等

Android 云之声离线语音合成

Android实现语音合成与识别功能

python调用百度语音识别api

python调用百度REST API实现语音识别

Python语言实现百度语音识别API的使用实例

微信合作伙伴大会释放的信息：推语音识别API

Google推出AI语音合成器，DeepMind AI提供支持

语音合成综述

`SpeechSynthesisUtterance` (`SpeechSynthesisUtterance`)