Forum: Too Lazy BBS

Who's Online
Recent Visitors
- Sykotik
  Mon Jun 30 17:32:10 2025
  from Canada via Telnet
- Guest
  Mon Jun 30 17:20:27 2025
  from Auckland via SSH
- Sykotik
  Mon Jun 30 16:11:59 2025
  from Canada via Telnet
- Geek2
  Mon Jun 30 16:09:42 2025
  from Euclid, Oh via Telnet

System Info

Sysop:	Amessyroom
Location:	Fayetteville, NC
Users:	27
Nodes:	6 (0 / 6)
Uptime:	50:38:40
Calls:	478
Calls today:	10
Files:	1,071
Messages:	95,614
Posted today:	1

Re: How to properly use py-webrtcvad?

From Stefan Ram@21:1/5 to marc nicole on Sun Jan 26 10:37:40 2025

marc nicole <mk1853387@gmail.com> wrote or quoted:

return _webrtcvad.process(self._vad, sample_rate, buf, length)

Error: Error while processing frame

(I was not able to check the following tips myself!
So, please read them as a mere wild guess!)

That error you're running into - it's possibly because the
audio format webrtcvad wants isn't jiving with what you're
feeding it. Let me break it down for you:

WebRTC VAD is picky about its audio, like a foodie at a farmers
market:

- It wants 16-bit mono PCM, nothing fancy

- Sample rates got to be 8000, 16000, 32000, or 48000 Hz

- Frame durations should be 10, 20, or 30 ms, like clockwork

Tweak your PyAudio setup like you're fine-tuning a classic car:

Python

self.FORMAT = pyaudio.paInt16
self.CHANNELS = 1
self.RATE = 16000
self.FRAMES_PER_BUFFER = 480 # 30 ms at 16000 Hz, smooth as a SoCal highway

Give your audio reading loop a makeover:

Python

for i in range(0, int(self.RATE / self.FRAMES_PER_BUFFER * self.RECORD_SECONDS)):
data = self.stream.read(self.FRAMES_PER_BUFFER)
is_speech = self.vad.is_speech(data, self.RATE)

Make sure your audio data is on point:

Python

import numpy as np

# Turn that audio data into a numpy array, like magic
audio_array = np.frombuffer(data, dtype=np.int16)

# If it's not mono, make it mono - no stereo allowed at this party
if self.CHANNELS > 1:
audio_array = audio_array[::self.CHANNELS]

# Back to bytes it goes
audio_bytes = audio_array.tobytes()

is_speech = self.vad.is_speech(audio_bytes, self.RATE)

Crank up that VAD aggressiveness:

Python

self.vad = webrtcvad.Vad(3) # 3 is as aggressive as LA traffic

(Just remember to adjust your sample rate and frame duration
to fit your needs.)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

Recent Visitors

System Info

Re: How to properly use py-webrtcvad?