marc nicole <
mk1853387@gmail.com> wrote or quoted:
return _webrtcvad.process(self._vad, sample_rate, buf, length)
Error: Error while processing frame
(I was not able to check the following tips myself!
So, please read them as a mere wild guess!)
That error you're running into - it's possibly because the
audio format webrtcvad wants isn't jiving with what you're
feeding it. Let me break it down for you:
WebRTC VAD is picky about its audio, like a foodie at a farmers
market:
- It wants 16-bit mono PCM, nothing fancy
- Sample rates got to be 8000, 16000, 32000, or 48000 Hz
- Frame durations should be 10, 20, or 30 ms, like clockwork
Tweak your PyAudio setup like you're fine-tuning a classic car:
Python
self.FORMAT = pyaudio.paInt16
self.CHANNELS = 1
self.RATE = 16000
self.FRAMES_PER_BUFFER = 480 # 30 ms at 16000 Hz, smooth as a SoCal highway
Give your audio reading loop a makeover:
Python
for i in range(0, int(self.RATE / self.FRAMES_PER_BUFFER * self.RECORD_SECONDS)):
data = self.stream.read(self.FRAMES_PER_BUFFER)
is_speech = self.vad.is_speech(data, self.RATE)
Make sure your audio data is on point:
Python
import numpy as np
# Turn that audio data into a numpy array, like magic
audio_array = np.frombuffer(data, dtype=np.int16)
# If it's not mono, make it mono - no stereo allowed at this party
if self.CHANNELS > 1:
audio_array = audio_array[::self.CHANNELS]
# Back to bytes it goes
audio_bytes = audio_array.tobytes()
is_speech = self.vad.is_speech(audio_bytes, self.RATE)
Crank up that VAD aggressiveness:
Python
self.vad = webrtcvad.Vad(3) # 3 is as aggressive as LA traffic
(Just remember to adjust your sample rate and frame duration
to fit your needs.)
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)