• Re: How to properly use py-webrtcvad?

    From Stefan Ram@21:1/5 to marc nicole on Sun Jan 26 10:37:40 2025
    marc nicole <mk1853387@gmail.com> wrote or quoted:
    return _webrtcvad.process(self._vad, sample_rate, buf, length)
    Error: Error while processing frame

    (I was not able to check the following tips myself!
    So, please read them as a mere wild guess!)

    That error you're running into - it's possibly because the
    audio format webrtcvad wants isn't jiving with what you're
    feeding it. Let me break it down for you:

    WebRTC VAD is picky about its audio, like a foodie at a farmers
    market:

    - It wants 16-bit mono PCM, nothing fancy

    - Sample rates got to be 8000, 16000, 32000, or 48000 Hz

    - Frame durations should be 10, 20, or 30 ms, like clockwork

    Tweak your PyAudio setup like you're fine-tuning a classic car:

    Python

    self.FORMAT = pyaudio.paInt16
    self.CHANNELS = 1
    self.RATE = 16000
    self.FRAMES_PER_BUFFER = 480 # 30 ms at 16000 Hz, smooth as a SoCal highway

    Give your audio reading loop a makeover:

    Python

    for i in range(0, int(self.RATE / self.FRAMES_PER_BUFFER * self.RECORD_SECONDS)):
    data = self.stream.read(self.FRAMES_PER_BUFFER)
    is_speech = self.vad.is_speech(data, self.RATE)

    Make sure your audio data is on point:

    Python

    import numpy as np

    # Turn that audio data into a numpy array, like magic
    audio_array = np.frombuffer(data, dtype=np.int16)

    # If it's not mono, make it mono - no stereo allowed at this party
    if self.CHANNELS > 1:
    audio_array = audio_array[::self.CHANNELS]

    # Back to bytes it goes
    audio_bytes = audio_array.tobytes()

    is_speech = self.vad.is_speech(audio_bytes, self.RATE)

    Crank up that VAD aggressiveness:

    Python

    self.vad = webrtcvad.Vad(3) # 3 is as aggressive as LA traffic

    (Just remember to adjust your sample rate and frame duration
    to fit your needs.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)