我想通了-下面的函数可以正常工作,而无需保存,缓冲等。它接收一个wav文件并对其进行编辑,然后直接发送给get math embedding函数:
def get_customer_voice_and_cutting_10_seconds_embedding(file): print('getting customer voice only') wav = read(file) ch = wav[1].shape[1] sr = wav[0] c1 = wav[1][:,1] vad = VoiceActivityDetection() vad.process(c1) voice_samples = vad.get_voice_samples() audio_segment = AudioSegment(voice_samples.tobytes(), frame_rate=sr,sample_width=voice_samples.dtype.itemsize, channels=1) audio_segment = audio_segment[0:10000] file = str(file) + '_10seconds.wav' return get_embedding(file)关键是音频段中的tobytes(),它将它们全部重新组合到1个轨道中



