#289: Record Audio With Sounddevice
When we want to turn speech into text, we somehow need to record our audio. For Python we have two main options: sounddevice and PyAudio. This week we see how the more modern sounddevice works, while PyAudio will be the topic of next week’s post.
Installation
We can install sounddevice with this command:
Show the devices
Before we start recording, we first should see what audio devices we can access and how many channels they offer. If we try to record with two channels for stereo but the device only offers one channel (mono), we end up with an exception.
Before you run this script, make sure that you connected the microphone you want to use.
0 Microsoft Sound Mapper - Input, MME (2 in, 0 out)
> 1 Headset (EarPods), MME (1 in, 0 out)
2 Echo Cancelling Speakerphone (D, MME (2 in, 0 out)
3 Microphone (Logitech BRIO), MME (2 in, 0 out)
4 Microphone (Realtek(R) Audio), MME (4 in, 0 out)
5 Microsoft Sound Mapper - Output, MME (0 in, 2 out)
< 6 Headset (EarPods), MME (0 in, 2 out)
7 Echo Cancelling Speakerphone (D, MME (0 in, 2 out)
8 Speakers/Headphones (Realtek(R), MME (0 in, 2 out)
9 Primary Sound Capture Driver, Windows DirectSound (2 in, 0 out)
10 Headset (EarPods), Windows DirectSound (1 in, 0 out)
11 Echo Cancelling Speakerphone (DELL PROFESSIONAL SOUND BAR AE515), Windows DirectSound (2 in, 0 out)
12 Microphone (Logitech BRIO), Windows DirectSound (2 in, 0 out)
13 Microphone (Realtek(R) Audio), Windows DirectSound (4 in, 0 out)
14 Primary Sound Driver, Windows DirectSound (0 in, 2 out)
15 Headset (EarPods), Windows DirectSound (0 in, 2 out)
16 Echo Cancelling Speakerphone (DELL PROFESSIONAL SOUND BAR AE515), Windows DirectSound (0 in, 2 out)
17 Speakers/Headphones (Realtek(R) Audio), Windows DirectSound (0 in, 2 out)
18 Echo Cancelling Speakerphone (DELL PROFESSIONAL SOUND BAR AE515), Windows WASAPI (0 in, 2 out)
19 Speakers/Headphones (Realtek(R) Audio), Windows WASAPI (0 in, 2 out)
20 Headset (EarPods), Windows WASAPI (0 in, 2 out)
21 Headset (EarPods), Windows WASAPI (1 in, 0 out)
22 Echo Cancelling Speakerphone (DELL PROFESSIONAL SOUND BAR AE515), Windows WASAPI (2 in, 0 out)
23 Microphone (Logitech BRIO), Windows WASAPI (2 in, 0 out)
24 Microphone (Realtek(R) Audio), Windows WASAPI (2 in, 0 out)
25 Speakers 1 (Realtek HD Audio output with SST), Windows WDM-KS (0 in, 2 out)
26 Speakers 2 (Realtek HD Audio output with SST), Windows WDM-KS (0 in, 2 out)
27 PC Speaker (Realtek HD Audio output with SST), Windows WDM-KS (2 in, 0 out)
28 Microphone 1 (Realtek HD Audio Mic input with SST), Windows WDM-KS (2 in, 0 out)
29 Microphone 2 (Realtek HD Audio Mic input with SST), Windows WDM-KS (4 in, 0 out)
30 Microphone 3 (Realtek HD Audio Mic input with SST), Windows WDM-KS (4 in, 0 out)
31 Stereo Mix (Realtek HD Audio Stereo input), Windows WDM-KS (2 in, 0 out)
32 Output (EarPods), Windows WDM-KS (0 in, 2 out)
33 Headset (EarPods), Windows WDM-KS (1 in, 0 out)
34 Microphone (Logitech BRIO), Windows WDM-KS (2 in, 0 out)
35 Echo Cancelling Speakerphone (DELL PROFESSIONAL SOUND BAR AE515), Windows WDM-KS (2 in, 0 out)
36 Echo Cancelling Speakerphone (DELL PROFESSIONAL SOUND BAR AE515), Windows WDM-KS (0 in, 2 out)
The > shows the default input device, while < shows the default output device.
Change the default device
If we are not happy with the current default device, we can use this property to set it to the name of the device we want to use:
Record to a file
When we found the device we want to use, we can set the correct number of channels in this script to record and store our audio file to disk:
The script captures audio from the microphone for 5 seconds and generates a NumPy array. We can take this array and save it to a *.wav file with the wave module of Python.
Record for as long as we need to
The fixed duration to record is usually not what we want. We can modify the scrip and add threading so that we can keep recording until we hit the Enter key:
There is a lot more going on here than in the fixed-length script. The callback() function appends the current increment of our recording to our list of recordings, while the wait_for_enter() function makes sure that we can stop the recording.
Next
Sounddevice is a nice and modern approach to record audio files with Python. It offers us a direct way to access our sound devices and has an easy-to-understand API.
Next week we explore PyAudio to see how we can use this library to record audio files.