View All Posts
Want to keep up to date with the latest posts and videos? Subscribe to the newsletter
HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

Explore the audio capabilities of the M5Stack Core 2, and learn about the potential impact of the plastic case and differences between the PDM and PCM microphones. Discover how to set up the I2S config, manage audio processing, and create a stunning audio visualizer.

Related Content

[0:00] Hey Everyone!
[0:01] I’ve been playing with the audio capabilities of the M5Stack Core 2
[0:06] The microphone on this device is tucked away in the back

[0:10] and there’s a small outlet in the plastic case to let sound in.
[0:14] The actual microphone is attached to the stack slot cover so I think if you are

[0:19] stacking the Core 2 on top of other modules you may lose the microphone functionality.
[0:24] I don’t have any other modules to confirm this so if anyone does then please leave a comment.
[0:30] this also means that any sound needs to make its way through the plastic case to the microphone

[0:36] which may have an impact on how sensitive the audio pickup will be.
[0:41] The Core 2 has an SPM1423 MEMS microphone which uses PDM instead of PCM.
[0:49] This microphone is slightly different from the other I2S microphones we’ve looked at.
[0:54] The microphones we’ve been looking at have all been PCM microphones

[0:58] where the audio signal is represented as a number on the I2S interface.
[1:03] With a PDM microphone, the audio signal is represented using Pulse Density Modulation.
[1:10] This should not be confused with PWM which is Pulse Width Modulation.
[1:16] The animation I’ve put together here shows a simulation of a PDM signal for a sin wave.
[1:23] Now, obviously, in the real world, the PDM signal would be much higher

[1:27] frequency - around 3MHz - and the modulation algorithm would be a lot more sophisticated.
[1:34] Fortunately for us, this is all handled by the I2S peripheral.
[1:38] So we don’t need to worry too much about the microphone technology that is being used.
[1:43] There are however a couple of things to be aware of in the I2S setup.
[1:48] Here’s the code for setting up the I2S config.
[1:52] The first thing to note is that we have to specify I2S_MODE_PDM.
[1:58] The second thing to note is the bits per sample.
[2:02] This needs to be 16bits.
[2:04] I tried using 32bits and the lower 16bits are always zero.
[2:09] I believe this is a limitation on the ESP32 side of things but it’s not documented anywhere.
[2:17] The final thing to be aware of is that we must use I2S_NUM0.
[2:23] This is the only I2S peripheral that supports PDM.
[2:28] So, let’s record some audio and see how it sounds.
[2:31] I’m using my standard bit of test code and sending audio from the device to my machine.
[2:37] This first recording will be at 16KHz sampling rate.
[2:42] “Testing testing one two three”
[2:48] With the audio captured on my machine, we can open it up in audacity and have a listen.
[2:54] We need to use the raw data import functionality and tell audacity our sample size and sample rate.
[3:11] You can see there’s a bit of a wobble when the microphone first starts up.
[3:14] We’ve seen this on other I2S microphones.
[3:18] The audio is very quiet - this is a bit disappointing as I was pretty close to the mic.
[3:24] But we’ve also seen this on other I2S microphones as well.
[3:27] We can use Audacity to amplify the signal.
[3:31] There seems to be quite a lot of noise. Let’s have a listen
[3:43] “Testing testing one two three”
[3:49] It’s actually not too bad.
[3:51] Let’s try the same thing but at a higher sampling rate of 44.1KHz.
[3:57] This is a simple change in the code.
[4:09] “Testing testing one two three”
[4:13] You can see how much faster it needs to send data to keep up.
[4:17] We’ll load this into audacity as well and compare.
[4:20] Once again we use the raw data import.
[4:29] We’ve got the little wiggle at the start and we’ll need to amplify the signal.
[4:47] “Testing testing one two three”
[4:52] It is pretty noisy.
[4:53] I’m not sure where this noise is coming from but I think the signal we’re getting is usable.
[4:59] So, I’ve been having a play and I’ve built a small audio visualizer.
[5:03] Let’s have a watch and then I’ll briefly run through the code
[5:07] 🎶 Music 🎶
[5:29] “Testing testing one two three”
[5:34] We’ll keep this walkthrough quite short.
[5:37] All the code is in GitHub so feel free to check it out and try it on your own device.
[5:43] If you have any suggestions or improvements then please open a pull request - everyone is welcome to contribute.
[5:50] Looking at the code in our main.cpp we define the number of samples we want to analyze.
[5:57] 512 samples at a sample rate of 16KHz works out at about 30 milliseconds.
[6:04] We have the config for the PDM microphone and we have the pins for the Core 2.
[6:12] I’ve structured my code into three main concerns.
[6:15] We have the audio processing
[6:18] This code is responsible for running an FFT against the audio samples.
[6:23] We have the I2S code
[6:25] This is responsible for reading samples from the I2S peripheral.
[6:30] And we have the UI
[6:32] This contains: the graphic equalizer, the spectrogram, and the waveform UI elements
[6:39] Finally, we have our application class that coordinates everything.
[6:45] When our application class starts up it creates a task to handle processing the samples

[6:50] and passes that task handle to the I2S sampler.
[6:56] If we look at the I2S sampler we can see that when it fills up one of its buffers

[7:01] it notifies the processing task to start processing the samples
[7:06] The processing task tells our audio processor to update its state with the new samples.
[7:12] And it then tells the UI to update.
[7:17] Looking at our processor it grabs a copy of the samples from the buffer

[7:22] and applies a Hamming window before running the FFT.
[7:26] For the UI’s update method, we update each individual UI component and then

[7:31] we trigger a drawing task to do the actual rendering of each component.
[7:36] This nicely decouples drawing which can be quite time consuming

[7:39] from the processing of the audio data.
[7:43] I won’t go into too much detail on the code as it’s all in GitHub.
[7:47] If you’d like me to do a more detailed walkthrough

[7:49] then please leave a comment and I’ll do a follow-up video.
[7:53] So that’s it for this video
[7:55] I hope you enjoyed it and as always if you did please give it a big thumbs up

[7:59] and don’t forget to subscribe.
[8:01] I’ll see you in the next video
[8:05] 🎶 Music 🎶

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi
Want to keep up to date with the latest posts and videos? Subscribe to the newsletter
Blog Logo

Chris Greening

> Image


A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...

View All Posts