View All Posts
Want to keep up to date with the latest posts and videos? Subscribe to the newsletter
HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi

Learn how to decode and play MP3 audio files on the ESP32 with both headphone support and I2S digital amplifiers. Discover techniques to enhance audio quality and reduce power interference for clearer sound.

Related Content

[0:00] This video is being brought to you with the support of the channel sponsor PCBWay - I’m

[0:04] actually using a PCB that I had assembled by them in this project - I’ll be making

[0:09] some more PCBs in future videos as I want to learn a bit more about KiCad.

[0:13] They also do 3D printing and CNC work. Check out the link to PCBWay in the description.
[0:19] We’re back with a bit of Audio - we’ve played back audio in plenty

[0:22] of previous videos - checkout the audio projects playlist I’ve linked

[0:26] to in the description - so you might be asking: what’s new in this video?
[0:30] Well, in previous videos we’ve been playing back uncompressed audio - either

[0:34] raw samples or WAV files. This time we’re playing back an MP3 audio file.
[0:40] What’s so great about playing back MP3 files? Well, when we’re dealing with the ESP32

[0:45] size does matter - size of audio files that is.
[0:49] Typically on the ESP32 we’re limited by the size of the flash storage - on most ESP32 devices

[0:55] this is about 4MBytes and you need to reserve some of that space for your actual firmware.
[1:00] With a small app partition and over the air support enabled you can get almost 2MB of SPIFFs

[1:06] storage. If you don’t need over the air updates then you can get closer to 3MB for SPIFFS. But

[1:11] as your app grows in size you’ll need to decrease the amount of space you’ve allocated to storage.
[1:16] If you’re dealing with audio data it quickly takes up a lot of space. If we have stereo audio sampled

[1:22] at 44.1KHz with 16-bit samples then every second of audio data takes up about 172KBytes. If we’ve

[1:30] got around 1MB spare on our flash for audio data we can only store 5 or 6 seconds worth of audio.
[1:37] A normal single is around 3.5 minutes long and would require about 35MBytes

[1:42] to store uncompressed. Obviously, this is not going to fit in a SPIFFS partition.
[1:46] There are of course some shorter songs - the Guinness World Record for the shortest song ever

[1:51] published is 1.3 seconds - I’ve put a link to it in the description for you to have a listen.
[1:56] Assuming you want to listen to something that lasts more

[1:58] than 1.3 seconds we’ll need to compress the audio.
[2:02] MP3 is a popular compression technique for audio files and can compress the audio down to 75-95% of its original size.
[2:09] It became popular back in the mid to late 90s with services like Napster taking off.
[2:14] I’ve run a fairly short song through various different bit rates to see how well it performs.

[2:19] The song is only 2 minutes 41 seconds long so only takes up around 28Mbytes in WAV format.
[2:25] As we decrease the bit rate the audio is encoded at we decrease the file size dramatically.
[2:29] Even on the very low setting of 45-85kps, the quality is very good

[2:35] and we are down to less than a megabyte in size.
[2:38] With MP3 decoding we can fit a song into SPIFFS reasonably easily. Obviously,

[2:43] if you wanted more than one song then you’d probably need to switch to an SD Card for storage,

[2:47] but even for short audio samples encoding them as MP3 would give you a massive saving in space.
[2:53] So, how do we decode the MP3 data? I’ve found a

[2:56] very nice standalone MP3 decoder that is self-contained in a single header file.

[3:01] We just need to feed this data from the MP3 file and it will give us 16-bit audio samples to play.
[3:07] The decoder decodes one frame of data at a time from the buffer and tells us how much data it has

[3:12] consumed. We shift the data down by this number of bytes and top the buffer up from the file.
[3:17] We keep running this until we have no more data in the file

[3:20] and all the buffered data has been used up.
[3:22] So, how do we playback the audio?
[3:25] I’ve added two different options for you - we can play

[3:27] the data using ESP32’s built-in digital to analogue converter and some headphones

[3:32] or we can output I2S data straight to a digital amplifier.
[3:37] Let’s cover the headphone option first. Wiring up headphones is pretty straightforward,

[3:42] you can either get a little breakout board with the headphone socket as I have, or you

[3:46] can hack up an old set of headphones and connect header pins to the cable.
[3:50] Headphone jacks generally come in a couple of variations.

[3:53] You have some that come with a microphone and some that just have headphones. The tip of the

[3:58] socket is connected to the left earpiece, the first ring is connected to the right earpiece.

[4:03] Most headphones have the ground connection next followed by the microphone on the final connector.

[4:09] Be aware that there are some manufacturers such as Nokia who used a different standard

[4:13] and have the microphone on the ring2 and ground on the sleeve.
[4:17] Our DAC outputs the audio signal with a DC bias so we need to put a DC blocking capacitor between

[4:23] the signal and the headphone connector. This will block the DC element and only allow through the AC

[4:28] signal. The size of the capacitor will determine how well the system responds to low-frequency

[4:33] signals. I’m just using a couple of 47 microfarad capacitors here that I had lying around.
[4:38] I’m not trying to build a high quality audio system.
[4:41] I found the ESP32’s DAC output quite capable of driving the headphones,

[4:46] but to make sure I don’t damage the ESP32 by drawing too much current I’ve put a resistor

[4:50] in series. I found that a value of 500Ω-2KΩ- seems to work well and still gives a reasonable volume.
[4:57] But you might want to play with this value yourself.
[5:00] The DAC output is quite noisy - we can see this on the oscilloscope with the

[5:03] volume turned down to zero we have an audible noise on the headphones.
[5:07] My initial thought was that this could be power supply noise

[5:10] so I tried it with a battery power supply and the noise was still present.
[5:13] So, your mileage may vary, but I found it a little bit noisy.
[5:17] But actually, it’s quite listenable.
[5:20] I’ve recorded the audio as best I can from the headphones,

[5:22] it’s annoying that not many computers actually have a line in port nowadays.
[5:26] So I’ve had to use my headphones and strap them round my phone.
[5:29] Let’s have a listen.
[5:31] MUSIC
[5:51] It’s actually pretty good!
[5:52] I’m quite pleased with the audio output.
[5:54] The other way to get audio out of the ESP32 is to use the I2S interface

[5:59] connected directly to an amplifier such as the MAX98357 - I’ve got a

[6:03] whole video on using this device - there’s a link in the description.
[6:07] Wiring up is very straightforward, we just need three pins on the ESP32 - one for the LR clock,

[6:13] one for the serial clock and one for the serial data. If we want to have stereo sound

[6:19] then we need two MAX98537 boards - one for the left channel and one for the

[6:24] right channel. If you’re using the breakout board from Adafruit then you select the

[6:28] channel the board by connecting a resistor from the SD pin to the power supply.
[6:32] Calculating this value is a bit complicated - the board has a resistor divider connected to

[6:37] the pin already - so you need to take this into account when calculating your pull up resistor - I

[6:42] actually got it wrong in the previous video. A value that should work is around 390Kohms.
[6:47] I’ve got my own stereo breakout board based around the same chip that outputs stereo to two speakers and

[6:52] cuts down on the wiring. I’ve linked to the design video for that board in the description.
[6:57] The MAX98357 board has a selectable gain,

[7:00] at maximum gain, each amplifier will require over 1amp. That means that if you are using

[7:05] two amplifiers for stereo sound then you’ll need about 3 amps in total.
[7:09] Drawing this much current can pull down the voltage coming from the power supply

[7:13] and can cause the ESP32 to brownout if it gets pulled too low.
[7:17] I’ve connected the voltage coming into the ESP32 dev board from the power supply to my oscilloscope

[7:23] and we can see the effect here. The voltage is being pulled down by almost 2 volts.
[7:28] Eventually, we have a sustained loud piece of

[7:30] music and the ESP32 cuts out due to a brownout and we get a reset.
[7:34] To mitigate this we can add a bunch of capacitors to the power pins of the amplifier. I’ve added

[7:40] 3x1000 microfarad capacitors to mine and now we can get through the whole song at maximum volume.
[7:45] The voltage is still being pulled down, but not as bad as before.
[7:50] Ideally, you’d probably want to use a slightly more powerful power supply

[7:53] along with some reservoir capacitors near the amplifiers.
[7:57] MUSIC
[8:05] As a bonus, I’ve added a volume control using a potentiometer connected between

[8:09] 0v and the 3.3v line. We measure this using the Analog to Digital Converter on the ESP32
[8:15] and use the measured value to set the volume.
[8:17] There’s an interesting thing that’s quite important about volume and how we perceive

[8:21] loudness. We don’t perceive loudness on a linear scale, so if you are using a linear potentiometer

[8:27] like I am to set the volume you should make sure to amplify the sound in a non-linear way.
[8:32] You can do this simply by squaring or cubing the volume.

[8:35] This gives a much more pleasing volume control.
[8:38] So that’s it for this project - it would be quite easy to hook up an SD Card and

[8:42] an ESP32 with a display and you’ve got yourself a very rudimentary MP3 player!
[8:47] As always the code is on GitHub. Let me know how you get on!

HELP SUPPORT MY WORK: If you're feeling flush then please stop by Patreon Or you can make a one off donation via ko-fi
Want to keep up to date with the latest posts and videos? Subscribe to the newsletter
Blog Logo

Chris Greening

> Image


A collection of slightly mad projects, instructive/educational videos, and generally interesting stuff. Building projects around the Arduino and ESP32 platforms - we'll be exploring AI, Computer Vision, Audio, 3D Printing - it may get a bit eclectic...

View All Posts