In this video, we successfully navigated the convoluted process of setting up movie file playback from an ESP32 with an SD card. There were a few bumps along the way, such as confusing USB data pins and the intricacies of various video container formats, but our quirky PCBWay board came through. Discussed an ingenious method of creating a simple custom video container format with ffmpeg that can be effortlessly parsed by the ESP32. And yes, even though the tiny TV guys use AVI files, we pushed boundaries and learned a thing or two about list chunks, sub formats, and hex dumps. The result? We achieved smooth audio playback and video frame skipping for an optimal balance. Check out the streaming version on WiFi for more fun!
[0:19] That’s right, we’re now able to play movie files from an SD card on the ESP32.
[0:24] My little board from PCBWay is now getting slightly bonkers, it’s Bowie Heath Robinson.
[0:30] It’s definitely time for version 2.
[0:31] Thanks to the Patreons for their support, it’s really appreciated,
[0:35] and they’ve been getting a few updates along the way.
[0:37] And also thanks to PCBWay for supplying these boards.
[0:40] We’ll do a new version soon and make use of their services again.
[0:43] They are really great, give them a go.
[0:45] If you’re more sensible than me, then you should probably just use an ESP32 board
[0:49] that has an SD card built in, or even just get a display board with one as well.
[0:54] And of course, there’s plenty of SD card break-up boards that you can buy.
[0:57] But if you want to have a bit of fun, you can just connect an SD card directly to an ESP32.
[1:03] They use SPI, so in theory it’s quite straightforward.
[1:06] There’s only 4 signal wires plus power and ground.
[1:09] So I’ve got a bunch of these micro SD adapters, and if you’re careful not to melt the plastic,
[1:14] you can tin the connectors and solder some wires on.
[1:16] There were two ground pins, so to cut down on wires I’ve just joined these together.
[1:21] We’ll need a wire for ground and another for power.
[1:23] And then we need MOSI, chip select, MISO and finally serial clock.
[1:27] These can all be connected directly to our ESP32.
[1:31] And if you’re very observant, you’ll see me making a complete cock-up here.
[1:34] I’ve hooked up two of the wires to the USB data pins,
[1:37] that cause quite a bit of head scratching with things randomly connecting and disconnecting.
[1:41] The ground pin connects nicely to our spare ground pin,
[1:44] but unfortunately we’ve only got one 3.3 volt pad, and that’s already used by the display.
[1:49] To avoid accidentally desoldering this, I’m going to take power from the top of these handy decoupling capacitors.
[1:55] Something with version 2 of the PCB is a whole bunch of power connectors.
[1:59] With the two wires moved so they don’t conflict with the USB data pins,
[2:02] we now have a working SD card. That’s very, very cool.
[2:05] Obviously the real challenge of this project is not connecting up the SD card,
[2:09] it’s what we’re going to store on the SD card.
[2:12] Most movie container formats are ludicrously complicated and require big libraries to read them,
[2:17] so I consulted with an expert on my options.
[2:20] One reasonable idea would be to convert the movies into two files,
[2:24] one containing the audio data and another containing the motion jpg data,
[2:29] which would just be a sequence of jpg files concatenated together.
[2:32] This would work quite well and would be very easy to process.
[2:35] Another option would be to create my own very simple custom format
[2:39] and interleave the audio and video data with some very simple headers.
[2:43] That’s also not a bad suggestion, but there’s a couple of downsides to both these approaches.
[2:47] We could probably do the first suggestion using the command line and ffmpeg,
[2:51] but it’s a bit of a pain.
[2:52] The second option would need some custom code to convert the video files,
[2:55] which might be fun but feels like more work.
[2:58] The biggest downside with both these suggestions
[3:01] is that you wouldn’t be able to view the converted video on your desktop,
[3:04] so it’s just not very user friendly.
[3:06] The tiny TV guys use an AVI file to store their videos,
[3:10] and although this is still a bit complicated, this video container is simple enough for us to pass on
[3:15] the ESP32 with fairly basic code, especially if we make some assumptions about what will be in the
[3:20] file and we skip most of the metadata. We can generate an AVI file using ffmpeg,
[3:26] and the resulting file plays nicely on our desktop.
[3:29] If we have a look at the AVI file in a hex dump, we can see how it’s made up.
[3:33] In a valid file, the first four bytes are RIFF.
[3:36] That’s because the AVI file is actually a subformat of the resource interchange file format.
[3:41] The next four bytes are the length of the file,
[3:43] and then we have another four bytes that tell us this is an AVI file.
[3:47] Following on from this, we have chunks of list data. Within these we have data chunks.
[3:52] We need to find the list chunk that has the subtype movie.
[3:55] This is where our video and audio data can be found.
[3:58] So we just skip through the file, reading the chunk type and chunk length,
[4:01] and then if it’s a list chunk, we can check what the subtype is.
[4:05] Once we find the movie subtype, we can start extracting the video and audio data.
[4:09] The video data has type 00DC, followed by four bytes for the length,
[4:14] and if we look at the data we can see that we just have a jpeg file.
[4:17] If we scroll down to the end of the jpeg file, we can find a chunk of audio data.
[4:21] This has the type 01WB. If we search for all occurrences of 00DC,
[4:26] we can see we have a bunch of jpeg frames. That’s our motion jpeg.
[4:30] And if we search for 01WB, we can see our audio data.
[4:34] The only slight gotcha which caught me out is that the chunk data is all word aligned.
[4:39] So if you have a chunk data length that is an odd value,
[4:41] you have to read one extra dummy byte to keep within alignment.
[4:44] So it’s a reasonably simple file to pass. We just need to locate the list chunk with
[4:49] the movie type and then read each chunk of video or audio data.
[4:52] So as with the network streaming in the previous video,
[4:55] I’m going to read the audio and video data independently.
[4:58] We’ll read and play the audio data sequentially and use that to keep track of elapsed playtime.
[5:03] For the video data, we’ll read each individual jpeg frame,
[5:06] but if needed we’ll allow the code to skip over frames so we can keep
[5:10] instinct with the audio playback. This should keep the audio playing nicely
[5:13] without any stuttering, and we may just drop a few frames now and then due to slow rendering speeds.
[5:19] The code should work on pretty much any ESP32 board.
[5:22] I’ve got it running here on the cheap yellow display.
[5:24] So that’s streaming video from SDcard. If you want to learn about the Wi-Fi streaming version,
[5:29] then watch the video up on the screen right now.