OMX Hardware Transcoding on Raspberry PI

I have deployed a Pi4 as a Channels DVR server and it works great. It can even software transcode, but barely.

The most recent Emby release has enabled OMX hardware encoding on the Pi. AFAIK Emby also uses ffmpeg for transcoding, so there must be a way to get this goodness working for Channels! Happy to test any such attempt. The Pi4 is a low power, low cost and tiny device that enables lots of other cool functionality.

Would be curious to hear if the Emby stuff works on the rpi4. Looks like they originally only tested with rpi3 a few years ago: https://emby.media/community/index.php?/topic/51504-rpi-specific-request

No personal experience, but it looks like it.

2 Likes

Well I just spent an un-productive couple hours trying to find the source for Emby’s version of ffmpeg, only to come to understand their new business model seems to rely on obfuscating the source and not living up to the terms of the licenses of their backends. But I guess it must be in some new ffmpeg release.

BTW, I poked around briefly, but couldn’t find Channels’ link to the source code for the licensed content you distribute (like ffmpeg). Were can I find that?

Keen to test any OMXified ffmpeg you can deploy. Thanks.

1 Like

I'm not certain, but Channels' FFmpeg sources might be the fork on Aman's GitHub, perhaps.

We track ffmpeg upstream closely and most of our patches are quickly merged back into official ffmpeg releases.

OMX RPi support is built into ffmpeg already, as is mmal. You can compile with --enable-omx-rpi

I did some testing when the patch was originally being discussed: https://ffmpeg.org/pipermail/ffmpeg-devel/2016-May/193895.html

I see. I know the Emby people are catching heat for not providing their patches and source to comply with the LGPL license of ffmpeg, so you might consider linking to your patch repository.

I’d be happy to compile my own ffmpeg with OMX support to test, but I’m not sure the best approach to substitute it and modify the calling options Channels uses. Any tips?

You would have to replace it with a shell script that mangled options and passed them to another binary.

Can you post the output of uname -a from the rpi4?

That's doable. Any way to test before going to the trouble? Here's uname output:

Linux datpi 4.19.57-v7l+ #1244 SMP Thu Jul 4 18:48:07 BST 2019 armv7l GNU/Linux

Edit: I was able to compile an ffmpeg from head this morning with omx support. I used:

sudo ./configure --arch=armel --target-os=linux --enable-gpl --enable-omx --enable-omx-rpi --enable-nonfree

I would start by using any of the guides to compile an ffmpeg and test it directly to see if it works.

You want --arch=armhf

According to https://emby.media/community/index.php?/topic/51504-rpi-specific-request you'll also want --enable-decoder=mpeg2_mmal --enable-mmal and presumably --enable-decoder=h264_mmal as well.

There's some test commands in that thread which you can copy to see what performance is like with the various options.

Interesting the rpi4 doesn't have a mpeg2 hardware decoder: http://www.raspberrypi.com/mpeg-2-license-key/

Interesting. You have to purchase a license for mpeg2 decode, yes? I guess Channels has that built in when it uses Intel HW transcode? Which is more CPU-intensive: decoding mpeg2 or encoding H.264 (I'd guess the latter).

The mpeg2 license was only sold for rpi3 and is not available for rpi4. Supposedly the CPU is fast enough to not need it anymore:

The promised 4K video playback is limited to H.265 content, too, while hardware decode for MPEG2, MPEG4, and H.263 has been dropped on the understanding the CPU is powerful enough to decode these formats in software without too much strain.

Encoding is definitely slower, and that's where omx will help. On the rpi3 it seems emby had to use mmal as well to get decent performance, to engage the hardware decoder. I never tried that, and is probably why I deemed the device unsuitable for transcoding.

I have an rpi4 on the way and will run some more tests on my rpi3 today.

I'm a little surprised to see this on an armv8 64-bit device.

Is this raspbian lite or something else?

The proprietary bits are only available as 32-bit blobs, so there is no full support for the chipset on aarch64. This includes the wifi/bt stack, as well as the graphics bits for hardware codecs.

There are aarch64 distros for the RPi3/4, including graphics with the V3D drivers. But they lack complete hardware support.

OK I did some tests. Source file was the short clip of NOVA that I reported earlier on. Recall that Channels ffmpeg could only transcode it at 0.9–0.95x, right on the bleeding edge of stuttering. And that it maxed out the CPUs to do so, up to 380%.

With just the default software codecs, my new ffmpeg does better than this:

$ ffmpeg -i NOVA\ S46E01\ 2019-01-02\ Pluto\ and\ Beyond\ 2019-08-14-2017.mpg nova_out.mp4
ffmpeg version N-94563-g3aeb681f07 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Raspbian 8.3.0-6+rpi1)
  configuration: --arch=armhf --target-os=linux --enable-gpl --enable-omx --enable-omx-rpi --enable-nonfree --enable-decoder=mpeg2_mmal --enable-mmal --enable-decoder=h264_mmal
  libavutil      56. 33.100 / 56. 33.100
  libavcodec     58. 55.100 / 58. 55.100
  libavformat    58. 30.100 / 58. 30.100
  libavdevice    58.  9.100 / 58.  9.100
  libavfilter     7. 58.100 /  7. 58.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
[mpeg2video @ 0x2f96a90] Invalid frame dimensions 0x0.
    Last message repeated 28 times
[mpegts @ 0x2f92320] PES packet size mismatch
Input #0, mpegts, from 'NOVA S46E01 2019-01-02 Pluto and Beyond 2019-08-14-2017.mpg':
  Duration: 00:11:40.33, start: 31840.284300, bitrate: 11978 kb/s
  Program 7 
    Stream #0:0[0x71]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, top first), 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc
    Stream #0:1[0x74](eng): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, stereo, fltp, 192 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (mpeg2video (native) -> mpeg4 (native))
  Stream #0:1 -> #0:1 (ac3 (native) -> aac (native))
Press [q] to stop, [?] for help
Output #0, mp4, to 'nova_out.mp4':
  Metadata:
    encoder         : Lavf58.30.100
    Stream #0:0: Video: mpeg4 (mp4v / 0x7634706D), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 30k tbn, 29.97 tbc
    Metadata:
      encoder         : Lavc58.55.100 mpeg4
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: 18446744073709551615
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc58.55.100 aac
frame= 4701 fps= 38 q=31.0 size=   28928kB time=00:02:36.82 bitrate=1511.1kbits/s dup=38 drop=0 speed=1.26x    

Strangely, cpu load for this software only encode was only 250% or so, far less than the 380% the Channel's ffmpeg pegs at. Is your version doing something extra that could account for this?

Then I tried with mmal and got the expected failure:

mpegts @ 0x2ea2350] Failed to open codec in avformat_find_stream_info
...
Stream mapping:
  Stream #0:0 -> #0:0 (mpeg2video (mpeg2_mmal) -> mpeg4 (native))
  Stream #0:1 -> #0:1 (ac3 (native) -> aac (native))
Error while opening decoder for input stream #0:0 : Unknown error occurred

Then I tried with the omx hardware encoder:

$ ffmpeg -i NOVA\ S46E01\ 2019-01-02\ Pluto\ and\ Beyond\ 2019-08-14-2017.mpg -codec:v h264_omx nova_out.mp4
ffmpeg version N-94563-g3aeb681f07 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Raspbian 8.3.0-6+rpi1)
  configuration: --arch=armhf --target-os=linux --enable-gpl --enable-omx --enable-omx-rpi --enable-nonfree --enable-decoder=mpeg2_mmal --enable-mmal --enable-decoder=h264_mmal
  libavutil      56. 33.100 / 56. 33.100
  libavcodec     58. 55.100 / 58. 55.100
  libavformat    58. 30.100 / 58. 30.100
  libavdevice    58.  9.100 / 58.  9.100
  libavfilter     7. 58.100 /  7. 58.100
  libswscale      5.  6.100 /  5.  6.100
  libswresample   3.  6.100 /  3.  6.100
  libpostproc    55.  6.100 / 55.  6.100
[mpeg2video @ 0x2cd5aa0] Invalid frame dimensions 0x0.
    Last message repeated 28 times
[mpegts @ 0x2cd1330] PES packet size mismatch
Input #0, mpegts, from 'NOVA S46E01 2019-01-02 Pluto and Beyond 2019-08-14-2017.mpg':
  Duration: 00:11:40.33, start: 31840.284300, bitrate: 11978 kb/s
  Program 7 
    Stream #0:0[0x71]: Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, top first), 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc
    Stream #0:1[0x74](eng): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, stereo, fltp, 192 kb/s
File 'nova_out.mp4' already exists. Overwrite ? [y/N] y
Stream mapping:
  Stream #0:0 -> #0:0 (mpeg2video (native) -> h264 (h264_omx))
  Stream #0:1 -> #0:1 (ac3 (native) -> aac (native))
Press [q] to stop, [?] for help
[h264_omx @ 0x2d22da0] Using OMX.broadcom.video_encode
Output #0, mp4, to 'nova_out.mp4':
  Metadata:
    encoder         : Lavf58.30.100
    Stream #0:0: Video: h264 (h264_omx) (avc1 / 0x31637661), yuv420p, 1920x1080 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 29.97 fps, 30k tbn, 29.97 tbc
    Metadata:
      encoder         : Lavc58.55.100 h264_omx
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s
    Metadata:
      encoder         : Lavc58.55.100 aac
frame=  628 fps= 38 q=-0.0 size=     768kB time=00:00:20.88 bitrate= 301.2kbits/s dup=38 drop=0 speed=1.28x    

So... pretty much the same speed :expressionless:. Meanwhile, however, the CPU load dropped from 250% to 160%, so OMX is doing something anyway. So both of these seem to outperform the Channels-provided ffmpeg, and the hardware version leaves quite a bit more CPU overhead for other tasks (like recording other streams), so would probably be more stable.

Haven't yet substituted for Channels' ffmpeg, probably need to do some severe input option editing given the switch from libx264, yes?

I guess some people have already had success forcing Raspbian to run with a 64bit kernel, but yeah, it does seem weird.

Arch has aarch64 rootfs images for the Pi3/4, but they are lacking support for WiFi and other hardware features. It's been a while since I've checked, but it might be interesting to compare their 32bit images with the 64 on the same hardware.

Yes we have to deinterlace as well. Your NOVA recording is 1080i and needs to be converted to a progressive format before it can be encoded to h264.

Looks like scaling also takes a big hit on the CPU. Converting 1080 down to 720 before encoding reduces throughput by a lot.

I just got access to a Pi4 and am running some tests.

1 Like