I've taken peeks at threads related to hardware decoding (of H.264 and HEVC, mainly) on Allwinner and Rockchip platforms on and off, sometimes dabbled in trying and failing to implement solutions recommended there. Being a complete amateur, I find the topic very opaque and confusing, with various different components that need to interface with each other, be patched in sync, and sometimes change drastically between kernel versions, etc. Today I sat down and read up on these subjects, scouring wikis, documentation, this forum, and assorted other sources to try and understand how this works. In this post I will attempt to compile what I've learned on the different software components involved, their relationships, their current status, and solutions to the problem. I hope people more knowledgeable will correct me when I get something wrong or cite outdated information. Stuff which I am highly uncertain of I will print in italics.
(This post is going to focus on mainline implementations of Cedrus/Allwinner, I haven't looked into Hantro/Rkvdec/Rockchip specifics yet. I will speak only of H.264 and H.265/HEVC; I don't understand the high/low stuff and didn't pay attention to other codecs.)
Basics: Video codecs like H.264, H.265/HEVC, MPEG-2, etc. are standardised methods which serve to more efficiently encode and decode videos, reducing their filesize. Software en-/decoding is very CPU-intensive. Modern GPUS and ARM SoCs therefore contain specialised hardware (VPUs) to delegate these tasks to. Working hardware decoding is particularly important for underpowered ARM CPUs.
Drivers: Topmost in the stack are the VPU drivers. These are Sunxi-Cedrus/Cedrus V4L2 M2M and Cedar [Is this the legacy one?] on Allwinner; Hantro and Rkvdec on Rockchip. These are all still in development, but Cedrus already fully supports H.264, and partially supports HEVC, and is already usable in the mainline kernel.
APIs: In order for anything (userspace APIs, libraries) to make use of the VPU drivers, you need backends/APIs. For Cedrus, there is the unmaintained libva-v4l2-request backend which implements VA-API, the legacy VDPAU implementation libvdpau-sunxi, and as of kernel version 5.11, H.264 has been merged into the uAPI headers. Different applications may make recourse to one or another of these APIs.
Libraries: FFmpeg and GStreamer. provide libraries and APIs of their own to other applications but can (importantly!) also output directly to the framebuffer. FFmpeg must be patched to access either libva-v4l2-request or the Cedrus driver headers. GStreamer directly accesses kernel headers since 1.18 (works on 5.9, not on 5.10; 1.20 will support 5.11.)
Media players: mpv and depends on FFmpeg for hardware acceleration (and must be patched together with it). VLC can be set to access libva-v4l2-request directly. Kodi 19.0+ supports hardware acceleration out of the box without any out-of-tree patches.
Display server: An additional complication is drawing the output of any of the above on screen. Most successful implementations I've seen bypass X11 and either output directly to the framebuffer or force a plane/display layer on top of any X windows. Wayland apparently makes this easier by allowing applications to use their own DRM planes, but this hasn't been explored much yet. Kodi 19.0 works with all three windowing systems (X, Wayland, and gbm).
6 hours ago, jernej said:
H264, MPEG2 and VP8 should be good in mainline, although api can still change until codec is promoted to uAPI. HEVC still needs out-of-tree patches for any serious work.
Taken from the LibreElec thread (which reflects LibreElec's status and is ahead of what works elsewhere, but outlines hardware limitations):
Quote(Video) What is Hardware Acceleration and Why it Matters
only MPEG2, H264 (AVC), H265 (HEVC) and VP8 codecs are supported in hardware, for now. Others are software decoded.
- R40 doesn't support H265 (HEVC) - hardware limitation
- 10-bit H265 videos are supported only on H6 (H3, H5 and A64 don't support 10-bit - hardware limitation)See AlsoUsing Hardware-Accelerated Streaming | Plex SupportUsing FFmpeg with NVIDIA GPU Hardware Acceleration
- 10-bit H264 is not supported (hardware limitation)
Many people have managed to make it work on their machines using different approaches. Note that some of these solutions are one or two years old, and kernel developments since may have changed the situation. Ordered from newer to older:
LibreElec – kernel + ffmpeg + Kodi: LibreElec is a Just-Enough-OS with the sole purpose of running Kodi, a media player. It's at the bleeding edge and usually implements codecs and features well before mainline or other distros. It achieves this by heavily patching everything up and down the stack, from the Linux kernel over FFmpeg to Kodi itself. These patches could all be applied to an Armbian build, but there are a lot of them, they're poorly documented, and you'd need to dig into their github to understand what they all do. LibreElec runs Kodi directly without a desktop. kodi-gbm is a package that can be installed on Armbian and functions similarly.
Key contributors to the project are @jernej and @Kwiboo, who sometimes post about their work here (and have been very helpful with questions, thank you). @balbes150 includes some of LibreElec's patches in his Armbian-TV builds, but I don't know which.
6 hours ago, jernej said:
Further clarification: Kodi 19.0 (released recently) is highly recommended for all this - it doesn't require any out of tree patch for video decoding (LE uses patch for HW deinterlacing). Additionally, with version 19.0, there is single binary for all 3 windowing systems (gbm, X11, wayland). Depends on build options. Not sure if this version is available on Armbian but PPA exists, so I guess it should not be hard to test.
LibreElec patches + mpv:
On 8/22/2019 at 8:14 PM, jernej said:
@Alerino Reis If you're using ffmpeg patches from LibreELEC, then you need only this additional patch to make it compatible with mpv. I tested yesterday and it works for me when running without any window manager running with either of these commands:mpv --vo=gpu --gpu-context=drm --hwdec=auto video.mkvmpv --vo=drm --hwdec=auto video.mkv
You can append "-v" parameter to check if mpv really uses HW decoding.(Video) Hardware Acceleration EXPLAINED
@megous – Kernel 5.11 + GStreamer: This implementation, done here on a PinePhone (A64), patches the 5.9 kernel and uses a recent version of GStreamer (1.18 and up), whose output is rendered directly to a DRM plane via kmssink. (No X or Wayland.)
GStreamer 1.18 works with the 5.9 kernel. It does not work with 5.10, because of numerous changes to the kernel headers in this version. In 5.11 the H.264 headers were moved into the uAPI; the master branch of GStreamer reflects this, but there haven't been any releases with these patches yet. It'll probably be in repositories with GStreamer 1.20; until then you can build it from source.
@Sash0k – patched libva-v4l2-request + VLC: This updates bootlin's libva-v4l2-request and follows the Sunxi wiki's instructions for enabling VLC to make use of it. It works on the desktop. This only works for H.264 and breaks HEVC. When I tried to replicate this approach on a recent Armbian build, I discovered that the h264.c files in the kernel (that libva-v4l2-request draws on) have changed considerably between 5.8 and 5.10, and I lack the understanding to reconcile libva-v4l2-request with them.
On 7/20/2020 at 4:38 PM, Sash0k said:
Finally got it! No kernel modifications needed, only v4l2-request.
- Use bootlin code, latest master (not release-2019.03 tag)
- I merged just one small patch from https://github.com/bootlin/libva-v4l2-request/pull/30/files (seems, it's unecessary)
- Download kernel sources with corresponding version. For my armbian is: https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.4.45.tar.xz
- Extract 2 files from kernel/include/media mpeg2-ctrls.h and h264-ctrls.h and replace ones in v4l2-request
- Replace V4L2_PIX_FMT_H264_SLICE_RAW to V4L2_PIX_FMT_H264_SLICE in v4l2-request source code
- Compile and install (instruction is as 2 posts above)
- Don't forget to set VLC as in https://linux-sunxi.org/Sunxi-Cedrus
Reveal hidden contents
Tools > Preferences > Input / Codecs > Codecs > Hardware-accelerated decoding > VA-API video decoder
Tools > Preferences > Video > Display > Output > X11 video output (XCB)(Video) Encode Video Faster with Hardware Acceleration (That You Probably Already Have)!
Tested with VLC, usable with issues:
- Artifacts in some videos h264 720p and higher, for example: https://imgur.com/nYFArT4 (360/480 works fine)
- Scaling (fullscreen, resizing) not works, slowdown with message[a310cb88] main filter error: Failed to create video converter
- Minor issues in console output on playback (see bold)
Reveal hidden contents
$ vlc 3-big_buck_bunny_480p_H264_AAC_25fps_1800K.MP4
VLC media player 188.8.131.52 Vetinari (revision 184.108.40.206-0-gd4c1aefe4d)
[02287b98] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface.
libEGL warning: DRI2: failed to authenticate
libva info: VA-API version 1.7.0
libva info: User environment variable requested driver 'v4l2_request'
libva info: Trying to open /usr/lib/arm-linux-gnueabihf/dri/v4l2_request_drv_video.so
libva info: Found init function __vaDriverInit_1_7
libva info: va_openDriver() returns 0
[a272d350] avcodec decoder: Using v4l2-request for hardware decoding
[a3018b98] blend blend error: no matching alpha blending routine (chroma: YUVA -> VAOP)
[a3018b98] main blend error: blending YUVA to VAOP failed
Thanks to: @jernej for this post:
@ubobrov – old kernel + libcedrus + libvdpau-sunxi + ffmpeg + mpv: This approach, which supports
encoding decoding of H.264 uses the libvdpau-sunxi API and ports the legacy driver to mainline as a loadable kernel module and if I understand it correctly, ubobrov ported a legacy feature to mainline. In the post quoted below the kernel is 4.20, but the same method has been successfully applied to 5.7.8 by another user. It requires that the board's device tree be patched, as documented in ubobrov's github repository.
On 4/23/2020 at 4:41 PM, ubobrov said:
Decoding H264 and X11 rendering using vdpau_sunxi, libcedrus, kernel 4.20.17, mpv, vncserver and Armbian Bionic on Orange PI Zero
libvdpau: https://github.com/uboborov/libvdpau-sunxi-H3.git(Video) How to Enable or Disable Hardware Acceleration Windows 10
mpv, ffmpeg, x11 installed using apt
It works extremely slow but it's just a beginning )
video on Orange PI One 1280x720 HDMI (works pretty fine)
The summary seems to be that none of the current implementations on Allwinner boards really play nice with X or desktop sessions, and it's best to output directly to the framebuffer. Kwiboo has forked FFmpeg and mpv to make good use of new and unstable kernel features/hardware acceleration which will take a while to make their way upstream. The recent 5.11 move of stateless H.264 out of staging and into the uAPI should facilitate further developments.
I intend to try some of these things in the nearer future. Thanks to everyone who works on mainlining all of this VPU stuff, and to users here who contribute solutions and readily & patiently answer questions (Jernej especially). I hope I didn't post any falsehoods out of ignorance, and welcome any corrections.
Other related threads here:
Edited by P.P.A.
Edits, corrections and additions to reflect Jernej's and ubobrov's input below.