21 Nov 2012

On supporting Wayland GL clients and proprietary embedded platforms

How would one start implementing support for graphics hardware accelerated Wayland clients on an embedded platform that only has proprietary drivers?

This is a question I have answered more than once recently. Presumably you already have some ways to implement a Wayland compositor, some APIs you can use to composite and get images on screen. You may have wl_shm based clients already working, and now you want hardware rendered clients to work. Where to start?

First I will explain the architecture a bit. There are basically three things related to Wayland graphics:
  • the client side support for graphics hardware acceleration
  • the compositor's support for hardware accelerated clients
  • the compositor's own rendering or compositing, and output to screen

Usually the graphics hardware accelerated applications (clients) use EGL for the window system glue, and GL ES (2) for rendering.  The client side is the EGL Wayland platform, plus wayland-egl API. The EGL Wayland platform means, that you can pass Wayland types as EGL native types, for instance a struct wl_display * as the EGLNativeDisplayType parameter of the eglGetDisplay() function.

The compositor's support for hardware accelerated clients is the server side of Wayland-enabled libEGL. Normally it consists of the EGL_WL_bind_wayland_display extension. For a compositor programmer, this extension allows you to create an EGLImageKHR object from a struct wl_buffer *, and then bind that as a GL texture, so you can composite it.

The compositor's own rendering mechanisms are largely irrelevant to the client support. The only requirement is, that the compositor can effectively use the kinds of buffers the clients send to it. If you turn a wl_buffer object via EGLImageKHR into a GL texture, you would better be compositing with a GL API, naturally. Apart from that, it does not matter what APIs the compositor uses for compositing and displaying.

Now, what do we actually need for supporting hardware accelerated clients?

First, forget about using wl_shm buffers, they are not suitable for hardware accelerated clients. Buffers that GPUs render into are often badly suited for CPU usage, or not directly accessible by the CPU at all. Due to GPU requirements, you likely cannot make a GPU to render into an shm buffer, either. Therefore to get the pixel data into an shm buffer you would need to do a copy, like glReadPixels(). Then you send the shm buffer to the server, and the server needs to copy the pixels again to make them accessible to the GPU for compositing, e.g. by calling glTexImage2D(). That is two copies between CPU and GPU domains, and that is slow. I would say unusably slow. It is far better to not move the pixels into CPU domain at all, and avoid all copying.

Therefore, the most important thing is graphics buffer sharing or passing. Buffer sharing works by creating a handle for a buffer, and passing that handle to another process which then uses the handle to make the GPU access again the same buffer. On your graphics platform, find out:
  • Do such handles exist at all?
  • How do you create a buffer and a handle?
  • How do you direct GL ES rendering into that buffer?
  • What is the handle? Does it contain integers, open file descriptors, or opaque pointers? Integers and file descriptors are not a problem, but you cannot pass (host virtual) pointers from one process to another.
  • How do you create something usable, like an EGLImageKHR or a GL texture, from the handle?
It would be good to test that the buffer passing actually works, too.

Once you know what the handle is, and whether clients can allocate their own buffers (preferred), or must the compositor hand out buffers to clients for some obscure security reasons, you can think about how to use the Wayland protocol to pass buffers around. You must invent a new Wayland protocol extension. The extension should allow a client to create a wl_buffer object from the handle. All the standard Wayland interfaces deal with wl_buffer objects, and the server will detect the type of each wl_buffer object when used by a client. Examples of the protocol extension are wl_drm of Mesa, and my experimental android_wlegl.

I recommend you do the first implementation of the protocol extension completely ad hoc. Hack the server to work with your buffer types, and write a custom client that directly uses the protocol without any fancy API like wayland-egl. Once you confirm it works, you can design the real implementation, whether it should be in a wrapper library around the proprietary libEGL or something else.

EGL is the standard interface to implement accelerated Wayland client support and it conveniently hides the details from both servers and clients, but it is not the only way. If you control both server and client code, you can use any other API, or create your own. That is a big if, though.

The key point is buffer sharing, copying will kill your system performance. After you know how to share graphics buffers, your work has only begun. Good luck!

12 comments:

Cosmo Kramer said...

I wonder how you manage at being so productive, you write code, patches and technical blogs, wow.

Michael Hasselmann said...

Well, Pekka has an unfair advantage: He lives in Finland ;-)

Michael Hasselmann said...

See https://plus.google.com/108138837678270193032/posts/EyU7rGuY77r for context.

pq said...

I'd say my unfair advantage is being contracted by Collabora: I get paid to do what I mostly love to do. :-)

Unknown said...

This is very useful in understating buffer sharing in wayland. Thanks Pekka

Lerc said...

The frustrating bit for me is having many-many ultra cheap devices readily available with accelerated EGL available to Android but not Linux

Is it feasible to take an Android device and strip back the config so that it doesn't launch any Android specific stuff but keeps enough library support to let Wayland drive the screen?

If that works you can presumably build the system up in a LinuxFromScratch approach.

pq said...

Lerc, yes, and there are two ways to go about it. One is the thing I did with Wayland on Android: get Wayland running, but keep the rest of the Android OS. That has major drawbacks like... having the Android low-level OS around, including Bionic.

The other approach I would recommend, is to indeed go like LinuxFromScratch or take some suitable meta-distribution and adapt that. The problem is, that the proprietary, binary-only libs and drivers are built for Bionic, hence they are incompatible with any other libc. That is where libhybris aims to help, so I very much advice looking at the libhybris approach by Carsten Munk. Still, you will probably be stuck with Android kernels, but at least you can have a nice userspace.

Abhijit Potnis said...

Hadn't found time to read this one until today. Thanks for writing this one !

Andreas Pokorny said...

Nice read, but isnt there one piece missing? How do you know that you can use the buffer at the server? Or do you simply do the transfer/commit to the wayland server just after eglSwapBuffers in the client?

pq said...

Andreas, eglSwapBuffers internally posts the buffer to the server. However, the actual synchronization of rendering into the buffer vs. the compositor using it is an EGL implementation detail. I hear most EGLImage types require implicit synchronization, but TEXTURE_EXTERNAL does not. Therefore if TEXTURE_EXTERNAL is used, the client side eglSwapBuffers must perform a real glFinish before posting the buffer.

So the short answer is: the EGL implementation must guarantee the order of operations, and it can do it any way it wants.

CSRedRat said...

Great, but when added support for Mir?

Unknown said...

Excellent!

I want to implement Wayland EGL platform for Raspberry Pi.