va: need to move common logic to create a va lib.

We can not do that before, because the gstreamer-vaapi is outside of gst-plugins-bad, and so the MSDK in bad can not refer to that lib reversely. But they are all in -bad now.

CC: @vjaquez @ndufresne @seungha.yang @haihao

changed the description

First of all, I have no objection about libgstva as long as it's -bad scope library at the moment (like libgstd3d11), and I believe libgstva would give us a chance to improve that API design as well.

Another important point is that when we want to extend our support of gstreamer on to windows

That sounds really good plan! AFAIK MSDK supports d3d very well on Windows and other open source projects (OBS for example) use it already. That's what I really wanted feature, and I was looking forward contribution from Intel people, But

I have no plan to integrate d3d11 with other APIs (MSDK for Intel, and CUDA/NVCODEC for nvidia) because:

d3d11 + MediaFoundation works better than gstreamer MSDK (and GSTMFX) already as per my test, in terms of stability and performance. I feel Intel supports d3d11 and MediaFoundation very well
d3d11 + MediaFoundation covers all well-known vendors, Intel, NVIDIA, AMD, Qualcomm
CUDA + d3d11 interop overhead is big, sometimes it's slower than d3d11/MedaiFoundation. So I have no motivation to do that yet.
non-d3d11 Graphics APIs are not allowed for UWP

In short, I prefer native Windows API (d3d11/MediaFoundation) over vendor specific APIs.

There are some points where d3d11/MediaFoundation doesn't cover but native APIs (MSDK/CUDA) do though, I'm focusing on d3d/Mediafoundation because of the reasons I mentioned above.

I hope Intel people take a look at Windows GStreamer things

@seungha.yang thanks for the feedback. MediaFoundation can only use a subset of MediaSDK. This why we need a MediaSDK plugin.

I have no plan to integrate d3d11 with other APIs

We can do this for gstmsdk. As the first step, we can make gst-va shamelessly work with gstmsdk, then we will look at gst-dxva. We still have many gaps, like gstmsdk d3d allocator is not implemented. Let us fix it step by step.

As far as I understand this library will only expose the GstVaDisplay structure and its methods required for creating and sharing it among the pipeline via GstContext.

For VASurfaces, there's already a method to get them with no need of special API.

Also, there's no need to expose bufferpools or allocators, if I understand correctly.

We still need gstmsdk encoder to provide allocator and buffer pool, right? and in some cases like "decode ! tee name=t ! encoder1 t. ! encoder2", tee will not ask encoder's allocator. So the gstmsdk need accept upstream allocated buffer too. Also, I am not sure in the decode + tee + encoder case, the decode can get the encoder's GstContext or not.

@vjaquez , if really do not want to creating the lib, the one way is encapsulating the "display" handle inside some miniobject, and pass this miniobject in GstContext between modules. This can trace the live time of the display handle. If we only export the display handle in the GstContext, just like "gst.va.display.handle", the va plugins may close that display while others are using it.

I prefer to expose the display, vamemory and va mem pool. And some utils such as context query and setting.

The GST_MAP_VA can really work, but we still need to do some check, such as "GST_VA_ALLOCATOR (mem->allocator)" before we map it.

I believe a lib to make the context sharing is a good start. In general, I'd focus on keeping the API as minimal as possible. To answer some of @XuGuangxin questions, for a pipeline like:

graph LR
  subgraph vaXYZdec
    VD_sink[sink]
    VD_src[src]
    VD_sink -.- VD_src
  end

  subgraph tee
    T_sink[sink]
    T_src0[src0]
    T_src1[src1]
    T_sink -.- T_src0
    T_sink -.- T_src1
  end

  subgraph msdkXYZenc0
    ME0_sink[sink]
    ME0_src[src]
    ME0_sink -.- ME0_src
  end

  subgraph msdkXYZenc1
    ME1_sink[sink]
    ME1_src[src]
    ME1_sink -.- ME1_src
  end

  VD_src -- "video/x-raw(memory:VASurface)" ---> T_sink
  T_src0 -- "video/x-raw(memory:VASurface)" ---> ME0_sink
  T_src1 -- "video/x-raw(memory:VASurface)" ---> ME1_sink

The context sharing, assuming this will be done GstContext way will work in a way that the app is asked first through a sync message (unless a context is cached in the parent bin), and other wise the neighbour are queries for a context. If a known context is found, it's used or wrapped (depending on the stack really) so that zero-copy buffer sharing will be possible.

Now, most of the work with memory:VASurface happens through caps, so my accepting these caps, you also accepts that you can deal with the VASurface, with of without a usable shared context. In D3D11 it was noted that it means we need to be aware of the context the D3D11 textures were created with, so you can in the worst case resort to a full download/upload roundtrip. Same seems needed here I believe. This is very voiler place stuff that should go into the lib.

Now, let's say the context is compatible, you have to deal with VASurface allocation. As usual, the decoder will use an allocation query to retrieve downstream information. The outcome could have been a usable buffer pool (figure-out if the pool is usable memory:VASurface can be done through a pool feature implemented in the lib). But for the tee case, you will not get any pool. You will have to resort to allocation APIs. As of current implementation, tee will only keep the APIs that exist on all branches (or legs).

Now, for this case, just the API type would not be enough. I believe you will need some extra information (modifiers list would be an options, it could also be be some VA specific hint, remember this has to be VA specific). The only oops is that tee does not know yet how to merge allocation API parameters. I need to make this happen for memory:DMABuf modifiers, so just bug me about it when you get there and I can make this happen.

This is far from a complete plan, but it gives an idea all the boilerplate needed and how a shared library can help.

And using d3d11xxxdec ! msdkxxxenc on Windows, linking them with divx surface caps

Regarding this one, I'd say DXGI/D3D11 surface/texture sharing will work only for one physical device (I haven't tested other cases like amd crossfire nvidia sli).

Another note is that, d3d11*dec doesn't use downstream buffer pool (always uses its internal pool for decoding) because of the DXVA API design. Downstream d3d11-compatible buffer pool will be used only for

reverse playback
or internal DPB pool is about to full

More note: I implemented MediaFoundation(MF) + d3d11 integration layer so that MF can copy incoming d3d11 texture into MF's own texture pool because of performance reason (intra-GPU copying overhead is sometimes smaller than synchronize overhead, especially in case of NVIDIA) https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/blob/master/sys/mediafoundation/gstmfvideoenc.cpp#L1014-1086

@ndufresne thanks for the information. We will ping you when we have an issue.

@seungha.yang , do you have a plan to use the downstream pool? Zero copy is important for large resolution.

I have no plan to use downstream pool because of special requirement for DPB texture.

Btw, I think MSDK should be able to wrap GstD3D11Memory into MSDK form and accept it even if it's not allocated by MSDK?

I think that may be more or less like the MSDK using VAMemory, the MSDK can provide no pool to upstream element, but it can still recognize and use the GstD3D11Memory/VAMemroy allocated by upstream element. This way does not need a copy.

added 26 commits

c51b2bcf...4900e358 - 25 commits from branch gstreamer:master
8e9275e9 - first version can work

Compare with previous version

marked this merge request as ready

changed title from [RFC] WIP: va: we need to create a va lib. to va: need to move common logic to create a va lib.

I have completed the first version that can make the pipeline such as:

gst-launch-1.0 -vf filesrc location=1920x1080.h264 ! h264parse ! vah264dec ! video/x-raw\(memory:VAMemory\) ! msdkh264enc ! fakesink

work well. And the encode result is correct.

I find that we may really need to move the va display logic into a lib to make the code clean.

What I do here is:

Move the VA display logic into the va lib. We do not install its headers now because it is only used by va and MSDK inside -bad.
Implement the MSDK's context using the common VA display. Then, all the va related plugins can share the same va display by GstContext.
Import VA surface as VAMemory and MSDK can directly use it.

If the idea is correct, We may need to split this into two MR later, one for this lib, and another one for MSDK.

You are so efficent. How about this pipeline? gst-launch-1.0 -vf filesrc location=1920x1080.h264 ! h264parse ! vah264dec ! msdkh264enc ! fakesink

The exact same result. @vjaquez make the VAMemory the first choice of the caps. So once the down stream element reports VAMemory caps, it will be used.

Just a first impression glancing the patch: the git mvs make the history very confuse, it's hard to follow what was done.

va: need to move common logic to create a va lib.

Merged by GStreamer Marge Bot 3 years ago (May 18, 2021 12:47pm UTC) 3 years ago

Activity

Admin message

va: need to move common logic to create a va lib.

Merge request reports

Merged by GStreamer Marge Bot 3 years ago (May 18, 2021 12:47pm UTC) 3 years ago

Activity