Commits · wip/mst-dual-kref-start-v5 · Lyude Paul / linux

Jan 08, 2019

drm/nouveau: Use atomic VCPI helpers for MST · d23849ed

Lyude Paul authored Oct 17, 2018



Currently, nouveau uses the yolo method of setting up MST displays: it
uses the old VCPI helpers (drm_dp_find_vcpi_slots()) for computing the
display configuration. These helpers don't take care to make sure they
take a reference to the mstb port that they're checking, and
additionally don't actually check whether or not the topology still has
enough bandwidth to provide the VCPI tokens required.

So, drop usage of the old helpers and move entirely over to the atomic
helpers.

Changes since v6:
* Cleanup atomic check logic and remove a bunch of unneeded checks -
  danvet
Changes since v5:
* Update nv50_msto_atomic_check() and nv50_mstc_atomic_check() to the
  new requirements for drm_dp_atomic_find_vcpi_slots() and
  drm_dp_atomic_release_vcpi_slots()

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

d23849ed

drm/dp_mst: Check payload count in drm_dp_mst_atomic_check() · 2204a5d4

Lyude Paul authored Oct 22, 2018



It occurred to me that we never actually check this! So let's start
doing that.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

2204a5d4

drm/dp_mst: Start tracking per-port VCPI allocations · eb668fa1

Lyude Paul authored Oct 17, 2018



There has been a TODO waiting for quite a long time in
drm_dp_mst_topology.c:

	/* We cannot rely on port->vcpi.num_slots to update
	 * topology_state->avail_slots as the port may not exist if the parent
	 * branch device was unplugged. This should be fixed by tracking
	 * per-port slot allocation in drm_dp_mst_topology_state instead of
	 * depending on the caller to tell us how many slots to release.
	 */

That's not the only reason we should fix this: forcing the driver to
track the VCPI allocations throughout a state's atomic check is
error prone, because it means that extra care has to be taken with the
order that drm_dp_atomic_find_vcpi_slots() and
drm_dp_atomic_release_vcpi_slots() are called in in order to ensure
idempotency. Currently the only driver actually using these helpers,
i915, doesn't even do this correctly: multiple ->best_encoder() checks
with i915's current implementation would not be idempotent and would
over-allocate VCPI slots, something I learned trying to implement
fallback retraining in MST.

So: simplify this whole mess, and teach drm_dp_atomic_find_vcpi_slots()
and drm_dp_atomic_release_vcpi_slots() to track the VCPI allocations for
each port. This allows us to ensure idempotency without having to rely
on the driver as much. Additionally: the driver doesn't need to do any
kind of VCPI slot tracking anymore if it doesn't need it for it's own
internal state.

Additionally; this adds a new drm_dp_mst_atomic_check() helper which
must be used by atomic drivers to perform validity checks for the new
VCPI allocations incurred by a state.

Also: update the documentation and make it more obvious that these
/must/ be called by /all/ atomic drivers supporting MST.

Changes since v9:
* Add some missing changes that were requested by danvet that I forgot
  about after I redid all of the kref stuff:
  * Remove unnecessary state changes in intel_dp_mst_atomic_check
  * Cleanup atomic check logic for VCPI allocations - all we need to check in
    compute_config is whether or not this state disables a CRTC, then free
    VCPI based off that

Changes since v8:
 * Fix compile errors, whoops!

Changes since v7:
 - Don't check for mixed stale/valid VCPI allocations, just rely on
 connector registration to stop such erroneous modesets

Changes since v6:
 - Keep a kref to all of the ports we have allocations on. This required
   a good bit of changing to when we call drm_dp_find_vcpi_slots(),
   mainly that we need to ensure that we only redo VCPI allocations on
   actual mode or CRTC changes, not crtc_state->active changes.
   Additionally, we no longer take the registration of the DRM connector
   for each port into account because so long as we have a kref to the
   port in the new or previous atomic state, the connector will stay
   registered.
 - Use the small changes to drm_dp_put_port() to add even more error
   checking to make misusage of the helpers more obvious. I added this
   after having to chase down various use-after-free conditions that
   started popping up from the new helpers so no one else has to
   troubleshoot that.
 - Move some accidental DRM_DEBUG_KMS() calls to DRM_DEBUG_ATOMIC()
 - Update documentation again, note that find/release() should both not be
   called on the same port in a single atomic check phase (but multiple
   calls to one or the other is OK)

Changes since v4:
 - Don't skip the atomic checks for VCPI allocations if no new VCPI
   allocations happen in a state. This makes the next change I'm about
   to list here a lot easier to implement.
 - Don't ignore VCPI allocations on destroyed ports, instead ensure that
   when ports are destroyed and still have VCPI allocations in the
   topology state, the only state changes allowed are releasing said
   ports' VCPI. This prevents a state with a mix of VCPI allocations
   from destroyed ports, and allocations from valid ports.

Changes since v3:
 - Don't release VCPI allocations in the topology state immediately in
   drm_dp_atomic_release_vcpi_slots(), instead mark them as 0 and skip
   over them in drm_dp_mst_duplicate_state(). This makes it so
   drm_dp_atomic_release_vcpi_slots() is still idempotent while also
   throwing warnings if the driver messes up it's book keeping and tries
   to release VCPI slots on a port that doesn't have any pre-existing
   VCPI allocation - danvet
 - Change mst_state/state in some debugging messages to "mst state"

Changes since v2:
 - Use kmemdup() for duplicating MST state - danvet
 - Move port validation out of duplicate state callback - danvet
 - Handle looping through MST topology states in
   drm_dp_mst_atomic_check() so the driver doesn't have to do it
 - Fix documentation in drm_dp_atomic_find_vcpi_slots()
 - Move the atomic check for each individual topology state into it's
   own function, reduces indenting
 - Don't consider "stale" MST ports when calculating the bandwidth
   requirements. This is needed because originally we relied on the
   state duplication functions to prune any stale ports from the new
   state, which would prevent us from incorrectly considering their
   bandwidth requirements alongside legitimate new payloads.
 - Add function references in drm_dp_atomic_release_vcpi_slots() - danvet
 - Annotate atomic VCPI and atomic check functions with __must_check
   - danvet

Changes since v1:
 - Don't use the now-removed ->atomic_check() for private objects hook,
   just give drivers a function to call themselves

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

eb668fa1

drm/dp_mst: Add some atomic state iterator macros · f571f637

Lyude Paul authored Oct 26, 2018



Changes since v6:
 - Move EXPORT_SYMBOL() for drm_dp_mst_topology_state_funcs to this
   commit
 - Document __drm_dp_mst_state_iter_get() and note that it shouldn't be
   called directly

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

f571f637

drm/nouveau: Grab payload lock in nv50_msto_payload() · 7bf2bc49

Lyude Paul authored Dec 13, 2018

Going through the currently programmed payloads isn't safe without
holding mgr->payload_lock, so actually do that and warn if anyone tries
calling nv50_msto_payload() in the future without grabbing the right
locks.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

7bf2bc49

drm/nouveau: Stop unsetting mstc->port, use malloc refs · 60d31584

Lyude Paul authored Dec 10, 2018

Same as we did for i915, but for nouveau this time. Additionally, we
grab a malloc reference to the port that lasts for the entire lifetime
of nv50_mstc, which gives us the guarantee that mstc->port will always
point to valid memory for as long as the mstc stays around.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

60d31584

drm/nouveau: Keep malloc references to MST ports · 46254f42

Lyude Paul authored Dec 10, 2018



Now that we finally have a sane way to keep port allocations around, use
it to fix the potential unchecked ->port accesses that nouveau makes by
making sure we keep the mst port allocated for as long as it's
drm_connector is accessible.

Additionally, now that we've guaranteed that mstc->port is allocated for
as long as we keep mstc around we can remove the connector registration
checks for codepaths which release payloads, allowing us to release
payloads on active topologies properly. These registration checks were
only required before in order to avoid situations where mstc->port could
technically be pointing at freed memory.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

46254f42

drm/nouveau: Remove unnecessary VCPI checks in nv50_msto_cleanup() · 85468c94

Lyude Paul authored Dec 10, 2018



There is no need to look at the port's VCPI allocation before calling
drm_dp_mst_deallocate_vcpi(), as we already have msto->disabled to let
us avoid cleaning up an msto more then once. The DP MST core will never
call drm_dp_mst_deallocate_vcpi() on it's own, which is presumably what
these checks are meant to protect against.

More importantly though, we're about to stop clearing mstc->port in the
next commit, which means if we could potentially hit a use-after-free
error if we tried to check mstc->port->vcpi here. So to make life easier
for anyone who bisects this code in the future, use msto->disabled
instead to check whether or not we need to deallocate VCPI instead.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

85468c94

drm/nouveau: Remove bogus cleanup in nv50_mstm_add_connector() · ddd37dd5

Lyude Paul authored Dec 10, 2018

Trying to destroy the connector using mstc->connector.funcs->destroy()
if connector initialization fails is wrong: there is no possible
codepath in nv50_mstc_new where nv50_mstm_add_connector() would return
<0 and mstc would be non-NULL.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

ddd37dd5

drm/amdgpu/display: Keep malloc ref to MST port · 40d77edc

Lyude Paul authored Dec 19, 2018



Just like i915 and nouveau, it's a good idea for us to hold a malloc
reference to the port here so that we never pass a freed pointer to any
of the DP MST helper functions.

Also, we stop unsetting aconnector->port in
dm_dp_destroy_mst_connector(). There's literally no point to that
assignment that I can see anyway.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

40d77edc

drm/i915: Keep malloc references to MST ports · c027ed90

Lyude Paul authored Dec 13, 2018



So that the ports stay around until we've destroyed the connectors, in
order to ensure that we don't pass an invalid pointer to any MST helpers
once we introduce the new MST VCPI helpers.

Changes since v1:
* Move drm_dp_mst_get_port_malloc() to where we assign
  intel_connector->port - danvet

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>

c027ed90

drm/dp_mst: Fix payload deallocation on hotplugs using malloc refs · 12e23013

Lyude Paul authored Dec 07, 2018



Up until now, freeing payloads on remote MST hubs that just had ports
removed has almost never worked because we've been relying on port
validation in order to stop us from accessing ports that have already
been freed from memory, but ports which need their payloads released due
to being removed will never be a valid part of the topology after
they've been removed.

Since we've introduced malloc refs, we can replace all of the validation
logic in payload helpers which are used for deallocation with some
well-placed malloc krefs. This ensures that regardless of whether or not
the ports are still valid and in the topology, any port which has an
allocated payload will remain allocated in memory until it's payloads
have been removed - finally allowing us to actually release said
payloads correctly.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

12e23013

drm/dp_mst: Stop releasing VCPI when removing ports from topology · 731e2886

Lyude Paul authored Dec 07, 2018

This has never actually worked, and isn't needed anyway: the driver's
always going to try to deallocate VCPI when it tears down the display
that the VCPI belongs to.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

731e2886

drm/dp_mst: Restart last_connected_port_and_mstb() if topology ref fails · aa04d565

Lyude Paul authored Dec 19, 2018



While this isn't a complete fix, this will improve the reliability of
drm_dp_get_last_connected_port_and_mstb() pretty significantly during
hotplug events, since there's a chance that the in-memory topology tree
may not be fully updated when drm_dp_get_last_connected_port_and_mstb()
is called and thus might end up causing our search to fail on an mstb
whose topology refcount has reached 0, but has not yet been removed from
it's parent.

Ideally, we should further fix this problem by ensuring that we deal
with the potential for racing with a hotplug event, which would look
like this:

* drm_dp_payload_send_msg() retrieves the last living relative of mstb
  with drm_dp_get_last_connected_port_and_mstb()
* drm_dp_payload_send_msg() starts building payload message
  At the same time, mstb gets unplugged from the topology and is no
  longer the actual last living relative of the original mstb
* drm_dp_payload_send_msg() tries sending the payload message, hub times
  out
* Hub timed out, we give up and run away-resulting in the payload being
  leaked

This could be fixed by restarting the
drm_dp_get_last_connected_port_and_mstb() search whenever we get a
timeout, sending the payload to the new mstb, then repeating until
either the entire topology is removed from the system or
drm_dp_get_last_connected_port_and_mstb() fails. But since the above
race condition is not terribly likely, we'll address that in a later
patch series once we've improved the recovery handling for VCPI
allocations in the rest of the DP MST helpers.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

aa04d565

drm/dp_mst: Introduce new refcounting scheme for mstbs and ports · fe36c5b5

Lyude Paul authored Nov 27, 2018



The current way of handling refcounting in the DP MST helpers is really
confusing and probably just plain wrong because it's been hacked up many
times over the years without anyone actually going over the code and
seeing if things could be simplified.

To the best of my understanding, the current scheme works like this:
drm_dp_mst_port and drm_dp_mst_branch both have a single refcount. When
this refcount hits 0 for either of the two, they're removed from the
topology state, but not immediately freed. Both ports and branch devices
will reinitialize their kref once it's hit 0 before actually destroying
themselves. The intended purpose behind this is so that we can avoid
problems like not being able to free a remote payload that might still
be active, due to us having removed all of the port/branch device
structures in memory, as per:

commit 91a25e46 ("drm/dp/mst: deallocate payload on port destruction")

Which may have worked, but then it caused use-after-free errors. Being
new to MST at the time, I tried fixing it;

commit 263efde3 ("drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()")

But, that was broken: both drm_dp_mst_port and drm_dp_mst_branch structs
are validated in almost every DP MST helper function. Simply put, this
means we go through the topology and try to see if the given
drm_dp_mst_branch or drm_dp_mst_port is still attached to something
before trying to use it in order to avoid dereferencing freed memory
(something that has happened a LOT in the past with this library).
Because of this it doesn't actually matter whether or not we keep keep
the ports and branches around in memory as that's not enough, because
any function that validates the branches and ports passed to it will
still reject them anyway since they're no longer in the topology
structure. So, use-after-free errors were fixed but payload deallocation
was completely broken.

Two years later, AMD informed me about this issue and I attempted to
come up with a temporary fix, pending a long-overdue cleanup of this
library:

commit c54c7374 ("drm/dp_mst: Skip validating ports during destruction, just ref")

But then that introduced use-after-free errors, so I quickly reverted
it:

commit 9765635b ("Revert "drm/dp_mst: Skip validating ports during destruction, just ref"")

And in the process, learned that there is just no simple fix for this:
the design is just broken. Unfortunately, the usage of these helpers are
quite broken as well. Some drivers like i915 have been smart enough to
avoid accessing any kind of information from MST port structures, but
others like nouveau have assumed, understandably so, that
drm_dp_mst_port structures are normal and can just be accessed at any
time without worrying about use-after-free errors.

After a lot of discussion, me and Daniel Vetter came up with a better
idea to replace all of this.

To summarize, since this is documented far more indepth in the
documentation this patch introduces, we make it so that drm_dp_mst_port
and drm_dp_mst_branch structures have two different classes of
refcounts: topology_kref, and malloc_kref. topology_kref corresponds to
the lifetime of the given drm_dp_mst_port or drm_dp_mst_branch in it's
given topology. Once it hits zero, any associated connectors are removed
and the branch or port can no longer be validated. malloc_kref
corresponds to the lifetime of the memory allocation for the actual
structure, and will always be non-zero so long as the topology_kref is
non-zero. This gives us a way to allow callers to hold onto port and
branch device structures past their topology lifetime, and dramatically
simplifies the lifetimes of both structures. This also finally fixes the
port deallocation problem, properly.

Additionally: since this now means that we can keep ports and branch
devices allocated in memory for however long we need, we no longer need
a significant amount of the port validation that we currently do.

Additionally, there is one last scenario that this fixes, which couldn't
have been fixed properly beforehand:

- CPU1 unrefs port from topology (refcount 1->0)
- CPU2 refs port in topology(refcount 0->1)

Since we now can guarantee memory safety for ports and branches
as-needed, we also can make our main reference counting functions fix
this problem by using kref_get_unless_zero() internally so that topology
refcounts can only ever reach 0 once.

Changes since v3:
* Remove rebase detritus - danvet
* Split out purely style changes into separate patches - hwentlan

Changes since v2:
* Fix commit message - checkpatch
* s/)-1/) - 1/g - checkpatch

Changes since v1:
* Remove forward declarations - danvet
* Move "Branch device and port refcounting" section from documentation
  into kernel-doc comments - danvet
* Export internal topology lifetime functions into their own section in
  the kernel-docs - danvet
* s/@/&/g for struct references in kernel-docs - danvet
* Drop the "when they are no longer being used" bits from the kernel
  docs - danvet
* Modify diagrams to show how the DRM driver interacts with the topology
  and payloads - danvet
* Make suggested documentation changes for
  drm_dp_mst_topology_get_mstb() and drm_dp_mst_topology_get_port() -
  danvet
* Better explain the relationship between malloc refs and topology krefs
  in the documentation for drm_dp_mst_topology_get_port() and
  drm_dp_mst_topology_get_mstb() - danvet
* Fix "See also" in drm_dp_mst_topology_get_mstb() - danvet
* Rename drm_dp_mst_topology_get_(port|mstb)() ->
  drm_dp_mst_topology_try_get_(port|mstb)() and
  drm_dp_mst_topology_ref_(port|mstb)() ->
  drm_dp_mst_topology_get_(port|mstb)() - danvet
* s/should/must in docs - danvet
* WARN_ON(refcount == 0) in topology_get_(mstb|port) - danvet
* Move kdocs for mstb/port structs inline - danvet
* Split drm_dp_get_last_connected_port_and_mstb() changes into their own
  commit - danvet

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

fe36c5b5

drm/dp_mst: Rename drm_dp_mst_get_validated_(port|mstb)_ref and friends · 35c41f0a

Lyude Paul authored Dec 18, 2018



s/drm_dp_get_validated_port_ref/drm_dp_mst_topology_get_port_validated/
s/drm_dp_put_port/drm_dp_mst_topology_put_port/
s/drm_dp_get_validated_mstb_ref/drm_dp_mst_topology_get_mstb_validated/
s/drm_dp_put_mst_branch_device/drm_dp_mst_topology_put_mstb/

This is a much more consistent naming scheme, and will make even more
sense once we redesign how the current refcounting scheme here works.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

35c41f0a

drm/dp_mst: Fix some formatting in drm_dp_mst_deallocate_vcpi() · 8e51be25

Lyude Paul authored Jan 08, 2019



Split some stuff across multiple lines

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

8e51be25

drm/dp_mst: Fix some formatting in drm_dp_mst_allocate_vcpi() · a953de3e

Lyude Paul authored Jan 08, 2019



Fix some indenting, split some stuff across multiple lines.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

a953de3e

drm/dp_mst: Fix some formatting in drm_dp_payload_send_msg() · c24fd238

Lyude Paul authored Jan 08, 2019



Split some stuff across multiple lines, remove some unnecessary braces

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

c24fd238

drm/dp_mst: Fix some formatting in drm_dp_add_port() · 7fa8a6f0

Lyude Paul authored Jan 08, 2019

Reindent some stuff, and split some stuff across multiple lines so we
aren't going over the text width limit.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Juston Li <juston.li@intel.com>

7fa8a6f0

drm-tip: 2019y-01m-08d-17h-49m-49s UTC integration manifest · cae04ddb
Noralf Trønnes authored Jan 08, 2019

cae04ddb
Merge remote-tracking branch 'drm-intel/topic/core-for-CI' into drm-tip · 4f27c5b5
Noralf Trønnes authored Jan 08, 2019

4f27c5b5
Merge remote-tracking branch 'sound/for-next' into drm-tip · 5232929c
Noralf Trønnes authored Jan 08, 2019

5232929c
Merge remote-tracking branch 'sound/for-linus' into drm-tip · 08f3af33
Noralf Trønnes authored Jan 08, 2019

08f3af33
Merge remote-tracking branch 'drm-intel/drm-intel-next-queued' into drm-tip · c90413cf
Noralf Trønnes authored Jan 08, 2019

c90413cf
Merge remote-tracking branch 'drm-misc/drm-misc-next' into drm-tip · 0858d6e9
Noralf Trønnes authored Jan 08, 2019

0858d6e9

drm/fb-helper: fix leaks in error path of drm_fb_helper_fbdev_setup · 00eb5b0d

Peter Wu authored Dec 23, 2018 and

Noralf Trønnes committed Jan 08, 2019

After drm_fb_helper_fbdev_setup calls drm_fb_helper_init,
"dev->fb_helper" will be initialized (and thus drm_fb_helper_fini will
have some effect). After that, drm_fb_helper_initial_config is called
which may call the "fb_probe" driver callback.

This driver callback may call drm_fb_helper_defio_init (as is done by
drm_fb_helper_generic_probe) or set a framebuffer (as is done by bochs)
as documented. These are normally cleaned up on exit by
drm_fb_helper_fbdev_teardown which also calls drm_fb_helper_fini.

If an error occurs after "fb_probe", but before setup is complete, then
calling just drm_fb_helper_fini will leak resources. This was triggered
by df2052cc ("bochs: convert to drm_fb_helper_fbdev_setup/teardown"):

    [   50.008030] bochsdrmfb: enable CONFIG_FB_LITTLE_ENDIAN to support this framebuffer
    [   50.009436] bochs-drm 0000:00:02.0: [drm:drm_fb_helper_fbdev_setup] *ERROR* fbdev: Failed to set configuration (ret=-38)
    [   50.011456] [drm] Initialized bochs-drm 1.0.0 20130925 for 0000:00:02.0 on minor 2
    [   50.013604] WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x280/0x2a0
    [   50.016175] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G                T 4.20.0-rc7 #1
    [   50.017732] EIP: drm_mode_config_cleanup+0x280/0x2a0
    ...
    [   50.023155] Call Trace:
    [   50.023155]  ? bochs_kms_fini+0x1e/0x30
    [   50.023155]  ? bochs_unload+0x18/0x40

This can be reproduced with QEMU and CONFIG_FB_LITTLE_ENDIAN=n.

Link: https://lkml.kernel.org/r/20181221083226.GI23332@shao2-debian
Link: https://lkml.kernel.org/r/20181223004315.GA11455@al


Fixes: 87412163 ("drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Cc: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20181223005507.28328-1-peter@lekensteyn.nl

00eb5b0d

drm/fb-helper: generic: Fix setup error path · 6e1490cf

Noralf Trønnes authored Jan 05, 2019

If register_framebuffer() fails during fbdev setup we will leak the
framebuffer, the GEM buffer and the shadow buffer for defio. This is
because drm_fb_helper_fbdev_setup() just calls drm_fb_helper_fini() on
error not taking into account that register_framebuffer() can fail.

Since the generic emulation uses DRM client for its framebuffer and
backing buffer in addition to a shadow buffer, it's necessary to open code
drm_fb_helper_fbdev_setup() to properly handle the error path.

Error cleanup is removed from .fb_probe and is handled by one function for
all paths.

Fixes: 9060d7f4 ("drm/fb-helper: Finish the generic fbdev emulation")
Reported-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190105181846.26495-1-noralf@tronnes.org

6e1490cf

Fix ("nohz: Fix local_timer_softirq_pending()") may have revealed another problem · e183b2d2

Frederic Weisbecker authored Dec 28, 2018 and

Chris Wilson committed Jan 08, 2019

On Fri, Dec 28, 2018 at 12:11:12AM +0100, Heiner Kallweit wrote:
>
> OK, did as you advised and here comes the trace. That's the related dmesg part:
>
> [ 1479.025092] x86: Booting SMP configuration:
> [ 1479.025129] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [ 1479.094715] NOHZ: local_softirq_pending 202
> [ 1479.096557] smpboot: CPU 1 is now offline
>
> Hope it helps.
> Heiner
>
>
> # tracer: nop
> #
> #                              _-----=> irqs-off
> #                             / _----=> need-resched
> #                            | / _---=> hardirq/softirq
> #                            || / _--=> preempt-depth
> #                            ||| /     delay
> #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
> #              | |       |   ||||       |         |
[...]
>           <idle>-0     [001] d.h2  1479.111017: softirq_raise: vec=9 [action=RCU]
>           <idle>-0     [001] d.h2  1479.111026: softirq_raise: vec=7 [action=SCHED]
>           <idle>-0     [001] ..s2  1479.111035: softirq_entry: vec=1 [action=TIMER]
>           <idle>-0     [001] ..s2  1479.111040: softirq_exit: vec=1 [action=TIMER]
>           <idle>-0     [001] ..s2  1479.111040: softirq_entry: vec=7 [action=SCHED]
>           <idle>-0     [001] ..s2  1479.111052: softirq_exit: vec=7 [action=SCHED]
>           <idle>-0     [001] ..s2  1479.111052: softirq_entry: vec=9 [action=RCU]
>           <idle>-0     [001] .Ns2  1479.111079: softirq_exit: vec=9 [action=RCU]
>          cpuhp/1-13    [001] dNh2  1479.112930: softirq_raise: vec=1 [action=TIMER]
>          cpuhp/1-13    [001] dNh2  1479.112935: softirq_raise: vec=9 [action=RCU]

Interesting, the softirq is raised from hardirq but it's not handled in the end of
the IRQ. Are you running threaded IRQS by any chance? If so I would expect ksoftirqd
to handle the pending work before we go idle. However I can imagine a small window
where such an expectation may not be met: if the softirq is raised after the ksoftirqd
thread is parked (CPUHP_AP_SMPBOOT_THREADS), which is right before we disable the CPU
(CPUHP_TEARDOWN_CPU).

I don't know if we can afford to ignore a softirq even at this late stage. We should
probably avoid leaking any. So here is a possible fix, if you don't mind trying:

e183b2d2

Ndrm/i915/debugfs: store rotation string buffer on stack · 5852a15c

Jani Nikula authored Jan 07, 2019



Minimal change to nuke the static buf.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107145149.10069-1-jani.nikula@intel.com

5852a15c

drm/virtio: drop virtio_gpu_fence_cleanup() · cb66c6da

Gerd Hoffmann authored Dec 19, 2018



Just call drm_fence_put directly instead.
Also set vgfb->fence to NULL after dropping the reference.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ezequiel Garcia <ezequiel@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20181219122708.4586-4-kraxel@redhat.com

cb66c6da

drm/virtio: fix pageflip flush · 6a01d277

Gerd Hoffmann authored Dec 19, 2018

Sending the flush command only makes sense if we actually have
a framebuffer attached to the scanout (handle != 0).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Ezequiel Garcia <ezequiel@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20181219122708.4586-3-kraxel@redhat.com

6a01d277

drm/virtio: log error responses · 3630c2a2

Gerd Hoffmann authored Dec 19, 2018

If we got an error response code from the host, print it to the log.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20181219122708.4586-2-kraxel@redhat.com

3630c2a2

drm/virtio: Add missing virtqueue reset · edde9fc5

Ezequiel Garcia authored Jan 02, 2019 and

Gerd Hoffmann committed Jan 08, 2019



As per the VirtIO spec, the virtqueues must be reset during cleanup
(see "3.3.1 Driver Requirements: Device Cleanup").

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190102175507.4653-2-ezequiel@collabora.com

edde9fc5

drm/virtio: Remove incorrect kfree() · 29cd2e2d

Ezequiel Garcia authored Jan 02, 2019 and

Gerd Hoffmann committed Jan 08, 2019

The virtio_gpu_output is a member of struct virtio_gpu_device
and is not a dynamically-allocated chunk, so it's wrong to kfree() it.
Removing it fixes a memory corruption BUG() that can be triggered
when the virtio-gpu driver is removed.

Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190102175507.4653-1-ezequiel@collabora.com

29cd2e2d

drm/i915: Return immediately if trylock fails for direct-reclaim · d25f71a1

Chris Wilson authored Jan 07, 2019

Ignore trying to shrink from i915 if we fail to acquire the struct_mutex
in the shrinker while performing direct-reclaim. The trade-off being
(much) lower latency for non-i915 clients at an increased risk of being
unable to obtain a page from direct-reclaim without hitting the
oom-notifier. The proviso being that we still keep trying to hard
obtain the lock for kswapd so that we can reap under heavy memory
pressure.

v2: Taint all mutexes taken within the shrinker with the struct_mutex
subclass as an early warning system, and drop I915_SHRINK_ACTIVE from
vmap to reduce the number of dangerous paths. We also have to drop
I915_SHRINK_ACTIVE from oom-notifier to be able to make the same claim
that ACTIVE is only used from outside context, which fits in with a
longer strategy of avoiding stalls due to scanning active during
shrinking.

The danger in using the subclass struct_mutex is that we declare
ourselves more knowledgable than lockdep and deprive ourselves of
automatic coverage. Instead, we require ourselves to mark up any mutex
taken inside the shrinker in order to detect lock-inversion, and if we
miss any we are doomed to a deadlock at the worst possible moment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107115509.12523-1-chris@chris-wilson.co.uk

d25f71a1

Merge drm/drm-next into drm-intel-next-queued · 3eb0930a

Jani Nikula authored Jan 08, 2019

Generally catch up with 5.0-rc1, and specifically get the changes:

96d4f267 ("Remove 'type' argument from access_ok() function")
0b2c8f8b ("i915: fix missing user_access_end() in page fault exception case")
594cc251 ("make 'user_access_begin()' do 'access_ok()'")

Signed-off-by: Jani Nikula <jani.nikula@intel.com>

3eb0930a

drm/todo: Better defio support in the generic fbdev emulation · be5cadc7

Simona Vetter authored Jan 07, 2019



The current one essentially means you need CMA or a vmalloc backed
object, which makes fbdev emulation a special case.

Since implementing this will be quite a bit of work, capture the idea
in a TODO.

Cc: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190107102238.7789-1-daniel.vetter@ffwll.ch

be5cadc7

ALSA: cs46xx: Potential NULL dereference in probe · 1524f4e4

Dan Carpenter authored Jan 08, 2019 and

Takashi Iwai committed Jan 08, 2019



The "chip->dsp_spos_instance" can be NULL on some of the ealier error
paths in snd_cs46xx_create().

Reported-by: "Yavuz, Tuba" <tuba@ece.ufl.edu>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

1524f4e4

Merge tag 'topic/drmp-cleanup-2019-01-02' of... · 481975ca

Jani Nikula authored Jan 08, 2019

Merge tag 'topic/drmp-cleanup-2019-01-02' of git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next-queued

Make some drm headers self-contained with includes and forward
declarations.

This topic branch has already been merged to drm-misc-next as commit
1c95f662 ("Merge tag 'topic/drmp-cleanup-2019-01-02' of
git://anongit.freedesktop.org/drm/drm-intel

 into drm-misc-next"). Now
merge it to drm-intel-next-queued to unblock some further drmP.h cleanup
without having to wait for a backmerge.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87pntfl6pa.fsf@intel.com

481975ca

Admin message