Skip to content
Snippets Groups Projects
  1. Dec 17, 2021
  2. Nov 11, 2021
    • Wang ShaoBo's avatar
      arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology() · 4cc4cc28
      Wang ShaoBo authored
      
      When testing cpu online and offline, warning happened like this:
      
      [  146.746743] WARNING: CPU: 92 PID: 974 at kernel/sched/topology.c:2215 build_sched_domains+0x81c/0x11b0
      [  146.749988] CPU: 92 PID: 974 Comm: kworker/92:2 Not tainted 5.15.0 #9
      [  146.750402] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.79 08/21/2021
      [  146.751213] Workqueue: events cpuset_hotplug_workfn
      [  146.751629] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  146.752048] pc : build_sched_domains+0x81c/0x11b0
      [  146.752461] lr : build_sched_domains+0x414/0x11b0
      [  146.752860] sp : ffff800040a83a80
      [  146.753247] x29: ffff800040a83a80 x28: ffff20801f13a980 x27: ffff20800448ae00
      [  146.753644] x26: ffff800012a858e8 x25: ffff800012ea48c0 x24: 0000000000000000
      [  146.754039] x23: ffff800010ab7d60 x22: ffff800012f03758 x21: 000000000000005f
      [  146.754427] x20: 000000000000005c x19: ffff004080012840 x18: ffffffffffffffff
      [  146.754814] x17: 3661613030303230 x16: 30303078303a3239 x15: ffff800011f92b48
      [  146.755197] x14: ffff20be3f95cef6 x13: 2e6e69616d6f642d x12: 6465686373204c4c
      [  146.755578] x11: ffff20bf7fc83a00 x10: 0000000000000040 x9 : 0000000000000000
      [  146.755957] x8 : 0000000000000002 x7 : ffffffffe0000000 x6 : 0000000000000002
      [  146.756334] x5 : 0000000090000000 x4 : 00000000f0000000 x3 : 0000000000000001
      [  146.756705] x2 : 0000000000000080 x1 : ffff800012f03860 x0 : 0000000000000001
      [  146.757070] Call trace:
      [  146.757421]  build_sched_domains+0x81c/0x11b0
      [  146.757771]  partition_sched_domains_locked+0x57c/0x978
      [  146.758118]  rebuild_sched_domains_locked+0x44c/0x7f0
      [  146.758460]  rebuild_sched_domains+0x2c/0x48
      [  146.758791]  cpuset_hotplug_workfn+0x3fc/0x888
      [  146.759114]  process_one_work+0x1f4/0x480
      [  146.759429]  worker_thread+0x48/0x460
      [  146.759734]  kthread+0x158/0x168
      [  146.760030]  ret_from_fork+0x10/0x20
      [  146.760318] ---[ end trace 82c44aad6900e81a ]---
      
      For some architectures like risc-v and arm64 which use common code
      clear_cpu_topology() in shutting down CPUx, When CONFIG_SCHED_CLUSTER
      is set, cluster_sibling in cpu_topology of each sibling adjacent
      to CPUx is missed clearing, this causes checking failed in
      topology_span_sane() and rebuilding topology failure at end when CPU online.
      
      Different sibling's cluster_sibling in cpu_topology[] when CPU92 offline
      (CPU 92, 93, 94, 95 are in one cluster):
      
      Before revision:
      CPU                 [92]      [93]      [94]      [95]
      cluster_sibling     [92]     [92-95]   [92-95]   [92-95]
      
      After revision:
      CPU                 [92]      [93]      [94]      [95]
      cluster_sibling     [92]     [93-95]   [93-95]   [93-95]
      
      Signed-off-by: default avatarWang ShaoBo <bobo.shaobowang@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Acked-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Tested-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Link: https://lore.kernel.org/r/20211110095856.469360-1-bobo.shaobowang@huawei.com
      4cc4cc28
  3. Nov 06, 2021
  4. Nov 05, 2021
    • Rafael J. Wysocki's avatar
      PM: sleep: Avoid calling put_device() under dpm_list_mtx · 2aa36604
      Rafael J. Wysocki authored
      
      It is generally unsafe to call put_device() with dpm_list_mtx held,
      because the given device's release routine may carry out an action
      depending on that lock which then may deadlock, so modify the
      system-wide suspend and resume of devices to always drop dpm_list_mtx
      before calling put_device() (and adjust white space somewhat while
      at it).
      
      For instance, this prevents the following splat from showing up in
      the kernel log after a system resume in certain configurations:
      
      [ 3290.969514] ======================================================
      [ 3290.969517] WARNING: possible circular locking dependency detected
      [ 3290.969519] 5.15.0+ #2420 Tainted: G S
      [ 3290.969523] ------------------------------------------------------
      [ 3290.969525] systemd-sleep/4553 is trying to acquire lock:
      [ 3290.969529] ffff888117ab1138 ((wq_completion)hci0#2){+.+.}-{0:0}, at: flush_workqueue+0x87/0x4a0
      [ 3290.969554]
                     but task is already holding lock:
      [ 3290.969556] ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0
      [ 3290.969571]
                     which lock already depends on the new lock.
      
      [ 3290.969573]
                     the existing dependency chain (in reverse order) is:
      [ 3290.969575]
                     -> #3 (dpm_list_mtx){+.+.}-{3:3}:
      [ 3290.969583]        __mutex_lock+0x9d/0xa30
      [ 3290.969591]        device_pm_add+0x2e/0xe0
      [ 3290.969597]        device_add+0x4d5/0x8f0
      [ 3290.969605]        hci_conn_add_sysfs+0x43/0xb0 [bluetooth]
      [ 3290.969689]        hci_conn_complete_evt.isra.71+0x124/0x750 [bluetooth]
      [ 3290.969747]        hci_event_packet+0xd6c/0x28a0 [bluetooth]
      [ 3290.969798]        hci_rx_work+0x213/0x640 [bluetooth]
      [ 3290.969842]        process_one_work+0x2aa/0x650
      [ 3290.969851]        worker_thread+0x39/0x400
      [ 3290.969859]        kthread+0x142/0x170
      [ 3290.969865]        ret_from_fork+0x22/0x30
      [ 3290.969872]
                     -> #2 (&hdev->lock){+.+.}-{3:3}:
      [ 3290.969881]        __mutex_lock+0x9d/0xa30
      [ 3290.969887]        hci_event_packet+0xba/0x28a0 [bluetooth]
      [ 3290.969935]        hci_rx_work+0x213/0x640 [bluetooth]
      [ 3290.969978]        process_one_work+0x2aa/0x650
      [ 3290.969985]        worker_thread+0x39/0x400
      [ 3290.969993]        kthread+0x142/0x170
      [ 3290.969999]        ret_from_fork+0x22/0x30
      [ 3290.970004]
                     -> #1 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0}:
      [ 3290.970013]        process_one_work+0x27d/0x650
      [ 3290.970020]        worker_thread+0x39/0x400
      [ 3290.970028]        kthread+0x142/0x170
      [ 3290.970033]        ret_from_fork+0x22/0x30
      [ 3290.970038]
                     -> #0 ((wq_completion)hci0#2){+.+.}-{0:0}:
      [ 3290.970047]        __lock_acquire+0x15cb/0x1b50
      [ 3290.970054]        lock_acquire+0x26c/0x300
      [ 3290.970059]        flush_workqueue+0xae/0x4a0
      [ 3290.970066]        drain_workqueue+0xa1/0x130
      [ 3290.970073]        destroy_workqueue+0x34/0x1f0
      [ 3290.970081]        hci_release_dev+0x49/0x180 [bluetooth]
      [ 3290.970130]        bt_host_release+0x1d/0x30 [bluetooth]
      [ 3290.970195]        device_release+0x33/0x90
      [ 3290.970201]        kobject_release+0x63/0x160
      [ 3290.970211]        dpm_resume+0x164/0x3e0
      [ 3290.970215]        dpm_resume_end+0xd/0x20
      [ 3290.970220]        suspend_devices_and_enter+0x1a4/0xba0
      [ 3290.970229]        pm_suspend+0x26b/0x310
      [ 3290.970236]        state_store+0x42/0x90
      [ 3290.970243]        kernfs_fop_write_iter+0x135/0x1b0
      [ 3290.970251]        new_sync_write+0x125/0x1c0
      [ 3290.970257]        vfs_write+0x360/0x3c0
      [ 3290.970263]        ksys_write+0xa7/0xe0
      [ 3290.970269]        do_syscall_64+0x3a/0x80
      [ 3290.970276]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      [ 3290.970284]
                     other info that might help us debug this:
      
      [ 3290.970285] Chain exists of:
                       (wq_completion)hci0#2 --> &hdev->lock --> dpm_list_mtx
      
      [ 3290.970297]  Possible unsafe locking scenario:
      
      [ 3290.970299]        CPU0                    CPU1
      [ 3290.970300]        ----                    ----
      [ 3290.970302]   lock(dpm_list_mtx);
      [ 3290.970306]                                lock(&hdev->lock);
      [ 3290.970310]                                lock(dpm_list_mtx);
      [ 3290.970314]   lock((wq_completion)hci0#2);
      [ 3290.970319]
                      *** DEADLOCK ***
      
      [ 3290.970321] 7 locks held by systemd-sleep/4553:
      [ 3290.970325]  #0: ffff888103bcd448 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xa7/0xe0
      [ 3290.970341]  #1: ffff888115a14488 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x103/0x1b0
      [ 3290.970355]  #2: ffff888100f719e0 (kn->active#233){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1b0
      [ 3290.970369]  #3: ffffffff82661048 (autosleep_lock){+.+.}-{3:3}, at: state_store+0x12/0x90
      [ 3290.970384]  #4: ffffffff82658ac8 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x9f/0x310
      [ 3290.970399]  #5: ffffffff827f2a48 (acpi_scan_lock){+.+.}-{3:3}, at: acpi_suspend_begin+0x4c/0x80
      [ 3290.970416]  #6: ffffffff8280fca8 (dpm_list_mtx){+.+.}-{3:3}, at: dpm_resume+0x12e/0x3e0
      [ 3290.970428]
                     stack backtrace:
      [ 3290.970431] CPU: 3 PID: 4553 Comm: systemd-sleep Tainted: G S                5.15.0+ #2420
      [ 3290.970438] Hardware name: Dell Inc. XPS 13 9380/0RYJWW, BIOS 1.5.0 06/03/2019
      [ 3290.970441] Call Trace:
      [ 3290.970446]  dump_stack_lvl+0x44/0x57
      [ 3290.970454]  check_noncircular+0x105/0x120
      [ 3290.970468]  ? __lock_acquire+0x15cb/0x1b50
      [ 3290.970474]  __lock_acquire+0x15cb/0x1b50
      [ 3290.970487]  lock_acquire+0x26c/0x300
      [ 3290.970493]  ? flush_workqueue+0x87/0x4a0
      [ 3290.970503]  ? __raw_spin_lock_init+0x3b/0x60
      [ 3290.970510]  ? lockdep_init_map_type+0x58/0x240
      [ 3290.970519]  flush_workqueue+0xae/0x4a0
      [ 3290.970526]  ? flush_workqueue+0x87/0x4a0
      [ 3290.970544]  ? drain_workqueue+0xa1/0x130
      [ 3290.970552]  drain_workqueue+0xa1/0x130
      [ 3290.970561]  destroy_workqueue+0x34/0x1f0
      [ 3290.970572]  hci_release_dev+0x49/0x180 [bluetooth]
      [ 3290.970624]  bt_host_release+0x1d/0x30 [bluetooth]
      [ 3290.970687]  device_release+0x33/0x90
      [ 3290.970695]  kobject_release+0x63/0x160
      [ 3290.970705]  dpm_resume+0x164/0x3e0
      [ 3290.970710]  ? dpm_resume_early+0x251/0x3b0
      [ 3290.970718]  dpm_resume_end+0xd/0x20
      [ 3290.970723]  suspend_devices_and_enter+0x1a4/0xba0
      [ 3290.970737]  pm_suspend+0x26b/0x310
      [ 3290.970746]  state_store+0x42/0x90
      [ 3290.970755]  kernfs_fop_write_iter+0x135/0x1b0
      [ 3290.970764]  new_sync_write+0x125/0x1c0
      [ 3290.970777]  vfs_write+0x360/0x3c0
      [ 3290.970785]  ksys_write+0xa7/0xe0
      [ 3290.970794]  do_syscall_64+0x3a/0x80
      [ 3290.970803]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [ 3290.970811] RIP: 0033:0x7f41b1328164
      [ 3290.970819] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 80 00 00 00 00 8b 05 4a d2 2c 00 48 63 ff 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 55 53 48 89 d5 48 89 f3 48 83
      [ 3290.970824] RSP: 002b:00007ffe6ae21b28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 3290.970831] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f41b1328164
      [ 3290.970836] RDX: 0000000000000004 RSI: 000055965e651070 RDI: 0000000000000004
      [ 3290.970839] RBP: 000055965e651070 R08: 000055965e64f390 R09: 00007f41b1e3d1c0
      [ 3290.970843] R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000004
      [ 3290.970846] R13: 0000000000000001 R14: 000055965e64f2b0 R15: 0000000000000004
      
      Cc: All applicable <stable@vger.kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2aa36604
  5. Nov 04, 2021
    • Ulf Hansson's avatar
      PM: sleep: Fix runtime PM based cpuidle support · a2bd7be1
      Ulf Hansson authored
      
      In the cpuidle-psci case, runtime PM in combination with the generic PM
      domain (genpd), may be used when entering/exiting a shared idlestate. More
      precisely, genpd relies on runtime PM to be enabled for the attached device
      (in this case it belongs to a CPU), to properly manage the reference
      counting of its PM domain.
      
      This works fine most of the time, but during system suspend in
      dpm_suspend_late(), the PM core disables runtime PM for all devices. Beyond
      this point, calls to pm_runtime_get_sync() to runtime resume a device may
      fail and therefore it could also mess up the reference counting in genpd.
      
      To fix this problem, let's call wake_up_all_idle_cpus() in
      dpm_suspend_late(), prior to disabling runtime PM. In this way a device
      that belongs to a CPU, becomes runtime resumed through cpuidle-psci and
      stays like that, because the runtime PM usage count has been bumped in
      device_prepare().
      
      Diagnosed-by: default avatarMaulik Shah <mkshah@codeaurora.org>
      Suggested-by: default avatarRafael J. Wysocki <rafael@kernel.org>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a2bd7be1
  6. Oct 27, 2021
    • Chunfeng Yun's avatar
      PM / wakeirq: support enabling wake-up irq after runtime_suspend called · 25971410
      Chunfeng Yun authored
      
      When the dedicated wake IRQ is level trigger, and it uses the
      device's low-power status as the wakeup source, that means if the
      device is not in low-power state, the wake IRQ will be triggered
      if enabled; For this case, need enable the wake IRQ after running
      the device's ->runtime_suspend() which make it enter low-power state.
      
      e.g.
      Assume the wake IRQ is a low level trigger type, and the wakeup
      signal comes from the low-power status of the device.
      The wakeup signal is low level at running time (0), and becomes
      high level when the device enters low-power state (runtime_suspend
      (1) is called), a wakeup event at (2) make the device exit low-power
      state, then the wakeup signal also becomes low level.
      
                      ------------------
                     |           ^     ^|
      ----------------           |     | --------------
       |<---(0)--->|<--(1)--|   (3)   (2)    (4)
      
      if enable the wake IRQ before running runtime_suspend during (0),
      a wake IRQ will arise, it causes resume immediately;
      it works if enable wake IRQ ( e.g. at (3) or (4)) after running
      ->runtime_suspend().
      
      This patch introduces a new status WAKE_IRQ_DEDICATED_REVERSE to
      optionally support enabling wake IRQ after running ->runtime_suspend().
      
      Suggested-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarChunfeng Yun <chunfeng.yun@mediatek.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      25971410
  7. Oct 26, 2021
  8. Oct 24, 2021
  9. Oct 23, 2021
  10. Oct 22, 2021
  11. Oct 21, 2021
    • Kai Vehmanen's avatar
      component: do not leave master devres group open after bind · c87761db
      Kai Vehmanen authored
      
      In current code, the devres group for aggregate master is left open
      after call to component_master_add_*(). This leads to problems when the
      master does further managed allocations on its own. When any
      participating driver calls component_del(), this leads to immediate
      release of resources.
      
      This came up when investigating a page fault occurring with i915 DRM
      driver unbind with 5.15-rc1 kernel. The following sequence occurs:
      
       i915_pci_remove()
         -> intel_display_driver_unregister()
           -> i915_audio_component_cleanup()
             -> component_del()
               -> component.c:take_down_master()
                 -> hdac_component_master_unbind() [via master->ops->unbind()]
                 -> devres_release_group(master->parent, NULL)
      
      With older kernels this has not caused issues, but with audio driver
      moving to use managed interfaces for more of its allocations, this no
      longer works. Devres log shows following to occur:
      
      component_master_add_with_match()
      [  126.886032] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000323ccdc5 devm_component_match_release (24 bytes)
      [  126.886045] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000865cdb29 grp< (0 bytes)
      [  126.886049] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 grp< (0 bytes)
      
      audio driver completes its PCI probe()
      [  126.892238] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 pcim_iomap_release (48 bytes)
      
      component_del() called() at DRM/i915 unbind()
      [  137.579422] i915 0000:00:02.0: DEVRES REL 00000000ef44c293 grp< (0 bytes)
      [  137.579445] snd_hda_intel 0000:00:1f.3: DEVRES REL 00000000865cdb29 grp< (0 bytes)
      [  137.579458] snd_hda_intel 0000:00:1f.3: DEVRES REL 000000001b480725 pcim_iomap_release (48 bytes)
      
      So the "devres_release_group(master->parent, NULL)" ends up freeing the
      pcim_iomap allocation. Upon next runtime resume, the audio driver will
      cause a page fault as the iomap alloc was released without the driver
      knowing about it.
      
      Fix this issue by using the "struct master" pointer as identifier for
      the devres group, and by closing the devres group after
      the master->ops->bind() call is done. This allows devres allocations
      done by the driver acting as master to be isolated from the binding state
      of the aggregate driver. This modifies the logic originally introduced in
      commit 9e1ccb4a ("drivers/base: fix devres handling for master device")
      
      Fixes: 9e1ccb4a ("drivers/base: fix devres handling for master device")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarImre Deak <imre.deak@intel.com>
      Acked-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarKai Vehmanen <kai.vehmanen@linux.intel.com>
      BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
      Link: https://lore.kernel.org/r/20211013161345.3755341-1-kai.vehmanen@linux.intel.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c87761db
  12. Oct 20, 2021
  13. Oct 15, 2021
    • Jonathan Cameron's avatar
      topology: Represent clusters of CPUs within a die · c5e22fef
      Jonathan Cameron authored
      Both ACPI and DT provide the ability to describe additional layers of
      topology between that of individual cores and higher level constructs
      such as the level at which the last level cache is shared.
      In ACPI this can be represented in PPTT as a Processor Hierarchy
      Node Structure [1] that is the parent of the CPU cores and in turn
      has a parent Processor Hierarchy Nodes Structure representing
      a higher level of topology.
      
      For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
      cluster has 4 cpus. All clusters share L3 cache data, but each cluster
      has local L3 tag. On the other hand, each clusters will share some
      internal system bus.
      
      +-----------------------------------+                          +---------+
      |  +------+    +------+             +--------------------------+         |
      |  | CPU0 |    | cpu1 |             |    +-----------+         |         |
      |  +------+    +------+             |    |           |         |         |
      |                                   +----+    L3     |         |         |
      |  +------+    +------+   cluster   |    |    tag    |         |         |
      |  | CPU2 |    | CPU3 |             |    |           |         |         |
      |  +------+    +------+             |    +-----------+         |         |
      |                                   |                          |         |
      +-----------------------------------+                          |         |
      +-----------------------------------+                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |    +-----------+         |         |
      |  +------+    +------+             |    |           |         |         |
      |                                   |    |    L3     |         |         |
      |  +------+    +------+             +----+    tag    |         |         |
      |  |      |    |      |             |    |           |         |         |
      |  +------+    +------+             |    +-----------+         |         |
      |                                   |                          |         |
      +-----------------------------------+                          |   L3    |
                                                                     |   data  |
      +-----------------------------------+                          |         |
      |  +------+    +------+             |    +-----------+         |         |
      |  |      |    |      |             |    |           |         |         |
      |  +------+    +------+             +----+    L3     |         |         |
      |                                   |    |    tag    |         |         |
      |  +------+    +------+             |    |           |         |         |
      |  |      |    |      |             |    +-----------+         |         |
      |  +------+    +------+             +--------------------------+         |
      +-----------------------------------|                          |         |
      +-----------------------------------|                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |    +-----------+         |         |
      |  +------+    +------+             |    |           |         |         |
      |                                   +----+    L3     |         |         |
      |  +------+    +------+             |    |    tag    |         |         |
      |  |      |    |      |             |    |           |         |         |
      |  +------+    +------+             |    +-----------+         |         |
      |                                   |                          |         |
      +-----------------------------------+                          |         |
      +-----------------------------------+                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |   +-----------+          |         |
      |  +------+    +------+             |   |           |          |         |
      |                                   |   |    L3     |          |         |
      |  +------+    +------+             +---+    tag    |          |         |
      |  |      |    |      |             |   |           |          |         |
      |  +------+    +------+             |   +-----------+          |         |
      |                                   |                          |         |
      +-----------------------------------+                          |         |
      +-----------------------------------+                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |  +-----------+           |         |
      |  +------+    +------+             |  |           |           |         |
      |                                   |  |    L3     |           |         |
      |  +------+    +------+             +--+    tag    |           |         |
      |  |      |    |      |             |  |           |           |         |
      |  +------+    +------+             |  +-----------+           |         |
      |                                   |                          +---------+
      +-----------------------------------+
      
      That means spreading tasks among clusters will bring more bandwidth
      while packing tasks within one cluster will lead to smaller cache
      synchronization latency. So both kernel and userspace will have
      a chance to leverage this topology to deploy tasks accordingly to
      achieve either smaller cache latency within one cluster or an even
      distribution of load among clusters for higher throughput.
      
      This patch exposes cluster topology to both kernel and userspace.
      Libraried like hwloc will know cluster by cluster_cpus and related
      sysfs attributes. PoC of HWLOC support at [2].
      
      Note this patch only handle the ACPI case.
      
      Special consideration is needed for SMT processors, where it is
      necessary to move 2 levels up the hierarchy from the leaf nodes
      (thus skipping the processor core level).
      
      Note that arm64 / ACPI does not provide any means of identifying
      a die level in the topology but that may be unrelate to the cluster
      level.
      
      [1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node
          structure (Type 0)
      [2] https://github.com/hisilicon/hwloc/tree/linux-cluster
      
      
      
      Signed-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarTian Tao <tiantao6@hisilicon.com>
      Signed-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210924085104.44806-2-21cnbao@gmail.com
      c5e22fef
  14. Oct 12, 2021
    • Yang Yingliang's avatar
      regmap: Fix possible double-free in regcache_rbtree_exit() · 55e6d803
      Yang Yingliang authored
      
      In regcache_rbtree_insert_to_block(), when 'present' realloc failed,
      the 'blk' which is supposed to assign to 'rbnode->block' will be freed,
      so 'rbnode->block' points a freed memory, in the error handling path of
      regcache_rbtree_init(), 'rbnode->block' will be freed again in
      regcache_rbtree_exit(), KASAN will report double-free as follows:
      
      BUG: KASAN: double-free or invalid-free in kfree+0xce/0x390
      Call Trace:
       slab_free_freelist_hook+0x10d/0x240
       kfree+0xce/0x390
       regcache_rbtree_exit+0x15d/0x1a0
       regcache_rbtree_init+0x224/0x2c0
       regcache_init+0x88d/0x1310
       __regmap_init+0x3151/0x4a80
       __devm_regmap_init+0x7d/0x100
       madera_spi_probe+0x10f/0x333 [madera_spi]
       spi_probe+0x183/0x210
       really_probe+0x285/0xc30
      
      To fix this, moving up the assignment of rbnode->block to immediately after
      the reallocation has succeeded so that the data structure stays valid even
      if the second reallocation fails.
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: 3f4ff561 ("regmap: rbtree: Make cache_present bitmap per node")
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Link: https://lore.kernel.org/r/20211012023735.1632786-1-yangyingliang@huawei.com
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      55e6d803
  15. Oct 07, 2021
  16. Oct 06, 2021
  17. Oct 05, 2021
  18. Sep 28, 2021
  19. Sep 23, 2021
  20. Sep 21, 2021
  21. Sep 16, 2021
    • Laurentiu Tudor's avatar
      software node: balance refcount for managed software nodes · 5aeb05b2
      Laurentiu Tudor authored
      
      software_node_notify(), on KOBJ_REMOVE drops the refcount twice on managed
      software nodes, thus leading to underflow errors. Balance the refcount by
      bumping it in the device_create_managed_software_node() function.
      
      The error [1] was encountered after adding a .shutdown() op to our
      fsl-mc-bus driver.
      
      [1]
      pc : refcount_warn_saturate+0xf8/0x150
      lr : refcount_warn_saturate+0xf8/0x150
      sp : ffff80001009b920
      x29: ffff80001009b920 x28: ffff1a2420318000 x27: 0000000000000000
      x26: ffffccac15e7a038 x25: 0000000000000008 x24: ffffccac168e0030
      x23: ffff1a2428a82000 x22: 0000000000080000 x21: ffff1a24287b5000
      x20: 0000000000000001 x19: ffff1a24261f4400 x18: ffffffffffffffff
      x17: 6f72645f726f7272 x16: 0000000000000000 x15: ffff80009009b607
      x14: 0000000000000000 x13: ffffccac16602670 x12: 0000000000000a17
      x11: 000000000000035d x10: ffffccac16602670 x9 : ffffccac16602670
      x8 : 00000000ffffefff x7 : ffffccac1665a670 x6 : ffffccac1665a670
      x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff
      x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff1a2420318000
      Call trace:
       refcount_warn_saturate+0xf8/0x150
       kobject_put+0x10c/0x120
       software_node_notify+0xd8/0x140
       device_platform_notify+0x4c/0xb4
       device_del+0x188/0x424
       fsl_mc_device_remove+0x2c/0x4c
       rebofind sp.c__fsl_mc_device_remove+0x14/0x2c
       device_for_each_child+0x5c/0xac
       dprc_remove+0x9c/0xc0
       fsl_mc_driver_remove+0x28/0x64
       __device_release_driver+0x188/0x22c
       device_release_driver+0x30/0x50
       bus_remove_device+0x128/0x134
       device_del+0x16c/0x424
       fsl_mc_bus_remove+0x8c/0x114
       fsl_mc_bus_shutdown+0x14/0x20
       platform_shutdown+0x28/0x40
       device_shutdown+0x15c/0x330
       __do_sys_reboot+0x218/0x2a0
       __arm64_sys_reboot+0x28/0x34
       invoke_syscall+0x48/0x114
       el0_svc_common+0x40/0xdc
       do_el0_svc+0x2c/0x94
       el0_svc+0x2c/0x54
       el0t_64_sync_handler+0xa8/0x12c
       el0t_64_sync+0x198/0x19c
      ---[ end trace 32eb1c71c7d86821 ]---
      
      Fixes: 151f6ff7 ("software node: Provide replacement for device_add_properties()")
      Reported-by: default avatarJon Nettleton <jon@solid-run.com>
      Suggested-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Reviewed-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Signed-off-by: default avatarLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Cc: 5.12+ <stable@vger.kernel.org> # 5.12+
      [ rjw: Fix up the software_node_notify() invocation ]
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5aeb05b2
Loading