[Telsa] [G92GLM] WARNING: CPU: 4 PID: 2360 at lib/refcount.c:28 refcount_warn_saturate+0xab/0xf0
With last kernels (4.x and 5.x) my Quadro card is a bit unusable.
This happens on Dell Precision M6500 with kernel vanilla 5.6.18.
[30843.780742] ------------[ cut here ]------------
[30843.780743] refcount_t: underflow; use-after-free.
[30843.780763] WARNING: CPU: 4 PID: 2360 at lib/refcount.c:28 refcount_warn_saturate+0xab/0xf0
[30843.780763] Modules linked in: ebtable_filter ebtables tun xt_nat veth xt_conntrack nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter ip6table_nat ip6table_mangle ip6table_filter ip6_tables xt_CHECKSUM iptable_mangle xt_comment xt_tcpudp bridge xt_MASQUERADE iptable_nat iptable_filter ip_tables x_tables bpfilter nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv4 crypto_simd cryptd glue_helper xts 8021q garp mrp stp llc xfs rfcomm fuse cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel bluetooth ecdh_generic ecc ext4 mbcache jbd2 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc sch_fq_codel iwldvm snd_hda_codec_idt snd_hda_codec_generic mac80211 ledtrig_audio joydev iTCO_wdt gpio_ich iTCO_vendor_support snd_hda_intel snd_intel_dspcfg iwlwifi snd_hda_codec intel_powerclamp coretemp snd_hda_core dell_wmi kvm_intel snd_hwdep dell_smbios dell_smm_hwmon snd_pcm kvm irqbypass pcspkr
[30843.780815] psmouse tifm_7xx1 input_leds sparse_keymap dell_wmi_descriptor i7core_edac i2c_i801 tifm_core lpc_ich cfg80211 snd_timer tg3 snd tpm_tis tpm_tis_core tpm zfs(PO) zunicode(PO) zlua(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) ipv6 crc_ccitt nf_defrag_ipv6 uas hid_generic pcmcia crc32c_intel serio_raw sdhci_pci cqhci sdhci firewire_ohci xhci_pci mmc_core firewire_core yenta_socket xhci_hcd ehci_pci ehci_hcd nouveau mxm_wmi i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm agpgart
[30843.780852] CPU: 4 PID: 2360 Comm: Xorg Tainted: P O 5.6.0-sabayon #1
[30843.780853] Hardware name: Dell Inc. Precision M6500 /0R1203, BIOS A10 06/04/2013
[30843.780857] RIP: 0010:refcount_warn_saturate+0xab/0xf0
[30843.780860] Code: 05 56 12 28 01 01 e8 46 ca c0 ff 0f 0b c3 80 3d 44 12 28 01 00 75 90 48 c7 c7 20 19 2f 82 c6 05 34 12 28 01 01 e8 27 ca c0 ff <0f> 0b c3 80 3d 23 12 28 01 00 0f 85 6d ff ff ff 48 c7 c7 78 19 2f
[30843.780861] RSP: 0018:ffffc90000fa7d08 EFLAGS: 00010286
[30843.780863] RAX: 0000000000000000 RBX: 0000000000000004 RCX: 0000000000000007
[30843.780864] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff88840bd19c10
[30843.780865] RBP: 000000000000000a R08: 000000000000051e R09: 0000000000000001
[30843.780867] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88829de70c00
[30843.780868] R13: ffff8884056c8000 R14: ffffc90000fa7e08 R15: ffffffffa0278ea0
[30843.780870] FS: 00007f1028a6f940(0000) GS:ffff88840bd00000(0000) knlGS:0000000000000000
[30843.780872] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30843.780873] CR2: 00003b1f0b8b9000 CR3: 00000003bda2c000 CR4: 00000000000006e0
[30843.780874] Call Trace:
[30843.780914] nouveau_gem_new+0x10b/0x120 [nouveau]
[30843.780944] ? nouveau_gem_new+0x120/0x120 [nouveau]
[30843.780968] nouveau_gem_ioctl_new+0x4e/0xc0 [nouveau]
[30843.780984] drm_ioctl_kernel+0xa7/0xf0 [drm]
[30843.780997] drm_ioctl+0x1f3/0x3b0 [drm]
[30843.781026] ? nouveau_gem_new+0x120/0x120 [nouveau]
[30843.781056] nouveau_drm_ioctl+0x60/0x15b0 [nouveau]
[30843.781061] ksys_ioctl+0x81/0xc0
[30843.781064] __x64_sys_ioctl+0x11/0x20
[30843.781068] do_syscall_64+0x4d/0x190
[30843.781071] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[30843.781074] RIP: 0033:0x7f1029117667
[30843.781076] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 ed c0 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f9 17 0d 00 f7 d8 64 89 01 48
[30843.781077] RSP: 002b:00007ffe96357758 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[30843.781080] RAX: ffffffffffffffda RBX: 0000562c53b92c80 RCX: 00007f1029117667
[30843.781081] RDX: 00007ffe963577b0 RSI: 00000000c0306480 RDI: 000000000000000c
[30843.781083] RBP: 00007ffe963577b0 R08: 0000000000000000 R09: 000000000000000c
[30843.781084] R10: 0000000000000030 R11: 0000000000000246 R12: 00000000c0306480
[30843.781085] R13: 000000000000000c R14: 0000562c528d4220 R15: 0000562c520cf1b0
[30843.781089] ---[ end trace 0f65c5f7ebdc9bbd ]---
[33664.882517] [TTM] Buffer eviction failed
# lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation G92GLM [Quadro FX 2800M] (rev a2)
# uname -a
Linux ironlight2 5.6.0-sabayon #1 SMP Tue Jun 23 13:11:32 UTC 2020 x86_64 Intel(R) Core(TM) i7 CPU Q 820 @ 1.73GHz GenuineIntel GNU/Linux
What I saw is that seems happens more frequently when the RAM has less free memory or when I use Libreoffice (maybe for the same reason).
It seems maybe related to this: https://bugzilla.redhat.com/show_bug.cgi?id=1806239