Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Admin message
The migration is almost done, at least the rest should happen in the background. There are still a few technical difference between the old cluster and the new ones, and they are summarized in this issue. Please pay attention to the TL:DR at the end of the comment.
So, is the issue related to graphics on-demand frequency management or similar feature?
will we ever know the answer to your question? who knows...
ORIGINAL REPORT:
I have already reported similar problems and everyone probably knows what I'm talking about. These Intel GPU hangs have been frequent in (at least) the past year, all with apparently different causes and workarounds, but in my case my laptop is still unusable.
what do i have to do? i have tried all possible kernel parameters, drivers and settings that you can find in the internet, most of that was VERY old informations since the problem is so old and so wide in range that you can get lost.
At the moment i'm on ArchLinux, using latest linux-next 5.16 kernel from git (that should have a fix for hangs on skylake too but nothing changed), latest mesa 22.0 from git (been a while and never made a difference).
#3437
old related issue i had a few months ago, at the time (in many applications including Firefox) the screen freezed and i had to reboot. now it doesn't happen anymore in Firefox and where it happens, like PCSX2, the screen freezes for just a second, then PCSX2 stay freezed but i can switch to other applications and do everything i want just fine. after a minute more or less PCSX2 crashes and closes. same thing happens also playing Minecraft.
it's like heavy GPU load triggers it, but i don't know if that makes any sense.
so here is an update. i've just built linux-drm-tip and while the GPU hangs are still there and frequent, the screen freezes for a few seconds at the worst but then it recovers.
just out of curiosity i update linux-next (i was using a build from 21/01/2022) to the latest commits and it seems the situation is identical, only freezes and nothing crashes anymore.
unfortunately disabling psr (as well as a LOT of other options) doesn't have any effect. also the gpu hangs are the same in 5.15 and 5.16 for me.
i will try again, just to be sure, even though the lack of help given by developers doesn't really make me want to do anything except buying new hardware.
User end systems don't have i915 debug enabling support. That's the reason you were unable to get the file details.
Please share the dmesg log when issue occurred.
more updates. i've tested Minecraft, which was another game crashing badly. unfortunately it still crashes, but at least it does after a few gpu hangs where it freezes badly but recovers. eventually though the Minecraft window will get completely black and then the game crashes.
two days ago i updated linux-drm-tip again and nothing has changed. tried updating linux-next, but something went wrong and it doesn't boot, so i'll try again in some time.
are you kidding me? i have been doing all i can to try to get help here even if i'm not a developer or anything, i've posted several times in other issues and opened more than one, NEVER GOT A SINGLE ANSWER, i even linked another issue i opened in the first comment and if you would have tried to read you would have already the answer to your question about "other distributions".
UNBELIEVABLE! i see these problems in an infinite amount of distributions and platforms, CAN'T YOU SEE HOW MANY SIMILAR BUG REPORTS ARE HERE?? ARE YOU TELLING ME THAT IT'S ALL A COINCIDENCE AND MY PROBLEM IS DIFFERENT FROM ALL OF THOSE SAME REPORTS??
i have been in this situation for YEARS! it's already WAY beyond annoying, i won't ever buy Intel's hardware again because it has almost ruined my life since i can't do anything i would like to do with MY FUCKING LAPTOP!!! and all this during one of the worst period i've ever lived.
instead of acting like a bitch tell me WHAT THE FUCK IS i915_error_state YOU FUCKING GENIUS!
i'm really disappointed by the suppport (NOT) given, you clearly haven't even read all i have posted in a fucking YEAR. NO ONE CARES ABOUT THIS, PROBABLY SINCE IT'S BETTER IF ME AND OTHERS AFFECTED JUST SHUT THE FUCK UP AND BUY NEW HARDWARE.
but yeah keep acting like you are doing me a favor just for giving me a useless answer, giving me NO HELP WHATSOEVER TO SOLVE THIS, suggesting to change distribution, like it would change anything. I HAVE BEEN IN DIFFERENT DISTRIBUTIONS FOR THE PAST TWO YEARS AND THE HARDWARE IS UNUSABLE. I HAVE USED ALL KERNELS AND DRIVERS POSSIBLE... AND YOU WOULD KNOW THAT IF YOU HAD SPENT A FEW MINUTES OF YOUR TIME READING.
i'm not the one who is working on this. i'm just one who wants to try to help himself and others in a similar situation, because it's already ridicoulus and someons must do something since you all seem like you have better things to do than fix YOUR FUCKING SOFTWARE.
i don't know how to explain this more clearly than i already did, i'm fucking out of my mind right now thanks to your attitude and incompetence.
if someone is wiling to help, tell me WHAT THE FUCK I HAVE TO DO, or else go to fucking hell. i'm already tired of all this, it's not fair at all... i just want to use my laptop... fuck off!
insulting people because you're upset is not a cool move; they also aren't your personal tech support, so expecting assistance for every issue is unreasonable. The developers and staff are limited in number and almost certainly have bigger, more important issues to tackle, such as bringing up support for GPUs that have zero support at all (that's even more "unusable"). That's an even bigger deal when you consider that the developers are likely on Intel's payroll. Aside from that...
First of all, take a look at the bug submission guidelines, and follow them if you wish to get more attention to your issue: wiki - How to file i915 bugs
Some key takeaways from that page:
Add more detailed platform information and steps to reproduce
Boot with drm.debug=0x1e log_buf_len=1M when replicating the issue, and upload the resultant dmesg. It would probably help to timestamp the files you upload by adding the current date to the name; that helps with identifying which files are relevant at a glance.
From a linked page: the error state containing the last batch buffer can be obtained (after the first hang) from /sys/class/drm/card0/error, or the older /sys/kernel/debug/dri/0/i915_error_state . Note that only the first error is collected (to my knowledge); to allow further errors to be collected, you can write to the error file after you've copied it (echo > /sys/class/drm/card0/error), however usually only the first error state is helpful.
If you are still using Xorg/X11, attaching /var/log/Xorg.0.log may be helpful as well. On gnome desktop, Xorg.0.log may be elsewhere or it may be using Wayland (I don't really know since I use MATE instead). It may be worthwhile to switch between X11 and Wayland, if GNOME supports it.
See also the Arch Linux wiki page on Intel Graphics. Some points of interest:
In Mesa 20+, the iris OpenGL driver is enabled by default for Gen8+. You can force the older i915 driver with MESA_LOADER_DRIVER_OVERRIDE=i965. Export it as an environment variable before you start GNOME, and verify it by checking for the variable in a terminal shell within GNOME.
See here for additional troubleshooting steps regarding some graphical crashes.
One final point: I don't think this is an issue affecting all Skylake GPUs. At least, I've been using Kabylake and haven't had any issues with hangs except for very rare issues that I can't replicate, including the issue of mine you bumped. If all Skylake iGPUs were affected it would make a bigger wave; likely it's a platform bug specific to your hardware vendor or motherboard model. They may be Mesa bugs, rather than drm/i915 ones, so downgrading/upgrading Mesa may fix your issue. It may also be a bug in VA-API, although unlikely.
first of all, i appreciate your answer, i really do 'cause it's the best i've got yet.
i disagree with part of what you said, for example that the staff is "limited" and have other things to do. i mean, i trust your words on that, but we aren't talking about an indipendent project made of volunteers. i think Intel's has sufficient funds to improve technical support.
For the part of Skylake etc... what about all of the other reports i linked below? for sure some are unrelated, but are you saying ALL of them are? there are some (even on different hardware) with the same exact behavior using the same exact applications.
do you know how much support did some of those get? that's right, ZERO. not even an answer saying "hey mate that's not the way to report a bug and we can't help". no, just ingored like nothing happened.
your answer, as i was saying, is what i've been waiting for almost a year now and the fact that you imply that i want them to be my personal technical support is nonsense. again, this isn't the PCSX2 team we are talking about and i've tried to make my part in solving this problem, even if i'm just an average user, and all i got is an unbearable silence.
not fair at all if you ask me.
thank you for the informations though, i'll see what i'm able to do.
just want to point out that some people (myself included) don't have anything /sys/class/drm/card0/error and i don't know how to solve that, while i've never seen or read aywhere about the other /sys/kernel/debug/dri/0/i915_error_state.
i'm using wayland so no Xorg.0.log unfortunately.
about the Archwiki solutions, when i say that i've tried EVERYTHING i've read in all possible reports, threads and whatnot regarding GPU hangs i mean that i've literally tried EVERYTHING and nothing has ever made the slightest difference.
i perfectly know this bug isn't widespread in all Skylake hardware and i've never said it is, but i've said that it seems to happen on many different Intel platforms in the same ways.
i don't know where this bug is from, but i can tell you that i've tried to upgrade to latest mesa, kernel and whatever, but still here i am.
i'd gladly downgrade, but i don't know what version should i try first, especially because similar issues have been reported on all possible kernel and mesa versions, so... i have no clue.
i reported my issues to have this kind of answers as well, because i'm not asking people to magically make my laptop work, i'm asking help to understand what i have to do in order to make it work, which is different in my opinion.
Sadly, I get here looking for the same answers to the same problems, and I'm a Linux-newb. I can poke around, get logs, etc. but I have to rely on others to create the fix. One thing I do understand is F-Bombing gets you sent to the ignore box by anyone researching the problem on your behalf. Anyone that puts up with a problem for over a year is a masochist. I would have shelved that computer and got another a long time ago.
My situation is a little different - I have to support computers that we've put in the field, and our "upgraded" application (which I'm testing) has the same errors "everyone" has reported. So far, I've tied it to the newest CPU/GPU's from Intel - Gen 10 or better. This particular box runs a i3-1115G4 with a Gen 12 GPU - not something "normal/tested/sturdy/stable/well understood", and the fixed drivers take FOREVER to move thru the Linux build chain/process. And, to top it off, my developer is building a custom OS on Yocto, so support is very thin.
So good luck with your computer, and know that Windows doesn't have this problem...lol.
first, if you pay more attention this thread has gotten a lot more activity after my "colorful" comments than it had in months and way more activity than countless similar threads. i'm not saying it's good to rant like that, but i think that it wasn't negative as well.
second, i'm not rich, my computer works and i care about the free software enough to spend years following development about this bug. it's not for everyone, that's for sure, but it's worth the pain.
i know windows doesn't have this problem, but it has a lot more than this, so...
good luck to you too and let's hope that what we are doing here will help you too.
Hi, I facing same problem, and I can't believe this stupid bug still not fixed. Also tried i915.enable_psr=0 kernel parameter, but it's not help. Laptop completely unusable because of shitty intel driver.
May 06 07:22:31 laptop-gnu kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer deviceMay 06 14:11:40 laptop-gnu kernel: i915 0000:00:02.0: [drm] *ERROR* CPU pipe A FIFO underrunMay 06 14:23:49 laptop-gnu kernel: i915 0000:00:02.0: Using 39-bit DMA addressesMay 06 17:49:29 laptop-gnu kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time outMay 06 17:49:29 laptop-gnu kernel: i915 0000:00:02.0: [drm] chrome[4686] context reset due to GPU hangMay 06 17:49:29 laptop-gnu kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:85dffffb, in chrome [4686]May 07 16:18:03 laptop-gnu kernel: i915 0000:00:02.0: [drm] *ERROR* CPU pipe A FIFO underrunMay 07 17:23:11 laptop-gnu kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time outMay 07 17:23:11 laptop-gnu kernel: i915 0000:00:02.0: [drm] kwin_wayland[2117] context reset due to GPU hangMay 07 17:23:11 laptop-gnu kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:85dffffb, in kwin_wayland [2117]May 08 03:21:18 laptop-gnu kernel: i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time outMay 08 03:21:18 laptop-gnu kernel: i915 0000:00:02.0: [drm] kwin_wayland[2117] context reset due to GPU hangMay 08 03:21:18 laptop-gnu kernel: i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:85dffffb, in kwin_wayland [2117]
I like intel CPUs, but I hate fucking intel graphics. Not only Linux driver is shit, Windows driver has problems too. For example, Vulkan games (running via DXVK) simply crashing at loading.
So, never buy laptop with shitty integrated intel GPU, better buy with cheapest, but normal discrete videocard from nvidia/amd. Intel GPU is a trash. Intel developers don't care about their users, I actually think Intel GPU developers are sadomasochists, they love when users suffer.