Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.05] zfs.latestCompatibleLinuxPackages selects linux-libre; unable to load proprietary firmware #341867

Closed
happyalu opened this issue Sep 14, 2024 · 19 comments
Labels
0.kind: bug Something is broken

Comments

@happyalu
Copy link
Contributor

Describe the bug

I'm using flakes to configure my nixos (24.05) hosts. Updating from commit f1bad50 to e65aa83 caused me to boot with bad display.

It seems to be related to kernel change 6.6 -> 6.10: I see this in the boot log.

Sep 14 20:53:34.224280 host1 kernel: [drm] amdgpu kernel modesetting enabled.
Sep 14 20:53:34.231271 host1 kernel: amdgpu: Virtual CRAT table created for CPU
Sep 14 20:53:34.231310 host1 kernel: amdgpu: Topology: Add CPU node
Sep 14 20:53:34.231386 host1 kernel: amdgpu 0000:07:00.0: enabling device (0006 -> 0007)
Sep 14 20:53:34.231620 host1 kernel: [drm] initializing kernel modesetting (RENOIR 0x1002:0x1638 0x1043:0x8809 0xC8).
Sep 14 20:53:34.231650 host1 kernel: [drm] register mmio base: 0xFCB00000
Sep 14 20:53:34.231674 host1 kernel: [drm] register mmio size: 524288
Sep 14 20:53:34.234376 host1 kernel: [drm] add ip block number 0 <soc15_common>
Sep 14 20:53:34.234407 host1 kernel: [drm] add ip block number 1 <gmc_v9_0>
Sep 14 20:53:34.234442 host1 kernel: [drm] add ip block number 2 <vega10_ih>
Sep 14 20:53:34.234461 host1 kernel: [drm] add ip block number 3 <psp>
Sep 14 20:53:34.234479 host1 kernel: [drm] add ip block number 4 <smu>
Sep 14 20:53:34.234499 host1 kernel: [drm] add ip block number 5 <dm>
Sep 14 20:53:34.234542 host1 kernel: [drm] add ip block number 6 <gfx_v9_0>
Sep 14 20:53:34.234561 host1 kernel: [drm] add ip block number 7 <sdma_v4_0>
Sep 14 20:53:34.234580 host1 kernel: [drm] add ip block number 8 <vcn_v2_0>
Sep 14 20:53:34.234599 host1 kernel: [drm] add ip block number 9 <jpeg_v2_0>
Sep 14 20:53:34.234624 host1 kernel: amdgpu 0000:07:00.0: amdgpu: Fetched VBIOS from VFCT
Sep 14 20:53:34.234825 host1 kernel: amdgpu: ATOM BIOS: 113-CEZANNE-018
Sep 14 20:53:34.234848 host1 kernel: 0000:07:00.0: Missing Free firmware (non-Free firmware loading is disabled)
Sep 14 20:53:34.234867 host1 kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <psp> failed -19
Sep 14 20:53:34.235271 host1 kernel: 0000:07:00.0: Missing Free firmware (non-Free firmware loading is disabled)
Sep 14 20:53:34.236270 host1 kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <dm> failed -19
Sep 14 20:53:34.236316 host1 kernel: 0000:07:00.0: Missing Free firmware (non-Free firmware loading is disabled)
Sep 14 20:53:34.237267 host1 kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <gfx_v9_0> failed -19
Sep 14 20:53:34.237337 host1 kernel: 0000:07:00.0: Missing Free firmware (non-Free firmware loading is disabled)
Sep 14 20:53:34.238298 host1 kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <sdma_v4_0> failed -19
Sep 14 20:53:34.238323 host1 kernel: [drm] VCN decode is enabled in VM mode
Sep 14 20:53:34.238349 host1 kernel: [drm] VCN encode is enabled in VM mode
Sep 14 20:53:34.238371 host1 kernel: 0000:07:00.0: Missing Free firmware (non-Free firmware loading is disabled)
Sep 14 20:53:34.239268 host1 kernel: [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <vcn_v2_0> failed -19
Sep 14 20:53:34.239301 host1 kernel: [drm] JPEG decode is enabled in VM mode
Sep 14 20:53:34.239326 host1 kernel: amdgpu 0000:07:00.0: amdgpu: Fatal error during GPU init
Sep 14 20:53:34.239693 host1 kernel: amdgpu 0000:07:00.0: amdgpu: amdgpu: finishing device.

I have reverted this back, but I'm not sure if I need to change anything in my host config to fix this, or just wait for a kernel update.

@happyalu happyalu added the 0.kind: bug Something is broken label Sep 14, 2024
@starkca90
Copy link
Contributor

I could be misremembering, but I vaguely recall some directory change back in 6.9 or something and I got similar errors.

Root of my problem was my boot.kernelPackages was using one repo and some other kernel line or module or something was using a more up to date repo (was working around some other out dated package or something).

I ended up switching the entry that was using my "non-standard" package repo back to NixOS and was back in business.

@Atemu
Copy link
Member

Atemu commented Sep 15, 2024

The default kernel was not updated to 6.10, this must be caused by your config. If AMDGPU is broken, there's not much we can do about that but wait for the kernel devs to fix it.

@Atemu Atemu closed this as not planned Won't fix, can't repro, duplicate, stale Sep 15, 2024
@happyalu
Copy link
Contributor Author

Thanks.

I had this in the config which was causing trouble.

boot.kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/non-free-firmware-not-loaded-after-update/52183/1

@coreyoconnor
Copy link
Contributor

This applies to more than just amdgpu. This also occurred on my laptop with Intel. Specifically the wifi module also could not locate the non-free firmware.

@coreyoconnor coreyoconnor reopened this Sep 15, 2024
@coreyoconnor
Copy link
Contributor

EG: the i915 module also reports the same "non-Free firmware loading is disabled"

@coreyoconnor
Copy link
Contributor

Indeed. Looks like even if enableAllFirmware and enableRedistributaleFirmware are true the 6.10 kernel does not load any non-free firmware.

Not an amdgpu specific bug then

@coreyoconnor
Copy link
Contributor

Cool! Using the default linux kernel, 6.6.51, worked as expected.

That tells me this is specific to 6.10 and not some general breakage with nixpkgs.

I did look through the 6.9 and 6.10 changelogs and didn't see anything about firmware loading changes. But I didn't read through 6.8 and 6.7.

@jeff84
Copy link

jeff84 commented Sep 16, 2024

Kernel 6.8.12 has been working without a problem before 6.8 EOL.
The 6.10 kernel was the first with -gnu suffix

Linux zellat2nix 6.10.10-gnu #1-NixOS SMP PREEMPT_DYNAMIC Tue Jan  1 00:00:00 UTC 1980 x86_64 GNU/Linux

Could it be a new patchset which is used in 6.10 kernel?

@Atemu Atemu changed the title amdgpu error with kernel 6.10 kernel 6.10 cannot locate proprietary firmware Sep 16, 2024
@Atemu
Copy link
Member

Atemu commented Sep 16, 2024

@coreyoconnor is that on 24.05 or unstable?

@happyalu
Copy link
Contributor Author

I had this on 24.05 with boot.kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;

@coreyoconnor
Copy link
Contributor

@coreyoconnor is that on 24.05 or unstable?

This is on 24.05. I can test on unstable if that is useful.

@jeff84
Copy link

jeff84 commented Sep 16, 2024

I think it could be a problem with
boot.kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;
For me it seems like this option picks the pkgs.linuxPackages-libre kernel package.

If I select
boot.kernelPackages = pkgs.linuxPackages_6_10;
it's working normally as expected.

~ % uname -a
Linux zellat2nix 6.10.10 #1-NixOS SMP PREEMPT_DYNAMIC Thu Sep 12 09:13:13 UTC 2024 x86_64 GNU/Linux
~ % lsmod| grep iwlwifi
iwlwifi               561152  1 iwlmvm
cfg80211             1347584  3 iwlmvm,iwlwifi,mac80211
firmware_class         57344  19 btrtl,snd_soc_avs,snd_hda_intel,intel_ipu6,xhci_pci_renesas,btmtk,snd_sof,drm_display_helper,intel_ipu6_isys,btintel,snd_soc_hdac_hda,btbcm,iwlwifi,btusb,mei_vsc_hw,xe,i915,cfg80211,intel_ishtp

@coreyoconnor
Copy link
Contributor

coreyoconnor commented Sep 16, 2024 via email

@terrorbyte
Copy link
Member

Also happened to me with with https://releases.nixos.org/nixos/24.05/nixos-24.05.4974.8f7492cce289/nixexprs.tar.xz and with the boot.kernelPackages = config.boot.zfs.package.latestCompatibleLinuxPackages;

jeff84s comment of manually selecting the package version appears to be a temporary fix: #341867 (comment)

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/non-free-firmware-not-loaded-after-update/52183/3

@flintflump
Copy link

flintflump commented Sep 17, 2024

As suggested in #341867 (comment), it seems that the new logic for zfs' latestCompatibleLinuxKernel (introduced in 27b52ad) chooses a libre Kernel:

Working commit (f4c846a)

nix-repl> pkgs.zfs_2_2.passthru.latestCompatibleLinuxPackages.kernel.isLibre
false

nix-repl> pkgs.zfs_2_2.passthru.latestCompatibleLinuxPackages.kernel.name
"linux-6.6.50"

Non-working commit (8f7492c):

nix-repl> pkgs.zfs_2_2.passthru.latestCompatibleLinuxPackages.kernel.isLibre
true

nix-repl> pkgs.zfs_2_2.passthru.latestCompatibleLinuxPackages.kernel.name
"linux-6.10.10"

It seems 34e1748 might be the culprit.

@Atemu Atemu changed the title kernel 6.10 cannot locate proprietary firmware [24.05] zfs.latestCompatibleLinuxPackages selects linux-libre; unable to load proprietary firmware Sep 17, 2024
Atemu added a commit to Atemu/nixpkgs that referenced this issue Sep 17, 2024
This reverts commit 27b52ad.

Fixes NixOS#341867

We could select linux_6_10 now if we wanted but I opted for linux_6_6 instead
because this will be removed anyways.

This is technically a breaking change but latestCompatibleLinuxPackages has no
expectancy of stability in the first place.
@Atemu Atemu closed this as completed Sep 17, 2024
@Atemu
Copy link
Member

Atemu commented Sep 17, 2024

The fix should be coming to a 24.05 channel near you in the coming days.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/no-more-sound-on-laptop-after-upgrade-tuxedo-infinitybook-pro-16-gen7/52296/6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken
Projects
None yet
Development

No branches or pull requests

8 participants