Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvidia-settings app won't launch #248119

Closed
Ashvith10 opened this issue Aug 9, 2023 · 9 comments
Closed

nvidia-settings app won't launch #248119

Ashvith10 opened this issue Aug 9, 2023 · 9 comments

Comments

@Ashvith10
Copy link
Contributor

Describe the bug

Nvidia's setting app isn't launching anymore after updating the channel and rebuilding recently.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Open a console
  2. Run nvidia-settings

Expected behavior

Nvidia's GUI app should launch without any issues

Screenshots

-

Additional context

Here's the log when I try launching with a terminal:

$ nvidia-settings 
Aborted (core dumped)

nvidia-smi works just fine:

$ nvidia-smi
Wed Aug  9 15:48:51 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05              Driver Version: 535.86.05    CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce MX250           Off | 00000000:02:00.0 Off |                  N/A |
| N/A   47C    P8              N/A / ERR! |      1MiB /  2048MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2062      G   ...5g-gnome-shell-44.2/bin/gnome-shell        0MiB |
+---------------------------------------------------------------------------------------+

I've also tried vkcube:

$ vkcube
Selected GPU 1: NVIDIA GeForce MX250, type: DiscreteGpu

However, here's what I've noticed: the driver is no longer visible in the About section of GNOME settings:
image

Notify maintainers

@Kiskae
@getchoo

Metadata

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
- system: `"x86_64-linux"`
 - host os: `Linux 6.1.43, NixOS, 23.05 (Stoat), 23.05.2561.9607b9149c9d`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.13.3`
 - channels(ashvith): `""`
 - channels(root): `"nixos-23.05"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
@Kiskae
Copy link
Contributor

Kiskae commented Aug 9, 2023

It looks like wayland is supposed to print a useful message, but that isn't happening for some reason.

After grabbing a debugging-enabled version of the wayland client I tracked down the issue to be similar to swaywm/sway#6717, where the app isn't properly checking if the offered wayland interface is newer than it can support.

It should be fairly easy to create a patch for nvidia-settings to fix this.

@doronbehar
Copy link
Contributor

I personally experience this:

$ nvidia-smi
Unable to determine the device handle for GPU0000:02:00.0: Unknown Error

@Ashvith10
Copy link
Contributor Author

@Kiskae Here's the GDB output:

gdb nvidia-settings 
GNU gdb (GDB) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from nvidia-settings...
(No debugging symbols found in nvidia-settings)
(gdb) run
Starting program: /nix/store/1k13j8z6wi3wxy4l8q3ik83156bk6gn1-system-path/bin/nvidia-settings 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib/libthread_db.so.1".
process 25283 is executing new program: /nix/store/rlhaiyy5lwxcbr3s9jvikgai3lg6q5yr-nvidia-settings-535.86.05/bin/.nvidia-settings-wrapped
warning: Loadable section ".dynstr" outside of ELF segments
  in /nix/store/rlhaiyy5lwxcbr3s9jvikgai3lg6q5yr-nvidia-settings-535.86.05/bin/.nvidia-settings-wrapped
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib/libthread_db.so.1".
[New Thread 0x7ffff57ff6c0 (LWP 25286)]
[New Thread 0x7fffeffff6c0 (LWP 25287)]
[New Thread 0x7ffff4ffe6c0 (LWP 25288)]
[New Thread 0x7fffef7fe6c0 (LWP 25289)]
[New Thread 0x7fffeeffd6c0 (LWP 25290)]

Thread 1 ".nvidia-setting" received signal SIGABRT, Aborted.
0x00007ffff7c08adc in __pthread_kill_implementation ()
   from /nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib/libc.so.6
(gdb) 

@Kiskae
Copy link
Contributor

Kiskae commented Aug 9, 2023

@Ashvith10 You'll need to run bt to see the stack, but that looks like the output I managed to reproduce.
If it going through wl_closure_invoke into wl_abort it should be fixed by #248128

@doronbehar That is definitely not related to this issue. It looks like an internal error of libnvidia-ml.
I'd recommend checking your dmesg log to see if there were errors in the nvidia kernel module.

@Ashvith10
Copy link
Contributor Author

Ashvith10 commented Aug 9, 2023

@Kiskae here's the output for bt:

(gdb) bt
#0  0x00007ffff7c08adc in __pthread_kill_implementation ()
   from /nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib/libc.so.6
#1  0x00007ffff7bb9cb6 in raise ()
   from /nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib/libc.so.6
#2  0x00007ffff7ba38ba in abort ()
   from /nix/store/46m4xx889wlhsdj72j38fnlyyvvvvbyb-glibc-2.37-8/lib/libc.so.6
#3  0x00007ffff7b34b5f in wl_abort ()
#4  0x00007ffff7b33948 in wl_closure_invoke ()
   from /nix/store/vcsb019z7yvj20k5gpdxxf2s8qzamy71-wayland-1.22.0/lib/libwayland-client.so.0
#5  0x00007ffff7b2fbc9 in dispatch_event.isra ()
   from /nix/store/vcsb019z7yvj20k5gpdxxf2s8qzamy71-wayland-1.22.0/lib/libwayland-client.so.0
#6  0x00007ffff7b31524 in wl_display_dispatch_queue_pending ()
   from /nix/store/vcsb019z7yvj20k5gpdxxf2s8qzamy71-wayland-1.22.0/lib/libwayland-client.so.0
#7  0x00007ffff7b31adf in wl_display_roundtrip_queue ()
   from /nix/store/vcsb019z7yvj20k5gpdxxf2s8qzamy71-wayland-1.22.0/lib/libwayland-client.so.0
#8  0x00007ffff7b3c43c in get_wayland_output_info ()
   from /nix/store/rlhaiyy5lwxcbr3s9jvikgai3lg6q5yr-nvidia-settings-535.86.05/lib/libnvidia-wayland-client.so.535.86.05
#9  0x000000000040b2f4 in wconn_get_wayland_output_info ()
#10 0x0000000000408d62 in main ()
(gdb) 

Looks like this is the exact same issue you were talking about.

@Kiskae
Copy link
Contributor

Kiskae commented Aug 13, 2023

Fixed once this reaches nixos-23.05: https://nixpk.gs/pr-tracker.html?pr=248845

@Ashvith10
Copy link
Contributor Author

@Kiskae thank you for your contribution 😸

@Kiskae
Copy link
Contributor

Kiskae commented Aug 14, 2023

Just updated my personal machine and it rebuilt nvidia-settings, so you should be able to confirm whether the fix worked now.

@Ashvith10
Copy link
Contributor Author

@Kiskae this issue has been resolve closing this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants