LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Ubuntu (https://www.linuxquestions.org/questions/ubuntu-63/)
-   -   Failed to start User Manager - Spurious SystemD / CGroups Issue (https://www.linuxquestions.org/questions/ubuntu-63/failed-to-start-user-manager-spurious-systemd-cgroups-issue-4175733844/)

MadMartian 02-14-2024 03:59 AM

Failed to start User Manager - Spurious SystemD / CGroups Issue
 
I have run into a strange issue that does not make any sense to me, primarily because I haven't changed anything (AFAIK) to warrant bricking my Ubuntu system. From what I can tell, the root cause is associated with or directly related to CGroups. Unfortunately I do not understand enough about SystemD or CGroups or the DBus to even know where to begin to troubleshoot or even what it is I'm trying to fix. All I really do know is that virtually any user SystemD service (i.e. systemctl --user start xyz) fails to start and the error messages are fairly consistent:

Code:

user@1000.service: Failed to enable/disable controllers on cgroup /user.slice/user-1000.slice/user@1000.service, ignoring: Permission denied
user@1000.service: Main process exited, code=exited, status=219/CGROUP

This is on Ubuntu 22.04 and I've tried the following Kernel versions so far:
- v6.5.0-17-generic
- v6.5.0-15-generic

systemctl --failed yields:
Code:

  UNIT                        LOAD  ACTIVE SUB    DESCRIPTION                                                   
● apcupsd.service            loaded failed failed UPS power management daemon
● apply-cgroups.service      loaded failed failed Apply CGroup settings
● dnsmasq.service            loaded failed failed dnsmasq - A lightweight DHCP and caching DNS server
● grub-common.service        loaded failed failed Record successful boot for GRUB
● kerneloops.service          loaded failed failed Tool to automatically collect and submit kernel crash signatures
● nvidia-persistenced.service loaded failed failed NVIDIA Persistence Daemon
● rpc-statd.service          loaded failed failed NFS status monitor for NFSv2/3 locking.
● user@1000.service          loaded failed failed User Manager for UID 1000
● user@1004.service          loaded failed failed User Manager for UID 1004
● user@130.service            loaded failed failed User Manager for UID 130
● vmware-tools.service        loaded failed failed VMWare Tools Service

The apply-cgroups.service is something I wrote, here are the contents (I have disabled this service but haven't yet tested if this changes anything):

Code:

[Unit]
Description=Apply CGroup settings

[Service]
Type=oneshot
ExecStart=cgcreate -g cpu,cpuacct:/foo-app
ExecStart=cgset -r cpu.cfs_quota_us=750000 /foo-app
ExecStart=cgset -r memory.limit_in_bytes=8G /foo-app

[Install]
WantedBy=multi-user.target

Contents of /sys/fs/cgroup, and my research tells me that this indicates CGroups v2 is not available:

Code:

-rw-r--r--  1 root root 0 Feb 14 01:17 cgroup.clone_children
-rw-r--r--  1 root root 0 Feb 14 01:17 cgroup.procs
-r--r--r--  1 root root 0 Feb 14 01:17 cgroup.sane_behavior
drwxr-xr-x  2 root root 0 Feb 14 01:20 dev-hugepages.mount
--w-------  1 root root 0 Feb 14 01:17 devices.allow
--w-------  1 root root 0 Feb 14 01:17 devices.deny
-r--r--r--  1 root root 0 Feb 14 01:17 devices.list
drwxr-xr-x  2 root root 0 Feb 14 01:20 dev-mqueue.mount
-rw-r--r--  1 root root 0 Feb 14 01:17 notify_on_release
drwxr-xr-x  2 root root 0 Feb 14 01:20 proc-fs-nfsd.mount
drwxr-xr-x  2 root root 0 Feb 14 01:20 proc-sys-fs-binfmt_misc.mount
-rw-r--r--  1 root root 0 Feb 14 01:17 release_agent
drwxr-xr-x  2 root root 0 Feb 14 01:20 sys-fs-fuse-connections.mount
drwxr-xr-x  2 root root 0 Feb 14 01:20 sys-kernel-config.mount
drwxr-xr-x  2 root root 0 Feb 14 01:20 sys-kernel-debug.mount
drwxr-xr-x  2 root root 0 Feb 14 01:20 sys-kernel-tracing.mount
drwxr-xr-x 99 root root 0 Feb 14 01:45 system.slice
-rw-r--r--  1 root root 0 Feb 14 01:17 tasks
drwxr-xr-x  5 root root 0 Feb 14 01:29 user.slice

cat /proc/filesystems reveals that both cgroup and cgroup2 are available:

Code:

...
nodev  cgroup
nodev  cgroup2
...

During the re-installation of packages as a troubleshooting step, I have seen this error message periodically, and I'm afraid I don't even understand what this service is or what it's trying to achieve:

Code:

Failed to reload daemon: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
And finally, I found this from some troubleshooting advice that may provide some clues:

Code:

02:32:20 > systemd --user
Cannot determine cgroup we are running in: No medium found
Failed to allocate manager object: No medium found

I don't know what a medium is, I don't know why it needs one, and I cannot find any documentation on the terminology either, hence why I am flummoxed and don't really know what it is I'm trying to fix. My key trouble here is that I don't really know the tools well enough to diagnose the root cause of this, so I have been spitballing fruitlessly, and that's why I'm reaching out here for answers.

--
UPDATE: Both CGroup versions 1 and 2 were mounted at `/sys/fs/cgroup`, so I unmounted it and v1 left but v2 stayed. This seems to have resolved the issue temporarily, but it comes back upon the next system boot. What is trying to mount `/sys/fs/cgroup` with v1 CGroups?


All times are GMT -5. The time now is 06:15 AM.