Solaris 11.4 is suddenly hanging forever when I boot -- how do I diagnose and fix this?
Solaris / OpenSolarisThis forum is for the discussion of Solaris, OpenSolaris, OpenIndiana, and illumos.
General Sun, SunOS and Sparc related questions also go here. Any Solaris fork or distribution is welcome.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Solaris 11.4 is suddenly hanging forever when I boot -- how do I diagnose and fix this?
Esteemed Colleagues:
Solaris 11.4 is giving me grief. It used to work. Now it hangs forever when I try to boot it. I am able to boot single-user, by modifying the GRUB menu either by adding "-s" to the multiboot line, or by adding "-m milestone=single-user". I booted single-user and permanently modified the GRUB menu, adding "-v -m verbose", hoping that this would produce useful information, but it did not. I reboot (with or without "-p" makes no difference) and after a while I type ESC to see the startup messages. The last thing I see is:
There is no error message, and I suspect that the problem is not with /lib/svc/method/fs-local, but with the thing that happens next. I cannot determine what the thing is, that is making it hang. The only way out is to power-cycle, which of course one tries to avoid doing, but it is the only way out. I power-cycle and then reboot once again single-user, and once there I type "svcadm milestone -s -T 200 all", hoping that this will show me the point at which it hangs. It does not. It says "svcadm: Unexpected libscf error on line 2380: entity not found." and I remain in single-user mode. If I type "exit", then at this point the behavior seems to depend on how I got into single-user mode, but if I got there by way of "-s" and not by way of "-m milestone=single-user", I see:
logout
svc.startd: Returning to milestone all.
This Solaris instance has UUID 216ca207-d4e1-47b8-85a8-ae7027494944
dump device is /dev/zvol/dsk/rpool/dump size 7 GB (8130 MB)
Reading ZFS config: done.
Mounting ZFS filesystems: (55/55)
and then it hangs, and there is nothing to do but to power-cycle. At this point I must have done a cold restart at least 20 times, which can't be good. I actually have to lean on the power button. If I just touch it gently, it says:
WARNING: Power off requested from power button or SC, powering down the system!
and then, a few minutes later:
WARNING: Failed to shut down the system!
I should just give up on Solaris once and for all -- I have been tempted for years to do that -- but I have been using Solaris since 1983, and it is like giving up on a bad marriage that has already lasted many years. You keep on thinking that if you just invest a little more time, the thing that needed to be fixed will be finally fixed, and you will be happy again. So we beat on, boats against the current, borne back ceaselessly into the past. How do I diagnose and fix this problem? Thank you in advance for any and all replies.
Hmmm … the views are piling up, and nobody's answering, so let me try, although my knowledge of Solaris is pretty near zero.
By booting into single user mode, you're getting in, and it's mounting /, so that's a start. But init hasn't run. You apparently haven't any worthwhile backup, so it's repair or reinstall. Have you done the basics?
set a basic PATH variable
Do an fsck of the root drive. You can do it if you add a 'read-only' option. Things go badly wrong if you don't. You just need to know that / is not your problem
You then have to run down the init scripts. I'm presuming you have some clue where they are. In BSD it's /etc/rc.d; In linux generally, it's /etc/rc.d/init.d/ with symlinks to them from the various runlevels.
If / is the problem, you might be able to check it as follows. On the kernel boot line, add
init=/sbin/fsck -options <your root drive>.
The options would be: check it anyhow, even if it looks clean; repair any errors you meet; yes to any prompts. You are giving it complete authority to attack your disk, but do you have a better idea? It will sit there at the end, and you can reboot.
We need a laugh, or a success story. Report back either way!
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.