[SOLVED] Raid problem, 1 of 4 disks dead, replaced with same, now what? Debian
Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Having looked at the promise controller, and what you posted earlier with the mdadm --detail output, I would believe that drives 1 & 3 (sda and sdc) were in a raid0 array that was hardware controlled. I also believe that drives 2 & 4 (sdb and sdd) were in a raid0 array that was software controlled. It could be that all were then part of a raid10 array consisting of all 4 disks although the data you posted does not indicate such.
I could suggest that you first attempt to get the promise fastrak array working in raid0 as previously configured with just the 2 drives. If that is successful then you should be able to access whatever data was there off those 2 drives.
The command I gave creates a new raid6 or raid10 array without attempting to recover the config.
However, to see if any data is available before you create new arrays you can do "cat /proc/mdstat " and see if the system has located any information on raid arrays from the drives themselves.
No hardware-raid controller shows its member-disks in lsblk like on your photos. With a hardware-raid you just see one big block-device in the OS and if there is a way to do something with it, it is vendor-specific.
"promise_fasttrak_array_member" and "isw_raid_member" (Intel) don't mean that there is a hardware-raid-controller behind it. This is fake-raid where you set several SATA-Ports to "RAID" and expect the OS to do the rest with the drivers. The best way to find out if that is on a system and see its status is to just do "cat /proc/mdstat", this should be filled with info about the md-raids you have.
When there are 2 arrays for the disks configured in BIOS, they are most likely both necessary, by default for Intel-Raid with Linux it looks like that for an Intel-Fake-Raid on my home-nas:
md127 : inactive sdd[2](S) sda[1](S) sdb[0](S)
7944 blocks super external:imsm
Personalities is what the kernel was compiled with and -could- do. After that every md-raid is shown:
md127 is the configuration by/for imsm (Intel Matrix Storage Manager). This will be something else for Promise. Do not change anything on that!
md126 in my case is a normally configured softraid with mdadm, using the disks and referencing the external config by md127. All three of them are up (UUU).
unused devices: <none>
--------------------
I am very nervous about the change in setting in the UEFI bios. In the UEFI under the
advanced mode...
advanced tab
SATA configuration and setup
port 1- port 4
there was a drop-down box that said AHCI, RAID and IDE. Originally it said AHCI and I changed it to RAID. My question is, should I leave it at RAID or change it back to AHCI?
Just a bit of information. Maybe useful or not. There was another computer set up at the same time by the same guy. This was a Windows 10 box with a raid configuration on an Asus motherboard with the same type of drives. I don't know if he would have set it up the same way but when I look in the UEFI it appears to be...
name: volume 1
raid level: RAID10 (RAID0 + 1 )
strip size: 64 KB
size: 5.4 TB
status: normal
bootable: yes
I'm going to do a bit more research on what a fake array is. Thank you for the input.
This means that one of the disks (sdd) is most likely configured as "RAID" right now and was the fourth disk ([3]) in some array.
When it is all broken anyways and not working and the data is already lost you can set all disks you want in that raid to "RAID" in BIOS, pre-set it to the settings you later do with mdadm and try it all out and play around with it. When there are other disks in this system, just unplug them and start from a live-cd for your tests/learning
I searched for your motherboard (Asus M5A99FX Pro_R2) and found the manual for it here. It even tells how to get into the raid menus with <control> + <F> during boot.
Section 5 talks about the raid and it appears it should be possible to create raid0, raid1, raid5, or raid10 arrays on that controller.
You should not have the system drive (sde) on sata port 5 set to raid so I suggest resetting it to AHCI if possible.
Please read up on that board and the raid management then decide what you want and you should be good to go.
Looking at the original images you posted and the latest it is clear that the original config was raid on the controller for 2 drives and software raid on the other 2 drives. That may be a limitation of the controller in that it might only be able to manage one array; although raid5 or raid10 could use all 4 drives.
I would try to activate the original array using the drives in ports 1 & 3 as raid0 and see what happens. If that does not give you what you expect then within the raid portion of the bios you can still choose whichever option you feel best with. Or turn off raid altogether in the bios and use software raid with mdadm instead. I have used software raid for many years on Linux and am totally satisfied with it.
Last edited by computersavvy; 12-28-2020 at 11:10 PM.
You should not have the system drive (sde) on sata port 5 set to raid so I suggest resetting it to AHCI if possible.
Due to chipset limitation, when SATA ports are set to RAID mode, all SATA ports run at RAID mode together. So therefore, your suggestion to reset the sde (ssd with OS) to AHCI is not possible without changing the others.
Quote:
Originally Posted by computersavvy
Please read up on that board and the raid management then decide what you want and you should be good to go.
Quote:
Originally Posted by computersavvy
Looking at the original images you posted and the latest it is clear that the original config was raid on the controller for 2 drives and software raid on the other 2 drives. That may be a limitation of the controller in that it might only be able to manage one array; although raid5 or raid10 could use all 4 drives.
Excellent information.
Quote:
Originally Posted by computersavvy
I would try to activate the original array using the drives in ports 1 & 3 as raid0 and see what happens. If that does not give you what you expect then within the raid portion of the bios you can still choose whichever option you feel best with. Or turn off raid altogether in the bios and use software raid with mdadm instead. I have used software raid for many years on Linux and am totally satisfied with it.
I would like to try your suggestion of connecting one and three set to RAID0 and view, if possible, what information is on there. My question is, how do I just connect one and three and what do I expect at startup?. I am willing to raid with mdadm instead when I can read the data and decide the next step.
This is in the manual. Does it apply to me?
'The motherboard does not provide a floppy drive connector. You have to use a USB floppy disk drive when creating a SATA RAID driver disk.'
This is in the manual. Does it apply to me?
'The motherboard does not provide a floppy drive connector. You have to use a USB floppy disk drive when creating a SATA RAID driver disk.'
I don't think so, The line just above that said the driver is for Windows. Your OS is different than the original so that driver would not work anyway.
Those images imply that raid10 was set up on the controller. The new drive that replaced the missing drive will have to be added back so the array can recover. Port 02 ID 01
Follow the instructions on creating an array to add that one back in and hopefully it will just rebuild after you tell it "Y" to activate that device. The array was configured raid10 with 4 drives and clearly shows the missing spot that needs to be filled with the new drive. If I understand the instructions correctly, simply select option 2 from the main menu then at the next menu a <control> + <C> will get you there. Page 5-4 of the manual.
With those simple instructions I am a little disappointed that it does not tell you how to replace a failed drive so I am guessing that it will do so automatically once it has been told to make the new device a member.
I think it really does not matter about the SSD since it was not part of the array anyway and if the system will boot that way you are OK. The raid controller says it can only do raid with 4 devices anyway. Have you been able to check out this deep into the other NAS and see how it was set up?
Those images imply that raid10 was set up on the controller. The new drive that replaced the missing drive will have to be added back so the array can recover. Port 02 ID 01
Follow the instructions on creating an array to add that one back in and hopefully it will just rebuild after you tell it "Y" to activate that device. The array was configured raid10 with 4 drives and clearly shows the missing spot that needs to be filled with the new drive. If I understand the instructions correctly, simply select option 2 from the main menu then at the next menu a <control> + <C> will get you there. Page 5-4 of the manual.
When I pressed <control> <c>, the `LD define menu` screen came up and changed to LD 2. <control> <c> seems to be the method of creating a new raid array not for adding a disc to an existing one. No other information seems to be in the manual.
I cannot figure out how to add a drive in drive assignments to LD1, the original setup. Everything else seems fine, I'm excited! If someone has an idea of how I can be in `LD define menu` with it showing logical drive one and add a drive in drive assignments, ever so grateful.
I am thinking that this is becoming one of those things that can be done only at the command-line and not in a GUI. From what I understand I need to take port 2 : ID 1 and assign it to LD 1-2. I see no way or option in the screens. If I'm wrong, please tell me so.
Quote:
Originally Posted by computersavvy
Have you been able to check out this deep into the other NAS and see how it was set up?
The other computer is not a NAS setup but a workstation in my studio. I have nothing on those drives and am thinking about resetting those after I get this up and running. I learning very interesting things about TIMESTAMP and SNAPSHOT. Seems an important and logical way to setup a system to include these. (wrong forum I suppose)
Note the differences in LD 1 and LD 2 screens. Then read in detail the instructions on the page I referenced. Your screen for LD 1 is a status screen, not the create screen but the stuff at the top is what I am referring to. You have the option to add/change each of the fields at the top in the create screen. I suspect if you make it LD 1 and Nasty_Array like the status screen it will give you all those disks. Just be careful that you do not select port 5 as that is the system disk. The config I see for the new disk is LD 1, port 2 id 1
In the screen for LD 2 I see port 2 id 1 may already be part of that logical disk so you will need to change the "Y" to "N" there then go to the screen for create of LD 1 and add it into that one. Do that before you exit or it might automatically add it back there on the next boot.
You have not shown what you see when you go to the create screen for LD 1.
Last edited by computersavvy; 12-30-2020 at 07:02 PM.
Note the differences in LD 1 and LD 2 screens. Then read in detail the instructions on the page I referenced. Your screen for LD 1 is a status screen, not the create screen but the stuff at the top is what I am referring to. You have the option to add/change each of the fields at the top in the create screen.
I have read the manual and it is shy on information referring to adding a drive to an already existing LD, only creating a new one or viewing existing ones.
I cannot change the option for LD 2 to LD 1. See no way to edit LD 1, only create LD 2. The ONLY option I haven't tried is [ctrl + H] SECURE ERASE in the 'View Drive Assignments' screen as that doesn't sound like anything I need.
Quote:
Originally Posted by computersavvy
I suspect if you make it LD 1 and Nasty_Array like the status screen it will give you all those disks. Just be careful that you do not select port 5 as that is the system disk. The config I see for the new disk is LD 1, port 2 id 1
Quote:
Originally Posted by computersavvy
In the screen for LD 2 I see port 2 id 1 may already be part of that logical disk so you will need to change the "Y" to "N" there then go to the screen for create of LD 1 and add it into that one. Do that before you exit or it might automatically add it back there on the next boot.
The "Y" is what I tried to change it to but it wouldn't save (screenshot with 4 disk warning). It is set to "N" .
Quote:
Originally Posted by computersavvy
You have not shown what you see when you go to the create screen for LD 1.
There is no way to get to the LD 1 create screen (LD DEFINE MENU). The only option is LD 2.
I'm including a screenshot of the only option I can find that I haven't tried. Any thoughts? I looking for info on it in the meanwhile.
In the UEFI I found a 'Launch EFI Shell from filesystem device'. If that is a better option to reset the 'N' to 'Y' or assign port 2 : ID 1 to the LD 1.
Wow, so they are so restricted that you cannot even replace a failed drive in the controller menus. That really sucks and I see no way forward using the hardware raid.
That array on LD 1 is raid10. Even with 1 drive failed you should be able to mount the array and access the data. Have you tried that? If you can then I suggest you backup any important data before you do anything else. If you can't then it seems the data is lost. It may be that the controller requires windows to operate properly and to rebuild the array so the replacement of the OS might force the failure of data recovery.
Try the data recovery and if you decide that is not possible then we will need to step through setting up the software raid in either raid5, 6, or 10 as you choose.
Even with 1 drive failed you should be able to mount the array and access the data. Have you tried that?
I am sooooo sorry to say this but I don't know how to mount an array. (More nube than I sometimes sound) I assume that means disks 1 and 3 as I believe they are a complete set.
Quote:
Originally Posted by computersavvy
Try the data recovery and if you decide that is not possible then we will need to step through setting up the software raid in either raid5, 6, or 10 as you choose.
"We" sounds very comforting. I have been at this on and off for months and I have made more progress than I could have imagined. Thank you.
OK, first post the output of "mount", then post the output of "lsblk", and finally post the output of "ls /dev". With that information we can tell what you have available to work with.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.