Showing posts with label linux. Show all posts
Showing posts with label linux. Show all posts

29/09/2013

Recovering from a major RAID5 with 2 disks failing out of 4

Yes. The title is a bit unrealistic, because if you have 2 disks failing out of 4, you are out of business. But I was lucky. One disk was failing into WRITE (at least that is what S.M.A.R.T. was saying). The other one was failing into READ (well, I didn't need S.M.A.R.T. to tell me anything, it just failed!). So, how did I manage (and believe me, this was a long, sleepless night, full of events). First, I took the disk that was failing WRITE commands and put it in another computer where I had some extra, unused disks (well, I just have them hanging there...). There, I copied the disk using dd (ah, you need Linux for this to work! Is there anything else but Linux for these kind of jobs anyway?):
dd if=/dev/sd${OLD} of=/dev/sd${NEW} # where you can replace your drives with the ones fitting your case

And here is where I got lucky! I managed to copy the whole drive on the new drive. But I wasn't not that lucky. When I put the drive back in the array. BOOHOOO! My RAID5 had 2 good drives now (one of them was the new one) but the third one was a SPARE!!! So, I still couldn't assemble the array. But there is a catch, I could rebuild the array without touching the data... Here is how it was looking before:
cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]       
md0 : inactive sdb3[4](S) sda3[0] sdc3[2]
      XXXXXXXXXX blocks

As you can see, the sdb3 is seen as a SPARE! Now, that is wrong, and it shouldn't be. So, I had to risk, declare the whole array as wrong and hope that the SPARE could actually have the information to get me out of this stall state. So, here is how I activated the SPARE:
mdadm --stop /dev/md0 # Stop the md0 array
mdadm -Cv /dev/md0 --assume-clean --level=5 --layout=left-symmetric --chunk=64 --raid-devices=4 /dev/sda3 /dev/sdb3 /dev/sdc3 missing # Remember, sdb was a SPARE and the sdd was GONE :)
[...CUT...]
Continue creating array? yes # I answered here with yes...
mdadm: array /dev/md0 started.

After that, I stopped, took a break (or a deep breath) and then looked into the array status:
cat /proc/mdstat                                                                                                                     
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md0 : active raid5 sdc3[2] sdb3[1] sda3[0]
      4391961792 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

You can see that the last one is gone missing (is marked as "_"). Better way to look at it is via:
mdadm --detail /dev/md0                                                                                                              
/dev/md0:
        Version : [...CUT...]
  Creation Time : [...CUT...]
     Raid Level : raid5
     Array Size : [...CUT...]
  Used Dev Size : [...CUT...]
   Raid Devices : 4
  Total Devices : 3
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : [...CUT...]
          State : clean, degraded
 Active Devices : 3
Working Devices : 3
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : [...CUT...]
         Events : [...CUT...]

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3
       3       0        0        3      removed

Next, all you need to do is add another clean drive (no partition table, you can wipe it out with dd if=/dev/zero of=/dev/sd${CLEAN} bs=512 count=8) and then make sure to let the array md0 know about it like this:
mdadm -a /dev/md0 /dev/sdd3
mdadm: added /dev/sdd3

Now, if you have problems with the partition of the drive, you can replicate the partition from one of the active drives (in my case, /dev/sda as master, and /dev/sdd as the target) like this:
sfdisk -d /dev/sda | sfdisk /dev/sdd

At the end of the process, you should see something like this:
mdadm --detail /dev/md0 
/dev/md0:
        Version : [...CUT...]
  Creation Time : [...CUT...]
     Raid Level : raid5
     Array Size : [...CUT...]
  Used Dev Size : [...CUT...]
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : [...CUT...]
          State : clean, degraded, recovering
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 0% complete

           UUID : [...CUT...]
         Events : 0.8

    Number   Major   Minor   RaidDevice State
       0       8        3        0      active sync   /dev/sda3
       1       8       19        1      active sync   /dev/sdb3
       2       8       35        2      active sync   /dev/sdc3
       4       8       51        3      spare rebuilding   /dev/sdd3

And if you want to go closer to the kernel, you can always enter the sysfs like this:
cd /sys/block/md0/md
ls -al
[...CUT...]

You can sync, (echo check > sync_action) etc...

Well, that is all folks. I hope it gives other lost souls (or damage raid5 disks unlucky ones) a good idea how to recover the data. Drop a comment if you have questions or additions...

17/03/2013

OpenSSH key, Amazon EC2 RSA, Android and dropbear (from Terminal IDE)

Here is the thing: I have quite complicated SSH key (uses AES-128-CBC) and while there are posts about converting the openssh key to dropbear format, they fail to mention that the openssh key needs to be in some PEM (or more streamline) format. That might seam easy, but when you try to convert using the common example, you might end up getting an error like this:
/usr/lib/dropbear/dropbearconvert openssh dropbear ~/.ssh/id_rsa /tmp/id_rsa.db
Error: Ciphers other than DES-EDE3-CBC not supported
Error reading key from '~/.ssh/id_rsa'
That happens especially if you have keys generate by the Amazon Web Services, or EC2. Those generally look like this:
head -3 ~/.ssh/id_rsa_amazon_ec2
-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: AES-128-CBC,1234567890ABCSFTDHIJKLMNOPQRSTUV

Now, that sucks! And google-ing didn't get me very far. So, if you have no idea what to make from that error, well, fear not, as there is a simple solution, coming from openssl. You will need to convert your key to PEM and then, use the dropbearconvert tool like this:
openssl rsa -in ~/.ssh/id_rsa -outform pem > ~/.ssh/id_rsa.pem
You will need to enter your pass phrase if using one:
Enter pass phrase for id_rsa:
writing RSA key

At this point, technically, that key is the same but you have removed the passphrase (you could do the same by using ssh-keygen -p -f ~/.ssh/id_rsa, but then, you will have to put it back).

And now, using the converter on the PEM:
/usr/lib/dropbear/dropbearconvert openssh dropbear ~/.ssh/id_rsa.pem ~/.ssh/id_rsa.db

And you will get your key:
Key is a RSA key
Wrote key to '~/.ssh/id_rsa.db'

Hope this helps someone like me that uses the Amazon generated SSH keys and needs them in the Android side (by the way, I needed it for use with Terminal IDE ssh, which I love)... NOTE: You should remove that file from your system as it is password-less. This could allow someone to use it without knowing your pass phrase!

21/05/2011

Ubuntu, extra repositories (ppa's) and living on the edge...

For those that are using launchpad, the term PPA should be quite familiar. This short entry is for them. I have been using PPA since 4-5 years ago. Before, I used to have my own repos in: packages.antoniac.name. Then, something changed and Ubuntu guys, through launchpad.net, started allowing personal package archive access (or PPA) and builds. Now, the trouble with them is the conflicting of packages (you might think). However, behind the PPA submission of packages is a clever name convention and ultimately you have quite a good control of where the package comes from. Now, to add more honey to the sweet thing, you could use several commands to automate inclusion and removal of the ppa's:

sudo aptitude install ppa-purge # To be able to remove it later (if you don't have it)
sudo apt-add-repository ppa:pan1nx/ppa
sudo aptitude update
sudo aptitude safe-upgrade # or install new packages
sudo ppa-purge ppa:pan1nx/ppa # To remove the packages and install the original

Hope this help some of you handle the ppa's in a friendlier way...

20/04/2011

Console rulez (as in linux console)

Yep, once again I find myself dragged into blabbing about the beauty of the consoles. I had some troubles with the network and the only way to work in the servers room was, you guessed, console.

Now, I want to briefly share some goodies with you:

wavemon
Awesome tool for helping you with debugging the wifi signal at the server (if you have such thing at the server)
iperf
Another good tool to test the performance between two points on the network
nload
Comes very handy if you need a graph showing the bandwidth. There are other tools here, like iptraf, etc. I just like this one...

Hack, I should expand this article later when I remember other goodies like this...

02/12/2010

x86 64 vs 32, still messy

Just when I thought that the mess between 32 bit and 64 bit was over (I am talking about Linux, of course) I got bumped into it again. Problem this time was the lightning plug-in or add-on for Thunderbird. Here was the solution:

http://releases.mozilla.org/pub/mozilla.org/calendar/lightning/releases/1.0b2rc3/contrib/linux-x86_64/

Me so sad...

07/01/2010

UPDATE4: Kubuntu/Ubuntu: OCZ SSD, swap, memory, 64 bit, or how to pimp your life with a Panasonic CF-R8...

Here is some overview on how to get a new Panasonic CF-R8 to work with Ubuntu and an OCZ-Vertex SSD...

First, you need to open up your laptop in order to change the drive. I will come back on this pictures in another article later...

Then, you need to make sure that the OCZ partitions are aligned properly (that is before you install the Ubuntu/Kubuntu). Before you start, make sure that you have backed up all the partition (i.e., use rsync -ax / /mnt/USB) and  then, in a console/shell with fdisk:
fdisk -H 32 -S 32 /dev/sda

And of course, the creation of the partition (use :
Command (m for help): o
Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-15711, default 1): 2
Be sure to start from cylinder 2 (to have it aligned) as discussed also in some forums.

Another good tip is to create the filesystem with the extended options for the RAID strip of 64k. Here is the command for it (assuming that you have the sda1 as the partiotion):
mkfs.ext4 -E stripe-width=128 /dev/sda1

Since the SSD is fast enough, you can also tweak the /etc/fstab for better through output and make the logs go to memory (volatile):
/dev/sda1       /               ext4 noload,noatime,nodiratime,data=writeback,nobh,commit=100,barrier=0 0       1
tmpfs           /tmp            tmpfs defaults,noatime,mode=1777 0 0
none            /var/tmp        aufs noatime,br:/tmp=rw:/var/tmp=ro 0 0
none            /var/log        aufs noatime,br:/tmp=rw:/var/log=ro 0 0
none            /var/cache      aufs noatime,br:/tmp=rw:/var/cache=ro 0 0
Though, make sure that every now and then, you rebuild the cache...

Other optimizations for boot time can be committed via /etc/default/grub (to run update-grub after) to speed up also the kernel loading, with some nice tweaks:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash=silent resume=/dev/sda2 fastboot elevator=deadline raid=noautodetect rootflags=noload,data=writeback,nobh,commit=100,barrier=0"
You will need to update the grub via update-grub command...

Add swappiness to fix some default behavior, and switch back now from noop to deadline that is more cpu/write friendly-with fifo_batch option for io scheduller:
cat << EOF| sudo tee -a /etc/sysctl.conf
# Peter added swapiness...
vm.swappiness = 0
vm.laptop_mode = 5
vm.dirty_writeback_centisecs = 1500
vm.vfs_cache_pressure = 50
EOF

And last, we need some optimizations after booting:
cat << EOF| sudo tee /etc/init/ssd-optimizations.conf
# ssd-optimizations - Set the parameters

description     "Set required ssd parameters"

start on starting dm

script
#       echo deadline > /sys/block/sda/queue/scheduler
        echo 1 > /sys/block/sda/queue/iosched/fifo_batch
        echo min_power > /sys/class/scsi_host/host0/link_power_management_policy
end script
EOF

Well, believe it or not, but with these settings, I got X starting at sec. 2, so, it is pretty fast (no initrd!)...

Some good links:
  1. OCZ Firmware update
  2. Ted Tso guide to partitioning...
  3. http://www.ocztechnologyforum.com/forum/showthread.php?t=54379&page=17
  4. http://www.ocztechnologyforum.com/forum/showpost.php?p=373226&postcount=98
  5. http://ata.wiki.kernel.org/index.php/ATA_Secure_Erase

26/11/2009

My Linux life is completed by the N900 mobile phone...

Well, the N900 is not quite a mobile phone. Saying that is an understatement. It is more like a mini laptop with a phone added as feature. Anyway, as of last weekend, I am using the Maemo Linux based Internet Tablet... Costs around 600€ but is worthy for those working and demanding a mobile development platform. Highly recommended...and here are the reasons:
  • WiFi
  • 3G
  • SIP
  • Full browser (Fennec)
  • GTK/Open source/Debian
Well, you do the math... meanwhile, I enjoy it...

02/06/2009

Linux games: where are they?

Hmm...today, just after I was giving a speech about how linux is lacking the game industry attention, I stumble on this: Savage 2. Conclussion: there are actually games that are build for linux, and quite good games. It is just me that does not know where to look (possible excuse is that I am more into PlayStation 3/Wii games than PC games). Well, yet another reason to stick to the Linux world...