Gentoo breakage

I run Gentoo on my home linux box. Generally, it’s stable enough for what I use it for: webserver, wireless+ipsec gateway, squid test machine.

Short story, it broke yesterday and I fixed it today, after much reading and typing. Long story after the break.

But, sometime in the last couple of months (yeah, I try not to reboot) an update broke something in the chain of 2.4 kernel support, devfsd, LVM2 and devicemapper that made my LVM on software RAID not want to work anymore. The machine would boot (boot and root are on /dev/md devices) but it could not “find” the LVM volumes for home, usr, var and tmp. Suck. Off to search the forums, usually an easy find and fix, but not this time. Nothing even close.

So, manual troubleshooting time. The LVM devices were getting created in /dev/vg/, but mount does not see valid filesystems on them. Joy, maybe all my data is gone? So, I reboot onto an older 2.4 based Gentoo LiveCD, hand create my /etc/raidtab for md0, md1 and md2, run raidstart and realize that my RAIDs are still good. But the partition numbering has changed enough that /proc/mdstat says it can’t find the second half of any of the RAIDs. That’s fixable later with raidhotadd, I hope. So, how do you figure out where your LVM parts are on a disk? vgscan, pvscan, lvscan, vgdisplay, pvdisplay, lvdisplay all are quickly your friend. Also very helpful was the Software RAID + LVM2 Quick Install Guide. Maybe all this will be fixed by updating to a 2.6 kernel and udev?

I’ve been holding on to the 2.4 kernel because I didn’t want to rewrite my iptables rules to deal with the loss of the ipsecN network interfaces that OpenSWAN creates, but surely I can find another way now.

But, darnit, some of the newer hardware in my Shuttle SB81P isn’t recognized by this older LiveCD, so off to burn a 2006.0 copy. After booting that up, re-recreating my /etc/raidtab by hand, re-[vg,pv,lv]scanning my LVMs and getting them mounted into the proper places for the chroot, I proceed to chroot and follow the Gentoo 2.6 Migration Guide. After a few hours of compiling, I’m ready to test my 2.6 kernel.

Reboot, and things mostly work. Checking cat /proc/mdstat tells me that my RAID1 volumes are still degraded, so I raidhotadd /dev/md0 /dev/sdb1, etc until those are all fixed. Now, on to fixing OpenSWAN and my strange MASQ/Routing network configuration.

First up is re-emerging net-misc/openswan to rebuild it for the 2.6 kernel, emerging net-firewall/ipsec-tools and getting the correct kernel options turned on. On to fixing my silly MASQ/Routing design. Luckily, I’m not the first person to be bothered by this little change, someone else already has a solution: Use the MANGLE table to MARK any packets that come in from the ipsec tunnel and then after the packets are decrypted we can look for this mark to verify they came from an ipsec client. So, with some complex looking, but actually simple iptables rules, it’s doable:
Before: iptables -A FORWARD -i ipsec0 -o eth0 -j ACCEPT
After:

  • iptables -t mangle -A PREROUTING -i eth1 -p esp -j MARK --set-mark 1
    This uses the esp protocol match to identify ipsec packets and mark them.
  • iptables -A INPUT -i eth1 -m mark --mark 1 -j ACCEPT
    This allows any packet that came from eth1 in to the host as long as its been marked.
  • iptables -A FORWARD -o eth0 -i eth1 -m mark --mark 1 -j ACCEPT
    This allows any packet that came in from eth1, is headed out eth0 and is marked to be forwarded.

So, I should have switched to a 2.6 kernel long ago.

Leave a Reply

Your email address will not be published. Required fields are marked *