MegaRAID Storage Manager on Ubuntu for LSI cards

Getting MegaRAID Storage Manager and CLI tools installed on Ubuntu is no fun. This is what worked for me:

echo "deb http://hwraid.le-vert.net/debian squeeze main" >> /etc/apt/sources.list
apt-get update
apt-get -y --force-yes install megacli megactl libc6-i386 lib32gcc1 lib32z1 lib32stdc++6 ia32-libs lib32icu42
wget http://djlab.com/stuff/libstdc++5_3.3.6-21ubuntu1_amd64.deb
dpkg -i libstdc++5_3.3.6-21ubuntu1_amd64.deb
wget http://djlab.com/stuff/libstdc++5_3.3.6-21ubuntu1_i386.deb
dpkg-deb -x libstdc++5_3.3.6-21ubuntu1_i386.deb lib32stdc++5
cp ./lib32stdc++5/usr/lib/libstdc++.so.5.0.7 /usr/lib32
ln -s /usr/lib32/libstdc++.so.5.0.7 /usr/lib32/libstdc++.so.5
wget http://djlab.com/stuff/libxerces-c28_2.8.0+deb1-2build1_i386.deb
dpkg-deb -x libxerces-c28_2.8.0+deb1-2build1_i386.deb lib32xerces-c28
mkdir -p /opt/lsi/Apache/
cp ./lib32xerces-c28/usr/lib/libxerces-c.so.28.0 /opt/lsi/Apache/
ln -s /opt/lsi/Apache/libxerces-c.so.28.0 /opt/lsi/Apache/libxerces-c.so.28
ln -s /opt/lsi/Apache/libxerces-c.so.28 /usr/lib/libxerces-c.so.28
ldconfig

wget https://djlab.com/stuff/megaraid-storage-manager_8.10-04_amd64.deb
dpkg -i megaraid-storage-manager_8.10-04_amd64.deb

wget http://djlab.com/stuff/getlibs-all.deb
dpkg -i getlibs-all.deb
getlibs -y -l libxerces-c.so.28 libicuuc.so.42
ldconfig

## Launch GUI (make sure X11 forwarding is enabled on SSH session):
## /usr/local/MegaRAID\ Storage\ Manager/startupui.sh

Now you can monitor and manage many different LSI SAS cards (including Dell Perc and some others).

Windows 7 / Vista DNS Suffix only one level deep

Major annoyance to any organization that has hostnames like foo.bar.domain.com — Windows 7 and Vista appends domain.com to bar, but not foo.bar. After some digging I finally found a registry hack that restores ‘proper’ functionality like any other OS or older version of windows.

Create a DWORD ‘AppendToMultiLabelName’ with a value of 1 in:

HKEY_LOCAL_MACHINE -> SOFTWARE -> Policies -> Microsoft -> Windows NT -> DNSClient

Here’s a .reg file you can run to automatically insert it. Reboot the machine to enable it.

SQL-fu #1

Nested select with an insert to create multiple rows — used this to add admin privileges for a client in Maia for all their domains:

INSERT INTO maia_domain_admins (SELECT t2.domain_id,1013 from postfix_transport t2 WHERE t2.mta_host='1.2.3.4');

Dropped VIF TX Packets on XenServer 5.6

On multiple XenServer 5.x setups I’ve been experiencing dropped packets evident as light to medium packet loss on a busy DomU. It only affects heavy network loads on DomU’s and not Dom0. It also doesn’t seem to care about what OS is running; I’m seeing it in Debian, CentOS, and Windows. Since I manage streaming services, I have some heavy network loads and the packet loss is causing issues for some clients. I also notice the following on the affected VIF as seen on Dom0 (this is just an example):

vif406.0  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:3793176632 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3083746066 errors:0 dropped:280 overruns:0 carrier:0
          collisions:0 txqueuelen:32
          RX bytes:1680119904 (1.5 GiB)  TX bytes:2415406042 (2.2 GiB)

Notice the dropped TX packets. This is only seen on Dom0 on the VIF. No packets lost on the Dom0 PIF or the DomU itself as shown by ifconfig.

I tried disabling checksum offloading (as suggested for Windows 2003 issues) on both Dom0 and DomU but it had no affect at all. I was almost ready to give up until I started wondering about the really small txqueuelen. 32 is really small — much smaller than the Linux default. I come from a Cisco background and we could never run ports with buffers that small.

On a hunch I tried increasing it on Dom0 to a reasonable value for a busy network:

ifconfig vif406.0 txqueuelen 1500

To my surprise, it completely fixed the packet loss. Single thread speeds went from bouncing all over the place to a steady 30+MB/sec. It was really that simple, and I can’t believe more people haven’t been hit by this. Especially for network based storage-backed DomU.

So I wrote a script and put it into Cron so all VIF will be set at 1500 on a regular basis:

ifconfig | grep -P '^vif\d+\.\d+' |  awk '{system("ifconfig "$1" txqueuelen 1500")}'

If anybody knows how to set the txqueuelen permanently through XE or XenStore I want to hear it. But for now I’ve found nothing in the manual or on the net to suggest how to do this.

I checked some older XenServer 4.X boxes and they don’t have this problem even though the txqueuelen is still only 32. Only 5.X exhibits problems from what I can see. All machines use standard Intel gigabit interfaces (82574L), nothing out of the ordinary.

Hopefully this helps someone else!

Mcelog – Bad memory?

The original hardware did all kinds of strange things, stuff like SSH not working and RPM checksums failing, random rebooting, ect. I ended up swapping the chassis, mainboard and RAM but kept the HDD’s to avoid an OS reinstall. Now, I’ve gotten this twice on the new box followed by a hard freeze about 2 months after the swap:

root@host [~]# mcelog --cpu  nehalem --ascii < mce.txt
CPU 2: Machine Check Exception:     4 Bank 8: fe0000400001009f
TSC 38f57147ce2b8 ADDR 74a0f000 MISC c4e377c200041180

HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 2 BANK 8 TSC 38f57147ce2b8
MISC c4e377c200041180 ADDR 74a0f000
MCG status:MCIP
MCi status:
Error overflow
Uncorrected error
Error enabled
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: MEMORY CONTROLLER RD_CHANNELunspecified_ERR
Transaction: Memory read error
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 0
Memory DIMM ID of error: 0
Memory channel ID of error: 1
Memory ECC syndrome: 44e377c2
STATUS fe0000400001009f MCGSTATUS 4

I realized I was running the same kernel as before the chassis swap, so I wonder if the ‘screwyness’ of the old hardware left a corrupted kernel module or two? I know it’s a long shot… but so are two identical, brand new machines having serious memory lapses. So I just upgraded the kernel (and bios for safe measure).

Fingers crossed…

Edit 4/20/11 – Turns out it was hardware issues… did a full chassis swap yesterday. Haven’t cracked open the bad server yet to see what went wrong.

Installing Subversion on Cpanel/WHM

Make and install Subversion:

mkdir -p /usr/local/src/
wget http://subversion.tigris.org/downloads/subversion-1.5.2.tar.bz2
tar xfj subversion-1.5.2.tar.bz2 && cd subversion-1.5.2
./configure --with-apxs=/usr/local/apache/bin/apxs --with-apr=/usr/local/apache/bin/apr-1-config --with-apr-util=/home/cpeasyapache/src/httpd-2.2.17/srclib/apr-util
make && make install

Add to pre-main include in Apache Configuration (WHM):

LoadModule dav_svn_module modules/mod_dav_svn.so
LoadModule authz_svn_module modules/mod_authz_svn.so

Set up EasyApache’s per-user virtualhost config directories (if not present):

mkdir /etc/httpd/conf/userdata
mkdir /etc/httpd/conf/userdata/std
mkdir /etc/httpd/conf/userdata/std/2

Add a virtualhost config file for the user and domain who needs subversion:

mkdir /etc/httpd/conf/userdata/std/2/<username>
mkdir /etc/httpd/conf/userdata/std/2/<username>/<www.domain.com_or_subdomain>
nano -w /etc/httpd/conf/userdata/std/2/<username>/<www.domain.com_or_subdomain>/svn_custom.conf

Paste in the config:

<IfModule mod_dav_svn.c>
 <location /repo>
  DAV svn
  SVNPath /home/<username>/public_html/repo/svn
  AuthType Basic
  AuthName "SVN Repo"
  AuthUserFile /home/<username>/.svn.htpasswd
  Require valid-user
 </location>
</IfModule>

Create the SVN user and filesystem:

su <username>
cd ~/
htpasswd -cm .svn.htpasswd <svnuser>
mkdir public_html/repo
cd public_html/repo
svnadmin create svn
chmod 775 -R svn
exit

Update the apache config:

/scripts/ensure_vhost_includes --all-users

Test by opening http:///repo/ in browser and authenticating.