Guest isolation in XenServer 6.1 / XCP 1.6

Since XenServer 6.1 (XCP 1.6) there is a new feature that allows you to lock VIF to specific MAC and IP addresses. This is nice (and also very buggy!), but it doesn’t provide any security other than keeping VMs from stealing each others IPs. A better solution should allow to (optionally) isolate traffic between groups of VMs as well. For example, prevent users from accessing other users VMs over a common private/backend network, but allow communication between VMs in the same group and with external networks.

Step 1: We must relocate the VIF locking data into the VM’s vm-data store, similar to how security groups are managed by Nova (openstack). This lets us use more fields and options.

Step 2: Patch /opt/xensource/libexec/setup-vif-rules to fix *several* bugs as well as take an extra locking mode (isolated). In isolate mode, VMs can only communicate to IPs in the allowed list. Use a common set of IPs on each set of VMs in the same ‘security group’ and your various groups of guests are isolated from each other.

Step 3: Patch another bug in /etc/xensource/scripts/vif to prevent orphaned rules from piling up in openvswitch when VMs are restarted.

Patches are here:

http://djlab.com/stuff/xs61/setup-vif-rules.patch
http://djlab.com/stuff/xs61/vif.patch

This will only work with openvswitch networking mode and has been tested on XS 6.1 and XCP 1.6. Do not try it in bridged mode (you shouldn’t be using bridged mode anyways).

XenServer multipath configuration for LIO targets

XenServer multipath.conf with special support for LIO-based iSCSI targets to maximize multipath performance and ensures 100% stability. path_grouping_policy setting doesn’t seem to matter (between group_by_prio and multibus) in most basic setups. Invalid lines (as reported by XenServer 6.1) have also been removed.

http://djlab.com/stuff/xs61/multipath.conf

Specifically:

        device {
                vendor "LIO-ORG"
                product "*"
                path_grouping_policy group_by_prio
                path_checker tur
                getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
                prio_callout "/sbin/mpath_prio_alua /dev/%n"
                path_selector "round-robin 0"
                hardware_handler "1 alua"
                rr_weight uniform
                rr_min_io 2
                failback immediate
        }

Migrate a XenServer VM without a Pool or Shared Storage

Source is now in github, please help me make it better! If you just want a 32-bit binary (to run on Dom0), download it from here instead.

With the help of Ben Booth’s Xen::API (Perl module) I put together a VM migration script to export a VM directly to another host with no intermediary file. The transfer occurs over XAPI with no temp files or local disk interaction. This script can run directly on the source or destination host, or any server in between. Bear in mind, you will have the best speeds and least network overhead running this directly on the destination host.

As of today, MigrateVM has been tested and works fine on XenServer 5.6 through 6.5.

Options:

-sh : source host
-su : source user (usually root)
-sp : source pass
-sv : source VM label or UUID
-dh : destination host
-du : destination user
-dp : destination pass
-ds : destination SR (optional)

If any of the options are omitted, you will be prompted for them.

Example output:

[root@cl-ash-h1 ~]# ./migratevm
Enter source host name/IP (blank = localhost): 1.2.3.4
Enter username for 1.2.3.4 (blank = root):
Enter password for 1.2.3.4: ************
Enter source vm name or uuid on 1.2.3.4: my_vm
Enter destination host name/IP (blank = localhost):
Enter username for localhost (blank = root):
Enter password for localhost: ******
Destination SR on localhost (blank for default):
Starting transfer
...................    12.0%, 30618.43 (KB/sec)
Done.

Download the script like this:

wget http://djlab.com/stuff/migratevm-1.0.2.tar.gz
tar zxf migratevm-1.0.2.tar.gz && cd migratevm-1.0.2
./migratevm

If you get ‘bad ELF’ or something like that on a 64 bit system, try to install 32-bit glibc, for example:

Older XenServers: yum install glibc.i686
XenServer 6.5:  yum install glibc.i686 --enablerepo=base --enablerepo=updates --disablerepo=citrix

Binary and source are both included in the tarball.

Version 1.0.2 has an updated binary build which should now run on XenServer 6.5. We needed to static link expat into the binary because it is no longer installed by default on XS 6.5.

X9SCM / X9SCL Network Timeout

Supermicro X9SCM and X9SCL main boards will lose network connection after some heavy traffic, especially on RHEL/CentOS 6. Updating BIOS and driver will not always fix this:

Oct 19 18:32:49 zeus kernel: ------------[ cut here ]------------
Oct 19 18:32:49 zeus kernel: WARNING: at net/sched/sch_generic.c:267 dev_watchdog+0x26d/0x280() (Not tainted)
Oct 19 18:32:49 zeus kernel: Hardware name: X9SCL/X9SCM
Oct 19 18:32:49 zeus kernel: NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out
Oct 19 18:32:49 zeus kernel: Modules linked in: vzethdev vznetdev pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs vzrst nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 vzcpt nfs lockd fscache nfs_acl auth_rpcgss sunrpc nf_conntrack vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_length vhost_net xt_hl xt_tcpmss macvtap xt_TCPMSS macvlan iptable_mangle iptable_filter xt_multiport xt_limit tun xt_dscp ipt_REJECT ip_tables kvm_intel kvm vzevent ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse snd_pcsp snd_pcm snd_timer video i2c_i801 tpm_tis tpm tpm_bios serio_raw i2c_core snd shpchp output soundcore snd_page_alloc ext3 jbd mbcache ahci e1000e [last unloaded: scsi_wait_scan]
Oct 19 18:32:49 zeus kernel: Pid: 4, comm: ksoftirqd/0 veid: 0 Not tainted 2.6.32-15-pve #1
Oct 19 18:32:49 zeus kernel: Call Trace:
Oct 19 18:32:49 zeus kernel: <IRQ> [<ffffffff8106c608>] ? warn_slowpath_common+0x88/0xc0
Oct 19 18:32:49 zeus kernel: [<ffffffff8106c6f6>] ? warn_slowpath_fmt+0x46/0x50
Oct 19 18:32:49 zeus kernel: [<ffffffff8147c6fd>] ? dev_watchdog+0x26d/0x280
Oct 19 18:32:49 zeus kernel: [<ffffffff8107fcac>] ? run_timer_softirq+0x1bc/0x380
Oct 19 18:32:49 zeus kernel: [<ffffffff8147c490>] ? dev_watchdog+0x0/0x280
Oct 19 18:32:49 zeus kernel: [<ffffffff81075413>] ? __do_softirq+0x103/0x260
Oct 19 18:32:49 zeus kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30

So far, the only fix that has worked for me is doing two things.

1. Patch the NIC firmware (do this for both eth0 and eth1):

Download Here

2. Then add the following kernel parameter in your grub.conf (or menu.lst depending on OS flavor):

pcie_aspm=off

Now reboot. You shouldn’t lose your network again.

Kickstart 4-disk RAID10 Recipe

Here’s a nice recipe for a RAID10 array comprised of 4x SSD disks. Tested to work on CentOS 6 (RHEL 6). Be sure to add the discard option in fstab for Trim support.

zerombr yes
bootloader --location=partition --driveorder=sda,sdb,sdc,sdd
clearpart --all --initlabel --drives=sda,sdb,sdc,sdd
part raid.100000 --size=250 --ondisk=sda
part raid.100001 --size=250 --ondisk=sdb
part raid.100002 --size=250 --ondisk=sdc
part raid.100003 --size=250 --ondisk=sdd
part raid.100007 --size=1 --grow --ondisk=sdd
part raid.100006 --size=1 --grow --ondisk=sdc
part raid.100005 --size=1 --grow --ondisk=sdb
part raid.100004 --size=1 --grow --ondisk=sda
raid /boot --fstype ext3 --level=RAID1 --device=md0 raid.100000 raid.100001 raid.100002 raid.100003
raid pv.100008 --fstype "physical volume (LVM)" --level=RAID10 --device=md1 raid.100004 raid.100005 raid.100006 raid.100007
volgroup vg --pesize=65536 pv.100008
logvol swap --fstype swap --name=SystemSwap --vgname=vg --size=4096
logvol / --fstype ext4 --name=SystemRoot --vgname=vg --size=1 --grow