This is my OLD blog. Thew new one can be found here


LXC on Debian Squeeze

Yet another blog about setting up LXC (Linux Containers). The article is focused on the current testing of debian called squeeze. (Cause some day it gotta be released!)

  • Setup network with bridging.
  • Network setup.
  • Install LXC.
  • Create your very first Container.
  • Usage of container templates.
  • Limiting ressources for containers (including: get the memory controller working)

Table of Content

Pre-Setup and assumptions^

The article will begin with network setup, which is required for the used network bridging, then the install process of LXC itself, create and configure the first container then show how to use container templates. The last topic covers limiting hardware resources for containers.

I assume you have setup a squeeze installation already. If not, you can download it from here. Best you use the netinst or businesscard, because this we want to keep it clean and simple. Of course you can use VirtualBox or such instead of a physical machine – due to being a container virtualization LXC can deal with this easily.

Each shell command provided assumes you are logged in as root. You can prefix them with a “sudo” if you work as a user and dont want to change to root.

If i prefix some command with “#>” it is meant to be executed by the user on the host machine (withunder which the LXC containers run). If i use “#vm0>” or alike the command is meant to be called from within the container. Non prefixed code lines show contents of files or command outputs.

Everything described should probably work out in lenny (if you use the backport kernel 2.6.32), but i did not test it, though.

Network – Bridging^

Assuming you have a freshly installed box with a network up and running the first thing to do is install bridge-utils:

#> aptitude install bridge-utils

Next thing is setup the bridging. Thats quite easy, assume your /etc/network/interfaces looks like this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
allow-hotplug eth0
iface eth0 inet dhcp

All you have to do is change it to this:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
#allow-hotplug eth0
#iface eth0 inet dhcp

# Setup bridge
auto br0
iface br0 inet dhcp
   bridge_ports eth0
   bridge_fd 0

Then restart your network:

#> /etc/init.d/networking restart

As you can see the eth0 entries are commented out, and a new interface (br0) is introduced. The line bridge_ports eth0 does add the physical eth0 interface to the bridge, bridge_fd 0 sets the “forward delay” to zero, which reduces the waiting in listening and learning states of the bridge..

If you use a static configuration it should look something like this:

auto br0
iface br0 inet static
  bridge_ports eth0
  bridge_fd 0
  address 10.0.0.100
  netmask 255.255.255.0
  gateway 10.100.0.1
  dns-nameservers 10.20.0.2

More info on that matter can be googled easily.

Install LXC, setup cgroups^

Now it’s time to install lxc. Just install via aptitude:

#> aptitude install lxc

Ok, before we can setup the first instance cgroups has to be mounted. Therfore create a directory for mounting them (can be everywhere, i prefer in file system root):

#> mkdir /cgroup

And add the follwowing to your /etc/fstab:

cgroup        /cgroup        cgroup        defaults    0    0

Then mount the cgroups:

#> mount cgroup

One final thing before we can go on is checking the the environment via lxc-checkconfig:

#> lxc-checkconfig
Kernel config /proc/config.gz not found, looking in other places...
Found kernel config file /boot/config-2.6.32-3-686
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled
Multiple /dev/pts instances: enabled

--- Control groups ---
Cgroup: enabled
Cgroup namespace: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: disabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled
Macvlan: enabled
Vlan: enabled
File capabilities: enabled

All should be enabled, besides the memory controller, which is sad, but expected. Read more in F.A.Q. about this.

First container^

You can either use the original lxc-debian from /usr/share/doc/lxc/examples/lxc-debian.gz, which will give you a lenny container, or download the modified version from the download section (refer also to the F.A.Q.).

Using original

First of copy the lxc-debian command from the lxc package into /usr/local/sbin and make it executable

#> cp /usr/share/doc/lxc/examples/lxc-debian.gz /usr/local/sbin/
#> gunzip /usr/local/sbin/lxc-debian.gz
#> chmod +x /usr/local/sbin/lxc-debian

Using modified

Download modified lxc-debian, put it into /usr/local/sbin/lxc-debian and chmod it to executable. Warning: very experimental, better use the original!

Debootsrapping

Before we can execute this install debootstrap!

#> aptitude install debootstrap

Ok, lets create the first virtual machine. The default (and hard coded) directory for deploying your containers is /var/lib/lxc/.., if you use a SAN or some other hdd, mount it there, don’t try to change lxc to use another location..

#> mkdir -p /var/lib/lxc/vm0
#> lxc-debian -p /var/lib/lxc/vm0

This might take some time. Some config menues (eg locales) might popup. After installation is finished, you could start the virtual machine (alias VM or container) right away, but take some time to customize the configuration a little bit.

Got to /var/lib/lxc/vm0 and edit the file config like so:

lxc.tty = 4
lxc.pts = 1024
lxc.rootfs = /srv/lxc/vm0/rootfs
lxc.cgroup.devices.deny = a
# /dev/null and zero
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
# consoles
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 4:0 rwm
lxc.cgroup.devices.allow = c 4:1 rwm
# /dev/{,u}random
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 5:2 rwm
# rtc
lxc.cgroup.devices.allow = c 254:0 rwm

# <<<< ADD THOSE LINES
lxc.utsname = vm0
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
# lxc.network.name = eth0
lxc.network.hwaddr = 00:FF:12:34:56:78
lxc.network.ipv4 = 10.0.0.110/24

Let’s hold for a minute and see what we did here:

  • lxc.utsname = vm0
    The hostname of the container.
  • lxc.network.type = veth
    There are a couple of network types we can work with (man lxc.conf), you should use veth if working with bridges. There is also vlan which you can use if you use linux vlan interfaces and phys, which allows you to hand over a complete physical interface to the container.
  • lxc.network.flags = up
    Says the network to be up at start.
  • lxc.network.link = br0
    Specifies the network bridge to which the virtual interface will be added.
  • lxc.network.name = eth0
    This is the name within the container! Don’t mistake this with the name showing up in the host machine! Don’t set it and it will be eth0 anyway..
  • lxc.network.hwaddr = 00:FF:12:34:56:78
    The hardware address (MAC) the device will use.
  • lxc.network.ipv4 = 10.0.0.110/24
    Network address assigned to the virtual interface. You can provide multiple, one per line (man lxc.conf).
    You have to edit /etc/network/interfaces within the container (/var/lib/lxc/vm0/rootfs/etc/network/interfaces) as well!

This being done, let’s start the machine:

#> lxc-start -n vm0 -d

That should have done the trick. You can check whether it is running with this:

#> lxc-info -n vm0
'vm0' is RUNNING

Now login to the container, use the lxc console:

#> lxc-console -n vm0

You should get a login prompt. Just type in “root”, no password (good time to set one).

Thats all. Your first container is up an running. To stop it, just do this:

#> lxc-stop -n vm0

And check again

#> lxc-info -n vm0
'vm0' is STOPPED

Setting up the next container^

Well, you could simply do the same as above, but that would be no fun at all. One of the good (but by far no the major) reason for virtualization is of course the fast setup of virtual machines. If you peeked in the rootfs directory of the created vm0 in /var/lib/lxc/vm0/rootfs, you might have mentioned it’s “only” a simple linux root directory – but no proc, sys or any of those nodes.

The idea is simple: Use tar to compress the vm0 directory and then to create new machines of the archive. Of course this at least one implication: you can now work with “golden images” – or container template if you’d like that better.

#> cd /var/lib/lxc/vm0
#> tar czf ../template.tar.gz *

Now create another container out of this:

#> cd /var/lib/lxc
#> mkdir vm1
#> cd vm1
#> tar xzf ../template.tar.gz

And done. Ok, of course you have to adjust the config file in vm1/config (network.*, utsname, etc) but that’s it.

The rest is up to you imagination. Eg you can have one master template (which you start on a regular basis and bring up to date), templates for each kind of your regular servers (HTTP Server with apache installed and configured or mailserver with postfix setup and so on..) or whatever you want.

What about limits ?^

You might have some experience with other virtualizations like KVM or Xen. When setting up a machine in Xen, you have to configure the amount of memory, could set amount of virtual CPUs and also set the scheduler ratio. That’s (among other things) what cgroups do for LXC.

Before going in details, keep in mind: all cgroup settings are totally dynamical. You can change them all at runtime – but be careful (especially with withdrawing memory from a running instance!).

How to set a cgroup value

All cgroup settings you can set by:

  • lxc-cgroup -n vm0 <cgroup-name> <value>
  • echo <value> > /cgroup/vm0/<cgroup-name>
  • in config-file: “lxc.cgroup.<cgroup-name> = <value>”

In the examples i will use the config-file notation, cause it’s container independent.

Byte values

For byte values such as memory limits you can use K, M or G

#> echo "400M" > /cgroup/vm0/..
#> echo "1G" > /cgroup/vm0/..
#> echo "500K" > /cgroup/vm0/..

Available parameters

First of, you should have a look at /cgroup/vm0 (if vm0 not running – start it now). You should see something like this:

#> ls -1 /cgroup/vm0/
cgroup.procs
cpuacct.stat
cpuacct.usage
cpuacct.usage_percpu
cpuset.cpu_exclusive
cpuset.cpus
cpuset.mem_exclusive
cpuset.mem_hardwall
cpuset.memory_migrate
cpuset.memory_pressure
cpuset.memory_spread_page
cpuset.memory_spread_slab
cpuset.mems
cpuset.sched_load_balance
cpuset.sched_relax_domain_level
cpu.shares
devices.allow
devices.deny
devices.list
freezer.state
memory.failcnt
memory.force_empty
memory.limit_in_bytes
memory.max_usage_in_bytes
memory.memsw.failcnt
memory.memsw.limit_in_bytes
memory.memsw.max_usage_in_bytes
memory.memsw.usage_in_bytes
memory.soft_limit_in_bytes
memory.stat
memory.swappiness
memory.usage_in_bytes
memory.use_hierarchy
net_cls.classid
notify_on_release
tasks

The lines in italic are only there if your kernel supports the memory controller (see below in the F.A.Q.).

I will not go into each of those, for a complete description of each use google or go here: http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6-Beta/html/Resource_Management_Guide/subsystems.html

Limit memory and swap^

First the bad news: You cannot limit memory. At least not with the current debian kernel. You have to build your own (or get it from somewhere – i could not find any pre-build for now). How to build your own Kernel from current debian kernel with memory controller enabled is described in the F.A.Q. section.

Assuming you have build the kernel (or get it from somewhere), here is how you limit memory to a container. Again: don’t reduce memory for a running container as long as you are not perfectly sure what you are doing.

Set max memory:

lxc.cgroup.memory.limit_in_bytes = 256M

Set max swap:

lxc.cgroup.memory.memsw.limit_in_bytes = 1G

More on the memory controller can be found here: http://www.mjmwired.net/kernel/Documentation/cgroups/memory.txt

Limit CPU^

There are two attempts for limiting CPU. First of there is the scheduler and then you can assign CPUs directly to a cgroup of a container.

Scheduler

The scheduler works like this: You assign to vm0 the value of 10 and to vm1 the value of 20. This means: in each CPU Second vm1 will get the double amount of CPU cycles as vm0. Per default all values are set to 1024.

lxc.cgroup.cpu.shares = 512

More on the CPU scheduler: http://www.mjmwired.net/kernel/Documentation/scheduler/sched-design-CFS.txt

CPUs

Set the actual CPUs to a container. Assume you have 4 CPUs, then the default for all is values is 0-3 (all CPUs).

# assign first CPU to this container:
lxc.cgroup.cpuset.cpus = 0

# assign the first, the second and the last CPU
lxc.cgroup.cpuset.cpus = 0-1,3

# assign the first and the last CPU
lxc.cgroup.cpuset.cpus = 0,3

Another interesting vaue might be lxc.cgroup.cpuset.sched_relax_domain_level, look it up.

More on CPU sets: http://www.mjmwired.net/kernel/Documentation/cgroups/cpusets.txt

Limit hard disk space^

Well, so far this is not possible with cgroups, but easily achieved with LVM or image files. Simply create a logical device with limited space in LVM or create a limited image file and mount at the container path (before creating the container or move the config and rootfs before).

F.A.Q.^

Why do i have a lenny container ?^

You might have noticed that you got yourself a debian 5.0 lenny container. Thats because lenny is hardcoded in the lxc-debian script and there is no(t yet) any squeeze compatible installer script. I have modified the lxc-debian script (download below try the one from the sourceforge archive) but it does not yet work fully out of the box. You can play around with it.

#> DEBIAN_VERSION=squeeze lxc-debian -p /var/lib/lxc/vm0

If you’ve already created a container with lxc-debian, you probably have to wipe the directory /var/cache/lxc/.. before you get squeeze (cache is stored not suite aware).

I want my containers in LVM^

Just setup you physical device for LVM, create a virtual group and the logical device. Then mount it unter /var/lib/lxc/<your-container>.

How to setup a private network between Containers^

Just setup an additional bridge on a tap interface. Therefore you require the uml-utilities package

#> aptitude install uml-utilities

Then add the tap interface and setup the bridge

#> tunctl -t tap0
#> brctl addbr br1
#> brctl addif br1 tap0

For persistence add the following to /etc/network/interfaces

# Setup private bridge
auto br1
iface br1 inet static
 pre-up /usr/sbin/tunctl -t tap0
 pre-up /sbin/ifconfig tap0 up
 post-down /sbin/ifconfig tap0 down
 bridge_ports tap0
 bridge_fd 0

Then edit the config files of the lxc containers (/var/lib/lxc/vmX/config) by adding (or replacing) the network config:

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br1
lxc.network.hwaddr = 00:FF:FF:11:22:33
lxc.network.ipv4 = 10.100.0.1

And (re)start your container.

How does LXC work ?^

Well, that’s a big question i cannot answer fully. But a good impressions of how it differs to other virtualization techniques gave me pstree:

#> pstree -p
init(1)─┬─acpid(1027)
        ├─cron(1048)
        ├─dhclient3(903)
        ├─exim4(1298)
        ├─getty(1346)
        ├─getty(1347)
        ├─getty(1348)
        ├─getty(1349)
        ├─getty(1350)
        ├─getty(1351)
        ├─lxc-start(4682)───init(4691)─┬─dhclient3(4888)
        │                              ├─getty(4980)
        │                              ├─getty(4981)
        │                              ├─getty(4982)
        │                              ├─login(4979)───bash(4983)
        │                              └─sshd(4966)
        ├─lxc-start(5983)───init(5994)─┬─dhclient3(6200)
        │                              ├─getty(6292)
        │                              ├─getty(6293)
        │                              ├─getty(6294)
        │                              ├─getty(6417)
        │                              ├─login(6291)───bash(6295)
        │                              └─sshd(6278)
        ├─rsyslogd(1013)─┬─{rsyslogd}(1015)
        │                └─{rsyslogd}(1016)
        ├─sshd(1310)─┬─sshd(1352)───bash(1354)───lxc-console(6180)
        │            ├─sshd(1360)───bash(1362)───lxc-console(4877)
        │            └─sshd(5347)───bash(5350)───pstree(6418)
        └─udevd(400)─┬─udevd(512)
                     └─udevd(523)

As you can see, there are 3 init processes running. Two of them descending from a lxc-start process. The container in LXC are simply (but isolated) processes in the host machine. If you are familiar with chroot, you might get the picture.

Building a kernel with memory controller^

A warning before: Building your own kernel could break your system. You could loose not only all your data, you could also damage your hardware. To use swap part of the memory controller you have to activate an experimental feature. This is, what it claims: experimental. You have been warned.

Aside from that, it’s no big deal..

The reason why the memory controller (besides the experimental swap part) is not activated is, so far i know, because of the overhead of the necessary resource counter.. the overall CPU usage is higher (about 2%, i remember reading somewhere) and the boot time is somewhat longer – even without using cgroup at all. And because there are no “cgroup modules”, it has to be kernel “built in”. They are working on removing the overhead and hopefully it will be in stable kernels sometime.

First of: there is a pre-build kernel in the Download section below, but only for i386! If you want to use it, you still have do everything starting from “Install the kernel” (below). However, starting with your own build: get build-essential, make-kpkg and the current kernel source:

Get debian packages

#> aptitude install build-essential make-kpkg kernel-source-2.6.32 \
   libncurses-dev libc6-dev zlib1g-dev

This could take some time, about 100MB download, about 700 MB after extraction and build on hdd. Don’t miss to add libc6-dev and zlib1g-dev, took me till the end of the first compile run to find out..

Extract kernel

After all is downloaded, go to the source directory, extract the compresses tar and make a symlink:

#> cd /usr/src
#> tar jxvf linux-source-2.6.32.tar.bz2
#> ln -s linux-source-2.6.32 linux

Copy config

Now switch to the created directory and change the required settings

#> cd linux
#> make clean
#> make mrproper
#> cp /boot/config-$( uname -r ) .config

Change the config

#> make menuconfig

Now follow the menu

  • Navigate to “Load Alternate Configuration File”.
  • Type in “.config” (without “) and hit enter.
  • Go to “General Setup —>” (enter)
  • Go to “Control Group support —>” (enter)
  • Go to “Resource Counters”, enable it (space)
  • A subentry arises, enable: “Memory Resource Controller for Control Groups”
  • Another subentry, enable: “Memory Resource Controller Swap” (this is optional and not “so” important.. if you are scared off by the experimental remark: dont activate it).
  • Now go to Exit (three or four times) and say “Yes” when it asks you wheter to save the configuration.

Build the kernel

That was the har d part, now begin the build – and grep yourself a book and/or a cup of coffee .. this might take some time.

#> make-kpkg clean
#> make-kpkg --append-to-version "-cgroup-memcap" --revision 1 --us --uc --initrd kernel_image kernel_headers

You can leave out the “kernel_headers” if you don’t require them!

Install the kernel

After some long time (depending on your CPU, RAM, HDD and such – took me about three hours on my atom box) you can install your newly created kernel image file in /usr/src , make the initrd image in /boot, update grub2 once more and reboot (in lenny update-grub instead of update-grub2 should probably do it.. not tested).

#> dpkg -i linux-image-2.6.32-cgroup-memcap_1_i386.deb
#> mkinitramfs -o /boot/initrd.img-2.6.32-cgroup-memcap 2.6.32-cgroup-memcap
#> update-grub2
#> reboot

Check results

Thats all. Memory controller should work now:

#> lxc-checkconfig
Kernel config /proc/config.gz not found, looking in other places...
Found kernel config file /boot/config-2.6.32-cgroup-memcap
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
Network namespace: enabled
Multiple /dev/pts instances: enabled

--- Control groups ---
Cgroup: enabled
Cgroup namespace: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled
Macvlan: enabled
Vlan: enabled
File capabilities: enabled

Yes, you have to do this each time a new kernel releases and you want to install it.

How to safely stop a container^

As you might have noticed the lxc-stop is a kind of rude. It simply kills the parental lxc-start process and thereby everything withunder. If you want a “sane” shutdown, for now there is no other solution then stop the machine eg via ssh or lxc-console like so:

#> lxc-console -n vm0
#vm0> init 0

Then you have to wait for it (use pstree to see wheter still running) and stop it:

#> lxc-stop -n vm0

How to remove a container permanently^

Stop the container and use the lxc-destroy command

#> lxc-destroy -n vm0

This will delete all files from /var/lib/lxc/vm0

How to freeze and unfreeze a container^

LXC can freeze all the processes in a running container. This is not to be mistaken with the kind of S3 freeze a laptop does (save memory content to harddisk). It simply blocks all processes.

#> lxc-freeze -n vm0

This will write the memory (and swap) into

Downloads^

Kernel^

The latest kernel i’ve build is here (sorry, only i386, build that on my old atom box). It is based on the 2.6.32-3-686 debian squeeze Kernel (Saturday, May 01 2010).


Disclaimer^

Of course: no warranties or guaranties of any kind are given. If something terrible happens to you or your hardware, don’t blame me.