Remember to set CONFIG_CFS_BANDWIDTH

I spent a while trying to debug a runc problem where it would always get an EACCES error writing the cpu.cfs_period_us file in a cpu cgroup.

The problem turned out to be that I had not enabled CONFIG_CFS_BANDWIDTH in my kernel build. Presumably, when runc tries to write the file, it passes O_CREAT and cgroupfs doesn’t let it create a new file, which leads to the somewhat surprising error.

So, if you get this error, just turn on CONFIG_CFS_BANDWIDTH :)

Read more...

Buildroot, Systemd, and Getty

I just spent a few hours trying to get systemd to spawn a getty on the vt console for a Linux image that I was building with Buildroot. It turns out that it helps to read the right documentation, which in this case was a blog about systemd console handling. The money quote for me was: In systemd, two template units are responsible for bringing up a login prompt on text consoles:
Read more...

Using alternatives(8) to enable lld

In this post, I remember how to use the alternatives(8) mechanism to make clang’s lld linker the default. First, tell alternatives that lld is available and set it at a high priority: $ sudo alternatives --install /usr/bin/ld ld /usr/bin/lld 80 $ sudo alternatives --auto ld Then, just verify that it worked: $ alternatives --display ld ld - status is auto. link currently points to /usr/bin/lld /usr/bin/ld.bfd - priority 50 /usr/bin/ld.gold - priority 30 /usr/bin/lld - priority 80 Current `best' version is /usr/bin/lld.
Read more...

Debugging libstdc++ strings

Writing this down quickly before I forget. When debugging a std::string from GNU libstdc++, the debugger typically won’t show you the actual representation. First, you need to turn off the pretty printer (assuming that it worked in the first place): (gdb) p reregisterSlaveMessage.resource_version_uuid_.ptr_ $13 = (std::string *) 0x7f4d98d5e970 (gdb) p *reregisterSlaveMessage.resource_version_uuid_.ptr_ $14 = "\022\020K|\n\225\064\246CE\222\350\275\315t", <incomplete sequence> (gdb) disable pretty-printer 2 printers disabled 0 of 2 printers enabled (gdb) p *reregisterSlaveMessage.resource_version_uuid_.ptr_ $15 = { static npos = <optimized out>, _M_dataplus = { <:allocator>> = { <:new_allocator>> = {<no data fields>}, <no data fields>}, members of std::basic_string<char std::char_traits>, std::allocator<char> >::_Alloc_hider: _M_p = 0x7f4d98d67068 "\022\020K|\n\225\064\246CE\222\350\275\315t", <incomplete sequence> } } Next, you need to know that the internal structure of std::string is prepended to the actual string data, so you need to cast and subtract from the data pointer to find the length and refcount.
Read more...

Tracing rmdir system calls with SystemTap

I wanted to know who was removing the Mesos memory cgroups hierarchy and why, so I turned to SystemTap. Here’s my one-liner: sudo stap \ -d /usr/lib/systemd/libsystemd-shared-233.so \ -d /usr/lib64/libc-2.25.so \ -d /usr/lib/systemd/systemd \ -e 'probe kernel.function("sys_rmdir") { printf("%s(%s): %s\n", execname(), pp(), user_string($pathname)); print_ubacktrace(); }' Note that you have to feed in the binaries you expect to see in order to get user stack traces. The corresponding systemd stack trace was:
Read more...

What I learned about Linux AIO today

  1. There is no filesystem that implements the AIO fsync operation.
  2. When performing AIO reads on vboxfs (VirtualBox filesystem) it will return -EPROTO. No idea why that happens.
  3. Neither vboxfs nor tmpfs suport O_DIRECT.
  4. AIO seems to work as advertised on XFS, with or without O_DIRECT.
Read more...

Leak detection with tcmalloc

tcmalloc has a built-in leak detection mechanism. It took me a couple of tries to figure out how to work it, even after reading the documentation. At least on Centos 7, the trick is to make sure you install the pprof package as well as gperftools-libs package. You will also need to set the PPROF_PATH environment variable so that the tcmalloc runtime can find proof. If you don’t do this, then the leaks report will not resolve symbols, so the stack traces will not be that useful.
Read more...

Debugging iPXE from the iLO console

So I’ve been trying to get iPXE chainloading to work and I’ve been using the iLO virtual serial console over SSH to verify and debug. iPXE has a DCHP debug build option which you can enable by doing make bin/undionly.kpxe DEBUG=dhcp. However, when you do this, you will find that each line of output on the iLO virtual serial console output overwrites the previous line, creating a big illegible mess. Fortunately, you can build iPXE with only serial output support, so that you can actually read the debug messages on the iLO virtual serial console.
Read more...