Tuesday, February 28, 2006

Update on zoneadm create with zfs

After reaching out to Sun to work on getting my work integrated into OpenSolaris, I found that Sun was already working on this feature. Subsequently, they indicated that the code made it into some internal source code tree. As such, I am holding off on future development until I can get at that code.

However, if you are wanting to try it out, I have posted the code for others to play with. If you have a working OpenSolaris build environment, you should be able to drop in my modified zoneadm.c, run dmake all, then use the resulting zoneadm command. Alternatively, the sparc version of the zoneadm binary is also available.



Sunday, February 19, 2006

Zone created in 0.922 seconds

I noticed today that in the latest OpenSolaris code that "zoneadm clone" exists. Unfortunately, cloning a zone only offered the copy mechanism that was essentially "find | cpio". A bit of hacking later and we have this:

# time ksh -x /var/tmp/clone
+ newzone=fast
+ template=template
+ zoneadm=/ws/usr/src/cmd/zoneadm/zoneadm
+ PATH=/usr/bin:/usr/sbin
+ zonecfg -z fast create -t template
+ zonecfg -z fast set zonepath=/zones/fast
+ /ws/usr/src/cmd/zoneadm/zoneadm -z fast clone -m zfsclone template
Cloning zonepath /zones/template...

real    0m0.922s
user    0m0.128s
sys     0m0.171s
This comes is achieved using zfs to create a snapshot of the template zone, then clone the snapshot to create the zonepath of the new zone. A bit of cleanup is needed, but goodness is on the way.


Wednesday, February 15, 2006

hunting bugs in filebench

I've been using filebench a bit at work and decided that I would like to try a few things out at home. My home machine is not quite as beefy as the V40z's that I have been testing on at work. Getting filebench to compile in the first place is a bit of work. Probably works really well on someone else's system, but mine is obviously different. That's another story though. After compiling filebench, I ran it for the first time and saw this:

$ /opt/filebench/bin/filebench
Segmentation fault (core dumped)
Bummer. Well, let's see where that is at:
$ gdb /opt/filebench/bin/filebench core
GNU gdb 6.4-debian

. . .

(gdb) where
#0  0x37dd84aa in memset () from /lib/tls/i686/cmov/libc.so.6
#1  0x0807b01e in ?? ()
#2  0x080522da in ipc_init () at ipc.c:264
#3  0x08058bc1 in main (argc=1, argv=0x3f8fdcf4) at parser_gram.y:1140
OK, so let's go with the assumption that the bug is in the code listed as alpha on the web site, and not libc. So we go up the stack a couple levels.
(gdb) up 2
#2  0x080522da in ipc_init () at ipc.c:264
264             memset(filebench_shm, 0, c2 - c1);
(gdb) print filebench_shm
$1 = (filebench_shm_t *) 0xffffffff
Hmmm... 0x with a bunch of f's looks like -1. Perhaps some system call on Solaris (presumably where filebench started) returns NULL on error and on Linux it returns -1. Let's go looking for that system call.
(gdb) list
259     #endif /* USE_PROCESS_MODEL */
261             c1 = (caddr_t)filebench_shm;

262 c2 = (caddr_t)&filebench_shm->marker; 263 264 memset(filebench_shm, 0, c2 - c1); 265 filebench_shm->epoch = gethrtime(); 266 filebench_shm->debug_level = 2; 267 filebench_shm->string_ptr = &filebench_shm->strings[0]; 268 filebench_shm->shm_ptr = (char *)filebench_shm->shm_addr;

Nope, not there. Maybe a bit further up.
(gdb) list 250
245     #endif
247             if ((filebench_shm = (filebench_shm_t *)mmap(0, sizeof(filebench_shm_t),
248                     PROT_READ | PROT_WRITE,
249                     MAP_SHARED, shmfd, 0)) == NULL) {
250                     filebench_log(LOG_FATAL, "Cannot mmap shm");
251                     exit(1);
252             }
254     #else
It looks like mmap may be the culprit. I first asked man, but this is Linux, not Solaris. No man page for mmap! Next try google. Google comes up with this page that looks a lot like a man page. Why isn't that found on my system? Another thing for another day. Anyway, it says:

On success, mmap returns a pointer to the mapped area. On error, the value MAP_FAILED (that is, (void *) -1) is returned, and errno is set appropriately. On success, munmap returns 0, on failure -1, and errno is set (probably to EINVAL).

Ok, so it is returning -1 because it doesn't like something. Let's see what it is trying to mmap:
(gdb) print sizeof(filebench_shm_t)
$2 = 907368000
(gdb) print sizeof(filebench_shm_t) / 1024 / 1024

$3 = 865 (gdb)

That 'splains it. It looks like it is trying to set up a shared memory segment that is 865 MB. My poor little system only has 512. FWIW, I have created a patch that addresses this one problem but I haven't had a chance to test it on Solaris yet. Unfortunately, with the patch, it just tells me that the mmap failed. It doesn't address the fact that it is trying to allocate a shared memory segment larger than the size of RAM on my system.

Update 1:

I have posted several patches to the bug tracking system at sourceforge.net. This particular one is 1432638. It turns out that mmap on Solaris also returns MAP_FAILED so the patch is simpler than I originally expected.