Mike Gerdts: dtrace

Monday, January 03, 2011

dtrace PID provider vs. libraries loaded by python

As I was doing some work on beadm, I really wanted to know who was mucking with properties on datasets. It looked as though I could do something like:

# dtrace -qn 'pid$target:libzfs.so.1:zfs_prop_set:entry / copyinstr(arg0 + 8) == $1 /
{
    printf("%s(%s,%s,%s)\n", probefunc, copyinstr(arg0 + 8), copyinstr(arg1), copyinstr(arg2));
    ustack();
}' -c 'beadm create test' $dataset

But that doesn't work:

... probe description pid12942:libzfs.so.1:zfs_prop_set:entry does not match any probes

My guess is because that this is because the python executable is not linked against libzfs - a very good thing. To get around this, we need to preload libzfs. For example:

# LD_PRELOAD_32=libzfs.so.1 dtrace -qn 'pid$target:libzfs.so.1:zfs_prop_set:entry / copyinstr(arg0 + 8) == $1 /
{
    printf("%s(%s,%s,%s)\n", probefunc, copyinstr(arg0 + 8), copyinstr(arg1), copyinstr(arg2));
    ustack();
}' -c 'beadm create test' rpool/zones/z1/rpool/ROOT/zbe-1
zfs_prop_set(rpool/zones/z1/rpool/ROOT/zbe-1,mountpoint,legacy)

              libzfs.so.1`zfs_prop_set
              libbe.so.1`be_unmount_zone_root+0xa9
              libbe.so.1`be_update_zone_vfstab+0x106
              libbe.so.1`be_copy_zones+0x9d9
              libbe.so.1`be_copy+0x8f2
              libbe_py.so`beCopy+0x245
...

The only change here was the addition of LD_PRELOAD_32=libzfs.so.1 at the beginning of the command. LD_PRELOAD would have worked as well, but it wouldn't work if a different library that does not have a 64-bit variant was being traced. That is because dtrace would fail to start because the 64-bit library was not able to be found. I found this when I needed to poke around at calls within libbe.so.1, which is 32-bit only.

Friday, April 20, 2007

Today's dtrace one-liners

Well, I've been spending a bit more time with dtrace lately. I'm now starting to do interesting things with dtrace without looking at the manual or snagging other scripts. Today I found some processes that looked like they weren't doing much of use. In particular, I saw the following in "prstat -mL"

   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 
 15274 oracle    46  53 0.0 0.0 0.0 0.0 0.0 0.0   0  67 .22   0 racgmain/1
 15295 oracle    47  53 0.0 0.0 0.0 0.0 0.0 0.0   0  74 .23   0 racgmain/1
 16810 oracle    47  53 0.0 0.0 0.0 0.0 0.0 0.0   0  66 .23   0 racgmain/1
 25029 oracle    47  53 0.0 0.0 0.0 0.0 0.0 0.0   0  70 .23   0 racgmain/1

First, it is confusing that it says that these processes are doing less than one system call per second (.22 or .23), but spending more than 50% in system calls. Looks like a bug in prstat. To see what system call was being used...

# dtrace -n 'syscall:::entry/execname == "racgmain"/ { @[probefunc] = count() }
dtrace: description 'syscall:::entry' matched 232 probes
^C

  read                                                         269039

Ahh, so it is only reads that are causing all of this. Let's see which file descriptors are involved.

# dtrace -n 'syscall::read:entry/execname == "racgmain"/{ @[pid, probefunc,arg0] = count() }'
^C
    22469  read      9                7
    22469  read      4                7
    22469  read      3                9
    22469  read      8                9
    22469  read     10               10
    22469  read      7               21
    22469  read      6               24
    15274  read     10          1026462
    15295  read     10          1044932
    25029  read     10          1051450
    16810  read     11          1060720

It seemed rather odd that there would be so much reading on a single file descriptor per file (10 or 11, depending on the process). If this were really the case, I would expect to see some processes showing up in prstat that were high in %sys doing writes. Using pfiles on each of the pids showed that the file descriptor in question was a FIFO, but gave no indication what (if anything) was on the other end of the FIFO.

To help answer the question of what might be on the other end of the FIFO, I tried this (ok, not a one liner...)

#!/usr/sbin/dtrace -qs 

syscall::read:return
/execname == "racgmain"/
{
        @c[pid, arg0] = count()
}

tick-1s
{
        printf("%6s %6s %4s\n", " pid  ", " size ", "count");
        printa("%6d %6d %@4d\n", @c);
        clear(@c);
}

Which gave the following output...

 pid    size  count
 15295      0 117524
 15274      0 121065
 25029      0 121308
 16810      0 125358
 pid    size  count
 15274      0 108177
 25029      0 109639
 15295      0 115735
 16810      0 117175
 pid    size  count
 15295      0 119411
 15274      0 119471
 25029      0 120273
 16810      0 120650

Imagine that... these processes are each doing 100,000+ zero-byte reads per second. Time to tell the DBA's that they have some real problems with one of their daemons.

Mike Gerdts

Monday, January 03, 2011

dtrace PID provider vs. libraries loaded by python

Friday, April 20, 2007

Today's dtrace one-liners

Labels

Blog Archive

About Me