Mike Gerdts

Saturday, April 03, 2010

Connecting to twist server

Once your development environment is set up, it is useful to know what is required to connect to the twist server.

#! /opt/opsware/smopython2/python2

import sys
sys.path.append("/opt/opsware/smopylibs2")
from pytwist import *

# Establish an unauthenticated connection to twist server.  Note that the
# hostname "twist" must resolve either through /etc/hosts or DNS.
ts = twistserver.TwistServer()

In the event the hostname twist does not resolve or points to the wrong twist server, you can do...

ts = twistserver.TwistServer("twist.mycompany.com")

See the TwistServer Method Syntax section on page 75 of the Developer's Guide for other options that may be important to you.

It is also quite likely that you will need to do an authenticated session. This is covered in the Error Handling section on page 76 of the Developer's Guide. In order to prompt for a username and password, I have found something like the following works well...

def authenticate(server):
        for tries in range(1,3):
                try:
                        sys.stdout.write("HPSA login: ")
                        user = sys.stdin.readline().strip()
                        pw = getpass.getpass("HPSA password: ")
                        server.authenticate(user, pw)
                        return True
                except:
                        sys.stderr.write("Authentication failed.\n")
        return False

...
ts = twistserver.TwistServer()
authenticate(ts)

Opsware/HPSA API: Getting Started

The HP Server Automation Platform Developer's Guide (for HPSA 7.50, September, 2008) and any other resources I've been able to find are far from complete and sometimes misleading with respect to using the HPSA API. As I've tried to clear up my confusion with the API, google has helped me exactly 0% of the time. Maybe my rambling on the subject will lead to someone leading me to a better approach...

In this first post on the subject, let's start out with enough to get a working Python environment. Chapter 3 of the developer's guide indicates that Pytwist relies on Python 1.5.2. Ugh. However, pytwist can be used with Python 2.4. Rather than following the advice to install the /Opsware/Tools/Opsware API Access software policy, instead install the following policies:

/Opsware/Tools/Python 2/Python 2 for Server Modules
/Opsware/Tools/Python 2 Opsware API Access for Server Modules/Python 2 Opsware API Access for Server Modules

The permissions are likely such that you need to be root to execute. Fix this with:

# chmod -R a+r /opt/opsware/smopython2
# exit
$ /opt/opsware/smopython2/python -V
Python 2.4.4

Ahhh, much better.

Notice the path to python above. It is actually a shell script that sets PYTHON_HOME then tries to execute python. Unfortunately, Solaris doesn't like a shebang line in one script referring to another script. The resulting error looks like:

$ ./twister1 
import: Unable to connect to X server ().
./twister1: line 4: syntax error near unexpected token `"/opt/opsware/smopylibs2"'
./twister1: line 4: `sys.path.append("/opt/opsware/smopylibs2")'

In other words, the shebang (#!) line python scripts will be useless as the software policy installs it.

Workaround 1:

$ export PYTHONHOME=/opt/opsware/smopython2/34c.0.0.5.31-1

Then use a shebang line like:

#! /opt/opsware/smopython2/34c.0.0.5.31-1/bin/python2

Workaround 2: Compile this program and put it at /opt/opsware/smoptyhon2/python2

#include <stdlib.h>
#include <unistd.h>

#define PYTHONHOME "/opt/opsware/smopython2/34c.0.0.5.31-1"

int main(int argc, char **argv) {
        char *python = PYTHONHOME "/bin/python2";

        argv[0] = python;
        setenv("PYTHONHOME", PYTHONHOME, 1);

        return execv(python, argv);
}

Then use a shebang line like:

#! /opt/opsware/smopython2/python2

In any case (even if you used Python 1.5 like the developer's guide suggests) the path to python does not match reality. Adjust your scripts accordingly.

Friday, June 19, 2009

Shell Programming and PATH

As most readers of this blog will already know, the PATH environment variable is used to locate commands that are executed. Key things to remember as you read this post are:

Environment variables (including PATH) are inherited by child processes
Child processes are unaffected by the parent process subsequently changing PATH to something else

So what's the big deal? Suppose you have a shell script that calls ps -fe. It works great for you because you have /usr/bin first in your PATH. However, the guy down the hall that cut his teeth on a BSD system has /usr/ucb first. If your shell script does not set PATH=/usr/bin:... prior to calling ps, your shell script will work for you but give strange errors for the guy down the hall. Of course, your shell script could just specify /usr/bin/ps -fe...

This brings up four different styles that are seen...

Style 1: Just hope for the best

#! /usr/bin/ksh

count=$(ps -fe | wc -l)
echo "There are $count processes running"

Style 2: Specify full path whenever calling a program

#! /usr/bin/ksh

count=$(/usr/bin/ps -fe | wc -l)
/usr/bin/echo "There are $count processes running"

Style 3: Create variables to store full path to all programs

#! /usr/bin/ksh

PS=/usr/bin/ps
WC=/usr/bin/wc
ECHO=/usr/bin/echo

count=$($PS -fe | $WC -l)
$ECHO "There are $count processes running"

Style 4: Set PATH to use the commands you want to use

#! /usr/bin/ksh

export PATH=/usr/bin
count=$(ps -fe | wc -l)
echo "There are $count processes running"

With Style 1, the script is only reliable for the subset of users that have the right version of ps first in their PATH.

A workaround for this is shown in Style 2. However, this example has an intentional problem that is somewhat common when this approach is used. Notice that wc is not specified by its full path. This will work fine until someone with a really messed up (or unset) PATH tries to execute the script.

Style 3 fixes the ps and wc problems, but introduces another small problem: it forces a fork() and exec*() to run something that could be more efficiently done via a built-in. I'll talk more about this in a future post.

Style 4 keeps the simplicity of Style 1, but ensures that each user will get the same version of the commands. The author of the script can tailor PATH to contain the minimum set to find the required commands and test the script to gain a high degree of confidence that the script will work for others.

I have a strong preference for Style 4. Performing shell programming retains the feel of using a shell interactively, keeps the code understandable, and performs reliably. But this doesn't mean that it is always the right thing to do. Consider the batch command. It doesn't set PATH and it is very correct in not doing so. That is, if the following:

exec /usr/bin/at -qb $*

were replaced with

PATH=/usr/bin; export PATH
...
at -qb $*

This would change the environment that at(1) attaches to the job - potentially breaking it.

Shell Programming and temporary files

I've been generally impressed with the coding standards that are enforced on C code in OpenSolaris, as checked by cstyle. However, similar automated checks don't exist for shell scripts. This, combined with a history of "/bin/sh is the one true shell", has led to inefficient scripts that very commonly have security vulnerabilities.

I'm hoping to breathe some life into this blog with a series of posts that describe some of the common problems and some potential solutions. I'll start with problems related to security problems with temporary files.

Consider the following bit of code taken from /usr/lib/lp/bin/lpadmin.

   293  # Do the LP configuration for a local printer served by lpsched
   294  if [[ -x ${LPADMIN} && -n "${local}" ]] ; then
   295          # enumerate LP configured printers before modification
   296          PRE=/tmp/lpadmin-pre.$$
   297          (/bin/ls /etc/lp/printers 2>/dev/null ; /bin/ls /etc/lp/classes \
   298                  2>/dev/null) >${PRE}

There are two problems here:

A symbolic link vulnerability exists.
The TMPDIR environment variable is not respected. If a user has a reason to want temporary files stored in a particular file system, utilities should respect the user's wishes.

The solution

For quite some time, Solaris has provided the mktemp command has existed to facilitate the secure creation of files. In the fix for this problem, I used mktemp in a way that both creates the file securely and respects TMPDIR.

     41 MKTEMP="/usr/bin/mktemp -t"
      ...
    300  PRE=$(${MKTEMP} lpadmin-pre.XXXXXX)
    301  if [[ -z "${PRE}" ]] ; then
    302   gettext "lpadmin: System error; cannot create temporary file\n" 1>&2
    303   exit 2
    304  fi
    305 
    306  (/bin/ls /etc/lp/printers 2>/dev/null ; /bin/ls /etc/lp/classes \
    307   2>/dev/null) >${PRE}

Wednesday, July 23, 2008

Installation of Studio 11 on Nevada

The OpenSolaris Sun Studio installation page forgets to mention that the default version of Java is incompatible with the Studio installer. The following command works for me:

# export JAVA_HOME=/usr/jdk/instances/jdk1.5.0
# export PATH=/usr/jdk/instances/jdk1.5.0/bin:$PATH
# ./installer

Wednesday, June 11, 2008

OpenSolaris build service proposal

With most open source projects the various pieces are relatively small and as a result, the build process for any component is measured in seconds or minutes. Solaris has historically been based upon consolidations - relatively large sets of code. The most common wrapper around the build process is called nightly for good reason - it takes hours to days to perform a full build. Once a full build is complete incrementals can take a lot less time, but if you infrequently build the software, you are pretty much forced into a full build every time. I just stumbled across this post that shows that others have noticed this problem too.

What can be done?

I've been thinking that most people really don't need a full clobber build all the time. If there is a 2 line fix somewhere, just verifying that the code builds and the resulting binaries work is most important - and most encouraging for the casual bug fixer. If there's a big re-work of Makefiles or changes are made to the linking of a program a full build may be more important. A service that can do incremental builds would be a big help.

So here's (very roughly) how I envision it.

Each time a release is tagged by a participating consolidation...

A job is kicked off to build it on each supported architecture.
A zfs snapshot is created of the workspace with the completed build.
The completed build is packaged into an IPS repository

When a developer wants to fix a bug...

Populate a workspace with the original source code
Hack
Create a webrev
Submit the webrev to the build system, saying which consolidation and build to build against and which architectures to build for

The build system then...

Schedules the job via Hudson or Grid Engine.
The job starts by cloning the pre-built release of the consolidation. Quite likely, it also clones then boots the proper boot environment.
The patch included in the webrev is applied to the source
An incremental build is done
The pre-built IPS repository is cloned to a per-developer repository
Packages associated with the changed files are populated into the per-developer IPS repository. The changed packages associated with this build are added as dependencies to a special package (e.g. build-20080611.0)
Results are sent to the developer.

In most cases, the execution time of the above could be just a few minutes. If the change triggers lots of dependencies, it could be many hours. To test the changes, the developer can use...

beadm create testbe
beadm mount testbe /testbe
pkg -R /testbe set-authority -O http://pkg.opensolaris.org/mysername myusername.opensolaris.org
pkg -R /testbe install build-20080611.0
beadm unmount testbe
beadm activate testbe
init 6

There are certainly plenty of details to work out. This is meant to be the start of a discussion to determine wether it is something worth pursuing and if so what those missing details are.

Friday, March 21, 2008

Solaris Wish List: Make it Open Source

I frequently see references from Sun and those that quote some Sun people as saying that Solaris is Open Source.

What is Solaris?

There are a few different flavors of things that people may refer to as Solaris:

Solaris 10 and earlier versions. This is what most references to the word Solaris seem to be referring to. Each release of Solaris is supported for a number of years through commercial and no-cost channels, to varying degrees.
Solaris Express. This is a collection of software that is viewed as a fairly stable distribution based upon the development branch of Solaris. Very limited ("installation and configuration as well as for developer assistance") is available, but the support period seems to be limited to around 3 months per release.
OpenSolaris. This is best summarized as "The main difference between the OpenSolaris project and the Solaris Operating System is that the OpenSolaris project does not provide an end-user product or complete distribution. Instead it is an open source code base, build tools necessary for developing with the code, and an infrastructure for communicating and sharing related information. Support for the code will be provided by the community; Sun offers no formal support for the OpenSolaris product in either source or binary form."

What is Open Source?

The annotated Open Source Definition covers this quite well.

Is Solaris Open Source?

Let's see if the license used by Solaris aligns with the Open Source definition.

Free redistribution	No. The Software License Agreement states: You may make a single archival copy of Software, but otherwise may not copy, modify, or distribute Software.
Source Code	No. The source code for Solaris is not available. Note that while a bunch of code is available at src.opensolaris.org, this is not the same source code that is used for building Solaris. If I want or need to modify the behavior of Solaris, there is no straight-forward way to do so.
Derived Works	No. Since I may not copy, modify, or distribute Solaris, this point is moot.
Integrity of Author's Source Code	No. Since the source code is not available, this point is also moot.
No Discrimination Against Persons or Groups	I think so. See section 11 of the license agreement for export restrictions.
No Discrimination Against Fields of Endeavor	I think so. While Sun's lawyers don't want to suggest that running your nuclear power plant with Solaris, they don't say that you can't.
Distribution of license	No. Since redistribution is not allowed, this point is moot.
...	...

Since one or more of the requirements to be called Open Source are not met by the license under which Solaris is distributed, Solaris is not open source.

People that need support don't want source code.

That is great in theory, but in practice it falls down a bit. Let's pretend that you need to write a dtrace script to dig into a thorny performance problem. If the stable dtrace providers don't provide probes at the right spot, you need to fall back on fbt or pid probes. The only way to understand what these are tracing is to read the source. Since the OpenSolaris and Solaris branch point is now about 3 years old, this is becoming extremely difficult to do reliably. The code that you get by browsing the latest OpenSolaris code sometimes does not match up with the fbt probes that are available in any release of Solaris. You may have more luck looking at historical versions of the source code, but that is a guessing game at best.

There are also times when a customer's needs just do not align with what Sun is willing to offer. Suppose you need different functionality from a device driver. It is possible that it is a trivial change from the customer's standpoint, so long as they have current source code. However, if the source code is not available, the best the customer can do is grab the same driver from somewhere else (e.g. opensolaris.org) and try to to maintain a special version of the driver and provide custom ports of all the bug fixes that they would otherwise get from from the Solaris source.

Suppose that hypothetical fix that the customer needed was something that Sun agreed was needed but they did not have the time to develop the fix. If the Solaris source code were as open as the OpenSolaris source code, the customer could work with the OpenSolaris community to get the fix integrated into OpenSolaris, then provide the backport to Solaris. If the customer could do this, the code would see the appropriate code reviews, development branch burn-in, etc. with minimal additional workload on Sun.

My Wish

I know that Sun went through tremendous work to make OpenSolaris happen. They should be commended for that and the other tens (hundreds?) of millions of lines of code they have opened up or written from scratch in the open. This gives them tremendous opportunity with Solaris 11 (assuming that 10+1 = 11). Keep the code that is open, well, open. When patches or other updates are released, be sure that it is clear in the open source code repository which files are used in building that update. To get the full benefit of this, it should be possible for the Solaris customer to set up a build environment to build this source.

It is OK if Sun doesn't want to support the code I modify. However, I would expect that they would support the unmodified parts of Solaris much the same as they do if I install a third-party package that adds some device drivers and mucks with some files in /etc.