Friday, March 21, 2008

Solaris Wish List: Make it Open Source

I frequently see references from Sun and those that quote some Sun people as saying that Solaris is Open Source.

What is Solaris?

There are a few different flavors of things that people may refer to as Solaris:

  • Solaris 10 and earlier versions. This is what most references to the word Solaris seem to be referring to. Each release of Solaris is supported for a number of years through commercial and no-cost channels, to varying degrees.
  • Solaris Express. This is a collection of software that is viewed as a fairly stable distribution based upon the development branch of Solaris. Very limited ("installation and configuration as well as for developer assistance") is available, but the support period seems to be limited to around 3 months per release.
  • OpenSolaris. This is best summarized as "The main difference between the OpenSolaris project and the Solaris Operating System is that the OpenSolaris project does not provide an end-user product or complete distribution. Instead it is an open source code base, build tools necessary for developing with the code, and an infrastructure for communicating and sharing related information. Support for the code will be provided by the community; Sun offers no formal support for the OpenSolaris product in either source or binary form."

What is Open Source?

The annotated Open Source Definition covers this quite well.

Is Solaris Open Source?

Let's see if the license used by Solaris aligns with the Open Source definition.

Free redistributionNo. The Software License Agreement states: You may make a single archival copy of Software, but otherwise may not copy, modify, or distribute Software.
Source CodeNo. The source code for Solaris is not available. Note that while a bunch of code is available at src.opensolaris.org, this is not the same source code that is used for building Solaris. If I want or need to modify the behavior of Solaris, there is no straight-forward way to do so.
Derived WorksNo. Since I may not copy, modify, or distribute Solaris, this point is moot.
Integrity of Author's Source CodeNo. Since the source code is not available, this point is also moot.
No Discrimination Against Persons or GroupsI think so. See section 11 of the license agreement for export restrictions.
No Discrimination Against Fields of EndeavorI think so. While Sun's lawyers don't want to suggest that running your nuclear power plant with Solaris, they don't say that you can't.
Distribution of licenseNo. Since redistribution is not allowed, this point is moot.
......

Since one or more of the requirements to be called Open Source are not met by the license under which Solaris is distributed, Solaris is not open source.

People that need support don't want source code.

That is great in theory, but in practice it falls down a bit. Let's pretend that you need to write a dtrace script to dig into a thorny performance problem. If the stable dtrace providers don't provide probes at the right spot, you need to fall back on fbt or pid probes. The only way to understand what these are tracing is to read the source. Since the OpenSolaris and Solaris branch point is now about 3 years old, this is becoming extremely difficult to do reliably. The code that you get by browsing the latest OpenSolaris code sometimes does not match up with the fbt probes that are available in any release of Solaris. You may have more luck looking at historical versions of the source code, but that is a guessing game at best.

There are also times when a customer's needs just do not align with what Sun is willing to offer. Suppose you need different functionality from a device driver. It is possible that it is a trivial change from the customer's standpoint, so long as they have current source code. However, if the source code is not available, the best the customer can do is grab the same driver from somewhere else (e.g. opensolaris.org) and try to to maintain a special version of the driver and provide custom ports of all the bug fixes that they would otherwise get from from the Solaris source.

Suppose that hypothetical fix that the customer needed was something that Sun agreed was needed but they did not have the time to develop the fix. If the Solaris source code were as open as the OpenSolaris source code, the customer could work with the OpenSolaris community to get the fix integrated into OpenSolaris, then provide the backport to Solaris. If the customer could do this, the code would see the appropriate code reviews, development branch burn-in, etc. with minimal additional workload on Sun.

My Wish

I know that Sun went through tremendous work to make OpenSolaris happen. They should be commended for that and the other tens (hundreds?) of millions of lines of code they have opened up or written from scratch in the open. This gives them tremendous opportunity with Solaris 11 (assuming that 10+1 = 11). Keep the code that is open, well, open. When patches or other updates are released, be sure that it is clear in the open source code repository which files are used in building that update. To get the full benefit of this, it should be possible for the Solaris customer to set up a build environment to build this source.

It is OK if Sun doesn't want to support the code I modify. However, I would expect that they would support the unmodified parts of Solaris much the same as they do if I install a third-party package that adds some device drivers and mucks with some files in /etc.

Saturday, March 15, 2008

OpenSolaris Wish List: Feature-based meta packages

The problem

There are many occasions when an administrator needs to install a particular feature on a system but tracking down the packages, patches, or revisions thereof is next to impossible. Consider the following scenarios:

  • A sysadmin is trying to build minimized systems that have a firewall, bash, python, and can support zones using Jumpstart. What needs to appear in the Jumpstart profile?
  • That same sysadmin then decides that a particular installed system needs to be augmented to support the OSPF routing protocol.
  • Some time passes and an updated feature of is released. The sysadmin needs to be able to update systems to support that feature without guessing which packages or patches are required. Note that the a subset of the required packages may not have been installed initially because they did not exist initially or they weren't required for to meet the dependencies of the earlier feature set.
  • A sysadmin notices a nifty feature mentioned in the "What's New" documentation and wants to determine if a recently patched system system supports the feature.
Dealing with any of these situations is very difficult today.

Most commonly, the best advice that is offered when trying to indicate what is required to have a particular feature is to say that the feature is available in Solaris X update Y. While that is true to a certain extent, that is not the complete story. It assumes that the system in question is in no way minimized and ignores the fact that the same patches that are integrated in update Y are also available for installation on releases before update Y.

Ongoing Work

The Image Packaging System looks to be outstanding work in this direction. The one-pager is likely the best place to look for a quick overview. While this will provide a good foundation, there are numerous posts to mailing lists of the form "I'm trying to install X, but pkg can't find it."

Suggested Approach

Some geeks (power users, whatever you would like to call them) will know the exact name of the packages that they want or will otherwise be able to search for them rather adeptly. Having the ability to say "I want bash" rather than "I want the same shell used by most Linux distros" is important.

I suspect that most people will be quite happy looking at their requirements and matching them up to advertised features. I suggest that the feature/* package namespace be reserved for meta-packages that represent features. A feature meta-package should not deliver any files, rather it just defines dependencies for the minimal set of packages required to support the feature. See the Package FMRIs and Versions section of pkg(5) for details on package naming.

For instance, if I needed the zones feature I should be able to say something along the lines of "pkg install feature/zones". This would likely correspond to an FMRI of pkg://opensolaris.org/feature/zones@5.11,5.11-7:20080326T164523Z. When new features are updated, the build number (7 in the example FMRI) should be updated.

While searching the published packages would yield a pretty good list of packages required for supporting zones, someone needing the OSPF routing protocol may have a harder time with a search. However, installing feature/network/routing/ospf would likely know that the zebra or quagga package is needed (varying by Solaris release).

Integration with Documentation

With such an approach, the Solaris What's New documentation could be very helpful to sysadmins needing to update existing systems, installation profiles, etc. Consider the following, based upon a particular entry from the Solaris 10 What's New documentation.

IPsec Tunnel Reform

Solaris now implements IPsec Tunnel Mode per RFC 2401. Inner-packet selectors can be specified on a per-tunnel-interface basis using the new "tunnel" keyword of ipsecconf(1M). IKE and PF_KEY handle Tunnel Mode identities for Phase 2/Quick Mode. Interoperability with other IPsec implementations is greatly increased.

For more information, see Transport and Tunnel Modes in IPsec in System Administration Guide: IP Services. This feature is available via the feature/network/protocol/rfc2401software collection.

To help someone with this feature find documentation of this type in the future, a documentation system (installed on system or internet accessible) should be able to point a person interested in documentation related to a feature to the appropriate man page(s), online books, and other documentation.

Similarly, if the feature is just an update of an existing feature, having the documentation refer to a particular build (release, whatever works) of the feature package would be must useful.

Tuesday, March 11, 2008

Future of OpenSolaris Boot Environment management

I was quite happy to see this recent post from Ethan Quach proposing an efficient method for sharing the variable parts of /var. It bears a striking resemblance to something that I suggested and and clarified in the past.

Correction June 6, 2009: Links to mail archives at opensolaris.org seem not to be stable. The same messages are available at the following: My initial suggestion and clarification.

But why does this matter? When you are making significant changes to the system, such as during a periodic patch cycle or upgrade, it is generally desirable to...

  1. be able to do so without taking the system down for the duration of the process
  2. be able to abort the operation if you have a change of heart
  3. be able to fail back if you realize that newer isn't better
Consider what is in /var:
  • Mail boxes If the machine is a mail server (using sendmail et. al.) there is a pretty good chance that users have their active mail boxes at /var/mail.
  • In flight mail messages Most machines process some email. For example, if a cron job generates output it is sent to the user via email. Many non-web mail clients invoke /bin/mail or /usr/lib/sendmail to cause mail to be sent. Each message spends somewhere between a few milliseconds and a few days in /var/spool/mqueue or /var/spool/clientmqueue.
  • Print jobs If the machine acts as a print server (even for a directly attached printer) each print job spends a bit of time in /var/spool/lp.
  • Logs When something goes wrong, it is often times useful to look in log messages to figure out why it went wrong. Those are often found under /var/adm.
  • Temporary files that may not be It is rather common for people to stick stuff in /var/tmp and expect to be able to find it sometime in the future.
  • DHCP If a machine is a dhcp server, it will store configuration and/or state information in /var/dhcp.
  • ...
All of those things should be of a stable file format and usable before and after you patch/upgrade/whatever. If you take the traditional Live Upgrade approach, you can patch or upgrade to an alternate boot environment. As part of activating the new environment, a bunch of files are copied between boot environments. According to this page the following things are synchronized:
/var/mail                    OVERWRITE
/var/spool/mqueue            OVERWRITE
/var/spool/cron/crontabs     OVERWRITE
/var/dhcp                    OVERWRITE
/etc/passwd                  OVERWRITE
/etc/shadow                  OVERWRITE
/etc/opasswd                 OVERWRITE
/etc/oshadow                 OVERWRITE
/etc/group                   OVERWRITE
/etc/pwhist                  OVERWRITE
/etc/default/passwd          OVERWRITE
/etc/dfs                     OVERWRITE
/var/log/syslog              APPEND
/var/adm/messages            APPEND
Notice that the default configuration loses your in flight print jobs because /var/spool/lp is not copied. Suppose you have a mail server with a few gigs of mail at /var/mail. Is it a good use of time or disk space to copy /var/mail between boot environments?

A much better solution seems to be to make those directories shared between the boot environments. The way to do this in Live Upgrade and presumably in the future is to remove (or not add) them to /etc/lu/synclist and allocate separate file systems. However, do you really want a file system for /, /var/mail, /var/spool/mqueue, /var/spool/clientmqueue, /var/spool/lp, /var/adm, /var/tmp, /var/dhcp, ...? What if you had someone tell you that you had to monitor every file system on every machine for being out of space? How big would you make all of those file systems so that your monitoring didn't wake you up in the middle of the night?

In the future, it looks as though OpenSolaris will use ZFS to store each boot environment. Among the features of ZFS that make this desirable are snapshots, clones, and rethinking the boundary between disk slices (or volumes) and file systems. If the organization of /var is changed just a bit...

/var/adm -> share/adm
/var/dhcp -> share/dhcp
/var/mail -> share/mail
/var/spool -> share/spool
/var/tmp -> share/tmp
/var/share/adm
/var/share/dhcp
/var/share/mail
/var/share/spool
/var/share/tmp
Then you can get by with having two zfs file systems: / and /var/share. The Snap Upgrade process would then likely do the following:
  1. Take a snapshot of /, clone it, then mount it somewhere usable in subsequent steps (e.g. /mnt/abe)
  2. Do whatever is needed on the alternate boot environment mounted at /mnt/abe.
  3. Unmount the alternate boot environment
When it comes time to activate the new boot environment, there are some files that are likely need to be synchronized using the traditional mechanism. For instance, if someone tried to get into a system by guessing a user's password, there is a reasonable chance that the account was locked via a modification to /etc/shadow. Presumably you don't want to give the bad guy another chance when you activate the new boot environment. Note, however that the files that may need to be synchronized in /etc are nearly always small files and there would not be very many of them. The files in /var/shared would not need to be synchronized. However, just in case the new version of sendmail decides to eat mailboxes, it would be very nice to be able to recover.

This means that activating a boot environment would look like:

  1. Bring the system into single-user mode
  2. Mount the alternate boot environment
  3. Synchronize those files that need to be synchronized
  4. Take a snapshot of /var/shared
  5. Set the boot loader to boot from the new boot environment and offer a failback option to the old boot environment
  6. Reboot
The items in italics are special to boot environment activation. Each one should take a couple seconds or less - adding far less than thirty seconds to the normal reboot process to activate the new boot environment. Failback would be similarly quick.

Now suppose this system is a bit more complicated and has 20 zones on it. Have you ever patched a system with 20 zones on it? Did you start and Friday and finish on Monday? How happy were the users with the "must install in single-user mode" requirement? This same technique should allow you to have two file systems per non-global zone - one for the zone root and one for /var/shared in the zone. Supposing that the reboot processing takes 5 seconds per zone you are looking at an extra minute to reboot rather than a weekend of down time.

Without Live Upgrade or Snap Upgrade, what would backout look like? After you had the system down for patching for a couple days, you could take it down again for a couple days to back the patches out. Or you could go to tape. Neither is an attractive option. With Snap Upgrade you should be able to fail back with your normal reboot time plus a minute.