An Introduction to the Virtualenv Sandbox

Author:Jeff Rush <>
Copyright:2008 Tau Productions Inc.
License:Creative Commons Attribution-ShareAlike 3.0
Date:March 13, 2008
Series:Python Eggs and Buildout Deployment - PyCon 2008 Chicago

A short intro to sandboxing your Python development work using the virtualenv tool and as a prerequisite, the steps for setting up EasyInstall to download it.


Roadmap of Topics

  • Installation Concerns
  • About Virtualenv
  • Steps: Installing EasyInstall then VirtualEnv
  • Creating and Using Sandboxes
  • Directory Layout of a Sandbox
  • Limitations of Virtualenv
  • For More Information

Index of Slides

Installation Concerns

  • Prerequisites
    • at least Python 2.3.5
    • access to the Internet
    • system administrator access rights
  • Items of Note
    • installs into system site-packages
    • ... of the instance of Python used
    • named with suffix of Python version

In this class we'll use virtualenv to create sandboxes for our exercises, to avoid disrupting the system installation of Python.

To simplify the installation of virtualenv, we'll first install EasyInstall, bundled with setuptools, which I'll cover much more in-depth in the section on eggs.

You need a relatively modern version of Python, and access to the Internet for retrieving the necessary files. Because we'll be installing this system-wide (i.e. the sandboxing tools cannot themselves be inside the sandbox), you also need administrator privileges on your system.

These two packages, EasyInstall and Virtualenv, are being installed into the system site-packages directory, of the instance of Python with which you invoke it. This is true of a lot of tools we'll use today.

For those with multiple versions of Python on their system, to distinguish beween them, the tools are installed with a suffix of the version of Python used.

About Virtualenv

  • written by Ian Bicking
  • a fresh, isolated Python environment
  • except it pre-installs setuptools
  • must run programs from bin/ (Scripts/)
  • may be
    • in front of system Python
    • or totally replace system Python

Written by Ian Bicking, virtualenv allows you to set up an isolated Python environment whose libraries do not affect programs outside it, making it a good choice for experimenting with new packages or to deploy different programs with conflicting library requirements.

To maintain the isolation, Python programs must be run from the "bin/" subdirectory ("Scripts/" under Windows).

Normally your virtualenv sandbox is isolated from your system Python, but requests for modules not found within the sandbox flow through to the system Python. This means if you install extra software into the system Python it will automatically become available in all virtualenv sandboxes.

There is an option, --no-site-packages that changes this behavior to exclude the system Python from all failing searches, for stronger isolation at the expense of having to explicitly install more dependencies.

Steps: Installing EasyInstall then Virtualenv

  • Steps to take

    $ cd /tmp
    $ wget
    $ sudo python
    $ sudo easy_install virtualenv

Using your favorite tool, download into a temporary directory and run it. This will download and install the appropriate setuptools egg for your Python version, and create a new system command easy_install.

Note: Windows users; do NOT put inside your Python installation. Use a temporary directory elsewhere.

The "sudo" command is the Unix-way of running the "easy_install" command with administrator privileges.

Creating and Using Sandboxes

  • $ virtualenv pycon2008
    $ virtualenv --no-site-packages pycon2008
    $ cd pycon2008
    $ bin/python
    $ source bin/activate
    $ deactivate

To create a sandbox, run the virtualenv command and pass to it the pathname of a new directory in which you want it to reside.

If you want stronger isolation from the system Python, use the --no-site-packages option to omit the system packages from the search path for your sandbox.

To run within the sandbox, run the Python interpreter in the bin directory. If you prefer to avoid typing the "bin/" prefix, virtualenv provides the "activate" command to rewrite the search paths for your shell session. The command "deactivate" reverses these changes.

Note: the "activate" command for virtualenv has nothing to do with the concept of activating or de-activating a Python egg, which means placing it onto the sys.path using a .pth file.

Note: On Windows you shouldn't create a virtualenv sandbox in a path with a space in any name.

Directory Layout of a Sandbox


Note: On Windows, the directory names are slightly different, with "Scripts/" being used for "bin/", and "Lib/" being used for "lib/".

Warning: Virtualenv is currently incompatible with a system-wide distutils.cfg and per-user ~/.pydistutils.cfg. If you have either of these files, virtualenv will put the easy_install command into the bin/ directory specified in that config file, rather than into the sandbox where it belongs.

Limitations of Virtualenv

  • easy to experiment, hard to repeat and manage
  • no simple way to list what you've installed
  • no simple way of uninstalling packages
  • the buildout tool is opposite to this
  • and has the ability to manage non-Python aspects

Distutils: Packaging, Metadata and Pushups

Author:Jeff Rush <>
Copyright:2008 Tau Productions Inc.
License:Creative Commons Attribution-ShareAlike 3.0
Date:March 13, 2008
Series:Python Eggs and Buildout Deployment - PyCon 2008 Chicago

An introduction to the distutils module that provides a standard way of building, distributing and installing one or a group of Python modules, across platforms. distutils has shipped as part of the Python standard library since version 1.6.


Roadmap of Topics

  • What is the Problem?
  • A Bit of History
  • What is a Distribution?
  • distutils: Audience and Usage
  • Examples of Invocation
  • Applying To Your Software
  • Sources and Uses of Metadata
  • MANIFEST for Source Distributions
  • About Package Servers
  • Features Missing from Distutils
  • For More Information

Index of Slides

What is the Problem?

  • a standard mechanism for
    • building , packaging (distributing) , installing
  • one or more modules of
    • Python source , C/C++ code , data files
  • be cross-platform

  • fit into existing packaging technologies

  • an extensible tool

    • distribution file formats
    • special processing commands

We want a mechanism standardized within the Python community for building, packaging, distributing, and installing one or more modules that may consist of Python source, compiled C source and bundled data files.

A solution should do all this in a manner that works across operating system platforms, plays nicely with existing packaging technologies, and provides for an easily extensible set of distribution file formats and special processing commands.

A Bit of History

  • Developer's Day session (IPC7 Nov 1998)
  • shipped with Python in version 1.6
  • older version at distutils-sig
  • setuptools and buildout leverage it

Scattered work on a solution to meet these requirements had been underway for quite a few years but in 1998 started to come together under Greg Ward at IPC7. This led to a version of distutils shipping with Python 1.6.

There is an old version linked to from the distutils-sig page but it is out-of-date. The official version is that which ships with Python.

The setuptools module that is the basis for eggs and the buildout deployment tool both make heavy use of distutils and leverage its concepts.

What is a Distribution?

  • single downloadable resource
  • made up of ...
    • one or more modules ...
      • a single .py file
      • a package hierarchy
      • a C/C++ extension
      • a set of unrelated modules
    • executable scripts
    • data files
  • meant to be installed together

distutils are tools for distributions so let's define what one is.

A "distribution" is a single downloadable resource made up of one or more modules, each of which may be a single .py file, a directory of modules organized into a Python package, or a C/C++ extension. These modules may be independent or unrelated sibling packages - no relationship is assumed.

It may contain executable Python scripts which get installed as additional system commands, depending upon the operating system.

And it may contain data files, such as icons or reference material.

All of which are intended to be installed or uninstalled together.

Note: Collections of unrelated modules are given their own directory by distutils, and a path configuration (.pth) file to add it to sys.path at run-time.

What is a Distribution? (cont'd)

  • has a ...
    • project name, version number, file
  • may internally be ...
    • pure Python (platform-neutral)
    • non-pure (at least one C/C++ extension module)
  • produced in two flavors
    • source distributions
    • binary distributions
      • potential sys.path entry
      • may be activated or not
    • each in many file formats

A distribution always has a project name, version number and a controlling file in the root of the distribution source tree.

Internally a distribution may be pure or platform-neutral, indicating it can be used unchanged across operating systems. If it has at least one C/C++ extension module, it is considered non-pure.

distutils generates distributions in two flavors, source distributions for sharing with other developers, and binary distributions for non-developers.

A binary distribution is an actual or potential sys.path entry, along with its metadata. Distributions can be activated or deactivated, by being placed onto sys.path or not, usually by use of .pth files. We'll cover this mechanism in a lot more detail.

Distribution: Source versus Binary

Type of Distribution Compiler Required? Uses Local Pkg Tech? Unique Identifier
Binary No Yes (projectname, version, platform)
Source Maybe No (archive) (projectname, version)


Local Package Technology: RPMs, .debs, .msi bundles

Binary distributions are intended for installation into environments that lack development tools such as a C/C++ compiler, and usually come in the form of existing package technologies, such as RPMs, .debs, .msi, .dpg.

Besides being identified by a project name and version, the Python version used in its build is relevant as well, because of differing binary APIs.

Binary distributions by their nature get "installed"; they are past the "build" and "package" stages.

Metadata about its characteristics, its ownership and dependencies is retained past the build process into the installation itself, so that othat packages that depend upon it can know of its presence and version.

Source distributions are targeted at developers who will typically have development tools. While pure distributions won't require them, non-pure ones will.

Source distributions are basically some form of compressed archive containing all the necessary source to build and install the project.

Being not yet compiled, they can be identified solely by the project name and version, and are fed into the "build" stages to get either installed or packages.

They are a source of metadata about distributions, which originates from keywords passed to the setup() function in and, for RPMs, from entries in the setup.cfg file.

distutils: The Intended Audience

Role Installer Packager Developer
Gives Metadata No No Yes
Programmer? No No Yes
Ctls Install Loc? Some Yes No
Admin Privs? Yes No No
Operation install bdist sdist

distutils is intended to be used by several different audiences.

The non-developer who just wants to install some software, either on an individual desktop or perhaps corporate-wide using automated tools.

A packager who collects and organizes useful software, builds it for specific environments and makes it available in repositories.

The original developer who wants to make available his work in as easy to use form at possible.

An installer can install a binary distribution (distutils is out of picture re RPMs) or can act as packager and build, then install from a source distribution can control where it gets installed. He also may require admin privs for system-wide places

A developer writes the script, that supplies the options that the installer cannot know: distribution meta-data, which modules and extensions are present in the distribution, and where they go in the space of Python modules.

distutils: Command-Line Invocation

  • uses a custom in project root

    from distutils.core import setup; setup()

  • is command-line driven

    python [global_opts]? cmd1 [cmd1_opts]? [cmd2 [cmd2_opts]? ...]

  • python --help-commands

  • python build --help

  • export DISTUTILS_DEBUG=yes

distutils makes its appearance in a project via a customized Python source file named placed in the project root directory.

A minimal such file looks like this. To interactively follow along with me, you may want to create such a file if you don't have one handy from an existing project.

Invocation of distutils is command-line driven as follows. You change into the directory of the project and pass explicitly to the Python interpreter.

Some of the key global options, their meaning pretty much self-explanatory, are --verbose, --quiet, --dry-run, and --help.

distutils accepts one or more commands per invocation, each with a variable number of arguments. The boundaries are found by matching arguments against commands in the mapping.

To obtain a list of defined commands, use the --help-commands option. This list can vary since distutils supports extension through addition of new commands.

Each command has its own help which can be accessed like this.

To get internal processing details while distutils is working, define the DISTUTILS_DEBUG=yes environment and invoke Any error will now give you a traceback telling you more about the who and why.

distutils: Usage Defaults Configuration

  • usage config versus project config

  • system-wide (within distutils module)
    • distutils.cfg within distutils
  • per-user (in $HOME directory)
    • .pydistutils.cfg or pydistutils.cfg
  • per-project (next to
    • setup.cfg
  • in format accepted by ConfigParser


distutils reads configuration related to command invocation from up to three optional locations, in the following order, with the last options read taking precedence.

system-wide: a distutils.cfg file within the distutils module directory; no such file is provided by default

per-user: a .pydistutils.cfg (POSIX) or pydistutils.cfg (non-POSIX) file in the user's $HOME directory

per-project: a setup.cfg file in the same directory as the file

The format of configuration files is that accepted by the ConfigParser module, with named sections and name = value assignments. Besides default options both globally and for particular commands, there are powerful things you can do such as creating aliases to sets of commands + options.

The command to build an RPM, the bdist_rpm command, needs additional metadata that is specific to RPMs. The setup.cfg file is used in this case to supply that data:

distribution-name=Red Hat Linux

You can provide default arguments in your config file.

Invocation: Installing from Source

$ cd /tmp
$ wget
$ tar xvzf BeautifulSoup-3.0.5.tgz
$ cd BeautifulSoup-3.0.5
$ less README
$ sudo python install
$ /sandbox/bin/python install
The most common operation is probably installing someone else's distribution from the source. The distribution is installed into the site-packages directory of the specific instance of Python invoked. To install into a virtualenv sandbox, give the explicit path to the sandbox's interpreter.

Invocation: Building a Binary Distribution

$ cd /tmp
$ wget
$ tar xvzf BeautifulSoup-3.0.5.tgz
$ cd BeautifulSoup-3.0.5
$ python bdist_rpm
$ cp dist/BeautifulSoup-3.0.5-1.noarch.rpm /releases
$ python bdist --formats=tar,zip,rpm
The next most common operation may be building platform-specific packages for distribution. A single binary distribution may be given as the command "bdist_<format>" or one or more formats can be specified with the "--formats=" option.

Invocation: Building a Binary (formats)

$ python bdist --help-formats
  • bdist (gztar, bztar, ztar, tar, zip)
  • bdist_msi (Microsoft Installer binary)
  • bdist_rpm (both source and binary RPMs)
  • planned for Python 2.6
    • bdist_deb
    • bdist_egg
  • 3rd-party add-ons
    • bdist_mpkg (for Mac OSX)
    • bdist_pkgtool (Solaris pkgtool)
    • bdist_sdux (HP-UX swinstall)

Invocation: Building a Source Distribution

$ python sdist --help-formats
$ cd /tmp
$ svn co foo
$ cd foo
$ python sdist --formats=zip
$ cp dist/foo-1.0.0.tar /releases

The sdist command extracts metadata and writes it to a file by the name of PKG-INFO in the top directory of the generated zipfile or tarball. This file is a single set of RFC822 headers parseable by the module.

This PKG-INFO file is what gets POST'd to the index server.

Applying distutils to your software

  • possible directory layouts
  • content of
    • basic metadata
    • source and extension module(s)
    • executable scripts
    • datafiles, package-relative and arbitrary
    • extra metadata
    • expressing dependencies

Project Directory Layout - Single File


Project Directory Layout - Package


Project Directory Layout - Multi-Package

mymodule2/ - Basic Metadata

from distutils.core import setup

from mymodule import __version__ as VERSION


The file is just Python source, that invokes a single function setup(), with a variety of keyword arguments. This allows you to bring to bear on the configuration problem the full power of Python, without having to learn another configuration-specific language.

The file should not be marked executable as it is generally invoked with explicit reference to a Python interpreter. This is because the binary formats generated and the directory locations written to are specific to that instance of interpreter.

In general it is best to obtain the version of your distribution from within your source code, to avoid duplication of information. - Source Module(s)

  • ...
  • package_dir = {'': 'src/lib'}
  • packages = ['foo', 'foo.command'],
  • NO automatic recursion (until setuptools)

The "py_modules" keyword provides a list of modules, specified NOT by filename but module name.

Such names are relative to the file itself. To locate modules within non-package subdirectories, use the "package_dir" keyword, which is a mapping of module names to directory paths. These paths are written in the Unix convention, i.e. slash-separated. A module name of empty string is the root package of all Python packages.

All packages must be explicitly listed; distutils will not recursively scan your source tree or package hierarchy looking for any directory with an file. The setuptools module provides a find_packages() function however that does. - Extension Module(s)

  • from distutils.core import setup, Extension
          ext_modules=[Extension('foo', ['foo.c'])],
  • rough support for SWIG

distutils has rough support for SWIG, processing .i files into C/C++ code. - Executable Scripts

  • setup(...
  • NOT versioned, can collide

  • chmod +x yourscript

  • rewrites #!/usr/bin/env python

Scripts are files containing Python source code, intended to be started from the command line. The "scripts=" keyword specifies a list of paths to the scripts.

These scripts get installed in the system command area and, because they are not renamed by version, may conflict if other distributions use the same name. This makes it difficult to install multiple versions of a distribution.

distutils takes care of marking the scripts executable for POSIX.

If the first line of the script starts with #! and contains the word "python", distutils will adjust the first line to refer to the current interpreter location. This value can be overridden with the --executable option to the invocation.. - Package-Relative Datafiles

  • mypkg/
  • setup(...
          package_data={'mypkg': ['data/*.dat']},
  • files to copy into distribution

  • paths relative to pkg dir, NOT

  • glob filename patterns ok

  • will create parent dirs

  • runtime access to data files in archives

The package_data keyword is a mapping from package name to a list of pathnames (relative to the package) of files to copy into the package.

The path names may contain directory portions; any necessary directories will be created in the installation. - Arbitrary Datafiles

  • syntax: (destdirectory, filelist)

             ('bitmaps', ['bm/b1.gif', 'bm/b2.gif']),
             ('config', ['cfg/data.cfg']),
             ('/etc/init.d', ['init-script'])]
  • destination relative to install prefix

  • source relative to file

  • source paths NOT retained, unlike package_data=

  • can relocate, NOT rename

  • feature removed in setuptools

The data_files keyword is for placement of datafiles unrelated to any specific Python package.

You can specify any destination directory for a file, but no mechanism is provided to rename it. - Extra Metadata (ownership)

      author='Sherlock Holmes',
      maintainer='John Watson',
      ) - Extra Metadata (descriptive)

      description='XYZZY Magic Generator',
      keywords='apple orange farming well',
The long_description keyword is a document, usually written in reStructuredText, that gets displayed on the project page in the Cheeseshop. - Extra Metadata (classifiers)

        'Development Status :: 4 - Beta',
        'Intended Audience :: Developers',
        'License :: OSI Approved :: MIT License',
  • python register --list-classifiers

The classifiers keyword is a list of strings, representing official tags used by the Cheeseshop, derived from the trove concept of discrimination. This list of classification values has been merged from FreshMeat and SourceForge (with their permission).

The official list at any time can be retrieved from the Cheeseshop with the --list-classifiers option to the register command. - Expressing Dependencies

  • requires=['magicmod', 'puff.cmd (>=1.3)'],
  • provides=['foomatic', 'washer.log (2.1)'],
  • obsoletes=['foomatic (1.0, 1.1, 1.2)'],
  • replaced in setuptools with more specific qualifiers

Sources and Uses of Metadata

  • sources
    • keywords
    • setup.cfg for RPMs
  • written to PKG-INFO file
    • set of RFC822 headers
    • gets HTTP POST'd to index server
  • .egg-info file
    • written by "egg-info" cmd
    • encases PKG-INFO
    • more open under setuptools
    • indicates what is installed

MANIFEST for Source Distributions

  • MANIFEST file (created if absent)
    • for extra files into source distro only
    • template (macros) in
    • needed much less in setuptools
  • auto-included
    • referenced Python, scripts, C/C++ source
    • recognized test scripts (test/*)
    • README.txt,, setup.cfg
  • omitted
    • C/C++ header files (flaw)
    • the build/ tree
    • version-control metadirs

About Package Servers in General

  • public distribution repositories
    • in a machine-readable format
    • dependency resolution
  • examples
    • CPAN (Perl) and PEAR (PHP)
    • Package Index (PyPI) or Cheeseshop
  • what does it serve?
    • index server
    • upload server
    • link server

To make the existence of distributions visible to others in an automated form, suitable for dependency resolution, there are package servers.

For Perl and PHP, respectively, there are the CPAN and PEAR package servers. The one for Python is named the Cheeseshop or sometimes the Package Index (PyPI).

A package server may serve one or more kinds of information:

An index server holds records containing metadata about many different distributions, including a URL where the actual distribution files may be found. Index servers are cross-project in nature.

An upload server receives distribution files, both source and binary, along with metadata, from developers and makes them available for download from a centralized place.

A link server holds HTML about a specific or related set of projects, which has links sprinkled on it that point to actual distribution files. Link servers are usually browseable project sites, and are used in connection with buildout.

Package Servers: About The Cheeseshop

  • PyPI is an index server and upload server
  • distutils pushes up, never pulls
  • need an account to write to it
    • visit site and register
    • do a push up, get queried
  • obtaining its source

The PyPI server is both an index server and an upload server. It is developer discretion whether to keep actual distributions on it or just rely upon the URL in the metdata to point to a project website.

distutils only knows how to push up to PyPI, both metadata with the "register" command and distributions (source and binary) with the "upload" command. Later we'll see how setuptools and buildout add the capability to pull from PyPI.

User accounts on PyPI can be obtained by visiting the site in a browser, or by pushing up metadata for a project and being prompted by

The source to PyPI is available for running your own, such as behind a corporate firewall.

Package Servers: Posting Metadata

svn co foo
cd foo
python register
1. existing login
2. register anew
3. reset/email password
  • entries keyed by (projectname, version)

  • minimum metadata required

    projectname, version, URL, contact info

Here we see use of the register command for pushing metadata for a distribution up to a package server, by default PyPI.

Sending data to a package server requires a username and password. distutils will prompt for an existing one, permission to create a new one or reset your password and have PyPI email a new, random one to you.

Within PyPI, entries are uniquely identified by the (projectname, version).

A package server such as PyPI, besides checking the correctness of metadata, enforces a minimum set of fields.

Package Servers: Stashing Credentials

  • register asks to save auth info

  • NOT in a distutils config file

  • $HOME/.pypirc:

  • setup.cfg:


Upon completion, the register command asks if you want to save the username and password entered into a local configuration file.

This file is NOT one of the distutils configuration files but one specific to PyPI.

Or you can place the information in the file yourself.

Oddly, the "repository" field is only used with the upload command which we cover next, not the register command. To convince register to use a repository other than PyPI, add something like this to one of the distutils configuration files.

Package Servers: Posting Distributions

$ svn co foo
$ cd foo
$ python sdist upload
$ python sdist --formats=gztar,zip upload
$ python sdist upload --sign
$ python bdist --formats=rpm upload
  • metadata sent as well
  • new releases hide previous releases

The "upload" command is used to upload actual distribution files. Since there are many potential distribution flavors and formats, the choice of what to upload is given earlier in the command-line.

Uploads can be signed with a GnuPGP key by adding the --sign option. There is also an --identity option that supplies a user ID to pass to the GnuPG tools.

To upload a binary distribution instead or in addition to a source distribution, use the bdist command.

Uploads can also be performed manually by visiting the PyPI website.

Submitting a distribution file automatically submits the metadata.

A new release of a package hides all previous releases, wrt listings and searches. You can manually override this by visiting the PyPI website, until the next submission of metadata.

Features Missing from Distutils

  • downloading packages
  • automatic dependency satisfaction
  • no way to list -installed- distributions
  • no official way to uninstall a module
  • no dev mode; have to install each time to test
  • no help/documentation bundling yet

For More Information

  • the Distutils-SIG and Mailing List
  • "Distributing Python Modules" (guide for developers)
  • "Installing Python Modules" (guide for sysadmins)
  • Distutils Cookbook - Collection of Recipes
  • Community Wiki for Distutils (links to useful info)
  • Source to Python Package Index
  • "Cleaning Up PyBlosxom Using Cheesecase"
  • Questions?

Buildout: Precision Assembly, Repeatability, Islands

Author:Jeff Rush
Copyright:2008 Tau Productions Inc.
License:Creative Commons Attribution-ShareAlike 3.0
Date:March 13, 2008
Series:Python Eggs and Buildout Deployment - PyCon 2008 Chicago

A follow-on to the setuptools talk introducing the buildout tool that uses parts specifications to repeatably bring together specific combinations and versions of Python eggs, along with non-Python elements, into controlled islands of development and deployment.


Roadmap to Talk

  • The Benefits
  • What Is It?
  • How Does It Work?
  • The Concepts
  • Getting Started: Dirs, Specs, Args
  • Recipe Functionality
  • Examples of Usage
  • For More Information

Index of Slides

The Benefits of Buildout

  • intended for

    • creating applications, not libraries
    • final deployment AND
    • daily development
  • islands of development

    • does NOT install into system Python
  • blueprints lead to repeatability

  • not just for Python software

  • not necessary to eggify software base

  • offline usage

In the prior talks on distutils and then setuptools, we focused on creating distributions of reusable modules, more in the sense of libraries. Buildout takes us in a different direction, using those packaging capabilities to bring together sets of distributions into whole applications in a controlled manner.

These applications can be deployed as self-contained source releases and RPMs in ways that facilitate operation by experienced Unix system administrators. Prior to deployment however, buildout is a useful tool in the development phase as well.

The buildout tool is based on the premise that installing distributions into the system instance of Python is, for a developer, a bad thing that leads to conflicts and unknown interactions with packages not under control of buildout. For this reason, buildout relies upon sandboxes or "islands of development", similar to how virtualenv work. In fact it can be used along with virtualenv.

Buildout is based on the idea of engineering blueprints; that an architect can rigorously specify the parts that go into an assembly and construct a product in a repeatable fashion. The word "buildout" comes from the manufacturing industry and refers to a specification of a set of parts and instructions on how to assemble them.

Note that parts could still behave differently due to changes in parts of the environment, such as system libraries, not controlled by the buildout.

Unlike the packaging tools covered previously, buildout encompasses not just Python software but non-Python elements such as configuration sets, multiple programs, Apache instances, database servers and so forth.

As a result of this, it is NOT necessary to eggify your software base to use buildout.

And buildout, while relying upon a package repository such as the Cheeseshop, is also able to function offline from the net from collections of parts within a cache directory.

What is Buildout?

  • Jim Fulton of Zope Corporation

  • leverages setuptools and eggs

  • for developers, not end-users

  • coarse-grained build system (config-based)

  • NOT fine-grained (rules-based)

    • Make, scons, distutils
  • explicit, declarative, not tweakable

  • anything built by buildout ...
    • is controlled by it.

buildout was conceived by Jim Fulton of Zope Corporation in 2006 and, while often used with the Zope web framework, is completely independent of it.

It draws from a extensible collection of recipes in driving the assembly process and leverages setuptools and eggs in managing Python packages.

Because of its architectural focus, the audience for buildout is more toward developers than the end-user.

buildout is a course-grained build system, differing from fine-grained approaches such as Make, scons and distutils. Those systems focus on individual files and use rules to determine how to compute one from another. Buildout works with larger elements such as applications, configuration files and databases, and uses configuration instead of rules to fit them together.

Rule systems are better used where the sheer number of many low-level elements require taking advantage of regularities to reduce complexity. Configuration systems are better at specifying the one-off relationships when you have relatively few high-level elements.

In one sense it is a better Make but works at a higher level than Make, dealing with large components rather than individual files.

buildout is not ideal for informal experimentation, in that it requires explicit specification of the parts used in an application. This is done in a declarative manner, in that the architect says "what to use", not "how to do it". This makes tweaking to control the low-level process difficult.

Part of using buildout is understanding that anything built by buildout is controlled by it. Temporary hacks to created files will be thrown away on the next build. To make a permanent change, it is necessary to update the buildout configuration.

There can be exceptions to this such as the recipes that manage checkouts. They don't remove checkouts to avoid losing user data. Similarly, the recipe doesn't remove data directories it creates on uninstall.

How does Buildout Work?

  • reads an assembly specification
  • to install (uninstall) into a deployment tree
  • specific parts and scripts, based on
  • a state file indicating what is in the tree
  • to make the tree match the specification.
  • uses setuptools to find/install eggs
  • and plugins (eggs) called recipes
  • for flexible parts handling, both egg and non-egg
  • allowing parts to reference other parts
  • and scripts to have custom import module-sets.

buildout is a tool that, each time it is run, ...

To accomplish this, buildout ...

The Concepts of Buildout

  • specification
  • recipes
  • parts
  • custom interpreters
  • executable scripts
  • deployment tree

Concepts: What is a Specification?

  • one or more (included) text files
  • format of ConfigParser
  • different specs for different modes
  • constrain versions and repositories

A specification is a text file that itemizes the parts that go into an assembly, names the recipes used by the various parts, and provides to those recipes an open-ended form of configuration.

When stored in a software control repository, it can reproduce an exact deployment or development scenario, upon being checked out and having a build operation invoked upon it.

The format of a specification corresponds to that accepted by ConfigParser, a standard Python module. A specification can be given in a single such file or factored into multiple ones.

Most commonly, there will be multiple specifications for a project, say one for development, one for testing and another for field deployment.

Specifications do more than just list the parts involved. They can place constraints on acceptable versions and provide details on where to automatically download them from, whether the Cheeseshop or project-specific websites.

If a part is removed from a specification, it is uninstalled from the deployment tree. If a part's receipe or configuration changes, the part is uninstalled and reinstalled.

Concepts: What is a Part?

  • "something to be managed by a buildout"
  • may be an egg, tarball, checkout, config file, etc.
  • has a name and its own data directory
  • is an object with attributes, inside buildout
  • gets installed, updated and uninstalled
  • can reference other parts
  • is defined by one recipe that handles it
    • along with recipe-specific parameters

A part is simply something to be managed by a buildout.

It can be almost anything, such as a Python package, a program, a directory, or even a configuration file.

It has a name unique within the specification and its own directory within which it can scribble anything. These scribbles can be referenced by other parts.

Within buildout, a part is an object with an open-ended set of attributes.

It may be installed, updated and uninstalled, over a series of builds. If a part reference is removed from a specification, upon the next invocation of Buildout, it is uninstalled from the deployment tree. If a part's recipe or configuration changes, the part is uninstalled and reinstalled.

A part can reference other parts within the same specification, accessing their attributes, configuration and private directory.

Each part is defined by a recipe, which contains the logic to manage them, along with some data used by that recipe specific to that part.

Concepts: What is a Recipe?

  • is what buildout is made of
  • are themselves eggs
  • can contain multiple sub-recipes
  • get installed automatically
  • a set of starter recipes for
    • installing eggs
    • generating scripts
    • custom Python interpreters
    • custom egg compilation
  • many 3rd party recipes in the Cheeseshop

buildout itself is constructed out of recipes, which are objects that know how to install, update and uninstall a type of part.

Receipes are themselves eggs and when one is referenced in a specification, buildout will automatically locate and install the recipe in the buildout environment.

A recipe can contain multiple sub-recipes, accessible as distinct egg entrypoints.

A set of starter recipes ships with the buildout, in the egg named zc.recipe.egg.

The Cheeseshop contains many add-on recipes, if you search for "recipe" in the name or keyword field.

Getting Started with Buildout

  • Installing Buildout
  • Project Directory Structure
  • Specification File
  • Controlling Versions
  • About Caches
  • How Distributions Are Found
  • The Command-Line
  • Configuration of Defaults

Getting Started: Installing Buildout

  • globally as an egg
    • easy_install zc.buildout
  • locally under a project
    • python
$ svn co svn:// z3hello
$ cd z3hello
$ python2.4
$ bin/buildout buildout:newest=false
$ bin/instance fg

The buildout software can be installed system-wide, using easy_install or locally under a project, by running the "" that is bundled with most existing projects.

The "" command will:

  • create support directories, like bin, eggs, and work, as needed,
  • download and install the zc.buildout and setuptools eggs,

Here is an example of setting up an existing project that uses buildout. Note that it takes a while to download and build everything it needs.

The full URL for the example is:

svn:// z3hello

Getting Started: Project Directory Structure


A project directory usually contains a "" script to help a new developer set up the tree after checking out a project. The file is optional.

The specification for the entire project defaults to "buildout.cfg" but there are often others, such as "deployment.cfg" and "production.cfg".

In the "bin/" directory are the executable scripts that buildout generates from entrypoints within distributions.

The "develop-eggs/" directory holds egg links for software being developed in the buildout. We separate "develop-eggs/" and "eggs/" to allow egg cache directories to be shared across multiple buildouts. For example, a common developer technique is to define a common eggs directory in their home that all non-develop eggs are stored in. This allows larger buildouts to be set up much more quickly and saves disk space.

And the "parts/" directory is contains code and data managed by buildout, or more precisely the recipes that make it up.

If you look hard, you will also find a hidden file named ".installed.cfg", which is where buildout keeps its state of what is currently installed. Do not tamper with it.

And if you did not change the default locations of the cache directories for eggs and tarballs, there will be an "eggs/" and "downloads/" directory. A difference between the two is that those in "eggs/" will be referenced "in-place" while those in "downloads/" will be unpacked into a subdirectory of "parts/".

Getting Started: Project Directory Structure (cont'd)

var/ (Zope instance data area)
products/ (Zope2 products)
  • svn-ignore:
    • eggs/, downloads/
    • bin/, develop-eggs/, parts/
    • var/
    • .installed.cfg

And of course there are the other files and directories about which buildout is not concerned.

There is usually a "README.txt" file because several tools complain if it is not there. If the build is itself an egg (and not all are), there will also be "" and "setup.cfg" files.

And there is often a "src/" directory under which the source of your own eggs or checkouts reside.

If the build represents a Zope instance, there may also be a "var/" directory to hold the instance data such as a ZODB, and a "products/" directory to contain Zope Products, which are used in Zope 2.

A question that usually arises with a project is which parts to check into a version control system and which are automatically generated and managed by buildout.

Obviously the two distribution cache directories should not be checked in.

Nor should the "bin/" directory into which buildout places generated scripts, the "develop-eggs/" directory which is really just a collection of egg-links that point into your "src/" directory for work under development, or the "parts" directory under which recipes store somewhat transient data belong to the part they manage.

And if you're running Zope, it is not common to check the "var/" directory in, unless your policy is to store frozen ZODB databases.

And last, the ".installed.cfg" file that buildout uses to keep track of the state of parts should not be checked in. buildout will generate it as needed upon the next build operation.

Getting Started: Specification File (syntax)

parts = ODBC_installation ODBC_configuration

recipe = zc.recipe.cmmi
url =
extra_options = --disable-gui

recipe = tau.recipe.odbc:iniwriter
odbc_ini =
Driver = FreeTDS

Specification files are in the format accepted by the ConfigParser Python module, with variable-definition and substitution extensions. Such a file is broken into [sections]?, where each part has their own section and name.

Within sections are "option = value" lines. A value can be spread across multiple lines by indenting it.

The "buildout" defines the buildout section and is the only required section in the specification file. It is options in this section that may cause other sections to be used.

The "parts = <space-delimited names>" option lists the parts that go into an assembly. Parts that depend on other other parts not specified here will automatically be identified and pulled in as well.

Each part is then further described under its section. The first option described for every part is "recipe=", which identifies the plugin used to manage it. All other options under a part description are dependent upon what that recipe accepts. For the curious, options are passed as keyword arguments to recipe objects.

The recipe "zc.recipe.cmmi" is one that understands how to download a tarball and perform the common sequence of commands: ./configure; make; make install". That installation occurs into the "parts/" directory, into a subdirectory named after the part. The recipe takes a "url=" option that tells it from where to download the archive.

Notice that installation and configuration are treated as separate operations. This is a good policy to folow for buildouts, to among other things, enhance specification reusability in different environments (development, testing, deployment).

The recipe "tau.recipe.odbc" accepts a multiline value and writes it into a file of the name as the option. The value can contain any text, as long as it is indented in the specification.

Getting Started: Specification File (references)

  • ${partname:optionname}
  • parts processed in order, beware circular refs
recipe = plone.recipe.zope2install
url =
recipe = plone.recipe.zope2instance
zope2-location = ${Zope2_installation:location}

Within a specification file, parts can reference attributes of other parts, such as the "location" of their parts directory. Any "option = value" field can be referenced in this way.

Parts declarations are processed in the order they appear in the specification file, so avoid circular references.

Parts referenced in this manner automatically become dependencies of the reading part. It is the same as putting its name in the buildout parts= option.

Getting Started: Specification File (includes)

  • base.cfg common specification
  • dev.cfg development specification
extends = base.cfg
  • rpms.cfg RPM generation specification
extends = base.cfg
Specification files can include one another, to factor our common options and provide for distinct deployment target environments.

Getting Started: Controlling Versions

newest = false # or current acceptable egg?
prefer-final = true # or under-develop releases?
versions = release-1 # section of explicit versions
spam = 1
eggs = 2.2
# error if any version is picked automatically?
allow-picked-versions = false

buildout offers several degrees of control over the versions of parts used for assemblies. These options can be specified either in the per-user $HOME/.buildout/default.cfg or in a per-project buildout specification file. Some policies make more sense in one than the other.

The default mode of operation for buildout is to always try to find the latest distributions that match requirements. Often going over the network, this lookup operation can be very time consuming. The newest option can disable this, so that buildout will use the currently installed eggs as long as they meet the requirements. It also lends a certain stability to the development environment. The -N command-line option also disables it.

When searching for new releases is enabled, the newest available release is used. This isn't usually ideal, as you may get a development release or alpha releases not ready to be widely used. The prefer-final option controls whether to only use the latest final or stable releases.

In buildout version 2, final releases will be preferred by default. You will then need to use a false value for prefer-final to get the newest releases.

In order to give more control over the precise version of distributions used, a versions option can be specified in the buildout section that points to a section that itemizes the versions to be used.

To populate this section, running buildout in verbose mode will print the versions selected of the various distributions.

To insure no versions slip past and are picked automatically, the allow-picked-versions can be used to disable the automatic process and generate an error, giving absolute control over version selection.

Getting Started: About Caches

  • establishing shared caches:

    eggs-directory = /var/tmp/buildout/eggs
    download-cache = /var/tmp/buildout/downloads
  • only get things from the cache?

    install-from-cache = true
    offline = true # obsolete

Normally, when distributions are installed, if any processing is needed, they are downloaded from the internet to a temporary directory and then installed from there. A download cache can be used to avoid the download step. This can be useful to reduce network access and to create source distributions of an entire buildout.

buildout supports two cache locations: one for eggs, and one for tarball archives. Without specifying these options, the default is to use directories "eggs" and "downloads" within each project directory tree.

A cache can be used as the basis of application source releases. In an application source release, we want to distribute an application that can be built without making any network accesses. In this case, we distribute a buildout with download cache and tell the buildout to install from the download cache only, without making network accesses. The buildout install-from-cache option can be used to signal that packages should be installed _only_ from the download cache.

The offline option is related, in that it tells buildout whether it is allowed to search distribution repositories on the network.

Getting Started: How Distributions Are Found

find-links =
  • buildout: other sites, then Cheeseshop
  • setuptools: Cheeseshop, then other sites
  • find-links also specified per some recipes

To find distributions, buildout uses the search mechanism built into setuptools, and allows specification of places, in addition to the Cheeseshop, in which to look.

To use an index server other than the Cheeseshop, specify its URL with the --index-url (or index-url = URL) configuration option. There is no provision to have multiple index servers.

NOTE: buildout searches those sites given with --find-links after it searches an index server like the Cheeseshop. setuptools searches in the opposite order.

For installing on non-networked machines, a link server can be represented as simply a directory of eggs or source packages, pointed to with the --find-links* command-line option.

Getting Started: The Command-Line

  • buildout [options]? [assignments]? [cmd [cmd args]]?

  • options
    • -c deploy.cfg
    • -o (offline)
    • -n (newest)
    • -D (debug)
  • assignments
    • section:option=value
  • commands

    • buildout init
    • buildout install [parts]?
    • buildout runsetup sdist register upload

Any option you can set in the configuration file, you can set on the command-line. Option settings specified on the command line override settings read from configuration files.

-c config_file Specify path to the buildout configuration file to be used. This defaults to the file named "buildout.cfg" in the current working directory.
-o Run in off-line mode. This is equivalent to the assignment "buildout:offline=true".
-n Run in newest mode. This is equivalent to the assignment "buildout:newest=true". With this setting, which is the default, buildout will try to find the newest versions of distributions available that satisfy its requirements.
-D Debug errors. If an error occurs, then the post-mortem debugger will be started. This is especially useful for debugging recipe problems.

Getting Started: Configuration of Defaults

  • two-layer configuration
    • $HOME/.buildout/default.cfg
    • project specification
  • no system-wide settings
  • per-project settings in specification

buildout always looks for an initial configuration file under the $HOME directory and loads it before the assembly specification file. The syntax of the two files is identical; anything that can go into a specification file can go into a defaults file.

Notice from this that there are no system-wide settings, like there was with setuptools.

Besides parts information, buildout settings can also go into the per-project assembly specification.

About Recipes

  • zc.recipe.egg
  • iw.recipe.subversion
  • zc.recipe.cmmi
  • zc.recipe.testrunner
  • zc.sshtunnel
  • z3c.recipe.ldap
  • tl.buildout_apache
  • z3c.recipe.openoffice
  • zc.recipe.zope3checkout


Installs one or more eggs, along with their dependencies. It installs their console-script entry points with the eggs needed included in their paths.


Generates scripts to run project-specific unit tests over a collection of eggs. The eggs must already be installed (using the zc.recipe.egg recipe).


Installs a checkout from the Zope 3 repository.


Sets up a server instance for running Zope 3.


Create an empty instance of ZODB filestorage and generates a configuration clause in the style of ZConfig for using it.

Recipes: Egg Installation and Scripts

  • zc.recipe.egg
    • :eggs
    • :scripts
    • :custom
    • :develop
recipe = zc.recipe.egg:eggs
eggs = docutils
ZODB3 <=3.8
find-links =
index =

The zc.recipe.egg recipe installs one or more eggs, with their dependencies. It has four sub-recipes that can be references by adding a colon and their name to the recipe= line. The default sub-recipe is "scripts".

The eggs option accepts one or more distribution requirements, one per line. Acceptable versions can be specified. Any dependencies of the named eggs will also be installed.

It is also possible to specify a part-custom "find-links=" list of places to look for distributions as well as the location of a specific index server such as the Cheeseshop.

Recipes: Customizing the Building of Eggs

recipe = zc.recipe.cmmi
url =
recipe = zc.recipe.egg:custom
egg = SpreadModule ==1.4
find-links =
include-dirs = ${spreadtoolkit:location}/include
library-dirs = ${spreadtoolkit:location}/lib
  • zc.recipe.egg:develop

The ":custom" sub-recipe of zc.recipe.egg provides for custom building of an egg from its source distribution. Sometimes a distribution has extension modules that need to be compiled with special options, such as the location of include files and libraries.

In this example, we have a part representing a non-Python library that needs to be built using the "./configure; make; make install" dance.

And then a part that uses that library to build a Python extension module. Notice how the second part references the location into which the first part was installed.

There is a ":develop" sub-recipe that is similar to ":custom", except that it operates upon develop-eggs that you may be working on. The resulting eggs are placed in the develop-eggs directory because the eggs are buildout specific.

Recipes: Generating Scripts for Eggs

recipe = zc.recipe.egg:scripts
eggs = zc.rst2
scripts = rst2=s5
extra-paths =
entry-points = ...

The zc.recipe.egg:scripts recipe scans those eggs specified with "eggs=" for entrypoints of the group "console_scripts" and, for each one found that appears in "scripts=", generates a script, usually in the "bin/" directory, that invokes it. If there is no "scripts=" option, all found entrypoints have a script generated for them.

The "eggs=" option also controls the set of distributions that will be "baked into" or activated within those specific scripts.

The "scripts=" option also permits aliasing a script, by providing an alternate name, after the second '=', for the script file itself. In this case the "rst2" entrypoint will be invoked from a script file named "s5".

The "extra-paths=" option provides directories to be added onto the sys.path for the particular scripts.

If a distribution referenced doesn't use setuptools, it may not have declared in its metadata any entry points. In that case, entry points can be specified in the recipe data, using the "entry-points=" option.

Recipes: How Eggs Are Activated Within Scripts


import sys
sys.path[0:0]? = [
import myegg
if __name__ == '__main__':

This is an example of a script generated by buildout, showing how it bakes specific distributions into each script and then invokes code within the egg.

Notice how it differs from a script generated by setuptools, which is more declarative with __requires__ and version constraints, and defers to the entrypoint lookup mechanism.

Recipes: Baking Paths into Python Interpreter Scripts

  • a Python interpreter under a custom filename

  • that maps onto sys.path

    specific eggs and specific versions

parts = zcomponent

recipe = zc.recipe.egg:scripts
eggs = zope.component
interpreter = zprompt

buildout provides for the generation of scripts to provide an interactive Python prompt with the specified eggs and their dependencies already activated, which is very useful for debugging specific programming scenarios.

This is similar to a script, but uses the "interpreter=" option instead of the "scripts=" option.

Recipes: Pulling Subversion Files into Buildout

recipe = iw.recipe.subversion
urls =  colorclips  greyclips
recipe = tau.recipe.clipserver
images =

This is an example of how to pull into a buildout non-egg content stored under version control. The "iw.recipe.subversion" recipe accepts a list of URLs from which to checkout files and a destination directory name. Those directories are placed under the "parts/clipart-svn/" directory.

In the second part we see a server of some type that knows how to deliver those files to a client, and how it manages to reference those checked-out files.

Recipes: Using a Non-Egg Archive of Python Source

recipe =
url =
recipe      = zc.recipe.egg
interpreter = zprompt
eggs        = ${Zope2_instance:eggs}
extra-paths =

Sometimes a needed distribution comes as a zipfile of just .pyc files, particularly for a proprietary package such as mxODBC. They're not an actual egg, just a directory tree of files to be used as-is.

The "" downloads archives in a variety of compression formats and unpacks them underneath the "parts/mxODBC_installation/" directory.

The second part can then reference this directory tree explicitly.

Use Case: Starting a New Project

$ mkdir newproj
$ cd newproj
$ buildout init
$ virtualenv --no-site-packages newproj
$ cd newproj
$ bin/easy_install zc.buildout
$ bin/buildout init
$ wget

The first case shows how to start a new buildout, without using virtualenv.

It is suggested that every project that makes use of buildout come bundled with a file to make it easier for the next developer to get started. installs the setuptools and buildout distributions into the project directory.

The actual URL for fetching the "" file is:

The second case show how to start a buildout within a virtualenv sandbox, with complete isolation from the system site-packages.

Use Case: Picking Up a Buildout Project

$ svn co svn:// Adder
$ cd Adder
$ python
$ bin/buildout
$ bin/zopectl fg
develop = .

This is an example of picking up a project that is already packaged for use with buildout.

This particular project already has a "develop=" line in its buildout.cfg that points to the in the project root. This means that, within the buildout, the package will already be a develop-egg, so that one can begin making changes to the source immediately and have it reflected in the runtime behavior without having to build/install it each time.

The actual URL for the example project is:

svn:// Adder

Use Case: Buildout Around Other Projects

$ virtualenv --no-site-packages myproj
$ cd myproj
$ bin/easy_install zc.buildout
$ mkdir src
$ svn co src/modulex
develop = src
parts = myproj_eval

recipe = zc.recipe.eggs:script
eggs = modulex
interpreter = aprompt

Often you run across a package or two that are not buildout-aware but you want to experiment with them inside a buildout sandbox.

This example sets up a sandbox and then brings the outside package into it under the "src/" directory. It may be a checkout or if this buildout is itself going to be stored under version control, the outside package can be a Subversion "extern" checkout. In this manner, checking out the buildout will pull down all the developmental pieces.

Within our buildout, we tell buildout to treat the outside package as a "develop-egg", and reference its distribution name as "modulex".

Use Case: Making Your Project Buildout-Aware

develop = .
parts = test

recipe = zc.recipe.testrunner
eggs = zc.ngi

To package your project so that it is buildout-aware, drop a minimal "buildout.cfg" file in the project root, next to the file.

A common usage of buildout is to support development of a single package along with running tests.

The "develop = ." says to find the "* file in the current directory and activate it as a development egg, so that I can make changes to it and re-run the tests as I work.

The value of a "develop=" option can be more than one directory, each of which has its own file.

The location, name and such of this package are provided in that file, and could be any number of Python packages arranged in any directory structure I choose.

To experiment with an example of this pattern of usage:

$ svn co zc.ngi

Use Case: Distributing Project as Egg

$ bin/buildout
$ bin/test
$ bin/buildout runsetup . sdist
$ bin/buildout runsetup . bdist_egg
This is a example of how, after developing and testing your project, buildout is used to push it up to a package index like the Cheeseshop.

Use Case: Production Deployment (non-RPMs)

recipe = zc.recipe.egg:script
eggs = zc.sourcerelease
$ bin/buildout-source-release file:///tmp/jtest buildout.cfg
$ tar xvzf jtest.tgz
$ cd jtest
$ bin/python2.5

This example uses the "zc.sourcerelease" recipe to cause an entire buildout, including dependencies, to be bundled into a tarball.

Note that the tarball does NOT include an actual Python interpreter, which must already be installed on the destination system to run the "" script.

Use Case: Production Deployment (RPMs)

  • tarball using zc.sourcerelease

  • hand-create a .spec file that has:

    • Source: %{source}.tgz

    • a %prep that unpacks it

    • a %build that:

      • copies it under /opt/MYPROJ/

      • runs

        python /opt/MYPROJ/ bootstrap

      • runs

        python /opt/MYPROJ/ buildout:extensions=

    • a %files that grabs everything under /opt/MYPROJ/

This example shows how to produce a RPM for installation. It uses the "zc.sourcerelease" recipe to first produce a tarball, and then a hand-made RPM .spec file to turn that into an RPM.

A key part of this for an application like Zope or ZODB is separating a build into software parts and configuration parts.

The software parts are assembled when the source release/rpm is built.

The configuration is done post-install, by invoking scripts within the %build section of the RPM .spec file. The ZODB and Zope 3 recipes were specifically designed to support this separation.

For More Information

  • Primary Buildout Home Page
  • EuroPython 2007: Philipp v. Weitershausen
  • Minitutorial: Introduction to zc.buildout
  • Good documents in the zc.buildout egg itself.
  • Questions?