How To Install PyInstaller
How To Install PyInstaller
PyInstaller Manual
Version:
Homepage:
Author:
Contact:
Revision:
Source URL:
Copyright:
PyInstaller 2.1
http://www.pyinstaller.org
David Cortesi (based on structure by Giovanni Bajo & William Caban (based on
Gordon McMillan's manual))
[email protected]
$Rev$
$HeadURL$
This document has been placed in the public domain.
PyInstaller Manual -
Contents
Requirements
License
How To Contribute
Installing PyInstaller
Installing in Windows
Installed commands
Console or not?
Using PyInstaller
Options
10
General Options
10
11
11
12
12
12
13
13
13
13
Using UPX
14
14
14
15
15
16
PyInstaller Manual -
17
17
18
19
19
19
19
20
Build-time Messages
20
20
21
21
21
21
21
22
22
Advanced Topics
23
23
Bootloader
23
23
24
24
25
25
Inspecting Archives
26
ZlibArchive
26
CArchive
26
Using pyi-archive_viewer
27
Inspecting Executables
27
Multipackage Bundles
27
MERGE Function
28
28
29
Hooks in Detail
29
31
Development tools
31
Building
31
32
33
ImportTracker
33
analyze_one()
33
Module Classes
33
code scanning
33
Hooks
34
Warnings
34
Cross Reference
34
Outdated Features
34
34
Building Optimized
35
35
ImportManager
36
ImportDirector
36
PathImportDirector
36
Owner
36
Packages
36
Possibilities
37
Compatibility
37
Performance
37
Limitations
37
iu Usage
37
Requirements
Windows
Windows XP or newer.
PyWin32 Python extensions for Windows is needed for users of Python 2.6 and later.
Mac OS X
Mac OS X 10.4 (Tiger) or newer (Leopard, Snow Leopard, Lion, Mountain Lion).
Linux
ldd - Console application to print the shared libraries required by each program or shared library.
objdump - Console application to display information from object files.
Solaris
ldd
objdump
AIX
AIX 6.1 or newer. Python executables created using PyInstaller on AIX 6.1 should work on AIX
5.2/5.3.
ldd
objdump
License
PyInstaller is distributed under the GPL License but it has an exception such that you can use it to compile
commercial products.
In a nutshell, the license is GPL for the source code with the exception that:
1. You may use PyInstaller to compile commercial applications out of your source code.
2. The resulting binaries generated by PyInstaller from your source code can be shipped with
whatever license you want.
3. You may modify PyInstaller for your own needs but changes to the PyInstaller source code fall
under the terms of the GPL license. That is, if you distribute your modifications you must
distribute them under GPL terms.
For updated information or clarification see our FAQ at the PyInstaller home page.
How To Contribute
PyInstaller is an open-source project that is created and maintained by volunteers. At Pyinstaller.org you
find links to the mailing list, IRC channel, and Git repository, and the important How to Contribute link.
Contributions to code and documentation are welcome, as well as tested hooks for installing other
packages.
Installing PyInstaller
Beginning with version 2.1 PyInstaller is a Python package and is installed like other Python packages.
Installing in Windows
For Windows, PyWin32 is a prerequisite. Follow that link and carefully read the instructions; there is a
different version of PyWin32 for each version of Python. With this done you can continue to install pip
using the MS-DOS command line.
However it is particularly easy to use pip-Win, which sets up both pip and virtualenv and makes it simple
to install packages and to switch between different Python interpreters. (For more on the uses of
virtualenv, see Supporting Multiple Platforms below.)
When pip-Win is working, enter this command in its Command field and click Run:
venv -c -i
pyi-env-name
This creates a new virtual environment rooted at C:\Python\pyi-env-name and makes it the current
environment. A new command shell window opens in which you can run commands within this
environment. Enter the command
pip install PyInstaller
Whenever you want to use PyInstaller,
Start pip-Win
In the Command field enter venv pyi-env-name
Click Run
Then you have a command shell window in which commands execute in that environment.
setup.py
install
with
For platforms other than Windows, Linux and Mac OS, you must build a bootloader program for your
platform before installing the Python package.
cd into the distribution folder.
cd bootloader.
Make a bootloader with: python ./waf configure build install.
If this reports an error, read Building the Bootloader below, then ask for technical help. It is of no use to
continue the installation without a bootloader. After the bootloader has been created, you can run
python setup.py install with administrator privileges to complete the installation.
Installed commands
The complete installation places these commands on the execution path:
pyinstaller is the main command to build a bundled application. See Using PyInstaller.
pyi-makespec is used to create a spec file. See Using Spec Files.
pyi-build is used to execute a spec file that already exists. See Using Spec Files.
pyi-archive_viewer is used to inspect a bundled application. See Inspecting Archives.
pyi-bindepend is used to display dependencies of an executable. See Inspecting Executables.
pyi-grab_version is used to extract a version resource from a Windows executable. See
Capturing Version Data.
pyi-make_comserver is used to build a Windows COM server. See Windows COM Server
Support.
If you do not perform the complete installation (setup.py or installing via pip), these commands will not
exist as commands. However you can still execute all the functions documented below by running Python
scripts found in the distribution folder. The equivalent of the pyinstaller command is pyinstaller-folder
/pyinstaller.py. The other commands are found in pyinstaller-folder /cliutils/ with obvious
names (makespec.py, etc.)
Console or not?
By default the bootloader creates a command-line console (a terminal window in Linux and Mac OS, a
command window in Windows). It gives this window to the Python interpreter for its standard input and
output. Error messages from Python and print statements in your script will appear in the console window.
If your script reads from standard input, the user can enter data in the window.
An option for Windows and Mac OS is to tell PyInstaller to not provide a console window. The bootloader
starts Python with no target for standard output or input. Do this if your script has a graphical interface for
user input and can properly report its own diagnostics.
Using PyInstaller
The syntax of the pyinstaller command is:
pyinstaller [options] script [script ...] | specfile
In the most simple case, set the current directory to the location of your program myscript.py and
execute:
pyinstaller myscript.py
PyInstaller analyzes myscript.py and:
Writes myscript.spec in the same folder as the script.
Creates a folder build in the same folder as the script if it does not exist.
Writes some log files and working files in the build folder.
Creates a folder dist in the same folder as the script if it does not exist.
Writes the myscript executable folder in the dist folder.
In the dist folder you find the bundled app you distribute to your users.
Normally you name one script on the command line. If you name more, all are analyzed and included in
the output. However, the first script named supplies the name for the spec file and for the executable
folder or file. Its code is the first to execute at run-time.
For certain uses you may edit the contents of myscript.spec (described under Using Spec Files). After
you do this, you name the spec file to PyInstaller instead of the script:
pyinstaller myscript.spec
You may give a path to the script or spec file, for example
pyinstaller options... ~/myproject/source/myscript.py
or, on Windows,
pyinstaller "C:\Documents and Settings\project\myscript.spec"
Options
General Options
-h, --help
-v, --version
-a, --ascii
--distpath=path_to_executable, -o path_to_executable
--specpath=path_to_spec_file
--workpath=path_to_work_files
--clean
10
-y, --noconfirm
--log-level=keyword
-p dir_list, --paths=dir_list
--hidden-import=modulename
--additional-hooks-dir=hook-path
--runtime-hook=path-to-hook-file
-n name, --name=name
-D, --onedir
-F, --onefile
-c, --console, --nowindowed
11
-d, --debug
-s, --strip
--upx-dir=upx_dir
--noupx
--version-file=version_text_file
-r <FILE[,TYPE[,NAME[,LANGUAGE]]]>, --resource=<FILE[,TYPE[,NAME[,LANGUAGE]]]>
-i <FILE.icns>, --icon=<FILE.icns>
12
13
--hidden-import=secret1 \
--hidden-import=secret2 \
--upx-dir=/usr/local/share/ \
myscript.spec
Or in Windows, use the little-known BAT file line continuation:
pyinstaller --noconfirm --log-level=WARN ^
--onefile --nowindow ^
--hidden-import=secret1 ^
--hidden-import=secret2 ^
--icon-file=..\MLNMFLCN.ICO ^
myscript.spec
Using UPX
UPX is a free utility available for most operating systems. UPX compresses executable files and libraries,
making them smaller, sometimes much smaller. UPX is available for most operating systems and can
compress a large number of executable file formats. See the UPX home page for downloads, and for the
list of supported executable formats. As of May 2013, the only major absence is 64-bit binaries for
Windows and Mac OS X. UPX has no effect on these.
A compressed executable program is wrapped in UPX startup code that dynamically decompresses the
program when the program is launched. After it has been decompressed, the program runs normally. In
the case of a PyInstaller one-file executable that has been UPX-compressed, the full execution sequence
is:
The compressed program start up in the UPX decompressor code.
After decompression, the program executes the PyInstaller bootloader, which creates a temporary
environment for Python.
The Python interpreter executes your script.
PyInstaller looks for UPX on the execution path or the path specified with the --upx-dir option. If UPX
exists, PyInstaller applies it to the final executable, unless the --noupx option was given. UPX has been
used with PyInstaller output often, usually with no problems.
14
15
16
description
name
path
'BINARY'
A shared library.
Run-time name.
'DATA'
Arbitrary files.
Run-time name.
'OPTION'
Option code
ignored.
17
The prefix argument, if given, is a name for a subfolder within the run-time folder to contain the tree
files. If you omit prefix or give None, the tree files will be at the top level of the run-time folder.
The excludes argument, if given, is a list of one or more strings that match files in the root that should
be omitted from the Tree. An item in the list can be either:
a name, which causes files or folders with this basename to be excluded
*.ext, which causes files with this extension to be excluded
For example:
extra_tree = Tree('../src/extras', prefix='extras', excludes=['tmp'])
This creates extra_tree as a TOC object that lists all files from the relative path ../src/extras,
omitting those that have the basename (or are in a folder named) tmp.
Each tuple in this TOC has:
A typecode of DATA,
A path consisting of a complete, absolute path to one file in the root folder,
A name consisting of the filename of this file, or, if you specify a prefix, the name is prefix/filename.
18
Example
Notes
Verbose
imports
('v', None,
'OPTION')
Unbuffered
stdio
('u', None,
'OPTION')
W
spec
Warning
option
('W ignore',
None,
'OPTION')
Use site.py
('s', None,
'OPTION')
For example:
exe = EXE(a.scripts, pyz,
[('v', None, 'OPTION'),('W ignore', None, 'OPTION')],
name="myapp.exe", exclude_binaries=1)
In this example, you have inserted a list of two tuples into the EXE call.
19
20
21
22
Advanced Topics
The Bootstrap Process in Detail
There are many steps that must take place before the bundled script can begin execution. A summary of
these steps was given in the Overview (How the One-Folder Program Works and How the One-File
Program Works). Here is more detail to help you understand what the bootloader does and how to figure
out problems.
Bootloader
The bootloader prepares everything for running Python code. It begins the setup and then reruns itself in
another process. This approach of using two processes allows a lot of flexibility and is used in all bundles
except one-folder mode in Windows. So do not be surprised if you will see your frozen app as two
processes in your system task manager.
What happens during execution of bootloader:
A. First process: bootloader starts.
1. If one-file mode, extract bundled files to temppath_MEIxxxxxx
2. Set/unset various environment variables, e.g. override LD_LIBRARY_PATH on Linux or
LIBPATH on AIX; unset DYLD_LIBRARY_PATH on OSX.
3. Set up to handle signals for both processes.
4. Run the child process.
5. Wait for the child process to finish.
6. If one-file mode, delete temppath_MEIxxxxxx.
B. Second process: bootloader itself started as a child process.
1. On Windows set the activation context.
2. Load the Python dynamic library. The name of the dynamic library is embedded in the
executable file.
3. Initialize Python interpreter: set PYTHONPATH, PYTHONHOME.
4. Run python code.
23
24
25
Inspecting Archives
An archive is a file that contains other files, for example a .tar file, a .jar file, or a .zip file. Two
kinds of archives are used in PyInstaller. One is a ZlibArchive, which allows Python modules to be stored
efficiently and, with some import hooks, imported directly. The other, a CArchive, is similar to a .zip file,
a general way of packing up (and optionally compressing) arbitrary blobs of data. It gets its name from the
fact that it can be manipulated easily from C as well as from Python. Both of these derive from a common
base class, making it fairly easy to create new kinds of archives.
ZlibArchive
A ZlibArchive contains compressed .pyc or .pyo files. The PYZ class invocation in a spec file creates
a ZlibArchive.
The table of contents in a ZlibArchive is a Python dictionary that associates a key, which is a member's
name as given in an import statement, with a seek position and a length in the ZlibArchive. All parts of a
ZlibArchive are stored in the marshalled format and so are platform-independent.
A ZlibArchive is used at run-time to import bundled python modules. Even with maximum compression
this works faster than the normal import. Instead of searching sys.path, there's a lookup in the
dictionary. There are no directory operations and no file to open (the file is already open). There's just a
seek, a read and a decompress.
A Python error trace will point to the source file from which the archive entry was created (the __file__
attribute from the time the .pyc was compiled, captured and saved in the archive). This will not tell your
user anything useful, but if they send you a Python error trace, you can make sense of it.
CArchive
A CArchive can contain any kind of file. It's very much like a .zip file. They are easy to create in Python
and easy to unpack from C code. A CArchive can be appended to another file, such as an ELF and COFF
executable. To allow this, the archive is made with its table of contents at the end of the file, followed only
by a cookie that tells where the table of contents starts and where the archive itself starts.
A CArchive can be embedded within another CArchive. An inner archive can be opened and used in
place, without having to extract it.
Each table of contents entry has variable length. The first field in the entry gives the length of the entry.
The last field is the name of the corresponding packed file. The name is null terminated. Compression is
optional for each member.
There is also a type code associated with each member. The type codes are used by the self-extracting
executables. If you're using a CArchive as a .zip file, you don't need to worry about the code.
The ELF executable format (Windows, Linux and some others) allows arbitrary data to be concatenated to
the end of the executable without disturbing its functionality. For this reason, a CArchive's Table of
Contents is at the end of the archive. The executable can open itself as a binary file, seek to the end and
'open' the CArchive.
26
Using pyi-archive_viewer
Use the pyi-archive_viewer command to inspect any type of archive:
pyi-archive_viewer archivefile
With this command you can examine the contents of any archive built with PyInstaller (a PYZ or PKG), or
any executable (.exe file or an ELF or COFF binary). The archive can be navigated using these
commands:
O name
Open the embedded archive name (will prompt if omitted). For example when looking in a one-file
executable, you can open the outPYZ.pyz archive inside it.
U
Go up one level (back to viewing the containing archive).
X name
Extract name (will prompt if omitted). Prompts for an output filename. If none given, the member is
extracted to stdout.
Q
Quit.
The pyi-archive_viewer command has these options:
Show help.
-h, --help
-l, --log
-b, --brief
-r, --recursive
Inspecting Executables
You can inspect any executable file with pyi-bindepend:
pyi-bindepend executable_or_dynamic_library
The pyi-bindepend command analyzes the executable or DLL you name and writes to stdout all its
binary dependencies. This is handy to find out which DLLs are required by an executable or by another
DLL.
pyi-bindepend is used by PyInstaller to follow the chain of dependencies of binary extensions during
Analysis.
Multipackage Bundles
Some products are made of several different apps, each of which might depend on a common set of
third-party libraries, or share code in other ways. When packaging such an product it would be a pity to
treat each app in isolation, bundling it with all its dependencies, because that means storing duplicate
copies of code and libraries.
You can use the multipackage feature to bundle a set of executable apps so that they share single copies
of libraries. Each dependency (a DLL, for example) is packaged only once, in one of the apps. Any other
apps in the set that depend on that DLL have an "external reference" to it, telling them to go find that
dependency in the executable file of the app that contains it.
This saves disk space because each dependency is stored only once. However, to follow an external
reference takes extra time when an app is starting up. Some of the apps in the set will have slightly slower
launch times.
27
MERGE Function
A custom spec file for a multipackage bundle contains one call to the MERGE function:
MERGE(*args)
MERGE is used after the analysis phase and before EXE and COLLECT. Its variable-length list of
arguments consists of a list of tuples, each tuple having three elements:
The first element is an Analysis object, an instance of class Analysis.
The second element is the script name (without the .py extension).
The third element is the name for the executable (usually the same as the script).
MERGE examines the Analysis objects to learn the dependencies of each script. It modifies the total list to
avoid duplication of libraries and modules. As a result the packages generated will be connected.
28
MERGE( (foo_a, 'foo', 'foo'), (bar_a, 'bar', 'bar'), (zap_a, 'zap', 'zap') )
Following this you can copy the PYZ, EXE and COLLECT statements from the original three spec files,
substituting the unique names of the Analysis objects where the original spec files have a., for example:
foo_pyz = PYZ(foo_a.pure)
foo_exe = EXE(foo_pyz, foo_a.scripts, ... etc.
Save the merged spec file as foobarzap.spec and then build it:
pyi-build foobarzap.spec
There are several multipackage examples in the tests/multipackage folder of the PyInstaller
distribution folder.
Remember that a spec file is executable Python. You can use all the Python facilities (for and with
and the members of sys and io) in creating the Analysis objects and performing the PYZ, EXE and
COLLECT statements.
Hooks in Detail
A hook is a module named hook- fully.qualified.import.name .py in the hooks folder of the PyInstaller
folder (or in a folder specified with --additional-hooks-dir).
A hook is executable Python code that should define one or more of the following three global names:
hiddenimports
A list of module names (relative or absolute) that the hooked module imports in some opaque way.
These names extend the list of imported modules created by scanning the code. Example:
29
30
Development tools
On Debian/Ubuntu systems, you can run the following to install everything required:
sudo apt-get install build-essential
On Fedora/RHEL and derivates, you can run the following:
su
yum groupinstall "Development Tools"
On Mac OS X you can get gcc by installing Xcode. It is a suite of tools for developing software for Mac OS
X. It can be also installed from your Mac OS X Install DVD. It is not necessary to install the version 4 of
Xcode.
On Solaris and AIX the bootloader is tested with gcc.
On Windows you can use the Visual Studio C++ compiler (Visual Studio 2008 is recommended). A free
version you can download is Visual Studio Express.
Note: There is no connection between the Visual Studio version used to compile the bootloader and the
Visual Studio version used to compile Python. The bootloader is a self-contained static executable that
imposes no restrictions on the version of Python being used. So you can use any Visual Studio version
you have around.
You can download and install or unpack MinGW distribution from one of the following locations:
MinGW - stable and mature, uses gcc 3.4 as its base
MinGW-w64 - more recent, uses gcc 4.4 and up.
TDM-GCC - MinGW and MinGW-w64 installers
Building
On Windows, when using MinGW, it is needed to add PATH_TO_MINGW\bin to your system PATH.
variable. In command prompt before building bootloader run for example:
set PATH=C:\MinGW\bin;%PATH%
Change to the bootloader subdirectory. Run:
python ./waf configure build install
This will produce
./PyInstaller/bootloader/YOUR_OS/run,
./PyInstaller/bootloader/YOUR_OS/run_d
./PyInstaller/bootloader/YOUR_OS/runw and
31
./PyInstaller/bootloader/YOUR_OS/runw_d
which are the bootloaders.
On Windows this will produce in the ./PyInstaller/bootloader/YOUR_OS directory: run*.exe
(bootloader for regular programs), and inprocsrvr*.dll (bootloader for in-process COM servers).
Note: If you have multiple versions of Python, the Python you use to run waf is the one whose
configuration is used.
Note: On AIX the bootloader builds with gcc and is tested with gcc 4.2.0 on AIX 6.1.
32
ImportTracker
ImportTracker can be called in two ways: analyze_one(name, importername=None) or
analyze_r(name, importername=None). The second method does what modulefinder does - it
recursively finds all the module names that importing name would cause to appear in sys.modules. The
first method is non-recursive. This is useful, because it is the only way of answering the question "Who
imports name?" But since it is somewhat unrealistic (very few real imports do not involve recursion), it
deserves some explanation.
analyze_one()
When a name is imported, there are structural and dynamic effects. The dynamic effects are due to the
execution of the top-level code in the module (or modules) that get imported. The structural effects have to
do with whether the import is relative or absolute, and whether the name is a dotted name (if there are N
dots in the name, then N+1 modules will be imported even without any code running).
The analyze_one method determines the structural effects, and defers the dynamic effects. For example,
analyze_one("B.C", "A") could return ["B", "B.C"] or ["A.B", "A.B.C"] depending on
whether the import turns out to be relative or absolute. In addition, ImportTracker's modules dict will have
Module instances for them.
Module Classes
There are Module subclasses for builtins, extensions, packages and (normal) modules. Besides the
normal module object attributes, they have an attribute imports. For packages and normal modules,
imports is a list populated by scanning the code object (and therefor, the names in this list may be relative
or absolute names - we don't know until they have been analyzed).
The highly astute will notice that there is a hole in analyze_one() here. The first thing that happens
when B.C is being imported is that B is imported and its top-level code executed. That top-level code
can do various things so that when the import of B.C finally occurs, something completely different
happens (from what a structural analysis would predict). But mf can handle this through its hooks
mechanism.
code scanning
Like modulefinder, ImportTracker scans the byte code of a module, looking for imports. In addition it
will pick out a module's __all__ attribute, if it is built as a list of constant names. This means that if a
package declares an __all__ list as a list of names, ImportTracker will track those names if asked to
analyze package.*. The code scan also notes the occurance of __import__, exec and eval, and
can issue warnings when they are found.
The code scanning also keeps track (as well as it can) of the context of an import. It recognizes when
imports are found at the top-level, and when they are found inside definitions (deferred imports). Within
that, it also tracks whether the import is inside a condition (conditional imports).
33
Hooks
In modulefinder, scanning the code takes the place of executing the code object. ExtensionModules, of
course, don't get scanned, so there needs to be a way of recording any imports they do.
Please read Listing Hidden Imports for more information.
ImportTracker goes further and allows a module to be hooked (after it has been scanned, but before
analyze_one is done with it).
Warnings
ImportTracker has a getwarnings() method that returns all the warnings accumulated by the
instance, and by the Module instances in its modules dict. Generally, it is ImportTracker who will
accumulate the warnings generated during the structural phase, and Modules that will get the warnings
generated during the code scan.
Note that by using a hook module, you can silence some particularly tiresome warnings, but not all of
them.
Cross Reference
Once a full analysis (that is, an analyze_r call) has been done, you can get a cross reference by using
getxref(). This returns a list of tuples. Each tuple is (modulename, importers), where importers is
a list of the (fully qualified) names of the modules importing modulename. Both the returned list and the
importers list are sorted.
Outdated Features
The following sections document features of PyInstaller that are still present in the code but are rarely
used and may no longer work.
34
--out=output_path
If you have the win32dbg package installed, you can use it with the generated COM server. In
drivescript.py, set debug=1 in the registration line.
Caution: the inprocess COM server support will not work when the client process already has Python
loaded. It would be rather tricky to non-obtrusively hook into an already running Python, but the
show-stopper is that the Python/C API won't let us find out which interpreter instance to hook into. (If this
is important to you, you might experiment with using apartment threading, which seems the best
possibility to get this to work). To use a "frozen" COM server from a Python process, you'll have to load it
as an exe:
o = win32com.client.Dispatch(progid,
clsctx=pythoncom.CLSCTX_LOCAL_SERVER)
Building Optimized
There are two facets to running optimized: gathering .pyo's, and setting the Py_OptimizeFlag. Installer
will gather .pyo's if it is run optimized:
python -O pyinstaller.py ...
The Py_OptimizeFlag will be set if you use a ('O','','OPTION') in one of the TOCs building the
EXE:
exe = EXE(pyz,
a.scripts + [('O','','OPTION')],
...
See Using Spec Files for details.
35
ImportManager
ImportManager formalizes the concept of a metapath. This concept implicitly exists in native Python in
that builtins and frozen modules are searched before sys.path, (on Windows there's also a search of
the registry, while on Mac, resources may be searched). This metapath is a list populated with
ImportDirector instances. There are ImportDirector subclasses for builtins, frozen modules, (on
Windows) modules found through the registry and a PathImportDirector for handling sys.path. For
a top-level import (that is, not an import of a module in a package), ImportManager tries each director
on its metapath until one succeeds.
ImportManager hides the semantic complexity of an import from the directors. It is up to the
ImportManager to decide if an import is relative or absolute; to see if the module has already been
imported; to keep sys.modules up to date; to handle the fromlist and return the correct module object.
ImportDirector
An ImportDirector just needs to respond to getmod(name) by returning a module object or None.
As you will see, an ImportDirector can consider name to be atomic - it has no need to examine name
to see if it is dotted.
To see how this works, we need to examine the PathImportDirector.
PathImportDirector
The PathImportDirector subclass manages a list of names, most notably sys.path. To do so, it
maintains a shadowpath, a dictionary mapping the names on its pathlist (eg, sys.path) to their
associated Owners. (It could do this directly, but the assumption that sys.path is occupied solely by
strings seems ineradicable.) Owners of the appropriate kind are created as needed (if all your imports
are satisfied by the first two elements of sys.path, the PathImportDirector's shadowpath will only
have two entries).
Owner
An Owner is much like an ImportDirector but manages a much more concrete piece of turf. For
example, a DirOwner manages one directory. Since there are no other officially recognized
filesystem-like namespaces for importing, that's all that's included in iu, but it's easy to imagine
Owner``s for zip files (and I have one for my own ``.pyz archive format) or even
URLs.
As with ImportDirectors, an Owner just needs to respond to getmod(name) by returning a module
object or None, and it can consider name to be atomic.
So structurally, we have a tree, rooted at the ImportManager. At the next level, we have a set of
ImportDirectors. At least one of those directors, the PathImportDirector in charge of
sys.path, has another level beneath it, consisting of Owners. This much of the tree covers the entire
top-level import namespace.
The rest of the import namespace is covered by treelets, each rooted in a package module (an
__init__.py).
Packages
To make this work, Owners need to recognize when a module is a package. For a DirOwner, this
means that name is a subdirectory which contains an __init__.py. The __init__ module is loaded
and its __path__ is initialized with the subdirectory. Then, a PathImportDirector is created to
manage this __path__. Finally the new PathImportDirector's getmod is assigned to the package's
__importsub__ function.
36
Possibilities
Let's say we want to import from zip files. So, we subclass Owner. The __init__ method should take a
filename, and raise a ValueError if the file is not an acceptable .zip file. (When a new name is
encountered on sys.path or a package's __path__, registered Owners are tried until one accepts the
name.) The getmod method would check the zip file's contents and return None if the name is not
found. Otherwise, it would extract the marshalled code object from the zip, create a new module object
and perform a bit of initialization (12 lines of code all told for my own archive format, including initializing a
package with its __subimporter__).
Once the new Owner class is registered with iu, you can put a zip file on sys.path. A package could
even put a zip file on its __path__.
Compatibility
This code has been tested with the PyXML, mxBase and Win32 packages, covering over a dozen import
hacks from manipulations of __path__ to replacing a module in sys.modules with a different one.
Emulation of Python's native import is nearly exact, including the names recorded in sys.modules and
module attributes (packages imported through iu have an extra attribute - __importsub__).
Performance
In most cases, iu is slower than builtin import (by 15 to 20%) but faster than imputil (by 15 to 20%).
By inserting archives at the front of sys.path containing the standard lib and the package being tested,
this can be reduced to 5 to 10% slower (or, on my 1.52 box, 10% faster!) than builtin import. A bit more
can be shaved off by manipulating the ImportManager's metapath.
Limitations
This module makes no attempt to facilitate policy import hacks. It is easy to implement certain kinds of
policies within a particular domain, but fundamentally iu works by dividing up the import namespace into
independent domains.
Quite simply, I think cross-domain import hacks are a very bad idea. As author of the original package on
which PyInstaller is based, McMillan worked with import hacks for many years. Many of them are highly
fragile; they often rely on undocumented (maybe even accidental) features of implementation. A
cross-domain import hack is not likely to work with PyXML, for example.
That rant aside, you can modify ImportManger to implement different policies. For example, a version
that implements three import primitives: absolute import, relative import and recursive-relative import. No
idea what the Python syntax for those should be, but __aimport__, __rimport__ and
__rrimport__ were easy to implement.
iu Usage
Here's a simple example of using iu as a builtin import replacement.
>>> import iu
>>> iu.ImportManager().install()
37
>>>
>>> import DateTime
>>> DateTime.__importsub__
<method PathImportDirector.getmod
of PathImportDirector instance at 825900>
>>>
38