Skip to content
Snippets Groups Projects
guix.texi 1.03 MiB
Newer Older
  • Learn to ignore specific revisions
  • Coreutils, Awk, Findutils, `sed', and `grep' and Guile, GCC, Binutils, and the
    GNU C Library (@pxref{Bootstrapping}).  Usually, these bootstrap binaries are
    ``taken for granted.''
    
    
    Taking the bootstrap binaries for granted means that we consider them to
    be a correct and trustworthy ``seed'' for building the complete system.
    Therein lies a problem: the combined size of these bootstrap binaries is
    about 250MB (@pxref{Bootstrappable Builds,,, mes, GNU Mes}).  Auditing
    or even inspecting these is next to impossible.
    
    For @code{i686-linux} and @code{x86_64-linux}, Guix now features a
    ``Reduced Binary Seed'' bootstrap @footnote{We would like to say: ``Full
    Source Bootstrap'' and while we are working towards that goal it would
    be hyperbole to use that term for what we do now.}.
    
    
    The Reduced Binary Seed bootstrap removes the most critical tools---from a
    trust perspective---from the bootstrap binaries: GCC, Binutils and the GNU C
    
    Library are replaced by: @code{bootstrap-mescc-tools} (a tiny assembler and
    linker) and @code{bootstrap-mes} (a small Scheme Interpreter and a C compiler
    
    written in Scheme and the Mes C Library, built for TinyCC and for GCC).
    
    Using these new binary seeds the ``missing'' Binutils, GCC, and the GNU
    C Library are built from source.  From here on the more traditional
    bootstrap process resumes.  This approach has reduced the bootstrap
    binaries in size to about 145MB in Guix v1.1.
    
    The next step that Guix has taken is to replace the shell and all its
    utilities with implementations in Guile Scheme, the @emph{Scheme-only
    bootstrap}.  Gash (@pxref{Gash,,, gash, The Gash manual}) is a
    POSIX-compatible shell that replaces Bash, and it comes with Gash Utils
    which has minimalist replacements for Awk, the GNU Core Utilities, Grep,
    Gzip, Sed, and Tar.  The rest of the bootstrap binary seeds that were
    removed are now built from source.
    
    Building the GNU System from source is currently only possibly by adding
    some historical GNU packages as intermediate steps@footnote{Packages
    such as @code{gcc-2.95.3}, @code{binutils-2.14}, @code{glibc-2.2.5},
    @code{gzip-1.2.4}, @code{tar-1.22}, and some others.  For details, see
    @file{gnu/packages/commencement.scm}.}.  As Gash and Gash Utils mature,
    and GNU packages become more bootstrappable again (e.g., new releases of
    GNU Sed will also ship as gzipped tarballs again, as alternative to the
    hard to bootstrap @code{xz}-compression), this set of added packages can
    hopefully be reduced again.
    
    The graph below shows the resulting dependency graph for
    @code{gcc-core-mesboot0}, the bootstrap compiler used for the
    traditional bootstrap of the rest of the Guix System.
    
    @c ./pre-inst-env guix graph -e '(@@ (gnu packages commencement) gcc-core-mesboot0)' | sed -re 's,((bootstrap-mescc-tools|bootstrap-mes|guile-bootstrap).*shape =) box,\1 ellipse,' > doc/images/gcc-core-mesboot0-graph.dot
    @image{images/gcc-core-mesboot0-graph,6in,,Dependency graph of gcc-core-mesboot0}
    
    The only significant binary bootstrap seeds that remain@footnote{
    Ignoring the 68KB @code{mescc-tools}; that will be removed later,
    together with @code{mes}.} are a Scheme intepreter and a Scheme
    compiler: GNU Mes and GNU Guile@footnote{Not shown in this graph are the
    static binaries for @file{bash}, @code{tar}, and @code{xz} that are used
    to get Guile running.}.
    
    This further reduction has brought down the size of the binary seed to
    about 60MB for @code{i686-linux} and @code{x86_64-linux}.
    
    Work is ongoing to remove all binary blobs from our free software
    bootstrap stack, working towards a Full Source Bootstrap.  Also ongoing
    is work to bring these bootstraps to the @code{arm-linux} and
    @code{aarch64-linux} architectures and to the Hurd.
    
    If you are interested, join us on @samp{#bootstrappable} on the Freenode
    IRC network or discuss on @email{bug-mes@@gnu.org} or
    @email{gash-devel@@nongnu.org}.
    
    Ricardo Wurmus's avatar
    Ricardo Wurmus committed
    @node Preparing to Use the Bootstrap Binaries
    @section Preparing to Use the Bootstrap Binaries
    
    @c As of Emacs 24.3, Info-mode displays the image, but since it's a
    @c large image, it's hard to scroll.  Oh well.
    @image{images/bootstrap-graph,6in,,Dependency graph of the early bootstrap derivations}
    
    The figure above shows the very beginning of the dependency graph of the
    distribution, corresponding to the package definitions of the @code{(gnu
    
    packages bootstrap)} module.  A similar figure can be generated with
    @command{guix graph} (@pxref{Invoking guix graph}), along the lines of:
    
    @example
    guix graph -t derivation \
      -e '(@@@@ (gnu packages bootstrap) %bootstrap-gcc)' \
    
    or, for the further Reduced Binary Seed bootstrap
    
    
    @example
    guix graph -t derivation \
      -e '(@@@@ (gnu packages bootstrap) %bootstrap-mes)' \
      | dot -Tps > mes.ps
    
    @end example
    
    At this level of detail, things are
    
    slightly complex.  First, Guile itself consists of an ELF executable,
    along with many source and compiled Scheme files that are dynamically
    loaded when it runs.  This gets stored in the @file{guile-2.0.7.tar.xz}
    tarball shown in this graph.  This tarball is part of Guix's ``source''
    distribution, and gets inserted into the store with @code{add-to-store}
    (@pxref{The Store}).
    
    But how do we write a derivation that unpacks this tarball and adds it
    to the store?  To solve this problem, the @code{guile-bootstrap-2.0.drv}
    derivation---the first one that gets built---uses @code{bash} as its
    builder, which runs @code{build-bootstrap-guile.sh}, which in turn calls
    @code{tar} to unpack the tarball.  Thus, @file{bash}, @file{tar},
    @file{xz}, and @file{mkdir} are statically-linked binaries, also part of
    the Guix source distribution, whose sole purpose is to allow the Guile
    tarball to be unpacked.
    
    Once @code{guile-bootstrap-2.0.drv} is built, we have a functioning
    Guile that can be used to run subsequent build programs.  Its first task
    is to download tarballs containing the other pre-built binaries---this
    
    is what the @file{.tar.xz.drv} derivations do.  Guix modules such as
    
    @code{ftp-client.scm} are used for this purpose.  The
    @code{module-import.drv} derivations import those modules in a directory
    in the store, using the original layout.  The
    @code{module-import-compiled.drv} derivations compile those modules, and
    write them in an output directory with the right layout.  This
    corresponds to the @code{#:modules} argument of
    @code{build-expression->derivation} (@pxref{Derivations}).
    
    Finally, the various tarballs are unpacked by the derivations
    @code{gcc-bootstrap-0.drv}, @code{glibc-bootstrap-0.drv}, or
    
    @code{bootstrap-mes-0.drv} and @code{bootstrap-mescc-tools-0.drv}, at which
    point we have a working C tool chain.
    
    @unnumberedsec Building the Build Tools
    
    Bootstrapping is complete when we have a full tool chain that does not
    depend on the pre-built bootstrap tools discussed above.  This
    no-dependency requirement is verified by checking whether the files of
    the final tool chain contain references to the @file{/gnu/store}
    directories of the bootstrap inputs.  The process that leads to this
    ``final'' tool chain is described by the package definitions found in
    
    the @code{(gnu packages commencement)} module.
    
    The @command{guix graph} command allows us to ``zoom out'' compared to
    the graph above, by looking at the level of package objects instead of
    individual derivations---remember that a package may translate to
    several derivations, typically one derivation to download its source,
    one to build the Guile modules it needs, and one to actually build the
    package from source.  The command:
    
    @example
    guix graph -t bag \
      -e '(@@@@ (gnu packages commencement)
    
    Ludovic Courtès's avatar
    Ludovic Courtès committed
              glibc-final-with-bootstrap-bash)' | xdot -
    
    Ludovic Courtès's avatar
    Ludovic Courtès committed
    displays the dependency graph leading to the ``final'' C
    
    library@footnote{You may notice the @code{glibc-intermediate} label,
    suggesting that it is not @emph{quite} final, but as a good
    approximation, we will consider it final.}, depicted below.
    
    @image{images/bootstrap-packages,6in,,Dependency graph of the early packages}
    
    
    Marius Bakke's avatar
    Marius Bakke committed
    @c See <https://lists.gnu.org/archive/html/gnu-system-discuss/2012-10/msg00000.html>.
    
    The first tool that gets built with the bootstrap binaries is
    
    GNU@tie{}Make---noted @code{make-boot0} above---which is a prerequisite
    for all the following packages.  From there Findutils and Diffutils get
    built.
    
    Then come the first-stage Binutils and GCC, built as pseudo cross
    
    tools---i.e., with @option{--target} equal to @option{--host}.  They are
    
    used to build libc.  Thanks to this cross-build trick, this libc is
    guaranteed not to hold any reference to the initial tool chain.
    
    From there the final Binutils and GCC (not shown above) are built.  GCC
    uses @command{ld} from the final Binutils, and links programs against
    the just-built libc.  This tool chain is used to build the other
    packages used by Guix and by the GNU Build System: Guile, Bash,
    Coreutils, etc.
    
    And voilà!  At this point we have the complete set of build tools that
    the GNU Build System expects.  These are in the @code{%final-inputs}
    
    variable of the @code{(gnu packages commencement)} module, and are
    implicitly used by any package that uses @code{gnu-build-system}
    
    (@pxref{Build Systems, @code{gnu-build-system}}).
    
    @unnumberedsec Building the Bootstrap Binaries
    
    @cindex bootstrap binaries
    
    Because the final tool chain does not depend on the bootstrap binaries,
    those rarely need to be updated.  Nevertheless, it is useful to have an
    automated way to produce them, should an update occur, and this is what
    the @code{(gnu packages make-bootstrap)} module provides.
    
    The following command builds the tarballs containing the bootstrap binaries
    (Binutils, GCC, glibc, for the traditional bootstrap and linux-libre-headers,
    bootstrap-mescc-tools, bootstrap-mes for the Reduced Binary Seed bootstrap,
    and Guile, and a tarball containing a mixture of Coreutils and other basic
    command-line tools):
    
    @example
    guix build bootstrap-tarballs
    @end example
    
    The generated tarballs are those that should be referred to in the
    @code{(gnu packages bootstrap)} module mentioned at the beginning of
    this section.
    
    Still here?  Then perhaps by now you've started to wonder: when do we
    reach a fixed point?  That is an interesting question!  The answer is
    unknown, but if you would like to investigate further (and have
    significant computational and storage resources to do so), then let us
    know.
    
    
    @unnumberedsec Reducing the Set of Bootstrap Binaries
    
    Our traditional bootstrap includes GCC, GNU Libc, Guile, etc.  That's a lot of
    binary code!  Why is that a problem?  It's a problem because these big chunks
    of binary code are practically non-auditable, which makes it hard to establish
    what source code produced them.  Every unauditable binary also leaves us
    vulnerable to compiler backdoors as described by Ken Thompson in the 1984
    paper @emph{Reflections on Trusting Trust}.
    
    
    This is mitigated by the fact that our bootstrap binaries were generated
    from an earlier Guix revision.  Nevertheless it lacks the level of
    transparency that we get in the rest of the package dependency graph,
    where Guix always gives us a source-to-binary mapping.  Thus, our goal
    is to reduce the set of bootstrap binaries to the bare minimum.
    
    
    The @uref{https://bootstrappable.org, Bootstrappable.org web site} lists
    
    on-going projects to do that.  One of these is about replacing the
    bootstrap GCC with a sequence of assemblers, interpreters, and compilers
    of increasing complexity, which could be built from source starting from
    
    a simple and auditable assembler.
    
    Our first major achievement is the replacement of of GCC, the GNU C Library
    and Binutils by MesCC-Tools (a simple hex linker and macro assembler) and Mes
    (@pxref{Top, GNU Mes Reference Manual,, mes, GNU Mes}, a Scheme interpreter
    and C compiler in Scheme).  Neither MesCC-Tools nor Mes can be fully
    bootstrapped yet and thus we inject them as binary seeds.  We call this the
    Reduced Binary Seed bootstrap, as it has halved the size of our bootstrap
    binaries!  Also, it has eliminated the C compiler binary; i686-linux and
    x86_64-linux Guix packages are now bootstrapped without any binary C compiler.
    
    Work is ongoing to make MesCC-Tools and Mes fully bootstrappable and we are
    also looking at any other bootstrap binaries.  Your help is welcome!
    
    @chapter Porting to a New Platform
    
    
    As discussed above, the GNU distribution is self-contained, and
    self-containment is achieved by relying on pre-built ``bootstrap
    binaries'' (@pxref{Bootstrapping}).  These binaries are specific to an
    operating system kernel, CPU architecture, and application binary
    interface (ABI).  Thus, to port the distribution to a platform that is
    not yet supported, one must build those bootstrap binaries, and update
    the @code{(gnu packages bootstrap)} module to use them on that platform.
    
    Fortunately, Guix can @emph{cross compile} those bootstrap binaries.
    When everything goes well, and assuming the GNU tool chain supports the
    target platform, this can be as simple as running a command like this
    one:
    
    @example
    guix build --target=armv5tel-linux-gnueabi bootstrap-tarballs
    @end example
    
    
    For this to work, the @code{glibc-dynamic-linker} procedure in
    @code{(gnu packages bootstrap)} must be augmented to return the right
    file name for libc's dynamic linker on that platform; likewise,
    @code{system->linux-architecture} in @code{(gnu packages linux)} must be
    taught about the new platform.
    
    
    Once these are built, the @code{(gnu packages bootstrap)} module needs
    
    to be updated to refer to these binaries on the target platform.  That
    is, the hashes and URLs of the bootstrap tarballs for the new platform
    must be added alongside those of the currently supported platforms.  The
    bootstrap Guile tarball is treated specially: it is expected to be
    
    Laura Lazzati's avatar
    Laura Lazzati committed
    available locally, and @file{gnu/local.mk} has rules to download it for
    
    the supported architectures; a rule for the new platform must be added
    as well.
    
    
    In practice, there may be some complications.  First, it may be that the
    extended GNU triplet that specifies an ABI (like the @code{eabi} suffix
    above) is not recognized by all the GNU tools.  Typically, glibc
    
    recognizes some of these, whereas GCC uses an extra @option{--with-abi}
    
    configure flag (see @code{gcc.scm} for examples of how to handle this).
    Second, some of the required packages could fail to build for that
    platform.  Lastly, the generated binaries could be broken for some
    reason.
    
    @c *********************************************************************
    
    @include contributing.texi
    
    @c *********************************************************************
    @node Acknowledgments
    @chapter Acknowledgments
    
    
    Marius Bakke's avatar
    Marius Bakke committed
    Guix is based on the @uref{https://nixos.org/nix/, Nix package manager},
    
    which was designed and
    
    implemented by Eelco Dolstra, with contributions from other people (see
    
    zimoun's avatar
    zimoun committed
    the @file{nix/AUTHORS} file in Guix).  Nix pioneered functional package
    
    management, and promoted unprecedented features, such as transactional
    package upgrades and rollbacks, per-user profiles, and referentially
    transparent build processes.  Without this work, Guix would not exist.
    
    The Nix-based software distributions, Nixpkgs and NixOS, have also been
    an inspiration for Guix.
    
    
    GNU@tie{}Guix itself is a collective work with contributions from a
    number of people.  See the @file{AUTHORS} file in Guix for more
    information on these fine people.  The @file{THANKS} file lists people
    who have helped by reporting bugs, taking care of the infrastructure,
    providing artwork and themes, making suggestions, and more---thank you!
    
    
    
    @c *********************************************************************
    @node GNU Free Documentation License
    @appendix GNU Free Documentation License
    
    @cindex license, GNU Free Documentation License
    
    @include fdl-1.3.texi
    
    @c *********************************************************************
    @node Concept Index
    @unnumbered Concept Index
    @printindex cp
    
    
    @node Programming Index
    @unnumbered Programming Index
    @syncodeindex tp fn
    @syncodeindex vr fn
    
    @printindex fn
    
    @bye
    
    @c Local Variables:
    @c ispell-local-dictionary: "american";
    @c End: