<html> <head> <title>Precompiled Headers (PCH)</title> <link type="text/css" rel="stylesheet" href="../menu.css" /> <link type="text/css" rel="stylesheet" href="../content.css" /> <style type="text/css"> td { vertical-align: top; } </style> </head> <body> <!--#include virtual="../menu.html.incl"--> <div id="content"> <h1>Precompiled Headers</h1> <p>This document describes the design and implementation of Clang's precompiled headers (PCH). If you are interested in the end-user view, please see the <a href="UsersManual.html#precompiledheaders">User's Manual</a>.</p> <p><b>Table of Contents</b></p> <ul> <li><a href="#usage">Using Precompiled Headers with <tt>clang</tt></a></li> <li><a href="#philosophy">Design Philosophy</a></li> <li><a href="#contents">Precompiled Header Contents</a> <ul> <li><a href="#metadata">Metadata Block</a></li> <li><a href="#sourcemgr">Source Manager Block</a></li> <li><a href="#preprocessor">Preprocessor Block</a></li> <li><a href="#types">Types Block</a></li> <li><a href="#decls">Declarations Block</a></li> <li><a href="#stmt">Statements and Expressions</a></li> <li><a href="#idtable">Identifier Table Block</a></li> <li><a href="#method-pool">Method Pool Block</a></li> </ul> </li> <li><a href="#tendrils">Precompiled Header Integration Points</a></li> </ul> <h2 id="usage">Using Precompiled Headers with <tt>clang</tt></h2> <p>The Clang compiler frontend, <tt>clang -cc1</tt>, supports two command line options for generating and using PCH files.<p> <p>To generate PCH files using <tt>clang -cc1</tt>, use the option <b><tt>-emit-pch</tt></b>: <pre> $ clang -cc1 test.h -emit-pch -o test.h.pch </pre> <p>This option is transparently used by <tt>clang</tt> when generating PCH files. The resulting PCH file contains the serialized form of the compiler's internal representation after it has completed parsing and semantic analysis. The PCH file can then be used as a prefix header with the <b><tt>-include-pch</tt></b> option:</p> <pre> $ clang -cc1 -include-pch test.h.pch test.c -o test.s </pre> <h2 id="philosophy">Design Philosophy</h2> <p>Precompiled headers are meant to improve overall compile times for projects, so the design of precompiled headers is entirely driven by performance concerns. The use case for precompiled headers is relatively simple: when there is a common set of headers that is included in nearly every source file in the project, we <i>precompile</i> that bundle of headers into a single precompiled header (PCH file). Then, when compiling the source files in the project, we load the PCH file first (as a prefix header), which acts as a stand-in for that bundle of headers.</p> <p>A precompiled header implementation improves performance when:</p> <ul> <li>Loading the PCH file is significantly faster than re-parsing the bundle of headers stored within the PCH file. Thus, a precompiled header design attempts to minimize the cost of reading the PCH file. Ideally, this cost should not vary with the size of the precompiled header file.</li> <li>The cost of generating the PCH file initially is not so large that it counters the per-source-file performance improvement due to eliminating the need to parse the bundled headers in the first place. This is particularly important on multi-core systems, because PCH file generation serializes the build when all compilations require the PCH file to be up-to-date.</li> </ul> <p>Clang's precompiled headers are designed with a compact on-disk representation, which minimizes both PCH creation time and the time required to initially load the PCH file. The PCH file itself contains a serialized representation of Clang's abstract syntax trees and supporting data structures, stored using the same compressed bitstream as <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitcode file format</a>.</p> <p>Clang's precompiled headers are loaded "lazily" from disk. When a PCH file is initially loaded, Clang reads only a small amount of data from the PCH file to establish where certain important data structures are stored. The amount of data read in this initial load is independent of the size of the PCH file, such that a larger PCH file does not lead to longer PCH load times. The actual header data in the PCH file--macros, functions, variables, types, etc.--is loaded only when it is referenced from the user's code, at which point only that entity (and those entities it depends on) are deserialized from the PCH file. With this approach, the cost of using a precompiled header for a translation unit is proportional to the amount of code actually used from the header, rather than being proportional to the size of the header itself.</p> <p>When given the <code>-print-stats</code> option, Clang produces statistics describing how much of the precompiled header was actually loaded from disk. For a simple "Hello, World!" program that includes the Apple <code>Cocoa.h</code> header (which is built as a precompiled header), this option illustrates how little of the actual precompiled header is required:</p> <pre> *** PCH Statistics: 933 stat cache hits 4 stat cache misses 895/39981 source location entries read (2.238563%) 19/15315 types read (0.124061%) 20/82685 declarations read (0.024188%) 154/58070 identifiers read (0.265197%) 0/7260 selectors read (0.000000%) 0/30842 statements read (0.000000%) 4/8400 macros read (0.047619%) 1/4995 lexical declcontexts read (0.020020%) 0/4413 visible declcontexts read (0.000000%) 0/7230 method pool entries read (0.000000%) 0 method pool misses </pre> <p>For this small program, only a tiny fraction of the source locations, types, declarations, identifiers, and macros were actually deserialized from the precompiled header. These statistics can be useful to determine whether the precompiled header implementation can be improved by making more of the implementation lazy.</p> <p>Precompiled headers can be chained. When you create a PCH while including an existing PCH, Clang can create the new PCH by referencing the original file and only writing the new data to the new file. For example, you could create a PCH out of all the headers that are very commonly used throughout your project, and then create a PCH for every single source file in the project that includes the code that is specific to that file, so that recompiling the file itself is very fast, without duplicating the data from the common headers for every file.</p> <h2 id="contents">Precompiled Header Contents</h2> <img src="PCHLayout.png" align="right" alt="Precompiled header layout"> <p>Clang's precompiled headers are organized into several different blocks, each of which contains the serialized representation of a part of Clang's internal representation. Each of the blocks corresponds to either a block or a record within <a href="http://llvm.org/docs/BitCodeFormat.html">LLVM's bitstream format</a>. The contents of each of these logical blocks are described below.</p> <p>For a given precompiled header, the <a href="http://llvm.org/cmds/llvm-bcanalyzer.html"><code>llvm-bcanalyzer</code></a> utility can be used to examine the actual structure of the bitstream for the precompiled header. This information can be used both to help understand the structure of the precompiled header and to isolate areas where precompiled headers can still be optimized, e.g., through the introduction of abbreviations.</p> <h3 id="metadata">Metadata Block</h3> <p>The metadata block contains several records that provide information about how the precompiled header was built. This metadata is primarily used to validate the use of a precompiled header. For example, a precompiled header built for a 32-bit x86 target cannot be used when compiling for a 64-bit x86 target. The metadata block contains information about:</p> <dl> <dt>Language options</dt> <dd>Describes the particular language dialect used to compile the PCH file, including major options (e.g., Objective-C support) and more minor options (e.g., support for "//" comments). The contents of this record correspond to the <code>LangOptions</code> class.</dd> <dt>Target architecture</dt> <dd>The target triple that describes the architecture, platform, and ABI for which the PCH file was generated, e.g., <code>i386-apple-darwin9</code>.</dd> <dt>PCH version</dt> <dd>The major and minor version numbers of the precompiled header format. Changes in the minor version number should not affect backward compatibility, while changes in the major version number imply that a newer compiler cannot read an older precompiled header (and vice-versa).</dd> <dt>Original file name</dt> <dd>The full path of the header that was used to generate the precompiled header.</dd> <dt>Predefines buffer</dt> <dd>Although not explicitly stored as part of the metadata, the predefines buffer is used in the validation of the precompiled header. The predefines buffer itself contains code generated by the compiler to initialize the preprocessor state according to the current target, platform, and command-line options. For example, the predefines buffer will contain "<code>#define __STDC__ 1</code>" when we are compiling C without Microsoft extensions. The predefines buffer itself is stored within the <a href="#sourcemgr">source manager block</a>, but its contents are verified along with the rest of the metadata.</dd> </dl> <p>A chained PCH file (that is, one that references another PCH) has a slightly different metadata block, which contains the following information:</p> <dl> <dt>Referenced file</dt> <dd>The name of the referenced PCH file. It is looked up like a file specified using -include-pch.</dd> <dt>PCH version</dt> <dd>This is the same as in normal PCH files.</dd> <dt>Original file name</dt> <dd>The full path of the header that was used to generate this precompiled header.</dd> </dl> <p>The language options, target architecture and predefines buffer data is taken from the end of the chain, since they have to match anyway.</p> <h3 id="sourcemgr">Source Manager Block</h3> <p>The source manager block contains the serialized representation of Clang's <a href="InternalsManual.html#SourceLocation">SourceManager</a> class, which handles the mapping from source locations (as represented in Clang's abstract syntax tree) into actual column/line positions within a source file or macro instantiation. The precompiled header's representation of the source manager also includes information about all of the headers that were (transitively) included when building the precompiled header.</p> <p>The bulk of the source manager block is dedicated to information about the various files, buffers, and macro instantiations into which a source location can refer. Each of these is referenced by a numeric "file ID", which is a unique number (allocated starting at 1) stored in the source location. Clang serializes the information for each kind of file ID, along with an index that maps file IDs to the position within the PCH file where the information about that file ID is stored. The data associated with a file ID is loaded only when required by the front end, e.g., to emit a diagnostic that includes a macro instantiation history inside the header itself.</p> <p>The source manager block also contains information about all of the headers that were included when building the precompiled header. This includes information about the controlling macro for the header (e.g., when the preprocessor identified that the contents of the header dependent on a macro like <code>LLVM_CLANG_SOURCEMANAGER_H</code>) along with a cached version of the results of the <code>stat()</code> system calls performed when building the precompiled header. The latter is particularly useful in reducing system time when searching for include files.</p> <h3 id="preprocessor">Preprocessor Block</h3> <p>The preprocessor block contains the serialized representation of the preprocessor. Specifically, it contains all of the macros that have been defined by the end of the header used to build the precompiled header, along with the token sequences that comprise each macro. The macro definitions are only read from the PCH file when the name of the macro first occurs in the program. This lazy loading of macro definitions is triggered by lookups into the <a href="#idtable">identifier table</a>.</p> <h3 id="types">Types Block</h3> <p>The types block contains the serialized representation of all of the types referenced in the translation unit. Each Clang type node (<code>PointerType</code>, <code>FunctionProtoType</code>, etc.) has a corresponding record type in the PCH file. When types are deserialized from the precompiled header, the data within the record is used to reconstruct the appropriate type node using the AST context.</p> <p>Each type has a unique type ID, which is an integer that uniquely identifies that type. Type ID 0 represents the NULL type, type IDs less than <code>NUM_PREDEF_TYPE_IDS</code> represent predefined types (<code>void</code>, <code>float</code>, etc.), while other "user-defined" type IDs are assigned consecutively from <code>NUM_PREDEF_TYPE_IDS</code> upward as the types are encountered. The PCH file has an associated mapping from the user-defined types block to the location within the types block where the serialized representation of that type resides, enabling lazy deserialization of types. When a type is referenced from within the PCH file, that reference is encoded using the type ID shifted left by 3 bits. The lower three bits are used to represent the <code>const</code>, <code>volatile</code>, and <code>restrict</code> qualifiers, as in Clang's <a href="http://clang.llvm.org/docs/InternalsManual.html#Type">QualType</a> class.</p> <h3 id="decls">Declarations Block</h3> <p>The declarations block contains the serialized representation of all of the declarations referenced in the translation unit. Each Clang declaration node (<code>VarDecl</code>, <code>FunctionDecl</code>, etc.) has a corresponding record type in the PCH file. When declarations are deserialized from the precompiled header, the data within the record is used to build and populate a new instance of the corresponding <code>Decl</code> node. As with types, each declaration node has a numeric ID that is used to refer to that declaration within the PCH file. In addition, a lookup table provides a mapping from that numeric ID to the offset within the precompiled header where that declaration is described.</p> <p>Declarations in Clang's abstract syntax trees are stored hierarchically. At the top of the hierarchy is the translation unit (<code>TranslationUnitDecl</code>), which contains all of the declarations in the translation unit. These declarations (such as functions or struct types) may also contain other declarations inside them, and so on. Within Clang, each declaration is stored within a <a href="http://clang.llvm.org/docs/InternalsManual.html#DeclContext">declaration context</a>, as represented by the <code>DeclContext</code> class. Declaration contexts provide the mechanism to perform name lookup within a given declaration (e.g., find the member named <code>x</code> in a structure) and iterate over the declarations stored within a context (e.g., iterate over all of the fields of a structure for structure layout).</p> <p>In Clang's precompiled header format, deserializing a declaration that is a <code>DeclContext</code> is a separate operation from deserializing all of the declarations stored within that declaration context. Therefore, Clang will deserialize the translation unit declaration without deserializing the declarations within that translation unit. When required, the declarations stored within a declaration context will be deserialized. There are two representations of the declarations within a declaration context, which correspond to the name-lookup and iteration behavior described above:</p> <ul> <li>When the front end performs name lookup to find a name <code>x</code> within a given declaration context (for example, during semantic analysis of the expression <code>p->x</code>, where <code>p</code>'s type is defined in the precompiled header), Clang deserializes a hash table mapping from the names within that declaration context to the declaration IDs that represent each visible declaration with that name. The entire hash table is deserialized at this point (into the <code>llvm::DenseMap</code> stored within each <code>DeclContext</code> object), but the actual declarations are not yet deserialized. In a second step, those declarations with the name <code>x</code> will be deserialized and will be used as the result of name lookup.</li> <li>When the front end performs iteration over all of the declarations within a declaration context, all of those declarations are immediately de-serialized. For large declaration contexts (e.g., the translation unit), this operation is expensive; however, large declaration contexts are not traversed in normal compilation, since such a traversal is unnecessary. However, it is common for the code generator and semantic analysis to traverse declaration contexts for structs, classes, unions, and enumerations, although those contexts contain relatively few declarations in the common case.</li> </ul> <h3 id="stmt">Statements and Expressions</h3> <p>Statements and expressions are stored in the precompiled header in both the <a href="#types">types</a> and the <a href="#decls">declarations</a> blocks, because every statement or expression will be associated with either a type or declaration. The actual statement and expression records are stored immediately following the declaration or type that owns the statement or expression. For example, the statement representing the body of a function will be stored directly following the declaration of the function.</p> <p>As with types and declarations, each statement and expression kind in Clang's abstract syntax tree (<code>ForStmt</code>, <code>CallExpr</code>, etc.) has a corresponding record type in the precompiled header, which contains the serialized representation of that statement or expression. Each substatement or subexpression within an expression is stored as a separate record (which keeps most records to a fixed size). Within the precompiled header, the subexpressions of an expression are stored, in reverse order, prior to the expression that owns those expression, using a form of <a href="http://en.wikipedia.org/wiki/Reverse_Polish_notation">Reverse Polish Notation</a>. For example, an expression <code>3 - 4 + 5</code> would be represented as follows:</p> <table border="1"> <tr><td><code>IntegerLiteral(5)</code></td></tr> <tr><td><code>IntegerLiteral(4)</code></td></tr> <tr><td><code>IntegerLiteral(3)</code></td></tr> <tr><td><code>BinaryOperator(-)</code></td></tr> <tr><td><code>BinaryOperator(+)</code></td></tr> <tr><td>STOP</td></tr> </table> <p>When reading this representation, Clang evaluates each expression record it encounters, builds the appropriate abstract syntax tree node, and then pushes that expression on to a stack. When a record contains <i>N</i> subexpressions--<code>BinaryOperator</code> has two of them--those expressions are popped from the top of the stack. The special STOP code indicates that we have reached the end of a serialized expression or statement; other expression or statement records may follow, but they are part of a different expression.</p> <h3 id="idtable">Identifier Table Block</h3> <p>The identifier table block contains an on-disk hash table that maps each identifier mentioned within the precompiled header to the serialized representation of the identifier's information (e.g, the <code>IdentifierInfo</code> structure). The serialized representation contains:</p> <ul> <li>The actual identifier string.</li> <li>Flags that describe whether this identifier is the name of a built-in, a poisoned identifier, an extension token, or a macro.</li> <li>If the identifier names a macro, the offset of the macro definition within the <a href="#preprocessor">preprocessor block</a>.</li> <li>If the identifier names one or more declarations visible from translation unit scope, the <a href="#decls">declaration IDs</a> of these declarations.</li> </ul> <p>When a precompiled header is loaded, the precompiled header mechanism introduces itself into the identifier table as an external lookup source. Thus, when the user program refers to an identifier that has not yet been seen, Clang will perform a lookup into the identifier table. If an identifier is found, its contents (macro definitions, flags, top-level declarations, etc.) will be deserialized, at which point the corresponding <code>IdentifierInfo</code> structure will have the same contents it would have after parsing the headers in the precompiled header.</p> <p>Within the PCH file, the identifiers used to name declarations are represented with an integral value. A separate table provides a mapping from this integral value (the identifier ID) to the location within the on-disk hash table where that identifier is stored. This mapping is used when deserializing the name of a declaration, the identifier of a token, or any other construct in the PCH file that refers to a name.</p> <h3 id="method-pool">Method Pool Block</h3> <p>The method pool block is represented as an on-disk hash table that serves two purposes: it provides a mapping from the names of Objective-C selectors to the set of Objective-C instance and class methods that have that particular selector (which is required for semantic analysis in Objective-C) and also stores all of the selectors used by entities within the precompiled header. The design of the method pool is similar to that of the <a href="#idtable">identifier table</a>: the first time a particular selector is formed during the compilation of the program, Clang will search in the on-disk hash table of selectors; if found, Clang will read the Objective-C methods associated with that selector into the appropriate front-end data structure (<code>Sema::InstanceMethodPool</code> and <code>Sema::FactoryMethodPool</code> for instance and class methods, respectively).</p> <p>As with identifiers, selectors are represented by numeric values within the PCH file. A separate index maps these numeric selector values to the offset of the selector within the on-disk hash table, and will be used when de-serializing an Objective-C method declaration (or other Objective-C construct) that refers to the selector.</p> <h2 id="tendrils">Precompiled Header Integration Points</h2> <p>The "lazy" deserialization behavior of precompiled headers requires their integration into several completely different submodules of Clang. For example, lazily deserializing the declarations during name lookup requires that the name-lookup routines be able to query the precompiled header to find entities within the PCH file.</p> <p>For each Clang data structure that requires direct interaction with the precompiled header logic, there is an abstract class that provides the interface between the two modules. The <code>PCHReader</code> class, which handles the loading of a precompiled header, inherits from all of these abstract classes to provide lazy deserialization of Clang's data structures. <code>PCHReader</code> implements the following abstract classes:</p> <dl> <dt><code>StatSysCallCache</code></dt> <dd>This abstract interface is associated with the <code>FileManager</code> class, and is used whenever the file manager is going to perform a <code>stat()</code> system call.</dd> <dt><code>ExternalSLocEntrySource</code></dt> <dd>This abstract interface is associated with the <code>SourceManager</code> class, and is used whenever the <a href="#sourcemgr">source manager</a> needs to load the details of a file, buffer, or macro instantiation.</dd> <dt><code>IdentifierInfoLookup</code></dt> <dd>This abstract interface is associated with the <code>IdentifierTable</code> class, and is used whenever the program source refers to an identifier that has not yet been seen. In this case, the precompiled header implementation searches for this identifier within its <a href="#idtable">identifier table</a> to load any top-level declarations or macros associated with that identifier.</dd> <dt><code>ExternalASTSource</code></dt> <dd>This abstract interface is associated with the <code>ASTContext</code> class, and is used whenever the abstract syntax tree nodes need to loaded from the precompiled header. It provides the ability to de-serialize declarations and types identified by their numeric values, read the bodies of functions when required, and read the declarations stored within a declaration context (either for iteration or for name lookup).</dd> <dt><code>ExternalSemaSource</code></dt> <dd>This abstract interface is associated with the <code>Sema</code> class, and is used whenever semantic analysis needs to read information from the <a href="#methodpool">global method pool</a>.</dd> </dl> </div> </body> </html>