|< Previous PageNext Page >|
This chapter describes the structure of the Mach-O executable file format, which is the standard used to store programs on disk in the Mach-O runtime architecture. To understand how the development tools work with Mach-O files, and to perform low-level debugging tasks, you need to understand this information.
A Mach-O file contains three major regions (as shown in Figure 3-1):
Figure 3-1 Mach-O file format basic structure
Various tables within a Mach-O file refer to sections by number. Section numbering begins at 1 (not zero) and continues across segment boundaries. Thus, the first segment in a file may contain sections 1 and 2 and the second segment may contain sections 3 and 4.
A Mach-O file contains code and data for one CPU architecture. The header structure of a Mach-O file specifies the target CPU architecture, which allows the kernel to ensure that, for example, binary machine code intended for PowerPC processors is not executed on an x86 processor. You can store Mach-O files for multiple CPU architectures in one file using the format described in “Multi-CPU Architecture Files”.
Segments and sections are normally accessed by name. Segments, by convention, are named using all uppercase letters preceded by two underscores (for example, __TEXT); sections should be named using all lowercase letters preceded by two underscores (for example, __text). This naming convention is standard, though not required for the tools to operate correctly.
A segment defines a range of bytes in a Mach-O file and the addresses and memory protection attributes at which those bytes are mapped into virtual memory when the dynamic linker loads the application. As such, segments are always virtual memory page-aligned.
Segments that require more memory at runtime than they do
at build time can specify a larger in-memory size than they actually
have on disk. For example, the
generated by the linker for PowerPC executable files has a virtual
memory size of one page, but an on-disk size of zero. Because
no data, there is no need for it to occupy any space in the executable
A segment contains zero or more sections. For performance reasons, sections that are to be filled with zeros should always be placed at the end of the segment.
For compactness, an intermediate object file contains only
one segment. This segment has no name; it contains all of the sections
destined ultimately for different segments in the final Mach-O file.
The data structure that defines a section (
section) contains the name of the segment
the section is intended for, and the static linker will place each
section in the final Mach-O file accordingly.
For best performance, segments should be aligned on virtual memory page boundaries—4096 bytes for PowerPC processors and 8192 bytes for x86 processors. To calculate the size of a segment, add up the size of each section, then round up the sum to the next virtual memory page boundary (4096 bytes, or 4 kilobytes). Using this algorithm, the minimum size of a segment is 4 kilobytes, and thereafter it is sized at 4 kilobyte increments.
The header and load commands are considered part of the first segment of the file for paging purposes. In an executable file, this generally means that the headers and load commands live at the start of the __TEXT segment, because that is the first segment that contains data. The __PAGEZERO segment contains no data on disk, and so is ignored for this purpose.
The standard Mac OS X development tools add five segment types to a typical Mac OS X executable:
The __TEXT and __DATA segments
may contain a number of standard sections, listed in Table 3-1. The __OBJC segment
contains a number of sections which are private to the Objective-C compiler.
Note that the static linker and file analysis tools typically use
the section type and attributes (instead of the section name) to
determine how they should treat the section. The section name, type
and attributes are explained further in the description of the
section data type.
Segment and Section Name
|__TEXT,__text||Executable machine code. The compiler places only executable code in this section; no tables or data of any sort are stored here.|
|__TEXT,__cstring||Constant C strings. A C string is a sequence of non-null bytes that ends with a null byte (‘\0’). The static linker coalesces constant C string values, removing duplicates, when building the final product.|
|__TEXT,__picsymbol_stub||Position -independent indirect symbol stubs. See “Indirect Addressing” for more information.|
|__TEXT,__symbol_stub||Indirect symbol stubs. See “Indirect Addressing” for more information.|
|__TEXT,__const||Initialized constant variables. The compiler places all data declared const in this section.|
|__TEXT,__literal4||4-byte literal values. The compiler places single-precision floating point constants in this section. The static linker coalesces these values, removing duplicates, when building the final product. With some CPU architectures, it is more efficient for the compiler to use immediate load instructions rather than adding to this section.|
|__TEXT,__literal8||8-byte literal values. The compiler places double-precision floating point constants in this section. The static linker coalesces these values, removing duplicates, when building the final product. With some CPU architectures, it is more efficient for the compiler to use immediate load instructions rather than adding to this section.|
|__DATA,__data||Initialized mutable variables, such as writable C strings and data arrays.|
|__DATA,__la_symbol_ptr||Lazy symbol pointers, which are indirect references to functions imported from a different file. See “Indirect Addressing” for more information.|
|__DATA,__nl_symbol_ptr||Non-lazy symbol pointers, which are indirect references to data items imported from a different file. See “Indirect Addressing” for more information.|
|__DATA,__dyld||Information used by the static linker.|
|__DATA,__const||Unintialized constant variables.|
|__DATA,__mod_init_func||Module initialization functions. The C++ compiler places static constructors here.|
|__DATA,__mod_term_func||Module termination functions.|
|__DATA,__bss||Data for uninitialized static variables (for example, static int i;).|
|__DATA,__common||Uninitialized imported symbol definitions (for example, int i;) located in the global scope (outside of a function declaration).|
Each section in a Mach-O file has both a type and a set of attribute flags. In intermediate object files, the type and attributes determine how the static linker copy the sections from intermediate object files into the final product. Object file analysis tools (such as otool) use the type and attributes to determine how to read and display the sections. The section type and attributes are not used by the dynamic linker. Descriptions for important variants of the symbol type and attributes as they apply to static linking follow:
|< Previous PageNext Page >|
Last updated: 2003-08-07