Advanced Search
Apple Developer Connection
Member Login Log In | Not a Member? Contact ADC

< Previous PageNext Page >

Mach-O File Format Reference

This chapter describes the structure of the Mach-O executable file format, which is the standard used to store programs on disk in the Mach-O runtime architecture. To understand how the development tools work with Mach-O files, and to perform low-level debugging tasks, you need to understand this information.

A Mach-O file contains three major regions (as shown in Figure 3-1):


Figure 3-1 Mach-O file format basic structure

Figure 3-1 Mach-O file format basic structure

Various tables within a Mach-O file refer to sections by number. Section numbering begins at 1 (not zero) and continues across segment boundaries. Thus, the first segment in a file may contain sections 1 and 2 and the second segment may contain sections 3 and 4.

A Mach-O file contains code and data for one CPU architecture. The header structure of a Mach-O file specifies the target CPU architecture, which allows the kernel to ensure that, for example, binary machine code intended for PowerPC processors is not executed on an x86 processor. You can store Mach-O files for multiple CPU architectures in one file using the format described in “Multi-CPU Architecture Files”.

Segments and sections are normally accessed by name. Segments, by convention, are named using all uppercase letters preceded by two underscores (for example, __TEXT); sections should be named using all lowercase letters preceded by two underscores (for example, __text). This naming convention is standard, though not required for the tools to operate correctly.

A segment defines a range of bytes in a Mach-O file and the addresses and memory protection attributes at which those bytes are mapped into virtual memory when the dynamic linker loads the application. As such, segments are always virtual memory page-aligned.

Segments that require more memory at runtime than they do at build time can specify a larger in-memory size than they actually have on disk. For example, the __PAGEZERO segment generated by the linker for PowerPC executable files has a virtual memory size of one page, but an on-disk size of zero. Because __PAGEZERO contains no data, there is no need for it to occupy any space in the executable file.

A segment contains zero or more sections. For performance reasons, sections that are to be filled with zeros should always be placed at the end of the segment.

For compactness, an intermediate object file contains only one segment. This segment has no name; it contains all of the sections destined ultimately for different segments in the final Mach-O file. The data structure that defines a section (section) contains the name of the segment the section is intended for, and the static linker will place each section in the final Mach-O file accordingly.

For best performance, segments should be aligned on virtual memory page boundaries—4096 bytes for PowerPC processors and 8192 bytes for x86 processors. To calculate the size of a segment, add up the size of each section, then round up the sum to the next virtual memory page boundary (4096 bytes, or 4 kilobytes). Using this algorithm, the minimum size of a segment is 4 kilobytes, and thereafter it is sized at 4 kilobyte increments.

The header and load commands are considered part of the first segment of the file for paging purposes. In an executable file, this generally means that the headers and load commands live at the start of the __TEXT segment, because that is the first segment that contains data. The __PAGEZERO segment contains no data on disk, and so is ignored for this purpose.

The standard Mac OS X development tools add five segment types to a typical Mac OS X executable:

The __TEXT and __DATA segments may contain a number of standard sections, listed in Table 3-1. The __OBJC segment contains a number of sections which are private to the Objective-C compiler. Note that the static linker and file analysis tools typically use the section type and attributes (instead of the section name) to determine how they should treat the section. The section name, type and attributes are explained further in the description of the section data type.

Table 3-1 Typical sections in a Mach-O file

Segment and Section Name
Contents
__TEXT,__text Executable machine code. The compiler places only executable code in this section; no tables or data of any sort are stored here.
__TEXT,__cstring Constant C strings. A C string is a sequence of non-null bytes that ends with a null byte (‘\0’). The static linker coalesces constant C string values, removing duplicates, when building the final product.
__TEXT,__picsymbol_stub Position -independent indirect symbol stubs. See “Indirect Addressing” for more information.
__TEXT,__symbol_stub Indirect symbol stubs. See “Indirect Addressing” for more information.
__TEXT,__const Initialized constant variables. The compiler places all data declared const in this section.
__TEXT,__literal4 4-byte literal values. The compiler places single-precision floating point constants in this section. The static linker coalesces these values, removing duplicates, when building the final product. With some CPU architectures, it is more efficient for the compiler to use immediate load instructions rather than adding to this section.
__TEXT,__literal8 8-byte literal values. The compiler places double-precision floating point constants in this section. The static linker coalesces these values, removing duplicates, when building the final product. With some CPU architectures, it is more efficient for the compiler to use immediate load instructions rather than adding to this section.
__DATA,__data Initialized mutable variables, such as writable C strings and data arrays.
__DATA,__la_symbol_ptr Lazy symbol pointers, which are indirect references to functions imported from a different file. See “Indirect Addressing” for more information.
__DATA,__nl_symbol_ptr Non-lazy symbol pointers, which are indirect references to data items imported from a different file. See “Indirect Addressing” for more information.
__DATA,__dyld Information used by the static linker.
__DATA,__const Unintialized constant variables.
__DATA,__mod_init_func Module initialization functions. The C++ compiler places static constructors here.
__DATA,__mod_term_func Module termination functions.
__DATA,__bss Data for uninitialized static variables (for example, static int i;).
__DATA,__common Uninitialized imported symbol definitions (for example, int i;) located in the global scope (outside of a function declaration).

Each section in a Mach-O file has both a type and a set of attribute flags. In intermediate object files, the type and attributes determine how the static linker copy the sections from intermediate object files into the final product. Object file analysis tools (such as otool) use the type and attributes to determine how to read and display the sections. The section type and attributes are not used by the dynamic linker. Descriptions for important variants of the symbol type and attributes as they apply to static linking follow:



< Previous PageNext Page >


Last updated: 2003-08-07

Get information on Apple products.
Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Copyright © 2004 Apple Computer, Inc.
All rights reserved. | Terms of use | Privacy Notice