MCLinker - the final toolchain frontier

Jörg Sonnenberger

joerg@NetBSD.org

Naples, April 06, 2013

BSD Day 2013

Overview

  • Introduction
  • Architecture
  • Performance
  • Implementation status
  • Future work

Introduction

  • Machine Code Linker complements the MC layer of LLVM
  • Created by Luba Tang from MediaTek in 2011
  • Uses same BSD-license as LLVM

Architecture: High-level view

  • Build input tree
  • Build fragment reference graph
  • Layout sections, relocate and write output
  • GNU ld: three steps mixed up
  • gold: merge first two phases

Build the input tree

  • Goal: High-level intermediate reprensentation
  • Based on command line
  • ...and file system content
  • Deals with positional arguments (--start-group, --as-needed)
  • Nesting: linker archives contain objects
  • Typed objects: object files, linker archives, shared libraries

Build fragment reference graph

  • Goal: symbol resolution
  • Build a graph with sections as nodes, symbol references as edges
  • Traverse input tree and look for files
  • If it requested OR provides a missing definition
  • ...process sections and symbol table
  • Linker groups: use stack, push when hitting start
  • ...repeat from start as long as new undefined reference occur
  • Optimize for cache locality
  • Place symbol attributes and initial part of name in same cache line

Layout sections

  • Goal: decide section order and final positions
  • Merge sections with same name and subsections
  • Drop redundant or unused sections
  • Finalize symbol values
  • Advantage of late layout: avoids recomputations
  • Single pass for ordering and address assignment

Compute relocations

  • Apply finalized symbol values to relocations
  • Decide which relocations are known at link time
  • ...and which are left for the run time linker
  • ...or whether they can be replaced by cheaper versions
  • Constant tables vs limited intermediate encoding
  • Global dynamic vs initial exec TLS method

Write output

  • Goal: write final binary
  • Apply relocations to input sections
  • Write resulting sections/segmentions
  • Mix in metadata
  • Use memory mapped files if possible
  • ...helps page lookup table (TLB) cache
  • ...improves page locality
  • ...helps filesystem cache

Performance: Time and memory use

Binary GNU ld gold MCLinker
llvm-tblgen Run time 0.10s 0.04s 0.05s
Peak RSS 17,700KB 17,528KB 17,508KB
clang Run time 1.41s 0.44s 0.69s
Peak RSS 150MB 182MB 176MB

Output size

Binary Segment GNU ld gold MCLinker
llvm-tblgen text 1,828KB 1,786LB 2,124KB
data 2,664 2,520 2,408
bss 5,912 2,520 5,360
clang text 26.9MB 26.7MB 34.3MB
data 22,112 22,112 21,984
bss 47,736 47,704 47,624

  • MCLinker behaves like --export-dynamic
  • Text size difference in .rodata and .dynstr

Linking GCC's cc1

GNU ld MCLinker
Run time 0.20s 0.16s
Peak RSS 47,888KB 51,752KB
Code size 8,618KB 8,178KB
Data size 1,154KB 1,154KB (+48B)

Implementation status: MI

  • Most basic ELF functionality works:
    • Static/dynamic linkage
    • Partial linking
    • Visibility and binding rules
    • DT_NEEDED not honoured yet

i386 and amd64

  • build.sh release works
  • ...using a fallback to GNU ld for parts depending on linker scripts
  • TLS support incomplete: relaxation tests fail

ARM

  • build.sh release builds
  • ...using a few more hacks than X86
  • ...parts of libc.so don't work optimized
  • ...analysis is still running
  • TLS support incomplete
  • ARM ELF header flags problematic
  • Optional system linker for Android
  • No support for AArch64

MIPS

  • Used by Android/MIPS
  • NetBSD untested (yet)
  • No support for N64 or O64

Future work

  • Extensive testsuite
  • Symbol versioning
  • Linker scripts
  • LTO
  • Research: fine grained layout on a per function base
  • EH table optimisations
  • Platform work:
    • To-be-completed: X86 (i386 and amd64), ARM and MIPS support
    • Work-in-progress: X32, MIPS64, Hexagon
    • Not-started-yet: AArch64

See also

Corporate supporters

  • MediaTek
  • Google
  • Intel
  • MIPS
  • Qualcomm

Q&A

Questions?