MCLinker - the final toolchain frontier 
           Jörg Sonnenberger
          
            joerg@NetBSD.org
          
           Naples, April 06, 2013 
           BSD Day 2013 
        
        
           Overview 
          
            -  Introduction 
 
            -  Architecture 
 
            -  Performance 
 
            -  Implementation status 
 
            -  Future work 
 
          
        
        
           Introduction 
          
            -  Machine Code Linker complements the MC layer of LLVM 
 
            -  Created by Luba Tang from MediaTek in 2011 
 
            -  Uses same BSD-license as LLVM 
 
          
        
        
           Architecture: High-level view 
          
            -  Build input tree 
 
            -  Build fragment reference graph 
 
            -  Layout sections, relocate and write output 
 
            -  GNU ld: three steps mixed up 
 
            -  gold: merge first two phases 
 
          
        
        
          
             Build the input tree 
            
              -  Goal: High-level intermediate reprensentation 
 
              -  Based on command line 
 
              -  ...and file system content 
 
              -  Deals with positional arguments (--start-group, --as-needed) 
 
              -  Nesting: linker archives contain objects 
 
              -  Typed objects: object files, linker archives, shared libraries 
 
            
          
          
             Build fragment reference graph 
            
              -  Goal: symbol resolution 
 
              -  Build a graph with sections as nodes, symbol references as edges 
 
              -  Traverse input tree and look for files 
 
              -  If it requested OR provides a missing definition 
 
              -  ...process sections and symbol table 
 
              -  Linker groups: use stack, push when hitting start 
 
              -  ...repeat from start as long as new undefined reference occur 
 
              -  Optimize for cache locality 
 
              -  Place symbol attributes and initial part of name in same cache line 
 
            
          
          
             Layout sections 
            
              -  Goal: decide section order and final positions 
 
              -  Merge sections with same name and subsections 
 
              -  Drop redundant or unused sections 
 
              -  Finalize symbol values 
 
              -  Advantage of late layout: avoids recomputations 
 
              -  Single pass for ordering and address assignment 
 
            
          
          
             Compute relocations 
            
              -  Apply finalized symbol values to relocations 
 
              -  Decide which relocations are known at link time 
 
              -  ...and which are left for the run time linker 
 
              -  ...or whether they can be replaced by cheaper versions 
 
              -  Constant tables vs limited intermediate encoding 
 
              -  Global dynamic vs initial exec TLS method 
 
            
          
          
             Write output 
            
              -  Goal: write final binary 
 
              -  Apply relocations to input sections 
 
              -  Write resulting sections/segmentions 
 
              -  Mix in metadata 
 
              -  Use memory mapped files if possible 
 
              -  ...helps page lookup table (TLB) cache 
 
              -  ...improves page locality 
 
              -  ...helps filesystem cache 
 
            
          
        
        
          
             Performance: Time and memory use 
            
              
                |  Binary  | 
                  | 
                 GNU ld  | 
                 gold  | 
                 MCLinker  | 
              
              
                |  llvm-tblgen  | 
                 Run time  | 
                 0.10s  | 
                 0.04s  | 
                 0.05s  | 
              
              
                |  Peak RSS  | 
                 17,700KB  | 
                 17,528KB  | 
                 17,508KB  | 
              
              
                |  clang  | 
                 Run time  | 
                 1.41s  | 
                 0.44s  | 
                 0.69s  | 
              
              
                |  Peak RSS  | 
                 150MB  | 
                 182MB  | 
                 176MB  | 
              
            
          
          
             Output size 
            
              
                |  Binary  | 
                 Segment  | 
                 GNU ld  | 
                 gold  | 
                 MCLinker  | 
              
              
                |  llvm-tblgen  | 
                 text  | 
                 1,828KB  | 
                 1,786LB  | 
                 2,124KB  | 
              
              
                |  data  | 
                 2,664  | 
                 2,520  | 
                 2,408  | 
              
              
                |  bss  | 
                 5,912  | 
                 2,520  | 
                 5,360  | 
              
              
                |  clang  | 
                 text  | 
                 26.9MB  | 
                 26.7MB  | 
                 34.3MB  | 
              
              
                |  data  | 
                 22,112  | 
                 22,112  | 
                 21,984  | 
              
              
                |  bss  | 
                 47,736  | 
                 47,704  | 
                 47,624  | 
              
            
            
            
              -  MCLinker behaves like --export-dynamic 
 
              -  Text size difference in .rodata and .dynstr 
 
            
          
          
             Linking GCC's cc1 
            
              
                |   | 
                 GNU ld  | 
                 MCLinker  | 
              
              
                |  Run time  | 
                 0.20s  | 
                 0.16s  | 
              
              
                |  Peak RSS  | 
                 47,888KB  | 
                 51,752KB  | 
              
              
                |  Code size  | 
                 8,618KB  | 
                 8,178KB  | 
              
              
                |  Data size  | 
                 1,154KB  | 
                 1,154KB (+48B)  | 
              
            
          
        
        
          
             Implementation status: MI 
            
              -  Most basic ELF functionality works:
    
-  Static/dynamic linkage 
 -  Partial linking 
 -  Visibility and binding rules 
 -  DT_NEEDED not honoured yet 
 
   
            
          
          
             i386 and amd64 
            
              -  build.sh release works 
 
              -  ...using a fallback to GNU ld for parts depending on linker scripts 
 
              -  TLS support incomplete: relaxation tests fail 
 
            
          
          
             ARM 
            
              -  build.sh release builds 
 
              -  ...using a few more hacks than X86 
 
              -  ...parts of libc.so don't work optimized 
 
              -  ...analysis is still running 
 
              -  TLS support incomplete 
 
              -  ARM ELF header flags problematic 
 
              -  Optional system linker for Android 
 
              -  No support for AArch64 
 
            
          
          
             MIPS 
            
              -  Used by Android/MIPS 
 
              -  NetBSD untested (yet) 
 
              -  No support for N64 or O64 
 
            
          
        
        
           Future work 
          
            -  Extensive testsuite 
 
            -  Symbol versioning 
 
            -  Linker scripts 
 
            -  LTO 
 
            -  Research: fine grained layout on a per function base 
 
            -  EH table optimisations 
 
            -  Platform work:
    
-  To-be-completed: X86 (i386 and amd64), ARM and MIPS support 
 -  Work-in-progress: X32, MIPS64, Hexagon 
 -  Not-started-yet: AArch64 
 
   
          
        
        
        
           Corporate supporters 
          
            -  MediaTek 
 
            -  Google 
 
            -  Intel 
 
            -  MIPS 
 
            -  Qualcomm