Beginner's introduction to GCC

About me

maya@NetBSD.org

coypu@sdf.org

NetBSD/pkgsrc for the last 3 years

(Not a GCC expert, just sharing knowledge)

This talk

GNU-centric view of toolchains

(alternatives exist, won't be mentioned)

Might be familiar for many

let's all get on the same page

Top-level overview of toolchain

preprocessor
compiler
assembler
linker

Top-level overview of toolchain

preprocessor
compiler
assembler
linker
rtld (ld.so)

Upstreams overview

projectcomponent
GCCpreprocessor
GCCcompiler
binutilsassembler
binutilslinker
OSrtld (ld.so)

Independent tools

commandcomponent
cpppreprocessor
/usr/libexec/cc1compiler
asassembler
ldlinker
(kernel)rtld (ld.so)

GCC can pass flags to each component

GCC flagcomponent
-Wp,preprocessor
(none)compiler
-Wa,assembler
-Wl,linker

Can stop after component

GCC flagstop at
-Epreprocessor
-Scompiler
-cassembler
(none)linker

Preprocessor

important for packaging, most of our problems are here

Expands preprocessor directives

  • #include <math.h>
  • #if defined(__NetBSD__) || defined(__linux__)...
  • #error "OS not in long list of supported OSes"

#include <math.h> ?

(why would things "just work"?)

  • Existing lookup directories
  • Visible with -Wp,--verbose
  • Each tool has --verbose

#include <math.h>

~> gcc -Wp,--verbose test.c
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/gcc-6
 /usr/include
End of search list.
I have /usr/include/math.h

#if defined(__NetBSD__) ?

I never defined that...

GCC internal code builtin_define("__NetBSD__")

visible with gcc -dM -E - < /dev/null

~> gcc -dM -E - < /dev/null
#define __NetBSD__ 1
#define _LP64 1
#define __STDC_VERSION__ 201112L
...

Results can be inspected after preprocessing

-save-temps to save all the results

  • Single file with all the used code
  • Standalone test case for compiler problems
  • No need to dig through many include files
  • quickly debug your qt5+boost program with limited prior knowledge
(the reason for this talk)

compiler

?? Magical C to assembler machine ??

assembler

  • Human-like assembly output
  • Some directives
	.file	"test2.c"
	.text
	.globl	main
	.type	main, @function
main:
.LFB0:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	movl	$0, %eax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	main, .-main
	.ident	"GCC: (nb2 20180327) 6.4.0"

Linker/object files

  • Machine-readable
  • Many tools to parse

Parsing binary objects (nm)

~> nm test2
0000000000600a78 d _DYNAMIC
0000000000600bf8 d _GLOBAL_OFFSET_TABLE_
                 w _Jv_RegisterClasses
0000000000600a58 D __CTOR_LIST_END__
0000000000400968 r __GNU_EH_FRAME_HDR
00000000004006ea T ___start
0000000000600c92 B __bss_start
                 w __deregister_frame_info
0000000000600c48 D __dso_handle
0000000000600c40 D __progname
0000000000600c98 B __ps_strings
                 w __register_frame_info
00000000004005e8 r __rela_iplt_end
00000000004005e8 r __rela_iplt_start
0000000000400690 T __start
                 U __syscall
0000000000600c92 D _edata
0000000000600cb0 B _end
                 U _exit
00000000004008d0 T _fini
00000000004005f0 T _init
                 U _libc_init
0000000000400690 T _start
                 U abort
                 U atexit
0000000000600ca8 B environ
                 U exit
00000000004008c1 T main

C runtime stuff

(where all these extra symbols came from?)

We can see that the linker adding those

C runtime stuff

~> gcc -Wl,--verbose test2.c
.. (linker script) ..
attempt to open /usr/lib/crt0.o succeeded
/usr/lib/crt0.o
attempt to open /usr/lib/crti.o succeeded
/usr/lib/crti.o
attempt to open /usr/lib/crtbegin.o succeeded
/usr/lib/crtbegin.o
attempt to open /var/tmp//ccrus4oG.o succeeded
/var/tmp//ccrus4oG.o
attempt to open /usr/lib/libgcc_s.so succeeded
-lgcc_s (/usr/lib/libgcc_s.so)
attempt to open /usr/lib/libgcc.so failed
attempt to open /usr/lib/libgcc.a succeeded
attempt to open /usr/lib/libc.so succeeded
-lc (/usr/lib/libc.so)
attempt to open /usr/lib/libgcc_s.so succeeded
-lgcc_s (/usr/lib/libgcc_s.so)
attempt to open /usr/lib/libgcc.so failed
attempt to open /usr/lib/libgcc.a succeeded
attempt to open /usr/lib/crtend.o succeeded
/usr/lib/crtend.o
attempt to open /usr/lib/crtn.o succeeded
/usr/lib/crtn.o

specfiles

Not very legible, gcc -dumpspecs

Responsible for not needing to specify -lc

attempt to open /usr/lib/libc.so succeeded
-lc (/usr/lib/libc.so)
attempt to open /usr/lib/libgcc_s.so succeeded
-lgcc_s (/usr/lib/libgcc_s.so)

specfiles

*lib:
%{pthread:			     %{!p:			       %{!pg:-lpthread}}	     %{p:-lpthread_p}		     %{pg:-lpthread_p}}		   %{posix:			     %{!p:			       %{!pg:-lposix}}		     %{p:-lposix_p}		     %{pg:-lposix_p}}		   %{shared:-lc}		   %{!shared:			     %{!symbolic:		       %{!p:				 %{!pg:-lc}}		       %{p:-lc_p}		       %{pg:-lc_p}}}

Parsing binary objects (readelf)

Can see the libraries we use

Dynamic section at offset 0x2adc0 contains 22 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libedit.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libterminfo.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.12]
 0x000000000000000f (RPATH)              Library rpath: [/lib]

Semver

We can see we specify libc.MAJOR

libc.MAJOR -> libc.MAJOR.minor (symlink)

Can change library minor without binary noticing
New library major won't be used

GCC configuration

  • Primary file: gcc/config.gcc
  • per-OS+arch
  • Add headers to override definitions
  • Add extra makefiles for extra code to run e.g. on init
    (e.g. NetBSD has custom code to rename cabs to __c99_cabs

GCC - overview

front-endGIMPLE
middle-endRTL
back-endassembly

Can dump lots of intermediate results

  • -fdump-rtl-all-all
  • -fdump-tree-all

Backend (RTL)

  • gcc/config/ARCH/arch.md
  • lisp-like, rules to match

Backend (RTL)

  • constraints must match
  • multiple templates possible
(define_insn "extendqihi2"
  [(set (match_operand:HI 0 "nonimmediate_operand" "=g")
        (sign_extend:HI (match_operand:QI 1 "nonimmediate_operand" "g")))]
  ""
  "cvtbw %1,%0")

Backend (RTL)

  • Want to reorder assembly
  • Need to specify side-effects (condition codes)
    (x86 FLAGS)
(define_insn_and_split "*cmp_cc_i387"
  [(set (reg:CCFP FLAGS_REG)
        (compare:CCFP
          (match_operand:MODEF 1 "register_operand" "f")
          (match_operand:MODEF 2 "nonimmediate_operand" "fm")))
   (clobber (match_operand:HI 0 "register_operand" "=a"))]
  "TARGET_80387 && TARGET_SAHF && !TARGET_CMOVE"
  "#"
  "&& reload_completed"
  [(set (match_dup 0)
        (unspec:HI
          [(compare:CCFP (match_dup 1)(match_dup 2))]
        UNSPEC_FNSTSW))
   (set (reg:CC FLAGS_REG)
        (unspec:CC [(match_dup 0)] UNSPEC_SAHF))]
  ""
  [(set_attr "type" "multi")
   (set_attr "unit" "i387")
   (set_attr "mode" "")])