...

Text file src/cmd/compile/internal/ssa/README.md

Documentation: cmd/compile/internal/ssa

     1<!---
     2// Copyright 2018 The Go Authors. All rights reserved.
     3// Use of this source code is governed by a BSD-style
     4// license that can be found in the LICENSE file.
     5-->
     6
     7## Introduction to the Go compiler's SSA backend
     8
     9This package contains the compiler's Static Single Assignment form component. If
    10you're not familiar with SSA, its [Wikipedia
    11article](https://en.wikipedia.org/wiki/Static_single_assignment_form) is a good
    12starting point.
    13
    14It is recommended that you first read [cmd/compile/README.md](../../README.md)
    15if you are not familiar with the Go compiler already. That document gives an
    16overview of the compiler, and explains what is SSA's part and purpose in it.
    17
    18### Key concepts
    19
    20The names described below may be loosely related to their Go counterparts, but
    21note that they are not equivalent. For example, a Go block statement has a
    22variable scope, yet SSA has no notion of variables nor variable scopes.
    23
    24It may also be surprising that values and blocks are named after their unique
    25sequential IDs. They rarely correspond to named entities in the original code,
    26such as variables or function parameters. The sequential IDs also allow the
    27compiler to avoid maps, and it is always possible to track back the values to Go
    28code using debug and position information.
    29
    30#### Values
    31
    32Values are the basic building blocks of SSA. Per SSA's very definition, a
    33value is defined exactly once, but it may be used any number of times. A value
    34mainly consists of a unique identifier, an operator, a type, and some arguments.
    35
    36An operator or `Op` describes the operation that computes the value. The
    37semantics of each operator can be found in `_gen/*Ops.go`. For example, `OpAdd8`
    38takes two value arguments holding 8-bit integers and results in their addition.
    39Here is a possible SSA representation of the addition of two `uint8` values:
    40
    41	// var c uint8 = a + b
    42	v4 = Add8 <uint8> v2 v3
    43
    44A value's type will usually be a Go type. For example, the value in the example
    45above has a `uint8` type, and a constant boolean value will have a `bool` type.
    46However, certain types don't come from Go and are special; below we will cover
    47`memory`, the most common of them.
    48
    49Some operators contain an auxiliary field. The aux fields are usually printed as
    50enclosed in `[]` or `{}`, and could be the constant op argument, argument type,
    51etc. For example:
    52
    53	v13 (?) = Const64 <int> [1]
    54
    55Here the aux field is the constant op argument, the op is creating a `Const64`
    56value of 1. One more example:
    57
    58	v17 (361) = Store <mem> {int} v16 v14 v8
    59
    60Here the aux field is the type of the value being `Store`ed, which is int.
    61
    62See [value.go](value.go) and `_gen/*Ops.go` for more information.
    63
    64#### Memory types
    65
    66`memory` represents the global memory state. An `Op` that takes a memory
    67argument depends on that memory state, and an `Op` which has the memory type
    68impacts the state of memory. This ensures that memory operations are kept in the
    69right order. For example:
    70
    71	// *a = 3
    72	// *b = *a
    73	v10 = Store <mem> {int} v6 v8 v1
    74	v14 = Store <mem> {int} v7 v8 v10
    75
    76Here, `Store` stores its second argument (of type `int`) into the first argument
    77(of type `*int`). The last argument is the memory state; since the second store
    78depends on the memory value defined by the first store, the two stores cannot be
    79reordered.
    80
    81See [cmd/compile/internal/types/type.go](../types/type.go) for more information.
    82
    83#### Blocks
    84
    85A block represents a basic block in the control flow graph of a function. It is,
    86essentially, a list of values that define the operation of this block. Besides
    87the list of values, blocks mainly consist of a unique identifier, a kind, and a
    88list of successor blocks.
    89
    90The simplest kind is a `plain` block; it simply hands the control flow to
    91another block, thus its successors list contains one block.
    92
    93Another common block kind is the `exit` block. These have a final value, called
    94control value, which must return a memory state. This is necessary for functions
    95to return some values, for example - the caller needs some memory state to
    96depend on, to ensure that it receives those return values correctly.
    97
    98The last important block kind we will mention is the `if` block. It has a single
    99control value that must be a boolean value, and it has exactly two successor
   100blocks. The control flow is handed to the first successor if the bool is true,
   101and to the second otherwise.
   102
   103Here is a sample if-else control flow represented with basic blocks:
   104
   105	// func(b bool) int {
   106	// 	if b {
   107	// 		return 2
   108	// 	}
   109	// 	return 3
   110	// }
   111	b1:
   112	  v1 = InitMem <mem>
   113	  v2 = SP <uintptr>
   114	  v5 = Addr <*int> {~r1} v2
   115	  v6 = Arg <bool> {b}
   116	  v8 = Const64 <int> [2]
   117	  v12 = Const64 <int> [3]
   118	  If v6 -> b2 b3
   119	b2: <- b1
   120	  v10 = VarDef <mem> {~r1} v1
   121	  v11 = Store <mem> {int} v5 v8 v10
   122	  Ret v11
   123	b3: <- b1
   124	  v14 = VarDef <mem> {~r1} v1
   125	  v15 = Store <mem> {int} v5 v12 v14
   126	  Ret v15
   127
   128<!---
   129TODO: can we come up with a shorter example that still shows the control flow?
   130-->
   131
   132See [block.go](block.go) for more information.
   133
   134#### Functions
   135
   136A function represents a function declaration along with its body. It mainly
   137consists of a name, a type (its signature), a list of blocks that form its body,
   138and the entry block within said list.
   139
   140When a function is called, the control flow is handed to its entry block. If the
   141function terminates, the control flow will eventually reach an exit block, thus
   142ending the function call.
   143
   144Note that a function may have zero or multiple exit blocks, just like a Go
   145function can have any number of return points, but it must have exactly one
   146entry point block.
   147
   148Also note that some SSA functions are autogenerated, such as the hash functions
   149for each type used as a map key.
   150
   151For example, this is what an empty function can look like in SSA, with a single
   152exit block that returns an uninteresting memory state:
   153
   154	foo func()
   155	  b1:
   156	    v1 = InitMem <mem>
   157	    Ret v1
   158
   159See [func.go](func.go) for more information.
   160
   161### Compiler passes
   162
   163Having a program in SSA form is not very useful on its own. Its advantage lies
   164in how easy it is to write optimizations that modify the program to make it
   165better. The way the Go compiler accomplishes this is via a list of passes.
   166
   167Each pass transforms a SSA function in some way. For example, a dead code
   168elimination pass will remove blocks and values that it can prove will never be
   169executed, and a nil check elimination pass will remove nil checks which it can
   170prove to be redundant.
   171
   172Compiler passes work on one function at a time, and by default run sequentially
   173and exactly once.
   174
   175The `lower` pass is special; it converts the SSA representation from being
   176machine-independent to being machine-dependent. That is, some abstract operators
   177are replaced with their non-generic counterparts, potentially reducing or
   178increasing the final number of values.
   179
   180<!---
   181TODO: Probably explain here why the ordering of the passes matters, and why some
   182passes like deadstore have multiple variants at different stages.
   183-->
   184
   185See the `passes` list defined in [compile.go](compile.go) for more information.
   186
   187### Playing with SSA
   188
   189A good way to see and get used to the compiler's SSA in action is via
   190`GOSSAFUNC`. For example, to see func `Foo`'s initial SSA form and final
   191generated assembly, one can run:
   192
   193	GOSSAFUNC=Foo go build
   194
   195The generated `ssa.html` file will also contain the SSA func at each of the
   196compile passes, making it easy to see what each pass does to a particular
   197program. You can also click on values and blocks to highlight them, to help
   198follow the control flow and values.
   199
   200The value specified in GOSSAFUNC can also be a package-qualified function
   201name, e.g.
   202
   203	GOSSAFUNC=blah.Foo go build
   204
   205This will match any function named "Foo" within a package whose final
   206suffix is "blah" (e.g. something/blah.Foo, anotherthing/extra/blah.Foo).
   207
   208The users may also print the Control Flow Graph(CFG) by specifying in
   209`GOSSAFUNC` value in the following format:
   210
   211	GOSSAFUNC="$FunctionName:$PassName1,$PassName2,..." go build
   212
   213For example, the following command will print SSA with CFGs attached to the
   214`sccp` and `generic deadcode` pass columns:
   215
   216	GOSSAFUNC="blah.Foo:sccp,generic deadcode" go build
   217
   218If non-HTML dumps are needed, append a "+" to the GOSSAFUNC value
   219and dumps will be written to stdout:
   220
   221	GOSSAFUNC=Bar+ go build
   222
   223<!---
   224TODO: need more ideas for this section
   225-->
   226
   227### Hacking on SSA
   228
   229While most compiler passes are implemented directly in Go code, some others are
   230code generated. This is currently done via rewrite rules, which have their own
   231syntax and are maintained in `_gen/*.rules`. Simpler optimizations can be written
   232easily and quickly this way, but rewrite rules are not suitable for more complex
   233optimizations.
   234
   235To read more on rewrite rules, have a look at the top comments in
   236[_gen/generic.rules](_gen/generic.rules) and [_gen/rulegen.go](_gen/rulegen.go).
   237
   238Similarly, the code to manage operators is also code generated from
   239`_gen/*Ops.go`, as it is easier to maintain a few tables than a lot of code.
   240After changing the rules or operators, run `go generate cmd/compile/internal/ssa`
   241to generate the Go code again.
   242
   243<!---
   244TODO: more tips and info could likely go here
   245-->

View as plain text