This is a living document and at times it will be out of date. It is intended to articulate how programming in the Go runtime differs from writing normal Go. It focuses on pervasive concepts rather than details of particular interfaces. Scheduler structures ==================== The scheduler manages three types of resources that pervade the runtime: Gs, Ms, and Ps. It's important to understand these even if you're not working on the scheduler. Gs, Ms, Ps ---------- A "G" is simply a goroutine. It's represented by type `g`. When a goroutine exits, its `g` object is returned to a pool of free `g`s and can later be reused for some other goroutine. An "M" is an OS thread that can be executing user Go code, runtime code, a system call, or be idle. It's represented by type `m`. There can be any number of Ms at a time since any number of threads may be blocked in system calls. Finally, a "P" represents the resources required to execute user Go code, such as scheduler and memory allocator state. It's represented by type `p`. There are exactly `GOMAXPROCS` Ps. A P can be thought of like a CPU in the OS scheduler and the contents of the `p` type like per-CPU state. This is a good place to put state that needs to be sharded for efficiency, but doesn't need to be per-thread or per-goroutine. The scheduler's job is to match up a G (the code to execute), an M (where to execute it), and a P (the rights and resources to execute it). When an M stops executing user Go code, for example by entering a system call, it returns its P to the idle P pool. In order to resume executing user Go code, for example on return from a system call, it must acquire a P from the idle pool. All `g`, `m`, and `p` objects are heap allocated, but are never freed, so their memory remains type stable. As a result, the runtime can avoid write barriers in the depths of the scheduler. `getg()` and `getg().m.curg` ---------------------------- To get the current user `g`, use `getg().m.curg`. `getg()` alone returns the current `g`, but when executing on the system or signal stacks, this will return the current M's "g0" or "gsignal", respectively. This is usually not what you want. To determine if you're running on the user stack or the system stack, use `getg() == getg().m.curg`. Stacks ====== Every non-dead G has a *user stack* associated with it, which is what user Go code executes on. User stacks start small (e.g., 2K) and grow or shrink dynamically. Every M has a *system stack* associated with it (also known as the M's "g0" stack because it's implemented as a stub G) and, on Unix platforms, a *signal stack* (also known as the M's "gsignal" stack). System and signal stacks cannot grow, but are large enough to execute runtime and cgo code (8K in a pure Go binary; system-allocated in a cgo binary). Runtime code often temporarily switches to the system stack using `systemstack`, `mcall`, or `asmcgocall` to perform tasks that must not be preempted, that must not grow the user stack, or that switch user goroutines. Code running on the system stack is implicitly non-preemptible and the garbage collector does not scan system stacks. While running on the system stack, the current user stack is not used for execution. nosplit functions ----------------- Most functions start with a prologue that inspects the stack pointer and the current G's stack bound and calls `morestack` if the stack needs to grow. Functions can be marked `//go:nosplit` (or `NOSPLIT` in assembly) to indicate that they should not get this prologue. This has several uses: - Functions that must run on the user stack, but must not call into stack growth, for example because this would cause a deadlock, or because they have untyped words on the stack. - Functions that must not be preempted on entry. - Functions that may run without a valid G. For example, functions that run in early runtime start-up, or that may be entered from C code such as cgo callbacks or the signal handler. Splittable functions ensure there's some amount of space on the stack for nosplit functions to run in and the linker checks that any static chain of nosplit function calls cannot exceed this bound. Any function with a `//go:nosplit` annotation should explain why it is nosplit in its documentation comment. Error handling and reporting ============================ Errors that can reasonably be recovered from in user code should use `panic` like usual. However, there are some situations where `panic` will cause an immediate fatal error, such as when called on the system stack or when called during `mallocgc`. Most errors in the runtime are not recoverable. For these, use `throw`, which dumps the traceback and immediately terminates the process. In general, `throw` should be passed a string constant to avoid allocating in perilous situations. By convention, additional details are printed before `throw` using `print` or `println` and the messages are prefixed with "runtime:". For unrecoverable errors where user code is expected to be at fault for the failure (such as racing map writes), use `fatal`. For runtime error debugging, it may be useful to run with `GOTRACEBACK=system` or `GOTRACEBACK=crash`. The output of `panic` and `fatal` is as described by `GOTRACEBACK`. The output of `throw` always includes runtime frames, metadata and all goroutines regardless of `GOTRACEBACK` (i.e., equivalent to `GOTRACEBACK=system`). Whether `throw` crashes or not is still controlled by `GOTRACEBACK`. Synchronization =============== The runtime has multiple synchronization mechanisms. They differ in semantics and, in particular, in whether they interact with the goroutine scheduler or the OS scheduler. The simplest is `mutex`, which is manipulated using `lock` and `unlock`. This should be used to protect shared structures for short periods. Blocking on a `mutex` directly blocks the M, without interacting with the Go scheduler. This means it is safe to use from the lowest levels of the runtime, but also prevents any associated G and P from being rescheduled. `rwmutex` is similar. For one-shot notifications, use `note`, which provides `notesleep` and `notewakeup`. Unlike traditional UNIX `sleep`/`wakeup`, `note`s are race-free, so `notesleep` returns immediately if the `notewakeup` has already happened. A `note` can be reset after use with `noteclear`, which must not race with a sleep or wakeup. Like `mutex`, blocking on a `note` blocks the M. However, there are different ways to sleep on a `note`:`notesleep` also prevents rescheduling of any associated G and P, while `notetsleepg` acts like a blocking system call that allows the P to be reused to run another G. This is still less efficient than blocking the G directly since it consumes an M. To interact directly with the goroutine scheduler, use `gopark` and `goready`. `gopark` parks the current goroutine—putting it in the "waiting" state and removing it from the scheduler's run queue—and schedules another goroutine on the current M/P. `goready` puts a parked goroutine back in the "runnable" state and adds it to the run queue. In summary,
Blocks | |||
---|---|---|---|
Interface | G | M | P |
(rw)mutex | Y | Y | Y |
note | Y | Y | Y/N |
park | Y | N | N |