...

Source file src/cmd/cgo/doc.go

Documentation: cmd/cgo

     1  // Copyright 2009 The Go Authors. All rights reserved.
     2  // Use of this source code is governed by a BSD-style
     3  // license that can be found in the LICENSE file.
     4  
     5  /*
     6  Cgo enables the creation of Go packages that call C code.
     7  
     8  # Using cgo with the go command
     9  
    10  To use cgo write normal Go code that imports a pseudo-package "C".
    11  The Go code can then refer to types such as C.size_t, variables such
    12  as C.stdout, or functions such as C.putchar.
    13  
    14  If the import of "C" is immediately preceded by a comment, that
    15  comment, called the preamble, is used as a header when compiling
    16  the C parts of the package. For example:
    17  
    18  	// #include <stdio.h>
    19  	// #include <errno.h>
    20  	import "C"
    21  
    22  The preamble may contain any C code, including function and variable
    23  declarations and definitions. These may then be referred to from Go
    24  code as though they were defined in the package "C". All names
    25  declared in the preamble may be used, even if they start with a
    26  lower-case letter. Exception: static variables in the preamble may
    27  not be referenced from Go code; static functions are permitted.
    28  
    29  See $GOROOT/cmd/cgo/internal/teststdio and $GOROOT/misc/cgo/gmp for examples. See
    30  "C? Go? Cgo!" for an introduction to using cgo:
    31  https://golang.org/doc/articles/c_go_cgo.html.
    32  
    33  CFLAGS, CPPFLAGS, CXXFLAGS, FFLAGS and LDFLAGS may be defined with pseudo
    34  #cgo directives within these comments to tweak the behavior of the C, C++
    35  or Fortran compiler. Values defined in multiple directives are concatenated
    36  together. The directive can include a list of build constraints limiting its
    37  effect to systems satisfying one of the constraints
    38  (see https://golang.org/pkg/go/build/#hdr-Build_Constraints for details about the constraint syntax).
    39  For example:
    40  
    41  	// #cgo CFLAGS: -DPNG_DEBUG=1
    42  	// #cgo amd64 386 CFLAGS: -DX86=1
    43  	// #cgo LDFLAGS: -lpng
    44  	// #include <png.h>
    45  	import "C"
    46  
    47  Alternatively, CPPFLAGS and LDFLAGS may be obtained via the pkg-config tool
    48  using a '#cgo pkg-config:' directive followed by the package names.
    49  For example:
    50  
    51  	// #cgo pkg-config: png cairo
    52  	// #include <png.h>
    53  	import "C"
    54  
    55  The default pkg-config tool may be changed by setting the PKG_CONFIG environment variable.
    56  
    57  For security reasons, only a limited set of flags are allowed, notably -D, -U, -I, and -l.
    58  To allow additional flags, set CGO_CFLAGS_ALLOW to a regular expression
    59  matching the new flags. To disallow flags that would otherwise be allowed,
    60  set CGO_CFLAGS_DISALLOW to a regular expression matching arguments
    61  that must be disallowed. In both cases the regular expression must match
    62  a full argument: to allow -mfoo=bar, use CGO_CFLAGS_ALLOW='-mfoo.*',
    63  not just CGO_CFLAGS_ALLOW='-mfoo'. Similarly named variables control
    64  the allowed CPPFLAGS, CXXFLAGS, FFLAGS, and LDFLAGS.
    65  
    66  Also for security reasons, only a limited set of characters are
    67  permitted, notably alphanumeric characters and a few symbols, such as
    68  '.', that will not be interpreted in unexpected ways. Attempts to use
    69  forbidden characters will get a "malformed #cgo argument" error.
    70  
    71  When building, the CGO_CFLAGS, CGO_CPPFLAGS, CGO_CXXFLAGS, CGO_FFLAGS and
    72  CGO_LDFLAGS environment variables are added to the flags derived from
    73  these directives. Package-specific flags should be set using the
    74  directives, not the environment variables, so that builds work in
    75  unmodified environments. Flags obtained from environment variables
    76  are not subject to the security limitations described above.
    77  
    78  All the cgo CPPFLAGS and CFLAGS directives in a package are concatenated and
    79  used to compile C files in that package. All the CPPFLAGS and CXXFLAGS
    80  directives in a package are concatenated and used to compile C++ files in that
    81  package. All the CPPFLAGS and FFLAGS directives in a package are concatenated
    82  and used to compile Fortran files in that package. All the LDFLAGS directives
    83  in any package in the program are concatenated and used at link time. All the
    84  pkg-config directives are concatenated and sent to pkg-config simultaneously
    85  to add to each appropriate set of command-line flags.
    86  
    87  When the cgo directives are parsed, any occurrence of the string ${SRCDIR}
    88  will be replaced by the absolute path to the directory containing the source
    89  file. This allows pre-compiled static libraries to be included in the package
    90  directory and linked properly.
    91  For example if package foo is in the directory /go/src/foo:
    92  
    93  	// #cgo LDFLAGS: -L${SRCDIR}/libs -lfoo
    94  
    95  Will be expanded to:
    96  
    97  	// #cgo LDFLAGS: -L/go/src/foo/libs -lfoo
    98  
    99  When the Go tool sees that one or more Go files use the special import
   100  "C", it will look for other non-Go files in the directory and compile
   101  them as part of the Go package. Any .c, .s, .S or .sx files will be
   102  compiled with the C compiler. Any .cc, .cpp, or .cxx files will be
   103  compiled with the C++ compiler. Any .f, .F, .for or .f90 files will be
   104  compiled with the fortran compiler. Any .h, .hh, .hpp, or .hxx files will
   105  not be compiled separately, but, if these header files are changed,
   106  the package (including its non-Go source files) will be recompiled.
   107  Note that changes to files in other directories do not cause the package
   108  to be recompiled, so all non-Go source code for the package should be
   109  stored in the package directory, not in subdirectories.
   110  The default C and C++ compilers may be changed by the CC and CXX
   111  environment variables, respectively; those environment variables
   112  may include command line options.
   113  
   114  The cgo tool will always invoke the C compiler with the source file's
   115  directory in the include path; i.e. -I${SRCDIR} is always implied. This
   116  means that if a header file foo/bar.h exists both in the source
   117  directory and also in the system include directory (or some other place
   118  specified by a -I flag), then "#include <foo/bar.h>" will always find the
   119  local version in preference to any other version.
   120  
   121  The cgo tool is enabled by default for native builds on systems where
   122  it is expected to work. It is disabled by default when cross-compiling
   123  as well as when the CC environment variable is unset and the default
   124  C compiler (typically gcc or clang) cannot be found on the system PATH.
   125  You can override the default by setting the CGO_ENABLED
   126  environment variable when running the go tool: set it to 1 to enable
   127  the use of cgo, and to 0 to disable it. The go tool will set the
   128  build constraint "cgo" if cgo is enabled. The special import "C"
   129  implies the "cgo" build constraint, as though the file also said
   130  "//go:build cgo".  Therefore, if cgo is disabled, files that import
   131  "C" will not be built by the go tool. (For more about build constraints
   132  see https://golang.org/pkg/go/build/#hdr-Build_Constraints).
   133  
   134  When cross-compiling, you must specify a C cross-compiler for cgo to
   135  use. You can do this by setting the generic CC_FOR_TARGET or the
   136  more specific CC_FOR_${GOOS}_${GOARCH} (for example, CC_FOR_linux_arm)
   137  environment variable when building the toolchain using make.bash,
   138  or you can set the CC environment variable any time you run the go tool.
   139  
   140  The CXX_FOR_TARGET, CXX_FOR_${GOOS}_${GOARCH}, and CXX
   141  environment variables work in a similar way for C++ code.
   142  
   143  # Go references to C
   144  
   145  Within the Go file, C's struct field names that are keywords in Go
   146  can be accessed by prefixing them with an underscore: if x points at a C
   147  struct with a field named "type", x._type accesses the field.
   148  C struct fields that cannot be expressed in Go, such as bit fields
   149  or misaligned data, are omitted in the Go struct, replaced by
   150  appropriate padding to reach the next field or the end of the struct.
   151  
   152  The standard C numeric types are available under the names
   153  C.char, C.schar (signed char), C.uchar (unsigned char),
   154  C.short, C.ushort (unsigned short), C.int, C.uint (unsigned int),
   155  C.long, C.ulong (unsigned long), C.longlong (long long),
   156  C.ulonglong (unsigned long long), C.float, C.double,
   157  C.complexfloat (complex float), and C.complexdouble (complex double).
   158  The C type void* is represented by Go's unsafe.Pointer.
   159  The C types __int128_t and __uint128_t are represented by [16]byte.
   160  
   161  A few special C types which would normally be represented by a pointer
   162  type in Go are instead represented by a uintptr.  See the Special
   163  cases section below.
   164  
   165  To access a struct, union, or enum type directly, prefix it with
   166  struct_, union_, or enum_, as in C.struct_stat.
   167  
   168  The size of any C type T is available as C.sizeof_T, as in
   169  C.sizeof_struct_stat.
   170  
   171  A C function may be declared in the Go file with a parameter type of
   172  the special name _GoString_. This function may be called with an
   173  ordinary Go string value. The string length, and a pointer to the
   174  string contents, may be accessed by calling the C functions
   175  
   176  	size_t _GoStringLen(_GoString_ s);
   177  	const char *_GoStringPtr(_GoString_ s);
   178  
   179  These functions are only available in the preamble, not in other C
   180  files. The C code must not modify the contents of the pointer returned
   181  by _GoStringPtr. Note that the string contents may not have a trailing
   182  NUL byte.
   183  
   184  As Go doesn't have support for C's union type in the general case,
   185  C's union types are represented as a Go byte array with the same length.
   186  
   187  Go structs cannot embed fields with C types.
   188  
   189  Go code cannot refer to zero-sized fields that occur at the end of
   190  non-empty C structs. To get the address of such a field (which is the
   191  only operation you can do with a zero-sized field) you must take the
   192  address of the struct and add the size of the struct.
   193  
   194  Cgo translates C types into equivalent unexported Go types.
   195  Because the translations are unexported, a Go package should not
   196  expose C types in its exported API: a C type used in one Go package
   197  is different from the same C type used in another.
   198  
   199  Any C function (even void functions) may be called in a multiple
   200  assignment context to retrieve both the return value (if any) and the
   201  C errno variable as an error (use _ to skip the result value if the
   202  function returns void). For example:
   203  
   204  	n, err = C.sqrt(-1)
   205  	_, err := C.voidFunc()
   206  	var n, err = C.sqrt(1)
   207  
   208  Calling C function pointers is currently not supported, however you can
   209  declare Go variables which hold C function pointers and pass them
   210  back and forth between Go and C. C code may call function pointers
   211  received from Go. For example:
   212  
   213  	package main
   214  
   215  	// typedef int (*intFunc) ();
   216  	//
   217  	// int
   218  	// bridge_int_func(intFunc f)
   219  	// {
   220  	//		return f();
   221  	// }
   222  	//
   223  	// int fortytwo()
   224  	// {
   225  	//	    return 42;
   226  	// }
   227  	import "C"
   228  	import "fmt"
   229  
   230  	func main() {
   231  		f := C.intFunc(C.fortytwo)
   232  		fmt.Println(int(C.bridge_int_func(f)))
   233  		// Output: 42
   234  	}
   235  
   236  In C, a function argument written as a fixed size array
   237  actually requires a pointer to the first element of the array.
   238  C compilers are aware of this calling convention and adjust
   239  the call accordingly, but Go cannot. In Go, you must pass
   240  the pointer to the first element explicitly: C.f(&C.x[0]).
   241  
   242  Calling variadic C functions is not supported. It is possible to
   243  circumvent this by using a C function wrapper. For example:
   244  
   245  	package main
   246  
   247  	// #include <stdio.h>
   248  	// #include <stdlib.h>
   249  	//
   250  	// static void myprint(char* s) {
   251  	//   printf("%s\n", s);
   252  	// }
   253  	import "C"
   254  	import "unsafe"
   255  
   256  	func main() {
   257  		cs := C.CString("Hello from stdio")
   258  		C.myprint(cs)
   259  		C.free(unsafe.Pointer(cs))
   260  	}
   261  
   262  A few special functions convert between Go and C types
   263  by making copies of the data. In pseudo-Go definitions:
   264  
   265  	// Go string to C string
   266  	// The C string is allocated in the C heap using malloc.
   267  	// It is the caller's responsibility to arrange for it to be
   268  	// freed, such as by calling C.free (be sure to include stdlib.h
   269  	// if C.free is needed).
   270  	func C.CString(string) *C.char
   271  
   272  	// Go []byte slice to C array
   273  	// The C array is allocated in the C heap using malloc.
   274  	// It is the caller's responsibility to arrange for it to be
   275  	// freed, such as by calling C.free (be sure to include stdlib.h
   276  	// if C.free is needed).
   277  	func C.CBytes([]byte) unsafe.Pointer
   278  
   279  	// C string to Go string
   280  	func C.GoString(*C.char) string
   281  
   282  	// C data with explicit length to Go string
   283  	func C.GoStringN(*C.char, C.int) string
   284  
   285  	// C data with explicit length to Go []byte
   286  	func C.GoBytes(unsafe.Pointer, C.int) []byte
   287  
   288  As a special case, C.malloc does not call the C library malloc directly
   289  but instead calls a Go helper function that wraps the C library malloc
   290  but guarantees never to return nil. If C's malloc indicates out of memory,
   291  the helper function crashes the program, like when Go itself runs out
   292  of memory. Because C.malloc cannot fail, it has no two-result form
   293  that returns errno.
   294  
   295  # C references to Go
   296  
   297  Go functions can be exported for use by C code in the following way:
   298  
   299  	//export MyFunction
   300  	func MyFunction(arg1, arg2 int, arg3 string) int64 {...}
   301  
   302  	//export MyFunction2
   303  	func MyFunction2(arg1, arg2 int, arg3 string) (int64, *C.char) {...}
   304  
   305  They will be available in the C code as:
   306  
   307  	extern GoInt64 MyFunction(int arg1, int arg2, GoString arg3);
   308  	extern struct MyFunction2_return MyFunction2(int arg1, int arg2, GoString arg3);
   309  
   310  found in the _cgo_export.h generated header, after any preambles
   311  copied from the cgo input files. Functions with multiple
   312  return values are mapped to functions returning a struct.
   313  
   314  Not all Go types can be mapped to C types in a useful way.
   315  Go struct types are not supported; use a C struct type.
   316  Go array types are not supported; use a C pointer.
   317  
   318  Go functions that take arguments of type string may be called with the
   319  C type _GoString_, described above. The _GoString_ type will be
   320  automatically defined in the preamble. Note that there is no way for C
   321  code to create a value of this type; this is only useful for passing
   322  string values from Go to C and back to Go.
   323  
   324  Using //export in a file places a restriction on the preamble:
   325  since it is copied into two different C output files, it must not
   326  contain any definitions, only declarations. If a file contains both
   327  definitions and declarations, then the two output files will produce
   328  duplicate symbols and the linker will fail. To avoid this, definitions
   329  must be placed in preambles in other files, or in C source files.
   330  
   331  # Passing pointers
   332  
   333  Go is a garbage collected language, and the garbage collector needs to
   334  know the location of every pointer to Go memory. Because of this,
   335  there are restrictions on passing pointers between Go and C.
   336  
   337  In this section the term Go pointer means a pointer to memory
   338  allocated by Go (such as by using the & operator or calling the
   339  predefined new function) and the term C pointer means a pointer to
   340  memory allocated by C (such as by a call to C.malloc). Whether a
   341  pointer is a Go pointer or a C pointer is a dynamic property
   342  determined by how the memory was allocated; it has nothing to do with
   343  the type of the pointer.
   344  
   345  Note that values of some Go types, other than the type's zero value,
   346  always include Go pointers. This is true of string, slice, interface,
   347  channel, map, and function types. A pointer type may hold a Go pointer
   348  or a C pointer. Array and struct types may or may not include Go
   349  pointers, depending on the element types. All the discussion below
   350  about Go pointers applies not just to pointer types, but also to other
   351  types that include Go pointers.
   352  
   353  All Go pointers passed to C must point to pinned Go memory. Go pointers
   354  passed as function arguments to C functions have the memory they point to
   355  implicitly pinned for the duration of the call. Go memory reachable from
   356  these function arguments must be pinned as long as the C code has access
   357  to it. Whether Go memory is pinned is a dynamic property of that memory
   358  region; it has nothing to do with the type of the pointer.
   359  
   360  Go values created by calling new, by taking the address of a composite
   361  literal, or by taking the address of a local variable may also have their
   362  memory pinned using [runtime.Pinner]. This type may be used to manage
   363  the duration of the memory's pinned status, potentially beyond the
   364  duration of a C function call. Memory may be pinned more than once and
   365  must be unpinned exactly the same number of times it has been pinned.
   366  
   367  Go code may pass a Go pointer to C provided the memory to which it
   368  points does not contain any Go pointers to memory that is unpinned. When
   369  passing a pointer to a field in a struct, the Go memory in question is
   370  the memory occupied by the field, not the entire struct. When passing a
   371  pointer to an element in an array or slice, the Go memory in question is
   372  the entire array or the entire backing array of the slice.
   373  
   374  C code may keep a copy of a Go pointer only as long as the memory it
   375  points to is pinned.
   376  
   377  C code may not keep a copy of a Go pointer after the call returns,
   378  unless the memory it points to is pinned with [runtime.Pinner] and the
   379  Pinner is not unpinned while the Go pointer is stored in C memory.
   380  This implies that C code may not keep a copy of a string, slice,
   381  channel, and so forth, because they cannot be pinned with
   382  [runtime.Pinner].
   383  
   384  The _GoString_ type also may not be pinned with [runtime.Pinner].
   385  Because it includes a Go pointer, the memory it points to is only pinned
   386  for the duration of the call; _GoString_ values may not be retained by C
   387  code.
   388  
   389  A Go function called by C code may return a Go pointer to pinned memory
   390  (which implies that it may not return a string, slice, channel, and so
   391  forth). A Go function called by C code may take C pointers as arguments,
   392  and it may store non-pointer data, C pointers, or Go pointers to pinned
   393  memory through those pointers. It may not store a Go pointer to unpinned
   394  memory in memory pointed to by a C pointer (which again, implies that it
   395  may not store a string, slice, channel, and so forth). A Go function
   396  called by C code may take a Go pointer but it must preserve the property
   397  that the Go memory to which it points (and the Go memory to which that
   398  memory points, and so on) is pinned.
   399  
   400  These rules are checked dynamically at runtime. The checking is
   401  controlled by the cgocheck setting of the GODEBUG environment
   402  variable. The default setting is GODEBUG=cgocheck=1, which implements
   403  reasonably cheap dynamic checks. These checks may be disabled
   404  entirely using GODEBUG=cgocheck=0. Complete checking of pointer
   405  handling, at some cost in run time, is available by setting
   406  GOEXPERIMENT=cgocheck2 at build time.
   407  
   408  It is possible to defeat this enforcement by using the unsafe package,
   409  and of course there is nothing stopping the C code from doing anything
   410  it likes. However, programs that break these rules are likely to fail
   411  in unexpected and unpredictable ways.
   412  
   413  The runtime/cgo.Handle type can be used to safely pass Go values
   414  between Go and C. See the runtime/cgo package documentation for details.
   415  
   416  Note: the current implementation has a bug. While Go code is permitted
   417  to write nil or a C pointer (but not a Go pointer) to C memory, the
   418  current implementation may sometimes cause a runtime error if the
   419  contents of the C memory appear to be a Go pointer. Therefore, avoid
   420  passing uninitialized C memory to Go code if the Go code is going to
   421  store pointer values in it. Zero out the memory in C before passing it
   422  to Go.
   423  
   424  # Special cases
   425  
   426  A few special C types which would normally be represented by a pointer
   427  type in Go are instead represented by a uintptr. Those include:
   428  
   429  1. The *Ref types on Darwin, rooted at CoreFoundation's CFTypeRef type.
   430  
   431  2. The object types from Java's JNI interface:
   432  
   433  	jobject
   434  	jclass
   435  	jthrowable
   436  	jstring
   437  	jarray
   438  	jbooleanArray
   439  	jbyteArray
   440  	jcharArray
   441  	jshortArray
   442  	jintArray
   443  	jlongArray
   444  	jfloatArray
   445  	jdoubleArray
   446  	jobjectArray
   447  	jweak
   448  
   449  3. The EGLDisplay and EGLConfig types from the EGL API.
   450  
   451  These types are uintptr on the Go side because they would otherwise
   452  confuse the Go garbage collector; they are sometimes not really
   453  pointers but data structures encoded in a pointer type. All operations
   454  on these types must happen in C. The proper constant to initialize an
   455  empty such reference is 0, not nil.
   456  
   457  These special cases were introduced in Go 1.10. For auto-updating code
   458  from Go 1.9 and earlier, use the cftype or jni rewrites in the Go fix tool:
   459  
   460  	go tool fix -r cftype <pkg>
   461  	go tool fix -r jni <pkg>
   462  
   463  It will replace nil with 0 in the appropriate places.
   464  
   465  The EGLDisplay case was introduced in Go 1.12. Use the egl rewrite
   466  to auto-update code from Go 1.11 and earlier:
   467  
   468  	go tool fix -r egl <pkg>
   469  
   470  The EGLConfig case was introduced in Go 1.15. Use the eglconf rewrite
   471  to auto-update code from Go 1.14 and earlier:
   472  
   473  	go tool fix -r eglconf <pkg>
   474  
   475  # Using cgo directly
   476  
   477  Usage:
   478  
   479  	go tool cgo [cgo options] [-- compiler options] gofiles...
   480  
   481  Cgo transforms the specified input Go source files into several output
   482  Go and C source files.
   483  
   484  The compiler options are passed through uninterpreted when
   485  invoking the C compiler to compile the C parts of the package.
   486  
   487  The following options are available when running cgo directly:
   488  
   489  	-V
   490  		Print cgo version and exit.
   491  	-debug-define
   492  		Debugging option. Print #defines.
   493  	-debug-gcc
   494  		Debugging option. Trace C compiler execution and output.
   495  	-dynimport file
   496  		Write list of symbols imported by file. Write to
   497  		-dynout argument or to standard output. Used by go
   498  		build when building a cgo package.
   499  	-dynlinker
   500  		Write dynamic linker as part of -dynimport output.
   501  	-dynout file
   502  		Write -dynimport output to file.
   503  	-dynpackage package
   504  		Set Go package for -dynimport output.
   505  	-exportheader file
   506  		If there are any exported functions, write the
   507  		generated export declarations to file.
   508  		C code can #include this to see the declarations.
   509  	-importpath string
   510  		The import path for the Go package. Optional; used for
   511  		nicer comments in the generated files.
   512  	-import_runtime_cgo
   513  		If set (which it is by default) import runtime/cgo in
   514  		generated output.
   515  	-import_syscall
   516  		If set (which it is by default) import syscall in
   517  		generated output.
   518  	-gccgo
   519  		Generate output for the gccgo compiler rather than the
   520  		gc compiler.
   521  	-gccgoprefix prefix
   522  		The -fgo-prefix option to be used with gccgo.
   523  	-gccgopkgpath path
   524  		The -fgo-pkgpath option to be used with gccgo.
   525  	-gccgo_define_cgoincomplete
   526  		Define cgo.Incomplete locally rather than importing it from
   527  		the "runtime/cgo" package. Used for old gccgo versions.
   528  	-godefs
   529  		Write out input file in Go syntax replacing C package
   530  		names with real values. Used to generate files in the
   531  		syscall package when bootstrapping a new target.
   532  	-ldflags flags
   533  		Flags to pass to the C linker. The cmd/go tool uses
   534  		this to pass in the flags in the CGO_LDFLAGS variable.
   535  	-objdir directory
   536  		Put all generated files in directory.
   537  	-srcdir directory
   538  */
   539  package main
   540  
   541  /*
   542  Implementation details.
   543  
   544  Cgo provides a way for Go programs to call C code linked into the same
   545  address space. This comment explains the operation of cgo.
   546  
   547  Cgo reads a set of Go source files and looks for statements saying
   548  import "C". If the import has a doc comment, that comment is
   549  taken as literal C code to be used as a preamble to any C code
   550  generated by cgo. A typical preamble #includes necessary definitions:
   551  
   552  	// #include <stdio.h>
   553  	import "C"
   554  
   555  For more details about the usage of cgo, see the documentation
   556  comment at the top of this file.
   557  
   558  Understanding C
   559  
   560  Cgo scans the Go source files that import "C" for uses of that
   561  package, such as C.puts. It collects all such identifiers. The next
   562  step is to determine each kind of name. In C.xxx the xxx might refer
   563  to a type, a function, a constant, or a global variable. Cgo must
   564  decide which.
   565  
   566  The obvious thing for cgo to do is to process the preamble, expanding
   567  #includes and processing the corresponding C code. That would require
   568  a full C parser and type checker that was also aware of any extensions
   569  known to the system compiler (for example, all the GNU C extensions) as
   570  well as the system-specific header locations and system-specific
   571  pre-#defined macros. This is certainly possible to do, but it is an
   572  enormous amount of work.
   573  
   574  Cgo takes a different approach. It determines the meaning of C
   575  identifiers not by parsing C code but by feeding carefully constructed
   576  programs into the system C compiler and interpreting the generated
   577  error messages, debug information, and object files. In practice,
   578  parsing these is significantly less work and more robust than parsing
   579  C source.
   580  
   581  Cgo first invokes gcc -E -dM on the preamble, in order to find out
   582  about simple #defines for constants and the like. These are recorded
   583  for later use.
   584  
   585  Next, cgo needs to identify the kinds for each identifier. For the
   586  identifiers C.foo, cgo generates this C program:
   587  
   588  	<preamble>
   589  	#line 1 "not-declared"
   590  	void __cgo_f_1_1(void) { __typeof__(foo) *__cgo_undefined__1; }
   591  	#line 1 "not-type"
   592  	void __cgo_f_1_2(void) { foo *__cgo_undefined__2; }
   593  	#line 1 "not-int-const"
   594  	void __cgo_f_1_3(void) { enum { __cgo_undefined__3 = (foo)*1 }; }
   595  	#line 1 "not-num-const"
   596  	void __cgo_f_1_4(void) { static const double __cgo_undefined__4 = (foo); }
   597  	#line 1 "not-str-lit"
   598  	void __cgo_f_1_5(void) { static const char __cgo_undefined__5[] = (foo); }
   599  
   600  This program will not compile, but cgo can use the presence or absence
   601  of an error message on a given line to deduce the information it
   602  needs. The program is syntactically valid regardless of whether each
   603  name is a type or an ordinary identifier, so there will be no syntax
   604  errors that might stop parsing early.
   605  
   606  An error on not-declared:1 indicates that foo is undeclared.
   607  An error on not-type:1 indicates that foo is not a type (if declared at all, it is an identifier).
   608  An error on not-int-const:1 indicates that foo is not an integer constant.
   609  An error on not-num-const:1 indicates that foo is not a number constant.
   610  An error on not-str-lit:1 indicates that foo is not a string literal.
   611  An error on not-signed-int-const:1 indicates that foo is not a signed integer constant.
   612  
   613  The line number specifies the name involved. In the example, 1 is foo.
   614  
   615  Next, cgo must learn the details of each type, variable, function, or
   616  constant. It can do this by reading object files. If cgo has decided
   617  that t1 is a type, v2 and v3 are variables or functions, and i4, i5
   618  are integer constants, u6 is an unsigned integer constant, and f7 and f8
   619  are float constants, and s9 and s10 are string constants, it generates:
   620  
   621  	<preamble>
   622  	__typeof__(t1) *__cgo__1;
   623  	__typeof__(v2) *__cgo__2;
   624  	__typeof__(v3) *__cgo__3;
   625  	__typeof__(i4) *__cgo__4;
   626  	enum { __cgo_enum__4 = i4 };
   627  	__typeof__(i5) *__cgo__5;
   628  	enum { __cgo_enum__5 = i5 };
   629  	__typeof__(u6) *__cgo__6;
   630  	enum { __cgo_enum__6 = u6 };
   631  	__typeof__(f7) *__cgo__7;
   632  	__typeof__(f8) *__cgo__8;
   633  	__typeof__(s9) *__cgo__9;
   634  	__typeof__(s10) *__cgo__10;
   635  
   636  	long long __cgodebug_ints[] = {
   637  		0, // t1
   638  		0, // v2
   639  		0, // v3
   640  		i4,
   641  		i5,
   642  		u6,
   643  		0, // f7
   644  		0, // f8
   645  		0, // s9
   646  		0, // s10
   647  		1
   648  	};
   649  
   650  	double __cgodebug_floats[] = {
   651  		0, // t1
   652  		0, // v2
   653  		0, // v3
   654  		0, // i4
   655  		0, // i5
   656  		0, // u6
   657  		f7,
   658  		f8,
   659  		0, // s9
   660  		0, // s10
   661  		1
   662  	};
   663  
   664  	const char __cgodebug_str__9[] = s9;
   665  	const unsigned long long __cgodebug_strlen__9 = sizeof(s9)-1;
   666  	const char __cgodebug_str__10[] = s10;
   667  	const unsigned long long __cgodebug_strlen__10 = sizeof(s10)-1;
   668  
   669  and again invokes the system C compiler, to produce an object file
   670  containing debug information. Cgo parses the DWARF debug information
   671  for __cgo__N to learn the type of each identifier. (The types also
   672  distinguish functions from global variables.) Cgo reads the constant
   673  values from the __cgodebug_* from the object file's data segment.
   674  
   675  At this point cgo knows the meaning of each C.xxx well enough to start
   676  the translation process.
   677  
   678  Translating Go
   679  
   680  Given the input Go files x.go and y.go, cgo generates these source
   681  files:
   682  
   683  	x.cgo1.go       # for gc (cmd/compile)
   684  	y.cgo1.go       # for gc
   685  	_cgo_gotypes.go # for gc
   686  	_cgo_import.go  # for gc (if -dynout _cgo_import.go)
   687  	x.cgo2.c        # for gcc
   688  	y.cgo2.c        # for gcc
   689  	_cgo_defun.c    # for gcc (if -gccgo)
   690  	_cgo_export.c   # for gcc
   691  	_cgo_export.h   # for gcc
   692  	_cgo_main.c     # for gcc
   693  	_cgo_flags      # for build tool (if -gccgo)
   694  
   695  The file x.cgo1.go is a copy of x.go with the import "C" removed and
   696  references to C.xxx replaced with names like _Cfunc_xxx or _Ctype_xxx.
   697  The definitions of those identifiers, written as Go functions, types,
   698  or variables, are provided in _cgo_gotypes.go.
   699  
   700  Here is a _cgo_gotypes.go containing definitions for needed C types:
   701  
   702  	type _Ctype_char int8
   703  	type _Ctype_int int32
   704  	type _Ctype_void [0]byte
   705  
   706  The _cgo_gotypes.go file also contains the definitions of the
   707  functions. They all have similar bodies that invoke runtime·cgocall
   708  to make a switch from the Go runtime world to the system C (GCC-based)
   709  world.
   710  
   711  For example, here is the definition of _Cfunc_puts:
   712  
   713  	//go:cgo_import_static _cgo_be59f0f25121_Cfunc_puts
   714  	//go:linkname __cgofn__cgo_be59f0f25121_Cfunc_puts _cgo_be59f0f25121_Cfunc_puts
   715  	var __cgofn__cgo_be59f0f25121_Cfunc_puts byte
   716  	var _cgo_be59f0f25121_Cfunc_puts = unsafe.Pointer(&__cgofn__cgo_be59f0f25121_Cfunc_puts)
   717  
   718  	func _Cfunc_puts(p0 *_Ctype_char) (r1 _Ctype_int) {
   719  		_cgo_runtime_cgocall(_cgo_be59f0f25121_Cfunc_puts, uintptr(unsafe.Pointer(&p0)))
   720  		return
   721  	}
   722  
   723  The hexadecimal number is a hash of cgo's input, chosen to be
   724  deterministic yet unlikely to collide with other uses. The actual
   725  function _cgo_be59f0f25121_Cfunc_puts is implemented in a C source
   726  file compiled by gcc, the file x.cgo2.c:
   727  
   728  	void
   729  	_cgo_be59f0f25121_Cfunc_puts(void *v)
   730  	{
   731  		struct {
   732  			char* p0;
   733  			int r;
   734  			char __pad12[4];
   735  		} __attribute__((__packed__, __gcc_struct__)) *a = v;
   736  		a->r = puts((void*)a->p0);
   737  	}
   738  
   739  It extracts the arguments from the pointer to _Cfunc_puts's argument
   740  frame, invokes the system C function (in this case, puts), stores the
   741  result in the frame, and returns.
   742  
   743  Linking
   744  
   745  Once the _cgo_export.c and *.cgo2.c files have been compiled with gcc,
   746  they need to be linked into the final binary, along with the libraries
   747  they might depend on (in the case of puts, stdio). cmd/link has been
   748  extended to understand basic ELF files, but it does not understand ELF
   749  in the full complexity that modern C libraries embrace, so it cannot
   750  in general generate direct references to the system libraries.
   751  
   752  Instead, the build process generates an object file using dynamic
   753  linkage to the desired libraries. The main function is provided by
   754  _cgo_main.c:
   755  
   756  	int main() { return 0; }
   757  	void crosscall2(void(*fn)(void*), void *a, int c, uintptr_t ctxt) { }
   758  	uintptr_t _cgo_wait_runtime_init_done(void) { return 0; }
   759  	void _cgo_release_context(uintptr_t ctxt) { }
   760  	char* _cgo_topofstack(void) { return (char*)0; }
   761  	void _cgo_allocate(void *a, int c) { }
   762  	void _cgo_panic(void *a, int c) { }
   763  	void _cgo_reginit(void) { }
   764  
   765  The extra functions here are stubs to satisfy the references in the C
   766  code generated for gcc. The build process links this stub, along with
   767  _cgo_export.c and *.cgo2.c, into a dynamic executable and then lets
   768  cgo examine the executable. Cgo records the list of shared library
   769  references and resolved names and writes them into a new file
   770  _cgo_import.go, which looks like:
   771  
   772  	//go:cgo_dynamic_linker "/lib64/ld-linux-x86-64.so.2"
   773  	//go:cgo_import_dynamic puts puts#GLIBC_2.2.5 "libc.so.6"
   774  	//go:cgo_import_dynamic __libc_start_main __libc_start_main#GLIBC_2.2.5 "libc.so.6"
   775  	//go:cgo_import_dynamic stdout stdout#GLIBC_2.2.5 "libc.so.6"
   776  	//go:cgo_import_dynamic fflush fflush#GLIBC_2.2.5 "libc.so.6"
   777  	//go:cgo_import_dynamic _ _ "libpthread.so.0"
   778  	//go:cgo_import_dynamic _ _ "libc.so.6"
   779  
   780  In the end, the compiled Go package, which will eventually be
   781  presented to cmd/link as part of a larger program, contains:
   782  
   783  	_go_.o        # gc-compiled object for _cgo_gotypes.go, _cgo_import.go, *.cgo1.go
   784  	_all.o        # gcc-compiled object for _cgo_export.c, *.cgo2.c
   785  
   786  If there is an error generating the _cgo_import.go file, then, instead
   787  of adding _cgo_import.go to the package, the go tool adds an empty
   788  file named dynimportfail. The _cgo_import.go file is only needed when
   789  using internal linking mode, which is not the default when linking
   790  programs that use cgo (as described below). If the linker sees a file
   791  named dynimportfail it reports an error if it has been told to use
   792  internal linking mode. This approach is taken because generating
   793  _cgo_import.go requires doing a full C link of the package, which can
   794  fail for reasons that are irrelevant when using external linking mode.
   795  
   796  The final program will be a dynamic executable, so that cmd/link can avoid
   797  needing to process arbitrary .o files. It only needs to process the .o
   798  files generated from C files that cgo writes, and those are much more
   799  limited in the ELF or other features that they use.
   800  
   801  In essence, the _cgo_import.o file includes the extra linking
   802  directives that cmd/link is not sophisticated enough to derive from _all.o
   803  on its own. Similarly, the _all.o uses dynamic references to real
   804  system object code because cmd/link is not sophisticated enough to process
   805  the real code.
   806  
   807  The main benefits of this system are that cmd/link remains relatively simple
   808  (it does not need to implement a complete ELF and Mach-O linker) and
   809  that gcc is not needed after the package is compiled. For example,
   810  package net uses cgo for access to name resolution functions provided
   811  by libc. Although gcc is needed to compile package net, gcc is not
   812  needed to link programs that import package net.
   813  
   814  Runtime
   815  
   816  When using cgo, Go must not assume that it owns all details of the
   817  process. In particular it needs to coordinate with C in the use of
   818  threads and thread-local storage. The runtime package declares a few
   819  variables:
   820  
   821  	var (
   822  		iscgo             bool
   823  		_cgo_init         unsafe.Pointer
   824  		_cgo_thread_start unsafe.Pointer
   825  	)
   826  
   827  Any package using cgo imports "runtime/cgo", which provides
   828  initializations for these variables. It sets iscgo to true, _cgo_init
   829  to a gcc-compiled function that can be called early during program
   830  startup, and _cgo_thread_start to a gcc-compiled function that can be
   831  used to create a new thread, in place of the runtime's usual direct
   832  system calls.
   833  
   834  Internal and External Linking
   835  
   836  The text above describes "internal" linking, in which cmd/link parses and
   837  links host object files (ELF, Mach-O, PE, and so on) into the final
   838  executable itself. Keeping cmd/link simple means we cannot possibly
   839  implement the full semantics of the host linker, so the kinds of
   840  objects that can be linked directly into the binary is limited (other
   841  code can only be used as a dynamic library). On the other hand, when
   842  using internal linking, cmd/link can generate Go binaries by itself.
   843  
   844  In order to allow linking arbitrary object files without requiring
   845  dynamic libraries, cgo supports an "external" linking mode too. In
   846  external linking mode, cmd/link does not process any host object files.
   847  Instead, it collects all the Go code and writes a single go.o object
   848  file containing it. Then it invokes the host linker (usually gcc) to
   849  combine the go.o object file and any supporting non-Go code into a
   850  final executable. External linking avoids the dynamic library
   851  requirement but introduces a requirement that the host linker be
   852  present to create such a binary.
   853  
   854  Most builds both compile source code and invoke the linker to create a
   855  binary. When cgo is involved, the compile step already requires gcc, so
   856  it is not problematic for the link step to require gcc too.
   857  
   858  An important exception is builds using a pre-compiled copy of the
   859  standard library. In particular, package net uses cgo on most systems,
   860  and we want to preserve the ability to compile pure Go code that
   861  imports net without requiring gcc to be present at link time. (In this
   862  case, the dynamic library requirement is less significant, because the
   863  only library involved is libc.so, which can usually be assumed
   864  present.)
   865  
   866  This conflict between functionality and the gcc requirement means we
   867  must support both internal and external linking, depending on the
   868  circumstances: if net is the only cgo-using package, then internal
   869  linking is probably fine, but if other packages are involved, so that there
   870  are dependencies on libraries beyond libc, external linking is likely
   871  to work better. The compilation of a package records the relevant
   872  information to support both linking modes, leaving the decision
   873  to be made when linking the final binary.
   874  
   875  Linking Directives
   876  
   877  In either linking mode, package-specific directives must be passed
   878  through to cmd/link. These are communicated by writing //go: directives in a
   879  Go source file compiled by gc. The directives are copied into the .o
   880  object file and then processed by the linker.
   881  
   882  The directives are:
   883  
   884  //go:cgo_import_dynamic <local> [<remote> ["<library>"]]
   885  
   886  	In internal linking mode, allow an unresolved reference to
   887  	<local>, assuming it will be resolved by a dynamic library
   888  	symbol. The optional <remote> specifies the symbol's name and
   889  	possibly version in the dynamic library, and the optional "<library>"
   890  	names the specific library where the symbol should be found.
   891  
   892  	On AIX, the library pattern is slightly different. It must be
   893  	"lib.a/obj.o" with obj.o the member of this library exporting
   894  	this symbol.
   895  
   896  	In the <remote>, # or @ can be used to introduce a symbol version.
   897  
   898  	Examples:
   899  	//go:cgo_import_dynamic puts
   900  	//go:cgo_import_dynamic puts puts#GLIBC_2.2.5
   901  	//go:cgo_import_dynamic puts puts#GLIBC_2.2.5 "libc.so.6"
   902  
   903  	A side effect of the cgo_import_dynamic directive with a
   904  	library is to make the final binary depend on that dynamic
   905  	library. To get the dependency without importing any specific
   906  	symbols, use _ for local and remote.
   907  
   908  	Example:
   909  	//go:cgo_import_dynamic _ _ "libc.so.6"
   910  
   911  	For compatibility with current versions of SWIG,
   912  	#pragma dynimport is an alias for //go:cgo_import_dynamic.
   913  
   914  //go:cgo_dynamic_linker "<path>"
   915  
   916  	In internal linking mode, use "<path>" as the dynamic linker
   917  	in the final binary. This directive is only needed from one
   918  	package when constructing a binary; by convention it is
   919  	supplied by runtime/cgo.
   920  
   921  	Example:
   922  	//go:cgo_dynamic_linker "/lib/ld-linux.so.2"
   923  
   924  //go:cgo_export_dynamic <local> <remote>
   925  
   926  	In internal linking mode, put the Go symbol
   927  	named <local> into the program's exported symbol table as
   928  	<remote>, so that C code can refer to it by that name. This
   929  	mechanism makes it possible for C code to call back into Go or
   930  	to share Go's data.
   931  
   932  	For compatibility with current versions of SWIG,
   933  	#pragma dynexport is an alias for //go:cgo_export_dynamic.
   934  
   935  //go:cgo_import_static <local>
   936  
   937  	In external linking mode, allow unresolved references to
   938  	<local> in the go.o object file prepared for the host linker,
   939  	under the assumption that <local> will be supplied by the
   940  	other object files that will be linked with go.o.
   941  
   942  	Example:
   943  	//go:cgo_import_static puts_wrapper
   944  
   945  //go:cgo_export_static <local> <remote>
   946  
   947  	In external linking mode, put the Go symbol
   948  	named <local> into the program's exported symbol table as
   949  	<remote>, so that C code can refer to it by that name. This
   950  	mechanism makes it possible for C code to call back into Go or
   951  	to share Go's data.
   952  
   953  //go:cgo_ldflag "<arg>"
   954  
   955  	In external linking mode, invoke the host linker (usually gcc)
   956  	with "<arg>" as a command-line argument following the .o files.
   957  	Note that the arguments are for "gcc", not "ld".
   958  
   959  	Example:
   960  	//go:cgo_ldflag "-lpthread"
   961  	//go:cgo_ldflag "-L/usr/local/sqlite3/lib"
   962  
   963  A package compiled with cgo will include directives for both
   964  internal and external linking; the linker will select the appropriate
   965  subset for the chosen linking mode.
   966  
   967  Example
   968  
   969  As a simple example, consider a package that uses cgo to call C.sin.
   970  The following code will be generated by cgo:
   971  
   972  	// compiled by gc
   973  
   974  	//go:cgo_ldflag "-lm"
   975  
   976  	type _Ctype_double float64
   977  
   978  	//go:cgo_import_static _cgo_gcc_Cfunc_sin
   979  	//go:linkname __cgo_gcc_Cfunc_sin _cgo_gcc_Cfunc_sin
   980  	var __cgo_gcc_Cfunc_sin byte
   981  	var _cgo_gcc_Cfunc_sin = unsafe.Pointer(&__cgo_gcc_Cfunc_sin)
   982  
   983  	func _Cfunc_sin(p0 _Ctype_double) (r1 _Ctype_double) {
   984  		_cgo_runtime_cgocall(_cgo_gcc_Cfunc_sin, uintptr(unsafe.Pointer(&p0)))
   985  		return
   986  	}
   987  
   988  	// compiled by gcc, into foo.cgo2.o
   989  
   990  	void
   991  	_cgo_gcc_Cfunc_sin(void *v)
   992  	{
   993  		struct {
   994  			double p0;
   995  			double r;
   996  		} __attribute__((__packed__)) *a = v;
   997  		a->r = sin(a->p0);
   998  	}
   999  
  1000  What happens at link time depends on whether the final binary is linked
  1001  using the internal or external mode. If other packages are compiled in
  1002  "external only" mode, then the final link will be an external one.
  1003  Otherwise the link will be an internal one.
  1004  
  1005  The linking directives are used according to the kind of final link
  1006  used.
  1007  
  1008  In internal mode, cmd/link itself processes all the host object files, in
  1009  particular foo.cgo2.o. To do so, it uses the cgo_import_dynamic and
  1010  cgo_dynamic_linker directives to learn that the otherwise undefined
  1011  reference to sin in foo.cgo2.o should be rewritten to refer to the
  1012  symbol sin with version GLIBC_2.2.5 from the dynamic library
  1013  "libm.so.6", and the binary should request "/lib/ld-linux.so.2" as its
  1014  runtime dynamic linker.
  1015  
  1016  In external mode, cmd/link does not process any host object files, in
  1017  particular foo.cgo2.o. It links together the gc-generated object
  1018  files, along with any other Go code, into a go.o file. While doing
  1019  that, cmd/link will discover that there is no definition for
  1020  _cgo_gcc_Cfunc_sin, referred to by the gc-compiled source file. This
  1021  is okay, because cmd/link also processes the cgo_import_static directive and
  1022  knows that _cgo_gcc_Cfunc_sin is expected to be supplied by a host
  1023  object file, so cmd/link does not treat the missing symbol as an error when
  1024  creating go.o. Indeed, the definition for _cgo_gcc_Cfunc_sin will be
  1025  provided to the host linker by foo2.cgo.o, which in turn will need the
  1026  symbol 'sin'. cmd/link also processes the cgo_ldflag directives, so that it
  1027  knows that the eventual host link command must include the -lm
  1028  argument, so that the host linker will be able to find 'sin' in the
  1029  math library.
  1030  
  1031  cmd/link Command Line Interface
  1032  
  1033  The go command and any other Go-aware build systems invoke cmd/link
  1034  to link a collection of packages into a single binary. By default, cmd/link will
  1035  present the same interface it does today:
  1036  
  1037  	cmd/link main.a
  1038  
  1039  produces a file named a.out, even if cmd/link does so by invoking the host
  1040  linker in external linking mode.
  1041  
  1042  By default, cmd/link will decide the linking mode as follows: if the only
  1043  packages using cgo are those on a list of known standard library
  1044  packages (net, os/user, runtime/cgo), cmd/link will use internal linking
  1045  mode. Otherwise, there are non-standard cgo packages involved, and cmd/link
  1046  will use external linking mode. The first rule means that a build of
  1047  the godoc binary, which uses net but no other cgo, can run without
  1048  needing gcc available. The second rule means that a build of a
  1049  cgo-wrapped library like sqlite3 can generate a standalone executable
  1050  instead of needing to refer to a dynamic library. The specific choice
  1051  can be overridden using a command line flag: cmd/link -linkmode=internal or
  1052  cmd/link -linkmode=external.
  1053  
  1054  In an external link, cmd/link will create a temporary directory, write any
  1055  host object files found in package archives to that directory (renamed
  1056  to avoid conflicts), write the go.o file to that directory, and invoke
  1057  the host linker. The default value for the host linker is $CC, split
  1058  into fields, or else "gcc". The specific host linker command line can
  1059  be overridden using command line flags: cmd/link -extld=clang
  1060  -extldflags='-ggdb -O3'. If any package in a build includes a .cc or
  1061  other file compiled by the C++ compiler, the go tool will use the
  1062  -extld option to set the host linker to the C++ compiler.
  1063  
  1064  These defaults mean that Go-aware build systems can ignore the linking
  1065  changes and keep running plain 'cmd/link' and get reasonable results, but
  1066  they can also control the linking details if desired.
  1067  
  1068  */
  1069  

View as plain text