...

Source file src/cmd/internal/obj/arm64/doc.go

Documentation: cmd/internal/obj/arm64

     1  // Copyright 2018 The Go Authors. All rights reserved.
     2  // Use of this source code is governed by a BSD-style
     3  // license that can be found in the LICENSE file.
     4  
     5  /*
     6  Package arm64 implements an ARM64 assembler. Go assembly syntax is different from GNU ARM64
     7  syntax, but we can still follow the general rules to map between them.
     8  
     9  # Instructions mnemonics mapping rules
    10  
    11  1. Most instructions use width suffixes of instruction names to indicate operand width rather than
    12  using different register names.
    13  
    14  Examples:
    15  
    16  	ADC R24, R14, R12          <=>     adc x12, x14, x24
    17  	ADDW R26->24, R21, R15     <=>     add w15, w21, w26, asr #24
    18  	FCMPS F2, F3               <=>     fcmp s3, s2
    19  	FCMPD F2, F3               <=>     fcmp d3, d2
    20  	FCVTDH F2, F3              <=>     fcvt h3, d2
    21  
    22  2. Go uses .P and .W suffixes to indicate post-increment and pre-increment.
    23  
    24  Examples:
    25  
    26  	MOVD.P -8(R10), R8         <=>      ldr x8, [x10],#-8
    27  	MOVB.W 16(R16), R10        <=>      ldrsb x10, [x16,#16]!
    28  	MOVBU.W 16(R16), R10       <=>      ldrb x10, [x16,#16]!
    29  
    30  3. Go uses a series of MOV instructions as load and store.
    31  
    32  64-bit variant ldr, str, stur => MOVD;
    33  32-bit variant str, stur, ldrsw => MOVW;
    34  32-bit variant ldr => MOVWU;
    35  ldrb => MOVBU; ldrh => MOVHU;
    36  ldrsb, sturb, strb => MOVB;
    37  ldrsh, sturh, strh =>  MOVH.
    38  
    39  4. Go moves conditions into opcode suffix, like BLT.
    40  
    41  5. Go adds a V prefix for most floating-point and SIMD instructions, except cryptographic extension
    42  instructions and floating-point(scalar) instructions.
    43  
    44  Examples:
    45  
    46  	VADD V5.H8, V18.H8, V9.H8         <=>      add v9.8h, v18.8h, v5.8h
    47  	VLD1.P (R6)(R11), [V31.D1]        <=>      ld1 {v31.1d}, [x6], x11
    48  	VFMLA V29.S2, V20.S2, V14.S2      <=>      fmla v14.2s, v20.2s, v29.2s
    49  	AESD V22.B16, V19.B16             <=>      aesd v19.16b, v22.16b
    50  	SCVTFWS R3, F16                   <=>      scvtf s17, w6
    51  
    52  6. Align directive
    53  
    54  Go asm supports the PCALIGN directive, which indicates that the next instruction should be aligned
    55  to a specified boundary by padding with NOOP instruction. The alignment value supported on arm64
    56  must be a power of 2 and in the range of [8, 2048].
    57  
    58  Examples:
    59  
    60  	PCALIGN $16
    61  	MOVD $2, R0          // This instruction is aligned with 16 bytes.
    62  	PCALIGN $1024
    63  	MOVD $3, R1          // This instruction is aligned with 1024 bytes.
    64  
    65  PCALIGN also changes the function alignment. If a function has one or more PCALIGN directives,
    66  its address will be aligned to the same or coarser boundary, which is the maximum of all the
    67  alignment values.
    68  
    69  In the following example, the function Add is aligned with 128 bytes.
    70  
    71  Examples:
    72  
    73  	TEXT ·Add(SB),$40-16
    74  	MOVD $2, R0
    75  	PCALIGN $32
    76  	MOVD $4, R1
    77  	PCALIGN $128
    78  	MOVD $8, R2
    79  	RET
    80  
    81  On arm64, functions in Go are aligned to 16 bytes by default, we can also use PCALIGN to set the
    82  function alignment. The functions that need to be aligned are preferably using NOFRAME and NOSPLIT
    83  to avoid the impact of the prologues inserted by the assembler, so that the function address will
    84  have the same alignment as the first hand-written instruction.
    85  
    86  In the following example, PCALIGN at the entry of the function Add will align its address to 2048 bytes.
    87  
    88  Examples:
    89  
    90  	TEXT ·Add(SB),NOSPLIT|NOFRAME,$0
    91  	  PCALIGN $2048
    92  	  MOVD $1, R0
    93  	  MOVD $1, R1
    94  	  RET
    95  
    96  7. Move large constants to vector registers.
    97  
    98  Go asm uses VMOVQ/VMOVD/VMOVS to move 128-bit, 64-bit and 32-bit constants into vector registers, respectively.
    99  And for a 128-bit integer, it take two 64-bit operands, for the low and high parts separately.
   100  
   101  Examples:
   102  
   103  	VMOVS $0x11223344, V0
   104  	VMOVD $0x1122334455667788, V1
   105  	VMOVQ $0x1122334455667788, $0x99aabbccddeeff00, V2   // V2=0x99aabbccddeeff001122334455667788
   106  
   107  8. Move an optionally-shifted 16-bit immediate value to a register.
   108  
   109  The instructions are MOVK(W), MOVZ(W) and MOVN(W), the assembly syntax is "op $(uimm16<<shift), <Rd>". The <uimm16>
   110  is the 16-bit unsigned immediate, in the range 0 to 65535; For the 32-bit variant, the <shift> is 0 or 16, for the
   111  64-bit variant, the <shift> is 0, 16, 32 or 48.
   112  
   113  The current Go assembler does not accept zero shifts, such as "op $0, Rd" and "op $(0<<(16|32|48)), Rd" instructions.
   114  
   115  Examples:
   116  
   117  	MOVK $(10<<32), R20     <=>      movk x20, #10, lsl #32
   118  	MOVZW $(20<<16), R8     <=>      movz w8, #20, lsl #16
   119  	MOVK $(0<<16), R10 will be reported as an error by the assembler.
   120  
   121  Special Cases.
   122  
   123  (1) umov is written as VMOV.
   124  
   125  (2) br is renamed JMP, blr is renamed CALL.
   126  
   127  (3) No need to add "W" suffix: LDARB, LDARH, LDAXRB, LDAXRH, LDTRH, LDXRB, LDXRH.
   128  
   129  (4) In Go assembly syntax, NOP is a zero-width pseudo-instruction serves generic purpose, nothing
   130  related to real ARM64 instruction. NOOP serves for the hardware nop instruction. NOOP is an alias of
   131  HINT $0.
   132  
   133  Examples:
   134  
   135  	VMOV V13.B[1], R20      <=>      mov x20, v13.b[1]
   136  	VMOV V13.H[1], R20      <=>      mov w20, v13.h[1]
   137  	JMP (R3)                <=>      br x3
   138  	CALL (R17)              <=>      blr x17
   139  	LDAXRB (R19), R16       <=>      ldaxrb w16, [x19]
   140  	NOOP                    <=>      nop
   141  
   142  # Register mapping rules
   143  
   144  1. All basic register names are written as Rn.
   145  
   146  2. Go uses ZR as the zero register and RSP as the stack pointer.
   147  
   148  3. Bn, Hn, Dn, Sn and Qn instructions are written as Fn in floating-point instructions and as Vn
   149  in SIMD instructions.
   150  
   151  # Argument mapping rules
   152  
   153  1. The operands appear in left-to-right assignment order.
   154  
   155  Go reverses the arguments of most instructions.
   156  
   157  Examples:
   158  
   159  	ADD R11.SXTB<<1, RSP, R25      <=>      add x25, sp, w11, sxtb #1
   160  	VADD V16, V19, V14             <=>      add d14, d19, d16
   161  
   162  Special Cases.
   163  
   164  (1) Argument order is the same as in the GNU ARM64 syntax: cbz, cbnz and some store instructions,
   165  such as str, stur, strb, sturb, strh, sturh stlr, stlrb. stlrh, st1.
   166  
   167  Examples:
   168  
   169  	MOVD R29, 384(R19)    <=>    str x29, [x19,#384]
   170  	MOVB.P R30, 30(R4)    <=>    strb w30, [x4],#30
   171  	STLRH R21, (R19)      <=>    stlrh w21, [x19]
   172  
   173  (2) MADD, MADDW, MSUB, MSUBW, SMADDL, SMSUBL, UMADDL, UMSUBL <Rm>, <Ra>, <Rn>, <Rd>
   174  
   175  Examples:
   176  
   177  	MADD R2, R30, R22, R6       <=>    madd x6, x22, x2, x30
   178  	SMSUBL R10, R3, R17, R27    <=>    smsubl x27, w17, w10, x3
   179  
   180  (3) FMADDD, FMADDS, FMSUBD, FMSUBS, FNMADDD, FNMADDS, FNMSUBD, FNMSUBS <Fm>, <Fa>, <Fn>, <Fd>
   181  
   182  Examples:
   183  
   184  	FMADDD F30, F20, F3, F29    <=>    fmadd d29, d3, d30, d20
   185  	FNMSUBS F7, F25, F7, F22    <=>    fnmsub s22, s7, s7, s25
   186  
   187  (4) BFI, BFXIL, SBFIZ, SBFX, UBFIZ, UBFX $<lsb>, <Rn>, $<width>, <Rd>
   188  
   189  Examples:
   190  
   191  	BFIW $16, R20, $6, R0      <=>    bfi w0, w20, #16, #6
   192  	UBFIZ $34, R26, $5, R20    <=>    ubfiz x20, x26, #34, #5
   193  
   194  (5) FCCMPD, FCCMPS, FCCMPED, FCCMPES <cond>, Fm. Fn, $<nzcv>
   195  
   196  Examples:
   197  
   198  	FCCMPD AL, F8, F26, $0     <=>    fccmp d26, d8, #0x0, al
   199  	FCCMPS VS, F29, F4, $4     <=>    fccmp s4, s29, #0x4, vs
   200  	FCCMPED LE, F20, F5, $13   <=>    fccmpe d5, d20, #0xd, le
   201  	FCCMPES NE, F26, F10, $0   <=>    fccmpe s10, s26, #0x0, ne
   202  
   203  (6) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, $<imm>, $<nzcv>
   204  
   205  Examples:
   206  
   207  	CCMP MI, R22, $12, $13     <=>    ccmp x22, #0xc, #0xd, mi
   208  	CCMNW AL, R1, $11, $8      <=>    ccmn w1, #0xb, #0x8, al
   209  
   210  (7) CCMN, CCMNW, CCMP, CCMPW <cond>, <Rn>, <Rm>, $<nzcv>
   211  
   212  Examples:
   213  
   214  	CCMN VS, R13, R22, $10     <=>    ccmn x13, x22, #0xa, vs
   215  	CCMPW HS, R19, R14, $11    <=>    ccmp w19, w14, #0xb, cs
   216  
   217  (9) CSEL, CSELW, CSNEG, CSNEGW, CSINC, CSINCW <cond>, <Rn>, <Rm>, <Rd> ;
   218  FCSELD, FCSELS <cond>, <Fn>, <Fm>, <Fd>
   219  
   220  Examples:
   221  
   222  	CSEL GT, R0, R19, R1        <=>    csel x1, x0, x19, gt
   223  	CSNEGW GT, R7, R17, R8      <=>    csneg w8, w7, w17, gt
   224  	FCSELD EQ, F15, F18, F16    <=>    fcsel d16, d15, d18, eq
   225  
   226  (10) TBNZ, TBZ $<imm>, <Rt>, <label>
   227  
   228  (11) STLXR, STLXRW, STXR, STXRW, STLXRB, STLXRH, STXRB, STXRH  <Rf>, (<Rn|RSP>), <Rs>
   229  
   230  Examples:
   231  
   232  	STLXR ZR, (R15), R16    <=>    stlxr w16, xzr, [x15]
   233  	STXRB R9, (R21), R19    <=>    stxrb w19, w9, [x21]
   234  
   235  (12) STLXP, STLXPW, STXP, STXPW (<Rf1>, <Rf2>), (<Rn|RSP>), <Rs>
   236  
   237  Examples:
   238  
   239  	STLXP (R17, R19), (R4), R5      <=>    stlxp w5, x17, x19, [x4]
   240  	STXPW (R30, R25), (R22), R13    <=>    stxp w13, w30, w25, [x22]
   241  
   242  2. Expressions for special arguments.
   243  
   244  #<immediate> is written as $<immediate>.
   245  
   246  Optionally-shifted immediate.
   247  
   248  Examples:
   249  
   250  	ADD $(3151<<12), R14, R20     <=>    add x20, x14, #0xc4f, lsl #12
   251  	ADDW $1864, R25, R6           <=>    add w6, w25, #0x748
   252  
   253  Optionally-shifted registers are written as <Rm>{<shift><amount>}.
   254  The <shift> can be <<(lsl), >>(lsr), ->(asr), @>(ror).
   255  
   256  Examples:
   257  
   258  	ADD R19>>30, R10, R24     <=>    add x24, x10, x19, lsr #30
   259  	ADDW R26->24, R21, R15    <=>    add w15, w21, w26, asr #24
   260  
   261  Extended registers are written as <Rm>{.<extend>{<<<amount>}}.
   262  <extend> can be UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW or SXTX.
   263  
   264  Examples:
   265  
   266  	ADDS R19.UXTB<<4, R9, R26     <=>    adds x26, x9, w19, uxtb #4
   267  	ADDSW R14.SXTX, R14, R6       <=>    adds w6, w14, w14, sxtx
   268  
   269  Memory references: [<Xn|SP>{,#0}] is written as (Rn|RSP), a base register and an immediate
   270  offset is written as imm(Rn|RSP), a base register and an offset register is written as (Rn|RSP)(Rm).
   271  
   272  Examples:
   273  
   274  	LDAR (R22), R9                  <=>    ldar x9, [x22]
   275  	LDP 28(R17), (R15, R23)         <=>    ldp x15, x23, [x17,#28]
   276  	MOVWU (R4)(R12<<2), R8          <=>    ldr w8, [x4, x12, lsl #2]
   277  	MOVD (R7)(R11.UXTW<<3), R25     <=>    ldr x25, [x7,w11,uxtw #3]
   278  	MOVBU (R27)(R23), R14           <=>    ldrb w14, [x27,x23]
   279  
   280  Register pairs are written as (Rt1, Rt2).
   281  
   282  Examples:
   283  
   284  	LDP.P -240(R11), (R12, R26)    <=>    ldp x12, x26, [x11],#-240
   285  
   286  Register with arrangement and register with arrangement and index.
   287  
   288  Examples:
   289  
   290  	VADD V5.H8, V18.H8, V9.H8                     <=>    add v9.8h, v18.8h, v5.8h
   291  	VLD1 (R2), [V21.B16]                          <=>    ld1 {v21.16b}, [x2]
   292  	VST1.P V9.S[1], (R16)(R21)                    <=>    st1 {v9.s}[1], [x16], x28
   293  	VST1.P [V13.H8, V14.H8, V15.H8], (R3)(R14)    <=>    st1 {v13.8h-v15.8h}, [x3], x14
   294  	VST1.P [V14.D1, V15.D1], (R7)(R23)            <=>    st1 {v14.1d, v15.1d}, [x7], x23
   295  */
   296  package arm64
   297  

View as plain text