| .. SPDX-License-Identifier: GPL-2.0 | 
 |  | 
 | ========================= | 
 | Introduction to LoongArch | 
 | ========================= | 
 |  | 
 | LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. There are | 
 | currently 3 variants: a reduced 32-bit version (LA32R), a standard 32-bit | 
 | version (LA32S) and a 64-bit version (LA64). There are 4 privilege levels | 
 | (PLVs) defined in LoongArch: PLV0~PLV3, from high to low. Kernel runs at PLV0 | 
 | while applications run at PLV3. This document introduces the registers, basic | 
 | instruction set, virtual memory and some other topics of LoongArch. | 
 |  | 
 | Registers | 
 | ========= | 
 |  | 
 | LoongArch registers include general purpose registers (GPRs), floating point | 
 | registers (FPRs), vector registers (VRs) and control status registers (CSRs) | 
 | used in privileged mode (PLV0). | 
 |  | 
 | GPRs | 
 | ---- | 
 |  | 
 | LoongArch has 32 GPRs ( ``$r0`` ~ ``$r31`` ); each one is 32-bit wide in LA32 | 
 | and 64-bit wide in LA64. ``$r0`` is hard-wired to zero, and the other registers | 
 | are not architecturally special. (Except ``$r1``, which is hard-wired as the | 
 | link register of the BL instruction.) | 
 |  | 
 | The kernel uses a variant of the LoongArch register convention, as described in | 
 | the LoongArch ELF psABI spec, in :ref:`References <loongarch-references>`: | 
 |  | 
 | ================= =============== =================== ============ | 
 | Name              Alias           Usage               Preserved | 
 |                                                       across calls | 
 | ================= =============== =================== ============ | 
 | ``$r0``           ``$zero``       Constant zero       Unused | 
 | ``$r1``           ``$ra``         Return address      No | 
 | ``$r2``           ``$tp``         TLS/Thread pointer  Unused | 
 | ``$r3``           ``$sp``         Stack pointer       Yes | 
 | ``$r4``-``$r11``  ``$a0``-``$a7`` Argument registers  No | 
 | ``$r4``-``$r5``   ``$v0``-``$v1`` Return value        No | 
 | ``$r12``-``$r20`` ``$t0``-``$t8`` Temp registers      No | 
 | ``$r21``          ``$u0``         Percpu base address Unused | 
 | ``$r22``          ``$fp``         Frame pointer       Yes | 
 | ``$r23``-``$r31`` ``$s0``-``$s8`` Static registers    Yes | 
 | ================= =============== =================== ============ | 
 |  | 
 | .. Note:: | 
 |     The register ``$r21`` is reserved in the ELF psABI, but used by the Linux | 
 |     kernel for storing the percpu base address. It normally has no ABI name, | 
 |     but is called ``$u0`` in the kernel. You may also see ``$v0`` or ``$v1`` | 
 |     in some old code,however they are deprecated aliases of ``$a0`` and ``$a1`` | 
 |     respectively. | 
 |  | 
 | FPRs | 
 | ---- | 
 |  | 
 | LoongArch has 32 FPRs ( ``$f0`` ~ ``$f31`` ) when FPU is present. Each one is | 
 | 64-bit wide on the LA64 cores. | 
 |  | 
 | The floating-point register convention is the same as described in the | 
 | LoongArch ELF psABI spec: | 
 |  | 
 | ================= ================== =================== ============ | 
 | Name              Alias              Usage               Preserved | 
 |                                                          across calls | 
 | ================= ================== =================== ============ | 
 | ``$f0``-``$f7``   ``$fa0``-``$fa7``  Argument registers  No | 
 | ``$f0``-``$f1``   ``$fv0``-``$fv1``  Return value        No | 
 | ``$f8``-``$f23``  ``$ft0``-``$ft15`` Temp registers      No | 
 | ``$f24``-``$f31`` ``$fs0``-``$fs7``  Static registers    Yes | 
 | ================= ================== =================== ============ | 
 |  | 
 | .. Note:: | 
 |     You may see ``$fv0`` or ``$fv1`` in some old code, however they are | 
 |     deprecated aliases of ``$fa0`` and ``$fa1`` respectively. | 
 |  | 
 | VRs | 
 | ---- | 
 |  | 
 | There are currently 2 vector extensions to LoongArch: | 
 |  | 
 | - LSX (Loongson SIMD eXtension) with 128-bit vectors, | 
 | - LASX (Loongson Advanced SIMD eXtension) with 256-bit vectors. | 
 |  | 
 | LSX brings ``$v0`` ~ ``$v31`` while LASX brings ``$x0`` ~ ``$x31`` as the vector | 
 | registers. | 
 |  | 
 | The VRs overlap with FPRs: for example, on a core implementing LSX and LASX, | 
 | the lower 128 bits of ``$x0`` is shared with ``$v0``, and the lower 64 bits of | 
 | ``$v0`` is shared with ``$f0``; same with all other VRs. | 
 |  | 
 | CSRs | 
 | ---- | 
 |  | 
 | CSRs can only be accessed from privileged mode (PLV0): | 
 |  | 
 | ================= ===================================== ============== | 
 | Address           Full Name                             Abbrev Name | 
 | ================= ===================================== ============== | 
 | 0x0               Current Mode Information              CRMD | 
 | 0x1               Pre-exception Mode Information        PRMD | 
 | 0x2               Extension Unit Enable                 EUEN | 
 | 0x3               Miscellaneous Control                 MISC | 
 | 0x4               Exception Configuration               ECFG | 
 | 0x5               Exception Status                      ESTAT | 
 | 0x6               Exception Return Address              ERA | 
 | 0x7               Bad (Faulting) Virtual Address        BADV | 
 | 0x8               Bad (Faulting) Instruction Word       BADI | 
 | 0xC               Exception Entrypoint Address          EENTRY | 
 | 0x10              TLB Index                             TLBIDX | 
 | 0x11              TLB Entry High-order Bits             TLBEHI | 
 | 0x12              TLB Entry Low-order Bits 0            TLBELO0 | 
 | 0x13              TLB Entry Low-order Bits 1            TLBELO1 | 
 | 0x18              Address Space Identifier              ASID | 
 | 0x19              Page Global Directory Address for     PGDL | 
 |                   Lower-half Address Space | 
 | 0x1A              Page Global Directory Address for     PGDH | 
 |                   Higher-half Address Space | 
 | 0x1B              Page Global Directory Address         PGD | 
 | 0x1C              Page Walk Control for Lower-          PWCL | 
 |                   half Address Space | 
 | 0x1D              Page Walk Control for Higher-         PWCH | 
 |                   half Address Space | 
 | 0x1E              STLB Page Size                        STLBPS | 
 | 0x1F              Reduced Virtual Address Configuration RVACFG | 
 | 0x20              CPU Identifier                        CPUID | 
 | 0x21              Privileged Resource Configuration 1   PRCFG1 | 
 | 0x22              Privileged Resource Configuration 2   PRCFG2 | 
 | 0x23              Privileged Resource Configuration 3   PRCFG3 | 
 | 0x30+n (0≤n≤15)   Saved Data register                   SAVEn | 
 | 0x40              Timer Identifier                      TID | 
 | 0x41              Timer Configuration                   TCFG | 
 | 0x42              Timer Value                           TVAL | 
 | 0x43              Compensation of Timer Count           CNTC | 
 | 0x44              Timer Interrupt Clearing              TICLR | 
 | 0x60              LLBit Control                         LLBCTL | 
 | 0x80              Implementation-specific Control 1     IMPCTL1 | 
 | 0x81              Implementation-specific Control 2     IMPCTL2 | 
 | 0x88              TLB Refill Exception Entrypoint       TLBRENTRY | 
 |                   Address | 
 | 0x89              TLB Refill Exception BAD (Faulting)   TLBRBADV | 
 |                   Virtual Address | 
 | 0x8A              TLB Refill Exception Return Address   TLBRERA | 
 | 0x8B              TLB Refill Exception Saved Data       TLBRSAVE | 
 |                   Register | 
 | 0x8C              TLB Refill Exception Entry Low-order  TLBRELO0 | 
 |                   Bits 0 | 
 | 0x8D              TLB Refill Exception Entry Low-order  TLBRELO1 | 
 |                   Bits 1 | 
 | 0x8E              TLB Refill Exception Entry High-order TLBEHI | 
 |                   Bits | 
 | 0x8F              TLB Refill Exception Pre-exception    TLBRPRMD | 
 |                   Mode Information | 
 | 0x90              Machine Error Control                 MERRCTL | 
 | 0x91              Machine Error Information 1           MERRINFO1 | 
 | 0x92              Machine Error Information 2           MERRINFO2 | 
 | 0x93              Machine Error Exception Entrypoint    MERRENTRY | 
 |                   Address | 
 | 0x94              Machine Error Exception Return        MERRERA | 
 |                   Address | 
 | 0x95              Machine Error Exception Saved Data    MERRSAVE | 
 |                   Register | 
 | 0x98              Cache TAGs                            CTAG | 
 | 0x180+n (0≤n≤3)   Direct Mapping Configuration Window n DMWn | 
 | 0x200+2n (0≤n≤31) Performance Monitor Configuration n   PMCFGn | 
 | 0x201+2n (0≤n≤31) Performance Monitor Overall Counter n PMCNTn | 
 | 0x300             Memory Load/Store WatchPoint          MWPC | 
 |                   Overall Control | 
 | 0x301             Memory Load/Store WatchPoint          MWPS | 
 |                   Overall Status | 
 | 0x310+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG1 | 
 |                   Configuration 1 | 
 | 0x311+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG2 | 
 |                   Configuration 2 | 
 | 0x312+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG3 | 
 |                   Configuration 3 | 
 | 0x313+8n (0≤n≤7)  Memory Load/Store WatchPoint n        MWPnCFG4 | 
 |                   Configuration 4 | 
 | 0x380             Instruction Fetch WatchPoint          FWPC | 
 |                   Overall Control | 
 | 0x381             Instruction Fetch WatchPoint          FWPS | 
 |                   Overall Status | 
 | 0x390+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG1 | 
 |                   Configuration 1 | 
 | 0x391+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG2 | 
 |                   Configuration 2 | 
 | 0x392+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG3 | 
 |                   Configuration 3 | 
 | 0x393+8n (0≤n≤7)  Instruction Fetch WatchPoint n        FWPnCFG4 | 
 |                   Configuration 4 | 
 | 0x500             Debug Register                        DBG | 
 | 0x501             Debug Exception Return Address        DERA | 
 | 0x502             Debug Exception Saved Data Register   DSAVE | 
 | ================= ===================================== ============== | 
 |  | 
 | ERA, TLBRERA, MERRERA and DERA are sometimes also known as EPC, TLBREPC, MERREPC | 
 | and DEPC respectively. | 
 |  | 
 | Basic Instruction Set | 
 | ===================== | 
 |  | 
 | Instruction formats | 
 | ------------------- | 
 |  | 
 | LoongArch instructions are 32 bits wide, belonging to 9 basic instruction | 
 | formats (and variants of them): | 
 |  | 
 | =========== ========================== | 
 | Format name Composition | 
 | =========== ========================== | 
 | 2R          Opcode + Rj + Rd | 
 | 3R          Opcode + Rk + Rj + Rd | 
 | 4R          Opcode + Ra + Rk + Rj + Rd | 
 | 2RI8        Opcode + I8 + Rj + Rd | 
 | 2RI12       Opcode + I12 + Rj + Rd | 
 | 2RI14       Opcode + I14 + Rj + Rd | 
 | 2RI16       Opcode + I16 + Rj + Rd | 
 | 1RI21       Opcode + I21L + Rj + I21H | 
 | I26         Opcode + I26L + I26H | 
 | =========== ========================== | 
 |  | 
 | Rd is the destination register operand, while Rj, Rk and Ra ("a" stands for | 
 | "additional") are the source register operands. I8/I12/I14/I16/I21/I26 are | 
 | immediate operands of respective width. The longer I21 and I26 are stored | 
 | in separate higher and lower parts in the instruction word, denoted by the "L" | 
 | and "H" suffixes. | 
 |  | 
 | List of Instructions | 
 | -------------------- | 
 |  | 
 | For brevity, only instruction names (mnemonics) are listed here; please see the | 
 | :ref:`References <loongarch-references>` for details. | 
 |  | 
 |  | 
 | 1. Arithmetic Instructions:: | 
 |  | 
 |     ADD.W SUB.W ADDI.W ADD.D SUB.D ADDI.D | 
 |     SLT SLTU SLTI SLTUI | 
 |     AND OR NOR XOR ANDN ORN ANDI ORI XORI | 
 |     MUL.W MULH.W MULH.WU DIV.W DIV.WU MOD.W MOD.WU | 
 |     MUL.D MULH.D MULH.DU DIV.D DIV.DU MOD.D MOD.DU | 
 |     PCADDI PCADDU12I PCADDU18I | 
 |     LU12I.W LU32I.D LU52I.D ADDU16I.D | 
 |  | 
 | 2. Bit-shift Instructions:: | 
 |  | 
 |     SLL.W SRL.W SRA.W ROTR.W SLLI.W SRLI.W SRAI.W ROTRI.W | 
 |     SLL.D SRL.D SRA.D ROTR.D SLLI.D SRLI.D SRAI.D ROTRI.D | 
 |  | 
 | 3. Bit-manipulation Instructions:: | 
 |  | 
 |     EXT.W.B EXT.W.H CLO.W CLO.D SLZ.W CLZ.D CTO.W CTO.D CTZ.W CTZ.D | 
 |     BYTEPICK.W BYTEPICK.D BSTRINS.W BSTRINS.D BSTRPICK.W BSTRPICK.D | 
 |     REVB.2H REVB.4H REVB.2W REVB.D REVH.2W REVH.D BITREV.4B BITREV.8B BITREV.W BITREV.D | 
 |     MASKEQZ MASKNEZ | 
 |  | 
 | 4. Branch Instructions:: | 
 |  | 
 |     BEQ BNE BLT BGE BLTU BGEU BEQZ BNEZ B BL JIRL | 
 |  | 
 | 5. Load/Store Instructions:: | 
 |  | 
 |     LD.B LD.BU LD.H LD.HU LD.W LD.WU LD.D ST.B ST.H ST.W ST.D | 
 |     LDX.B LDX.BU LDX.H LDX.HU LDX.W LDX.WU LDX.D STX.B STX.H STX.W STX.D | 
 |     LDPTR.W LDPTR.D STPTR.W STPTR.D | 
 |     PRELD PRELDX | 
 |  | 
 | 6. Atomic Operation Instructions:: | 
 |  | 
 |     LL.W SC.W LL.D SC.D | 
 |     AMSWAP.W AMSWAP.D AMADD.W AMADD.D AMAND.W AMAND.D AMOR.W AMOR.D AMXOR.W AMXOR.D | 
 |     AMMAX.W AMMAX.D AMMIN.W AMMIN.D | 
 |  | 
 | 7. Barrier Instructions:: | 
 |  | 
 |     IBAR DBAR | 
 |  | 
 | 8. Special Instructions:: | 
 |  | 
 |     SYSCALL BREAK CPUCFG NOP IDLE ERTN(ERET) DBCL(DBGCALL) RDTIMEL.W RDTIMEH.W RDTIME.D | 
 |     ASRTLE.D ASRTGT.D | 
 |  | 
 | 9. Privileged Instructions:: | 
 |  | 
 |     CSRRD CSRWR CSRXCHG | 
 |     IOCSRRD.B IOCSRRD.H IOCSRRD.W IOCSRRD.D IOCSRWR.B IOCSRWR.H IOCSRWR.W IOCSRWR.D | 
 |     CACOP TLBP(TLBSRCH) TLBRD TLBWR TLBFILL TLBCLR TLBFLUSH INVTLB LDDIR LDPTE | 
 |  | 
 | Virtual Memory | 
 | ============== | 
 |  | 
 | LoongArch supports direct-mapped virtual memory and page-mapped virtual memory. | 
 |  | 
 | Direct-mapped virtual memory is configured by CSR.DMWn (n=0~3), it has a simple | 
 | relationship between virtual address (VA) and physical address (PA):: | 
 |  | 
 |  VA = PA + FixedOffset | 
 |  | 
 | Page-mapped virtual memory has arbitrary relationship between VA and PA, which | 
 | is recorded in TLB and page tables. LoongArch's TLB includes a fully-associative | 
 | MTLB (Multiple Page Size TLB) and set-associative STLB (Single Page Size TLB). | 
 |  | 
 | By default, the whole virtual address space of LA32 is configured like this: | 
 |  | 
 | ============ =========================== ============================= | 
 | Name         Address Range               Attributes | 
 | ============ =========================== ============================= | 
 | ``UVRANGE``  ``0x00000000 - 0x7FFFFFFF`` Page-mapped, Cached, PLV0~3 | 
 | ``KPRANGE0`` ``0x80000000 - 0x9FFFFFFF`` Direct-mapped, Uncached, PLV0 | 
 | ``KPRANGE1`` ``0xA0000000 - 0xBFFFFFFF`` Direct-mapped, Cached, PLV0 | 
 | ``KVRANGE``  ``0xC0000000 - 0xFFFFFFFF`` Page-mapped, Cached, PLV0 | 
 | ============ =========================== ============================= | 
 |  | 
 | User mode (PLV3) can only access UVRANGE. For direct-mapped KPRANGE0 and | 
 | KPRANGE1, PA is equal to VA with bit30~31 cleared. For example, the uncached | 
 | direct-mapped VA of 0x00001000 is 0x80001000, and the cached direct-mapped | 
 | VA of 0x00001000 is 0xA0001000. | 
 |  | 
 | By default, the whole virtual address space of LA64 is configured like this: | 
 |  | 
 | ============ ====================== ====================================== | 
 | Name         Address Range          Attributes | 
 | ============ ====================== ====================================== | 
 | ``XUVRANGE`` ``0x0000000000000000 - Page-mapped, Cached, PLV0~3 | 
 |              0x3FFFFFFFFFFFFFFF`` | 
 | ``XSPRANGE`` ``0x4000000000000000 - Direct-mapped, Cached / Uncached, PLV0 | 
 |              0x7FFFFFFFFFFFFFFF`` | 
 | ``XKPRANGE`` ``0x8000000000000000 - Direct-mapped, Cached / Uncached, PLV0 | 
 |              0xBFFFFFFFFFFFFFFF`` | 
 | ``XKVRANGE`` ``0xC000000000000000 - Page-mapped, Cached, PLV0 | 
 |              0xFFFFFFFFFFFFFFFF`` | 
 | ============ ====================== ====================================== | 
 |  | 
 | User mode (PLV3) can only access XUVRANGE. For direct-mapped XSPRANGE and | 
 | XKPRANGE, PA is equal to VA with bits 60~63 cleared, and the cache attribute | 
 | is configured by bits 60~61 in VA: 0 is for strongly-ordered uncached, 1 is | 
 | for coherent cached, and 2 is for weakly-ordered uncached. | 
 |  | 
 | Currently we only use XKPRANGE for direct mapping and XSPRANGE is reserved. | 
 |  | 
 | To put this in action: the strongly-ordered uncached direct-mapped VA (in | 
 | XKPRANGE) of 0x00000000_00001000 is 0x80000000_00001000, the coherent cached | 
 | direct-mapped VA (in XKPRANGE) of 0x00000000_00001000 is 0x90000000_00001000, | 
 | and the weakly-ordered uncached direct-mapped VA (in XKPRANGE) of 0x00000000 | 
 | _00001000 is 0xA0000000_00001000. | 
 |  | 
 | Relationship of Loongson and LoongArch | 
 | ====================================== | 
 |  | 
 | LoongArch is a RISC ISA which is different from any other existing ones, while | 
 | Loongson is a family of processors. Loongson includes 3 series: Loongson-1 is | 
 | the 32-bit processor series, Loongson-2 is the low-end 64-bit processor series, | 
 | and Loongson-3 is the high-end 64-bit processor series. Old Loongson is based on | 
 | MIPS, while New Loongson is based on LoongArch. Take Loongson-3 as an example: | 
 | Loongson-3A1000/3B1500/3A2000/3A3000/3A4000 are MIPS-compatible, while Loongson- | 
 | 3A5000 (and future revisions) are all based on LoongArch. | 
 |  | 
 | .. _loongarch-references: | 
 |  | 
 | References | 
 | ========== | 
 |  | 
 | Official web site of Loongson Technology Corp. Ltd.: | 
 |  | 
 |   http://www.loongson.cn/ | 
 |  | 
 | Developer web site of Loongson and LoongArch (Software and Documentation): | 
 |  | 
 |   http://www.loongnix.cn/ | 
 |  | 
 |   https://github.com/loongson/ | 
 |  | 
 |   https://loongson.github.io/LoongArch-Documentation/ | 
 |  | 
 | Documentation of LoongArch ISA: | 
 |  | 
 |   https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.10-CN.pdf (in Chinese) | 
 |  | 
 |   https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-Vol1-v1.10-EN.pdf (in English) | 
 |  | 
 | Documentation of LoongArch ELF psABI: | 
 |  | 
 |   https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v2.01-CN.pdf (in Chinese) | 
 |  | 
 |   https://github.com/loongson/LoongArch-Documentation/releases/latest/download/LoongArch-ELF-ABI-v2.01-EN.pdf (in English) | 
 |  | 
 | Linux kernel repository of Loongson and LoongArch: | 
 |  | 
 |   https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git |