merge-upstream/v5.15.157 from branch/tag: upstream/v5.15.157 into branch: main-R105-cos-5.15
Changelog:
-------------------------------------------------------------
Alan Stern (1):
fs: sysfs: Fix reference leak in sysfs_break_active_protection()
Alexander Usyskin (1):
mei: me: disable RPL-S on SPS and IGN firmwares
Arınç ÜNAL (4):
net: dsa: mt7530: fix mirroring frames received on local port
net: dsa: mt7530: set all CPU ports in MT7531_CPU_PMAP
net: dsa: mt7530: fix improper frames on all 25MHz and 40MHz XTAL MT7530
net: dsa: mt7530: fix enabling EEE on MT7531 switch on all boards
Boris Burkov (1):
btrfs: record delayed inode root in transaction
Carlos Llamas (1):
binder: check offset alignment in binder_get_object()
Chuanhong Guo (1):
USB: serial: option: add support for Fibocom FM650/FG650
Chuck Lever (1):
Revert "lockd: introduce safe async lock op"
Claudiu Beznea (1):
clk: remove extra empty line
Coia Prant (1):
USB: serial: option: add Lonsung U8300/U9300 product
Daniel Borkmann (4):
bpf: Generalize check_ctx_reg for reuse with other types
bpf: Generally fix helper register offset check
bpf: Fix out of bounds access for ringbuf helpers
bpf: Fix ringbuf memory type confusion when passing to helpers
Daniele Palmas (1):
USB: serial: option: add Telit FN920C04 rmnet compositions
Dave Airlie (1):
nouveau: fix instmem race condition around ptr stores
Dmitry Baryshkov (1):
drm/panel: visionox-rm69299: don't unregister DSI device
Eric Biggers (1):
x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ
Finn Thain (1):
serial/pmac_zilog: Remove flawed mitigation for rx irq flood
Florian Westphal (1):
netfilter: nft_set_pipapo: do not free live element
Gil Fine (2):
thunderbolt: Avoid notify PM core about runtime PM resume
thunderbolt: Fix wake configurations after device unplug
Greg Kroah-Hartman (2):
Revert "usb: cdc-wdm: close race between read and workqueue"
Linux 5.15.157
Jeongjun Park (1):
nilfs2: fix OOB in nilfs_set_de_type
Jerry Meng (1):
USB: serial: option: support Quectel EM060K sub-models
Josh Poimboeuf (1):
x86/bugs: Fix BHI retpoline check
Kai-Heng Feng (1):
usb: Disable USB3 LPM at shutdown
Konrad Dybcio (1):
clk: Print an info line before disabling unused clocks
Kumar Kartikeya Dwivedi (1):
bpf: Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support
Kuniyuki Iwashima (2):
af_unix: Call manage_oob() for every skb in unix_stream_read_generic().
af_unix: Don't peek OOB data without MSG_OOB.
Lei Chen (1):
tun: limit printing rate when illegal packet received by tun dev
Mark Zhang (1):
RDMA/cm: Print the old state when cm_destroy_id gets timeout
Michael Guralnik (1):
RDMA/mlx5: Fix port number for counter query in multi-port configuration
Mikhail Kobuk (1):
drm: nv04: Fix out of bounds access
Minas Harutyunyan (1):
usb: dwc2: host: Fix dereference issue in DDMA completion flow.
Namjae Jeon (3):
ksmbd: don't send oplock break if rename fails
ksmbd: validate payload size in ipc response
ksmbd: do not set SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1
Nikita Zhandarovich (1):
comedi: vmk80xx: fix incomplete endpoint checking
Norihiko Hama (1):
usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error
Pablo Neira Ayuso (3):
netfilter: br_netfilter: skip conntrack input hook for promisc packets
netfilter: flowtable: validate pppoe header
netfilter: flowtable: incorrect pppoe tuple
Peter Oberparleiter (2):
s390/qdio: handle deferred cc1
s390/cio: fix race condition during online processing
Samuel Thibault (1):
speakup: Avoid crash on very long word
Sandipan Das (1):
KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms
Sean Christopherson (1):
KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible
Siddh Raman Pant (1):
Revert "tracing/trigger: Fix to return error if failed to alloc snapshot"
Siddharth Vadapalli (1):
net: ethernet: ti: am65-cpsw-nuss: cleanup DMA Channels before using them
Stephen Boyd (4):
clk: Remove prepare_lock hold assertion in __clk_release()
clk: Mark 'all_lists' as const
clk: Initialize struct clk_core kref earlier
clk: Get runtime PM before walking tree during disable_unused
Steven Rostedt (Google) (1):
SUNRPC: Fix rpcgss_context trace event acceptor field
Vanillan Wang (1):
USB: serial: option: add Rolling RW101-GL and RW135-GL support
Vlad Buslov (1):
netfilter: nf_flow_table: count pending offload workqueue tasks
Vladimir Oltean (1):
net: dsa: introduce preferred_default_local_cpu_port and use on MT7530
Yanjun.Zhu (1):
RDMA/rxe: Fix the problem "mutex_destroy missing"
Yaxiong Tian (1):
arm64: hibernate: Fix level3 translation fault in swsusp_save()
Yuanhe Shu (1):
selftests/ftrace: Limit length in subsystem-enable tests
Yuntao Wang (1):
init/main.c: Fix potential static_command_line memory overflow
Zack Rusin (1):
drm/vmwgfx: Sort primary plane formats by order of preference
Zheng Yejian (1):
kprobes: Fix possible use-after-free issue on kprobe registration
Ziyang Xuan (2):
netfilter: nf_tables: Fix potential data-race in __nft_expr_type_get()
netfilter: nf_tables: Fix potential data-race in __nft_obj_type_get()
bolan wang (1):
USB: serial: option: add Fibocom FM135-GL variants
xinhui pan (1):
drm/amdgpu: validate the parameters of bo mapping operations more clearly
BUG=b/337599104
TEST=tryjob, validation and K8s e2e
RELEASE_NOTE=Updated the Linux kernel to v5.15.157.
Change-Id: Ie55b6695d593d294602d14e0a162e638f202fd0a
Signed-off-by: COS Kernel Merge Bot <cloud-image-merge-automation@prod.google.com>
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index e61f0d0..30f1fc4 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4020,6 +4020,12 @@
nomsi [MSI] If the PCI_MSI kernel config parameter is
enabled, this kernel boot option can be used to
disable the use of MSI interrupts system-wide.
+ clearmsi [X86] Clears MSI/MSI-X enable bits early in boot
+ time in order to avoid issues like adapters
+ screaming irqs and preventing boot progress.
+ Also, it enforces the PCI Local Bus spec
+ rule that those bits should be 0 in system reset
+ events (useful for kexec/kdump cases).
noioapicquirk [APIC] Disable all boot interrupt quirks.
Safety option to keep boot IRQs enabled. This
should never be necessary.
@@ -5300,6 +5306,8 @@
serialnumber [BUGS=X86-32]
+ sev=option[,option...] [X86-64] See Documentation/x86/x86_64/boot-options.rst
+
shapers= [NET]
Maximal number of shapers.
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 48b91c4..2b1fdf4 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -442,6 +442,25 @@
``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
+io_uring_disabled
+=================
+
+Prevents all processes from creating new io_uring instances. Enabling this
+shrinks the kernel's attack surface.
+
+= ==================================================================
+0 All processes can create io_uring instances as normal. This is the
+ default setting.
+1 io_uring creation is disabled for unprivileged processes.
+ io_uring_setup fails with -EPERM unless the calling process is
+ privileged (CAP_SYS_ADMIN). Existing io_uring instances can
+ still be used.
+2 io_uring creation is disabled for all processes. io_uring_setup
+ always fails with -EPERM. Existing io_uring instances can still be
+ used.
+= ==================================================================
+
+
kexec_load_disabled
===================
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index aad6db9..71697c3 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -327,7 +327,10 @@
----------
Maximum ancillary buffer size allowed per socket. Ancillary data is a sequence
-of struct cmsghdr structures with appended data.
+of struct cmsghdr structures with appended data. TCP tx zerocopy also uses
+optmem_max as a limit for its internal structures.
+
+Default : 128 KB
fb_tunnels_only_for_init_net
----------------------------
diff --git a/Documentation/networking/device_drivers/ethernet/google/gve.rst b/Documentation/networking/device_drivers/ethernet/google/gve.rst
index 6d73ee7..31d621b 100644
--- a/Documentation/networking/device_drivers/ethernet/google/gve.rst
+++ b/Documentation/networking/device_drivers/ethernet/google/gve.rst
@@ -52,6 +52,15 @@
GVE supports two descriptor formats: GQI and DQO. These two formats have
entirely different descriptors, which will be described below.
+Addressing Mode
+------------------
+GVE supports two addressing modes: QPL and RDA.
+QPL ("queue-page-list") mode communicates data through a set of
+pre-registered pages.
+
+For RDA ("raw DMA addressing") mode, the set of pages is dynamic.
+Therefore, the packet buffers can be anywhere in guest memory.
+
Registers
---------
All registers are MMIO.
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index 7f75767..76534fa 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -303,6 +303,7 @@
option can harm clients of your server.
tcp_adv_win_scale - INTEGER
+ Obsolete since linux-6.6
Count buffering overhead as bytes/2^tcp_adv_win_scale
(if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale),
if it is <= 0.
@@ -716,6 +717,13 @@
Default : 44
+tcp_backlog_ack_defer - BOOLEAN
+ If set, user thread processing socket backlog tries sending
+ one ACK for the whole queue. This helps to avoid potential
+ long latencies at end of a TCP socket syscall.
+
+ Default : true
+
tcp_slow_start_after_idle - BOOLEAN
If set, provide RFC2861 behavior and time out the congestion
window after an idle period. An idle period is defined at
diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst
new file mode 100644
index 0000000..bf593e8
--- /dev/null
+++ b/Documentation/virt/coco/sevguest.rst
@@ -0,0 +1,155 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================================
+The Definitive SEV Guest API Documentation
+===================================================================
+
+1. General description
+======================
+
+The SEV API is a set of ioctls that are used by the guest or hypervisor
+to get or set a certain aspect of the SEV virtual machine. The ioctls belong
+to the following classes:
+
+ - Hypervisor ioctls: These query and set global attributes which affect the
+ whole SEV firmware. These ioctl are used by platform provisioning tools.
+
+ - Guest ioctls: These query and set attributes of the SEV virtual machine.
+
+2. API description
+==================
+
+This section describes ioctls that is used for querying the SEV guest report
+from the SEV firmware. For each ioctl, the following information is provided
+along with a description:
+
+ Technology:
+ which SEV technology provides this ioctl. SEV, SEV-ES, SEV-SNP or all.
+
+ Type:
+ hypervisor or guest. The ioctl can be used inside the guest or the
+ hypervisor.
+
+ Parameters:
+ what parameters are accepted by the ioctl.
+
+ Returns:
+ the return value. General error numbers (-ENOMEM, -EINVAL)
+ are not detailed, but errors with specific meanings are.
+
+The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device.
+The ioctl accepts struct snp_user_guest_request. The input and output structure is
+specified through the req_data and resp_data field respectively. If the ioctl fails
+to execute due to a firmware error, then fw_err code will be set otherwise the
+fw_err will be set to 0x00000000000000ff.
+
+The firmware checks that the message sequence counter is one greater than
+the guests message sequence counter. If guest driver fails to increment message
+counter (e.g. counter overflow), then -EIO will be returned.
+
+::
+
+ struct snp_guest_request_ioctl {
+ /* Message version number */
+ __u32 msg_version;
+
+ /* Request and response structure address */
+ __u64 req_data;
+ __u64 resp_data;
+
+ /* firmware error code on failure (see psp-sev.h) */
+ __u64 fw_err;
+ };
+
+2.1 SNP_GET_REPORT
+------------------
+
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in): struct snp_report_req
+:Returns (out): struct snp_report_resp on success, -negative on error
+
+The SNP_GET_REPORT ioctl can be used to query the attestation report from the
+SEV-SNP firmware. The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command
+provided by the SEV-SNP firmware to query the attestation report.
+
+On success, the snp_report_resp.data will contains the report. The report
+contain the format described in the SEV-SNP specification. See the SEV-SNP
+specification for further details.
+
+2.2 SNP_GET_DERIVED_KEY
+-----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in): struct snp_derived_key_req
+:Returns (out): struct snp_derived_key_resp on success, -negative on error
+
+The SNP_GET_DERIVED_KEY ioctl can be used to get a key derive from a root key.
+The derived key can be used by the guest for any purpose, such as sealing keys
+or communicating with external entities.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_KEY_REQ) command provided by the
+SEV-SNP firmware to derive the key. See SEV-SNP specification for further details
+on the various fields passed in the key derivation request.
+
+On success, the snp_derived_key_resp.data contains the derived key value. See
+the SEV-SNP specification for further details.
+
+
+2.3 SNP_GET_EXT_REPORT
+----------------------
+:Technology: sev-snp
+:Type: guest ioctl
+:Parameters (in/out): struct snp_ext_report_req
+:Returns (out): struct snp_report_resp on success, -negative on error
+
+The SNP_GET_EXT_REPORT ioctl is similar to the SNP_GET_REPORT. The difference is
+related to the additional certificate data that is returned with the report.
+The certificate data returned is being provided by the hypervisor through the
+SNP_SET_EXT_CONFIG.
+
+The ioctl uses the SNP_GUEST_REQUEST (MSG_REPORT_REQ) command provided by the SEV-SNP
+firmware to get the attestation report.
+
+On success, the snp_ext_report_resp.data will contain the attestation report
+and snp_ext_report_req.certs_address will contain the certificate blob. If the
+length of the blob is smaller than expected then snp_ext_report_req.certs_len will
+be updated with the expected value.
+
+See GHCB specification for further detail on how to parse the certificate blob.
+
+3. SEV-SNP CPUID Enforcement
+============================
+
+SEV-SNP guests can access a special page that contains a table of CPUID values
+that have been validated by the PSP as part of the SNP_LAUNCH_UPDATE firmware
+command. It provides the following assurances regarding the validity of CPUID
+values:
+
+ - Its address is obtained via bootloader/firmware (via CC blob), and those
+ binaries will be measured as part of the SEV-SNP attestation report.
+ - Its initial state will be encrypted/pvalidated, so attempts to modify
+ it during run-time will result in garbage being written, or #VC exceptions
+ being generated due to changes in validation state if the hypervisor tries
+ to swap the backing page.
+ - Attempts to bypass PSP checks by the hypervisor by using a normal page, or
+ a non-CPUID encrypted page will change the measurement provided by the
+ SEV-SNP attestation report.
+ - The CPUID page contents are *not* measured, but attempts to modify the
+ expected contents of a CPUID page as part of guest initialization will be
+ gated by the PSP CPUID enforcement policy checks performed on the page
+ during SNP_LAUNCH_UPDATE, and noticeable later if the guest owner
+ implements their own checks of the CPUID values.
+
+It is important to note that this last assurance is only useful if the kernel
+has taken care to make use of the SEV-SNP CPUID throughout all stages of boot.
+Otherwise, guest owner attestation provides no assurance that the kernel wasn't
+fed incorrect values at some point during boot.
+
+
+Reference
+---------
+
+SEV-SNP and GHCB specification: developer.amd.com/sev
+
+The driver is based on SEV-SNP firmware spec 0.9 and GHCB spec version 2.0.
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index edea7fe..40ad0d2 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -13,6 +13,7 @@
guest-halt-polling
ne_overview
acrn/index
+ coco/sevguest
.. only:: html and subproject
diff --git a/Documentation/x86/x86_64/boot-options.rst b/Documentation/x86/x86_64/boot-options.rst
index ccb7e86..eaecb5d 100644
--- a/Documentation/x86/x86_64/boot-options.rst
+++ b/Documentation/x86/x86_64/boot-options.rst
@@ -317,3 +317,17 @@
Do not use GB pages for kernel direct mappings.
gbpages
Use GB pages for kernel direct mappings.
+
+
+AMD SEV (Secure Encrypted Virtualization)
+=========================================
+Options relating to AMD SEV, specified via the following format:
+
+::
+
+ sev=option1[,option2]
+
+The available options are:
+
+ debug
+ Enable debug messages.
diff --git a/Documentation/x86/zero-page.rst b/Documentation/x86/zero-page.rst
index f088f58..45aa9cc 100644
--- a/Documentation/x86/zero-page.rst
+++ b/Documentation/x86/zero-page.rst
@@ -19,6 +19,7 @@
058/008 ALL tboot_addr Physical address of tboot shared page
060/010 ALL ist_info Intel SpeedStep (IST) BIOS support information
(struct ist_info)
+070/008 ALL acpi_rsdp_addr Physical address of ACPI RSDP table
080/010 ALL hd0_info hd0 disk parameter, OBSOLETE!!
090/010 ALL hd1_info hd1 disk parameter, OBSOLETE!!
0A0/010 ALL sys_desc_table System description table (struct sys_desc_table),
@@ -27,6 +28,7 @@
0C0/004 ALL ext_ramdisk_image ramdisk_image high 32bits
0C4/004 ALL ext_ramdisk_size ramdisk_size high 32bits
0C8/004 ALL ext_cmd_line_ptr cmd_line_ptr high 32bits
+13C/004 ALL cc_blob_address Physical address of Confidential Computing blob
140/080 ALL edid_info Video mode setup (struct edid_info)
1C0/020 ALL efi_info EFI 32 information (struct efi_info)
1E0/004 ALL alt_mem_k Alternative mem check, in KB
diff --git a/PRESUBMIT.cfg b/PRESUBMIT.cfg
new file mode 100644
index 0000000..e42c329
--- /dev/null
+++ b/PRESUBMIT.cfg
@@ -0,0 +1,14 @@
+[Hook Overrides]
+# Make sure cos_patch trailer is present
+cos_patch_trailer_check: true
+
+aosp_license_check: false
+cros_license_check: false
+long_line_check: false
+stray_whitespace_check: false
+tab_check: false
+tabbed_indent_required_check: false
+signoff_check: true
+
+# Make sure RELEASE_NOTE field is present.
+release_note_field_check: true
diff --git a/arch/arm64/configs/google/xfstest.config b/arch/arm64/configs/google/xfstest.config
new file mode 100644
index 0000000..fb9126e
--- /dev/null
+++ b/arch/arm64/configs/google/xfstest.config
@@ -0,0 +1,25 @@
+# Configurations required to run xfs tests
+CONFIG_MODULE_SIG=n
+CONFIG_MODULE_SIG_ALL=n
+CONFIG_SECURITY_LOADPIN=n
+CONFIG_SECURITY_LOADPIN_ENFORCE=n
+CONFIG_SECURITY_YAMA=n
+CONFIG_SECURITY_LOCKDOWN_LSM=n
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=n
+CONFIG_LSM=""
+CONFIG_SYSTEM_TRUSTED_KEYRING=n
+CONFIG_SECONDARY_TRUSTED_KEYRING=n
+CONFIG_VFAT_FS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_UTF8=y
+CONFIG_FAT_DEFAULT_UTF8=y
+CONFIG_GVE=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_NFSD=y
+CONFIG_NFSD_V4=y
diff --git a/arch/arm64/configs/lakitu_defconfig b/arch/arm64/configs/lakitu_defconfig
new file mode 100644
index 0000000..5622253
--- /dev/null
+++ b/arch/arm64/configs/lakitu_defconfig
@@ -0,0 +1,4175 @@
+#
+# Automatically generated file; DO NOT EDIT.
+# Linux/arm64 5.15.152 Kernel Configuration
+#
+CONFIG_CC_VERSION_TEXT="Chromium OS 15.0_pre458507_p20220602-r18 clang version 15.0.0 (/var/tmp/portage/sys-devel/llvm-15.0_pre458507_p20220602-r18/work/llvm-15.0_pre458507_p20220602/clang a58d0af058038595c93de961b725f86997cf8d4a)"
+CONFIG_GCC_VERSION=0
+CONFIG_CC_IS_CLANG=y
+CONFIG_CLANG_VERSION=150000
+CONFIG_AS_IS_LLVM=y
+CONFIG_AS_VERSION=150000
+CONFIG_LD_VERSION=0
+CONFIG_LD_IS_LLD=y
+CONFIG_LLD_VERSION=150000
+CONFIG_CC_CAN_LINK=y
+CONFIG_CC_CAN_LINK_STATIC=y
+CONFIG_CC_HAS_ASM_GOTO=y
+CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
+CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
+CONFIG_TOOLS_SUPPORT_RELR=y
+CONFIG_CC_HAS_ASM_INLINE=y
+CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
+CONFIG_PAHOLE_VERSION=121
+CONFIG_IRQ_WORK=y
+CONFIG_BUILDTIME_TABLE_SORT=y
+CONFIG_THREAD_INFO_IN_TASK=y
+
+#
+# General setup
+#
+CONFIG_INIT_ENV_ARG_LIMIT=32
+# CONFIG_COMPILE_TEST is not set
+# CONFIG_WERROR is not set
+CONFIG_LOCALVERSION=""
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_BUILD_SALT=""
+CONFIG_DEFAULT_INIT=""
+CONFIG_DEFAULT_HOSTNAME="localhost"
+CONFIG_SWAP=y
+CONFIG_SYSVIPC=y
+CONFIG_SYSVIPC_SYSCTL=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_POSIX_MQUEUE_SYSCTL=y
+# CONFIG_WATCH_QUEUE is not set
+CONFIG_CROSS_MEMORY_ATTACH=y
+CONFIG_USELIB=y
+CONFIG_AUDIT=y
+CONFIG_HAVE_ARCH_AUDITSYSCALL=y
+CONFIG_AUDITSYSCALL=y
+
+#
+# IRQ subsystem
+#
+CONFIG_GENERIC_IRQ_PROBE=y
+CONFIG_GENERIC_IRQ_SHOW=y
+CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
+CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
+CONFIG_GENERIC_IRQ_MIGRATION=y
+CONFIG_HARDIRQS_SW_RESEND=y
+CONFIG_IRQ_DOMAIN=y
+CONFIG_IRQ_DOMAIN_HIERARCHY=y
+CONFIG_GENERIC_IRQ_IPI=y
+CONFIG_GENERIC_MSI_IRQ=y
+CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
+CONFIG_HANDLE_DOMAIN_IRQ=y
+CONFIG_IRQ_FORCED_THREADING=y
+CONFIG_SPARSE_IRQ=y
+# CONFIG_GENERIC_IRQ_DEBUGFS is not set
+# end of IRQ subsystem
+
+CONFIG_GENERIC_TIME_VSYSCALL=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_ARCH_HAS_TICK_BROADCAST=y
+CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
+
+#
+# Timers subsystem
+#
+CONFIG_TICK_ONESHOT=y
+CONFIG_NO_HZ_COMMON=y
+# CONFIG_HZ_PERIODIC is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_NO_HZ_FULL is not set
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+# end of Timers subsystem
+
+CONFIG_BPF=y
+CONFIG_HAVE_EBPF_JIT=y
+CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
+
+#
+# BPF subsystem
+#
+CONFIG_BPF_SYSCALL=y
+CONFIG_BPF_JIT=y
+CONFIG_BPF_JIT_ALWAYS_ON=y
+CONFIG_BPF_JIT_DEFAULT_ON=y
+# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set
+# CONFIG_BPF_PRELOAD is not set
+# CONFIG_BPF_LSM is not set
+# end of BPF subsystem
+
+CONFIG_PREEMPT_NONE=y
+# CONFIG_PREEMPT_VOLUNTARY is not set
+# CONFIG_PREEMPT is not set
+CONFIG_SCHED_CORE=y
+
+#
+# CPU/Task time and stats accounting
+#
+CONFIG_TICK_CPU_ACCOUNTING=y
+# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
+# CONFIG_IRQ_TIME_ACCOUNTING is not set
+CONFIG_HAVE_SCHED_AVG_IRQ=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_PSI=y
+CONFIG_PSI_DEFAULT_DISABLED=y
+# end of CPU/Task time and stats accounting
+
+CONFIG_CPU_ISOLATION=y
+
+#
+# RCU Subsystem
+#
+CONFIG_TREE_RCU=y
+# CONFIG_RCU_EXPERT is not set
+CONFIG_SRCU=y
+CONFIG_TREE_SRCU=y
+CONFIG_TASKS_RCU_GENERIC=y
+CONFIG_TASKS_RUDE_RCU=y
+CONFIG_TASKS_TRACE_RCU=y
+CONFIG_RCU_STALL_COMMON=y
+CONFIG_RCU_NEED_SEGCBLIST=y
+# end of RCU Subsystem
+
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_IKHEADERS=m
+CONFIG_LOG_BUF_SHIFT=18
+CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
+CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
+# CONFIG_PRINTK_INDEX is not set
+CONFIG_GENERIC_SCHED_CLOCK=y
+
+#
+# Scheduler features
+#
+# end of Scheduler features
+
+CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
+CONFIG_CC_HAS_INT128=y
+CONFIG_ARCH_SUPPORTS_INT128=y
+# CONFIG_NUMA_BALANCING is not set
+CONFIG_CGROUPS=y
+CONFIG_PAGE_COUNTER=y
+CONFIG_MEMCG=y
+CONFIG_MEMCG_SWAP=y
+CONFIG_MEMCG_KMEM=y
+CONFIG_BLK_CGROUP=y
+CONFIG_CGROUP_WRITEBACK=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_FAIR_GROUP_SCHED=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_CGROUP_PIDS=y
+CONFIG_CGROUP_RDMA=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_HUGETLB=y
+CONFIG_CPUSETS=y
+CONFIG_PROC_PID_CPUSET=y
+CONFIG_CGROUP_DEVICE=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_PERF=y
+CONFIG_CGROUP_BPF=y
+# CONFIG_CGROUP_MISC is not set
+# CONFIG_CGROUP_DEBUG is not set
+CONFIG_SOCK_CGROUP_DATA=y
+CONFIG_NAMESPACES=y
+CONFIG_UTS_NS=y
+CONFIG_TIME_NS=y
+CONFIG_IPC_NS=y
+CONFIG_USER_NS=y
+CONFIG_PID_NS=y
+CONFIG_NET_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
+# CONFIG_SCHED_AUTOGROUP is not set
+# CONFIG_SYSFS_DEPRECATED is not set
+CONFIG_RELAY=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_INITRAMFS_SOURCE=""
+CONFIG_RD_GZIP=y
+# CONFIG_RD_BZIP2 is not set
+# CONFIG_RD_LZMA is not set
+CONFIG_RD_XZ=y
+# CONFIG_RD_LZO is not set
+CONFIG_RD_LZ4=y
+CONFIG_RD_ZSTD=y
+# CONFIG_BOOT_CONFIG is not set
+CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
+# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
+CONFIG_LD_ORPHAN_WARN=y
+CONFIG_SYSCTL=y
+CONFIG_HAVE_UID16=y
+CONFIG_SYSCTL_EXCEPTION_TRACE=y
+CONFIG_EXPERT=y
+CONFIG_UID16=y
+CONFIG_MULTIUSER=y
+CONFIG_SGETMASK_SYSCALL=y
+CONFIG_SYSFS_SYSCALL=y
+CONFIG_FHANDLE=y
+CONFIG_POSIX_TIMERS=y
+CONFIG_PRINTK=y
+CONFIG_BUG=y
+CONFIG_ELF_CORE=y
+CONFIG_BASE_FULL=y
+CONFIG_FUTEX=y
+CONFIG_FUTEX_PI=y
+CONFIG_HAVE_FUTEX_CMPXCHG=y
+CONFIG_EPOLL=y
+CONFIG_SIGNALFD=y
+CONFIG_TIMERFD=y
+CONFIG_EVENTFD=y
+CONFIG_SHMEM=y
+CONFIG_AIO=y
+CONFIG_IO_URING=y
+CONFIG_ADVISE_SYSCALLS=y
+CONFIG_MEMBARRIER=y
+CONFIG_KALLSYMS=y
+# CONFIG_KALLSYMS_ALL is not set
+CONFIG_KALLSYMS_BASE_RELATIVE=y
+# CONFIG_USERFAULTFD is not set
+CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
+CONFIG_KCMP=y
+CONFIG_RSEQ=y
+# CONFIG_DEBUG_RSEQ is not set
+CONFIG_EMBEDDED=y
+CONFIG_HAVE_PERF_EVENTS=y
+# CONFIG_PC104 is not set
+
+#
+# Kernel Performance Events And Counters
+#
+CONFIG_PERF_EVENTS=y
+# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
+# end of Kernel Performance Events And Counters
+
+CONFIG_VM_EVENT_COUNTERS=y
+CONFIG_SLUB_DEBUG=y
+# CONFIG_COMPAT_BRK is not set
+# CONFIG_SLAB is not set
+CONFIG_SLUB=y
+# CONFIG_SLOB is not set
+CONFIG_SLAB_MERGE_DEFAULT=y
+CONFIG_SLAB_FREELIST_RANDOM=y
+CONFIG_SLAB_FREELIST_HARDENED=y
+CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
+CONFIG_SLUB_CPU_PARTIAL=y
+CONFIG_SYSTEM_DATA_VERIFICATION=y
+CONFIG_PROFILING=y
+CONFIG_TRACEPOINTS=y
+# end of General setup
+
+CONFIG_ARM64=y
+CONFIG_64BIT=y
+CONFIG_MMU=y
+CONFIG_ARM64_PAGE_SHIFT=12
+CONFIG_ARM64_CONT_PTE_SHIFT=4
+CONFIG_ARM64_CONT_PMD_SHIFT=4
+CONFIG_ARCH_MMAP_RND_BITS_MIN=18
+CONFIG_ARCH_MMAP_RND_BITS_MAX=33
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=11
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
+CONFIG_STACKTRACE_SUPPORT=y
+CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
+CONFIG_LOCKDEP_SUPPORT=y
+CONFIG_GENERIC_BUG=y
+CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
+CONFIG_GENERIC_HWEIGHT=y
+CONFIG_GENERIC_CSUM=y
+CONFIG_GENERIC_CALIBRATE_DELAY=y
+CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y
+CONFIG_SMP=y
+CONFIG_KERNEL_MODE_NEON=y
+CONFIG_FIX_EARLYCON_MEM=y
+CONFIG_PGTABLE_LEVELS=4
+CONFIG_ARCH_SUPPORTS_UPROBES=y
+CONFIG_ARCH_PROC_KCORE_TEXT=y
+
+#
+# Platform selection
+#
+# CONFIG_ARCH_ACTIONS is not set
+# CONFIG_ARCH_SUNXI is not set
+# CONFIG_ARCH_ALPINE is not set
+# CONFIG_ARCH_APPLE is not set
+# CONFIG_ARCH_BCM2835 is not set
+# CONFIG_ARCH_BCM4908 is not set
+# CONFIG_ARCH_BCM_IPROC is not set
+# CONFIG_ARCH_BERLIN is not set
+# CONFIG_ARCH_BITMAIN is not set
+# CONFIG_ARCH_BRCMSTB is not set
+# CONFIG_ARCH_EXYNOS is not set
+# CONFIG_ARCH_SPARX5 is not set
+# CONFIG_ARCH_K3 is not set
+# CONFIG_ARCH_LAYERSCAPE is not set
+# CONFIG_ARCH_LG1K is not set
+# CONFIG_ARCH_HISI is not set
+# CONFIG_ARCH_KEEMBAY is not set
+# CONFIG_ARCH_MEDIATEK is not set
+# CONFIG_ARCH_MESON is not set
+# CONFIG_ARCH_MVEBU is not set
+# CONFIG_ARCH_MXC is not set
+# CONFIG_ARCH_QCOM is not set
+# CONFIG_ARCH_REALTEK is not set
+# CONFIG_ARCH_RENESAS is not set
+# CONFIG_ARCH_ROCKCHIP is not set
+# CONFIG_ARCH_S32 is not set
+# CONFIG_ARCH_SEATTLE is not set
+# CONFIG_ARCH_INTEL_SOCFPGA is not set
+# CONFIG_ARCH_SYNQUACER is not set
+# CONFIG_ARCH_TEGRA is not set
+# CONFIG_ARCH_SPRD is not set
+# CONFIG_ARCH_THUNDER is not set
+# CONFIG_ARCH_THUNDER2 is not set
+# CONFIG_ARCH_UNIPHIER is not set
+# CONFIG_ARCH_VEXPRESS is not set
+# CONFIG_ARCH_VISCONTI is not set
+# CONFIG_ARCH_XGENE is not set
+# CONFIG_ARCH_ZYNQMP is not set
+# end of Platform selection
+
+#
+# Kernel Features
+#
+
+#
+# ARM errata workarounds via the alternatives framework
+#
+CONFIG_ARM64_WORKAROUND_CLEAN_CACHE=y
+CONFIG_ARM64_ERRATUM_826319=y
+CONFIG_ARM64_ERRATUM_827319=y
+CONFIG_ARM64_ERRATUM_824069=y
+CONFIG_ARM64_ERRATUM_819472=y
+CONFIG_ARM64_ERRATUM_832075=y
+CONFIG_ARM64_ERRATUM_1742098=y
+CONFIG_ARM64_ERRATUM_845719=y
+CONFIG_ARM64_ERRATUM_843419=y
+CONFIG_ARM64_LD_HAS_FIX_ERRATUM_843419=y
+CONFIG_ARM64_ERRATUM_1024718=y
+CONFIG_ARM64_ERRATUM_1418040=y
+CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT=y
+CONFIG_ARM64_ERRATUM_1165522=y
+CONFIG_ARM64_ERRATUM_1319367=y
+CONFIG_ARM64_ERRATUM_1530923=y
+CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y
+CONFIG_ARM64_ERRATUM_2441007=y
+CONFIG_ARM64_ERRATUM_1286807=y
+CONFIG_ARM64_ERRATUM_1463225=y
+CONFIG_ARM64_ERRATUM_1542419=y
+CONFIG_ARM64_ERRATUM_1508412=y
+CONFIG_ARM64_ERRATUM_2441009=y
+CONFIG_ARM64_ERRATUM_2457168=y
+CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE=y
+CONFIG_ARM64_ERRATUM_2054223=y
+CONFIG_ARM64_ERRATUM_2067961=y
+CONFIG_CAVIUM_ERRATUM_22375=y
+CONFIG_CAVIUM_ERRATUM_23144=y
+CONFIG_CAVIUM_ERRATUM_23154=y
+CONFIG_CAVIUM_ERRATUM_27456=y
+CONFIG_CAVIUM_ERRATUM_30115=y
+CONFIG_CAVIUM_TX2_ERRATUM_219=y
+CONFIG_FUJITSU_ERRATUM_010001=y
+CONFIG_HISILICON_ERRATUM_161600802=y
+CONFIG_QCOM_FALKOR_ERRATUM_1003=y
+CONFIG_QCOM_FALKOR_ERRATUM_1009=y
+CONFIG_QCOM_QDF2400_ERRATUM_0065=y
+CONFIG_QCOM_FALKOR_ERRATUM_E1041=y
+CONFIG_NVIDIA_CARMEL_CNP_ERRATUM=y
+CONFIG_SOCIONEXT_SYNQUACER_PREITS=y
+# end of ARM errata workarounds via the alternatives framework
+
+CONFIG_ARM64_4K_PAGES=y
+# CONFIG_ARM64_16K_PAGES is not set
+# CONFIG_ARM64_64K_PAGES is not set
+# CONFIG_ARM64_VA_BITS_39 is not set
+CONFIG_ARM64_VA_BITS_48=y
+CONFIG_ARM64_VA_BITS=48
+CONFIG_ARM64_PA_BITS_48=y
+CONFIG_ARM64_PA_BITS=48
+# CONFIG_CPU_BIG_ENDIAN is not set
+CONFIG_CPU_LITTLE_ENDIAN=y
+CONFIG_SCHED_MC=y
+CONFIG_SCHED_SMT=y
+CONFIG_NR_CPUS=512
+CONFIG_HOTPLUG_CPU=y
+CONFIG_NUMA=y
+CONFIG_NODES_SHIFT=6
+CONFIG_USE_PERCPU_NUMA_NODE_ID=y
+CONFIG_HAVE_SETUP_PER_CPU_AREA=y
+CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
+# CONFIG_HZ_100 is not set
+# CONFIG_HZ_250 is not set
+# CONFIG_HZ_300 is not set
+CONFIG_HZ_1000=y
+CONFIG_HZ=1000
+CONFIG_SCHED_HRTICK=y
+CONFIG_ARCH_SPARSEMEM_ENABLE=y
+CONFIG_HW_PERF_EVENTS=y
+CONFIG_CC_HAVE_SHADOW_CALL_STACK=y
+CONFIG_PARAVIRT=y
+CONFIG_PARAVIRT_TIME_ACCOUNTING=y
+# CONFIG_KEXEC is not set
+CONFIG_KEXEC_FILE=y
+# CONFIG_KEXEC_SIG is not set
+# CONFIG_CRASH_DUMP is not set
+CONFIG_XEN_DOM0=y
+CONFIG_XEN=y
+CONFIG_FORCE_MAX_ZONEORDER=11
+CONFIG_UNMAP_KERNEL_AT_EL0=y
+CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY=y
+CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
+# CONFIG_ARM64_SW_TTBR0_PAN is not set
+CONFIG_ARM64_TAGGED_ADDR_ABI=y
+CONFIG_COMPAT=y
+CONFIG_KUSER_HELPERS=y
+# CONFIG_COMPAT_VDSO is not set
+# CONFIG_ARMV8_DEPRECATED is not set
+
+#
+# ARMv8.1 architectural features
+#
+CONFIG_ARM64_HW_AFDBM=y
+CONFIG_ARM64_PAN=y
+CONFIG_AS_HAS_LDAPR=y
+CONFIG_AS_HAS_LSE_ATOMICS=y
+# end of ARMv8.1 architectural features
+
+#
+# ARMv8.2 architectural features
+#
+# CONFIG_ARM64_PMEM is not set
+CONFIG_ARM64_RAS_EXTN=y
+CONFIG_ARM64_CNP=y
+# end of ARMv8.2 architectural features
+
+#
+# ARMv8.3 architectural features
+#
+CONFIG_ARM64_PTR_AUTH=y
+CONFIG_ARM64_PTR_AUTH_KERNEL=y
+CONFIG_CC_HAS_BRANCH_PROT_PAC_RET=y
+CONFIG_CC_HAS_SIGN_RETURN_ADDRESS=y
+CONFIG_AS_HAS_PAC=y
+CONFIG_AS_HAS_CFI_NEGATE_RA_STATE=y
+# end of ARMv8.3 architectural features
+
+#
+# ARMv8.4 architectural features
+#
+CONFIG_ARM64_AMU_EXTN=y
+CONFIG_AS_HAS_ARMV8_4=y
+CONFIG_ARM64_TLB_RANGE=y
+# end of ARMv8.4 architectural features
+
+#
+# ARMv8.5 architectural features
+#
+CONFIG_AS_HAS_ARMV8_5=y
+CONFIG_ARM64_BTI=y
+CONFIG_ARM64_BTI_KERNEL=y
+CONFIG_CC_HAS_BRANCH_PROT_PAC_RET_BTI=y
+CONFIG_ARM64_E0PD=y
+CONFIG_ARCH_RANDOM=y
+CONFIG_ARM64_AS_HAS_MTE=y
+CONFIG_ARM64_MTE=y
+# end of ARMv8.5 architectural features
+
+#
+# ARMv8.7 architectural features
+#
+CONFIG_ARM64_EPAN=y
+# end of ARMv8.7 architectural features
+
+CONFIG_ARM64_SVE=y
+CONFIG_ARM64_MODULE_PLTS=y
+# CONFIG_ARM64_PSEUDO_NMI is not set
+CONFIG_RELOCATABLE=y
+CONFIG_RANDOMIZE_BASE=y
+CONFIG_RANDOMIZE_MODULE_REGION_FULL=y
+CONFIG_CC_HAVE_STACKPROTECTOR_SYSREG=y
+CONFIG_STACKPROTECTOR_PER_TASK=y
+# end of Kernel Features
+
+#
+# Boot options
+#
+# CONFIG_ARM64_ACPI_PARKING_PROTOCOL is not set
+CONFIG_CMDLINE=""
+CONFIG_EFI_STUB=y
+CONFIG_EFI=y
+CONFIG_DMI=y
+# end of Boot options
+
+CONFIG_SYSVIPC_COMPAT=y
+
+#
+# Power management options
+#
+CONFIG_SUSPEND=y
+CONFIG_SUSPEND_FREEZER=y
+# CONFIG_SUSPEND_SKIP_SYNC is not set
+# CONFIG_HIBERNATION is not set
+CONFIG_PM_SLEEP=y
+CONFIG_PM_SLEEP_SMP=y
+# CONFIG_PM_AUTOSLEEP is not set
+# CONFIG_PM_WAKELOCKS is not set
+CONFIG_PM=y
+CONFIG_PM_DEBUG=y
+# CONFIG_PM_ADVANCED_DEBUG is not set
+# CONFIG_PM_TEST_SUSPEND is not set
+CONFIG_PM_SLEEP_DEBUG=y
+# CONFIG_DPM_WATCHDOG is not set
+CONFIG_PM_CLK=y
+# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
+CONFIG_CPU_PM=y
+# CONFIG_ENERGY_MODEL is not set
+CONFIG_ARCH_HIBERNATION_POSSIBLE=y
+CONFIG_ARCH_SUSPEND_POSSIBLE=y
+# end of Power management options
+
+#
+# CPU Power Management
+#
+
+#
+# CPU Idle
+#
+CONFIG_CPU_IDLE=y
+CONFIG_CPU_IDLE_GOV_LADDER=y
+CONFIG_CPU_IDLE_GOV_MENU=y
+# CONFIG_CPU_IDLE_GOV_TEO is not set
+
+#
+# ARM CPU Idle Drivers
+#
+# CONFIG_ARM_CPUIDLE is not set
+# CONFIG_ARM_PSCI_CPUIDLE is not set
+# end of ARM CPU Idle Drivers
+# end of CPU Idle
+
+#
+# CPU Frequency scaling
+#
+CONFIG_CPU_FREQ=y
+CONFIG_CPU_FREQ_STAT=y
+CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
+CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
+# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
+# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set
+
+#
+# CPU frequency scaling drivers
+#
+# CONFIG_CPUFREQ_DT is not set
+# CONFIG_ACPI_CPPC_CPUFREQ is not set
+# end of CPU Frequency scaling
+# end of CPU Power Management
+
+CONFIG_ARCH_SUPPORTS_ACPI=y
+CONFIG_ACPI=y
+CONFIG_ACPI_GENERIC_GSI=y
+CONFIG_ACPI_CCA_REQUIRED=y
+# CONFIG_ACPI_DEBUGGER is not set
+CONFIG_ACPI_SPCR_TABLE=y
+# CONFIG_ACPI_EC_DEBUGFS is not set
+# CONFIG_ACPI_AC is not set
+# CONFIG_ACPI_BATTERY is not set
+CONFIG_ACPI_BUTTON=y
+# CONFIG_ACPI_FAN is not set
+# CONFIG_ACPI_TAD is not set
+# CONFIG_ACPI_DOCK is not set
+CONFIG_ACPI_PROCESSOR_IDLE=y
+CONFIG_ACPI_MCFG=y
+CONFIG_ACPI_PROCESSOR=y
+CONFIG_ACPI_HOTPLUG_CPU=y
+CONFIG_ACPI_THERMAL=y
+CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
+# CONFIG_ACPI_TABLE_UPGRADE is not set
+# CONFIG_ACPI_DEBUG is not set
+# CONFIG_ACPI_PCI_SLOT is not set
+CONFIG_ACPI_CONTAINER=y
+# CONFIG_ACPI_HED is not set
+# CONFIG_ACPI_CUSTOM_METHOD is not set
+# CONFIG_ACPI_BGRT is not set
+CONFIG_ACPI_REDUCED_HARDWARE_ONLY=y
+CONFIG_ACPI_NUMA=y
+# CONFIG_ACPI_HMAT is not set
+CONFIG_HAVE_ACPI_APEI=y
+# CONFIG_ACPI_APEI is not set
+# CONFIG_ACPI_CONFIGFS is not set
+CONFIG_ACPI_IORT=y
+CONFIG_ACPI_GTDT=y
+CONFIG_ACPI_PPTT=y
+# CONFIG_PMIC_OPREGION is not set
+CONFIG_IRQ_BYPASS_MANAGER=m
+CONFIG_VIRTUALIZATION=y
+# CONFIG_KVM is not set
+CONFIG_ARM64_CRYPTO=y
+# CONFIG_CRYPTO_SHA256_ARM64 is not set
+# CONFIG_CRYPTO_SHA512_ARM64 is not set
+# CONFIG_CRYPTO_SHA1_ARM64_CE is not set
+# CONFIG_CRYPTO_SHA2_ARM64_CE is not set
+# CONFIG_CRYPTO_SHA512_ARM64_CE is not set
+# CONFIG_CRYPTO_SHA3_ARM64 is not set
+# CONFIG_CRYPTO_SM3_ARM64_CE is not set
+# CONFIG_CRYPTO_SM4_ARM64_CE is not set
+# CONFIG_CRYPTO_GHASH_ARM64_CE is not set
+# CONFIG_CRYPTO_CRCT10DIF_ARM64_CE is not set
+# CONFIG_CRYPTO_AES_ARM64 is not set
+# CONFIG_CRYPTO_AES_ARM64_CE is not set
+# CONFIG_CRYPTO_AES_ARM64_CE_CCM is not set
+# CONFIG_CRYPTO_AES_ARM64_CE_BLK is not set
+# CONFIG_CRYPTO_AES_ARM64_NEON_BLK is not set
+CONFIG_CRYPTO_CHACHA20_NEON=m
+CONFIG_CRYPTO_POLY1305_NEON=m
+# CONFIG_CRYPTO_NHPOLY1305_NEON is not set
+# CONFIG_CRYPTO_AES_ARM64_BS is not set
+
+#
+# General architecture-dependent options
+#
+CONFIG_CRASH_CORE=y
+CONFIG_KEXEC_CORE=y
+CONFIG_HAVE_IMA_KEXEC=y
+CONFIG_KPROBES=y
+# CONFIG_JUMP_LABEL is not set
+CONFIG_UPROBES=y
+CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
+CONFIG_KRETPROBES=y
+CONFIG_HAVE_KPROBES=y
+CONFIG_HAVE_KRETPROBES=y
+CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
+CONFIG_HAVE_NMI=y
+CONFIG_TRACE_IRQFLAGS_SUPPORT=y
+CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
+CONFIG_HAVE_ARCH_TRACEHOOK=y
+CONFIG_HAVE_DMA_CONTIGUOUS=y
+CONFIG_GENERIC_SMP_IDLE_THREAD=y
+CONFIG_GENERIC_IDLE_POLL_SETUP=y
+CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
+CONFIG_ARCH_HAS_KEEPINITRD=y
+CONFIG_ARCH_HAS_SET_MEMORY=y
+CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
+CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
+CONFIG_ARCH_WANTS_NO_INSTR=y
+CONFIG_HAVE_ASM_MODVERSIONS=y
+CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
+CONFIG_HAVE_RSEQ=y
+CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
+CONFIG_HAVE_HW_BREAKPOINT=y
+CONFIG_HAVE_PERF_REGS=y
+CONFIG_HAVE_PERF_USER_STACK_DUMP=y
+CONFIG_HAVE_ARCH_JUMP_LABEL=y
+CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
+CONFIG_MMU_GATHER_TABLE_FREE=y
+CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
+CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
+CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
+CONFIG_HAVE_CMPXCHG_LOCAL=y
+CONFIG_HAVE_CMPXCHG_DOUBLE=y
+CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
+CONFIG_HAVE_ARCH_SECCOMP=y
+CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
+CONFIG_SECCOMP=y
+CONFIG_SECCOMP_FILTER=y
+# CONFIG_SECCOMP_CACHE_DEBUG is not set
+CONFIG_HAVE_ARCH_STACKLEAK=y
+CONFIG_HAVE_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR_STRONG=y
+CONFIG_ARCH_SUPPORTS_SHADOW_CALL_STACK=y
+# CONFIG_SHADOW_CALL_STACK is not set
+CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
+CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
+CONFIG_LTO_NONE=y
+CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
+CONFIG_HAVE_CONTEXT_TRACKING=y
+CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
+CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
+CONFIG_HAVE_MOVE_PUD=y
+CONFIG_HAVE_MOVE_PMD=y
+CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
+CONFIG_HAVE_ARCH_HUGE_VMAP=y
+CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
+CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
+CONFIG_MODULES_USE_ELF_RELA=y
+CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
+CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
+CONFIG_ARCH_MMAP_RND_BITS=31
+CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS=11
+CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT=y
+CONFIG_CLONE_BACKWARDS=y
+CONFIG_OLD_SIGSUSPEND3=y
+CONFIG_COMPAT_OLD_SIGACTION=y
+CONFIG_COMPAT_32BIT_TIME=y
+CONFIG_HAVE_ARCH_VMAP_STACK=y
+CONFIG_VMAP_STACK=y
+CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
+# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set
+CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
+CONFIG_STRICT_KERNEL_RWX=y
+CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
+CONFIG_STRICT_MODULE_RWX=y
+CONFIG_HAVE_ARCH_COMPILER_H=y
+CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
+CONFIG_ARCH_USE_MEMREMAP_PROT=y
+# CONFIG_LOCK_EVENT_COUNTS is not set
+CONFIG_ARCH_HAS_RELR=y
+CONFIG_RELR=y
+CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
+CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
+
+#
+# GCOV-based kernel profiling
+#
+# CONFIG_GCOV_KERNEL is not set
+CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
+# end of GCOV-based kernel profiling
+
+CONFIG_HAVE_GCC_PLUGINS=y
+# end of General architecture-dependent options
+
+CONFIG_RT_MUTEXES=y
+CONFIG_BASE_SMALL=0
+CONFIG_MODULE_SIG_FORMAT=y
+CONFIG_MODULES=y
+# CONFIG_MODULE_FORCE_LOAD is not set
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULE_FORCE_UNLOAD=y
+# CONFIG_MODVERSIONS is not set
+# CONFIG_MODULE_SRCVERSION_ALL is not set
+CONFIG_MODULE_SIG=y
+# CONFIG_MODULE_SIG_FORCE is not set
+CONFIG_MODULE_SIG_ALL=y
+# CONFIG_MODULE_SIG_SHA1 is not set
+# CONFIG_MODULE_SIG_SHA224 is not set
+CONFIG_MODULE_SIG_SHA256=y
+# CONFIG_MODULE_SIG_SHA384 is not set
+# CONFIG_MODULE_SIG_SHA512 is not set
+CONFIG_MODULE_SIG_HASH="sha256"
+CONFIG_MODULE_COMPRESS_NONE=y
+# CONFIG_MODULE_COMPRESS_GZIP is not set
+# CONFIG_MODULE_COMPRESS_XZ is not set
+# CONFIG_MODULE_COMPRESS_ZSTD is not set
+# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
+CONFIG_MODPROBE_PATH="/sbin/modprobe"
+# CONFIG_TRIM_UNUSED_KSYMS is not set
+CONFIG_MODULES_TREE_LOOKUP=y
+CONFIG_BLOCK=y
+CONFIG_BLK_CGROUP_RWSTAT=y
+CONFIG_BLK_DEV_BSG_COMMON=y
+CONFIG_BLK_DEV_BSGLIB=y
+CONFIG_BLK_DEV_INTEGRITY=y
+CONFIG_BLK_DEV_INTEGRITY_T10=y
+# CONFIG_BLK_DEV_ZONED is not set
+CONFIG_BLK_DEV_THROTTLING=y
+# CONFIG_BLK_DEV_THROTTLING_LOW is not set
+CONFIG_BLK_WBT=y
+CONFIG_BLK_WBT_MQ=y
+# CONFIG_BLK_CGROUP_IOLATENCY is not set
+# CONFIG_BLK_CGROUP_IOCOST is not set
+# CONFIG_BLK_CGROUP_IOPRIO is not set
+# CONFIG_BLK_DEBUG_FS is not set
+# CONFIG_BLK_SED_OPAL is not set
+# CONFIG_BLK_INLINE_ENCRYPTION is not set
+
+#
+# Partition Types
+#
+CONFIG_PARTITION_ADVANCED=y
+# CONFIG_ACORN_PARTITION is not set
+# CONFIG_AIX_PARTITION is not set
+# CONFIG_OSF_PARTITION is not set
+# CONFIG_AMIGA_PARTITION is not set
+# CONFIG_ATARI_PARTITION is not set
+# CONFIG_MAC_PARTITION is not set
+CONFIG_MSDOS_PARTITION=y
+# CONFIG_BSD_DISKLABEL is not set
+# CONFIG_MINIX_SUBPARTITION is not set
+# CONFIG_SOLARIS_X86_PARTITION is not set
+# CONFIG_UNIXWARE_DISKLABEL is not set
+# CONFIG_LDM_PARTITION is not set
+# CONFIG_SGI_PARTITION is not set
+# CONFIG_ULTRIX_PARTITION is not set
+# CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
+CONFIG_EFI_PARTITION=y
+# CONFIG_SYSV68_PARTITION is not set
+# CONFIG_CMDLINE_PARTITION is not set
+# end of Partition Types
+
+CONFIG_BLOCK_COMPAT=y
+CONFIG_BLK_MQ_PCI=y
+CONFIG_BLK_MQ_VIRTIO=y
+CONFIG_BLK_MQ_RDMA=y
+CONFIG_BLK_PM=y
+CONFIG_BLOCK_HOLDER_DEPRECATED=y
+
+#
+# IO Schedulers
+#
+CONFIG_MQ_IOSCHED_DEADLINE=y
+CONFIG_MQ_IOSCHED_KYBER=m
+CONFIG_IOSCHED_BFQ=m
+CONFIG_BFQ_GROUP_IOSCHED=y
+# CONFIG_BFQ_CGROUP_DEBUG is not set
+# end of IO Schedulers
+
+CONFIG_ASN1=y
+CONFIG_ARCH_INLINE_SPIN_TRYLOCK=y
+CONFIG_ARCH_INLINE_SPIN_TRYLOCK_BH=y
+CONFIG_ARCH_INLINE_SPIN_LOCK=y
+CONFIG_ARCH_INLINE_SPIN_LOCK_BH=y
+CONFIG_ARCH_INLINE_SPIN_LOCK_IRQ=y
+CONFIG_ARCH_INLINE_SPIN_LOCK_IRQSAVE=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK_BH=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQ=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE=y
+CONFIG_ARCH_INLINE_READ_LOCK=y
+CONFIG_ARCH_INLINE_READ_LOCK_BH=y
+CONFIG_ARCH_INLINE_READ_LOCK_IRQ=y
+CONFIG_ARCH_INLINE_READ_LOCK_IRQSAVE=y
+CONFIG_ARCH_INLINE_READ_UNLOCK=y
+CONFIG_ARCH_INLINE_READ_UNLOCK_BH=y
+CONFIG_ARCH_INLINE_READ_UNLOCK_IRQ=y
+CONFIG_ARCH_INLINE_READ_UNLOCK_IRQRESTORE=y
+CONFIG_ARCH_INLINE_WRITE_LOCK=y
+CONFIG_ARCH_INLINE_WRITE_LOCK_BH=y
+CONFIG_ARCH_INLINE_WRITE_LOCK_IRQ=y
+CONFIG_ARCH_INLINE_WRITE_LOCK_IRQSAVE=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK_BH=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQ=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE=y
+CONFIG_INLINE_SPIN_TRYLOCK=y
+CONFIG_INLINE_SPIN_TRYLOCK_BH=y
+CONFIG_INLINE_SPIN_LOCK=y
+CONFIG_INLINE_SPIN_LOCK_BH=y
+CONFIG_INLINE_SPIN_LOCK_IRQ=y
+CONFIG_INLINE_SPIN_LOCK_IRQSAVE=y
+CONFIG_INLINE_SPIN_UNLOCK_BH=y
+CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
+CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE=y
+CONFIG_INLINE_READ_LOCK=y
+CONFIG_INLINE_READ_LOCK_BH=y
+CONFIG_INLINE_READ_LOCK_IRQ=y
+CONFIG_INLINE_READ_LOCK_IRQSAVE=y
+CONFIG_INLINE_READ_UNLOCK=y
+CONFIG_INLINE_READ_UNLOCK_BH=y
+CONFIG_INLINE_READ_UNLOCK_IRQ=y
+CONFIG_INLINE_READ_UNLOCK_IRQRESTORE=y
+CONFIG_INLINE_WRITE_LOCK=y
+CONFIG_INLINE_WRITE_LOCK_BH=y
+CONFIG_INLINE_WRITE_LOCK_IRQ=y
+CONFIG_INLINE_WRITE_LOCK_IRQSAVE=y
+CONFIG_INLINE_WRITE_UNLOCK=y
+CONFIG_INLINE_WRITE_UNLOCK_BH=y
+CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
+CONFIG_INLINE_WRITE_UNLOCK_IRQRESTORE=y
+CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
+CONFIG_MUTEX_SPIN_ON_OWNER=y
+CONFIG_RWSEM_SPIN_ON_OWNER=y
+CONFIG_LOCK_SPIN_ON_OWNER=y
+CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
+CONFIG_QUEUED_SPINLOCKS=y
+CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
+CONFIG_QUEUED_RWLOCKS=y
+CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
+CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
+CONFIG_FREEZER=y
+
+#
+# Executable file formats
+#
+CONFIG_BINFMT_ELF=y
+CONFIG_COMPAT_BINFMT_ELF=y
+CONFIG_ARCH_BINFMT_ELF_STATE=y
+CONFIG_ARCH_HAVE_ELF_PROT=y
+CONFIG_ARCH_USE_GNU_PROPERTY=y
+CONFIG_ELFCORE=y
+CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
+CONFIG_BINFMT_SCRIPT=y
+CONFIG_BINFMT_MISC=y
+CONFIG_COREDUMP=y
+# end of Executable file formats
+
+#
+# Memory Management options
+#
+CONFIG_SPARSEMEM=y
+CONFIG_SPARSEMEM_EXTREME=y
+CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
+CONFIG_SPARSEMEM_VMEMMAP=y
+CONFIG_HAVE_FAST_GUP=y
+CONFIG_ARCH_KEEP_MEMBLOCK=y
+CONFIG_MEMORY_ISOLATION=y
+CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
+# CONFIG_MEMORY_HOTPLUG is not set
+CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
+CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
+CONFIG_MEMORY_BALLOON=y
+CONFIG_BALLOON_COMPACTION=y
+CONFIG_COMPACTION=y
+CONFIG_PAGE_REPORTING=y
+CONFIG_MIGRATION=y
+CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
+CONFIG_ARCH_ENABLE_THP_MIGRATION=y
+CONFIG_CONTIG_ALLOC=y
+CONFIG_PHYS_ADDR_T_64BIT=y
+CONFIG_MMU_NOTIFIER=y
+# CONFIG_KSM is not set
+CONFIG_DEFAULT_MMAP_MIN_ADDR=65536
+CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
+# CONFIG_MEMORY_FAILURE is not set
+CONFIG_TRANSPARENT_HUGEPAGE=y
+# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
+CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
+# CONFIG_CLEANCACHE is not set
+# CONFIG_FRONTSWAP is not set
+CONFIG_CMA=y
+# CONFIG_CMA_DEBUG is not set
+# CONFIG_CMA_DEBUGFS is not set
+# CONFIG_CMA_SYSFS is not set
+CONFIG_CMA_AREAS=7
+# CONFIG_ZPOOL is not set
+CONFIG_ZSMALLOC=m
+# CONFIG_ZSMALLOC_STAT is not set
+CONFIG_GENERIC_EARLY_IOREMAP=y
+# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
+# CONFIG_IDLE_PAGE_TRACKING is not set
+CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
+CONFIG_ARCH_HAS_PTE_DEVMAP=y
+CONFIG_ARCH_HAS_ZONE_DMA_SET=y
+CONFIG_ZONE_DMA=y
+CONFIG_ZONE_DMA32=y
+CONFIG_HMM_MIRROR=y
+CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
+# CONFIG_PERCPU_STATS is not set
+# CONFIG_GUP_TEST is not set
+# CONFIG_READ_ONLY_THP_FOR_FS is not set
+CONFIG_ARCH_HAS_PTE_SPECIAL=y
+
+#
+# Data Access Monitoring
+#
+# CONFIG_DAMON is not set
+# end of Data Access Monitoring
+# end of Memory Management options
+
+CONFIG_NET=y
+CONFIG_NET_INGRESS=y
+CONFIG_NET_EGRESS=y
+CONFIG_NET_REDIRECT=y
+CONFIG_SKB_EXTENSIONS=y
+
+#
+# Networking options
+#
+CONFIG_PACKET=y
+CONFIG_PACKET_DIAG=m
+CONFIG_UNIX=y
+CONFIG_UNIX_SCM=y
+CONFIG_AF_UNIX_OOB=y
+CONFIG_UNIX_DIAG=m
+CONFIG_TLS=y
+CONFIG_TLS_DEVICE=y
+# CONFIG_TLS_TOE is not set
+CONFIG_XFRM=y
+CONFIG_XFRM_ALGO=y
+CONFIG_XFRM_USER=y
+CONFIG_XFRM_INTERFACE=m
+# CONFIG_XFRM_SUB_POLICY is not set
+# CONFIG_XFRM_MIGRATE is not set
+CONFIG_XFRM_STATISTICS=y
+CONFIG_XFRM_AH=m
+CONFIG_XFRM_ESP=m
+CONFIG_XFRM_IPCOMP=m
+CONFIG_NET_KEY=m
+# CONFIG_NET_KEY_MIGRATE is not set
+# CONFIG_SMC is not set
+CONFIG_XDP_SOCKETS=y
+# CONFIG_XDP_SOCKETS_DIAG is not set
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+# CONFIG_IP_FIB_TRIE_STATS is not set
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_ROUTE_MULTIPATH=y
+CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_IP_ROUTE_CLASSID=y
+# CONFIG_IP_PNP is not set
+CONFIG_NET_IPIP=m
+# CONFIG_NET_IPGRE_DEMUX is not set
+CONFIG_NET_IP_TUNNEL=m
+CONFIG_IP_MROUTE_COMMON=y
+CONFIG_IP_MROUTE=y
+# CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set
+CONFIG_IP_PIMSM_V1=y
+CONFIG_IP_PIMSM_V2=y
+CONFIG_SYN_COOKIES=y
+# CONFIG_NET_IPVTI is not set
+CONFIG_NET_UDP_TUNNEL=m
+CONFIG_NET_FOU=m
+CONFIG_NET_FOU_IP_TUNNELS=y
+CONFIG_INET_AH=m
+CONFIG_INET_ESP=m
+# CONFIG_INET_ESP_OFFLOAD is not set
+# CONFIG_INET_ESPINTCP is not set
+CONFIG_INET_IPCOMP=m
+CONFIG_INET_TABLE_PERTURB_ORDER=16
+CONFIG_INET_XFRM_TUNNEL=m
+CONFIG_INET_TUNNEL=m
+CONFIG_INET_DIAG=m
+CONFIG_INET_TCP_DIAG=m
+CONFIG_INET_UDP_DIAG=m
+# CONFIG_INET_RAW_DIAG is not set
+CONFIG_INET_DIAG_DESTROY=y
+CONFIG_TCP_CONG_ADVANCED=y
+# CONFIG_TCP_CONG_BIC is not set
+CONFIG_TCP_CONG_CUBIC=y
+# CONFIG_TCP_CONG_WESTWOOD is not set
+# CONFIG_TCP_CONG_HTCP is not set
+# CONFIG_TCP_CONG_HSTCP is not set
+# CONFIG_TCP_CONG_HYBLA is not set
+# CONFIG_TCP_CONG_VEGAS is not set
+# CONFIG_TCP_CONG_NV is not set
+# CONFIG_TCP_CONG_SCALABLE is not set
+CONFIG_TCP_CONG_LP=m
+# CONFIG_TCP_CONG_VENO is not set
+# CONFIG_TCP_CONG_YEAH is not set
+# CONFIG_TCP_CONG_ILLINOIS is not set
+# CONFIG_TCP_CONG_DCTCP is not set
+# CONFIG_TCP_CONG_CDG is not set
+CONFIG_TCP_CONG_BBR=m
+CONFIG_DEFAULT_CUBIC=y
+# CONFIG_DEFAULT_RENO is not set
+CONFIG_DEFAULT_TCP_CONG="cubic"
+CONFIG_TCP_MD5SIG=y
+CONFIG_IPV6=y
+# CONFIG_IPV6_ROUTER_PREF is not set
+# CONFIG_IPV6_OPTIMISTIC_DAD is not set
+# CONFIG_INET6_AH is not set
+CONFIG_INET6_ESP=m
+# CONFIG_INET6_ESP_OFFLOAD is not set
+# CONFIG_INET6_ESPINTCP is not set
+# CONFIG_INET6_IPCOMP is not set
+# CONFIG_IPV6_MIP6 is not set
+# CONFIG_IPV6_ILA is not set
+CONFIG_INET6_TUNNEL=m
+# CONFIG_IPV6_VTI is not set
+# CONFIG_IPV6_SIT is not set
+CONFIG_IPV6_TUNNEL=m
+CONFIG_IPV6_FOU=m
+CONFIG_IPV6_FOU_TUNNEL=m
+CONFIG_IPV6_MULTIPLE_TABLES=y
+# CONFIG_IPV6_SUBTREES is not set
+# CONFIG_IPV6_MROUTE is not set
+# CONFIG_IPV6_SEG6_LWTUNNEL is not set
+# CONFIG_IPV6_SEG6_HMAC is not set
+# CONFIG_IPV6_RPL_LWTUNNEL is not set
+# CONFIG_IPV6_IOAM6_LWTUNNEL is not set
+# CONFIG_NETLABEL is not set
+# CONFIG_MPTCP is not set
+CONFIG_NETWORK_SECMARK=y
+CONFIG_NET_PTP_CLASSIFY=y
+# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
+CONFIG_NETFILTER=y
+CONFIG_NETFILTER_ADVANCED=y
+CONFIG_BRIDGE_NETFILTER=m
+
+#
+# Core Netfilter Configuration
+#
+CONFIG_NETFILTER_INGRESS=y
+CONFIG_NETFILTER_NETLINK=y
+CONFIG_NETFILTER_FAMILY_BRIDGE=y
+CONFIG_NETFILTER_FAMILY_ARP=y
+# CONFIG_NETFILTER_NETLINK_HOOK is not set
+CONFIG_NETFILTER_NETLINK_ACCT=m
+CONFIG_NETFILTER_NETLINK_QUEUE=y
+CONFIG_NETFILTER_NETLINK_LOG=y
+CONFIG_NETFILTER_NETLINK_OSF=m
+CONFIG_NF_CONNTRACK=y
+CONFIG_NF_LOG_SYSLOG=m
+CONFIG_NETFILTER_CONNCOUNT=m
+CONFIG_NF_CONNTRACK_MARK=y
+CONFIG_NF_CONNTRACK_SECMARK=y
+CONFIG_NF_CONNTRACK_ZONES=y
+CONFIG_NF_CONNTRACK_PROCFS=y
+CONFIG_NF_CONNTRACK_EVENTS=y
+CONFIG_NF_CONNTRACK_TIMEOUT=y
+CONFIG_NF_CONNTRACK_TIMESTAMP=y
+CONFIG_NF_CONNTRACK_LABELS=y
+CONFIG_NF_CT_PROTO_DCCP=y
+CONFIG_NF_CT_PROTO_GRE=y
+CONFIG_NF_CT_PROTO_SCTP=y
+CONFIG_NF_CT_PROTO_UDPLITE=y
+CONFIG_NF_CONNTRACK_AMANDA=m
+CONFIG_NF_CONNTRACK_FTP=m
+CONFIG_NF_CONNTRACK_H323=m
+CONFIG_NF_CONNTRACK_IRC=m
+CONFIG_NF_CONNTRACK_BROADCAST=m
+CONFIG_NF_CONNTRACK_NETBIOS_NS=m
+CONFIG_NF_CONNTRACK_SNMP=m
+CONFIG_NF_CONNTRACK_PPTP=m
+CONFIG_NF_CONNTRACK_SANE=m
+CONFIG_NF_CONNTRACK_SIP=m
+CONFIG_NF_CONNTRACK_TFTP=m
+CONFIG_NF_CT_NETLINK=y
+CONFIG_NF_CT_NETLINK_TIMEOUT=m
+CONFIG_NF_CT_NETLINK_HELPER=m
+CONFIG_NETFILTER_NETLINK_GLUE_CT=y
+CONFIG_NF_NAT=m
+CONFIG_NF_NAT_AMANDA=m
+CONFIG_NF_NAT_FTP=m
+CONFIG_NF_NAT_IRC=m
+CONFIG_NF_NAT_SIP=m
+CONFIG_NF_NAT_TFTP=m
+CONFIG_NF_NAT_REDIRECT=y
+CONFIG_NF_NAT_MASQUERADE=y
+CONFIG_NETFILTER_SYNPROXY=y
+CONFIG_NF_TABLES=m
+CONFIG_NF_TABLES_INET=y
+# CONFIG_NF_TABLES_NETDEV is not set
+CONFIG_NFT_NUMGEN=m
+CONFIG_NFT_CT=m
+# CONFIG_NFT_FLOW_OFFLOAD is not set
+CONFIG_NFT_COUNTER=m
+CONFIG_NFT_CONNLIMIT=m
+CONFIG_NFT_LOG=m
+CONFIG_NFT_LIMIT=m
+CONFIG_NFT_MASQ=m
+CONFIG_NFT_REDIR=m
+CONFIG_NFT_NAT=m
+CONFIG_NFT_TUNNEL=m
+CONFIG_NFT_OBJREF=m
+CONFIG_NFT_QUEUE=m
+CONFIG_NFT_QUOTA=m
+CONFIG_NFT_REJECT=m
+CONFIG_NFT_REJECT_INET=m
+CONFIG_NFT_COMPAT=m
+CONFIG_NFT_HASH=m
+CONFIG_NFT_XFRM=m
+CONFIG_NFT_SOCKET=m
+CONFIG_NFT_OSF=m
+CONFIG_NFT_TPROXY=m
+CONFIG_NFT_SYNPROXY=m
+# CONFIG_NF_FLOW_TABLE_INET is not set
+CONFIG_NF_FLOW_TABLE=m
+CONFIG_NETFILTER_XTABLES=y
+CONFIG_NETFILTER_XTABLES_COMPAT=y
+
+#
+# Xtables combined modules
+#
+CONFIG_NETFILTER_XT_MARK=m
+CONFIG_NETFILTER_XT_CONNMARK=m
+CONFIG_NETFILTER_XT_SET=m
+
+#
+# Xtables targets
+#
+CONFIG_NETFILTER_XT_TARGET_AUDIT=m
+CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
+CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
+CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
+CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
+CONFIG_NETFILTER_XT_TARGET_CT=m
+CONFIG_NETFILTER_XT_TARGET_DSCP=m
+CONFIG_NETFILTER_XT_TARGET_HL=m
+CONFIG_NETFILTER_XT_TARGET_HMARK=m
+CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
+CONFIG_NETFILTER_XT_TARGET_LOG=m
+CONFIG_NETFILTER_XT_TARGET_MARK=m
+CONFIG_NETFILTER_XT_NAT=m
+CONFIG_NETFILTER_XT_TARGET_NETMAP=m
+CONFIG_NETFILTER_XT_TARGET_NFLOG=y
+CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
+# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set
+CONFIG_NETFILTER_XT_TARGET_RATEEST=m
+CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
+CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m
+CONFIG_NETFILTER_XT_TARGET_TEE=m
+CONFIG_NETFILTER_XT_TARGET_TPROXY=m
+CONFIG_NETFILTER_XT_TARGET_TRACE=m
+CONFIG_NETFILTER_XT_TARGET_SECMARK=m
+CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
+CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
+
+#
+# Xtables matches
+#
+CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
+CONFIG_NETFILTER_XT_MATCH_BPF=m
+CONFIG_NETFILTER_XT_MATCH_CGROUP=m
+CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
+CONFIG_NETFILTER_XT_MATCH_COMMENT=y
+CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
+CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
+CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
+CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
+CONFIG_NETFILTER_XT_MATCH_CPU=m
+CONFIG_NETFILTER_XT_MATCH_DCCP=m
+CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
+CONFIG_NETFILTER_XT_MATCH_DSCP=m
+CONFIG_NETFILTER_XT_MATCH_ECN=m
+CONFIG_NETFILTER_XT_MATCH_ESP=m
+CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_HELPER=m
+CONFIG_NETFILTER_XT_MATCH_HL=m
+CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
+CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
+CONFIG_NETFILTER_XT_MATCH_IPVS=m
+CONFIG_NETFILTER_XT_MATCH_L2TP=m
+CONFIG_NETFILTER_XT_MATCH_LENGTH=m
+CONFIG_NETFILTER_XT_MATCH_LIMIT=m
+CONFIG_NETFILTER_XT_MATCH_MAC=m
+CONFIG_NETFILTER_XT_MATCH_MARK=m
+CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
+CONFIG_NETFILTER_XT_MATCH_NFACCT=m
+CONFIG_NETFILTER_XT_MATCH_OSF=m
+CONFIG_NETFILTER_XT_MATCH_OWNER=y
+CONFIG_NETFILTER_XT_MATCH_POLICY=y
+CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
+CONFIG_NETFILTER_XT_MATCH_QUOTA=m
+CONFIG_NETFILTER_XT_MATCH_RATEEST=m
+CONFIG_NETFILTER_XT_MATCH_REALM=y
+CONFIG_NETFILTER_XT_MATCH_RECENT=m
+CONFIG_NETFILTER_XT_MATCH_SCTP=m
+CONFIG_NETFILTER_XT_MATCH_SOCKET=m
+CONFIG_NETFILTER_XT_MATCH_STATE=m
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
+CONFIG_NETFILTER_XT_MATCH_STRING=m
+CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
+CONFIG_NETFILTER_XT_MATCH_TIME=m
+CONFIG_NETFILTER_XT_MATCH_U32=m
+# end of Core Netfilter Configuration
+
+CONFIG_IP_SET=m
+CONFIG_IP_SET_MAX=256
+CONFIG_IP_SET_BITMAP_IP=m
+CONFIG_IP_SET_BITMAP_IPMAC=m
+CONFIG_IP_SET_BITMAP_PORT=m
+CONFIG_IP_SET_HASH_IP=m
+CONFIG_IP_SET_HASH_IPMARK=m
+CONFIG_IP_SET_HASH_IPPORT=m
+CONFIG_IP_SET_HASH_IPPORTIP=m
+CONFIG_IP_SET_HASH_IPPORTNET=m
+# CONFIG_IP_SET_HASH_IPMAC is not set
+CONFIG_IP_SET_HASH_MAC=m
+CONFIG_IP_SET_HASH_NETPORTNET=m
+CONFIG_IP_SET_HASH_NET=m
+CONFIG_IP_SET_HASH_NETNET=m
+CONFIG_IP_SET_HASH_NETPORT=m
+CONFIG_IP_SET_HASH_NETIFACE=m
+CONFIG_IP_SET_LIST_SET=m
+CONFIG_IP_VS=m
+# CONFIG_IP_VS_IPV6 is not set
+# CONFIG_IP_VS_DEBUG is not set
+CONFIG_IP_VS_TAB_BITS=12
+
+#
+# IPVS transport protocol load balancing support
+#
+CONFIG_IP_VS_PROTO_TCP=y
+CONFIG_IP_VS_PROTO_UDP=y
+CONFIG_IP_VS_PROTO_AH_ESP=y
+CONFIG_IP_VS_PROTO_ESP=y
+CONFIG_IP_VS_PROTO_AH=y
+CONFIG_IP_VS_PROTO_SCTP=y
+
+#
+# IPVS scheduler
+#
+CONFIG_IP_VS_RR=m
+CONFIG_IP_VS_WRR=m
+CONFIG_IP_VS_LC=m
+CONFIG_IP_VS_WLC=m
+CONFIG_IP_VS_FO=m
+CONFIG_IP_VS_OVF=m
+CONFIG_IP_VS_LBLC=m
+CONFIG_IP_VS_LBLCR=m
+CONFIG_IP_VS_DH=m
+CONFIG_IP_VS_SH=m
+# CONFIG_IP_VS_MH is not set
+CONFIG_IP_VS_SED=m
+CONFIG_IP_VS_NQ=m
+# CONFIG_IP_VS_TWOS is not set
+
+#
+# IPVS SH scheduler
+#
+CONFIG_IP_VS_SH_TAB_BITS=8
+
+#
+# IPVS MH scheduler
+#
+CONFIG_IP_VS_MH_TAB_INDEX=12
+
+#
+# IPVS application helper
+#
+CONFIG_IP_VS_FTP=m
+CONFIG_IP_VS_NFCT=y
+CONFIG_IP_VS_PE_SIP=m
+
+#
+# IP: Netfilter Configuration
+#
+CONFIG_NF_DEFRAG_IPV4=y
+CONFIG_NF_SOCKET_IPV4=m
+CONFIG_NF_TPROXY_IPV4=m
+CONFIG_NF_TABLES_IPV4=y
+CONFIG_NFT_REJECT_IPV4=m
+# CONFIG_NFT_DUP_IPV4 is not set
+# CONFIG_NFT_FIB_IPV4 is not set
+# CONFIG_NF_TABLES_ARP is not set
+# CONFIG_NF_FLOW_TABLE_IPV4 is not set
+CONFIG_NF_DUP_IPV4=m
+CONFIG_NF_LOG_ARP=m
+CONFIG_NF_LOG_IPV4=m
+CONFIG_NF_REJECT_IPV4=y
+CONFIG_NF_NAT_SNMP_BASIC=m
+CONFIG_NF_NAT_PPTP=m
+CONFIG_NF_NAT_H323=m
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IP_NF_MATCH_AH=m
+CONFIG_IP_NF_MATCH_ECN=m
+CONFIG_IP_NF_MATCH_RPFILTER=m
+CONFIG_IP_NF_MATCH_TTL=m
+CONFIG_IP_NF_FILTER=y
+CONFIG_IP_NF_TARGET_REJECT=y
+CONFIG_IP_NF_TARGET_SYNPROXY=m
+CONFIG_IP_NF_NAT=m
+CONFIG_IP_NF_TARGET_MASQUERADE=m
+CONFIG_IP_NF_TARGET_NETMAP=m
+CONFIG_IP_NF_TARGET_REDIRECT=m
+CONFIG_IP_NF_MANGLE=y
+CONFIG_IP_NF_TARGET_CLUSTERIP=m
+CONFIG_IP_NF_TARGET_ECN=m
+CONFIG_IP_NF_TARGET_TTL=m
+CONFIG_IP_NF_RAW=m
+CONFIG_IP_NF_SECURITY=m
+CONFIG_IP_NF_ARPTABLES=m
+CONFIG_IP_NF_ARPFILTER=m
+CONFIG_IP_NF_ARP_MANGLE=m
+# end of IP: Netfilter Configuration
+
+#
+# IPv6: Netfilter Configuration
+#
+CONFIG_NF_SOCKET_IPV6=m
+CONFIG_NF_TPROXY_IPV6=m
+CONFIG_NF_TABLES_IPV6=y
+CONFIG_NFT_REJECT_IPV6=m
+# CONFIG_NFT_DUP_IPV6 is not set
+# CONFIG_NFT_FIB_IPV6 is not set
+# CONFIG_NF_FLOW_TABLE_IPV6 is not set
+CONFIG_NF_DUP_IPV6=m
+CONFIG_NF_REJECT_IPV6=y
+CONFIG_NF_LOG_IPV6=m
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP6_NF_MATCH_AH=m
+# CONFIG_IP6_NF_MATCH_EUI64 is not set
+# CONFIG_IP6_NF_MATCH_FRAG is not set
+# CONFIG_IP6_NF_MATCH_OPTS is not set
+# CONFIG_IP6_NF_MATCH_HL is not set
+# CONFIG_IP6_NF_MATCH_IPV6HEADER is not set
+# CONFIG_IP6_NF_MATCH_MH is not set
+CONFIG_IP6_NF_MATCH_RPFILTER=m
+# CONFIG_IP6_NF_MATCH_RT is not set
+# CONFIG_IP6_NF_MATCH_SRH is not set
+# CONFIG_IP6_NF_TARGET_HL is not set
+CONFIG_IP6_NF_FILTER=y
+CONFIG_IP6_NF_TARGET_REJECT=y
+CONFIG_IP6_NF_TARGET_SYNPROXY=y
+CONFIG_IP6_NF_MANGLE=m
+CONFIG_IP6_NF_RAW=m
+CONFIG_IP6_NF_SECURITY=m
+CONFIG_IP6_NF_NAT=m
+CONFIG_IP6_NF_TARGET_MASQUERADE=m
+# CONFIG_IP6_NF_TARGET_NPT is not set
+# end of IPv6: Netfilter Configuration
+
+CONFIG_NF_DEFRAG_IPV6=y
+# CONFIG_NF_TABLES_BRIDGE is not set
+# CONFIG_NF_CONNTRACK_BRIDGE is not set
+CONFIG_BRIDGE_NF_EBTABLES=m
+CONFIG_BRIDGE_EBT_BROUTE=m
+CONFIG_BRIDGE_EBT_T_FILTER=m
+CONFIG_BRIDGE_EBT_T_NAT=m
+CONFIG_BRIDGE_EBT_802_3=m
+CONFIG_BRIDGE_EBT_AMONG=m
+CONFIG_BRIDGE_EBT_ARP=m
+CONFIG_BRIDGE_EBT_IP=m
+# CONFIG_BRIDGE_EBT_IP6 is not set
+CONFIG_BRIDGE_EBT_LIMIT=m
+CONFIG_BRIDGE_EBT_MARK=m
+CONFIG_BRIDGE_EBT_PKTTYPE=m
+CONFIG_BRIDGE_EBT_STP=m
+CONFIG_BRIDGE_EBT_VLAN=m
+CONFIG_BRIDGE_EBT_ARPREPLY=m
+CONFIG_BRIDGE_EBT_DNAT=m
+CONFIG_BRIDGE_EBT_MARK_T=m
+CONFIG_BRIDGE_EBT_REDIRECT=m
+CONFIG_BRIDGE_EBT_SNAT=m
+CONFIG_BRIDGE_EBT_LOG=m
+CONFIG_BRIDGE_EBT_NFLOG=m
+# CONFIG_BPFILTER is not set
+# CONFIG_IP_DCCP is not set
+# CONFIG_IP_SCTP is not set
+# CONFIG_RDS is not set
+# CONFIG_TIPC is not set
+# CONFIG_ATM is not set
+# CONFIG_L2TP is not set
+CONFIG_STP=y
+CONFIG_BRIDGE=y
+CONFIG_BRIDGE_IGMP_SNOOPING=y
+CONFIG_BRIDGE_VLAN_FILTERING=y
+# CONFIG_BRIDGE_MRP is not set
+# CONFIG_BRIDGE_CFM is not set
+# CONFIG_NET_DSA is not set
+CONFIG_VLAN_8021Q=m
+# CONFIG_VLAN_8021Q_GVRP is not set
+# CONFIG_VLAN_8021Q_MVRP is not set
+CONFIG_LLC=y
+# CONFIG_LLC2 is not set
+# CONFIG_ATALK is not set
+# CONFIG_X25 is not set
+# CONFIG_LAPB is not set
+# CONFIG_PHONET is not set
+# CONFIG_6LOWPAN is not set
+# CONFIG_IEEE802154 is not set
+CONFIG_NET_SCHED=y
+
+#
+# Queueing/Scheduling
+#
+CONFIG_NET_SCH_HTB=m
+CONFIG_NET_SCH_HFSC=m
+CONFIG_NET_SCH_PRIO=m
+CONFIG_NET_SCH_MULTIQ=m
+CONFIG_NET_SCH_RED=m
+CONFIG_NET_SCH_SFB=m
+CONFIG_NET_SCH_SFQ=m
+CONFIG_NET_SCH_TEQL=m
+CONFIG_NET_SCH_TBF=m
+# CONFIG_NET_SCH_CBS is not set
+# CONFIG_NET_SCH_ETF is not set
+# CONFIG_NET_SCH_TAPRIO is not set
+CONFIG_NET_SCH_GRED=m
+CONFIG_NET_SCH_NETEM=m
+CONFIG_NET_SCH_DRR=m
+CONFIG_NET_SCH_MQPRIO=m
+# CONFIG_NET_SCH_SKBPRIO is not set
+CONFIG_NET_SCH_CHOKE=m
+CONFIG_NET_SCH_QFQ=m
+CONFIG_NET_SCH_CODEL=m
+CONFIG_NET_SCH_FQ_CODEL=m
+# CONFIG_NET_SCH_CAKE is not set
+CONFIG_NET_SCH_FQ=m
+CONFIG_NET_SCH_HHF=m
+CONFIG_NET_SCH_PIE=m
+# CONFIG_NET_SCH_FQ_PIE is not set
+CONFIG_NET_SCH_INGRESS=m
+CONFIG_NET_SCH_PLUG=m
+# CONFIG_NET_SCH_ETS is not set
+# CONFIG_NET_SCH_DEFAULT is not set
+
+#
+# Classification
+#
+CONFIG_NET_CLS=y
+CONFIG_NET_CLS_BASIC=m
+CONFIG_NET_CLS_ROUTE4=m
+CONFIG_NET_CLS_FW=m
+CONFIG_NET_CLS_U32=m
+# CONFIG_CLS_U32_PERF is not set
+CONFIG_CLS_U32_MARK=y
+# CONFIG_NET_CLS_FLOW is not set
+CONFIG_NET_CLS_CGROUP=m
+CONFIG_NET_CLS_BPF=m
+# CONFIG_NET_CLS_FLOWER is not set
+# CONFIG_NET_CLS_MATCHALL is not set
+# CONFIG_NET_EMATCH is not set
+CONFIG_NET_CLS_ACT=y
+# CONFIG_NET_ACT_POLICE is not set
+CONFIG_NET_ACT_GACT=m
+# CONFIG_GACT_PROB is not set
+CONFIG_NET_ACT_MIRRED=y
+# CONFIG_NET_ACT_SAMPLE is not set
+# CONFIG_NET_ACT_IPT is not set
+CONFIG_NET_ACT_NAT=m
+CONFIG_NET_ACT_PEDIT=y
+# CONFIG_NET_ACT_SIMP is not set
+# CONFIG_NET_ACT_SKBEDIT is not set
+# CONFIG_NET_ACT_CSUM is not set
+# CONFIG_NET_ACT_MPLS is not set
+# CONFIG_NET_ACT_VLAN is not set
+# CONFIG_NET_ACT_BPF is not set
+# CONFIG_NET_ACT_CONNMARK is not set
+# CONFIG_NET_ACT_CTINFO is not set
+# CONFIG_NET_ACT_SKBMOD is not set
+# CONFIG_NET_ACT_IFE is not set
+# CONFIG_NET_ACT_TUNNEL_KEY is not set
+# CONFIG_NET_ACT_CT is not set
+# CONFIG_NET_ACT_GATE is not set
+# CONFIG_NET_TC_SKB_EXT is not set
+CONFIG_NET_SCH_FIFO=y
+# CONFIG_DCB is not set
+CONFIG_DNS_RESOLVER=m
+# CONFIG_BATMAN_ADV is not set
+# CONFIG_OPENVSWITCH is not set
+CONFIG_VSOCKETS=y
+CONFIG_VSOCKETS_DIAG=y
+CONFIG_VSOCKETS_LOOPBACK=y
+CONFIG_VIRTIO_VSOCKETS=y
+CONFIG_VIRTIO_VSOCKETS_COMMON=y
+CONFIG_NETLINK_DIAG=m
+# CONFIG_MPLS is not set
+# CONFIG_NET_NSH is not set
+# CONFIG_HSR is not set
+# CONFIG_NET_SWITCHDEV is not set
+CONFIG_NET_L3_MASTER_DEV=y
+# CONFIG_QRTR is not set
+# CONFIG_NET_NCSI is not set
+CONFIG_PCPU_DEV_REFCNT=y
+CONFIG_RPS=y
+CONFIG_RFS_ACCEL=y
+CONFIG_SOCK_RX_QUEUE_MAPPING=y
+CONFIG_XPS=y
+CONFIG_CGROUP_NET_PRIO=y
+CONFIG_CGROUP_NET_CLASSID=y
+CONFIG_NET_RX_BUSY_POLL=y
+CONFIG_BQL=y
+CONFIG_BPF_STREAM_PARSER=y
+CONFIG_NET_FLOW_LIMIT=y
+
+#
+# Network testing
+#
+# CONFIG_NET_PKTGEN is not set
+# CONFIG_NET_DROP_MONITOR is not set
+# end of Network testing
+# end of Networking options
+
+# CONFIG_HAMRADIO is not set
+# CONFIG_CAN is not set
+# CONFIG_BT is not set
+# CONFIG_AF_RXRPC is not set
+# CONFIG_AF_KCM is not set
+CONFIG_STREAM_PARSER=y
+# CONFIG_MCTP is not set
+CONFIG_FIB_RULES=y
+# CONFIG_WIRELESS is not set
+# CONFIG_RFKILL is not set
+CONFIG_NET_9P=y
+CONFIG_NET_9P_VIRTIO=y
+# CONFIG_NET_9P_XEN is not set
+# CONFIG_NET_9P_RDMA is not set
+# CONFIG_NET_9P_DEBUG is not set
+# CONFIG_CAIF is not set
+# CONFIG_CEPH_LIB is not set
+# CONFIG_NFC is not set
+# CONFIG_PSAMPLE is not set
+# CONFIG_NET_IFE is not set
+CONFIG_LWTUNNEL=y
+CONFIG_LWTUNNEL_BPF=y
+CONFIG_DST_CACHE=y
+CONFIG_GRO_CELLS=y
+CONFIG_SOCK_VALIDATE_XMIT=y
+CONFIG_NET_SELFTESTS=y
+CONFIG_NET_SOCK_MSG=y
+CONFIG_NET_DEVLINK=y
+CONFIG_PAGE_POOL=y
+CONFIG_FAILOVER=y
+CONFIG_ETHTOOL_NETLINK=y
+
+#
+# Device Drivers
+#
+CONFIG_ARM_AMBA=y
+CONFIG_HAVE_PCI=y
+CONFIG_PCI=y
+CONFIG_PCI_DOMAINS=y
+CONFIG_PCI_DOMAINS_GENERIC=y
+CONFIG_PCI_SYSCALL=y
+CONFIG_PCIEPORTBUS=y
+CONFIG_HOTPLUG_PCI_PCIE=y
+CONFIG_PCIEAER=y
+# CONFIG_PCIEAER_INJECT is not set
+# CONFIG_PCIE_ECRC is not set
+CONFIG_PCIEASPM=y
+CONFIG_PCIEASPM_DEFAULT=y
+# CONFIG_PCIEASPM_POWERSAVE is not set
+# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
+# CONFIG_PCIEASPM_PERFORMANCE is not set
+CONFIG_PCIE_PME=y
+# CONFIG_PCIE_DPC is not set
+# CONFIG_PCIE_PTM is not set
+CONFIG_PCI_MSI=y
+CONFIG_PCI_MSI_IRQ_DOMAIN=y
+CONFIG_PCI_QUIRKS=y
+# CONFIG_PCI_DEBUG is not set
+# CONFIG_PCI_STUB is not set
+CONFIG_PCI_ECAM=y
+# CONFIG_PCI_IOV is not set
+# CONFIG_PCI_PRI is not set
+# CONFIG_PCI_PASID is not set
+CONFIG_PCI_LABEL=y
+# CONFIG_PCIE_BUS_TUNE_OFF is not set
+CONFIG_PCIE_BUS_DEFAULT=y
+# CONFIG_PCIE_BUS_SAFE is not set
+# CONFIG_PCIE_BUS_PERFORMANCE is not set
+# CONFIG_PCIE_BUS_PEER2PEER is not set
+CONFIG_HOTPLUG_PCI=y
+CONFIG_HOTPLUG_PCI_ACPI=y
+# CONFIG_HOTPLUG_PCI_ACPI_IBM is not set
+# CONFIG_HOTPLUG_PCI_CPCI is not set
+# CONFIG_HOTPLUG_PCI_SHPC is not set
+
+#
+# PCI controller drivers
+#
+# CONFIG_PCI_FTPCI100 is not set
+# CONFIG_PCI_HOST_GENERIC is not set
+# CONFIG_PCIE_XILINX is not set
+# CONFIG_PCI_XGENE is not set
+# CONFIG_PCIE_ALTERA is not set
+# CONFIG_PCI_HOST_THUNDER_PEM is not set
+# CONFIG_PCI_HOST_THUNDER_ECAM is not set
+# CONFIG_PCIE_MICROCHIP_HOST is not set
+
+#
+# DesignWare PCI Core Support
+#
+# CONFIG_PCIE_DW_PLAT_HOST is not set
+# CONFIG_PCI_HISI is not set
+# CONFIG_PCIE_KIRIN is not set
+# CONFIG_PCI_MESON is not set
+# CONFIG_PCIE_AL is not set
+# end of DesignWare PCI Core Support
+
+#
+# Mobiveil PCIe Core Support
+#
+# end of Mobiveil PCIe Core Support
+
+#
+# Cadence PCIe controllers support
+#
+# CONFIG_PCIE_CADENCE_PLAT_HOST is not set
+# CONFIG_PCI_J721E_HOST is not set
+# end of Cadence PCIe controllers support
+# end of PCI controller drivers
+
+#
+# PCI Endpoint
+#
+# CONFIG_PCI_ENDPOINT is not set
+# end of PCI Endpoint
+
+#
+# PCI switch controller drivers
+#
+# CONFIG_PCI_SW_SWITCHTEC is not set
+# end of PCI switch controller drivers
+
+# CONFIG_CXL_BUS is not set
+# CONFIG_PCCARD is not set
+# CONFIG_RAPIDIO is not set
+
+#
+# Generic Driver Options
+#
+CONFIG_AUXILIARY_BUS=y
+CONFIG_UEVENT_HELPER=y
+CONFIG_UEVENT_HELPER_PATH=""
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_DEVTMPFS_SAFE=y
+CONFIG_STANDALONE=y
+CONFIG_PREVENT_FIRMWARE_BUILD=y
+
+#
+# Firmware loader
+#
+CONFIG_FW_LOADER=y
+CONFIG_EXTRA_FIRMWARE=""
+# CONFIG_FW_LOADER_USER_HELPER is not set
+# CONFIG_FW_LOADER_COMPRESS is not set
+CONFIG_FW_CACHE=y
+# end of Firmware loader
+
+CONFIG_ALLOW_DEV_COREDUMP=y
+# CONFIG_DEBUG_DRIVER is not set
+CONFIG_DEBUG_DEVRES=y
+# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
+# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
+CONFIG_SYS_HYPERVISOR=y
+CONFIG_GENERIC_CPU_AUTOPROBE=y
+CONFIG_GENERIC_CPU_VULNERABILITIES=y
+CONFIG_SOC_BUS=y
+CONFIG_DMA_SHARED_BUFFER=y
+# CONFIG_DMA_FENCE_TRACE is not set
+CONFIG_GENERIC_ARCH_TOPOLOGY=y
+CONFIG_GENERIC_ARCH_NUMA=y
+# end of Generic Driver Options
+
+#
+# Bus devices
+#
+# CONFIG_BRCMSTB_GISB_ARB is not set
+# CONFIG_VEXPRESS_CONFIG is not set
+# CONFIG_MHI_BUS is not set
+# end of Bus devices
+
+CONFIG_CONNECTOR=y
+CONFIG_PROC_EVENTS=y
+
+#
+# Firmware Drivers
+#
+
+#
+# ARM System Control and Management Interface Protocol
+#
+# CONFIG_ARM_SCMI_PROTOCOL is not set
+# end of ARM System Control and Management Interface Protocol
+
+# CONFIG_ARM_SCPI_PROTOCOL is not set
+# CONFIG_FIRMWARE_MEMMAP is not set
+CONFIG_DMIID=y
+# CONFIG_DMI_SYSFS is not set
+# CONFIG_ISCSI_IBFT is not set
+# CONFIG_FW_CFG_SYSFS is not set
+CONFIG_SYSFB=y
+# CONFIG_SYSFB_SIMPLEFB is not set
+# CONFIG_ARM_FFA_TRANSPORT is not set
+# CONFIG_GOOGLE_FIRMWARE is not set
+
+#
+# EFI (Extensible Firmware Interface) Support
+#
+CONFIG_EFI_ESRT=y
+CONFIG_EFI_VARS_PSTORE=y
+# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set
+CONFIG_EFI_PARAMS_FROM_FDT=y
+CONFIG_EFI_RUNTIME_WRAPPERS=y
+CONFIG_EFI_GENERIC_STUB=y
+CONFIG_EFI_ARMSTUB_DTB_LOADER=y
+CONFIG_EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER=y
+# CONFIG_EFI_BOOTLOADER_CONTROL is not set
+# CONFIG_EFI_CAPSULE_LOADER is not set
+# CONFIG_EFI_TEST is not set
+# CONFIG_RESET_ATTACK_MITIGATION is not set
+# CONFIG_EFI_DISABLE_PCI_DMA is not set
+# end of EFI (Extensible Firmware Interface) Support
+
+CONFIG_EFI_EARLYCON=y
+# CONFIG_EFI_CUSTOM_SSDT_OVERLAYS is not set
+CONFIG_ARM_PSCI_FW=y
+# CONFIG_ARM_PSCI_CHECKER is not set
+CONFIG_HAVE_ARM_SMCCC=y
+CONFIG_HAVE_ARM_SMCCC_DISCOVERY=y
+CONFIG_ARM_SMCCC_SOC_ID=y
+
+#
+# Tegra firmware driver
+#
+# end of Tegra firmware driver
+# end of Firmware Drivers
+
+# CONFIG_GNSS is not set
+# CONFIG_MTD is not set
+CONFIG_DTC=y
+CONFIG_OF=y
+# CONFIG_OF_UNITTEST is not set
+CONFIG_OF_FLATTREE=y
+CONFIG_OF_EARLY_FLATTREE=y
+CONFIG_OF_KOBJ=y
+CONFIG_OF_ADDRESS=y
+CONFIG_OF_IRQ=y
+CONFIG_OF_RESERVED_MEM=y
+# CONFIG_OF_OVERLAY is not set
+CONFIG_OF_NUMA=y
+# CONFIG_PARPORT is not set
+CONFIG_PNP=y
+CONFIG_PNP_DEBUG_MESSAGES=y
+
+#
+# Protocols
+#
+CONFIG_PNPACPI=y
+CONFIG_BLK_DEV=y
+# CONFIG_BLK_DEV_NULL_BLK is not set
+CONFIG_CDROM=y
+# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
+CONFIG_ZRAM=m
+CONFIG_ZRAM_DEF_COMP_LZORLE=y
+# CONFIG_ZRAM_DEF_COMP_LZ4 is not set
+# CONFIG_ZRAM_DEF_COMP_LZO is not set
+CONFIG_ZRAM_DEF_COMP="lzo-rle"
+# CONFIG_ZRAM_WRITEBACK is not set
+# CONFIG_ZRAM_MEMORY_TRACKING is not set
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
+# CONFIG_BLK_DEV_CRYPTOLOOP is not set
+# CONFIG_BLK_DEV_DRBD is not set
+# CONFIG_BLK_DEV_NBD is not set
+# CONFIG_BLK_DEV_RAM is not set
+# CONFIG_CDROM_PKTCDVD is not set
+# CONFIG_ATA_OVER_ETH is not set
+CONFIG_XEN_BLKDEV_FRONTEND=y
+CONFIG_XEN_BLKDEV_BACKEND=m
+CONFIG_VIRTIO_BLK=m
+# CONFIG_BLK_DEV_RBD is not set
+# CONFIG_BLK_DEV_RSXX is not set
+
+#
+# NVME Support
+#
+CONFIG_NVME_CORE=y
+CONFIG_BLK_DEV_NVME=y
+# CONFIG_NVME_MULTIPATH is not set
+# CONFIG_NVME_RDMA is not set
+# CONFIG_NVME_FC is not set
+# CONFIG_NVME_TCP is not set
+# CONFIG_NVME_TARGET is not set
+# end of NVME Support
+
+#
+# Misc devices
+#
+# CONFIG_DUMMY_IRQ is not set
+# CONFIG_PHANTOM is not set
+# CONFIG_TIFM_CORE is not set
+# CONFIG_ENCLOSURE_SERVICES is not set
+# CONFIG_HP_ILO is not set
+# CONFIG_SRAM is not set
+# CONFIG_DW_XDATA_PCIE is not set
+# CONFIG_PCI_ENDPOINT_TEST is not set
+# CONFIG_XILINX_SDFEC is not set
+# CONFIG_C2PORT is not set
+
+#
+# EEPROM support
+#
+CONFIG_EEPROM_93CX6=m
+# end of EEPROM support
+
+# CONFIG_CB710_CORE is not set
+
+#
+# Texas Instruments shared transport line discipline
+#
+# end of Texas Instruments shared transport line discipline
+
+#
+# Altera FPGA firmware download module (requires I2C)
+#
+# CONFIG_GENWQE is not set
+# CONFIG_ECHO is not set
+# CONFIG_BCM_VK is not set
+# CONFIG_MISC_ALCOR_PCI is not set
+# CONFIG_MISC_RTSX_PCI is not set
+# CONFIG_HABANA_AI is not set
+# CONFIG_UACCE is not set
+CONFIG_PVPANIC=y
+CONFIG_PVPANIC_MMIO=y
+# CONFIG_PVPANIC_PCI is not set
+# end of Misc devices
+
+#
+# SCSI device support
+#
+CONFIG_SCSI_MOD=y
+# CONFIG_RAID_ATTRS is not set
+CONFIG_SCSI_COMMON=y
+CONFIG_SCSI=y
+CONFIG_SCSI_DMA=y
+CONFIG_SCSI_PROC_FS=y
+
+#
+# SCSI support type (disk, tape, CD-ROM)
+#
+CONFIG_BLK_DEV_SD=y
+# CONFIG_CHR_DEV_ST is not set
+CONFIG_BLK_DEV_SR=y
+# CONFIG_CHR_DEV_SG is not set
+CONFIG_BLK_DEV_BSG=y
+# CONFIG_CHR_DEV_SCH is not set
+CONFIG_SCSI_CONSTANTS=y
+# CONFIG_SCSI_LOGGING is not set
+# CONFIG_SCSI_SCAN_ASYNC is not set
+
+#
+# SCSI Transports
+#
+CONFIG_SCSI_SPI_ATTRS=y
+# CONFIG_SCSI_FC_ATTRS is not set
+CONFIG_SCSI_ISCSI_ATTRS=m
+# CONFIG_SCSI_SAS_ATTRS is not set
+# CONFIG_SCSI_SAS_LIBSAS is not set
+# CONFIG_SCSI_SRP_ATTRS is not set
+# end of SCSI Transports
+
+CONFIG_SCSI_LOWLEVEL=y
+CONFIG_ISCSI_TCP=m
+# CONFIG_ISCSI_BOOT_SYSFS is not set
+# CONFIG_SCSI_CXGB3_ISCSI is not set
+# CONFIG_SCSI_CXGB4_ISCSI is not set
+# CONFIG_SCSI_BNX2_ISCSI is not set
+# CONFIG_BE2ISCSI is not set
+# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
+# CONFIG_SCSI_HPSA is not set
+# CONFIG_SCSI_3W_9XXX is not set
+# CONFIG_SCSI_3W_SAS is not set
+# CONFIG_SCSI_ACARD is not set
+# CONFIG_SCSI_AACRAID is not set
+# CONFIG_SCSI_AIC7XXX is not set
+# CONFIG_SCSI_AIC79XX is not set
+# CONFIG_SCSI_AIC94XX is not set
+# CONFIG_SCSI_HISI_SAS is not set
+# CONFIG_SCSI_MVSAS is not set
+# CONFIG_SCSI_MVUMI is not set
+# CONFIG_SCSI_DPT_I2O is not set
+# CONFIG_SCSI_ADVANSYS is not set
+# CONFIG_SCSI_ARCMSR is not set
+# CONFIG_SCSI_ESAS2R is not set
+# CONFIG_MEGARAID_NEWGEN is not set
+# CONFIG_MEGARAID_LEGACY is not set
+# CONFIG_MEGARAID_SAS is not set
+# CONFIG_SCSI_MPT3SAS is not set
+# CONFIG_SCSI_MPT2SAS is not set
+# CONFIG_SCSI_MPI3MR is not set
+# CONFIG_SCSI_SMARTPQI is not set
+# CONFIG_SCSI_UFSHCD is not set
+# CONFIG_SCSI_HPTIOP is not set
+# CONFIG_SCSI_MYRB is not set
+# CONFIG_SCSI_MYRS is not set
+CONFIG_XEN_SCSI_FRONTEND=m
+# CONFIG_SCSI_SNIC is not set
+# CONFIG_SCSI_DMX3191D is not set
+# CONFIG_SCSI_FDOMAIN_PCI is not set
+# CONFIG_SCSI_IPS is not set
+# CONFIG_SCSI_INITIO is not set
+# CONFIG_SCSI_INIA100 is not set
+# CONFIG_SCSI_STEX is not set
+# CONFIG_SCSI_SYM53C8XX_2 is not set
+# CONFIG_SCSI_IPR is not set
+# CONFIG_SCSI_QLOGIC_1280 is not set
+# CONFIG_SCSI_QLA_ISCSI is not set
+# CONFIG_SCSI_DC395x is not set
+# CONFIG_SCSI_AM53C974 is not set
+# CONFIG_SCSI_WD719X is not set
+# CONFIG_SCSI_DEBUG is not set
+# CONFIG_SCSI_PMCRAID is not set
+# CONFIG_SCSI_PM8001 is not set
+CONFIG_SCSI_VIRTIO=y
+# CONFIG_SCSI_DH is not set
+# end of SCSI device support
+
+CONFIG_HAVE_PATA_PLATFORM=y
+CONFIG_ATA=y
+CONFIG_SATA_HOST=y
+CONFIG_PATA_TIMINGS=y
+CONFIG_ATA_VERBOSE_ERROR=y
+CONFIG_ATA_FORCE=y
+CONFIG_ATA_ACPI=y
+# CONFIG_SATA_ZPODD is not set
+# CONFIG_SATA_PMP is not set
+
+#
+# Controllers with non-SFF native interface
+#
+CONFIG_SATA_AHCI=y
+CONFIG_SATA_MOBILE_LPM_POLICY=0
+# CONFIG_SATA_AHCI_PLATFORM is not set
+# CONFIG_AHCI_CEVA is not set
+# CONFIG_AHCI_QORIQ is not set
+# CONFIG_SATA_INIC162X is not set
+# CONFIG_SATA_ACARD_AHCI is not set
+# CONFIG_SATA_SIL24 is not set
+CONFIG_ATA_SFF=y
+
+#
+# SFF controllers with custom DMA interface
+#
+# CONFIG_PDC_ADMA is not set
+# CONFIG_SATA_QSTOR is not set
+# CONFIG_SATA_SX4 is not set
+CONFIG_ATA_BMDMA=y
+
+#
+# SATA SFF controllers with BMDMA
+#
+CONFIG_ATA_PIIX=y
+# CONFIG_SATA_MV is not set
+# CONFIG_SATA_NV is not set
+# CONFIG_SATA_PROMISE is not set
+# CONFIG_SATA_SIL is not set
+# CONFIG_SATA_SIS is not set
+# CONFIG_SATA_SVW is not set
+# CONFIG_SATA_ULI is not set
+# CONFIG_SATA_VIA is not set
+# CONFIG_SATA_VITESSE is not set
+
+#
+# PATA SFF controllers with BMDMA
+#
+# CONFIG_PATA_ALI is not set
+# CONFIG_PATA_AMD is not set
+# CONFIG_PATA_ARTOP is not set
+# CONFIG_PATA_ATIIXP is not set
+# CONFIG_PATA_ATP867X is not set
+# CONFIG_PATA_CMD64X is not set
+# CONFIG_PATA_CYPRESS is not set
+# CONFIG_PATA_EFAR is not set
+# CONFIG_PATA_HPT366 is not set
+# CONFIG_PATA_HPT37X is not set
+# CONFIG_PATA_HPT3X2N is not set
+# CONFIG_PATA_HPT3X3 is not set
+# CONFIG_PATA_IT8213 is not set
+# CONFIG_PATA_IT821X is not set
+# CONFIG_PATA_JMICRON is not set
+# CONFIG_PATA_MARVELL is not set
+# CONFIG_PATA_NETCELL is not set
+# CONFIG_PATA_NINJA32 is not set
+# CONFIG_PATA_NS87415 is not set
+# CONFIG_PATA_OLDPIIX is not set
+# CONFIG_PATA_OPTIDMA is not set
+# CONFIG_PATA_PDC2027X is not set
+# CONFIG_PATA_PDC_OLD is not set
+# CONFIG_PATA_RADISYS is not set
+# CONFIG_PATA_RDC is not set
+# CONFIG_PATA_SCH is not set
+# CONFIG_PATA_SERVERWORKS is not set
+# CONFIG_PATA_SIL680 is not set
+# CONFIG_PATA_SIS is not set
+# CONFIG_PATA_TOSHIBA is not set
+# CONFIG_PATA_TRIFLEX is not set
+# CONFIG_PATA_VIA is not set
+# CONFIG_PATA_WINBOND is not set
+
+#
+# PIO-only SFF controllers
+#
+# CONFIG_PATA_CMD640_PCI is not set
+# CONFIG_PATA_MPIIX is not set
+# CONFIG_PATA_NS87410 is not set
+# CONFIG_PATA_OPTI is not set
+# CONFIG_PATA_PLATFORM is not set
+# CONFIG_PATA_RZ1000 is not set
+
+#
+# Generic fallback / legacy drivers
+#
+# CONFIG_PATA_ACPI is not set
+CONFIG_ATA_GENERIC=y
+# CONFIG_PATA_LEGACY is not set
+CONFIG_MD=y
+CONFIG_BLK_DEV_MD=y
+CONFIG_MD_AUTODETECT=y
+# CONFIG_MD_LINEAR is not set
+CONFIG_MD_RAID0=y
+CONFIG_MD_RAID1=m
+CONFIG_MD_RAID10=m
+CONFIG_MD_RAID456=m
+# CONFIG_MD_MULTIPATH is not set
+# CONFIG_MD_FAULTY is not set
+CONFIG_BCACHE=m
+# CONFIG_BCACHE_DEBUG is not set
+# CONFIG_BCACHE_CLOSURES_DEBUG is not set
+# CONFIG_BCACHE_ASYNC_REGISTRATION is not set
+CONFIG_BLK_DEV_DM_BUILTIN=y
+CONFIG_BLK_DEV_DM=y
+CONFIG_DM_DEBUG=y
+CONFIG_DM_BUFIO=y
+# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
+CONFIG_DM_BIO_PRISON=m
+CONFIG_DM_PERSISTENT_DATA=m
+# CONFIG_DM_UNSTRIPED is not set
+CONFIG_DM_CRYPT=y
+CONFIG_DM_SNAPSHOT=m
+CONFIG_DM_THIN_PROVISIONING=m
+CONFIG_DM_CACHE=m
+CONFIG_DM_CACHE_SMQ=m
+CONFIG_DM_WRITECACHE=m
+# CONFIG_DM_EBS is not set
+# CONFIG_DM_ERA is not set
+# CONFIG_DM_CLONE is not set
+# CONFIG_DM_MIRROR is not set
+CONFIG_DM_RAID=m
+# CONFIG_DM_ZERO is not set
+CONFIG_DM_MULTIPATH=m
+CONFIG_DM_MULTIPATH_QL=m
+CONFIG_DM_MULTIPATH_ST=m
+CONFIG_DM_MULTIPATH_HST=m
+# CONFIG_DM_MULTIPATH_IOA is not set
+# CONFIG_DM_DELAY is not set
+# CONFIG_DM_DUST is not set
+CONFIG_DM_INIT=y
+# CONFIG_DM_UEVENT is not set
+# CONFIG_DM_FLAKEY is not set
+CONFIG_DM_VERITY=y
+# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set
+# CONFIG_DM_VERITY_FEC is not set
+# CONFIG_DM_SWITCH is not set
+# CONFIG_DM_LOG_WRITES is not set
+CONFIG_DM_INTEGRITY=y
+CONFIG_TARGET_CORE=m
+CONFIG_TCM_IBLOCK=m
+CONFIG_TCM_FILEIO=m
+# CONFIG_TCM_PSCSI is not set
+CONFIG_TCM_USER2=m
+CONFIG_LOOPBACK_TARGET=m
+# CONFIG_ISCSI_TARGET is not set
+# CONFIG_FUSION is not set
+
+#
+# IEEE 1394 (FireWire) support
+#
+# CONFIG_FIREWIRE is not set
+# CONFIG_FIREWIRE_NOSY is not set
+# end of IEEE 1394 (FireWire) support
+
+CONFIG_NETDEVICES=y
+CONFIG_NET_CORE=y
+# CONFIG_BONDING is not set
+CONFIG_DUMMY=m
+CONFIG_WIREGUARD=m
+# CONFIG_WIREGUARD_DEBUG is not set
+# CONFIG_EQUALIZER is not set
+# CONFIG_NET_FC is not set
+CONFIG_IFB=m
+# CONFIG_NET_TEAM is not set
+CONFIG_MACVLAN=y
+# CONFIG_MACVTAP is not set
+CONFIG_IPVLAN_L3S=y
+CONFIG_IPVLAN=m
+# CONFIG_IPVTAP is not set
+CONFIG_VXLAN=m
+CONFIG_GENEVE=m
+# CONFIG_BAREUDP is not set
+# CONFIG_GTP is not set
+# CONFIG_MACSEC is not set
+# CONFIG_NETCONSOLE is not set
+CONFIG_TUN=m
+# CONFIG_TUN_VNET_CROSS_LE is not set
+CONFIG_VETH=m
+CONFIG_VIRTIO_NET=y
+# CONFIG_NLMON is not set
+# CONFIG_NET_VRF is not set
+# CONFIG_ARCNET is not set
+CONFIG_ETHERNET=y
+# CONFIG_NET_VENDOR_3COM is not set
+# CONFIG_NET_VENDOR_ADAPTEC is not set
+# CONFIG_NET_VENDOR_AGERE is not set
+# CONFIG_NET_VENDOR_ALACRITECH is not set
+# CONFIG_NET_VENDOR_ALTEON is not set
+# CONFIG_ALTERA_TSE is not set
+CONFIG_NET_VENDOR_AMAZON=y
+CONFIG_ENA_ETHERNET=y
+# CONFIG_NET_VENDOR_AMD is not set
+# CONFIG_NET_VENDOR_AQUANTIA is not set
+# CONFIG_NET_VENDOR_ARC is not set
+# CONFIG_NET_VENDOR_ATHEROS is not set
+# CONFIG_NET_VENDOR_BROADCOM is not set
+# CONFIG_NET_VENDOR_CADENCE is not set
+# CONFIG_NET_VENDOR_CAVIUM is not set
+# CONFIG_NET_VENDOR_CHELSIO is not set
+# CONFIG_NET_VENDOR_CISCO is not set
+# CONFIG_NET_VENDOR_CORTINA is not set
+# CONFIG_DNET is not set
+# CONFIG_NET_VENDOR_DEC is not set
+# CONFIG_NET_VENDOR_DLINK is not set
+# CONFIG_NET_VENDOR_EMULEX is not set
+# CONFIG_NET_VENDOR_EZCHIP is not set
+CONFIG_NET_VENDOR_GOOGLE=y
+CONFIG_GVE=m
+CONFIG_NET_VENDOR_HISILICON=y
+# CONFIG_HIX5HD2_GMAC is not set
+# CONFIG_HISI_FEMAC is not set
+# CONFIG_HIP04_ETH is not set
+# CONFIG_HNS_DSAF is not set
+# CONFIG_HNS_ENET is not set
+# CONFIG_HNS3 is not set
+# CONFIG_NET_VENDOR_HUAWEI is not set
+CONFIG_NET_VENDOR_I825XX=y
+CONFIG_NET_VENDOR_INTEL=y
+# CONFIG_E100 is not set
+# CONFIG_E1000 is not set
+# CONFIG_E1000E is not set
+# CONFIG_IGB is not set
+# CONFIG_IGBVF is not set
+# CONFIG_IXGB is not set
+# CONFIG_IXGBE is not set
+CONFIG_IXGBEVF=y
+# CONFIG_I40E is not set
+# CONFIG_I40EVF is not set
+# CONFIG_ICE is not set
+# CONFIG_FM10K is not set
+# CONFIG_IGC is not set
+# CONFIG_JME is not set
+CONFIG_NET_VENDOR_LITEX=y
+# CONFIG_LITEX_LITEETH is not set
+# CONFIG_NET_VENDOR_MARVELL is not set
+CONFIG_NET_VENDOR_MELLANOX=y
+CONFIG_MLX4_EN=m
+CONFIG_MLX4_CORE=m
+CONFIG_MLX4_DEBUG=y
+CONFIG_MLX4_CORE_GEN2=y
+CONFIG_MLX5_CORE=m
+CONFIG_MLX5_ACCEL=y
+CONFIG_MLX5_FPGA=y
+CONFIG_MLX5_CORE_EN=y
+CONFIG_MLX5_EN_ARFS=y
+CONFIG_MLX5_EN_RXNFC=y
+CONFIG_MLX5_MPFS=y
+# CONFIG_MLX5_CORE_IPOIB is not set
+CONFIG_MLX5_FPGA_IPSEC=y
+# CONFIG_MLX5_FPGA_TLS is not set
+# CONFIG_MLX5_TLS is not set
+# CONFIG_MLX5_SF is not set
+CONFIG_MLXSW_CORE=m
+CONFIG_MLXSW_CORE_THERMAL=y
+CONFIG_MLXSW_PCI=m
+CONFIG_MLXFW=m
+# CONFIG_MLXBF_GIGE is not set
+# CONFIG_NET_VENDOR_MICREL is not set
+CONFIG_NET_VENDOR_MICROCHIP=y
+# CONFIG_LAN743X is not set
+# CONFIG_NET_VENDOR_MICROSEMI is not set
+CONFIG_NET_VENDOR_MICROSOFT=y
+# CONFIG_NET_VENDOR_MYRI is not set
+# CONFIG_FEALNX is not set
+# CONFIG_NET_VENDOR_NI is not set
+# CONFIG_NET_VENDOR_NATSEMI is not set
+# CONFIG_NET_VENDOR_NETERION is not set
+# CONFIG_NET_VENDOR_NETRONOME is not set
+# CONFIG_NET_VENDOR_NVIDIA is not set
+# CONFIG_NET_VENDOR_OKI is not set
+# CONFIG_ETHOC is not set
+# CONFIG_NET_VENDOR_PACKET_ENGINES is not set
+CONFIG_NET_VENDOR_PENSANDO=y
+# CONFIG_IONIC is not set
+# CONFIG_NET_VENDOR_QLOGIC is not set
+# CONFIG_NET_VENDOR_BROCADE is not set
+# CONFIG_NET_VENDOR_QUALCOMM is not set
+# CONFIG_NET_VENDOR_RDC is not set
+# CONFIG_NET_VENDOR_REALTEK is not set
+# CONFIG_NET_VENDOR_RENESAS is not set
+# CONFIG_NET_VENDOR_ROCKER is not set
+# CONFIG_NET_VENDOR_SAMSUNG is not set
+# CONFIG_NET_VENDOR_SEEQ is not set
+# CONFIG_NET_VENDOR_SILAN is not set
+# CONFIG_NET_VENDOR_SIS is not set
+# CONFIG_NET_VENDOR_SOLARFLARE is not set
+# CONFIG_NET_VENDOR_SMSC is not set
+# CONFIG_NET_VENDOR_SOCIONEXT is not set
+# CONFIG_NET_VENDOR_STMICRO is not set
+# CONFIG_NET_VENDOR_SUN is not set
+# CONFIG_NET_VENDOR_SYNOPSYS is not set
+# CONFIG_NET_VENDOR_TEHUTI is not set
+# CONFIG_NET_VENDOR_TI is not set
+# CONFIG_NET_VENDOR_VIA is not set
+# CONFIG_NET_VENDOR_WIZNET is not set
+CONFIG_NET_VENDOR_XILINX=y
+# CONFIG_XILINX_EMACLITE is not set
+# CONFIG_XILINX_AXI_EMAC is not set
+# CONFIG_XILINX_LL_TEMAC is not set
+# CONFIG_FDDI is not set
+# CONFIG_HIPPI is not set
+# CONFIG_NET_SB1000 is not set
+CONFIG_PHYLIB=y
+CONFIG_SWPHY=y
+CONFIG_FIXED_PHY=y
+
+#
+# MII PHY device drivers
+#
+# CONFIG_AMD_PHY is not set
+# CONFIG_ADIN_PHY is not set
+# CONFIG_AQUANTIA_PHY is not set
+# CONFIG_AX88796B_PHY is not set
+# CONFIG_BROADCOM_PHY is not set
+# CONFIG_BCM54140_PHY is not set
+# CONFIG_BCM7XXX_PHY is not set
+# CONFIG_BCM84881_PHY is not set
+# CONFIG_BCM87XX_PHY is not set
+# CONFIG_CICADA_PHY is not set
+# CONFIG_CORTINA_PHY is not set
+# CONFIG_DAVICOM_PHY is not set
+# CONFIG_ICPLUS_PHY is not set
+# CONFIG_LXT_PHY is not set
+# CONFIG_INTEL_XWAY_PHY is not set
+# CONFIG_LSI_ET1011C_PHY is not set
+# CONFIG_MARVELL_PHY is not set
+# CONFIG_MARVELL_10G_PHY is not set
+# CONFIG_MARVELL_88X2222_PHY is not set
+# CONFIG_MAXLINEAR_GPHY is not set
+# CONFIG_MEDIATEK_GE_PHY is not set
+# CONFIG_MICREL_PHY is not set
+# CONFIG_MICROCHIP_PHY is not set
+# CONFIG_MICROCHIP_T1_PHY is not set
+# CONFIG_MICROSEMI_PHY is not set
+# CONFIG_MOTORCOMM_PHY is not set
+# CONFIG_NATIONAL_PHY is not set
+# CONFIG_NXP_C45_TJA11XX_PHY is not set
+# CONFIG_QSEMI_PHY is not set
+# CONFIG_REALTEK_PHY is not set
+# CONFIG_RENESAS_PHY is not set
+# CONFIG_ROCKCHIP_PHY is not set
+# CONFIG_SMSC_PHY is not set
+# CONFIG_STE10XP is not set
+# CONFIG_TERANETICS_PHY is not set
+# CONFIG_DP83822_PHY is not set
+# CONFIG_DP83TC811_PHY is not set
+# CONFIG_DP83848_PHY is not set
+# CONFIG_DP83867_PHY is not set
+# CONFIG_DP83869_PHY is not set
+# CONFIG_VITESSE_PHY is not set
+# CONFIG_XILINX_GMII2RGMII is not set
+CONFIG_MDIO_DEVICE=y
+CONFIG_MDIO_BUS=y
+CONFIG_FWNODE_MDIO=y
+CONFIG_OF_MDIO=y
+CONFIG_ACPI_MDIO=y
+CONFIG_MDIO_DEVRES=y
+# CONFIG_MDIO_BITBANG is not set
+# CONFIG_MDIO_BCM_UNIMAC is not set
+# CONFIG_MDIO_HISI_FEMAC is not set
+# CONFIG_MDIO_MSCC_MIIM is not set
+# CONFIG_MDIO_OCTEON is not set
+# CONFIG_MDIO_IPQ4019 is not set
+# CONFIG_MDIO_THUNDER is not set
+
+#
+# MDIO Multiplexers
+#
+# CONFIG_MDIO_BUS_MUX_MULTIPLEXER is not set
+# CONFIG_MDIO_BUS_MUX_MMIOREG is not set
+
+#
+# PCS device drivers
+#
+# CONFIG_PCS_XPCS is not set
+# end of PCS device drivers
+
+CONFIG_PPP=m
+# CONFIG_PPP_BSDCOMP is not set
+# CONFIG_PPP_DEFLATE is not set
+# CONFIG_PPP_FILTER is not set
+# CONFIG_PPP_MPPE is not set
+# CONFIG_PPP_MULTILINK is not set
+# CONFIG_PPPOE is not set
+CONFIG_PPP_ASYNC=m
+# CONFIG_PPP_SYNC_TTY is not set
+# CONFIG_SLIP is not set
+CONFIG_SLHC=m
+
+#
+# Host-side USB support is needed for USB Network Adapter support
+#
+# CONFIG_WLAN is not set
+# CONFIG_WAN is not set
+
+#
+# Wireless WAN
+#
+# CONFIG_WWAN is not set
+# end of Wireless WAN
+
+CONFIG_XEN_NETDEV_FRONTEND=y
+CONFIG_XEN_NETDEV_BACKEND=m
+CONFIG_VMXNET3=y
+# CONFIG_FUJITSU_ES is not set
+# CONFIG_NETDEVSIM is not set
+CONFIG_NET_FAILOVER=y
+# CONFIG_ISDN is not set
+
+#
+# Input device support
+#
+CONFIG_INPUT=y
+CONFIG_INPUT_FF_MEMLESS=y
+CONFIG_INPUT_SPARSEKMAP=m
+# CONFIG_INPUT_MATRIXKMAP is not set
+
+#
+# Userland interfaces
+#
+# CONFIG_INPUT_MOUSEDEV is not set
+# CONFIG_INPUT_JOYDEV is not set
+CONFIG_INPUT_EVDEV=y
+# CONFIG_INPUT_EVBUG is not set
+
+#
+# Input Device Drivers
+#
+CONFIG_INPUT_KEYBOARD=y
+CONFIG_KEYBOARD_ATKBD=y
+# CONFIG_KEYBOARD_LKKBD is not set
+# CONFIG_KEYBOARD_NEWTON is not set
+# CONFIG_KEYBOARD_OPENCORES is not set
+# CONFIG_KEYBOARD_SAMSUNG is not set
+# CONFIG_KEYBOARD_STOWAWAY is not set
+# CONFIG_KEYBOARD_SUNKBD is not set
+# CONFIG_KEYBOARD_OMAP4 is not set
+# CONFIG_KEYBOARD_XTKBD is not set
+# CONFIG_KEYBOARD_BCM is not set
+# CONFIG_INPUT_MOUSE is not set
+# CONFIG_INPUT_JOYSTICK is not set
+# CONFIG_INPUT_TABLET is not set
+# CONFIG_INPUT_TOUCHSCREEN is not set
+CONFIG_INPUT_MISC=y
+# CONFIG_INPUT_AD714X is not set
+# CONFIG_INPUT_E3X0_BUTTON is not set
+CONFIG_INPUT_UINPUT=m
+# CONFIG_INPUT_ADXL34X is not set
+# CONFIG_INPUT_CMA3000 is not set
+CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y
+# CONFIG_RMI4_CORE is not set
+
+#
+# Hardware I/O ports
+#
+CONFIG_SERIO=y
+CONFIG_SERIO_SERPORT=m
+# CONFIG_SERIO_AMBAKMI is not set
+CONFIG_SERIO_PCIPS2=m
+CONFIG_SERIO_LIBPS2=y
+CONFIG_SERIO_RAW=y
+# CONFIG_SERIO_ALTERA_PS2 is not set
+# CONFIG_SERIO_PS2MULT is not set
+# CONFIG_SERIO_ARC_PS2 is not set
+# CONFIG_SERIO_APBPS2 is not set
+# CONFIG_USERIO is not set
+# CONFIG_GAMEPORT is not set
+# end of Hardware I/O ports
+# end of Input device support
+
+#
+# Character devices
+#
+CONFIG_TTY=y
+CONFIG_VT=y
+CONFIG_CONSOLE_TRANSLATIONS=y
+CONFIG_VT_CONSOLE=y
+CONFIG_VT_CONSOLE_SLEEP=y
+CONFIG_HW_CONSOLE=y
+# CONFIG_VT_HW_CONSOLE_BINDING is not set
+CONFIG_UNIX98_PTYS=y
+# CONFIG_LEGACY_PTYS is not set
+CONFIG_LDISC_AUTOLOAD=y
+
+#
+# Serial drivers
+#
+CONFIG_SERIAL_EARLYCON=y
+CONFIG_SERIAL_8250=y
+# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
+CONFIG_SERIAL_8250_PNP=y
+CONFIG_SERIAL_8250_16550A_VARIANTS=y
+# CONFIG_SERIAL_8250_FINTEK is not set
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_PCI=y
+# CONFIG_SERIAL_8250_EXAR is not set
+CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
+# CONFIG_SERIAL_8250_EXTENDED is not set
+CONFIG_SERIAL_8250_FSL=y
+# CONFIG_SERIAL_8250_DW is not set
+# CONFIG_SERIAL_8250_RT288X is not set
+# CONFIG_SERIAL_OF_PLATFORM is not set
+
+#
+# Non-8250 serial port support
+#
+CONFIG_SERIAL_AMBA_PL010=y
+CONFIG_SERIAL_AMBA_PL010_CONSOLE=y
+CONFIG_SERIAL_AMBA_PL011=y
+CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
+CONFIG_SERIAL_EARLYCON_ARM_SEMIHOST=y
+# CONFIG_SERIAL_UARTLITE is not set
+CONFIG_SERIAL_CORE=y
+CONFIG_SERIAL_CORE_CONSOLE=y
+# CONFIG_SERIAL_JSM is not set
+# CONFIG_SERIAL_SIFIVE is not set
+# CONFIG_SERIAL_SCCNXP is not set
+# CONFIG_SERIAL_BCM63XX is not set
+# CONFIG_SERIAL_ALTERA_JTAGUART is not set
+# CONFIG_SERIAL_ALTERA_UART is not set
+# CONFIG_SERIAL_XILINX_PS_UART is not set
+# CONFIG_SERIAL_ARC is not set
+# CONFIG_SERIAL_RP2 is not set
+# CONFIG_SERIAL_FSL_LPUART is not set
+# CONFIG_SERIAL_FSL_LINFLEXUART is not set
+# CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set
+# CONFIG_SERIAL_SPRD is not set
+# end of Serial drivers
+
+# CONFIG_SERIAL_NONSTANDARD is not set
+# CONFIG_N_GSM is not set
+# CONFIG_NOZOMI is not set
+# CONFIG_NULL_TTY is not set
+CONFIG_HVC_DRIVER=y
+CONFIG_HVC_IRQ=y
+CONFIG_HVC_XEN=y
+CONFIG_HVC_XEN_FRONTEND=y
+# CONFIG_HVC_DCC is not set
+# CONFIG_SERIAL_DEV_BUS is not set
+# CONFIG_TTY_PRINTK is not set
+# CONFIG_VIRTIO_CONSOLE is not set
+# CONFIG_IPMI_HANDLER is not set
+CONFIG_HW_RANDOM=y
+# CONFIG_HW_RANDOM_TIMERIOMEM is not set
+# CONFIG_HW_RANDOM_BA431 is not set
+CONFIG_HW_RANDOM_VIRTIO=y
+# CONFIG_HW_RANDOM_CCTRNG is not set
+# CONFIG_HW_RANDOM_XIPHERA is not set
+CONFIG_HW_RANDOM_ARM_SMCCC_TRNG=y
+# CONFIG_APPLICOM is not set
+CONFIG_DEVMEM=y
+CONFIG_DEVPORT=y
+CONFIG_TCG_TPM=y
+# CONFIG_HW_RANDOM_TPM is not set
+CONFIG_TCG_TIS_CORE=y
+CONFIG_TCG_TIS=y
+# CONFIG_TCG_ATMEL is not set
+# CONFIG_TCG_INFINEON is not set
+CONFIG_TCG_XEN=m
+CONFIG_TCG_CRB=y
+# CONFIG_TCG_VTPM_PROXY is not set
+# CONFIG_XILLYBUS is not set
+CONFIG_RANDOM_TRUST_CPU=y
+# CONFIG_RANDOM_TRUST_BOOTLOADER is not set
+# end of Character devices
+
+#
+# I2C support
+#
+# CONFIG_I2C is not set
+# end of I2C support
+
+# CONFIG_I3C is not set
+# CONFIG_SPI is not set
+# CONFIG_SPMI is not set
+# CONFIG_HSI is not set
+CONFIG_PPS=m
+# CONFIG_PPS_DEBUG is not set
+
+#
+# PPS clients support
+#
+# CONFIG_PPS_CLIENT_KTIMER is not set
+# CONFIG_PPS_CLIENT_LDISC is not set
+# CONFIG_PPS_CLIENT_GPIO is not set
+
+#
+# PPS generators support
+#
+
+#
+# PTP clock support
+#
+CONFIG_PTP_1588_CLOCK=m
+CONFIG_PTP_1588_CLOCK_OPTIONAL=m
+
+#
+# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
+#
+CONFIG_PTP_1588_CLOCK_KVM=m
+# end of PTP clock support
+
+# CONFIG_PINCTRL is not set
+# CONFIG_GPIOLIB is not set
+# CONFIG_W1 is not set
+CONFIG_POWER_RESET=y
+# CONFIG_POWER_RESET_RESTART is not set
+# CONFIG_POWER_RESET_XGENE is not set
+# CONFIG_POWER_RESET_SYSCON is not set
+# CONFIG_POWER_RESET_SYSCON_POWEROFF is not set
+# CONFIG_NVMEM_REBOOT_MODE is not set
+CONFIG_POWER_SUPPLY=y
+# CONFIG_POWER_SUPPLY_DEBUG is not set
+# CONFIG_PDA_POWER is not set
+# CONFIG_TEST_POWER is not set
+# CONFIG_BATTERY_DS2780 is not set
+# CONFIG_BATTERY_DS2781 is not set
+# CONFIG_BATTERY_BQ27XXX is not set
+# CONFIG_CHARGER_MAX8903 is not set
+# CONFIG_BATTERY_GOLDFISH is not set
+# CONFIG_HWMON is not set
+CONFIG_THERMAL=y
+# CONFIG_THERMAL_NETLINK is not set
+# CONFIG_THERMAL_STATISTICS is not set
+CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
+CONFIG_THERMAL_OF=y
+# CONFIG_THERMAL_WRITABLE_TRIPS is not set
+CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
+# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
+# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
+# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
+CONFIG_THERMAL_GOV_STEP_WISE=y
+# CONFIG_THERMAL_GOV_BANG_BANG is not set
+# CONFIG_THERMAL_GOV_USER_SPACE is not set
+# CONFIG_CPU_THERMAL is not set
+# CONFIG_THERMAL_EMULATION is not set
+# CONFIG_THERMAL_MMIO is not set
+CONFIG_WATCHDOG=y
+CONFIG_WATCHDOG_CORE=y
+# CONFIG_WATCHDOG_NOWAYOUT is not set
+CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
+CONFIG_WATCHDOG_OPEN_TIMEOUT=0
+# CONFIG_WATCHDOG_SYSFS is not set
+# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set
+
+#
+# Watchdog Pretimeout Governors
+#
+# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
+
+#
+# Watchdog Device Drivers
+#
+# CONFIG_SOFT_WATCHDOG is not set
+# CONFIG_WDAT_WDT is not set
+# CONFIG_XILINX_WATCHDOG is not set
+# CONFIG_MLX_WDT is not set
+# CONFIG_ARM_SP805_WATCHDOG is not set
+# CONFIG_ARM_SBSA_WATCHDOG is not set
+# CONFIG_CADENCE_WATCHDOG is not set
+# CONFIG_DW_WATCHDOG is not set
+# CONFIG_MAX63XX_WATCHDOG is not set
+# CONFIG_ARM_SMC_WATCHDOG is not set
+# CONFIG_ALIM7101_WDT is not set
+# CONFIG_I6300ESB_WDT is not set
+# CONFIG_XEN_WDT is not set
+
+#
+# PCI-based Watchdog Cards
+#
+# CONFIG_PCIPCWATCHDOG is not set
+# CONFIG_WDTPCI is not set
+CONFIG_SSB_POSSIBLE=y
+# CONFIG_SSB is not set
+CONFIG_BCMA_POSSIBLE=y
+# CONFIG_BCMA is not set
+
+#
+# Multifunction device drivers
+#
+CONFIG_MFD_CORE=y
+# CONFIG_MFD_ATMEL_FLEXCOM is not set
+# CONFIG_MFD_ATMEL_HLCDC is not set
+# CONFIG_MFD_MADERA is not set
+# CONFIG_MFD_HI6421_PMIC is not set
+# CONFIG_HTC_PASIC3 is not set
+CONFIG_LPC_ICH=y
+CONFIG_LPC_SCH=m
+# CONFIG_MFD_INTEL_PMT is not set
+# CONFIG_MFD_JANZ_CMODIO is not set
+# CONFIG_MFD_KEMPLD is not set
+# CONFIG_MFD_MT6397 is not set
+# CONFIG_MFD_RDC321X is not set
+# CONFIG_MFD_SM501 is not set
+# CONFIG_MFD_SYSCON is not set
+# CONFIG_MFD_TQMX86 is not set
+# CONFIG_MFD_VX855 is not set
+# end of Multifunction device drivers
+
+# CONFIG_REGULATOR is not set
+# CONFIG_RC_CORE is not set
+# CONFIG_MEDIA_CEC_SUPPORT is not set
+# CONFIG_MEDIA_SUPPORT is not set
+
+#
+# Graphics support
+#
+# CONFIG_VGA_ARB is not set
+# CONFIG_DRM is not set
+
+#
+# ARM devices
+#
+# end of ARM devices
+
+#
+# Frame buffer Devices
+#
+# CONFIG_FB is not set
+# end of Frame buffer Devices
+
+#
+# Backlight & LCD device support
+#
+# CONFIG_LCD_CLASS_DEVICE is not set
+# CONFIG_BACKLIGHT_CLASS_DEVICE is not set
+# end of Backlight & LCD device support
+
+#
+# Console display driver support
+#
+CONFIG_DUMMY_CONSOLE=y
+CONFIG_DUMMY_CONSOLE_COLUMNS=80
+CONFIG_DUMMY_CONSOLE_ROWS=25
+# end of Console display driver support
+# end of Graphics support
+
+# CONFIG_SOUND is not set
+
+#
+# HID support
+#
+# CONFIG_HID is not set
+# end of HID support
+
+CONFIG_USB_OHCI_LITTLE_ENDIAN=y
+# CONFIG_USB_SUPPORT is not set
+# CONFIG_MMC is not set
+# CONFIG_MEMSTICK is not set
+# CONFIG_NEW_LEDS is not set
+# CONFIG_ACCESSIBILITY is not set
+CONFIG_INFINIBAND=m
+CONFIG_INFINIBAND_USER_MAD=m
+CONFIG_INFINIBAND_USER_ACCESS=m
+CONFIG_INFINIBAND_USER_MEM=y
+CONFIG_INFINIBAND_ON_DEMAND_PAGING=y
+CONFIG_INFINIBAND_ADDR_TRANS=y
+CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS=y
+CONFIG_INFINIBAND_VIRT_DMA=y
+# CONFIG_INFINIBAND_MTHCA is not set
+# CONFIG_INFINIBAND_EFA is not set
+# CONFIG_MLX4_INFINIBAND is not set
+CONFIG_MLX5_INFINIBAND=m
+# CONFIG_INFINIBAND_OCRDMA is not set
+# CONFIG_INFINIBAND_VMWARE_PVRDMA is not set
+# CONFIG_RDMA_RXE is not set
+# CONFIG_RDMA_SIW is not set
+# CONFIG_INFINIBAND_IPOIB is not set
+# CONFIG_INFINIBAND_SRP is not set
+# CONFIG_INFINIBAND_SRPT is not set
+# CONFIG_INFINIBAND_ISER is not set
+# CONFIG_INFINIBAND_RTRS_CLIENT is not set
+# CONFIG_INFINIBAND_RTRS_SERVER is not set
+CONFIG_EDAC_SUPPORT=y
+# CONFIG_EDAC is not set
+CONFIG_RTC_LIB=y
+CONFIG_RTC_CLASS=y
+CONFIG_RTC_HCTOSYS=y
+CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
+CONFIG_RTC_SYSTOHC=y
+CONFIG_RTC_SYSTOHC_DEVICE="rtc0"
+# CONFIG_RTC_DEBUG is not set
+CONFIG_RTC_NVMEM=y
+
+#
+# RTC interfaces
+#
+CONFIG_RTC_INTF_SYSFS=y
+CONFIG_RTC_INTF_PROC=y
+CONFIG_RTC_INTF_DEV=y
+# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
+# CONFIG_RTC_DRV_TEST is not set
+
+#
+# I2C RTC drivers
+#
+
+#
+# SPI RTC drivers
+#
+
+#
+# SPI and I2C RTC drivers
+#
+
+#
+# Platform RTC drivers
+#
+# CONFIG_RTC_DRV_DS1286 is not set
+# CONFIG_RTC_DRV_DS1511 is not set
+# CONFIG_RTC_DRV_DS1553 is not set
+# CONFIG_RTC_DRV_DS1685_FAMILY is not set
+# CONFIG_RTC_DRV_DS1742 is not set
+# CONFIG_RTC_DRV_DS2404 is not set
+CONFIG_RTC_DRV_EFI=y
+# CONFIG_RTC_DRV_STK17TA8 is not set
+# CONFIG_RTC_DRV_M48T86 is not set
+# CONFIG_RTC_DRV_M48T35 is not set
+# CONFIG_RTC_DRV_M48T59 is not set
+# CONFIG_RTC_DRV_MSM6242 is not set
+# CONFIG_RTC_DRV_BQ4802 is not set
+# CONFIG_RTC_DRV_RP5C01 is not set
+# CONFIG_RTC_DRV_V3020 is not set
+# CONFIG_RTC_DRV_ZYNQMP is not set
+
+#
+# on-CPU RTC drivers
+#
+# CONFIG_RTC_DRV_PL030 is not set
+CONFIG_RTC_DRV_PL031=y
+# CONFIG_RTC_DRV_CADENCE is not set
+# CONFIG_RTC_DRV_FTRTC010 is not set
+# CONFIG_RTC_DRV_R7301 is not set
+
+#
+# HID Sensor RTC drivers
+#
+# CONFIG_RTC_DRV_GOLDFISH is not set
+# CONFIG_DMADEVICES is not set
+
+#
+# DMABUF options
+#
+# CONFIG_SYNC_FILE is not set
+# CONFIG_UDMABUF is not set
+# CONFIG_DMABUF_MOVE_NOTIFY is not set
+# CONFIG_DMABUF_DEBUG is not set
+# CONFIG_DMABUF_SELFTESTS is not set
+# CONFIG_DMABUF_HEAPS is not set
+# CONFIG_DMABUF_SYSFS_STATS is not set
+# end of DMABUF options
+
+# CONFIG_AUXDISPLAY is not set
+CONFIG_UIO=m
+# CONFIG_UIO_CIF is not set
+# CONFIG_UIO_PDRV_GENIRQ is not set
+# CONFIG_UIO_DMEM_GENIRQ is not set
+# CONFIG_UIO_AEC is not set
+# CONFIG_UIO_SERCOS3 is not set
+CONFIG_UIO_PCI_GENERIC=m
+# CONFIG_UIO_NETX is not set
+# CONFIG_UIO_PRUSS is not set
+# CONFIG_UIO_MF624 is not set
+CONFIG_VFIO=m
+CONFIG_VFIO_IOMMU_TYPE1=m
+CONFIG_VFIO_VIRQFD=m
+CONFIG_VFIO_NOIOMMU=y
+CONFIG_VFIO_PCI_CORE=m
+CONFIG_VFIO_PCI_MMAP=y
+CONFIG_VFIO_PCI_INTX=y
+CONFIG_VFIO_PCI=m
+# CONFIG_VFIO_PLATFORM is not set
+# CONFIG_VFIO_MDEV is not set
+# CONFIG_VIRT_DRIVERS is not set
+CONFIG_VIRTIO=y
+CONFIG_VIRTIO_PCI_LIB=y
+CONFIG_VIRTIO_MENU=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_PCI_LEGACY=y
+CONFIG_VIRTIO_BALLOON=m
+# CONFIG_VIRTIO_INPUT is not set
+# CONFIG_VIRTIO_MMIO is not set
+# CONFIG_VDPA is not set
+CONFIG_VHOST_MENU=y
+# CONFIG_VHOST_NET is not set
+# CONFIG_VHOST_SCSI is not set
+# CONFIG_VHOST_VSOCK is not set
+# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
+
+#
+# Microsoft Hyper-V guest support
+#
+# CONFIG_HYPERV is not set
+# end of Microsoft Hyper-V guest support
+
+#
+# Xen driver support
+#
+CONFIG_XEN_BALLOON=y
+CONFIG_XEN_SCRUB_PAGES_DEFAULT=y
+CONFIG_XEN_DEV_EVTCHN=m
+CONFIG_XEN_BACKEND=y
+CONFIG_XENFS=m
+CONFIG_XEN_COMPAT_XENFS=y
+CONFIG_XEN_SYS_HYPERVISOR=y
+CONFIG_XEN_XENBUS_FRONTEND=y
+CONFIG_XEN_GNTDEV=m
+CONFIG_XEN_GRANT_DEV_ALLOC=m
+# CONFIG_XEN_GRANT_DMA_ALLOC is not set
+CONFIG_SWIOTLB_XEN=y
+# CONFIG_XEN_PVCALLS_FRONTEND is not set
+# CONFIG_XEN_PVCALLS_BACKEND is not set
+# CONFIG_XEN_SCSI_BACKEND is not set
+CONFIG_XEN_PRIVCMD=m
+CONFIG_XEN_EFI=y
+CONFIG_XEN_AUTO_XLATE=y
+# end of Xen driver support
+
+# CONFIG_GREYBUS is not set
+# CONFIG_COMEDI is not set
+# CONFIG_STAGING is not set
+# CONFIG_GOLDFISH is not set
+# CONFIG_CHROME_PLATFORMS is not set
+CONFIG_MELLANOX_PLATFORM=y
+# CONFIG_MLXBF_BOOTCTL is not set
+CONFIG_SURFACE_PLATFORMS=y
+# CONFIG_SURFACE_GPE is not set
+# CONFIG_SURFACE_PRO3_BUTTON is not set
+CONFIG_HAVE_CLK=y
+CONFIG_HAVE_CLK_PREPARE=y
+CONFIG_COMMON_CLK=y
+
+#
+# Clock driver for ARM Reference designs
+#
+# CONFIG_ICST is not set
+# CONFIG_CLK_SP810 is not set
+# end of Clock driver for ARM Reference designs
+
+# CONFIG_COMMON_CLK_AXI_CLKGEN is not set
+# CONFIG_COMMON_CLK_XGENE is not set
+# CONFIG_COMMON_CLK_FIXED_MMIO is not set
+# CONFIG_XILINX_VCU is not set
+# CONFIG_HWSPINLOCK is not set
+
+#
+# Clock Source drivers
+#
+CONFIG_TIMER_OF=y
+CONFIG_TIMER_ACPI=y
+CONFIG_TIMER_PROBE=y
+CONFIG_ARM_ARCH_TIMER=y
+CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
+CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND=y
+CONFIG_FSL_ERRATUM_A008585=y
+CONFIG_HISILICON_ERRATUM_161010101=y
+CONFIG_ARM64_ERRATUM_858921=y
+# CONFIG_MICROCHIP_PIT64B is not set
+# end of Clock Source drivers
+
+CONFIG_MAILBOX=y
+# CONFIG_ARM_MHU is not set
+# CONFIG_ARM_MHU_V2 is not set
+# CONFIG_PLATFORM_MHU is not set
+# CONFIG_PL320_MBOX is not set
+CONFIG_PCC=y
+# CONFIG_ALTERA_MBOX is not set
+# CONFIG_MAILBOX_TEST is not set
+CONFIG_IOMMU_API=y
+# CONFIG_IOMMU_SUPPORT is not set
+
+#
+# Remoteproc drivers
+#
+# CONFIG_REMOTEPROC is not set
+# end of Remoteproc drivers
+
+#
+# Rpmsg drivers
+#
+# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
+# CONFIG_RPMSG_VIRTIO is not set
+# end of Rpmsg drivers
+
+# CONFIG_SOUNDWIRE is not set
+
+#
+# SOC (System On Chip) specific Drivers
+#
+
+#
+# Amlogic SoC drivers
+#
+# end of Amlogic SoC drivers
+
+#
+# Broadcom SoC drivers
+#
+# CONFIG_SOC_BRCMSTB is not set
+# end of Broadcom SoC drivers
+
+#
+# NXP/Freescale QorIQ SoC drivers
+#
+# CONFIG_QUICC_ENGINE is not set
+# CONFIG_FSL_RCPM is not set
+# end of NXP/Freescale QorIQ SoC drivers
+
+#
+# i.MX SoC drivers
+#
+# end of i.MX SoC drivers
+
+#
+# Enable LiteX SoC Builder specific drivers
+#
+# CONFIG_LITEX_SOC_CONTROLLER is not set
+# end of Enable LiteX SoC Builder specific drivers
+
+#
+# Qualcomm SoC drivers
+#
+# end of Qualcomm SoC drivers
+
+# CONFIG_SOC_TI is not set
+
+#
+# Xilinx SoC drivers
+#
+# end of Xilinx SoC drivers
+# end of SOC (System On Chip) specific Drivers
+
+# CONFIG_PM_DEVFREQ is not set
+# CONFIG_EXTCON is not set
+# CONFIG_MEMORY is not set
+# CONFIG_IIO is not set
+# CONFIG_NTB is not set
+# CONFIG_VME_BUS is not set
+# CONFIG_PWM is not set
+
+#
+# IRQ chip support
+#
+CONFIG_IRQCHIP=y
+CONFIG_ARM_GIC=y
+CONFIG_ARM_GIC_MAX_NR=1
+CONFIG_ARM_GIC_V2M=y
+CONFIG_ARM_GIC_V3=y
+CONFIG_ARM_GIC_V3_ITS=y
+CONFIG_ARM_GIC_V3_ITS_PCI=y
+# CONFIG_AL_FIC is not set
+CONFIG_PARTITION_PERCPU=y
+# end of IRQ chip support
+
+# CONFIG_IPACK_BUS is not set
+# CONFIG_RESET_CONTROLLER is not set
+
+#
+# PHY Subsystem
+#
+# CONFIG_GENERIC_PHY is not set
+# CONFIG_PHY_XGENE is not set
+# CONFIG_PHY_CAN_TRANSCEIVER is not set
+# CONFIG_BCM_KONA_USB2_PHY is not set
+# CONFIG_PHY_CADENCE_TORRENT is not set
+# CONFIG_PHY_CADENCE_DPHY is not set
+# CONFIG_PHY_CADENCE_SALVO is not set
+# CONFIG_PHY_FSL_IMX8MQ_USB is not set
+# CONFIG_PHY_MIXEL_MIPI_DPHY is not set
+# CONFIG_PHY_PXA_28NM_HSIC is not set
+# CONFIG_PHY_PXA_28NM_USB2 is not set
+# end of PHY Subsystem
+
+# CONFIG_POWERCAP is not set
+# CONFIG_MCB is not set
+
+#
+# Performance monitor support
+#
+# CONFIG_ARM_CCI_PMU is not set
+# CONFIG_ARM_CCN is not set
+# CONFIG_ARM_CMN is not set
+CONFIG_ARM_PMU=y
+CONFIG_ARM_PMU_ACPI=y
+# CONFIG_ARM_SMMU_V3_PMU is not set
+# CONFIG_ARM_DSU_PMU is not set
+# CONFIG_ARM_SPE_PMU is not set
+# CONFIG_ARM_DMC620_PMU is not set
+# CONFIG_HISI_PMU is not set
+# end of Performance monitor support
+
+CONFIG_RAS=y
+# CONFIG_USB4 is not set
+
+#
+# Android
+#
+# CONFIG_ANDROID is not set
+# end of Android
+
+# CONFIG_LIBNVDIMM is not set
+CONFIG_DAX=y
+# CONFIG_DEV_DAX is not set
+CONFIG_NVMEM=y
+CONFIG_NVMEM_SYSFS=y
+# CONFIG_NVMEM_RMEM is not set
+
+#
+# HW tracing support
+#
+# CONFIG_STM is not set
+# CONFIG_INTEL_TH is not set
+# end of HW tracing support
+
+# CONFIG_FPGA is not set
+# CONFIG_FSI is not set
+# CONFIG_TEE is not set
+# CONFIG_SIOX is not set
+# CONFIG_SLIMBUS is not set
+# CONFIG_INTERCONNECT is not set
+# CONFIG_COUNTER is not set
+# CONFIG_MOST is not set
+# end of Device Drivers
+
+#
+# File systems
+#
+CONFIG_DCACHE_WORD_ACCESS=y
+# CONFIG_VALIDATE_FS_PARSER is not set
+CONFIG_FS_IOMAP=y
+# CONFIG_EXT2_FS is not set
+# CONFIG_EXT3_FS is not set
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_USE_FOR_EXT2=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+# CONFIG_EXT4_DEBUG is not set
+CONFIG_JBD2=y
+# CONFIG_JBD2_DEBUG is not set
+CONFIG_FS_MBCACHE=y
+# CONFIG_REISERFS_FS is not set
+# CONFIG_JFS_FS is not set
+CONFIG_XFS_FS=y
+CONFIG_XFS_SUPPORT_V4=y
+CONFIG_XFS_QUOTA=y
+CONFIG_XFS_POSIX_ACL=y
+# CONFIG_XFS_RT is not set
+# CONFIG_XFS_ONLINE_SCRUB is not set
+# CONFIG_XFS_WARN is not set
+# CONFIG_XFS_DEBUG is not set
+# CONFIG_GFS2_FS is not set
+# CONFIG_OCFS2_FS is not set
+# CONFIG_BTRFS_FS is not set
+# CONFIG_NILFS2_FS is not set
+# CONFIG_F2FS_FS is not set
+# CONFIG_FS_DAX is not set
+CONFIG_FS_POSIX_ACL=y
+CONFIG_EXPORTFS=y
+# CONFIG_EXPORTFS_BLOCK_OPS is not set
+CONFIG_FILE_LOCKING=y
+# CONFIG_FS_ENCRYPTION is not set
+# CONFIG_FS_VERITY is not set
+CONFIG_FSNOTIFY=y
+CONFIG_DNOTIFY=y
+CONFIG_INOTIFY_USER=y
+CONFIG_FANOTIFY=y
+CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
+CONFIG_QUOTA=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
+# CONFIG_PRINT_QUOTA_WARNING is not set
+# CONFIG_QUOTA_DEBUG is not set
+CONFIG_QUOTA_TREE=y
+# CONFIG_QFMT_V1 is not set
+CONFIG_QFMT_V2=y
+CONFIG_QUOTACTL=y
+CONFIG_AUTOFS4_FS=y
+CONFIG_AUTOFS_FS=y
+CONFIG_FUSE_FS=m
+# CONFIG_CUSE is not set
+# CONFIG_VIRTIO_FS is not set
+CONFIG_OVERLAY_FS=y
+# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set
+CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y
+# CONFIG_OVERLAY_FS_INDEX is not set
+# CONFIG_OVERLAY_FS_XINO_AUTO is not set
+# CONFIG_OVERLAY_FS_METACOPY is not set
+
+#
+# Caches
+#
+CONFIG_NETFS_SUPPORT=m
+# CONFIG_NETFS_STATS is not set
+CONFIG_FSCACHE=m
+# CONFIG_FSCACHE_STATS is not set
+# CONFIG_FSCACHE_DEBUG is not set
+CONFIG_CACHEFILES=m
+# CONFIG_CACHEFILES_DEBUG is not set
+# end of Caches
+
+#
+# CD-ROM/DVD Filesystems
+#
+CONFIG_ISO9660_FS=m
+CONFIG_JOLIET=y
+CONFIG_ZISOFS=y
+CONFIG_UDF_FS=m
+# end of CD-ROM/DVD Filesystems
+
+#
+# DOS/FAT/EXFAT/NT Filesystems
+#
+CONFIG_FAT_FS=m
+# CONFIG_MSDOS_FS is not set
+CONFIG_VFAT_FS=m
+CONFIG_FAT_DEFAULT_CODEPAGE=437
+CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
+# CONFIG_FAT_DEFAULT_UTF8 is not set
+# CONFIG_EXFAT_FS is not set
+CONFIG_NTFS_FS=m
+# CONFIG_NTFS_DEBUG is not set
+# CONFIG_NTFS_RW is not set
+# CONFIG_NTFS3_FS is not set
+# end of DOS/FAT/EXFAT/NT Filesystems
+
+#
+# Pseudo filesystems
+#
+CONFIG_PROC_FS=y
+CONFIG_PROC_KCORE=y
+CONFIG_PROC_SYSCTL=y
+CONFIG_PROC_PAGE_MONITOR=y
+CONFIG_PROC_CHILDREN=y
+CONFIG_PROC_SELF_MEM_READONLY=y
+CONFIG_KERNFS=y
+CONFIG_SYSFS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_TMPFS_XATTR=y
+# CONFIG_TMPFS_INODE64 is not set
+CONFIG_ARCH_SUPPORTS_HUGETLBFS=y
+CONFIG_HUGETLBFS=y
+CONFIG_HUGETLB_PAGE=y
+CONFIG_MEMFD_CREATE=y
+CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
+CONFIG_CONFIGFS_FS=m
+CONFIG_EFIVAR_FS=y
+# end of Pseudo filesystems
+
+CONFIG_MISC_FILESYSTEMS=y
+# CONFIG_ORANGEFS_FS is not set
+# CONFIG_ADFS_FS is not set
+# CONFIG_AFFS_FS is not set
+# CONFIG_ECRYPT_FS is not set
+# CONFIG_HFS_FS is not set
+# CONFIG_HFSPLUS_FS is not set
+# CONFIG_BEFS_FS is not set
+# CONFIG_BFS_FS is not set
+# CONFIG_EFS_FS is not set
+# CONFIG_CRAMFS is not set
+CONFIG_SQUASHFS=m
+# CONFIG_SQUASHFS_FILE_CACHE is not set
+CONFIG_SQUASHFS_FILE_DIRECT=y
+# CONFIG_SQUASHFS_DECOMP_SINGLE is not set
+# CONFIG_SQUASHFS_DECOMP_MULTI is not set
+CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y
+CONFIG_SQUASHFS_XATTR=y
+CONFIG_SQUASHFS_ZLIB=y
+# CONFIG_SQUASHFS_LZ4 is not set
+# CONFIG_SQUASHFS_LZO is not set
+# CONFIG_SQUASHFS_XZ is not set
+CONFIG_SQUASHFS_ZSTD=y
+CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
+# CONFIG_SQUASHFS_EMBEDDED is not set
+CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
+# CONFIG_VXFS_FS is not set
+# CONFIG_MINIX_FS is not set
+# CONFIG_OMFS_FS is not set
+# CONFIG_HPFS_FS is not set
+# CONFIG_QNX4FS_FS is not set
+# CONFIG_QNX6FS_FS is not set
+# CONFIG_ROMFS_FS is not set
+CONFIG_PSTORE=y
+CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
+CONFIG_PSTORE_DEFLATE_COMPRESS=y
+# CONFIG_PSTORE_LZO_COMPRESS is not set
+# CONFIG_PSTORE_LZ4_COMPRESS is not set
+# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
+# CONFIG_PSTORE_842_COMPRESS is not set
+# CONFIG_PSTORE_ZSTD_COMPRESS is not set
+CONFIG_PSTORE_COMPRESS=y
+CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
+CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
+# CONFIG_PSTORE_CONSOLE is not set
+# CONFIG_PSTORE_PMSG is not set
+# CONFIG_PSTORE_FTRACE is not set
+# CONFIG_PSTORE_RAM is not set
+# CONFIG_PSTORE_BLK is not set
+# CONFIG_SYSV_FS is not set
+# CONFIG_UFS_FS is not set
+# CONFIG_EROFS_FS is not set
+CONFIG_NETWORK_FILESYSTEMS=y
+CONFIG_NFS_FS=m
+# CONFIG_NFS_V2 is not set
+CONFIG_NFS_V3=m
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=m
+# CONFIG_NFS_SWAP is not set
+CONFIG_NFS_V4_1=y
+CONFIG_NFS_V4_2=y
+CONFIG_PNFS_FILE_LAYOUT=m
+CONFIG_PNFS_BLOCK=m
+CONFIG_PNFS_FLEXFILE_LAYOUT=m
+CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
+# CONFIG_NFS_V4_1_MIGRATION is not set
+CONFIG_NFS_V4_SECURITY_LABEL=y
+CONFIG_NFS_FSCACHE=y
+# CONFIG_NFS_USE_LEGACY_DNS is not set
+CONFIG_NFS_USE_KERNEL_DNS=y
+CONFIG_NFS_DEBUG=y
+CONFIG_NFS_DISABLE_UDP_SUPPORT=y
+# CONFIG_NFS_V4_2_READ_PLUS is not set
+CONFIG_NFSD=m
+CONFIG_NFSD_V2_ACL=y
+CONFIG_NFSD_V3=y
+CONFIG_NFSD_V3_ACL=y
+CONFIG_NFSD_V4=y
+# CONFIG_NFSD_BLOCKLAYOUT is not set
+# CONFIG_NFSD_SCSILAYOUT is not set
+# CONFIG_NFSD_FLEXFILELAYOUT is not set
+# CONFIG_NFSD_V4_2_INTER_SSC is not set
+CONFIG_NFSD_V4_SECURITY_LABEL=y
+CONFIG_GRACE_PERIOD=m
+CONFIG_LOCKD=m
+CONFIG_LOCKD_V4=y
+CONFIG_NFS_ACL_SUPPORT=m
+CONFIG_NFS_COMMON=y
+CONFIG_NFS_V4_2_SSC_HELPER=y
+CONFIG_SUNRPC=m
+CONFIG_SUNRPC_GSS=m
+CONFIG_SUNRPC_BACKCHANNEL=y
+CONFIG_RPCSEC_GSS_KRB5=m
+# CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES is not set
+CONFIG_SUNRPC_DEBUG=y
+CONFIG_SUNRPC_XPRT_RDMA=m
+# CONFIG_CEPH_FS is not set
+CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
+# CONFIG_CIFS_ALLOW_INSECURE_LEGACY is not set
+CONFIG_CIFS_UPCALL=y
+CONFIG_CIFS_XATTR=y
+CONFIG_CIFS_DEBUG=y
+# CONFIG_CIFS_DEBUG2 is not set
+# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set
+CONFIG_CIFS_DFS_UPCALL=y
+# CONFIG_CIFS_SWN_UPCALL is not set
+# CONFIG_CIFS_SMB_DIRECT is not set
+# CONFIG_CIFS_FSCACHE is not set
+# CONFIG_SMB_SERVER is not set
+CONFIG_SMBFS_COMMON=m
+# CONFIG_CODA_FS is not set
+# CONFIG_AFS_FS is not set
+CONFIG_9P_FS=y
+CONFIG_9P_FS_POSIX_ACL=y
+CONFIG_9P_FS_SECURITY=y
+CONFIG_NLS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=m
+# CONFIG_NLS_CODEPAGE_737 is not set
+# CONFIG_NLS_CODEPAGE_775 is not set
+# CONFIG_NLS_CODEPAGE_850 is not set
+# CONFIG_NLS_CODEPAGE_852 is not set
+# CONFIG_NLS_CODEPAGE_855 is not set
+# CONFIG_NLS_CODEPAGE_857 is not set
+# CONFIG_NLS_CODEPAGE_860 is not set
+# CONFIG_NLS_CODEPAGE_861 is not set
+# CONFIG_NLS_CODEPAGE_862 is not set
+# CONFIG_NLS_CODEPAGE_863 is not set
+# CONFIG_NLS_CODEPAGE_864 is not set
+# CONFIG_NLS_CODEPAGE_865 is not set
+# CONFIG_NLS_CODEPAGE_866 is not set
+# CONFIG_NLS_CODEPAGE_869 is not set
+# CONFIG_NLS_CODEPAGE_936 is not set
+# CONFIG_NLS_CODEPAGE_950 is not set
+# CONFIG_NLS_CODEPAGE_932 is not set
+# CONFIG_NLS_CODEPAGE_949 is not set
+# CONFIG_NLS_CODEPAGE_874 is not set
+# CONFIG_NLS_ISO8859_8 is not set
+# CONFIG_NLS_CODEPAGE_1250 is not set
+# CONFIG_NLS_CODEPAGE_1251 is not set
+CONFIG_NLS_ASCII=m
+CONFIG_NLS_ISO8859_1=m
+# CONFIG_NLS_ISO8859_2 is not set
+# CONFIG_NLS_ISO8859_3 is not set
+# CONFIG_NLS_ISO8859_4 is not set
+# CONFIG_NLS_ISO8859_5 is not set
+# CONFIG_NLS_ISO8859_6 is not set
+# CONFIG_NLS_ISO8859_7 is not set
+# CONFIG_NLS_ISO8859_9 is not set
+# CONFIG_NLS_ISO8859_13 is not set
+# CONFIG_NLS_ISO8859_14 is not set
+# CONFIG_NLS_ISO8859_15 is not set
+# CONFIG_NLS_KOI8_R is not set
+# CONFIG_NLS_KOI8_U is not set
+# CONFIG_NLS_MAC_ROMAN is not set
+# CONFIG_NLS_MAC_CELTIC is not set
+# CONFIG_NLS_MAC_CENTEURO is not set
+# CONFIG_NLS_MAC_CROATIAN is not set
+# CONFIG_NLS_MAC_CYRILLIC is not set
+# CONFIG_NLS_MAC_GAELIC is not set
+# CONFIG_NLS_MAC_GREEK is not set
+# CONFIG_NLS_MAC_ICELAND is not set
+# CONFIG_NLS_MAC_INUIT is not set
+# CONFIG_NLS_MAC_ROMANIAN is not set
+# CONFIG_NLS_MAC_TURKISH is not set
+CONFIG_NLS_UTF8=m
+# CONFIG_DLM is not set
+# CONFIG_UNICODE is not set
+CONFIG_IO_WQ=y
+# end of File systems
+
+#
+# Security options
+#
+CONFIG_KEYS=y
+# CONFIG_KEYS_REQUEST_CACHE is not set
+# CONFIG_PERSISTENT_KEYRINGS is not set
+# CONFIG_TRUSTED_KEYS is not set
+# CONFIG_ENCRYPTED_KEYS is not set
+# CONFIG_KEY_DH_OPERATIONS is not set
+CONFIG_SECURITY_DMESG_RESTRICT=y
+CONFIG_SECURITY=y
+CONFIG_SECURITYFS=y
+CONFIG_SECURITY_NETWORK=y
+# CONFIG_SECURITY_INFINIBAND is not set
+# CONFIG_SECURITY_NETWORK_XFRM is not set
+CONFIG_SECURITY_PATH=y
+CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
+CONFIG_HARDENED_USERCOPY=y
+# CONFIG_HARDENED_USERCOPY_FALLBACK is not set
+# CONFIG_HARDENED_USERCOPY_PAGESPAN is not set
+# CONFIG_STATIC_USERMODEHELPER is not set
+# CONFIG_SECURITY_SELINUX is not set
+# CONFIG_SECURITY_SMACK is not set
+# CONFIG_SECURITY_TOMOYO is not set
+CONFIG_SECURITY_APPARMOR=y
+CONFIG_SECURITY_APPARMOR_HASH=y
+CONFIG_SECURITY_APPARMOR_HASH_DEFAULT=y
+# CONFIG_SECURITY_APPARMOR_DEBUG is not set
+CONFIG_SECURITY_LOADPIN=y
+CONFIG_SECURITY_LOADPIN_ENFORCE=y
+CONFIG_SECURITY_YAMA=y
+CONFIG_SECURITY_SAFESETID=y
+CONFIG_SECURITY_LOCKDOWN_LSM=y
+# CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is not set
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=y
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY is not set
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY is not set
+# CONFIG_SECURITY_LANDLOCK is not set
+CONFIG_INTEGRITY=y
+CONFIG_INTEGRITY_SIGNATURE=y
+CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
+CONFIG_INTEGRITY_TRUSTED_KEYRING=y
+CONFIG_INTEGRITY_PLATFORM_KEYRING=y
+CONFIG_LOAD_UEFI_KEYS=y
+CONFIG_INTEGRITY_AUDIT=y
+CONFIG_IMA=y
+# CONFIG_IMA_KEXEC is not set
+CONFIG_IMA_MEASURE_PCR_IDX=10
+CONFIG_IMA_LSM_RULES=y
+CONFIG_IMA_NG_TEMPLATE=y
+# CONFIG_IMA_SIG_TEMPLATE is not set
+CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
+# CONFIG_IMA_DEFAULT_HASH_SHA1 is not set
+CONFIG_IMA_DEFAULT_HASH_SHA256=y
+CONFIG_IMA_DEFAULT_HASH="sha256"
+CONFIG_IMA_WRITE_POLICY=y
+# CONFIG_IMA_READ_POLICY is not set
+CONFIG_IMA_APPRAISE=y
+# CONFIG_IMA_ARCH_POLICY is not set
+CONFIG_IMA_APPRAISE_BUILD_POLICY=y
+CONFIG_IMA_APPRAISE_REQUIRE_FIRMWARE_SIGS=y
+# CONFIG_IMA_APPRAISE_REQUIRE_KEXEC_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_MODULE_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_POLICY_SIGS is not set
+CONFIG_IMA_APPRAISE_BOOTPARAM=y
+# CONFIG_IMA_APPRAISE_MODSIG is not set
+# CONFIG_IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY is not set
+# CONFIG_IMA_BLACKLIST_KEYRING is not set
+CONFIG_IMA_LOAD_X509=y
+CONFIG_IMA_X509_PATH="/etc/ima/pubkey.x509"
+# CONFIG_IMA_APPRAISE_SIGNED_INIT is not set
+CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS=y
+CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS=y
+# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set
+# CONFIG_IMA_DISABLE_HTABLE is not set
+# CONFIG_EVM is not set
+# CONFIG_DEFAULT_SECURITY_APPARMOR is not set
+CONFIG_DEFAULT_SECURITY_DAC=y
+CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity"
+
+#
+# Kernel hardening options
+#
+
+#
+# Memory initialization
+#
+CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_ENABLER=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
+# CONFIG_INIT_STACK_NONE is not set
+# CONFIG_INIT_STACK_ALL_PATTERN is not set
+CONFIG_INIT_STACK_ALL_ZERO=y
+# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set
+# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
+# end of Memory initialization
+# end of Kernel hardening options
+# end of Security options
+
+CONFIG_XOR_BLOCKS=y
+CONFIG_ASYNC_CORE=y
+CONFIG_ASYNC_MEMCPY=m
+CONFIG_ASYNC_XOR=y
+CONFIG_ASYNC_PQ=m
+CONFIG_ASYNC_RAID6_RECOV=m
+CONFIG_CRYPTO=y
+
+#
+# Crypto core or helper
+#
+CONFIG_CRYPTO_ALGAPI=y
+CONFIG_CRYPTO_ALGAPI2=y
+CONFIG_CRYPTO_AEAD=y
+CONFIG_CRYPTO_AEAD2=y
+CONFIG_CRYPTO_SKCIPHER=y
+CONFIG_CRYPTO_SKCIPHER2=y
+CONFIG_CRYPTO_HASH=y
+CONFIG_CRYPTO_HASH2=y
+CONFIG_CRYPTO_RNG=m
+CONFIG_CRYPTO_RNG2=y
+CONFIG_CRYPTO_RNG_DEFAULT=m
+CONFIG_CRYPTO_AKCIPHER2=y
+CONFIG_CRYPTO_AKCIPHER=y
+CONFIG_CRYPTO_KPP2=y
+CONFIG_CRYPTO_ACOMP2=y
+CONFIG_CRYPTO_MANAGER=y
+CONFIG_CRYPTO_MANAGER2=y
+# CONFIG_CRYPTO_USER is not set
+CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
+CONFIG_CRYPTO_GF128MUL=y
+CONFIG_CRYPTO_NULL=y
+CONFIG_CRYPTO_NULL2=y
+# CONFIG_CRYPTO_PCRYPT is not set
+CONFIG_CRYPTO_CRYPTD=m
+CONFIG_CRYPTO_AUTHENC=y
+# CONFIG_CRYPTO_TEST is not set
+CONFIG_CRYPTO_ENGINE=m
+
+#
+# Public-key cryptography
+#
+CONFIG_CRYPTO_RSA=y
+# CONFIG_CRYPTO_DH is not set
+# CONFIG_CRYPTO_ECDH is not set
+# CONFIG_CRYPTO_ECDSA is not set
+# CONFIG_CRYPTO_ECRDSA is not set
+# CONFIG_CRYPTO_SM2 is not set
+# CONFIG_CRYPTO_CURVE25519 is not set
+
+#
+# Authenticated Encryption with Associated Data
+#
+CONFIG_CRYPTO_CCM=m
+CONFIG_CRYPTO_GCM=y
+# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
+# CONFIG_CRYPTO_AEGIS128 is not set
+CONFIG_CRYPTO_SEQIV=m
+CONFIG_CRYPTO_ECHAINIV=m
+
+#
+# Block modes
+#
+CONFIG_CRYPTO_CBC=y
+# CONFIG_CRYPTO_CFB is not set
+CONFIG_CRYPTO_CTR=y
+CONFIG_CRYPTO_CTS=y
+CONFIG_CRYPTO_ECB=y
+CONFIG_CRYPTO_LRW=m
+# CONFIG_CRYPTO_OFB is not set
+# CONFIG_CRYPTO_PCBC is not set
+CONFIG_CRYPTO_XTS=m
+# CONFIG_CRYPTO_KEYWRAP is not set
+# CONFIG_CRYPTO_ADIANTUM is not set
+CONFIG_CRYPTO_ESSIV=y
+
+#
+# Hash modes
+#
+CONFIG_CRYPTO_CMAC=m
+CONFIG_CRYPTO_HMAC=y
+# CONFIG_CRYPTO_XCBC is not set
+# CONFIG_CRYPTO_VMAC is not set
+
+#
+# Digest
+#
+CONFIG_CRYPTO_CRC32C=y
+# CONFIG_CRYPTO_CRC32 is not set
+# CONFIG_CRYPTO_XXHASH is not set
+# CONFIG_CRYPTO_BLAKE2B is not set
+CONFIG_CRYPTO_CRCT10DIF=y
+CONFIG_CRYPTO_GHASH=y
+# CONFIG_CRYPTO_POLY1305 is not set
+CONFIG_CRYPTO_MD4=m
+CONFIG_CRYPTO_MD5=y
+CONFIG_CRYPTO_MICHAEL_MIC=m
+# CONFIG_CRYPTO_RMD160 is not set
+CONFIG_CRYPTO_SHA1=y
+CONFIG_CRYPTO_SHA256=y
+CONFIG_CRYPTO_SHA512=m
+# CONFIG_CRYPTO_SHA3 is not set
+# CONFIG_CRYPTO_SM3 is not set
+# CONFIG_CRYPTO_STREEBOG is not set
+# CONFIG_CRYPTO_WP512 is not set
+
+#
+# Ciphers
+#
+CONFIG_CRYPTO_AES=y
+# CONFIG_CRYPTO_AES_TI is not set
+# CONFIG_CRYPTO_ANUBIS is not set
+CONFIG_CRYPTO_ARC4=y
+# CONFIG_CRYPTO_BLOWFISH is not set
+# CONFIG_CRYPTO_CAMELLIA is not set
+# CONFIG_CRYPTO_CAST5 is not set
+# CONFIG_CRYPTO_CAST6 is not set
+CONFIG_CRYPTO_DES=y
+# CONFIG_CRYPTO_FCRYPT is not set
+# CONFIG_CRYPTO_KHAZAD is not set
+# CONFIG_CRYPTO_CHACHA20 is not set
+# CONFIG_CRYPTO_SEED is not set
+# CONFIG_CRYPTO_SERPENT is not set
+# CONFIG_CRYPTO_SM4 is not set
+# CONFIG_CRYPTO_TEA is not set
+# CONFIG_CRYPTO_TWOFISH is not set
+
+#
+# Compression
+#
+CONFIG_CRYPTO_DEFLATE=y
+CONFIG_CRYPTO_LZO=m
+# CONFIG_CRYPTO_842 is not set
+CONFIG_CRYPTO_LZ4=m
+# CONFIG_CRYPTO_LZ4HC is not set
+# CONFIG_CRYPTO_ZSTD is not set
+
+#
+# Random Number Generation
+#
+# CONFIG_CRYPTO_ANSI_CPRNG is not set
+CONFIG_CRYPTO_DRBG_MENU=m
+CONFIG_CRYPTO_DRBG_HMAC=y
+# CONFIG_CRYPTO_DRBG_HASH is not set
+# CONFIG_CRYPTO_DRBG_CTR is not set
+CONFIG_CRYPTO_DRBG=m
+CONFIG_CRYPTO_JITTERENTROPY=m
+CONFIG_CRYPTO_USER_API=y
+CONFIG_CRYPTO_USER_API_HASH=y
+CONFIG_CRYPTO_USER_API_SKCIPHER=y
+# CONFIG_CRYPTO_USER_API_RNG is not set
+CONFIG_CRYPTO_USER_API_AEAD=y
+CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y
+CONFIG_CRYPTO_HASH_INFO=y
+CONFIG_CRYPTO_HW=y
+# CONFIG_CRYPTO_DEV_CCP is not set
+# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
+# CONFIG_CRYPTO_DEV_CAVIUM_ZIP is not set
+CONFIG_CRYPTO_DEV_VIRTIO=m
+# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
+# CONFIG_CRYPTO_DEV_CCREE is not set
+# CONFIG_CRYPTO_DEV_HISI_SEC is not set
+# CONFIG_CRYPTO_DEV_HISI_SEC2 is not set
+# CONFIG_CRYPTO_DEV_HISI_ZIP is not set
+# CONFIG_CRYPTO_DEV_HISI_HPRE is not set
+# CONFIG_CRYPTO_DEV_HISI_TRNG is not set
+# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
+CONFIG_ASYMMETRIC_KEY_TYPE=y
+CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
+CONFIG_X509_CERTIFICATE_PARSER=y
+# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
+CONFIG_PKCS7_MESSAGE_PARSER=y
+# CONFIG_PKCS7_TEST_KEY is not set
+# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set
+
+#
+# Certificates for signature checking
+#
+CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
+CONFIG_MODULE_SIG_KEY_TYPE_RSA=y
+# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set
+CONFIG_SYSTEM_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_TRUSTED_KEYS="google/certs/lakitu_root_cert.pem"
+# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
+CONFIG_SECONDARY_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_HASH_LIST=""
+# CONFIG_SYSTEM_REVOCATION_LIST is not set
+# end of Certificates for signature checking
+
+CONFIG_BINARY_PRINTF=y
+
+#
+# Library routines
+#
+CONFIG_RAID6_PQ=m
+CONFIG_RAID6_PQ_BENCHMARK=y
+# CONFIG_PACKING is not set
+CONFIG_BITREVERSE=y
+CONFIG_HAVE_ARCH_BITREVERSE=y
+CONFIG_GENERIC_STRNCPY_FROM_USER=y
+CONFIG_GENERIC_STRNLEN_USER=y
+CONFIG_GENERIC_NET_UTILS=y
+CONFIG_GENERIC_FIND_FIRST_BIT=y
+# CONFIG_CORDIC is not set
+# CONFIG_PRIME_NUMBERS is not set
+CONFIG_RATIONAL=y
+CONFIG_GENERIC_PCI_IOMAP=y
+CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
+CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
+CONFIG_ARCH_USE_SYM_ANNOTATIONS=y
+# CONFIG_INDIRECT_PIO is not set
+
+#
+# Crypto library routines
+#
+CONFIG_CRYPTO_LIB_AES=y
+CONFIG_CRYPTO_LIB_ARC4=y
+CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
+CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA=m
+CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m
+CONFIG_CRYPTO_LIB_CHACHA=m
+CONFIG_CRYPTO_LIB_CURVE25519_GENERIC=m
+CONFIG_CRYPTO_LIB_CURVE25519=m
+CONFIG_CRYPTO_LIB_DES=y
+CONFIG_CRYPTO_LIB_POLY1305_RSIZE=9
+CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m
+CONFIG_CRYPTO_LIB_SHA256=y
+# end of Crypto library routines
+
+CONFIG_LIB_MEMNEQ=y
+CONFIG_CRC_CCITT=y
+CONFIG_CRC16=y
+CONFIG_CRC_T10DIF=y
+CONFIG_CRC_ITU_T=y
+CONFIG_CRC32=y
+# CONFIG_CRC32_SELFTEST is not set
+CONFIG_CRC32_SLICEBY8=y
+# CONFIG_CRC32_SLICEBY4 is not set
+# CONFIG_CRC32_SARWATE is not set
+# CONFIG_CRC32_BIT is not set
+CONFIG_CRC64=m
+# CONFIG_CRC4 is not set
+CONFIG_CRC7=m
+CONFIG_LIBCRC32C=y
+# CONFIG_CRC8 is not set
+CONFIG_XXHASH=y
+CONFIG_AUDIT_GENERIC=y
+CONFIG_AUDIT_ARCH_COMPAT_GENERIC=y
+CONFIG_AUDIT_COMPAT_GENERIC=y
+# CONFIG_RANDOM32_SELFTEST is not set
+CONFIG_ZLIB_INFLATE=y
+CONFIG_ZLIB_DEFLATE=y
+CONFIG_LZO_COMPRESS=m
+CONFIG_LZO_DECOMPRESS=m
+CONFIG_LZ4_COMPRESS=m
+CONFIG_LZ4_DECOMPRESS=y
+CONFIG_ZSTD_DECOMPRESS=y
+CONFIG_XZ_DEC=y
+CONFIG_XZ_DEC_X86=y
+# CONFIG_XZ_DEC_POWERPC is not set
+# CONFIG_XZ_DEC_IA64 is not set
+# CONFIG_XZ_DEC_ARM is not set
+# CONFIG_XZ_DEC_ARMTHUMB is not set
+# CONFIG_XZ_DEC_SPARC is not set
+CONFIG_XZ_DEC_BCJ=y
+# CONFIG_XZ_DEC_TEST is not set
+CONFIG_DECOMPRESS_GZIP=y
+CONFIG_DECOMPRESS_XZ=y
+CONFIG_DECOMPRESS_LZ4=y
+CONFIG_DECOMPRESS_ZSTD=y
+CONFIG_GENERIC_ALLOCATOR=y
+CONFIG_TEXTSEARCH=y
+CONFIG_TEXTSEARCH_KMP=m
+CONFIG_TEXTSEARCH_BM=m
+CONFIG_TEXTSEARCH_FSM=m
+CONFIG_INTERVAL_TREE=y
+CONFIG_XARRAY_MULTI=y
+CONFIG_ASSOCIATIVE_ARRAY=y
+CONFIG_HAS_IOMEM=y
+CONFIG_HAS_IOPORT_MAP=y
+CONFIG_HAS_DMA=y
+CONFIG_DMA_OPS=y
+CONFIG_NEED_SG_DMA_LENGTH=y
+CONFIG_NEED_DMA_MAP_STATE=y
+CONFIG_ARCH_DMA_ADDR_T_64BIT=y
+CONFIG_DMA_DECLARE_COHERENT=y
+CONFIG_ARCH_HAS_SETUP_DMA_OPS=y
+CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y
+CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU=y
+CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y
+CONFIG_SWIOTLB=y
+# CONFIG_DMA_RESTRICTED_POOL is not set
+CONFIG_DMA_NONCOHERENT_MMAP=y
+CONFIG_DMA_COHERENT_POOL=y
+CONFIG_DMA_REMAP=y
+CONFIG_DMA_DIRECT_REMAP=y
+# CONFIG_DMA_CMA is not set
+# CONFIG_DMA_API_DEBUG is not set
+# CONFIG_DMA_MAP_BENCHMARK is not set
+CONFIG_SGL_ALLOC=y
+CONFIG_CPU_RMAP=y
+CONFIG_DQL=y
+CONFIG_GLOB=y
+# CONFIG_GLOB_SELFTEST is not set
+CONFIG_NLATTR=y
+CONFIG_CLZ_TAB=y
+CONFIG_IRQ_POLL=y
+CONFIG_MPILIB=y
+CONFIG_SIGNATURE=y
+CONFIG_DIMLIB=y
+CONFIG_LIBFDT=y
+CONFIG_OID_REGISTRY=y
+CONFIG_UCS2_STRING=y
+CONFIG_HAVE_GENERIC_VDSO=y
+CONFIG_GENERIC_GETTIMEOFDAY=y
+CONFIG_GENERIC_VDSO_TIME_NS=y
+CONFIG_FONT_SUPPORT=y
+CONFIG_FONT_8x16=y
+CONFIG_FONT_AUTOSELECT=y
+CONFIG_SG_POOL=y
+CONFIG_ARCH_STACKWALK=y
+CONFIG_SBITMAP=y
+# end of Library routines
+
+CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED=y
+
+#
+# Kernel hacking
+#
+
+#
+# printk and dmesg options
+#
+CONFIG_PRINTK_TIME=y
+# CONFIG_PRINTK_CALLER is not set
+# CONFIG_STACKTRACE_BUILD_ID is not set
+CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
+CONFIG_CONSOLE_LOGLEVEL_QUIET=4
+CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
+# CONFIG_BOOT_PRINTK_DELAY is not set
+# CONFIG_DYNAMIC_DEBUG is not set
+# CONFIG_DYNAMIC_DEBUG_CORE is not set
+CONFIG_SYMBOLIC_ERRNAME=y
+CONFIG_DEBUG_BUGVERBOSE=y
+# end of printk and dmesg options
+
+CONFIG_AS_HAS_NON_CONST_LEB128=y
+
+#
+# Compile-time checks and compiler options
+#
+CONFIG_DEBUG_INFO=y
+# CONFIG_DEBUG_INFO_REDUCED is not set
+# CONFIG_DEBUG_INFO_COMPRESSED is not set
+# CONFIG_DEBUG_INFO_SPLIT is not set
+# CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is not set
+CONFIG_DEBUG_INFO_DWARF4=y
+# CONFIG_DEBUG_INFO_DWARF5 is not set
+CONFIG_DEBUG_INFO_BTF=y
+CONFIG_PAHOLE_HAS_SPLIT_BTF=y
+CONFIG_DEBUG_INFO_BTF_MODULES=y
+# CONFIG_GDB_SCRIPTS is not set
+CONFIG_FRAME_WARN=2048
+# CONFIG_STRIP_ASM_SYMS is not set
+# CONFIG_HEADERS_INSTALL is not set
+CONFIG_SECTION_MISMATCH_WARN_ONLY=y
+# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set
+CONFIG_ARCH_WANT_FRAME_POINTERS=y
+CONFIG_FRAME_POINTER=y
+# CONFIG_VMLINUX_MAP is not set
+# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
+# end of Compile-time checks and compiler options
+
+#
+# Generic Kernel Debugging Instruments
+#
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
+CONFIG_MAGIC_SYSRQ_SERIAL=y
+CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE=""
+CONFIG_DEBUG_FS=y
+CONFIG_DEBUG_FS_ALLOW_ALL=y
+# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
+# CONFIG_DEBUG_FS_ALLOW_NONE is not set
+CONFIG_HAVE_ARCH_KGDB=y
+# CONFIG_KGDB is not set
+CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
+CONFIG_UBSAN=y
+# CONFIG_UBSAN_TRAP is not set
+CONFIG_CC_HAS_UBSAN_BOUNDS=y
+CONFIG_CC_HAS_UBSAN_ARRAY_BOUNDS=y
+CONFIG_UBSAN_BOUNDS=y
+CONFIG_UBSAN_ARRAY_BOUNDS=y
+# CONFIG_UBSAN_SHIFT is not set
+# CONFIG_UBSAN_DIV_ZERO is not set
+# CONFIG_UBSAN_UNREACHABLE is not set
+# CONFIG_UBSAN_BOOL is not set
+# CONFIG_UBSAN_ENUM is not set
+# CONFIG_UBSAN_ALIGNMENT is not set
+# CONFIG_UBSAN_SANITIZE_ALL is not set
+# CONFIG_TEST_UBSAN is not set
+CONFIG_HAVE_KCSAN_COMPILER=y
+# end of Generic Kernel Debugging Instruments
+
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DEBUG_MISC=y
+
+#
+# Memory Debugging
+#
+# CONFIG_PAGE_EXTENSION is not set
+# CONFIG_DEBUG_PAGEALLOC is not set
+# CONFIG_PAGE_OWNER is not set
+# CONFIG_PAGE_POISONING is not set
+# CONFIG_DEBUG_PAGE_REF is not set
+# CONFIG_DEBUG_RODATA_TEST is not set
+CONFIG_ARCH_HAS_DEBUG_WX=y
+# CONFIG_DEBUG_WX is not set
+CONFIG_GENERIC_PTDUMP=y
+# CONFIG_PTDUMP_DEBUGFS is not set
+# CONFIG_DEBUG_OBJECTS is not set
+# CONFIG_SLUB_DEBUG_ON is not set
+# CONFIG_SLUB_STATS is not set
+CONFIG_HAVE_DEBUG_KMEMLEAK=y
+# CONFIG_DEBUG_KMEMLEAK is not set
+# CONFIG_DEBUG_STACK_USAGE is not set
+# CONFIG_SCHED_STACK_END_CHECK is not set
+CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
+# CONFIG_DEBUG_VM is not set
+# CONFIG_DEBUG_VM_PGTABLE is not set
+CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
+# CONFIG_DEBUG_VIRTUAL is not set
+# CONFIG_DEBUG_MEMORY_INIT is not set
+# CONFIG_DEBUG_PER_CPU_MAPS is not set
+CONFIG_HAVE_ARCH_KASAN=y
+CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y
+CONFIG_HAVE_ARCH_KASAN_HW_TAGS=y
+CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
+CONFIG_CC_HAS_KASAN_GENERIC=y
+CONFIG_CC_HAS_KASAN_SW_TAGS=y
+CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
+# CONFIG_KASAN is not set
+CONFIG_HAVE_ARCH_KFENCE=y
+# CONFIG_KFENCE is not set
+# end of Memory Debugging
+
+# CONFIG_DEBUG_SHIRQ is not set
+
+#
+# Debug Oops, Lockups and Hangs
+#
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PANIC_ON_OOPS_VALUE=1
+CONFIG_PANIC_TIMEOUT=-1
+CONFIG_LOCKUP_DETECTOR=y
+CONFIG_SOFTLOCKUP_DETECTOR=y
+# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1
+# CONFIG_WQ_WATCHDOG is not set
+# CONFIG_TEST_LOCKUP is not set
+# end of Debug Oops, Lockups and Hangs
+
+#
+# Scheduler Debugging
+#
+# CONFIG_SCHED_DEBUG is not set
+CONFIG_SCHED_INFO=y
+CONFIG_SCHEDSTATS=y
+# end of Scheduler Debugging
+
+# CONFIG_DEBUG_TIMEKEEPING is not set
+
+#
+# Lock Debugging (spinlocks, mutexes, etc...)
+#
+CONFIG_LOCK_DEBUGGING_SUPPORT=y
+# CONFIG_PROVE_LOCKING is not set
+# CONFIG_LOCK_STAT is not set
+# CONFIG_DEBUG_RT_MUTEXES is not set
+# CONFIG_DEBUG_SPINLOCK is not set
+# CONFIG_DEBUG_MUTEXES is not set
+# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
+# CONFIG_DEBUG_RWSEMS is not set
+# CONFIG_DEBUG_LOCK_ALLOC is not set
+# CONFIG_DEBUG_ATOMIC_SLEEP is not set
+# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
+# CONFIG_LOCK_TORTURE_TEST is not set
+# CONFIG_WW_MUTEX_SELFTEST is not set
+# CONFIG_SCF_TORTURE_TEST is not set
+# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
+# end of Lock Debugging (spinlocks, mutexes, etc...)
+
+# CONFIG_DEBUG_IRQFLAGS is not set
+CONFIG_STACKTRACE=y
+# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
+# CONFIG_DEBUG_KOBJECT is not set
+
+#
+# Debug kernel data structures
+#
+CONFIG_DEBUG_LIST=y
+# CONFIG_DEBUG_PLIST is not set
+# CONFIG_DEBUG_SG is not set
+CONFIG_DEBUG_NOTIFIERS=y
+# CONFIG_BUG_ON_DATA_CORRUPTION is not set
+# end of Debug kernel data structures
+
+# CONFIG_DEBUG_CREDENTIALS is not set
+
+#
+# RCU Debugging
+#
+# CONFIG_RCU_SCALE_TEST is not set
+# CONFIG_RCU_TORTURE_TEST is not set
+# CONFIG_RCU_REF_SCALE_TEST is not set
+CONFIG_RCU_CPU_STALL_TIMEOUT=60
+# CONFIG_RCU_TRACE is not set
+# CONFIG_RCU_EQS_DEBUG is not set
+# end of RCU Debugging
+
+# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
+# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
+# CONFIG_LATENCYTOP is not set
+CONFIG_NOP_TRACER=y
+CONFIG_HAVE_FUNCTION_TRACER=y
+CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
+CONFIG_HAVE_DYNAMIC_FTRACE=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
+CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
+CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
+CONFIG_HAVE_C_RECORDMCOUNT=y
+CONFIG_TRACE_CLOCK=y
+CONFIG_RING_BUFFER=y
+CONFIG_EVENT_TRACING=y
+CONFIG_CONTEXT_SWITCH_TRACER=y
+CONFIG_TRACING=y
+CONFIG_GENERIC_TRACER=y
+CONFIG_TRACING_SUPPORT=y
+CONFIG_FTRACE=y
+# CONFIG_BOOTTIME_TRACING is not set
+CONFIG_FUNCTION_TRACER=y
+CONFIG_FUNCTION_GRAPH_TRACER=y
+CONFIG_DYNAMIC_FTRACE=y
+CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
+# CONFIG_FUNCTION_PROFILER is not set
+# CONFIG_STACK_TRACER is not set
+# CONFIG_IRQSOFF_TRACER is not set
+# CONFIG_SCHED_TRACER is not set
+# CONFIG_HWLAT_TRACER is not set
+# CONFIG_OSNOISE_TRACER is not set
+# CONFIG_TIMERLAT_TRACER is not set
+CONFIG_FTRACE_SYSCALLS=y
+# CONFIG_TRACER_SNAPSHOT is not set
+CONFIG_BRANCH_PROFILE_NONE=y
+# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
+# CONFIG_PROFILE_ALL_BRANCHES is not set
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_KPROBE_EVENTS=y
+# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
+CONFIG_UPROBE_EVENTS=y
+CONFIG_BPF_EVENTS=y
+CONFIG_DYNAMIC_EVENTS=y
+CONFIG_PROBE_EVENTS=y
+# CONFIG_BPF_KPROBE_OVERRIDE is not set
+CONFIG_FTRACE_MCOUNT_RECORD=y
+CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y
+# CONFIG_SYNTH_EVENTS is not set
+# CONFIG_HIST_TRIGGERS is not set
+# CONFIG_TRACE_EVENT_INJECT is not set
+# CONFIG_TRACEPOINT_BENCHMARK is not set
+# CONFIG_RING_BUFFER_BENCHMARK is not set
+# CONFIG_TRACE_EVAL_MAP_FILE is not set
+# CONFIG_FTRACE_RECORD_RECURSION is not set
+# CONFIG_FTRACE_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
+# CONFIG_PREEMPTIRQ_DELAY_TEST is not set
+# CONFIG_KPROBE_EVENT_GEN_TEST is not set
+# CONFIG_SAMPLES is not set
+CONFIG_STRICT_DEVMEM=y
+CONFIG_IO_STRICT_DEVMEM=y
+
+#
+# arm64 Debugging
+#
+# CONFIG_PID_IN_CONTEXTIDR is not set
+# CONFIG_DEBUG_EFI is not set
+# CONFIG_ARM64_RELOC_TEST is not set
+# CONFIG_CORESIGHT is not set
+# end of arm64 Debugging
+
+#
+# Kernel Testing and Coverage
+#
+# CONFIG_KUNIT is not set
+# CONFIG_NOTIFIER_ERROR_INJECTION is not set
+CONFIG_FUNCTION_ERROR_INJECTION=y
+# CONFIG_FAULT_INJECTION is not set
+CONFIG_ARCH_HAS_KCOV=y
+CONFIG_CC_HAS_SANCOV_TRACE_PC=y
+# CONFIG_KCOV is not set
+CONFIG_RUNTIME_TESTING_MENU=y
+# CONFIG_LKDTM is not set
+# CONFIG_TEST_MIN_HEAP is not set
+# CONFIG_TEST_DIV64 is not set
+# CONFIG_KPROBES_SANITY_TEST is not set
+# CONFIG_BACKTRACE_SELF_TEST is not set
+# CONFIG_RBTREE_TEST is not set
+# CONFIG_REED_SOLOMON_TEST is not set
+# CONFIG_INTERVAL_TREE_TEST is not set
+# CONFIG_PERCPU_TEST is not set
+# CONFIG_ATOMIC64_SELFTEST is not set
+# CONFIG_ASYNC_RAID6_TEST is not set
+# CONFIG_TEST_HEXDUMP is not set
+# CONFIG_STRING_SELFTEST is not set
+# CONFIG_TEST_STRING_HELPERS is not set
+# CONFIG_TEST_STRSCPY is not set
+# CONFIG_TEST_KSTRTOX is not set
+# CONFIG_TEST_PRINTF is not set
+# CONFIG_TEST_SCANF is not set
+# CONFIG_TEST_BITMAP is not set
+# CONFIG_TEST_UUID is not set
+# CONFIG_TEST_XARRAY is not set
+# CONFIG_TEST_OVERFLOW is not set
+# CONFIG_TEST_RHASHTABLE is not set
+# CONFIG_TEST_HASH is not set
+# CONFIG_TEST_IDA is not set
+# CONFIG_TEST_LKM is not set
+# CONFIG_TEST_BITOPS is not set
+# CONFIG_TEST_VMALLOC is not set
+# CONFIG_TEST_USER_COPY is not set
+CONFIG_TEST_BPF=m
+# CONFIG_TEST_BLACKHOLE_DEV is not set
+# CONFIG_FIND_BIT_BENCHMARK is not set
+# CONFIG_TEST_FIRMWARE is not set
+# CONFIG_TEST_SYSCTL is not set
+# CONFIG_TEST_UDELAY is not set
+# CONFIG_TEST_STATIC_KEYS is not set
+# CONFIG_TEST_KMOD is not set
+# CONFIG_TEST_MEMCAT_P is not set
+# CONFIG_TEST_STACKINIT is not set
+# CONFIG_TEST_MEMINIT is not set
+# CONFIG_TEST_FREE_PAGES is not set
+CONFIG_ARCH_USE_MEMTEST=y
+# CONFIG_MEMTEST is not set
+# end of Kernel Testing and Coverage
+# end of Kernel hacking
diff --git a/arch/powerpc/include/asm/mem_encrypt.h b/arch/powerpc/include/asm/mem_encrypt.h
index ba9dab0..2f26b8f 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
#include <asm/svm.h>
-static inline bool mem_encrypt_active(void)
-{
- return is_secure_guest();
-}
-
static inline bool force_dma_unencrypted(struct device *dev)
{
return is_secure_guest();
diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
index 87f001b..c083ecb 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -8,6 +8,7 @@
#include <linux/mm.h>
#include <linux/memblock.h>
+#include <linux/cc_platform.h>
#include <asm/machdep.h>
#include <asm/svm.h>
#include <asm/swiotlb.h>
@@ -63,7 +64,7 @@
int set_memory_encrypted(unsigned long addr, int numpages)
{
- if (!mem_encrypt_active())
+ if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
if (!PAGE_ALIGNED(addr))
@@ -76,7 +77,7 @@
int set_memory_decrypted(unsigned long addr, int numpages)
{
- if (!mem_encrypt_active())
+ if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
if (!PAGE_ALIGNED(addr))
diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h
index 2542cbf..08a8b96 100644
--- a/arch/s390/include/asm/mem_encrypt.h
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -4,8 +4,6 @@
#ifndef __ASSEMBLY__
-static inline bool mem_encrypt_active(void) { return false; }
-
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 15c5ae6..89bfa72 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -107,6 +107,7 @@
vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
+vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
efi-obj-$(CONFIG_EFI_STUB) = $(objtree)/drivers/firmware/efi/libstub/lib.a
$(obj)/vmlinux: $(vmlinux-objs-y) $(efi-obj-y) FORCE
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 8bcbcee..64b172d 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -20,153 +20,56 @@
*/
struct mem_vector immovable_mem[MAX_NUMNODES*2];
-/*
- * Search EFI system tables for RSDP. If both ACPI_20_TABLE_GUID and
- * ACPI_TABLE_GUID are found, take the former, which has more features.
- */
static acpi_physical_address
-__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
- bool efi_64)
+__efi_get_rsdp_addr(unsigned long cfg_tbl_pa, unsigned int cfg_tbl_len)
{
- acpi_physical_address rsdp_addr = 0;
-
#ifdef CONFIG_EFI
- int i;
+ unsigned long rsdp_addr;
+ int ret;
- /* Get EFI tables from systab. */
- for (i = 0; i < nr_tables; i++) {
- acpi_physical_address table;
- efi_guid_t guid;
+ /*
+ * Search EFI system tables for RSDP. Preferred is ACPI_20_TABLE_GUID to
+ * ACPI_TABLE_GUID because it has more features.
+ */
+ rsdp_addr = efi_find_vendor_table(boot_params, cfg_tbl_pa, cfg_tbl_len,
+ ACPI_20_TABLE_GUID);
+ if (rsdp_addr)
+ return (acpi_physical_address)rsdp_addr;
- if (efi_64) {
- efi_config_table_64_t *tbl = (efi_config_table_64_t *)config_tables + i;
+ /* No ACPI_20_TABLE_GUID found, fallback to ACPI_TABLE_GUID. */
+ rsdp_addr = efi_find_vendor_table(boot_params, cfg_tbl_pa, cfg_tbl_len,
+ ACPI_TABLE_GUID);
+ if (rsdp_addr)
+ return (acpi_physical_address)rsdp_addr;
- guid = tbl->guid;
- table = tbl->table;
-
- if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
- debug_putstr("Error getting RSDP address: EFI config table located above 4GB.\n");
- return 0;
- }
- } else {
- efi_config_table_32_t *tbl = (efi_config_table_32_t *)config_tables + i;
-
- guid = tbl->guid;
- table = tbl->table;
- }
-
- if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
- rsdp_addr = table;
- else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
- return table;
- }
+ debug_putstr("Error getting RSDP address.\n");
#endif
- return rsdp_addr;
+ return 0;
}
-/* EFI/kexec support is 64-bit only. */
-#ifdef CONFIG_X86_64
-static struct efi_setup_data *get_kexec_setup_data_addr(void)
-{
- struct setup_data *data;
- u64 pa_data;
-
- pa_data = boot_params->hdr.setup_data;
- while (pa_data) {
- data = (struct setup_data *)pa_data;
- if (data->type == SETUP_EFI)
- return (struct efi_setup_data *)(pa_data + sizeof(struct setup_data));
-
- pa_data = data->next;
- }
- return NULL;
-}
-
-static acpi_physical_address kexec_get_rsdp_addr(void)
-{
- efi_system_table_64_t *systab;
- struct efi_setup_data *esd;
- struct efi_info *ei;
- char *sig;
-
- esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
- if (!esd)
- return 0;
-
- if (!esd->tables) {
- debug_putstr("Wrong kexec SETUP_EFI data.\n");
- return 0;
- }
-
- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;
- if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
- debug_putstr("Wrong kexec EFI loader signature.\n");
- return 0;
- }
-
- /* Get systab from boot params. */
- systab = (efi_system_table_64_t *) (ei->efi_systab | ((__u64)ei->efi_systab_hi << 32));
- if (!systab)
- error("EFI system table not found in kexec boot_params.");
-
- return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables, true);
-}
-#else
-static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
-#endif /* CONFIG_X86_64 */
-
static acpi_physical_address efi_get_rsdp_addr(void)
{
#ifdef CONFIG_EFI
- unsigned long systab, config_tables;
+ unsigned long cfg_tbl_pa = 0;
+ unsigned int cfg_tbl_len;
+ unsigned long systab_pa;
unsigned int nr_tables;
- struct efi_info *ei;
- bool efi_64;
- char *sig;
+ enum efi_type et;
+ int ret;
- ei = &boot_params->efi_info;
- sig = (char *)&ei->efi_loader_signature;
-
- if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
- efi_64 = true;
- } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
- efi_64 = false;
- } else {
- debug_putstr("Wrong EFI loader signature.\n");
+ et = efi_get_type(boot_params);
+ if (et == EFI_TYPE_NONE)
return 0;
- }
- /* Get systab from boot params. */
-#ifdef CONFIG_X86_64
- systab = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
-#else
- if (ei->efi_systab_hi || ei->efi_memmap_hi) {
- debug_putstr("Error getting RSDP address: EFI system table located above 4GB.\n");
- return 0;
- }
- systab = ei->efi_systab;
-#endif
- if (!systab)
- error("EFI system table not found.");
+ systab_pa = efi_get_system_table(boot_params);
+ if (!systab_pa)
+ error("EFI support advertised, but unable to locate system table.");
- /* Handle EFI bitness properly */
- if (efi_64) {
- efi_system_table_64_t *stbl = (efi_system_table_64_t *)systab;
+ ret = efi_get_conf_table(boot_params, &cfg_tbl_pa, &cfg_tbl_len);
+ if (ret || !cfg_tbl_pa)
+ error("EFI config table not found.");
- config_tables = stbl->tables;
- nr_tables = stbl->nr_tables;
- } else {
- efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab;
-
- config_tables = stbl->tables;
- nr_tables = stbl->nr_tables;
- }
-
- if (!config_tables)
- error("EFI config tables not found.");
-
- return __efi_get_rsdp_addr(config_tables, nr_tables, efi_64);
+ return __efi_get_rsdp_addr(cfg_tbl_pa, cfg_tbl_len);
#else
return 0;
#endif
@@ -256,14 +159,6 @@
pa = boot_params->acpi_rsdp_addr;
- /*
- * Try to get EFI data from setup_data. This can happen when we're a
- * kexec'ed kernel and kexec(1) has passed all the required EFI info to
- * us.
- */
- if (!pa)
- pa = kexec_get_rsdp_addr();
-
if (!pa)
pa = efi_get_rsdp_addr();
diff --git a/arch/x86/boot/compressed/efi.c b/arch/x86/boot/compressed/efi.c
new file mode 100644
index 0000000..09fa3b5
--- /dev/null
+++ b/arch/x86/boot/compressed/efi.c
@@ -0,0 +1,236 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Helpers for early access to EFI configuration table.
+ *
+ * Originally derived from arch/x86/boot/compressed/acpi.c
+ */
+
+#include "misc.h"
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+/**
+ * efi_get_type - Given a pointer to boot_params, determine the type of EFI environment.
+ *
+ * @bp: pointer to boot_params
+ *
+ * Return: EFI_TYPE_{32,64} for valid EFI environments, EFI_TYPE_NONE otherwise.
+ */
+enum efi_type efi_get_type(struct boot_params *bp)
+{
+ struct efi_info *ei;
+ enum efi_type et;
+ const char *sig;
+
+ ei = &bp->efi_info;
+ sig = (char *)&ei->efi_loader_signature;
+
+ if (!strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ et = EFI_TYPE_64;
+ } else if (!strncmp(sig, EFI32_LOADER_SIGNATURE, 4)) {
+ et = EFI_TYPE_32;
+ } else {
+ debug_putstr("No EFI environment detected.\n");
+ et = EFI_TYPE_NONE;
+ }
+
+#ifndef CONFIG_X86_64
+ /*
+ * Existing callers like acpi.c treat this case as an indicator to
+ * fall-through to non-EFI, rather than an error, so maintain that
+ * functionality here as well.
+ */
+ if (ei->efi_systab_hi || ei->efi_memmap_hi) {
+ debug_putstr("EFI system table is located above 4GB and cannot be accessed.\n");
+ et = EFI_TYPE_NONE;
+ }
+#endif
+
+ return et;
+}
+
+/**
+ * efi_get_system_table - Given a pointer to boot_params, retrieve the physical address
+ * of the EFI system table.
+ *
+ * @bp: pointer to boot_params
+ *
+ * Return: EFI system table address on success. On error, return 0.
+ */
+unsigned long efi_get_system_table(struct boot_params *bp)
+{
+ unsigned long sys_tbl_pa;
+ struct efi_info *ei;
+ enum efi_type et;
+
+ /* Get systab from boot params. */
+ ei = &bp->efi_info;
+#ifdef CONFIG_X86_64
+ sys_tbl_pa = ei->efi_systab | ((__u64)ei->efi_systab_hi << 32);
+#else
+ sys_tbl_pa = ei->efi_systab;
+#endif
+ if (!sys_tbl_pa) {
+ debug_putstr("EFI system table not found.");
+ return 0;
+ }
+
+ return sys_tbl_pa;
+}
+
+/*
+ * EFI config table address changes to virtual address after boot, which may
+ * not be accessible for the kexec'd kernel. To address this, kexec provides
+ * the initial physical address via a struct setup_data entry, which is
+ * checked for here, along with some sanity checks.
+ */
+static struct efi_setup_data *get_kexec_setup_data(struct boot_params *bp,
+ enum efi_type et)
+{
+#ifdef CONFIG_X86_64
+ struct efi_setup_data *esd = NULL;
+ struct setup_data *data;
+ u64 pa_data;
+
+ pa_data = bp->hdr.setup_data;
+ while (pa_data) {
+ data = (struct setup_data *)pa_data;
+ if (data->type == SETUP_EFI) {
+ esd = (struct efi_setup_data *)(pa_data + sizeof(struct setup_data));
+ break;
+ }
+
+ pa_data = data->next;
+ }
+
+ /*
+ * Original ACPI code falls back to attempting normal EFI boot in these
+ * cases, so maintain existing behavior by indicating non-kexec
+ * environment to the caller, but print them for debugging.
+ */
+ if (esd && !esd->tables) {
+ debug_putstr("kexec EFI environment missing valid configuration table.\n");
+ return NULL;
+ }
+
+ return esd;
+#endif
+ return NULL;
+}
+
+/**
+ * efi_get_conf_table - Given a pointer to boot_params, locate and return the physical
+ * address of EFI configuration table.
+ *
+ * @bp: pointer to boot_params
+ * @cfg_tbl_pa: location to store physical address of config table
+ * @cfg_tbl_len: location to store number of config table entries
+ *
+ * Return: 0 on success. On error, return params are left unchanged.
+ */
+int efi_get_conf_table(struct boot_params *bp, unsigned long *cfg_tbl_pa,
+ unsigned int *cfg_tbl_len)
+{
+ unsigned long sys_tbl_pa;
+ enum efi_type et;
+ int ret;
+
+ if (!cfg_tbl_pa || !cfg_tbl_len)
+ return -EINVAL;
+
+ sys_tbl_pa = efi_get_system_table(bp);
+ if (!sys_tbl_pa)
+ return -EINVAL;
+
+ /* Handle EFI bitness properly */
+ et = efi_get_type(bp);
+ if (et == EFI_TYPE_64) {
+ efi_system_table_64_t *stbl = (efi_system_table_64_t *)sys_tbl_pa;
+ struct efi_setup_data *esd;
+
+ /* kexec provides an alternative EFI conf table, check for it. */
+ esd = get_kexec_setup_data(bp, et);
+
+ *cfg_tbl_pa = esd ? esd->tables : stbl->tables;
+ *cfg_tbl_len = stbl->nr_tables;
+ } else if (et == EFI_TYPE_32) {
+ efi_system_table_32_t *stbl = (efi_system_table_32_t *)sys_tbl_pa;
+
+ *cfg_tbl_pa = stbl->tables;
+ *cfg_tbl_len = stbl->nr_tables;
+ } else {
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+/* Get vendor table address/guid from EFI config table at the given index */
+static int get_vendor_table(void *cfg_tbl, unsigned int idx,
+ unsigned long *vendor_tbl_pa,
+ efi_guid_t *vendor_tbl_guid,
+ enum efi_type et)
+{
+ if (et == EFI_TYPE_64) {
+ efi_config_table_64_t *tbl_entry = (efi_config_table_64_t *)cfg_tbl + idx;
+
+ if (!IS_ENABLED(CONFIG_X86_64) && tbl_entry->table >> 32) {
+ debug_putstr("Error: EFI config table entry located above 4GB.\n");
+ return -EINVAL;
+ }
+
+ *vendor_tbl_pa = tbl_entry->table;
+ *vendor_tbl_guid = tbl_entry->guid;
+
+ } else if (et == EFI_TYPE_32) {
+ efi_config_table_32_t *tbl_entry = (efi_config_table_32_t *)cfg_tbl + idx;
+
+ *vendor_tbl_pa = tbl_entry->table;
+ *vendor_tbl_guid = tbl_entry->guid;
+ } else {
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+/**
+ * efi_find_vendor_table - Given EFI config table, search it for the physical
+ * address of the vendor table associated with GUID.
+ *
+ * @bp: pointer to boot_params
+ * @cfg_tbl_pa: pointer to EFI configuration table
+ * @cfg_tbl_len: number of entries in EFI configuration table
+ * @guid: GUID of vendor table
+ *
+ * Return: vendor table address on success. On error, return 0.
+ */
+unsigned long efi_find_vendor_table(struct boot_params *bp,
+ unsigned long cfg_tbl_pa,
+ unsigned int cfg_tbl_len,
+ efi_guid_t guid)
+{
+ enum efi_type et;
+ unsigned int i;
+
+ et = efi_get_type(bp);
+ if (et == EFI_TYPE_NONE)
+ return 0;
+
+ for (i = 0; i < cfg_tbl_len; i++) {
+ unsigned long vendor_tbl_pa;
+ efi_guid_t vendor_tbl_guid;
+ int ret;
+
+ ret = get_vendor_table((void *)cfg_tbl_pa, i,
+ &vendor_tbl_pa,
+ &vendor_tbl_guid, et);
+ if (ret)
+ return 0;
+
+ if (!efi_guidcmp(guid, vendor_tbl_guid))
+ return vendor_tbl_pa;
+ }
+
+ return 0;
+}
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index e189a16..cf071d1 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -197,11 +197,11 @@
subl $32, %eax /* Encryption bit is always above bit 31 */
bts %eax, %edx /* Set encryption mask for page tables */
/*
- * Mark SEV as active in sev_status so that startup32_check_sev_cbit()
- * will do a check. The sev_status memory will be fully initialized
- * with the contents of MSR_AMD_SEV_STATUS later in
- * set_sev_encryption_mask(). For now it is sufficient to know that SEV
- * is active.
+ * Set MSR_AMD64_SEV_ENABLED_BIT in sev_status so that
+ * startup32_check_sev_cbit() will do a check. sev_enable() will
+ * initialize sev_status with all the bits reported by
+ * MSR_AMD_SEV_STATUS later, but only MSR_AMD64_SEV_ENABLED_BIT
+ * needs to be set for now.
*/
movl $1, rva(sev_status)(%ebp)
1:
@@ -455,6 +455,23 @@
call load_stage1_idt
popq %rsi
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ /*
+ * Now that the stage1 interrupt handlers are set up, #VC exceptions from
+ * CPUID instructions can be properly handled for SEV-ES guests.
+ *
+ * For SEV-SNP, the CPUID table also needs to be set up in advance of any
+ * CPUID instructions being issued, so go ahead and do that now via
+ * sev_enable(), which will also handle the rest of the SEV-related
+ * detection/setup to ensure that has been done in advance of any dependent
+ * code.
+ */
+ pushq %rsi
+ movq %rsi, %rdi /* real mode address */
+ call sev_enable
+ popq %rsi
+#endif
+
/*
* paging_prepare() sets up the trampoline and checks if we need to
* enable 5-level paging.
@@ -586,17 +603,7 @@
shrq $3, %rcx
rep stosq
-/*
- * If running as an SEV guest, the encryption mask is required in the
- * page-table setup code below. When the guest also has SEV-ES enabled
- * set_sev_encryption_mask() will cause #VC exceptions, but the stage2
- * handler can't map its GHCB because the page-table is not set up yet.
- * So set up the encryption mask here while still on the stage1 #VC
- * handler. Then load stage2 IDT and switch to the kernel's own
- * page-table.
- */
pushq %rsi
- call set_sev_encryption_mask
call load_stage2_idt
/* Pass boot_params to initialize_identity_maps() */
diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index b72dea9..484337c 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -98,7 +98,7 @@
/*
* Adds the specified range to the identity mappings.
*/
-static void add_identity_map(unsigned long start, unsigned long end)
+void kernel_add_identity_map(unsigned long start, unsigned long end)
{
int ret;
@@ -165,14 +165,15 @@
* explicitly here in case the compressed kernel does not touch them,
* or does not touch all the pages covering them.
*/
- add_identity_map((unsigned long)_head, (unsigned long)_end);
+ kernel_add_identity_map((unsigned long)_head, (unsigned long)_end);
boot_params = rmode;
- add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
+ kernel_add_identity_map((unsigned long)boot_params, (unsigned long)(boot_params + 1));
cmdline = get_cmd_line_ptr();
- add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
+ kernel_add_identity_map(cmdline, cmdline + COMMAND_LINE_SIZE);
+
+ sev_prep_identity_maps(top_level_pgt);
/* Load the new page-table. */
- sev_verify_cbit(top_level_pgt);
write_cr3(top_level_pgt);
}
@@ -254,10 +255,10 @@
* It should already exist, but keep things generic.
*
* To map the page just read from it and fault it in if there is no
- * mapping yet. add_identity_map() can't be called here because that
- * would unconditionally map the address on PMD level, destroying any
- * PTE-level mappings that might already exist. Use assembly here so
- * the access won't be optimized away.
+ * mapping yet. kernel_add_identity_map() can't be called here because
+ * that would unconditionally map the address on PMD level, destroying
+ * any PTE-level mappings that might already exist. Use assembly here
+ * so the access won't be optimized away.
*/
asm volatile("mov %[address], %%r9"
:: [address] "g" (*(unsigned long *)address)
@@ -283,15 +284,31 @@
* Changing encryption attributes of a page requires to flush it from
* the caches.
*/
- if ((set | clr) & _PAGE_ENC)
+ if ((set | clr) & _PAGE_ENC) {
clflush_page(address);
+ /*
+ * If the encryption attribute is being cleared, change the page state
+ * to shared in the RMP table.
+ */
+ if (clr)
+ snp_set_page_shared(__pa(address & PAGE_MASK));
+ }
+
/* Update PTE */
pte = *ptep;
pte = pte_set_flags(pte, set);
pte = pte_clear_flags(pte, clr);
set_pte(ptep, pte);
+ /*
+ * If the encryption attribute is being set, then change the page state to
+ * private in the RMP entry. The page state change must be done after the PTE
+ * is updated.
+ */
+ if (set & _PAGE_ENC)
+ snp_set_page_private(__pa(address & PAGE_MASK));
+
/* Flush TLB after changing encryption attribute */
write_cr3(top_level_pgt);
@@ -355,7 +372,7 @@
* Error code is sane - now identity map the 2M region around
* the faulting address.
*/
- add_identity_map(address, end);
+ kernel_add_identity_map(address, end);
}
void do_boot_nmi_trap(struct pt_regs *regs, unsigned long error_code)
diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
index 9620883..4ad5286 100644
--- a/arch/x86/boot/compressed/idt_64.c
+++ b/arch/x86/boot/compressed/idt_64.c
@@ -39,7 +39,23 @@
load_boot_idt(&boot_idt_desc);
}
-/* Setup IDT after kernel jumping to .Lrelocated */
+/*
+ * Setup IDT after kernel jumping to .Lrelocated.
+ *
+ * initialize_identity_maps() needs a #PF handler to be setup
+ * in order to be able to fault-in identity mapping ranges; see
+ * do_boot_page_fault().
+ *
+ * This #PF handler setup needs to happen in load_stage2_idt() where the
+ * IDT is loaded and there the #VC IDT entry gets setup too.
+ *
+ * In order to be able to handle #VCs, one needs a GHCB which
+ * gets setup with an already set up pagetable, which is done in
+ * initialize_identity_maps(). And there's the catch 22: the boot #VC
+ * handler do_boot_stage2_vc() needs to call early_setup_ghcb() itself
+ * (and, especially set_page_decrypted()) because the SEV-ES setup code
+ * cannot initialize a GHCB as there's no #PF handler yet...
+ */
void load_stage2_idt(void)
{
boot_idt_desc.address = (unsigned long)boot_idt;
diff --git a/arch/x86/boot/compressed/mem_encrypt.S b/arch/x86/boot/compressed/mem_encrypt.S
index a63424d1..a73e4d7 100644
--- a/arch/x86/boot/compressed/mem_encrypt.S
+++ b/arch/x86/boot/compressed/mem_encrypt.S
@@ -187,42 +187,6 @@
.code64
#include "../../kernel/sev_verify_cbit.S"
-SYM_FUNC_START(set_sev_encryption_mask)
-#ifdef CONFIG_AMD_MEM_ENCRYPT
- push %rbp
- push %rdx
-
- movq %rsp, %rbp /* Save current stack pointer */
-
- call get_sev_encryption_bit /* Get the encryption bit position */
- testl %eax, %eax
- jz .Lno_sev_mask
-
- bts %rax, sme_me_mask(%rip) /* Create the encryption mask */
-
- /*
- * Read MSR_AMD64_SEV again and store it to sev_status. Can't do this in
- * get_sev_encryption_bit() because this function is 32-bit code and
- * shared between 64-bit and 32-bit boot path.
- */
- movl $MSR_AMD64_SEV, %ecx /* Read the SEV MSR */
- rdmsr
-
- /* Store MSR value in sev_status */
- shlq $32, %rdx
- orq %rdx, %rax
- movq %rax, sev_status(%rip)
-
-.Lno_sev_mask:
- movq %rbp, %rsp /* Restore original stack pointer */
-
- pop %rdx
- pop %rbp
-#endif
-
- xor %rax, %rax
- RET
-SYM_FUNC_END(set_sev_encryption_mask)
.data
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index cea1b96..3cdada4 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -21,6 +21,7 @@
#include <linux/screen_info.h>
#include <linux/elf.h>
#include <linux/io.h>
+#include <linux/efi.h>
#include <asm/page.h>
#include <asm/boot.h>
#include <asm/bootparam.h>
@@ -116,17 +117,23 @@
{ }
#endif
-void set_sev_encryption_mask(void);
-
#ifdef CONFIG_AMD_MEM_ENCRYPT
+void sev_enable(struct boot_params *bp);
void sev_es_shutdown_ghcb(void);
extern bool sev_es_check_ghcb_fault(unsigned long address);
+void snp_set_page_private(unsigned long paddr);
+void snp_set_page_shared(unsigned long paddr);
+void sev_prep_identity_maps(unsigned long top_level_pgt);
#else
+static inline void sev_enable(struct boot_params *bp) { }
static inline void sev_es_shutdown_ghcb(void) { }
static inline bool sev_es_check_ghcb_fault(unsigned long address)
{
return false;
}
+static inline void snp_set_page_private(unsigned long paddr) { }
+static inline void snp_set_page_shared(unsigned long paddr) { }
+static inline void sev_prep_identity_maps(unsigned long top_level_pgt) { }
#endif
/* acpi.c */
@@ -147,6 +154,7 @@
#ifdef CONFIG_X86_5LEVEL
extern unsigned int __pgtable_l5_enabled, pgdir_shift, ptrs_per_p4d;
#endif
+extern void kernel_add_identity_map(unsigned long start, unsigned long end);
/* Used by PAGE_KERN* macros: */
extern pteval_t __default_kernel_pte_mask;
@@ -169,4 +177,47 @@
unsigned long sev_verify_cbit(unsigned long cr3);
+enum efi_type {
+ EFI_TYPE_64,
+ EFI_TYPE_32,
+ EFI_TYPE_NONE,
+};
+
+#ifdef CONFIG_EFI
+/* helpers for early EFI config table access */
+enum efi_type efi_get_type(struct boot_params *bp);
+unsigned long efi_get_system_table(struct boot_params *bp);
+int efi_get_conf_table(struct boot_params *bp, unsigned long *cfg_tbl_pa,
+ unsigned int *cfg_tbl_len);
+unsigned long efi_find_vendor_table(struct boot_params *bp,
+ unsigned long cfg_tbl_pa,
+ unsigned int cfg_tbl_len,
+ efi_guid_t guid);
+#else
+static inline enum efi_type efi_get_type(struct boot_params *bp)
+{
+ return EFI_TYPE_NONE;
+}
+
+static inline unsigned long efi_get_system_table(struct boot_params *bp)
+{
+ return 0;
+}
+
+static inline int efi_get_conf_table(struct boot_params *bp,
+ unsigned long *cfg_tbl_pa,
+ unsigned int *cfg_tbl_len)
+{
+ return -ENOENT;
+}
+
+static inline unsigned long efi_find_vendor_table(struct boot_params *bp,
+ unsigned long cfg_tbl_pa,
+ unsigned int cfg_tbl_len,
+ efi_guid_t guid)
+{
+ return 0;
+}
+#endif /* CONFIG_EFI */
+
#endif /* BOOT_COMPRESSED_MISC_H */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 6728e56..eee87aa 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -20,8 +20,10 @@
#include <asm/fpu/xcr.h>
#include <asm/ptrace.h>
#include <asm/svm.h>
+#include <asm/cpuid.h>
#include "error.h"
+#include "../msr.h"
struct ghcb boot_ghcb_page __aligned(PAGE_SIZE);
struct ghcb *boot_ghcb;
@@ -56,23 +58,19 @@
static inline u64 sev_es_rd_ghcb_msr(void)
{
- unsigned long low, high;
+ struct msr m;
- asm volatile("rdmsr" : "=a" (low), "=d" (high) :
- "c" (MSR_AMD64_SEV_ES_GHCB));
+ boot_rdmsr(MSR_AMD64_SEV_ES_GHCB, &m);
- return ((high << 32) | low);
+ return m.q;
}
static inline void sev_es_wr_ghcb_msr(u64 val)
{
- u32 low, high;
+ struct msr m;
- low = val & 0xffffffffUL;
- high = val >> 32;
-
- asm volatile("wrmsr" : : "c" (MSR_AMD64_SEV_ES_GHCB),
- "a"(low), "d" (high) : "memory");
+ m.q = val;
+ boot_wrmsr(MSR_AMD64_SEV_ES_GHCB, &m);
}
static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt)
@@ -129,11 +127,54 @@
/* Include code for early handlers */
#include "../../kernel/sev-shared.c"
-static bool early_setup_sev_es(void)
+static inline bool sev_snp_enabled(void)
{
- if (!sev_es_negotiate_protocol())
- sev_es_terminate(GHCB_SEV_ES_REASON_PROTOCOL_UNSUPPORTED);
+ return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+}
+static void __page_state_change(unsigned long paddr, enum psc_op op)
+{
+ u64 val;
+
+ if (!sev_snp_enabled())
+ return;
+
+ /*
+ * If private -> shared then invalidate the page before requesting the
+ * state change in the RMP table.
+ */
+ if (op == SNP_PAGE_STATE_SHARED && pvalidate(paddr, RMP_PG_SIZE_4K, 0))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+ /* Issue VMGEXIT to change the page state in RMP table. */
+ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+ VMGEXIT();
+
+ /* Read the response of the VMGEXIT. */
+ val = sev_es_rd_ghcb_msr();
+ if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+
+ /*
+ * Now that page state is changed in the RMP table, validate it so that it is
+ * consistent with the RMP entry.
+ */
+ if (op == SNP_PAGE_STATE_PRIVATE && pvalidate(paddr, RMP_PG_SIZE_4K, 1))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+}
+
+void snp_set_page_private(unsigned long paddr)
+{
+ __page_state_change(paddr, SNP_PAGE_STATE_PRIVATE);
+}
+
+void snp_set_page_shared(unsigned long paddr)
+{
+ __page_state_change(paddr, SNP_PAGE_STATE_SHARED);
+}
+
+static bool early_setup_ghcb(void)
+{
if (set_page_decrypted((unsigned long)&boot_ghcb_page))
return false;
@@ -145,6 +186,10 @@
/* Initialize lookup tables for the instruction decoder */
inat_init_tables();
+ /* SNP guest requires the GHCB GPA must be registered */
+ if (sev_snp_enabled())
+ snp_register_ghcb_early(__pa(&boot_ghcb_page));
+
return true;
}
@@ -184,8 +229,8 @@
struct es_em_ctxt ctxt;
enum es_result result;
- if (!boot_ghcb && !early_setup_sev_es())
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ if (!boot_ghcb && !early_setup_ghcb())
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
vc_ghcb_invalidate(boot_ghcb);
result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -212,5 +257,191 @@
if (result == ES_OK)
vc_finish_insn(&ctxt);
else if (result != ES_RETRY)
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+}
+
+static void enforce_vmpl0(void)
+{
+ u64 attrs;
+ int err;
+
+ /*
+ * RMPADJUST modifies RMP permissions of a lesser-privileged (numerically
+ * higher) privilege level. Here, clear the VMPL1 permission mask of the
+ * GHCB page. If the guest is not running at VMPL0, this will fail.
+ *
+ * If the guest is running at VMPL0, it will succeed. Even if that operation
+ * modifies permission bits, it is still ok to do so currently because Linux
+ * SNP guests are supported only on VMPL0 so VMPL1 or higher permission masks
+ * changing is a don't-care.
+ */
+ attrs = 1;
+ if (rmpadjust((unsigned long)&boot_ghcb_page, RMP_PG_SIZE_4K, attrs))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_NOT_VMPL0);
+}
+
+void sev_enable(struct boot_params *bp)
+{
+ unsigned int eax, ebx, ecx, edx;
+ struct msr m;
+ bool snp;
+
+ /*
+ * Setup/preliminary detection of SNP. This will be sanity-checked
+ * against CPUID/MSR values later.
+ */
+ snp = snp_init(bp);
+
+ /* Check for the SME/SEV support leaf */
+ eax = 0x80000000;
+ ecx = 0;
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+ if (eax < 0x8000001f)
+ return;
+
+ /*
+ * Check for the SME/SEV feature:
+ * CPUID Fn8000_001F[EAX]
+ * - Bit 0 - Secure Memory Encryption support
+ * - Bit 1 - Secure Encrypted Virtualization support
+ * CPUID Fn8000_001F[EBX]
+ * - Bits 5:0 - Pagetable bit position used to indicate encryption
+ */
+ eax = 0x8000001f;
+ ecx = 0;
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+ /* Check whether SEV is supported */
+ if (!(eax & BIT(1))) {
+ if (snp)
+ error("SEV-SNP support indicated by CC blob, but not CPUID.");
+ return;
+ }
+
+ /* Set the SME mask if this is an SEV guest. */
+ boot_rdmsr(MSR_AMD64_SEV, &m);
+ sev_status = m.q;
+ if (!(sev_status & MSR_AMD64_SEV_ENABLED))
+ return;
+
+ /* Negotiate the GHCB protocol version. */
+ if (sev_status & MSR_AMD64_SEV_ES_ENABLED) {
+ if (!sev_es_negotiate_protocol())
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_PROT_UNSUPPORTED);
+ }
+
+ /*
+ * SNP is supported in v2 of the GHCB spec which mandates support for HV
+ * features.
+ */
+ if (sev_status & MSR_AMD64_SEV_SNP_ENABLED) {
+ if (!(get_hv_features() & GHCB_HV_FT_SNP))
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+ enforce_vmpl0();
+ }
+
+ if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+ error("SEV-SNP supported indicated by CC blob, but not SEV status MSR.");
+
+ sme_me_mask = BIT_ULL(ebx & 0x3f);
+}
+
+/* Search for Confidential Computing blob in the EFI config table. */
+static struct cc_blob_sev_info *find_cc_blob_efi(struct boot_params *bp)
+{
+ unsigned long cfg_table_pa;
+ unsigned int cfg_table_len;
+ int ret;
+
+ ret = efi_get_conf_table(bp, &cfg_table_pa, &cfg_table_len);
+ if (ret)
+ return NULL;
+
+ return (struct cc_blob_sev_info *)efi_find_vendor_table(bp, cfg_table_pa,
+ cfg_table_len,
+ EFI_CC_BLOB_GUID);
+}
+
+/*
+ * Initial set up of SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the boot kernel
+ * by firmware/bootloader in the following ways:
+ *
+ * - via an entry in the EFI config table
+ * - via a setup_data structure, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ cc_info = find_cc_blob_efi(bp);
+ if (cc_info)
+ goto found_cc_info;
+
+ cc_info = find_cc_blob_setup_data(bp);
+ if (!cc_info)
+ return NULL;
+
+found_cc_info:
+ if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+
+ return cc_info;
+}
+
+/*
+ * Indicate SNP based on presence of SNP-specific CC blob. Subsequent checks
+ * will verify the SNP CPUID/MSR bits.
+ */
+bool snp_init(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ if (!bp)
+ return false;
+
+ cc_info = find_cc_blob(bp);
+ if (!cc_info)
+ return false;
+
+ /*
+ * If a SNP-specific Confidential Computing blob is present, then
+ * firmware/bootloader have indicated SNP support. Verifying this
+ * involves CPUID checks which will be more reliable if the SNP
+ * CPUID table is used. See comments over snp_setup_cpuid_table() for
+ * more details.
+ */
+ setup_cpuid_table(cc_info);
+
+ /*
+ * Pass run-time kernel a pointer to CC info via boot_params so EFI
+ * config table doesn't need to be searched again during early startup
+ * phase.
+ */
+ bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+ return true;
+}
+
+void sev_prep_identity_maps(unsigned long top_level_pgt)
+{
+ /*
+ * The Confidential Computing blob is used very early in uncompressed
+ * kernel to find the in-memory CPUID table to handle CPUID
+ * instructions. Make sure an identity-mapping exists so it can be
+ * accessed after switchover.
+ */
+ if (sev_snp_enabled()) {
+ unsigned long cc_info_pa = boot_params->cc_blob_address;
+ struct cc_blob_sev_info *cc_info;
+
+ kernel_add_identity_map(cc_info_pa, cc_info_pa + sizeof(*cc_info));
+
+ cc_info = (struct cc_blob_sev_info *)cc_info_pa;
+ kernel_add_identity_map(cc_info->cpuid_phys, cc_info->cpuid_phys + cc_info->cpuid_len);
+ }
+
+ sev_verify_cbit(top_level_pgt);
}
diff --git a/arch/x86/boot/cpucheck.c b/arch/x86/boot/cpucheck.c
index e1478d3..fed8d13 100644
--- a/arch/x86/boot/cpucheck.c
+++ b/arch/x86/boot/cpucheck.c
@@ -27,6 +27,7 @@
#include <asm/required-features.h>
#include <asm/msr-index.h>
#include "string.h"
+#include "msr.h"
static u32 err_flags[NCAPINTS];
@@ -130,12 +131,11 @@
/* If this is an AMD and we're only missing SSE+SSE2, try to
turn them on */
- u32 ecx = MSR_K7_HWCR;
- u32 eax, edx;
+ struct msr m;
- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- eax &= ~(1 << 15);
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_rdmsr(MSR_K7_HWCR, &m);
+ m.l &= ~(1 << 15);
+ boot_wrmsr(MSR_K7_HWCR, &m);
get_cpuflags(); /* Make sure it really did something */
err = check_cpuflags();
@@ -145,28 +145,28 @@
/* If this is a VIA C3, we might have to enable CX8
explicitly */
- u32 ecx = MSR_VIA_FCR;
- u32 eax, edx;
+ struct msr m;
- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- eax |= (1<<1)|(1<<7);
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_rdmsr(MSR_VIA_FCR, &m);
+ m.l |= (1 << 1) | (1 << 7);
+ boot_wrmsr(MSR_VIA_FCR, &m);
set_bit(X86_FEATURE_CX8, cpu.flags);
err = check_cpuflags();
} else if (err == 0x01 && is_transmeta()) {
/* Transmeta might have masked feature bits in word 0 */
- u32 ecx = 0x80860004;
- u32 eax, edx;
+ struct msr m, m_tmp;
u32 level = 1;
- asm("rdmsr" : "=a" (eax), "=d" (edx) : "c" (ecx));
- asm("wrmsr" : : "a" (~0), "d" (edx), "c" (ecx));
+ boot_rdmsr(0x80860004, &m);
+ m_tmp = m;
+ m_tmp.l = ~0;
+ boot_wrmsr(0x80860004, &m_tmp);
asm("cpuid"
: "+a" (level), "=d" (cpu.flags[0])
: : "ecx", "ebx");
- asm("wrmsr" : : "a" (eax), "d" (edx), "c" (ecx));
+ boot_wrmsr(0x80860004, &m);
err = check_cpuflags();
} else if (err == 0x01 &&
diff --git a/arch/x86/boot/msr.h b/arch/x86/boot/msr.h
new file mode 100644
index 0000000..aed66f7
--- /dev/null
+++ b/arch/x86/boot/msr.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Helpers/definitions related to MSR access.
+ */
+
+#ifndef BOOT_MSR_H
+#define BOOT_MSR_H
+
+#include <asm/shared/msr.h>
+
+/*
+ * The kernel proper already defines rdmsr()/wrmsr(), but they are not for the
+ * boot kernel since they rely on tracepoint/exception handling infrastructure
+ * that's not available here.
+ */
+static inline void boot_rdmsr(unsigned int reg, struct msr *m)
+{
+ asm volatile("rdmsr" : "=a" (m->l), "=d" (m->h) : "c" (reg));
+}
+
+static inline void boot_wrmsr(unsigned int reg, const struct msr *m)
+{
+ asm volatile("wrmsr" : : "c" (reg), "a"(m->l), "d" (m->h) : "memory");
+}
+
+#endif /* BOOT_MSR_H */
diff --git a/arch/x86/configs/google/xfstest.config b/arch/x86/configs/google/xfstest.config
new file mode 100644
index 0000000..1b6faaa
--- /dev/null
+++ b/arch/x86/configs/google/xfstest.config
@@ -0,0 +1,24 @@
+#Configurations required to run xfs tests
+CONFIG_MODULE_SIG=n
+CONFIG_MODULE_SIG_ALL=n
+CONFIG_SECURITY_LOADPIN=n
+CONFIG_SECURITY_LOADPIN_ENFORCE=n
+CONFIG_SECURITY_YAMA=n
+CONFIG_SECURITY_LOCKDOWN_LSM=n
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=n
+CONFIG_LSM=""
+CONFIG_SYSTEM_TRUSTED_KEYRING=n
+CONFIG_SECONDARY_TRUSTED_KEYRING=n
+CONFIG_VFAT_FS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_UTF8=y
+CONFIG_FAT_DEFAULT_UTF8=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_NFSD=y
+CONFIG_NFSD_V4=y
diff --git a/arch/x86/configs/lakitu_defconfig b/arch/x86/configs/lakitu_defconfig
new file mode 100644
index 0000000..18a5c3d
--- /dev/null
+++ b/arch/x86/configs/lakitu_defconfig
@@ -0,0 +1,4293 @@
+#
+# Automatically generated file; DO NOT EDIT.
+# Linux/x86_64 5.15.152 Kernel Configuration
+#
+CONFIG_CC_VERSION_TEXT="Chromium OS 15.0_pre458507_p20220602-r18 clang version 15.0.0 (/var/tmp/portage/sys-devel/llvm-15.0_pre458507_p20220602-r18/work/llvm-15.0_pre458507_p20220602/clang a58d0af058038595c93de961b725f86997cf8d4a)"
+CONFIG_GCC_VERSION=0
+CONFIG_CC_IS_CLANG=y
+CONFIG_CLANG_VERSION=150000
+CONFIG_AS_IS_LLVM=y
+CONFIG_AS_VERSION=150000
+CONFIG_LD_VERSION=0
+CONFIG_LD_IS_LLD=y
+CONFIG_LLD_VERSION=150000
+CONFIG_CC_CAN_LINK=y
+CONFIG_CC_CAN_LINK_STATIC=y
+CONFIG_CC_HAS_ASM_GOTO=y
+CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
+CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
+CONFIG_TOOLS_SUPPORT_RELR=y
+CONFIG_CC_HAS_ASM_INLINE=y
+CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
+CONFIG_PAHOLE_VERSION=121
+CONFIG_IRQ_WORK=y
+CONFIG_BUILDTIME_TABLE_SORT=y
+CONFIG_THREAD_INFO_IN_TASK=y
+
+#
+# General setup
+#
+CONFIG_INIT_ENV_ARG_LIMIT=32
+# CONFIG_COMPILE_TEST is not set
+# CONFIG_WERROR is not set
+CONFIG_LOCALVERSION=""
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_BUILD_SALT=""
+CONFIG_HAVE_KERNEL_GZIP=y
+CONFIG_HAVE_KERNEL_BZIP2=y
+CONFIG_HAVE_KERNEL_LZMA=y
+CONFIG_HAVE_KERNEL_XZ=y
+CONFIG_HAVE_KERNEL_LZO=y
+CONFIG_HAVE_KERNEL_LZ4=y
+CONFIG_HAVE_KERNEL_ZSTD=y
+CONFIG_KERNEL_GZIP=y
+# CONFIG_KERNEL_BZIP2 is not set
+# CONFIG_KERNEL_LZMA is not set
+# CONFIG_KERNEL_XZ is not set
+# CONFIG_KERNEL_LZO is not set
+# CONFIG_KERNEL_LZ4 is not set
+# CONFIG_KERNEL_ZSTD is not set
+CONFIG_DEFAULT_INIT=""
+CONFIG_DEFAULT_HOSTNAME="localhost"
+CONFIG_SWAP=y
+CONFIG_SYSVIPC=y
+CONFIG_SYSVIPC_SYSCTL=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_POSIX_MQUEUE_SYSCTL=y
+# CONFIG_WATCH_QUEUE is not set
+CONFIG_CROSS_MEMORY_ATTACH=y
+CONFIG_USELIB=y
+CONFIG_AUDIT=y
+CONFIG_HAVE_ARCH_AUDITSYSCALL=y
+CONFIG_AUDITSYSCALL=y
+
+#
+# IRQ subsystem
+#
+CONFIG_GENERIC_IRQ_PROBE=y
+CONFIG_GENERIC_IRQ_SHOW=y
+CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
+CONFIG_GENERIC_PENDING_IRQ=y
+CONFIG_GENERIC_IRQ_MIGRATION=y
+CONFIG_HARDIRQS_SW_RESEND=y
+CONFIG_IRQ_DOMAIN=y
+CONFIG_IRQ_DOMAIN_HIERARCHY=y
+CONFIG_GENERIC_MSI_IRQ=y
+CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
+CONFIG_IRQ_MSI_IOMMU=y
+CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
+CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
+CONFIG_IRQ_FORCED_THREADING=y
+CONFIG_SPARSE_IRQ=y
+# CONFIG_GENERIC_IRQ_DEBUGFS is not set
+# end of IRQ subsystem
+
+CONFIG_CLOCKSOURCE_WATCHDOG=y
+CONFIG_ARCH_CLOCKSOURCE_INIT=y
+CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
+CONFIG_GENERIC_TIME_VSYSCALL=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
+CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
+CONFIG_GENERIC_CMOS_UPDATE=y
+CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
+CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
+
+#
+# Timers subsystem
+#
+CONFIG_TICK_ONESHOT=y
+CONFIG_NO_HZ_COMMON=y
+# CONFIG_HZ_PERIODIC is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_NO_HZ_FULL is not set
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+# end of Timers subsystem
+
+CONFIG_BPF=y
+CONFIG_HAVE_EBPF_JIT=y
+CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
+
+#
+# BPF subsystem
+#
+CONFIG_BPF_SYSCALL=y
+CONFIG_BPF_JIT=y
+CONFIG_BPF_JIT_ALWAYS_ON=y
+CONFIG_BPF_JIT_DEFAULT_ON=y
+# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set
+# CONFIG_BPF_PRELOAD is not set
+CONFIG_BPF_LSM=y
+# end of BPF subsystem
+
+CONFIG_PREEMPT_NONE=y
+# CONFIG_PREEMPT_VOLUNTARY is not set
+# CONFIG_PREEMPT is not set
+CONFIG_SCHED_CORE=y
+
+#
+# CPU/Task time and stats accounting
+#
+CONFIG_TICK_CPU_ACCOUNTING=y
+# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
+# CONFIG_IRQ_TIME_ACCOUNTING is not set
+CONFIG_HAVE_SCHED_AVG_IRQ=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_PSI=y
+CONFIG_PSI_DEFAULT_DISABLED=y
+# end of CPU/Task time and stats accounting
+
+CONFIG_CPU_ISOLATION=y
+
+#
+# RCU Subsystem
+#
+CONFIG_TREE_RCU=y
+# CONFIG_RCU_EXPERT is not set
+CONFIG_SRCU=y
+CONFIG_TREE_SRCU=y
+CONFIG_TASKS_RCU_GENERIC=y
+CONFIG_TASKS_RUDE_RCU=y
+CONFIG_TASKS_TRACE_RCU=y
+CONFIG_RCU_STALL_COMMON=y
+CONFIG_RCU_NEED_SEGCBLIST=y
+# end of RCU Subsystem
+
+CONFIG_BUILD_BIN2C=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_IKHEADERS=m
+CONFIG_LOG_BUF_SHIFT=18
+CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
+CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
+# CONFIG_PRINTK_INDEX is not set
+CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
+
+#
+# Scheduler features
+#
+# CONFIG_UCLAMP_TASK is not set
+# end of Scheduler features
+
+CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
+CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
+CONFIG_CC_HAS_INT128=y
+CONFIG_ARCH_SUPPORTS_INT128=y
+# CONFIG_NUMA_BALANCING is not set
+CONFIG_CGROUPS=y
+CONFIG_PAGE_COUNTER=y
+CONFIG_MEMCG=y
+CONFIG_MEMCG_SWAP=y
+CONFIG_MEMCG_KMEM=y
+CONFIG_BLK_CGROUP=y
+CONFIG_CGROUP_WRITEBACK=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_FAIR_GROUP_SCHED=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_CGROUP_PIDS=y
+CONFIG_CGROUP_RDMA=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_HUGETLB=y
+CONFIG_CPUSETS=y
+CONFIG_PROC_PID_CPUSET=y
+CONFIG_CGROUP_DEVICE=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_PERF=y
+CONFIG_CGROUP_BPF=y
+# CONFIG_CGROUP_MISC is not set
+# CONFIG_CGROUP_DEBUG is not set
+CONFIG_SOCK_CGROUP_DATA=y
+CONFIG_NAMESPACES=y
+CONFIG_UTS_NS=y
+CONFIG_TIME_NS=y
+CONFIG_IPC_NS=y
+CONFIG_USER_NS=y
+CONFIG_PID_NS=y
+CONFIG_NET_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
+# CONFIG_SCHED_AUTOGROUP is not set
+# CONFIG_SYSFS_DEPRECATED is not set
+CONFIG_RELAY=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_INITRAMFS_SOURCE=""
+CONFIG_RD_GZIP=y
+# CONFIG_RD_BZIP2 is not set
+# CONFIG_RD_LZMA is not set
+CONFIG_RD_XZ=y
+# CONFIG_RD_LZO is not set
+CONFIG_RD_LZ4=y
+CONFIG_RD_ZSTD=y
+# CONFIG_BOOT_CONFIG is not set
+CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
+# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
+CONFIG_LD_ORPHAN_WARN=y
+CONFIG_SYSCTL=y
+CONFIG_HAVE_UID16=y
+CONFIG_SYSCTL_EXCEPTION_TRACE=y
+CONFIG_HAVE_PCSPKR_PLATFORM=y
+CONFIG_EXPERT=y
+CONFIG_UID16=y
+CONFIG_MULTIUSER=y
+CONFIG_SGETMASK_SYSCALL=y
+CONFIG_SYSFS_SYSCALL=y
+CONFIG_FHANDLE=y
+CONFIG_POSIX_TIMERS=y
+CONFIG_PRINTK=y
+CONFIG_BUG=y
+CONFIG_ELF_CORE=y
+# CONFIG_PCSPKR_PLATFORM is not set
+CONFIG_BASE_FULL=y
+CONFIG_FUTEX=y
+CONFIG_FUTEX_PI=y
+CONFIG_EPOLL=y
+CONFIG_SIGNALFD=y
+CONFIG_TIMERFD=y
+CONFIG_EVENTFD=y
+CONFIG_SHMEM=y
+CONFIG_AIO=y
+CONFIG_IO_URING=y
+CONFIG_ADVISE_SYSCALLS=y
+CONFIG_MEMBARRIER=y
+CONFIG_KALLSYMS=y
+# CONFIG_KALLSYMS_ALL is not set
+CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
+CONFIG_KALLSYMS_BASE_RELATIVE=y
+# CONFIG_USERFAULTFD is not set
+CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
+CONFIG_KCMP=y
+CONFIG_RSEQ=y
+# CONFIG_DEBUG_RSEQ is not set
+CONFIG_EMBEDDED=y
+CONFIG_HAVE_PERF_EVENTS=y
+# CONFIG_PC104 is not set
+
+#
+# Kernel Performance Events And Counters
+#
+CONFIG_PERF_EVENTS=y
+# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
+# end of Kernel Performance Events And Counters
+
+CONFIG_VM_EVENT_COUNTERS=y
+CONFIG_SLUB_DEBUG=y
+# CONFIG_COMPAT_BRK is not set
+# CONFIG_SLAB is not set
+CONFIG_SLUB=y
+# CONFIG_SLOB is not set
+CONFIG_SLAB_MERGE_DEFAULT=y
+CONFIG_SLAB_FREELIST_RANDOM=y
+CONFIG_SLAB_FREELIST_HARDENED=y
+CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
+CONFIG_SLUB_CPU_PARTIAL=y
+CONFIG_SYSTEM_DATA_VERIFICATION=y
+CONFIG_PROFILING=y
+CONFIG_TRACEPOINTS=y
+# end of General setup
+
+CONFIG_64BIT=y
+CONFIG_X86_64=y
+CONFIG_X86=y
+CONFIG_INSTRUCTION_DECODER=y
+CONFIG_OUTPUT_FORMAT="elf64-x86-64"
+CONFIG_LOCKDEP_SUPPORT=y
+CONFIG_STACKTRACE_SUPPORT=y
+CONFIG_MMU=y
+CONFIG_ARCH_MMAP_RND_BITS_MIN=28
+CONFIG_ARCH_MMAP_RND_BITS_MAX=32
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
+CONFIG_GENERIC_ISA_DMA=y
+CONFIG_GENERIC_BUG=y
+CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
+CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_GENERIC_CALIBRATE_DELAY=y
+CONFIG_ARCH_HAS_CPU_RELAX=y
+CONFIG_ARCH_HAS_FILTER_PGPROT=y
+CONFIG_HAVE_SETUP_PER_CPU_AREA=y
+CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
+CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
+CONFIG_ARCH_HIBERNATION_POSSIBLE=y
+CONFIG_ARCH_NR_GPIO=1024
+CONFIG_ARCH_SUSPEND_POSSIBLE=y
+CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
+CONFIG_AUDIT_ARCH=y
+CONFIG_X86_64_SMP=y
+CONFIG_ARCH_SUPPORTS_UPROBES=y
+CONFIG_FIX_EARLYCON_MEM=y
+CONFIG_DYNAMIC_PHYSICAL_MASK=y
+CONFIG_PGTABLE_LEVELS=4
+CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
+
+#
+# Processor type and features
+#
+CONFIG_SMP=y
+CONFIG_X86_FEATURE_NAMES=y
+CONFIG_X86_X2APIC=y
+CONFIG_X86_MPPARSE=y
+# CONFIG_GOLDFISH is not set
+# CONFIG_X86_CPU_RESCTRL is not set
+# CONFIG_X86_EXTENDED_PLATFORM is not set
+# CONFIG_X86_INTEL_LPSS is not set
+# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
+# CONFIG_IOSF_MBI is not set
+CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
+CONFIG_SCHED_OMIT_FRAME_POINTER=y
+CONFIG_HYPERVISOR_GUEST=y
+CONFIG_PARAVIRT=y
+CONFIG_PARAVIRT_XXL=y
+# CONFIG_PARAVIRT_DEBUG is not set
+CONFIG_PARAVIRT_SPINLOCKS=y
+CONFIG_X86_HV_CALLBACK_VECTOR=y
+CONFIG_XEN=y
+CONFIG_XEN_PV=y
+CONFIG_XEN_512GB=y
+CONFIG_XEN_PV_SMP=y
+CONFIG_XEN_PV_DOM0=y
+CONFIG_XEN_PVHVM=y
+CONFIG_XEN_PVHVM_SMP=y
+CONFIG_XEN_PVHVM_GUEST=y
+CONFIG_XEN_SAVE_RESTORE=y
+# CONFIG_XEN_DEBUG_FS is not set
+CONFIG_XEN_PVH=y
+CONFIG_XEN_DOM0=y
+CONFIG_KVM_GUEST=y
+CONFIG_ARCH_CPUIDLE_HALTPOLL=y
+CONFIG_PVH=y
+CONFIG_PARAVIRT_TIME_ACCOUNTING=y
+CONFIG_PARAVIRT_CLOCK=y
+# CONFIG_JAILHOUSE_GUEST is not set
+# CONFIG_ACRN_GUEST is not set
+# CONFIG_MK8 is not set
+# CONFIG_MPSC is not set
+# CONFIG_MCORE2 is not set
+# CONFIG_MATOM is not set
+CONFIG_GENERIC_CPU=y
+CONFIG_X86_INTERNODE_CACHE_SHIFT=6
+CONFIG_X86_L1_CACHE_SHIFT=6
+CONFIG_X86_TSC=y
+CONFIG_X86_CMPXCHG64=y
+CONFIG_X86_CMOV=y
+CONFIG_X86_MINIMUM_CPU_FAMILY=64
+CONFIG_X86_DEBUGCTLMSR=y
+CONFIG_IA32_FEAT_CTL=y
+CONFIG_X86_VMX_FEATURE_NAMES=y
+# CONFIG_PROCESSOR_SELECT is not set
+CONFIG_CPU_SUP_INTEL=y
+CONFIG_CPU_SUP_AMD=y
+CONFIG_CPU_SUP_HYGON=y
+CONFIG_CPU_SUP_CENTAUR=y
+CONFIG_CPU_SUP_ZHAOXIN=y
+CONFIG_HPET_TIMER=y
+CONFIG_HPET_EMULATE_RTC=y
+CONFIG_DMI=y
+CONFIG_GART_IOMMU=y
+# CONFIG_MAXSMP is not set
+CONFIG_NR_CPUS_RANGE_BEGIN=2
+CONFIG_NR_CPUS_RANGE_END=512
+CONFIG_NR_CPUS_DEFAULT=64
+CONFIG_NR_CPUS=512
+CONFIG_SCHED_SMT=y
+CONFIG_SCHED_MC=y
+CONFIG_SCHED_MC_PRIO=y
+CONFIG_X86_LOCAL_APIC=y
+CONFIG_X86_IO_APIC=y
+CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
+CONFIG_X86_MCE=y
+# CONFIG_X86_MCELOG_LEGACY is not set
+CONFIG_X86_MCE_INTEL=y
+CONFIG_X86_MCE_AMD=y
+CONFIG_X86_MCE_THRESHOLD=y
+# CONFIG_X86_MCE_INJECT is not set
+
+#
+# Performance monitoring
+#
+CONFIG_PERF_EVENTS_INTEL_UNCORE=y
+CONFIG_PERF_EVENTS_INTEL_RAPL=y
+CONFIG_PERF_EVENTS_INTEL_CSTATE=y
+# CONFIG_PERF_EVENTS_AMD_POWER is not set
+CONFIG_PERF_EVENTS_AMD_UNCORE=y
+# end of Performance monitoring
+
+CONFIG_X86_16BIT=y
+CONFIG_X86_ESPFIX64=y
+CONFIG_X86_VSYSCALL_EMULATION=y
+CONFIG_X86_IOPL_IOPERM=y
+# CONFIG_MICROCODE is not set
+CONFIG_X86_MSR=y
+CONFIG_X86_CPUID=y
+# CONFIG_X86_5LEVEL is not set
+CONFIG_X86_DIRECT_GBPAGES=y
+# CONFIG_X86_CPA_STATISTICS is not set
+CONFIG_AMD_MEM_ENCRYPT=y
+CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y
+CONFIG_NUMA=y
+# CONFIG_AMD_NUMA is not set
+CONFIG_X86_64_ACPI_NUMA=y
+CONFIG_NUMA_EMU=y
+CONFIG_NODES_SHIFT=6
+CONFIG_ARCH_SPARSEMEM_ENABLE=y
+CONFIG_ARCH_SPARSEMEM_DEFAULT=y
+CONFIG_ARCH_SELECT_MEMORY_MODEL=y
+# CONFIG_ARCH_MEMORY_PROBE is not set
+CONFIG_ARCH_PROC_KCORE_TEXT=y
+CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
+# CONFIG_X86_PMEM_LEGACY is not set
+CONFIG_X86_CHECK_BIOS_CORRUPTION=y
+CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
+CONFIG_MTRR=y
+# CONFIG_MTRR_SANITIZER is not set
+CONFIG_X86_PAT=y
+CONFIG_ARCH_USES_PG_UNCACHED=y
+CONFIG_ARCH_RANDOM=y
+CONFIG_X86_SMAP=y
+CONFIG_X86_UMIP=y
+CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
+CONFIG_X86_INTEL_TSX_MODE_OFF=y
+# CONFIG_X86_INTEL_TSX_MODE_ON is not set
+# CONFIG_X86_INTEL_TSX_MODE_AUTO is not set
+# CONFIG_X86_SGX is not set
+CONFIG_EFI=y
+CONFIG_EFI_STUB=y
+# CONFIG_EFI_MIXED is not set
+# CONFIG_HZ_100 is not set
+# CONFIG_HZ_250 is not set
+# CONFIG_HZ_300 is not set
+CONFIG_HZ_1000=y
+CONFIG_HZ=1000
+CONFIG_SCHED_HRTICK=y
+# CONFIG_KEXEC is not set
+CONFIG_KEXEC_FILE=y
+CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
+# CONFIG_KEXEC_SIG is not set
+# CONFIG_CRASH_DUMP is not set
+CONFIG_PHYSICAL_START=0x1000000
+CONFIG_RELOCATABLE=y
+CONFIG_RANDOMIZE_BASE=y
+CONFIG_X86_NEED_RELOCS=y
+CONFIG_PHYSICAL_ALIGN=0x1000000
+CONFIG_DYNAMIC_MEMORY_LAYOUT=y
+CONFIG_RANDOMIZE_MEMORY=y
+CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
+CONFIG_HOTPLUG_CPU=y
+# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
+# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
+# CONFIG_COMPAT_VDSO is not set
+CONFIG_LEGACY_VSYSCALL_EMULATE=y
+# CONFIG_LEGACY_VSYSCALL_XONLY is not set
+# CONFIG_LEGACY_VSYSCALL_NONE is not set
+# CONFIG_CMDLINE_BOOL is not set
+CONFIG_MODIFY_LDT_SYSCALL=y
+CONFIG_HAVE_LIVEPATCH=y
+# end of Processor type and features
+
+CONFIG_CC_HAS_SLS=y
+CONFIG_CC_HAS_RETURN_THUNK=y
+CONFIG_SPECULATION_MITIGATIONS=y
+CONFIG_PAGE_TABLE_ISOLATION=y
+CONFIG_RETPOLINE=y
+CONFIG_RETHUNK=y
+CONFIG_CPU_UNRET_ENTRY=y
+CONFIG_CPU_IBPB_ENTRY=y
+CONFIG_CPU_IBRS_ENTRY=y
+CONFIG_CPU_SRSO=y
+# CONFIG_SLS is not set
+# CONFIG_GDS_FORCE_MITIGATION is not set
+CONFIG_ARCH_HAS_ADD_PAGES=y
+CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y
+CONFIG_USE_PERCPU_NUMA_NODE_ID=y
+
+#
+# Power management and ACPI options
+#
+CONFIG_SUSPEND=y
+CONFIG_SUSPEND_FREEZER=y
+# CONFIG_SUSPEND_SKIP_SYNC is not set
+CONFIG_HIBERNATE_CALLBACKS=y
+# CONFIG_HIBERNATION is not set
+CONFIG_PM_SLEEP=y
+CONFIG_PM_SLEEP_SMP=y
+# CONFIG_PM_AUTOSLEEP is not set
+# CONFIG_PM_WAKELOCKS is not set
+CONFIG_PM=y
+CONFIG_PM_DEBUG=y
+# CONFIG_PM_ADVANCED_DEBUG is not set
+# CONFIG_PM_TEST_SUSPEND is not set
+CONFIG_PM_SLEEP_DEBUG=y
+# CONFIG_DPM_WATCHDOG is not set
+CONFIG_PM_TRACE=y
+CONFIG_PM_TRACE_RTC=y
+CONFIG_PM_CLK=y
+# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
+# CONFIG_ENERGY_MODEL is not set
+CONFIG_ARCH_SUPPORTS_ACPI=y
+CONFIG_ACPI=y
+CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
+CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
+CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
+# CONFIG_ACPI_DEBUGGER is not set
+CONFIG_ACPI_SPCR_TABLE=y
+# CONFIG_ACPI_FPDT is not set
+CONFIG_ACPI_LPIT=y
+CONFIG_ACPI_SLEEP=y
+CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
+# CONFIG_ACPI_EC_DEBUGFS is not set
+# CONFIG_ACPI_AC is not set
+# CONFIG_ACPI_BATTERY is not set
+CONFIG_ACPI_BUTTON=y
+# CONFIG_ACPI_FAN is not set
+# CONFIG_ACPI_TAD is not set
+# CONFIG_ACPI_DOCK is not set
+CONFIG_ACPI_CPU_FREQ_PSS=y
+CONFIG_ACPI_PROCESSOR_CSTATE=y
+CONFIG_ACPI_PROCESSOR_IDLE=y
+CONFIG_ACPI_CPPC_LIB=y
+CONFIG_ACPI_PROCESSOR=y
+CONFIG_ACPI_HOTPLUG_CPU=y
+# CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
+CONFIG_ACPI_THERMAL=y
+CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
+# CONFIG_ACPI_TABLE_UPGRADE is not set
+# CONFIG_ACPI_DEBUG is not set
+# CONFIG_ACPI_PCI_SLOT is not set
+CONFIG_ACPI_CONTAINER=y
+# CONFIG_ACPI_HOTPLUG_MEMORY is not set
+CONFIG_ACPI_HOTPLUG_IOAPIC=y
+# CONFIG_ACPI_SBS is not set
+# CONFIG_ACPI_HED is not set
+# CONFIG_ACPI_CUSTOM_METHOD is not set
+# CONFIG_ACPI_BGRT is not set
+# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
+# CONFIG_ACPI_NFIT is not set
+CONFIG_ACPI_NUMA=y
+# CONFIG_ACPI_HMAT is not set
+CONFIG_HAVE_ACPI_APEI=y
+CONFIG_HAVE_ACPI_APEI_NMI=y
+# CONFIG_ACPI_APEI is not set
+# CONFIG_ACPI_DPTF is not set
+# CONFIG_ACPI_CONFIGFS is not set
+# CONFIG_PMIC_OPREGION is not set
+# CONFIG_X86_PM_TIMER is not set
+CONFIG_ACPI_PRMT=y
+
+#
+# CPU Frequency scaling
+#
+CONFIG_CPU_FREQ=y
+CONFIG_CPU_FREQ_GOV_ATTR_SET=y
+CONFIG_CPU_FREQ_STAT=y
+CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
+CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
+# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
+CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
+
+#
+# CPU frequency scaling drivers
+#
+CONFIG_X86_INTEL_PSTATE=y
+# CONFIG_X86_PCC_CPUFREQ is not set
+# CONFIG_X86_ACPI_CPUFREQ is not set
+# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
+# CONFIG_X86_P4_CLOCKMOD is not set
+
+#
+# shared options
+#
+# end of CPU Frequency scaling
+
+#
+# CPU Idle
+#
+CONFIG_CPU_IDLE=y
+CONFIG_CPU_IDLE_GOV_LADDER=y
+CONFIG_CPU_IDLE_GOV_MENU=y
+# CONFIG_CPU_IDLE_GOV_TEO is not set
+# CONFIG_CPU_IDLE_GOV_HALTPOLL is not set
+CONFIG_HALTPOLL_CPUIDLE=y
+# end of CPU Idle
+
+CONFIG_INTEL_IDLE=y
+# end of Power management and ACPI options
+
+#
+# Bus options (PCI etc.)
+#
+CONFIG_PCI_DIRECT=y
+CONFIG_PCI_MMCONFIG=y
+CONFIG_PCI_XEN=y
+CONFIG_MMCONF_FAM10H=y
+# CONFIG_PCI_CNB20LE_QUIRK is not set
+# CONFIG_ISA_BUS is not set
+CONFIG_ISA_DMA_API=y
+CONFIG_AMD_NB=y
+# end of Bus options (PCI etc.)
+
+#
+# Binary Emulations
+#
+CONFIG_IA32_EMULATION=y
+CONFIG_COMPAT_32=y
+CONFIG_COMPAT=y
+CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
+CONFIG_SYSVIPC_COMPAT=y
+# end of Binary Emulations
+
+CONFIG_HAVE_KVM=y
+CONFIG_VIRTUALIZATION=y
+# CONFIG_KVM is not set
+CONFIG_AS_AVX512=y
+CONFIG_AS_SHA1_NI=y
+CONFIG_AS_SHA256_NI=y
+CONFIG_AS_TPAUSE=y
+
+#
+# General architecture-dependent options
+#
+CONFIG_CRASH_CORE=y
+CONFIG_KEXEC_CORE=y
+CONFIG_HOTPLUG_SMT=y
+CONFIG_GENERIC_ENTRY=y
+CONFIG_KPROBES=y
+# CONFIG_JUMP_LABEL is not set
+# CONFIG_STATIC_CALL_SELFTEST is not set
+CONFIG_OPTPROBES=y
+CONFIG_KPROBES_ON_FTRACE=y
+CONFIG_UPROBES=y
+CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
+CONFIG_ARCH_USE_BUILTIN_BSWAP=y
+CONFIG_KRETPROBES=y
+CONFIG_HAVE_IOREMAP_PROT=y
+CONFIG_HAVE_KPROBES=y
+CONFIG_HAVE_KRETPROBES=y
+CONFIG_HAVE_OPTPROBES=y
+CONFIG_HAVE_KPROBES_ON_FTRACE=y
+CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
+CONFIG_HAVE_NMI=y
+CONFIG_TRACE_IRQFLAGS_SUPPORT=y
+CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
+CONFIG_HAVE_ARCH_TRACEHOOK=y
+CONFIG_HAVE_DMA_CONTIGUOUS=y
+CONFIG_GENERIC_SMP_IDLE_THREAD=y
+CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
+CONFIG_ARCH_HAS_SET_MEMORY=y
+CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
+CONFIG_ARCH_HAS_CPU_FINALIZE_INIT=y
+CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
+CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
+CONFIG_ARCH_WANTS_NO_INSTR=y
+CONFIG_HAVE_ASM_MODVERSIONS=y
+CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
+CONFIG_HAVE_RSEQ=y
+CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
+CONFIG_HAVE_HW_BREAKPOINT=y
+CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
+CONFIG_HAVE_USER_RETURN_NOTIFIER=y
+CONFIG_HAVE_PERF_EVENTS_NMI=y
+CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
+CONFIG_HAVE_PERF_REGS=y
+CONFIG_HAVE_PERF_USER_STACK_DUMP=y
+CONFIG_HAVE_ARCH_JUMP_LABEL=y
+CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
+CONFIG_MMU_GATHER_TABLE_FREE=y
+CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
+CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
+CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
+CONFIG_HAVE_CMPXCHG_LOCAL=y
+CONFIG_HAVE_CMPXCHG_DOUBLE=y
+CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
+CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
+CONFIG_HAVE_ARCH_SECCOMP=y
+CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
+CONFIG_SECCOMP=y
+CONFIG_SECCOMP_FILTER=y
+# CONFIG_SECCOMP_CACHE_DEBUG is not set
+CONFIG_HAVE_ARCH_STACKLEAK=y
+CONFIG_HAVE_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR_STRONG=y
+CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
+CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
+CONFIG_LTO_NONE=y
+CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
+CONFIG_HAVE_CONTEXT_TRACKING=y
+CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK=y
+CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
+CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
+CONFIG_HAVE_MOVE_PUD=y
+CONFIG_HAVE_MOVE_PMD=y
+CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
+CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
+CONFIG_HAVE_ARCH_HUGE_VMAP=y
+CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
+CONFIG_HAVE_ARCH_SOFT_DIRTY=y
+CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
+CONFIG_MODULES_USE_ELF_RELA=y
+CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
+CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
+CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
+CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
+CONFIG_HAVE_EXIT_THREAD=y
+CONFIG_ARCH_MMAP_RND_BITS=31
+CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
+CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y
+CONFIG_HAVE_STACK_VALIDATION=y
+CONFIG_HAVE_RELIABLE_STACKTRACE=y
+CONFIG_OLD_SIGSUSPEND3=y
+CONFIG_COMPAT_OLD_SIGACTION=y
+CONFIG_COMPAT_32BIT_TIME=y
+CONFIG_HAVE_ARCH_VMAP_STACK=y
+CONFIG_VMAP_STACK=y
+CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
+# CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT is not set
+CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
+CONFIG_STRICT_KERNEL_RWX=y
+CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
+CONFIG_STRICT_MODULE_RWX=y
+CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
+CONFIG_ARCH_USE_MEMREMAP_PROT=y
+# CONFIG_LOCK_EVENT_COUNTS is not set
+CONFIG_ARCH_HAS_MEM_ENCRYPT=y
+CONFIG_ARCH_HAS_CC_PLATFORM=y
+CONFIG_HAVE_STATIC_CALL=y
+CONFIG_HAVE_STATIC_CALL_INLINE=y
+CONFIG_HAVE_PREEMPT_DYNAMIC=y
+CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
+CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
+CONFIG_ARCH_HAS_ELFCORE_COMPAT=y
+CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y
+
+#
+# GCOV-based kernel profiling
+#
+# CONFIG_GCOV_KERNEL is not set
+CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
+# end of GCOV-based kernel profiling
+
+CONFIG_HAVE_GCC_PLUGINS=y
+# end of General architecture-dependent options
+
+CONFIG_RT_MUTEXES=y
+CONFIG_BASE_SMALL=0
+CONFIG_MODULE_SIG_FORMAT=y
+CONFIG_MODULES=y
+# CONFIG_MODULE_FORCE_LOAD is not set
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULE_FORCE_UNLOAD=y
+# CONFIG_MODVERSIONS is not set
+# CONFIG_MODULE_SRCVERSION_ALL is not set
+CONFIG_MODULE_SIG=y
+# CONFIG_MODULE_SIG_FORCE is not set
+CONFIG_MODULE_SIG_ALL=y
+# CONFIG_MODULE_SIG_SHA1 is not set
+# CONFIG_MODULE_SIG_SHA224 is not set
+CONFIG_MODULE_SIG_SHA256=y
+# CONFIG_MODULE_SIG_SHA384 is not set
+# CONFIG_MODULE_SIG_SHA512 is not set
+CONFIG_MODULE_SIG_HASH="sha256"
+CONFIG_MODULE_COMPRESS_NONE=y
+# CONFIG_MODULE_COMPRESS_GZIP is not set
+# CONFIG_MODULE_COMPRESS_XZ is not set
+# CONFIG_MODULE_COMPRESS_ZSTD is not set
+# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
+CONFIG_MODPROBE_PATH="/sbin/modprobe"
+# CONFIG_TRIM_UNUSED_KSYMS is not set
+CONFIG_MODULES_TREE_LOOKUP=y
+CONFIG_BLOCK=y
+CONFIG_BLK_CGROUP_RWSTAT=y
+CONFIG_BLK_DEV_BSG_COMMON=y
+CONFIG_BLK_DEV_BSGLIB=y
+CONFIG_BLK_DEV_INTEGRITY=y
+CONFIG_BLK_DEV_INTEGRITY_T10=y
+# CONFIG_BLK_DEV_ZONED is not set
+CONFIG_BLK_DEV_THROTTLING=y
+# CONFIG_BLK_DEV_THROTTLING_LOW is not set
+CONFIG_BLK_WBT=y
+CONFIG_BLK_WBT_MQ=y
+# CONFIG_BLK_CGROUP_IOLATENCY is not set
+# CONFIG_BLK_CGROUP_IOCOST is not set
+# CONFIG_BLK_CGROUP_IOPRIO is not set
+# CONFIG_BLK_DEBUG_FS is not set
+# CONFIG_BLK_SED_OPAL is not set
+# CONFIG_BLK_INLINE_ENCRYPTION is not set
+
+#
+# Partition Types
+#
+CONFIG_PARTITION_ADVANCED=y
+# CONFIG_ACORN_PARTITION is not set
+# CONFIG_AIX_PARTITION is not set
+# CONFIG_OSF_PARTITION is not set
+# CONFIG_AMIGA_PARTITION is not set
+# CONFIG_ATARI_PARTITION is not set
+# CONFIG_MAC_PARTITION is not set
+CONFIG_MSDOS_PARTITION=y
+# CONFIG_BSD_DISKLABEL is not set
+# CONFIG_MINIX_SUBPARTITION is not set
+# CONFIG_SOLARIS_X86_PARTITION is not set
+# CONFIG_UNIXWARE_DISKLABEL is not set
+# CONFIG_LDM_PARTITION is not set
+# CONFIG_SGI_PARTITION is not set
+# CONFIG_ULTRIX_PARTITION is not set
+# CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
+CONFIG_EFI_PARTITION=y
+# CONFIG_SYSV68_PARTITION is not set
+# CONFIG_CMDLINE_PARTITION is not set
+# end of Partition Types
+
+CONFIG_BLOCK_COMPAT=y
+CONFIG_BLK_MQ_PCI=y
+CONFIG_BLK_MQ_VIRTIO=y
+CONFIG_BLK_MQ_RDMA=y
+CONFIG_BLK_PM=y
+CONFIG_BLOCK_HOLDER_DEPRECATED=y
+
+#
+# IO Schedulers
+#
+CONFIG_MQ_IOSCHED_DEADLINE=y
+CONFIG_MQ_IOSCHED_KYBER=m
+CONFIG_IOSCHED_BFQ=m
+CONFIG_BFQ_GROUP_IOSCHED=y
+# CONFIG_BFQ_CGROUP_DEBUG is not set
+# end of IO Schedulers
+
+CONFIG_ASN1=y
+CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
+CONFIG_INLINE_READ_UNLOCK=y
+CONFIG_INLINE_READ_UNLOCK_IRQ=y
+CONFIG_INLINE_WRITE_UNLOCK=y
+CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
+CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
+CONFIG_MUTEX_SPIN_ON_OWNER=y
+CONFIG_RWSEM_SPIN_ON_OWNER=y
+CONFIG_LOCK_SPIN_ON_OWNER=y
+CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
+CONFIG_QUEUED_SPINLOCKS=y
+CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
+CONFIG_QUEUED_RWLOCKS=y
+CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
+CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
+CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
+CONFIG_FREEZER=y
+
+#
+# Executable file formats
+#
+CONFIG_BINFMT_ELF=y
+CONFIG_COMPAT_BINFMT_ELF=y
+CONFIG_ELFCORE=y
+CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
+CONFIG_BINFMT_SCRIPT=y
+CONFIG_BINFMT_MISC=y
+CONFIG_COREDUMP=y
+# end of Executable file formats
+
+#
+# Memory Management options
+#
+CONFIG_SELECT_MEMORY_MODEL=y
+CONFIG_SPARSEMEM_MANUAL=y
+CONFIG_SPARSEMEM=y
+CONFIG_SPARSEMEM_EXTREME=y
+CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
+CONFIG_SPARSEMEM_VMEMMAP=y
+CONFIG_HAVE_FAST_GUP=y
+CONFIG_NUMA_KEEP_MEMINFO=y
+CONFIG_MEMORY_ISOLATION=y
+CONFIG_HAVE_BOOTMEM_INFO_NODE=y
+CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTPLUG_SPARSE=y
+# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
+CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
+CONFIG_MEMORY_HOTREMOVE=y
+CONFIG_MHP_MEMMAP_ON_MEMORY=y
+CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
+CONFIG_MEMORY_BALLOON=y
+CONFIG_BALLOON_COMPACTION=y
+CONFIG_COMPACTION=y
+CONFIG_PAGE_REPORTING=y
+CONFIG_MIGRATION=y
+CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
+CONFIG_ARCH_ENABLE_THP_MIGRATION=y
+CONFIG_CONTIG_ALLOC=y
+CONFIG_PHYS_ADDR_T_64BIT=y
+CONFIG_VIRT_TO_BUS=y
+CONFIG_MMU_NOTIFIER=y
+# CONFIG_KSM is not set
+CONFIG_DEFAULT_MMAP_MIN_ADDR=65536
+CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
+CONFIG_MEMORY_FAILURE=y
+# CONFIG_HWPOISON_INJECT is not set
+CONFIG_TRANSPARENT_HUGEPAGE=y
+# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
+CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
+CONFIG_ARCH_WANTS_THP_SWAP=y
+CONFIG_THP_SWAP=y
+# CONFIG_CLEANCACHE is not set
+# CONFIG_FRONTSWAP is not set
+CONFIG_CMA=y
+# CONFIG_CMA_DEBUG is not set
+# CONFIG_CMA_DEBUGFS is not set
+# CONFIG_CMA_SYSFS is not set
+CONFIG_CMA_AREAS=7
+CONFIG_MEM_SOFT_DIRTY=y
+# CONFIG_ZPOOL is not set
+CONFIG_ZSMALLOC=m
+# CONFIG_ZSMALLOC_STAT is not set
+CONFIG_GENERIC_EARLY_IOREMAP=y
+# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
+# CONFIG_IDLE_PAGE_TRACKING is not set
+CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
+CONFIG_ARCH_HAS_PTE_DEVMAP=y
+CONFIG_ARCH_HAS_ZONE_DMA_SET=y
+CONFIG_ZONE_DMA=y
+CONFIG_ZONE_DMA32=y
+CONFIG_ZONE_DEVICE=y
+CONFIG_HMM_MIRROR=y
+# CONFIG_DEVICE_PRIVATE is not set
+CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
+CONFIG_ARCH_HAS_PKEYS=y
+# CONFIG_PERCPU_STATS is not set
+# CONFIG_GUP_TEST is not set
+# CONFIG_READ_ONLY_THP_FOR_FS is not set
+CONFIG_ARCH_HAS_PTE_SPECIAL=y
+
+#
+# Data Access Monitoring
+#
+# CONFIG_DAMON is not set
+# end of Data Access Monitoring
+# end of Memory Management options
+
+CONFIG_NET=y
+CONFIG_NET_INGRESS=y
+CONFIG_NET_EGRESS=y
+CONFIG_NET_REDIRECT=y
+CONFIG_SKB_EXTENSIONS=y
+
+#
+# Networking options
+#
+CONFIG_PACKET=y
+CONFIG_PACKET_DIAG=m
+CONFIG_UNIX=y
+CONFIG_UNIX_SCM=y
+CONFIG_AF_UNIX_OOB=y
+CONFIG_UNIX_DIAG=m
+CONFIG_TLS=y
+CONFIG_TLS_DEVICE=y
+# CONFIG_TLS_TOE is not set
+CONFIG_XFRM=y
+CONFIG_XFRM_ALGO=y
+CONFIG_XFRM_USER=y
+# CONFIG_XFRM_USER_COMPAT is not set
+CONFIG_XFRM_INTERFACE=m
+# CONFIG_XFRM_SUB_POLICY is not set
+# CONFIG_XFRM_MIGRATE is not set
+CONFIG_XFRM_STATISTICS=y
+CONFIG_XFRM_AH=m
+CONFIG_XFRM_ESP=m
+CONFIG_XFRM_IPCOMP=m
+CONFIG_NET_KEY=m
+# CONFIG_NET_KEY_MIGRATE is not set
+# CONFIG_SMC is not set
+CONFIG_XDP_SOCKETS=y
+# CONFIG_XDP_SOCKETS_DIAG is not set
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+# CONFIG_IP_FIB_TRIE_STATS is not set
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_ROUTE_MULTIPATH=y
+CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_IP_ROUTE_CLASSID=y
+# CONFIG_IP_PNP is not set
+CONFIG_NET_IPIP=m
+# CONFIG_NET_IPGRE_DEMUX is not set
+CONFIG_NET_IP_TUNNEL=m
+CONFIG_IP_MROUTE_COMMON=y
+CONFIG_IP_MROUTE=y
+# CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set
+CONFIG_IP_PIMSM_V1=y
+CONFIG_IP_PIMSM_V2=y
+CONFIG_SYN_COOKIES=y
+# CONFIG_NET_IPVTI is not set
+CONFIG_NET_UDP_TUNNEL=m
+CONFIG_NET_FOU=m
+CONFIG_NET_FOU_IP_TUNNELS=y
+CONFIG_INET_AH=m
+CONFIG_INET_ESP=m
+# CONFIG_INET_ESP_OFFLOAD is not set
+# CONFIG_INET_ESPINTCP is not set
+CONFIG_INET_IPCOMP=m
+CONFIG_INET_TABLE_PERTURB_ORDER=16
+CONFIG_INET_XFRM_TUNNEL=m
+CONFIG_INET_TUNNEL=m
+CONFIG_INET_DIAG=m
+CONFIG_INET_TCP_DIAG=m
+CONFIG_INET_UDP_DIAG=m
+# CONFIG_INET_RAW_DIAG is not set
+CONFIG_INET_DIAG_DESTROY=y
+CONFIG_TCP_CONG_ADVANCED=y
+# CONFIG_TCP_CONG_BIC is not set
+CONFIG_TCP_CONG_CUBIC=y
+# CONFIG_TCP_CONG_WESTWOOD is not set
+# CONFIG_TCP_CONG_HTCP is not set
+# CONFIG_TCP_CONG_HSTCP is not set
+# CONFIG_TCP_CONG_HYBLA is not set
+# CONFIG_TCP_CONG_VEGAS is not set
+# CONFIG_TCP_CONG_NV is not set
+# CONFIG_TCP_CONG_SCALABLE is not set
+CONFIG_TCP_CONG_LP=m
+# CONFIG_TCP_CONG_VENO is not set
+# CONFIG_TCP_CONG_YEAH is not set
+# CONFIG_TCP_CONG_ILLINOIS is not set
+# CONFIG_TCP_CONG_DCTCP is not set
+# CONFIG_TCP_CONG_CDG is not set
+CONFIG_TCP_CONG_BBR=m
+CONFIG_DEFAULT_CUBIC=y
+# CONFIG_DEFAULT_RENO is not set
+CONFIG_DEFAULT_TCP_CONG="cubic"
+CONFIG_TCP_MD5SIG=y
+CONFIG_IPV6=y
+# CONFIG_IPV6_ROUTER_PREF is not set
+# CONFIG_IPV6_OPTIMISTIC_DAD is not set
+# CONFIG_INET6_AH is not set
+CONFIG_INET6_ESP=m
+# CONFIG_INET6_ESP_OFFLOAD is not set
+# CONFIG_INET6_ESPINTCP is not set
+# CONFIG_INET6_IPCOMP is not set
+# CONFIG_IPV6_MIP6 is not set
+# CONFIG_IPV6_ILA is not set
+CONFIG_INET6_TUNNEL=m
+# CONFIG_IPV6_VTI is not set
+# CONFIG_IPV6_SIT is not set
+CONFIG_IPV6_TUNNEL=m
+CONFIG_IPV6_FOU=m
+CONFIG_IPV6_FOU_TUNNEL=m
+CONFIG_IPV6_MULTIPLE_TABLES=y
+# CONFIG_IPV6_SUBTREES is not set
+# CONFIG_IPV6_MROUTE is not set
+# CONFIG_IPV6_SEG6_LWTUNNEL is not set
+# CONFIG_IPV6_SEG6_HMAC is not set
+# CONFIG_IPV6_RPL_LWTUNNEL is not set
+# CONFIG_IPV6_IOAM6_LWTUNNEL is not set
+# CONFIG_NETLABEL is not set
+# CONFIG_MPTCP is not set
+CONFIG_NETWORK_SECMARK=y
+CONFIG_NET_PTP_CLASSIFY=y
+# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
+CONFIG_NETFILTER=y
+CONFIG_NETFILTER_ADVANCED=y
+CONFIG_BRIDGE_NETFILTER=m
+
+#
+# Core Netfilter Configuration
+#
+CONFIG_NETFILTER_INGRESS=y
+CONFIG_NETFILTER_NETLINK=y
+CONFIG_NETFILTER_FAMILY_BRIDGE=y
+CONFIG_NETFILTER_FAMILY_ARP=y
+# CONFIG_NETFILTER_NETLINK_HOOK is not set
+CONFIG_NETFILTER_NETLINK_ACCT=m
+CONFIG_NETFILTER_NETLINK_QUEUE=y
+CONFIG_NETFILTER_NETLINK_LOG=y
+CONFIG_NETFILTER_NETLINK_OSF=m
+CONFIG_NF_CONNTRACK=y
+CONFIG_NF_LOG_SYSLOG=m
+CONFIG_NETFILTER_CONNCOUNT=m
+CONFIG_NF_CONNTRACK_MARK=y
+CONFIG_NF_CONNTRACK_SECMARK=y
+CONFIG_NF_CONNTRACK_ZONES=y
+CONFIG_NF_CONNTRACK_PROCFS=y
+CONFIG_NF_CONNTRACK_EVENTS=y
+CONFIG_NF_CONNTRACK_TIMEOUT=y
+CONFIG_NF_CONNTRACK_TIMESTAMP=y
+CONFIG_NF_CONNTRACK_LABELS=y
+CONFIG_NF_CT_PROTO_DCCP=y
+CONFIG_NF_CT_PROTO_GRE=y
+CONFIG_NF_CT_PROTO_SCTP=y
+CONFIG_NF_CT_PROTO_UDPLITE=y
+CONFIG_NF_CONNTRACK_AMANDA=m
+CONFIG_NF_CONNTRACK_FTP=m
+CONFIG_NF_CONNTRACK_H323=m
+CONFIG_NF_CONNTRACK_IRC=m
+CONFIG_NF_CONNTRACK_BROADCAST=m
+CONFIG_NF_CONNTRACK_NETBIOS_NS=m
+CONFIG_NF_CONNTRACK_SNMP=m
+CONFIG_NF_CONNTRACK_PPTP=m
+CONFIG_NF_CONNTRACK_SANE=m
+CONFIG_NF_CONNTRACK_SIP=m
+CONFIG_NF_CONNTRACK_TFTP=m
+CONFIG_NF_CT_NETLINK=y
+CONFIG_NF_CT_NETLINK_TIMEOUT=m
+CONFIG_NF_CT_NETLINK_HELPER=m
+CONFIG_NETFILTER_NETLINK_GLUE_CT=y
+CONFIG_NF_NAT=m
+CONFIG_NF_NAT_AMANDA=m
+CONFIG_NF_NAT_FTP=m
+CONFIG_NF_NAT_IRC=m
+CONFIG_NF_NAT_SIP=m
+CONFIG_NF_NAT_TFTP=m
+CONFIG_NF_NAT_REDIRECT=y
+CONFIG_NF_NAT_MASQUERADE=y
+CONFIG_NETFILTER_SYNPROXY=y
+CONFIG_NF_TABLES=m
+CONFIG_NF_TABLES_INET=y
+# CONFIG_NF_TABLES_NETDEV is not set
+CONFIG_NFT_NUMGEN=m
+CONFIG_NFT_CT=m
+# CONFIG_NFT_FLOW_OFFLOAD is not set
+CONFIG_NFT_COUNTER=m
+CONFIG_NFT_CONNLIMIT=m
+CONFIG_NFT_LOG=m
+CONFIG_NFT_LIMIT=m
+CONFIG_NFT_MASQ=m
+CONFIG_NFT_REDIR=m
+CONFIG_NFT_NAT=m
+CONFIG_NFT_TUNNEL=m
+CONFIG_NFT_OBJREF=m
+CONFIG_NFT_QUEUE=m
+CONFIG_NFT_QUOTA=m
+CONFIG_NFT_REJECT=m
+CONFIG_NFT_REJECT_INET=m
+CONFIG_NFT_COMPAT=m
+CONFIG_NFT_HASH=m
+CONFIG_NFT_XFRM=m
+CONFIG_NFT_SOCKET=m
+CONFIG_NFT_OSF=m
+CONFIG_NFT_TPROXY=m
+CONFIG_NFT_SYNPROXY=m
+# CONFIG_NF_FLOW_TABLE_INET is not set
+CONFIG_NF_FLOW_TABLE=m
+CONFIG_NETFILTER_XTABLES=y
+CONFIG_NETFILTER_XTABLES_COMPAT=y
+
+#
+# Xtables combined modules
+#
+CONFIG_NETFILTER_XT_MARK=m
+CONFIG_NETFILTER_XT_CONNMARK=m
+CONFIG_NETFILTER_XT_SET=m
+
+#
+# Xtables targets
+#
+CONFIG_NETFILTER_XT_TARGET_AUDIT=m
+CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
+CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
+CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
+CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
+CONFIG_NETFILTER_XT_TARGET_CT=m
+CONFIG_NETFILTER_XT_TARGET_DSCP=m
+CONFIG_NETFILTER_XT_TARGET_HL=m
+CONFIG_NETFILTER_XT_TARGET_HMARK=m
+CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
+CONFIG_NETFILTER_XT_TARGET_LOG=m
+CONFIG_NETFILTER_XT_TARGET_MARK=m
+CONFIG_NETFILTER_XT_NAT=m
+CONFIG_NETFILTER_XT_TARGET_NETMAP=m
+CONFIG_NETFILTER_XT_TARGET_NFLOG=y
+CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
+# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set
+CONFIG_NETFILTER_XT_TARGET_RATEEST=m
+CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
+CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m
+CONFIG_NETFILTER_XT_TARGET_TEE=m
+CONFIG_NETFILTER_XT_TARGET_TPROXY=m
+CONFIG_NETFILTER_XT_TARGET_TRACE=m
+CONFIG_NETFILTER_XT_TARGET_SECMARK=m
+CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
+CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
+
+#
+# Xtables matches
+#
+CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
+CONFIG_NETFILTER_XT_MATCH_BPF=m
+CONFIG_NETFILTER_XT_MATCH_CGROUP=m
+CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
+CONFIG_NETFILTER_XT_MATCH_COMMENT=y
+CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
+CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
+CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
+CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
+CONFIG_NETFILTER_XT_MATCH_CPU=m
+CONFIG_NETFILTER_XT_MATCH_DCCP=m
+CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
+CONFIG_NETFILTER_XT_MATCH_DSCP=m
+CONFIG_NETFILTER_XT_MATCH_ECN=m
+CONFIG_NETFILTER_XT_MATCH_ESP=m
+CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_HELPER=m
+CONFIG_NETFILTER_XT_MATCH_HL=m
+CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
+CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
+CONFIG_NETFILTER_XT_MATCH_IPVS=m
+CONFIG_NETFILTER_XT_MATCH_L2TP=m
+CONFIG_NETFILTER_XT_MATCH_LENGTH=m
+CONFIG_NETFILTER_XT_MATCH_LIMIT=m
+CONFIG_NETFILTER_XT_MATCH_MAC=m
+CONFIG_NETFILTER_XT_MATCH_MARK=m
+CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
+CONFIG_NETFILTER_XT_MATCH_NFACCT=m
+CONFIG_NETFILTER_XT_MATCH_OSF=m
+CONFIG_NETFILTER_XT_MATCH_OWNER=y
+CONFIG_NETFILTER_XT_MATCH_POLICY=y
+CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
+CONFIG_NETFILTER_XT_MATCH_QUOTA=m
+CONFIG_NETFILTER_XT_MATCH_RATEEST=m
+CONFIG_NETFILTER_XT_MATCH_REALM=y
+CONFIG_NETFILTER_XT_MATCH_RECENT=m
+CONFIG_NETFILTER_XT_MATCH_SCTP=m
+CONFIG_NETFILTER_XT_MATCH_SOCKET=m
+CONFIG_NETFILTER_XT_MATCH_STATE=m
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
+CONFIG_NETFILTER_XT_MATCH_STRING=m
+CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
+CONFIG_NETFILTER_XT_MATCH_TIME=m
+CONFIG_NETFILTER_XT_MATCH_U32=m
+# end of Core Netfilter Configuration
+
+CONFIG_IP_SET=m
+CONFIG_IP_SET_MAX=256
+CONFIG_IP_SET_BITMAP_IP=m
+CONFIG_IP_SET_BITMAP_IPMAC=m
+CONFIG_IP_SET_BITMAP_PORT=m
+CONFIG_IP_SET_HASH_IP=m
+CONFIG_IP_SET_HASH_IPMARK=m
+CONFIG_IP_SET_HASH_IPPORT=m
+CONFIG_IP_SET_HASH_IPPORTIP=m
+CONFIG_IP_SET_HASH_IPPORTNET=m
+# CONFIG_IP_SET_HASH_IPMAC is not set
+CONFIG_IP_SET_HASH_MAC=m
+CONFIG_IP_SET_HASH_NETPORTNET=m
+CONFIG_IP_SET_HASH_NET=m
+CONFIG_IP_SET_HASH_NETNET=m
+CONFIG_IP_SET_HASH_NETPORT=m
+CONFIG_IP_SET_HASH_NETIFACE=m
+CONFIG_IP_SET_LIST_SET=m
+CONFIG_IP_VS=m
+# CONFIG_IP_VS_IPV6 is not set
+# CONFIG_IP_VS_DEBUG is not set
+CONFIG_IP_VS_TAB_BITS=12
+
+#
+# IPVS transport protocol load balancing support
+#
+CONFIG_IP_VS_PROTO_TCP=y
+CONFIG_IP_VS_PROTO_UDP=y
+CONFIG_IP_VS_PROTO_AH_ESP=y
+CONFIG_IP_VS_PROTO_ESP=y
+CONFIG_IP_VS_PROTO_AH=y
+CONFIG_IP_VS_PROTO_SCTP=y
+
+#
+# IPVS scheduler
+#
+CONFIG_IP_VS_RR=m
+CONFIG_IP_VS_WRR=m
+CONFIG_IP_VS_LC=m
+CONFIG_IP_VS_WLC=m
+CONFIG_IP_VS_FO=m
+CONFIG_IP_VS_OVF=m
+CONFIG_IP_VS_LBLC=m
+CONFIG_IP_VS_LBLCR=m
+CONFIG_IP_VS_DH=m
+CONFIG_IP_VS_SH=m
+# CONFIG_IP_VS_MH is not set
+CONFIG_IP_VS_SED=m
+CONFIG_IP_VS_NQ=m
+# CONFIG_IP_VS_TWOS is not set
+
+#
+# IPVS SH scheduler
+#
+CONFIG_IP_VS_SH_TAB_BITS=8
+
+#
+# IPVS MH scheduler
+#
+CONFIG_IP_VS_MH_TAB_INDEX=12
+
+#
+# IPVS application helper
+#
+CONFIG_IP_VS_FTP=m
+CONFIG_IP_VS_NFCT=y
+CONFIG_IP_VS_PE_SIP=m
+
+#
+# IP: Netfilter Configuration
+#
+CONFIG_NF_DEFRAG_IPV4=y
+CONFIG_NF_SOCKET_IPV4=m
+CONFIG_NF_TPROXY_IPV4=m
+CONFIG_NF_TABLES_IPV4=y
+CONFIG_NFT_REJECT_IPV4=m
+# CONFIG_NFT_DUP_IPV4 is not set
+# CONFIG_NFT_FIB_IPV4 is not set
+# CONFIG_NF_TABLES_ARP is not set
+# CONFIG_NF_FLOW_TABLE_IPV4 is not set
+CONFIG_NF_DUP_IPV4=m
+CONFIG_NF_LOG_ARP=m
+CONFIG_NF_LOG_IPV4=m
+CONFIG_NF_REJECT_IPV4=y
+CONFIG_NF_NAT_SNMP_BASIC=m
+CONFIG_NF_NAT_PPTP=m
+CONFIG_NF_NAT_H323=m
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IP_NF_MATCH_AH=m
+CONFIG_IP_NF_MATCH_ECN=m
+CONFIG_IP_NF_MATCH_RPFILTER=m
+CONFIG_IP_NF_MATCH_TTL=m
+CONFIG_IP_NF_FILTER=y
+CONFIG_IP_NF_TARGET_REJECT=y
+CONFIG_IP_NF_TARGET_SYNPROXY=m
+CONFIG_IP_NF_NAT=m
+CONFIG_IP_NF_TARGET_MASQUERADE=m
+CONFIG_IP_NF_TARGET_NETMAP=m
+CONFIG_IP_NF_TARGET_REDIRECT=m
+CONFIG_IP_NF_MANGLE=y
+CONFIG_IP_NF_TARGET_CLUSTERIP=m
+CONFIG_IP_NF_TARGET_ECN=m
+CONFIG_IP_NF_TARGET_TTL=m
+CONFIG_IP_NF_RAW=m
+CONFIG_IP_NF_SECURITY=m
+CONFIG_IP_NF_ARPTABLES=m
+CONFIG_IP_NF_ARPFILTER=m
+CONFIG_IP_NF_ARP_MANGLE=m
+# end of IP: Netfilter Configuration
+
+#
+# IPv6: Netfilter Configuration
+#
+CONFIG_NF_SOCKET_IPV6=m
+CONFIG_NF_TPROXY_IPV6=m
+CONFIG_NF_TABLES_IPV6=y
+CONFIG_NFT_REJECT_IPV6=m
+# CONFIG_NFT_DUP_IPV6 is not set
+# CONFIG_NFT_FIB_IPV6 is not set
+# CONFIG_NF_FLOW_TABLE_IPV6 is not set
+CONFIG_NF_DUP_IPV6=m
+CONFIG_NF_REJECT_IPV6=y
+CONFIG_NF_LOG_IPV6=m
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP6_NF_MATCH_AH=m
+# CONFIG_IP6_NF_MATCH_EUI64 is not set
+# CONFIG_IP6_NF_MATCH_FRAG is not set
+# CONFIG_IP6_NF_MATCH_OPTS is not set
+# CONFIG_IP6_NF_MATCH_HL is not set
+# CONFIG_IP6_NF_MATCH_IPV6HEADER is not set
+# CONFIG_IP6_NF_MATCH_MH is not set
+CONFIG_IP6_NF_MATCH_RPFILTER=m
+# CONFIG_IP6_NF_MATCH_RT is not set
+# CONFIG_IP6_NF_MATCH_SRH is not set
+# CONFIG_IP6_NF_TARGET_HL is not set
+CONFIG_IP6_NF_FILTER=y
+CONFIG_IP6_NF_TARGET_REJECT=y
+CONFIG_IP6_NF_TARGET_SYNPROXY=y
+CONFIG_IP6_NF_MANGLE=m
+CONFIG_IP6_NF_RAW=m
+CONFIG_IP6_NF_SECURITY=m
+CONFIG_IP6_NF_NAT=m
+CONFIG_IP6_NF_TARGET_MASQUERADE=m
+# CONFIG_IP6_NF_TARGET_NPT is not set
+# end of IPv6: Netfilter Configuration
+
+CONFIG_NF_DEFRAG_IPV6=y
+# CONFIG_NF_TABLES_BRIDGE is not set
+# CONFIG_NF_CONNTRACK_BRIDGE is not set
+CONFIG_BRIDGE_NF_EBTABLES=m
+CONFIG_BRIDGE_EBT_BROUTE=m
+CONFIG_BRIDGE_EBT_T_FILTER=m
+CONFIG_BRIDGE_EBT_T_NAT=m
+CONFIG_BRIDGE_EBT_802_3=m
+CONFIG_BRIDGE_EBT_AMONG=m
+CONFIG_BRIDGE_EBT_ARP=m
+CONFIG_BRIDGE_EBT_IP=m
+# CONFIG_BRIDGE_EBT_IP6 is not set
+CONFIG_BRIDGE_EBT_LIMIT=m
+CONFIG_BRIDGE_EBT_MARK=m
+CONFIG_BRIDGE_EBT_PKTTYPE=m
+CONFIG_BRIDGE_EBT_STP=m
+CONFIG_BRIDGE_EBT_VLAN=m
+CONFIG_BRIDGE_EBT_ARPREPLY=m
+CONFIG_BRIDGE_EBT_DNAT=m
+CONFIG_BRIDGE_EBT_MARK_T=m
+CONFIG_BRIDGE_EBT_REDIRECT=m
+CONFIG_BRIDGE_EBT_SNAT=m
+CONFIG_BRIDGE_EBT_LOG=m
+CONFIG_BRIDGE_EBT_NFLOG=m
+# CONFIG_BPFILTER is not set
+# CONFIG_IP_DCCP is not set
+# CONFIG_IP_SCTP is not set
+# CONFIG_RDS is not set
+# CONFIG_TIPC is not set
+# CONFIG_ATM is not set
+# CONFIG_L2TP is not set
+CONFIG_STP=y
+CONFIG_BRIDGE=y
+CONFIG_BRIDGE_IGMP_SNOOPING=y
+CONFIG_BRIDGE_VLAN_FILTERING=y
+# CONFIG_BRIDGE_MRP is not set
+# CONFIG_BRIDGE_CFM is not set
+# CONFIG_NET_DSA is not set
+CONFIG_VLAN_8021Q=m
+# CONFIG_VLAN_8021Q_GVRP is not set
+# CONFIG_VLAN_8021Q_MVRP is not set
+CONFIG_LLC=y
+# CONFIG_LLC2 is not set
+# CONFIG_ATALK is not set
+# CONFIG_X25 is not set
+# CONFIG_LAPB is not set
+# CONFIG_PHONET is not set
+# CONFIG_6LOWPAN is not set
+# CONFIG_IEEE802154 is not set
+CONFIG_NET_SCHED=y
+
+#
+# Queueing/Scheduling
+#
+CONFIG_NET_SCH_HTB=m
+CONFIG_NET_SCH_HFSC=m
+CONFIG_NET_SCH_PRIO=m
+CONFIG_NET_SCH_MULTIQ=m
+CONFIG_NET_SCH_RED=m
+CONFIG_NET_SCH_SFB=m
+CONFIG_NET_SCH_SFQ=m
+CONFIG_NET_SCH_TEQL=m
+CONFIG_NET_SCH_TBF=m
+# CONFIG_NET_SCH_CBS is not set
+# CONFIG_NET_SCH_ETF is not set
+# CONFIG_NET_SCH_TAPRIO is not set
+CONFIG_NET_SCH_GRED=m
+CONFIG_NET_SCH_NETEM=m
+CONFIG_NET_SCH_DRR=m
+CONFIG_NET_SCH_MQPRIO=m
+# CONFIG_NET_SCH_SKBPRIO is not set
+CONFIG_NET_SCH_CHOKE=m
+CONFIG_NET_SCH_QFQ=m
+CONFIG_NET_SCH_CODEL=m
+CONFIG_NET_SCH_FQ_CODEL=m
+# CONFIG_NET_SCH_CAKE is not set
+CONFIG_NET_SCH_FQ=m
+CONFIG_NET_SCH_HHF=m
+CONFIG_NET_SCH_PIE=m
+# CONFIG_NET_SCH_FQ_PIE is not set
+CONFIG_NET_SCH_INGRESS=m
+CONFIG_NET_SCH_PLUG=m
+# CONFIG_NET_SCH_ETS is not set
+# CONFIG_NET_SCH_DEFAULT is not set
+
+#
+# Classification
+#
+CONFIG_NET_CLS=y
+CONFIG_NET_CLS_BASIC=m
+CONFIG_NET_CLS_ROUTE4=m
+CONFIG_NET_CLS_FW=m
+CONFIG_NET_CLS_U32=m
+# CONFIG_CLS_U32_PERF is not set
+CONFIG_CLS_U32_MARK=y
+# CONFIG_NET_CLS_FLOW is not set
+CONFIG_NET_CLS_CGROUP=m
+CONFIG_NET_CLS_BPF=m
+# CONFIG_NET_CLS_FLOWER is not set
+# CONFIG_NET_CLS_MATCHALL is not set
+# CONFIG_NET_EMATCH is not set
+CONFIG_NET_CLS_ACT=y
+# CONFIG_NET_ACT_POLICE is not set
+CONFIG_NET_ACT_GACT=m
+# CONFIG_GACT_PROB is not set
+CONFIG_NET_ACT_MIRRED=y
+# CONFIG_NET_ACT_SAMPLE is not set
+# CONFIG_NET_ACT_IPT is not set
+CONFIG_NET_ACT_NAT=m
+CONFIG_NET_ACT_PEDIT=y
+# CONFIG_NET_ACT_SIMP is not set
+# CONFIG_NET_ACT_SKBEDIT is not set
+# CONFIG_NET_ACT_CSUM is not set
+# CONFIG_NET_ACT_MPLS is not set
+# CONFIG_NET_ACT_VLAN is not set
+# CONFIG_NET_ACT_BPF is not set
+# CONFIG_NET_ACT_CONNMARK is not set
+# CONFIG_NET_ACT_CTINFO is not set
+# CONFIG_NET_ACT_SKBMOD is not set
+# CONFIG_NET_ACT_IFE is not set
+# CONFIG_NET_ACT_TUNNEL_KEY is not set
+# CONFIG_NET_ACT_CT is not set
+# CONFIG_NET_ACT_GATE is not set
+# CONFIG_NET_TC_SKB_EXT is not set
+CONFIG_NET_SCH_FIFO=y
+# CONFIG_DCB is not set
+CONFIG_DNS_RESOLVER=m
+# CONFIG_BATMAN_ADV is not set
+# CONFIG_OPENVSWITCH is not set
+CONFIG_VSOCKETS=y
+CONFIG_VSOCKETS_DIAG=y
+CONFIG_VSOCKETS_LOOPBACK=y
+CONFIG_VMWARE_VMCI_VSOCKETS=y
+CONFIG_VIRTIO_VSOCKETS=y
+CONFIG_VIRTIO_VSOCKETS_COMMON=y
+# CONFIG_HYPERV_VSOCKETS is not set
+CONFIG_NETLINK_DIAG=m
+# CONFIG_MPLS is not set
+# CONFIG_NET_NSH is not set
+# CONFIG_HSR is not set
+# CONFIG_NET_SWITCHDEV is not set
+CONFIG_NET_L3_MASTER_DEV=y
+# CONFIG_QRTR is not set
+# CONFIG_NET_NCSI is not set
+CONFIG_PCPU_DEV_REFCNT=y
+CONFIG_RPS=y
+CONFIG_RFS_ACCEL=y
+CONFIG_SOCK_RX_QUEUE_MAPPING=y
+CONFIG_XPS=y
+CONFIG_CGROUP_NET_PRIO=y
+CONFIG_CGROUP_NET_CLASSID=y
+CONFIG_NET_RX_BUSY_POLL=y
+CONFIG_BQL=y
+CONFIG_BPF_STREAM_PARSER=y
+CONFIG_NET_FLOW_LIMIT=y
+
+#
+# Network testing
+#
+# CONFIG_NET_PKTGEN is not set
+# CONFIG_NET_DROP_MONITOR is not set
+# end of Network testing
+# end of Networking options
+
+# CONFIG_HAMRADIO is not set
+# CONFIG_CAN is not set
+# CONFIG_BT is not set
+# CONFIG_AF_RXRPC is not set
+# CONFIG_AF_KCM is not set
+CONFIG_STREAM_PARSER=y
+# CONFIG_MCTP is not set
+CONFIG_FIB_RULES=y
+# CONFIG_WIRELESS is not set
+# CONFIG_RFKILL is not set
+CONFIG_NET_9P=y
+CONFIG_NET_9P_VIRTIO=y
+# CONFIG_NET_9P_XEN is not set
+# CONFIG_NET_9P_RDMA is not set
+# CONFIG_NET_9P_DEBUG is not set
+# CONFIG_CAIF is not set
+# CONFIG_CEPH_LIB is not set
+# CONFIG_NFC is not set
+# CONFIG_PSAMPLE is not set
+# CONFIG_NET_IFE is not set
+CONFIG_LWTUNNEL=y
+CONFIG_LWTUNNEL_BPF=y
+CONFIG_DST_CACHE=y
+CONFIG_GRO_CELLS=y
+CONFIG_SOCK_VALIDATE_XMIT=y
+CONFIG_NET_SELFTESTS=y
+CONFIG_NET_SOCK_MSG=y
+CONFIG_NET_DEVLINK=y
+CONFIG_PAGE_POOL=y
+CONFIG_FAILOVER=y
+CONFIG_ETHTOOL_NETLINK=y
+
+#
+# Device Drivers
+#
+CONFIG_HAVE_EISA=y
+# CONFIG_EISA is not set
+CONFIG_HAVE_PCI=y
+CONFIG_PCI=y
+CONFIG_PCI_DOMAINS=y
+CONFIG_PCIEPORTBUS=y
+CONFIG_HOTPLUG_PCI_PCIE=y
+CONFIG_PCIEAER=y
+# CONFIG_PCIEAER_INJECT is not set
+# CONFIG_PCIE_ECRC is not set
+CONFIG_PCIEASPM=y
+CONFIG_PCIEASPM_DEFAULT=y
+# CONFIG_PCIEASPM_POWERSAVE is not set
+# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
+# CONFIG_PCIEASPM_PERFORMANCE is not set
+CONFIG_PCIE_PME=y
+# CONFIG_PCIE_DPC is not set
+# CONFIG_PCIE_PTM is not set
+CONFIG_PCI_MSI=y
+CONFIG_PCI_MSI_IRQ_DOMAIN=y
+CONFIG_PCI_QUIRKS=y
+# CONFIG_PCI_DEBUG is not set
+# CONFIG_PCI_STUB is not set
+CONFIG_XEN_PCIDEV_FRONTEND=y
+CONFIG_PCI_ATS=y
+CONFIG_PCI_LOCKLESS_CONFIG=y
+# CONFIG_PCI_IOV is not set
+CONFIG_PCI_PRI=y
+CONFIG_PCI_PASID=y
+# CONFIG_PCI_P2PDMA is not set
+CONFIG_PCI_LABEL=y
+CONFIG_PCI_HYPERV=y
+# CONFIG_PCIE_BUS_TUNE_OFF is not set
+CONFIG_PCIE_BUS_DEFAULT=y
+# CONFIG_PCIE_BUS_SAFE is not set
+# CONFIG_PCIE_BUS_PERFORMANCE is not set
+# CONFIG_PCIE_BUS_PEER2PEER is not set
+CONFIG_HOTPLUG_PCI=y
+CONFIG_HOTPLUG_PCI_ACPI=y
+# CONFIG_HOTPLUG_PCI_ACPI_IBM is not set
+# CONFIG_HOTPLUG_PCI_CPCI is not set
+# CONFIG_HOTPLUG_PCI_SHPC is not set
+
+#
+# PCI controller drivers
+#
+# CONFIG_VMD is not set
+CONFIG_PCI_HYPERV_INTERFACE=y
+
+#
+# DesignWare PCI Core Support
+#
+# CONFIG_PCIE_DW_PLAT_HOST is not set
+# CONFIG_PCI_MESON is not set
+# end of DesignWare PCI Core Support
+
+#
+# Mobiveil PCIe Core Support
+#
+# end of Mobiveil PCIe Core Support
+
+#
+# Cadence PCIe controllers support
+#
+# end of Cadence PCIe controllers support
+# end of PCI controller drivers
+
+#
+# PCI Endpoint
+#
+# CONFIG_PCI_ENDPOINT is not set
+# end of PCI Endpoint
+
+#
+# PCI switch controller drivers
+#
+# CONFIG_PCI_SW_SWITCHTEC is not set
+# end of PCI switch controller drivers
+
+# CONFIG_CXL_BUS is not set
+# CONFIG_PCCARD is not set
+# CONFIG_RAPIDIO is not set
+
+#
+# Generic Driver Options
+#
+CONFIG_AUXILIARY_BUS=y
+CONFIG_UEVENT_HELPER=y
+CONFIG_UEVENT_HELPER_PATH=""
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_DEVTMPFS_SAFE=y
+CONFIG_STANDALONE=y
+CONFIG_PREVENT_FIRMWARE_BUILD=y
+
+#
+# Firmware loader
+#
+CONFIG_FW_LOADER=y
+CONFIG_EXTRA_FIRMWARE=""
+# CONFIG_FW_LOADER_USER_HELPER is not set
+# CONFIG_FW_LOADER_COMPRESS is not set
+CONFIG_FW_CACHE=y
+# end of Firmware loader
+
+CONFIG_ALLOW_DEV_COREDUMP=y
+# CONFIG_DEBUG_DRIVER is not set
+CONFIG_DEBUG_DEVRES=y
+# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
+# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
+CONFIG_SYS_HYPERVISOR=y
+CONFIG_GENERIC_CPU_AUTOPROBE=y
+CONFIG_GENERIC_CPU_VULNERABILITIES=y
+CONFIG_DMA_SHARED_BUFFER=y
+# CONFIG_DMA_FENCE_TRACE is not set
+# end of Generic Driver Options
+
+#
+# Bus devices
+#
+# CONFIG_MHI_BUS is not set
+# end of Bus devices
+
+CONFIG_CONNECTOR=y
+CONFIG_PROC_EVENTS=y
+
+#
+# Firmware Drivers
+#
+
+#
+# ARM System Control and Management Interface Protocol
+#
+# end of ARM System Control and Management Interface Protocol
+
+# CONFIG_EDD is not set
+# CONFIG_FIRMWARE_MEMMAP is not set
+CONFIG_DMIID=y
+# CONFIG_DMI_SYSFS is not set
+CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
+# CONFIG_ISCSI_IBFT is not set
+# CONFIG_FW_CFG_SYSFS is not set
+CONFIG_SYSFB=y
+# CONFIG_SYSFB_SIMPLEFB is not set
+# CONFIG_GOOGLE_FIRMWARE is not set
+
+#
+# EFI (Extensible Firmware Interface) Support
+#
+CONFIG_EFI_VARS=y
+CONFIG_EFI_ESRT=y
+CONFIG_EFI_VARS_PSTORE=y
+# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set
+CONFIG_EFI_RUNTIME_MAP=y
+# CONFIG_EFI_FAKE_MEMMAP is not set
+CONFIG_EFI_RUNTIME_WRAPPERS=y
+CONFIG_EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER=y
+# CONFIG_EFI_BOOTLOADER_CONTROL is not set
+# CONFIG_EFI_CAPSULE_LOADER is not set
+# CONFIG_EFI_TEST is not set
+# CONFIG_APPLE_PROPERTIES is not set
+# CONFIG_RESET_ATTACK_MITIGATION is not set
+# CONFIG_EFI_RCI2_TABLE is not set
+# CONFIG_EFI_DISABLE_PCI_DMA is not set
+# end of EFI (Extensible Firmware Interface) Support
+
+CONFIG_EFI_EARLYCON=y
+# CONFIG_EFI_CUSTOM_SSDT_OVERLAYS is not set
+
+#
+# Tegra firmware driver
+#
+# end of Tegra firmware driver
+# end of Firmware Drivers
+
+# CONFIG_GNSS is not set
+# CONFIG_MTD is not set
+# CONFIG_OF is not set
+CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
+# CONFIG_PARPORT is not set
+CONFIG_PNP=y
+CONFIG_PNP_DEBUG_MESSAGES=y
+
+#
+# Protocols
+#
+CONFIG_PNPACPI=y
+CONFIG_BLK_DEV=y
+# CONFIG_BLK_DEV_NULL_BLK is not set
+# CONFIG_BLK_DEV_FD is not set
+CONFIG_CDROM=y
+# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
+CONFIG_ZRAM=m
+CONFIG_ZRAM_DEF_COMP_LZORLE=y
+# CONFIG_ZRAM_DEF_COMP_LZ4 is not set
+# CONFIG_ZRAM_DEF_COMP_LZO is not set
+CONFIG_ZRAM_DEF_COMP="lzo-rle"
+# CONFIG_ZRAM_WRITEBACK is not set
+# CONFIG_ZRAM_MEMORY_TRACKING is not set
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
+# CONFIG_BLK_DEV_CRYPTOLOOP is not set
+# CONFIG_BLK_DEV_DRBD is not set
+# CONFIG_BLK_DEV_NBD is not set
+# CONFIG_BLK_DEV_RAM is not set
+# CONFIG_CDROM_PKTCDVD is not set
+# CONFIG_ATA_OVER_ETH is not set
+CONFIG_XEN_BLKDEV_FRONTEND=y
+CONFIG_XEN_BLKDEV_BACKEND=m
+CONFIG_VIRTIO_BLK=m
+# CONFIG_BLK_DEV_RBD is not set
+# CONFIG_BLK_DEV_RSXX is not set
+
+#
+# NVME Support
+#
+CONFIG_NVME_CORE=y
+CONFIG_BLK_DEV_NVME=y
+# CONFIG_NVME_MULTIPATH is not set
+# CONFIG_NVME_RDMA is not set
+# CONFIG_NVME_FC is not set
+# CONFIG_NVME_TCP is not set
+# CONFIG_NVME_TARGET is not set
+# end of NVME Support
+
+#
+# Misc devices
+#
+# CONFIG_DUMMY_IRQ is not set
+# CONFIG_IBM_ASM is not set
+# CONFIG_PHANTOM is not set
+# CONFIG_TIFM_CORE is not set
+# CONFIG_ENCLOSURE_SERVICES is not set
+# CONFIG_HP_ILO is not set
+CONFIG_VMWARE_BALLOON=y
+# CONFIG_SRAM is not set
+# CONFIG_DW_XDATA_PCIE is not set
+# CONFIG_PCI_ENDPOINT_TEST is not set
+# CONFIG_XILINX_SDFEC is not set
+# CONFIG_C2PORT is not set
+
+#
+# EEPROM support
+#
+CONFIG_EEPROM_93CX6=m
+# end of EEPROM support
+
+# CONFIG_CB710_CORE is not set
+
+#
+# Texas Instruments shared transport line discipline
+#
+# end of Texas Instruments shared transport line discipline
+
+#
+# Altera FPGA firmware download module (requires I2C)
+#
+# CONFIG_INTEL_MEI is not set
+# CONFIG_INTEL_MEI_ME is not set
+# CONFIG_INTEL_MEI_TXE is not set
+CONFIG_VMWARE_VMCI=y
+# CONFIG_GENWQE is not set
+# CONFIG_ECHO is not set
+# CONFIG_BCM_VK is not set
+# CONFIG_MISC_ALCOR_PCI is not set
+# CONFIG_MISC_RTSX_PCI is not set
+# CONFIG_HABANA_AI is not set
+# CONFIG_UACCE is not set
+CONFIG_PVPANIC=y
+CONFIG_PVPANIC_MMIO=y
+# CONFIG_PVPANIC_PCI is not set
+# end of Misc devices
+
+#
+# SCSI device support
+#
+CONFIG_SCSI_MOD=y
+# CONFIG_RAID_ATTRS is not set
+CONFIG_SCSI_COMMON=y
+CONFIG_SCSI=y
+CONFIG_SCSI_DMA=y
+CONFIG_SCSI_PROC_FS=y
+
+#
+# SCSI support type (disk, tape, CD-ROM)
+#
+CONFIG_BLK_DEV_SD=y
+# CONFIG_CHR_DEV_ST is not set
+CONFIG_BLK_DEV_SR=y
+# CONFIG_CHR_DEV_SG is not set
+CONFIG_BLK_DEV_BSG=y
+# CONFIG_CHR_DEV_SCH is not set
+CONFIG_SCSI_CONSTANTS=y
+# CONFIG_SCSI_LOGGING is not set
+# CONFIG_SCSI_SCAN_ASYNC is not set
+
+#
+# SCSI Transports
+#
+CONFIG_SCSI_SPI_ATTRS=y
+# CONFIG_SCSI_FC_ATTRS is not set
+CONFIG_SCSI_ISCSI_ATTRS=m
+# CONFIG_SCSI_SAS_ATTRS is not set
+# CONFIG_SCSI_SAS_LIBSAS is not set
+# CONFIG_SCSI_SRP_ATTRS is not set
+# end of SCSI Transports
+
+CONFIG_SCSI_LOWLEVEL=y
+CONFIG_ISCSI_TCP=m
+# CONFIG_ISCSI_BOOT_SYSFS is not set
+# CONFIG_SCSI_CXGB3_ISCSI is not set
+# CONFIG_SCSI_CXGB4_ISCSI is not set
+# CONFIG_SCSI_BNX2_ISCSI is not set
+# CONFIG_BE2ISCSI is not set
+# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
+# CONFIG_SCSI_HPSA is not set
+# CONFIG_SCSI_3W_9XXX is not set
+# CONFIG_SCSI_3W_SAS is not set
+# CONFIG_SCSI_ACARD is not set
+# CONFIG_SCSI_AACRAID is not set
+# CONFIG_SCSI_AIC7XXX is not set
+# CONFIG_SCSI_AIC79XX is not set
+# CONFIG_SCSI_AIC94XX is not set
+# CONFIG_SCSI_MVSAS is not set
+# CONFIG_SCSI_MVUMI is not set
+# CONFIG_SCSI_DPT_I2O is not set
+# CONFIG_SCSI_ADVANSYS is not set
+# CONFIG_SCSI_ARCMSR is not set
+# CONFIG_SCSI_ESAS2R is not set
+# CONFIG_MEGARAID_NEWGEN is not set
+# CONFIG_MEGARAID_LEGACY is not set
+# CONFIG_MEGARAID_SAS is not set
+# CONFIG_SCSI_MPT3SAS is not set
+# CONFIG_SCSI_MPT2SAS is not set
+# CONFIG_SCSI_MPI3MR is not set
+# CONFIG_SCSI_SMARTPQI is not set
+# CONFIG_SCSI_UFSHCD is not set
+# CONFIG_SCSI_HPTIOP is not set
+# CONFIG_SCSI_BUSLOGIC is not set
+# CONFIG_SCSI_MYRB is not set
+# CONFIG_SCSI_MYRS is not set
+CONFIG_VMWARE_PVSCSI=y
+CONFIG_XEN_SCSI_FRONTEND=m
+CONFIG_HYPERV_STORAGE=y
+# CONFIG_SCSI_SNIC is not set
+# CONFIG_SCSI_DMX3191D is not set
+# CONFIG_SCSI_FDOMAIN_PCI is not set
+# CONFIG_SCSI_ISCI is not set
+# CONFIG_SCSI_IPS is not set
+# CONFIG_SCSI_INITIO is not set
+# CONFIG_SCSI_INIA100 is not set
+# CONFIG_SCSI_STEX is not set
+# CONFIG_SCSI_SYM53C8XX_2 is not set
+# CONFIG_SCSI_IPR is not set
+# CONFIG_SCSI_QLOGIC_1280 is not set
+# CONFIG_SCSI_QLA_ISCSI is not set
+# CONFIG_SCSI_DC395x is not set
+# CONFIG_SCSI_AM53C974 is not set
+# CONFIG_SCSI_WD719X is not set
+# CONFIG_SCSI_DEBUG is not set
+# CONFIG_SCSI_PMCRAID is not set
+# CONFIG_SCSI_PM8001 is not set
+CONFIG_SCSI_VIRTIO=y
+# CONFIG_SCSI_DH is not set
+# end of SCSI device support
+
+CONFIG_ATA=y
+CONFIG_SATA_HOST=y
+CONFIG_PATA_TIMINGS=y
+CONFIG_ATA_VERBOSE_ERROR=y
+CONFIG_ATA_FORCE=y
+CONFIG_ATA_ACPI=y
+# CONFIG_SATA_ZPODD is not set
+# CONFIG_SATA_PMP is not set
+
+#
+# Controllers with non-SFF native interface
+#
+CONFIG_SATA_AHCI=y
+CONFIG_SATA_MOBILE_LPM_POLICY=0
+# CONFIG_SATA_AHCI_PLATFORM is not set
+# CONFIG_SATA_INIC162X is not set
+# CONFIG_SATA_ACARD_AHCI is not set
+# CONFIG_SATA_SIL24 is not set
+CONFIG_ATA_SFF=y
+
+#
+# SFF controllers with custom DMA interface
+#
+# CONFIG_PDC_ADMA is not set
+# CONFIG_SATA_QSTOR is not set
+# CONFIG_SATA_SX4 is not set
+CONFIG_ATA_BMDMA=y
+
+#
+# SATA SFF controllers with BMDMA
+#
+CONFIG_ATA_PIIX=y
+# CONFIG_SATA_MV is not set
+# CONFIG_SATA_NV is not set
+# CONFIG_SATA_PROMISE is not set
+# CONFIG_SATA_SIL is not set
+# CONFIG_SATA_SIS is not set
+# CONFIG_SATA_SVW is not set
+# CONFIG_SATA_ULI is not set
+# CONFIG_SATA_VIA is not set
+# CONFIG_SATA_VITESSE is not set
+
+#
+# PATA SFF controllers with BMDMA
+#
+# CONFIG_PATA_ALI is not set
+# CONFIG_PATA_AMD is not set
+# CONFIG_PATA_ARTOP is not set
+# CONFIG_PATA_ATIIXP is not set
+# CONFIG_PATA_ATP867X is not set
+# CONFIG_PATA_CMD64X is not set
+# CONFIG_PATA_CYPRESS is not set
+# CONFIG_PATA_EFAR is not set
+# CONFIG_PATA_HPT366 is not set
+# CONFIG_PATA_HPT37X is not set
+# CONFIG_PATA_HPT3X2N is not set
+# CONFIG_PATA_HPT3X3 is not set
+# CONFIG_PATA_IT8213 is not set
+# CONFIG_PATA_IT821X is not set
+# CONFIG_PATA_JMICRON is not set
+# CONFIG_PATA_MARVELL is not set
+# CONFIG_PATA_NETCELL is not set
+# CONFIG_PATA_NINJA32 is not set
+# CONFIG_PATA_NS87415 is not set
+# CONFIG_PATA_OLDPIIX is not set
+# CONFIG_PATA_OPTIDMA is not set
+# CONFIG_PATA_PDC2027X is not set
+# CONFIG_PATA_PDC_OLD is not set
+# CONFIG_PATA_RADISYS is not set
+# CONFIG_PATA_RDC is not set
+# CONFIG_PATA_SCH is not set
+# CONFIG_PATA_SERVERWORKS is not set
+# CONFIG_PATA_SIL680 is not set
+# CONFIG_PATA_SIS is not set
+# CONFIG_PATA_TOSHIBA is not set
+# CONFIG_PATA_TRIFLEX is not set
+# CONFIG_PATA_VIA is not set
+# CONFIG_PATA_WINBOND is not set
+
+#
+# PIO-only SFF controllers
+#
+# CONFIG_PATA_CMD640_PCI is not set
+# CONFIG_PATA_MPIIX is not set
+# CONFIG_PATA_NS87410 is not set
+# CONFIG_PATA_OPTI is not set
+# CONFIG_PATA_PLATFORM is not set
+# CONFIG_PATA_RZ1000 is not set
+
+#
+# Generic fallback / legacy drivers
+#
+# CONFIG_PATA_ACPI is not set
+CONFIG_ATA_GENERIC=y
+# CONFIG_PATA_LEGACY is not set
+CONFIG_MD=y
+CONFIG_BLK_DEV_MD=y
+CONFIG_MD_AUTODETECT=y
+# CONFIG_MD_LINEAR is not set
+CONFIG_MD_RAID0=y
+CONFIG_MD_RAID1=m
+CONFIG_MD_RAID10=m
+CONFIG_MD_RAID456=m
+# CONFIG_MD_MULTIPATH is not set
+# CONFIG_MD_FAULTY is not set
+CONFIG_BCACHE=m
+# CONFIG_BCACHE_DEBUG is not set
+# CONFIG_BCACHE_CLOSURES_DEBUG is not set
+# CONFIG_BCACHE_ASYNC_REGISTRATION is not set
+CONFIG_BLK_DEV_DM_BUILTIN=y
+CONFIG_BLK_DEV_DM=y
+CONFIG_DM_DEBUG=y
+CONFIG_DM_BUFIO=y
+# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
+CONFIG_DM_BIO_PRISON=m
+CONFIG_DM_PERSISTENT_DATA=m
+# CONFIG_DM_UNSTRIPED is not set
+CONFIG_DM_CRYPT=y
+CONFIG_DM_SNAPSHOT=m
+CONFIG_DM_THIN_PROVISIONING=m
+CONFIG_DM_CACHE=m
+CONFIG_DM_CACHE_SMQ=m
+CONFIG_DM_WRITECACHE=m
+# CONFIG_DM_EBS is not set
+# CONFIG_DM_ERA is not set
+# CONFIG_DM_CLONE is not set
+# CONFIG_DM_MIRROR is not set
+CONFIG_DM_RAID=m
+# CONFIG_DM_ZERO is not set
+CONFIG_DM_MULTIPATH=m
+CONFIG_DM_MULTIPATH_QL=m
+CONFIG_DM_MULTIPATH_ST=m
+CONFIG_DM_MULTIPATH_HST=m
+# CONFIG_DM_MULTIPATH_IOA is not set
+# CONFIG_DM_DELAY is not set
+# CONFIG_DM_DUST is not set
+CONFIG_DM_INIT=y
+# CONFIG_DM_UEVENT is not set
+# CONFIG_DM_FLAKEY is not set
+CONFIG_DM_VERITY=y
+# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set
+# CONFIG_DM_VERITY_FEC is not set
+# CONFIG_DM_SWITCH is not set
+# CONFIG_DM_LOG_WRITES is not set
+CONFIG_DM_INTEGRITY=y
+CONFIG_TARGET_CORE=m
+CONFIG_TCM_IBLOCK=m
+CONFIG_TCM_FILEIO=m
+# CONFIG_TCM_PSCSI is not set
+CONFIG_TCM_USER2=m
+CONFIG_LOOPBACK_TARGET=m
+# CONFIG_ISCSI_TARGET is not set
+# CONFIG_FUSION is not set
+
+#
+# IEEE 1394 (FireWire) support
+#
+# CONFIG_FIREWIRE is not set
+# CONFIG_FIREWIRE_NOSY is not set
+# end of IEEE 1394 (FireWire) support
+
+# CONFIG_MACINTOSH_DRIVERS is not set
+CONFIG_NETDEVICES=y
+CONFIG_NET_CORE=y
+# CONFIG_BONDING is not set
+CONFIG_DUMMY=m
+CONFIG_WIREGUARD=m
+# CONFIG_WIREGUARD_DEBUG is not set
+# CONFIG_EQUALIZER is not set
+# CONFIG_NET_FC is not set
+CONFIG_IFB=m
+# CONFIG_NET_TEAM is not set
+CONFIG_MACVLAN=y
+# CONFIG_MACVTAP is not set
+CONFIG_IPVLAN_L3S=y
+CONFIG_IPVLAN=m
+# CONFIG_IPVTAP is not set
+CONFIG_VXLAN=m
+CONFIG_GENEVE=m
+# CONFIG_BAREUDP is not set
+# CONFIG_GTP is not set
+# CONFIG_MACSEC is not set
+# CONFIG_NETCONSOLE is not set
+CONFIG_TUN=m
+# CONFIG_TUN_VNET_CROSS_LE is not set
+CONFIG_VETH=m
+CONFIG_VIRTIO_NET=y
+# CONFIG_NLMON is not set
+# CONFIG_NET_VRF is not set
+# CONFIG_ARCNET is not set
+CONFIG_ETHERNET=y
+# CONFIG_NET_VENDOR_3COM is not set
+# CONFIG_NET_VENDOR_ADAPTEC is not set
+# CONFIG_NET_VENDOR_AGERE is not set
+# CONFIG_NET_VENDOR_ALACRITECH is not set
+# CONFIG_NET_VENDOR_ALTEON is not set
+# CONFIG_ALTERA_TSE is not set
+CONFIG_NET_VENDOR_AMAZON=y
+CONFIG_ENA_ETHERNET=y
+# CONFIG_NET_VENDOR_AMD is not set
+# CONFIG_NET_VENDOR_AQUANTIA is not set
+# CONFIG_NET_VENDOR_ARC is not set
+# CONFIG_NET_VENDOR_ATHEROS is not set
+# CONFIG_CX_ECAT is not set
+# CONFIG_NET_VENDOR_BROADCOM is not set
+# CONFIG_NET_VENDOR_CADENCE is not set
+# CONFIG_NET_VENDOR_CAVIUM is not set
+# CONFIG_NET_VENDOR_CHELSIO is not set
+# CONFIG_NET_VENDOR_CISCO is not set
+# CONFIG_NET_VENDOR_CORTINA is not set
+# CONFIG_DNET is not set
+# CONFIG_NET_VENDOR_DEC is not set
+# CONFIG_NET_VENDOR_DLINK is not set
+# CONFIG_NET_VENDOR_EMULEX is not set
+# CONFIG_NET_VENDOR_EZCHIP is not set
+CONFIG_NET_VENDOR_GOOGLE=y
+CONFIG_GVE=m
+# CONFIG_NET_VENDOR_HUAWEI is not set
+CONFIG_NET_VENDOR_I825XX=y
+CONFIG_NET_VENDOR_INTEL=y
+# CONFIG_E100 is not set
+# CONFIG_E1000 is not set
+# CONFIG_E1000E is not set
+# CONFIG_IGB is not set
+# CONFIG_IGBVF is not set
+# CONFIG_IXGB is not set
+# CONFIG_IXGBE is not set
+CONFIG_IXGBEVF=y
+# CONFIG_I40E is not set
+# CONFIG_I40EVF is not set
+# CONFIG_ICE is not set
+# CONFIG_FM10K is not set
+# CONFIG_IGC is not set
+# CONFIG_JME is not set
+CONFIG_NET_VENDOR_LITEX=y
+# CONFIG_NET_VENDOR_MARVELL is not set
+CONFIG_NET_VENDOR_MELLANOX=y
+CONFIG_MLX4_EN=m
+CONFIG_MLX4_CORE=m
+CONFIG_MLX4_DEBUG=y
+CONFIG_MLX4_CORE_GEN2=y
+CONFIG_MLX5_CORE=m
+CONFIG_MLX5_ACCEL=y
+CONFIG_MLX5_FPGA=y
+CONFIG_MLX5_CORE_EN=y
+CONFIG_MLX5_EN_ARFS=y
+CONFIG_MLX5_EN_RXNFC=y
+CONFIG_MLX5_MPFS=y
+# CONFIG_MLX5_CORE_IPOIB is not set
+CONFIG_MLX5_FPGA_IPSEC=y
+# CONFIG_MLX5_FPGA_TLS is not set
+# CONFIG_MLX5_TLS is not set
+# CONFIG_MLX5_SF is not set
+CONFIG_MLXSW_CORE=m
+CONFIG_MLXSW_CORE_THERMAL=y
+CONFIG_MLXSW_PCI=m
+CONFIG_MLXFW=m
+# CONFIG_NET_VENDOR_MICREL is not set
+CONFIG_NET_VENDOR_MICROCHIP=y
+# CONFIG_LAN743X is not set
+# CONFIG_NET_VENDOR_MICROSEMI is not set
+CONFIG_NET_VENDOR_MICROSOFT=y
+# CONFIG_MICROSOFT_MANA is not set
+# CONFIG_NET_VENDOR_MYRI is not set
+# CONFIG_FEALNX is not set
+# CONFIG_NET_VENDOR_NI is not set
+# CONFIG_NET_VENDOR_NATSEMI is not set
+# CONFIG_NET_VENDOR_NETERION is not set
+# CONFIG_NET_VENDOR_NETRONOME is not set
+# CONFIG_NET_VENDOR_NVIDIA is not set
+# CONFIG_NET_VENDOR_OKI is not set
+# CONFIG_ETHOC is not set
+# CONFIG_NET_VENDOR_PACKET_ENGINES is not set
+CONFIG_NET_VENDOR_PENSANDO=y
+# CONFIG_IONIC is not set
+# CONFIG_NET_VENDOR_QLOGIC is not set
+# CONFIG_NET_VENDOR_BROCADE is not set
+# CONFIG_NET_VENDOR_QUALCOMM is not set
+# CONFIG_NET_VENDOR_RDC is not set
+# CONFIG_NET_VENDOR_REALTEK is not set
+# CONFIG_NET_VENDOR_RENESAS is not set
+# CONFIG_NET_VENDOR_ROCKER is not set
+# CONFIG_NET_VENDOR_SAMSUNG is not set
+# CONFIG_NET_VENDOR_SEEQ is not set
+# CONFIG_NET_VENDOR_SILAN is not set
+# CONFIG_NET_VENDOR_SIS is not set
+# CONFIG_NET_VENDOR_SOLARFLARE is not set
+# CONFIG_NET_VENDOR_SMSC is not set
+# CONFIG_NET_VENDOR_SOCIONEXT is not set
+# CONFIG_NET_VENDOR_STMICRO is not set
+# CONFIG_NET_VENDOR_SUN is not set
+# CONFIG_NET_VENDOR_SYNOPSYS is not set
+# CONFIG_NET_VENDOR_TEHUTI is not set
+# CONFIG_NET_VENDOR_TI is not set
+# CONFIG_NET_VENDOR_VIA is not set
+# CONFIG_NET_VENDOR_WIZNET is not set
+CONFIG_NET_VENDOR_XILINX=y
+# CONFIG_XILINX_EMACLITE is not set
+# CONFIG_XILINX_AXI_EMAC is not set
+# CONFIG_XILINX_LL_TEMAC is not set
+# CONFIG_FDDI is not set
+# CONFIG_HIPPI is not set
+# CONFIG_NET_SB1000 is not set
+CONFIG_PHYLIB=y
+CONFIG_SWPHY=y
+CONFIG_FIXED_PHY=y
+
+#
+# MII PHY device drivers
+#
+# CONFIG_AMD_PHY is not set
+# CONFIG_ADIN_PHY is not set
+# CONFIG_AQUANTIA_PHY is not set
+# CONFIG_AX88796B_PHY is not set
+# CONFIG_BROADCOM_PHY is not set
+# CONFIG_BCM54140_PHY is not set
+# CONFIG_BCM7XXX_PHY is not set
+# CONFIG_BCM84881_PHY is not set
+# CONFIG_BCM87XX_PHY is not set
+# CONFIG_CICADA_PHY is not set
+# CONFIG_CORTINA_PHY is not set
+# CONFIG_DAVICOM_PHY is not set
+# CONFIG_ICPLUS_PHY is not set
+# CONFIG_LXT_PHY is not set
+# CONFIG_INTEL_XWAY_PHY is not set
+# CONFIG_LSI_ET1011C_PHY is not set
+# CONFIG_MARVELL_PHY is not set
+# CONFIG_MARVELL_10G_PHY is not set
+# CONFIG_MARVELL_88X2222_PHY is not set
+# CONFIG_MAXLINEAR_GPHY is not set
+# CONFIG_MEDIATEK_GE_PHY is not set
+# CONFIG_MICREL_PHY is not set
+# CONFIG_MICROCHIP_PHY is not set
+# CONFIG_MICROCHIP_T1_PHY is not set
+# CONFIG_MICROSEMI_PHY is not set
+# CONFIG_MOTORCOMM_PHY is not set
+# CONFIG_NATIONAL_PHY is not set
+# CONFIG_NXP_C45_TJA11XX_PHY is not set
+# CONFIG_QSEMI_PHY is not set
+# CONFIG_REALTEK_PHY is not set
+# CONFIG_RENESAS_PHY is not set
+# CONFIG_ROCKCHIP_PHY is not set
+# CONFIG_SMSC_PHY is not set
+# CONFIG_STE10XP is not set
+# CONFIG_TERANETICS_PHY is not set
+# CONFIG_DP83822_PHY is not set
+# CONFIG_DP83TC811_PHY is not set
+# CONFIG_DP83848_PHY is not set
+# CONFIG_DP83867_PHY is not set
+# CONFIG_DP83869_PHY is not set
+# CONFIG_VITESSE_PHY is not set
+# CONFIG_XILINX_GMII2RGMII is not set
+CONFIG_MDIO_DEVICE=y
+CONFIG_MDIO_BUS=y
+CONFIG_FWNODE_MDIO=y
+CONFIG_ACPI_MDIO=y
+CONFIG_MDIO_DEVRES=y
+# CONFIG_MDIO_BITBANG is not set
+# CONFIG_MDIO_BCM_UNIMAC is not set
+# CONFIG_MDIO_MSCC_MIIM is not set
+# CONFIG_MDIO_THUNDER is not set
+
+#
+# MDIO Multiplexers
+#
+
+#
+# PCS device drivers
+#
+# CONFIG_PCS_XPCS is not set
+# end of PCS device drivers
+
+CONFIG_PPP=m
+# CONFIG_PPP_BSDCOMP is not set
+# CONFIG_PPP_DEFLATE is not set
+# CONFIG_PPP_FILTER is not set
+# CONFIG_PPP_MPPE is not set
+# CONFIG_PPP_MULTILINK is not set
+# CONFIG_PPPOE is not set
+CONFIG_PPP_ASYNC=m
+# CONFIG_PPP_SYNC_TTY is not set
+# CONFIG_SLIP is not set
+CONFIG_SLHC=m
+
+#
+# Host-side USB support is needed for USB Network Adapter support
+#
+# CONFIG_WLAN is not set
+# CONFIG_WAN is not set
+
+#
+# Wireless WAN
+#
+# CONFIG_WWAN is not set
+# end of Wireless WAN
+
+CONFIG_XEN_NETDEV_FRONTEND=y
+CONFIG_XEN_NETDEV_BACKEND=m
+CONFIG_VMXNET3=y
+# CONFIG_FUJITSU_ES is not set
+CONFIG_HYPERV_NET=y
+# CONFIG_NETDEVSIM is not set
+CONFIG_NET_FAILOVER=y
+# CONFIG_ISDN is not set
+
+#
+# Input device support
+#
+CONFIG_INPUT=y
+CONFIG_INPUT_FF_MEMLESS=y
+CONFIG_INPUT_SPARSEKMAP=m
+# CONFIG_INPUT_MATRIXKMAP is not set
+
+#
+# Userland interfaces
+#
+# CONFIG_INPUT_MOUSEDEV is not set
+# CONFIG_INPUT_JOYDEV is not set
+CONFIG_INPUT_EVDEV=y
+# CONFIG_INPUT_EVBUG is not set
+
+#
+# Input Device Drivers
+#
+CONFIG_INPUT_KEYBOARD=y
+CONFIG_KEYBOARD_ATKBD=y
+# CONFIG_KEYBOARD_LKKBD is not set
+# CONFIG_KEYBOARD_NEWTON is not set
+# CONFIG_KEYBOARD_OPENCORES is not set
+# CONFIG_KEYBOARD_SAMSUNG is not set
+# CONFIG_KEYBOARD_STOWAWAY is not set
+# CONFIG_KEYBOARD_SUNKBD is not set
+# CONFIG_KEYBOARD_XTKBD is not set
+# CONFIG_INPUT_MOUSE is not set
+# CONFIG_INPUT_JOYSTICK is not set
+# CONFIG_INPUT_TABLET is not set
+# CONFIG_INPUT_TOUCHSCREEN is not set
+CONFIG_INPUT_MISC=y
+# CONFIG_INPUT_AD714X is not set
+# CONFIG_INPUT_E3X0_BUTTON is not set
+# CONFIG_INPUT_ATLAS_BTNS is not set
+CONFIG_INPUT_UINPUT=m
+# CONFIG_INPUT_ADXL34X is not set
+# CONFIG_INPUT_CMA3000 is not set
+CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y
+# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
+# CONFIG_RMI4_CORE is not set
+
+#
+# Hardware I/O ports
+#
+CONFIG_SERIO=y
+CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
+CONFIG_SERIO_I8042=y
+CONFIG_SERIO_SERPORT=m
+CONFIG_SERIO_CT82C710=m
+CONFIG_SERIO_PCIPS2=m
+CONFIG_SERIO_LIBPS2=y
+CONFIG_SERIO_RAW=y
+# CONFIG_SERIO_ALTERA_PS2 is not set
+# CONFIG_SERIO_PS2MULT is not set
+# CONFIG_SERIO_ARC_PS2 is not set
+CONFIG_HYPERV_KEYBOARD=y
+# CONFIG_USERIO is not set
+# CONFIG_GAMEPORT is not set
+# end of Hardware I/O ports
+# end of Input device support
+
+#
+# Character devices
+#
+CONFIG_TTY=y
+CONFIG_VT=y
+CONFIG_CONSOLE_TRANSLATIONS=y
+CONFIG_VT_CONSOLE=y
+CONFIG_VT_CONSOLE_SLEEP=y
+CONFIG_HW_CONSOLE=y
+# CONFIG_VT_HW_CONSOLE_BINDING is not set
+CONFIG_UNIX98_PTYS=y
+# CONFIG_LEGACY_PTYS is not set
+CONFIG_LDISC_AUTOLOAD=y
+
+#
+# Serial drivers
+#
+CONFIG_SERIAL_EARLYCON=y
+CONFIG_SERIAL_8250=y
+# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
+CONFIG_SERIAL_8250_PNP=y
+# CONFIG_SERIAL_8250_16550A_VARIANTS is not set
+# CONFIG_SERIAL_8250_FINTEK is not set
+CONFIG_SERIAL_8250_CONSOLE=y
+# CONFIG_SERIAL_8250_PCI is not set
+CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
+# CONFIG_SERIAL_8250_EXTENDED is not set
+# CONFIG_SERIAL_8250_DW is not set
+# CONFIG_SERIAL_8250_RT288X is not set
+# CONFIG_SERIAL_8250_LPSS is not set
+# CONFIG_SERIAL_8250_MID is not set
+
+#
+# Non-8250 serial port support
+#
+# CONFIG_SERIAL_UARTLITE is not set
+CONFIG_SERIAL_CORE=y
+CONFIG_SERIAL_CORE_CONSOLE=y
+# CONFIG_SERIAL_JSM is not set
+# CONFIG_SERIAL_LANTIQ is not set
+# CONFIG_SERIAL_SCCNXP is not set
+# CONFIG_SERIAL_BCM63XX is not set
+# CONFIG_SERIAL_ALTERA_JTAGUART is not set
+# CONFIG_SERIAL_ALTERA_UART is not set
+# CONFIG_SERIAL_ARC is not set
+# CONFIG_SERIAL_RP2 is not set
+# CONFIG_SERIAL_FSL_LPUART is not set
+# CONFIG_SERIAL_FSL_LINFLEXUART is not set
+# CONFIG_SERIAL_SPRD is not set
+# end of Serial drivers
+
+# CONFIG_SERIAL_NONSTANDARD is not set
+# CONFIG_N_GSM is not set
+# CONFIG_NOZOMI is not set
+# CONFIG_NULL_TTY is not set
+CONFIG_HVC_DRIVER=y
+CONFIG_HVC_IRQ=y
+CONFIG_HVC_XEN=y
+CONFIG_HVC_XEN_FRONTEND=y
+# CONFIG_SERIAL_DEV_BUS is not set
+# CONFIG_TTY_PRINTK is not set
+# CONFIG_VIRTIO_CONSOLE is not set
+# CONFIG_IPMI_HANDLER is not set
+CONFIG_HW_RANDOM=y
+# CONFIG_HW_RANDOM_TIMERIOMEM is not set
+# CONFIG_HW_RANDOM_INTEL is not set
+# CONFIG_HW_RANDOM_AMD is not set
+# CONFIG_HW_RANDOM_BA431 is not set
+# CONFIG_HW_RANDOM_VIA is not set
+CONFIG_HW_RANDOM_VIRTIO=y
+# CONFIG_HW_RANDOM_XIPHERA is not set
+# CONFIG_APPLICOM is not set
+# CONFIG_MWAVE is not set
+CONFIG_DEVMEM=y
+CONFIG_NVRAM=y
+CONFIG_DEVPORT=y
+CONFIG_HPET=y
+# CONFIG_HPET_MMAP is not set
+# CONFIG_HANGCHECK_TIMER is not set
+CONFIG_TCG_TPM=y
+# CONFIG_HW_RANDOM_TPM is not set
+CONFIG_TCG_TIS_CORE=y
+CONFIG_TCG_TIS=y
+# CONFIG_TCG_NSC is not set
+# CONFIG_TCG_ATMEL is not set
+# CONFIG_TCG_INFINEON is not set
+CONFIG_TCG_XEN=m
+CONFIG_TCG_CRB=y
+# CONFIG_TCG_VTPM_PROXY is not set
+# CONFIG_TELCLOCK is not set
+# CONFIG_XILLYBUS is not set
+CONFIG_RANDOM_TRUST_CPU=y
+# CONFIG_RANDOM_TRUST_BOOTLOADER is not set
+# end of Character devices
+
+#
+# I2C support
+#
+# CONFIG_I2C is not set
+# end of I2C support
+
+# CONFIG_I3C is not set
+# CONFIG_SPI is not set
+# CONFIG_SPMI is not set
+# CONFIG_HSI is not set
+CONFIG_PPS=y
+# CONFIG_PPS_DEBUG is not set
+
+#
+# PPS clients support
+#
+# CONFIG_PPS_CLIENT_KTIMER is not set
+# CONFIG_PPS_CLIENT_LDISC is not set
+# CONFIG_PPS_CLIENT_GPIO is not set
+
+#
+# PPS generators support
+#
+
+#
+# PTP clock support
+#
+CONFIG_PTP_1588_CLOCK=y
+CONFIG_PTP_1588_CLOCK_OPTIONAL=y
+
+#
+# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
+#
+CONFIG_PTP_1588_CLOCK_KVM=m
+# CONFIG_PTP_1588_CLOCK_VMW is not set
+# end of PTP clock support
+
+# CONFIG_PINCTRL is not set
+# CONFIG_GPIOLIB is not set
+# CONFIG_W1 is not set
+# CONFIG_POWER_RESET is not set
+# CONFIG_POWER_SUPPLY is not set
+# CONFIG_HWMON is not set
+CONFIG_THERMAL=y
+# CONFIG_THERMAL_NETLINK is not set
+# CONFIG_THERMAL_STATISTICS is not set
+CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
+CONFIG_THERMAL_WRITABLE_TRIPS=y
+CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
+# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
+# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
+# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
+CONFIG_THERMAL_GOV_STEP_WISE=y
+# CONFIG_THERMAL_GOV_BANG_BANG is not set
+CONFIG_THERMAL_GOV_USER_SPACE=y
+# CONFIG_THERMAL_EMULATION is not set
+
+#
+# Intel thermal drivers
+#
+# CONFIG_INTEL_POWERCLAMP is not set
+CONFIG_X86_THERMAL_VECTOR=y
+CONFIG_X86_PKG_TEMP_THERMAL=m
+# CONFIG_INTEL_SOC_DTS_THERMAL is not set
+
+#
+# ACPI INT340X thermal drivers
+#
+# CONFIG_INT340X_THERMAL is not set
+# end of ACPI INT340X thermal drivers
+
+# CONFIG_INTEL_PCH_THERMAL is not set
+# CONFIG_INTEL_TCC_COOLING is not set
+# CONFIG_INTEL_MENLOW is not set
+# end of Intel thermal drivers
+
+CONFIG_WATCHDOG=y
+CONFIG_WATCHDOG_CORE=y
+# CONFIG_WATCHDOG_NOWAYOUT is not set
+CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
+CONFIG_WATCHDOG_OPEN_TIMEOUT=0
+# CONFIG_WATCHDOG_SYSFS is not set
+# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set
+
+#
+# Watchdog Pretimeout Governors
+#
+# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
+
+#
+# Watchdog Device Drivers
+#
+# CONFIG_SOFT_WATCHDOG is not set
+# CONFIG_WDAT_WDT is not set
+# CONFIG_XILINX_WATCHDOG is not set
+# CONFIG_MLX_WDT is not set
+# CONFIG_CADENCE_WATCHDOG is not set
+# CONFIG_DW_WATCHDOG is not set
+# CONFIG_MAX63XX_WATCHDOG is not set
+# CONFIG_ACQUIRE_WDT is not set
+# CONFIG_ADVANTECH_WDT is not set
+# CONFIG_ALIM1535_WDT is not set
+# CONFIG_ALIM7101_WDT is not set
+# CONFIG_EBC_C384_WDT is not set
+# CONFIG_F71808E_WDT is not set
+# CONFIG_SP5100_TCO is not set
+# CONFIG_SBC_FITPC2_WATCHDOG is not set
+# CONFIG_EUROTECH_WDT is not set
+# CONFIG_IB700_WDT is not set
+# CONFIG_IBMASR is not set
+# CONFIG_WAFER_WDT is not set
+# CONFIG_I6300ESB_WDT is not set
+# CONFIG_IE6XX_WDT is not set
+CONFIG_ITCO_WDT=y
+CONFIG_ITCO_VENDOR_SUPPORT=y
+# CONFIG_IT8712F_WDT is not set
+# CONFIG_IT87_WDT is not set
+# CONFIG_HP_WATCHDOG is not set
+# CONFIG_SC1200_WDT is not set
+# CONFIG_PC87413_WDT is not set
+# CONFIG_NV_TCO is not set
+# CONFIG_60XX_WDT is not set
+# CONFIG_CPU5_WDT is not set
+# CONFIG_SMSC_SCH311X_WDT is not set
+# CONFIG_SMSC37B787_WDT is not set
+# CONFIG_TQMX86_WDT is not set
+# CONFIG_VIA_WDT is not set
+# CONFIG_W83627HF_WDT is not set
+# CONFIG_W83877F_WDT is not set
+# CONFIG_W83977F_WDT is not set
+# CONFIG_MACHZ_WDT is not set
+# CONFIG_SBC_EPX_C3_WATCHDOG is not set
+# CONFIG_NI903X_WDT is not set
+# CONFIG_NIC7018_WDT is not set
+# CONFIG_XEN_WDT is not set
+
+#
+# PCI-based Watchdog Cards
+#
+# CONFIG_PCIPCWATCHDOG is not set
+# CONFIG_WDTPCI is not set
+CONFIG_SSB_POSSIBLE=y
+# CONFIG_SSB is not set
+CONFIG_BCMA_POSSIBLE=y
+# CONFIG_BCMA is not set
+
+#
+# Multifunction device drivers
+#
+CONFIG_MFD_CORE=y
+# CONFIG_MFD_MADERA is not set
+# CONFIG_HTC_PASIC3 is not set
+# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
+CONFIG_LPC_ICH=y
+CONFIG_LPC_SCH=m
+# CONFIG_MFD_INTEL_LPSS_ACPI is not set
+# CONFIG_MFD_INTEL_LPSS_PCI is not set
+# CONFIG_MFD_INTEL_PMC_BXT is not set
+# CONFIG_MFD_INTEL_PMT is not set
+# CONFIG_MFD_JANZ_CMODIO is not set
+# CONFIG_MFD_KEMPLD is not set
+# CONFIG_MFD_MT6397 is not set
+# CONFIG_MFD_RDC321X is not set
+# CONFIG_MFD_SM501 is not set
+# CONFIG_MFD_SYSCON is not set
+# CONFIG_MFD_TQMX86 is not set
+# CONFIG_MFD_VX855 is not set
+# end of Multifunction device drivers
+
+# CONFIG_REGULATOR is not set
+# CONFIG_RC_CORE is not set
+# CONFIG_MEDIA_CEC_SUPPORT is not set
+# CONFIG_MEDIA_SUPPORT is not set
+
+#
+# Graphics support
+#
+# CONFIG_AGP is not set
+# CONFIG_VGA_ARB is not set
+# CONFIG_VGA_SWITCHEROO is not set
+# CONFIG_DRM is not set
+
+#
+# ARM devices
+#
+# end of ARM devices
+
+#
+# Frame buffer Devices
+#
+# CONFIG_FB is not set
+# end of Frame buffer Devices
+
+#
+# Backlight & LCD device support
+#
+# CONFIG_LCD_CLASS_DEVICE is not set
+# CONFIG_BACKLIGHT_CLASS_DEVICE is not set
+# end of Backlight & LCD device support
+
+#
+# Console display driver support
+#
+CONFIG_VGA_CONSOLE=y
+CONFIG_DUMMY_CONSOLE=y
+CONFIG_DUMMY_CONSOLE_COLUMNS=80
+CONFIG_DUMMY_CONSOLE_ROWS=25
+# end of Console display driver support
+# end of Graphics support
+
+# CONFIG_SOUND is not set
+
+#
+# HID support
+#
+# CONFIG_HID is not set
+
+#
+# Intel ISH HID support
+#
+# CONFIG_INTEL_ISH_HID is not set
+# end of Intel ISH HID support
+# end of HID support
+
+CONFIG_USB_OHCI_LITTLE_ENDIAN=y
+# CONFIG_USB_SUPPORT is not set
+# CONFIG_MMC is not set
+# CONFIG_MEMSTICK is not set
+# CONFIG_NEW_LEDS is not set
+# CONFIG_ACCESSIBILITY is not set
+CONFIG_INFINIBAND=m
+CONFIG_INFINIBAND_USER_MAD=m
+CONFIG_INFINIBAND_USER_ACCESS=m
+CONFIG_INFINIBAND_USER_MEM=y
+CONFIG_INFINIBAND_ON_DEMAND_PAGING=y
+CONFIG_INFINIBAND_ADDR_TRANS=y
+CONFIG_INFINIBAND_ADDR_TRANS_CONFIGFS=y
+CONFIG_INFINIBAND_VIRT_DMA=y
+# CONFIG_INFINIBAND_MTHCA is not set
+# CONFIG_INFINIBAND_EFA is not set
+# CONFIG_MLX4_INFINIBAND is not set
+CONFIG_MLX5_INFINIBAND=m
+# CONFIG_INFINIBAND_OCRDMA is not set
+# CONFIG_INFINIBAND_VMWARE_PVRDMA is not set
+# CONFIG_INFINIBAND_RDMAVT is not set
+# CONFIG_RDMA_RXE is not set
+# CONFIG_RDMA_SIW is not set
+# CONFIG_INFINIBAND_IPOIB is not set
+# CONFIG_INFINIBAND_SRP is not set
+# CONFIG_INFINIBAND_SRPT is not set
+# CONFIG_INFINIBAND_ISER is not set
+# CONFIG_INFINIBAND_RTRS_CLIENT is not set
+# CONFIG_INFINIBAND_RTRS_SERVER is not set
+# CONFIG_INFINIBAND_OPA_VNIC is not set
+CONFIG_EDAC_ATOMIC_SCRUB=y
+CONFIG_EDAC_SUPPORT=y
+# CONFIG_EDAC is not set
+CONFIG_RTC_LIB=y
+CONFIG_RTC_MC146818_LIB=y
+CONFIG_RTC_CLASS=y
+# CONFIG_RTC_HCTOSYS is not set
+CONFIG_RTC_SYSTOHC=y
+CONFIG_RTC_SYSTOHC_DEVICE="rtc0"
+# CONFIG_RTC_DEBUG is not set
+CONFIG_RTC_NVMEM=y
+
+#
+# RTC interfaces
+#
+CONFIG_RTC_INTF_SYSFS=y
+CONFIG_RTC_INTF_PROC=y
+CONFIG_RTC_INTF_DEV=y
+# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
+# CONFIG_RTC_DRV_TEST is not set
+
+#
+# I2C RTC drivers
+#
+
+#
+# SPI RTC drivers
+#
+
+#
+# SPI and I2C RTC drivers
+#
+
+#
+# Platform RTC drivers
+#
+CONFIG_RTC_DRV_CMOS=y
+# CONFIG_RTC_DRV_DS1286 is not set
+# CONFIG_RTC_DRV_DS1511 is not set
+# CONFIG_RTC_DRV_DS1553 is not set
+# CONFIG_RTC_DRV_DS1685_FAMILY is not set
+# CONFIG_RTC_DRV_DS1742 is not set
+# CONFIG_RTC_DRV_DS2404 is not set
+# CONFIG_RTC_DRV_STK17TA8 is not set
+# CONFIG_RTC_DRV_M48T86 is not set
+# CONFIG_RTC_DRV_M48T35 is not set
+# CONFIG_RTC_DRV_M48T59 is not set
+# CONFIG_RTC_DRV_MSM6242 is not set
+# CONFIG_RTC_DRV_BQ4802 is not set
+# CONFIG_RTC_DRV_RP5C01 is not set
+# CONFIG_RTC_DRV_V3020 is not set
+
+#
+# on-CPU RTC drivers
+#
+# CONFIG_RTC_DRV_FTRTC010 is not set
+
+#
+# HID Sensor RTC drivers
+#
+# CONFIG_RTC_DRV_GOLDFISH is not set
+# CONFIG_DMADEVICES is not set
+
+#
+# DMABUF options
+#
+# CONFIG_SYNC_FILE is not set
+# CONFIG_UDMABUF is not set
+# CONFIG_DMABUF_MOVE_NOTIFY is not set
+# CONFIG_DMABUF_DEBUG is not set
+# CONFIG_DMABUF_SELFTESTS is not set
+# CONFIG_DMABUF_HEAPS is not set
+CONFIG_DMABUF_SYSFS_STATS=y
+# end of DMABUF options
+
+# CONFIG_AUXDISPLAY is not set
+CONFIG_UIO=m
+# CONFIG_UIO_CIF is not set
+# CONFIG_UIO_PDRV_GENIRQ is not set
+# CONFIG_UIO_DMEM_GENIRQ is not set
+# CONFIG_UIO_AEC is not set
+# CONFIG_UIO_SERCOS3 is not set
+CONFIG_UIO_PCI_GENERIC=m
+# CONFIG_UIO_NETX is not set
+# CONFIG_UIO_PRUSS is not set
+# CONFIG_UIO_MF624 is not set
+# CONFIG_UIO_HV_GENERIC is not set
+CONFIG_VFIO=m
+CONFIG_VFIO_IOMMU_TYPE1=m
+CONFIG_VFIO_VIRQFD=m
+CONFIG_VFIO_NOIOMMU=y
+CONFIG_VFIO_PCI_CORE=m
+CONFIG_VFIO_PCI_MMAP=y
+CONFIG_VFIO_PCI_INTX=y
+CONFIG_VFIO_PCI=m
+# CONFIG_VFIO_PCI_IGD is not set
+# CONFIG_VFIO_MDEV is not set
+CONFIG_IRQ_BYPASS_MANAGER=m
+CONFIG_VIRT_DRIVERS=y
+# CONFIG_VBOXGUEST is not set
+# CONFIG_NITRO_ENCLAVES is not set
+CONFIG_SEV_GUEST=y
+CONFIG_VIRTIO=y
+CONFIG_ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS=y
+CONFIG_VIRTIO_PCI_LIB=y
+CONFIG_VIRTIO_MENU=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_PCI_LEGACY=y
+CONFIG_VIRTIO_BALLOON=m
+CONFIG_VIRTIO_MEM=m
+# CONFIG_VIRTIO_INPUT is not set
+# CONFIG_VIRTIO_MMIO is not set
+# CONFIG_VDPA is not set
+CONFIG_VHOST_MENU=y
+# CONFIG_VHOST_NET is not set
+# CONFIG_VHOST_SCSI is not set
+# CONFIG_VHOST_VSOCK is not set
+# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
+
+#
+# Microsoft Hyper-V guest support
+#
+CONFIG_HYPERV=y
+CONFIG_HYPERV_TIMER=y
+CONFIG_HYPERV_UTILS=y
+CONFIG_HYPERV_BALLOON=y
+# end of Microsoft Hyper-V guest support
+
+#
+# Xen driver support
+#
+CONFIG_XEN_BALLOON=y
+CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y
+CONFIG_XEN_MEMORY_HOTPLUG_LIMIT=512
+CONFIG_XEN_SCRUB_PAGES_DEFAULT=y
+CONFIG_XEN_DEV_EVTCHN=m
+CONFIG_XEN_BACKEND=y
+CONFIG_XENFS=m
+CONFIG_XEN_COMPAT_XENFS=y
+CONFIG_XEN_SYS_HYPERVISOR=y
+CONFIG_XEN_XENBUS_FRONTEND=y
+CONFIG_XEN_GNTDEV=m
+CONFIG_XEN_GRANT_DEV_ALLOC=m
+# CONFIG_XEN_GRANT_DMA_ALLOC is not set
+CONFIG_SWIOTLB_XEN=y
+CONFIG_XEN_PCIDEV_BACKEND=m
+# CONFIG_XEN_PVCALLS_FRONTEND is not set
+# CONFIG_XEN_PVCALLS_BACKEND is not set
+# CONFIG_XEN_SCSI_BACKEND is not set
+CONFIG_XEN_PRIVCMD=m
+CONFIG_XEN_ACPI_PROCESSOR=m
+# CONFIG_XEN_MCE_LOG is not set
+CONFIG_XEN_HAVE_PVMMU=y
+CONFIG_XEN_EFI=y
+CONFIG_XEN_AUTO_XLATE=y
+CONFIG_XEN_ACPI=y
+CONFIG_XEN_SYMS=y
+CONFIG_XEN_HAVE_VPMU=y
+CONFIG_XEN_UNPOPULATED_ALLOC=y
+# end of Xen driver support
+
+# CONFIG_GREYBUS is not set
+# CONFIG_COMEDI is not set
+# CONFIG_STAGING is not set
+CONFIG_X86_PLATFORM_DEVICES=y
+# CONFIG_ACPI_WMI is not set
+# CONFIG_ACERHDF is not set
+# CONFIG_ACER_WIRELESS is not set
+# CONFIG_AMD_PMC is not set
+# CONFIG_ADV_SWBUTTON is not set
+# CONFIG_ASUS_WIRELESS is not set
+# CONFIG_X86_PLATFORM_DRIVERS_DELL is not set
+# CONFIG_FUJITSU_TABLET is not set
+# CONFIG_GPD_POCKET_FAN is not set
+# CONFIG_X86_PLATFORM_DRIVERS_HP is not set
+# CONFIG_WIRELESS_HOTKEY is not set
+# CONFIG_IBM_RTL is not set
+# CONFIG_SENSORS_HDAPS is not set
+# CONFIG_INTEL_SAR_INT1092 is not set
+# CONFIG_INTEL_PMC_CORE is not set
+
+#
+# Intel Speed Select Technology interface support
+#
+# CONFIG_INTEL_SPEED_SELECT_INTERFACE is not set
+# end of Intel Speed Select Technology interface support
+
+# CONFIG_INTEL_PUNIT_IPC is not set
+# CONFIG_INTEL_RST is not set
+# CONFIG_INTEL_SMARTCONNECT is not set
+# CONFIG_INTEL_TURBO_MAX_3 is not set
+# CONFIG_INTEL_UNCORE_FREQ_CONTROL is not set
+# CONFIG_SAMSUNG_Q10 is not set
+# CONFIG_TOSHIBA_BT_RFKILL is not set
+# CONFIG_TOSHIBA_HAPS is not set
+# CONFIG_ACPI_CMPC is not set
+# CONFIG_SYSTEM76_ACPI is not set
+# CONFIG_TOPSTAR_LAPTOP is not set
+# CONFIG_INTEL_IPS is not set
+# CONFIG_INTEL_SCU_PCI is not set
+# CONFIG_INTEL_SCU_PLATFORM is not set
+CONFIG_PMC_ATOM=y
+# CONFIG_CHROME_PLATFORMS is not set
+CONFIG_MELLANOX_PLATFORM=y
+CONFIG_SURFACE_PLATFORMS=y
+# CONFIG_SURFACE_GPE is not set
+# CONFIG_SURFACE_PRO3_BUTTON is not set
+CONFIG_HAVE_CLK=y
+CONFIG_HAVE_CLK_PREPARE=y
+CONFIG_COMMON_CLK=y
+
+#
+# Clock driver for ARM Reference designs
+#
+# CONFIG_ICST is not set
+# CONFIG_CLK_SP810 is not set
+# end of Clock driver for ARM Reference designs
+
+# CONFIG_XILINX_VCU is not set
+# CONFIG_HWSPINLOCK is not set
+
+#
+# Clock Source drivers
+#
+CONFIG_CLKEVT_I8253=y
+CONFIG_CLKBLD_I8253=y
+# end of Clock Source drivers
+
+CONFIG_MAILBOX=y
+CONFIG_PCC=y
+# CONFIG_ALTERA_MBOX is not set
+CONFIG_IOMMU_IOVA=y
+CONFIG_IOMMU_API=y
+CONFIG_IOMMU_SUPPORT=y
+
+#
+# Generic IOMMU Pagetable Support
+#
+CONFIG_IOMMU_IO_PGTABLE=y
+# end of Generic IOMMU Pagetable Support
+
+# CONFIG_IOMMU_DEBUGFS is not set
+# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set
+CONFIG_IOMMU_DEFAULT_DMA_LAZY=y
+# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set
+CONFIG_IOMMU_DMA=y
+CONFIG_AMD_IOMMU=y
+CONFIG_AMD_IOMMU_V2=m
+CONFIG_DMAR_TABLE=y
+# CONFIG_INTEL_IOMMU is not set
+CONFIG_IRQ_REMAP=y
+CONFIG_HYPERV_IOMMU=y
+# CONFIG_VIRTIO_IOMMU is not set
+
+#
+# Remoteproc drivers
+#
+# CONFIG_REMOTEPROC is not set
+# end of Remoteproc drivers
+
+#
+# Rpmsg drivers
+#
+# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
+# CONFIG_RPMSG_VIRTIO is not set
+# end of Rpmsg drivers
+
+# CONFIG_SOUNDWIRE is not set
+
+#
+# SOC (System On Chip) specific Drivers
+#
+
+#
+# Amlogic SoC drivers
+#
+# end of Amlogic SoC drivers
+
+#
+# Broadcom SoC drivers
+#
+# end of Broadcom SoC drivers
+
+#
+# NXP/Freescale QorIQ SoC drivers
+#
+# end of NXP/Freescale QorIQ SoC drivers
+
+#
+# i.MX SoC drivers
+#
+# end of i.MX SoC drivers
+
+#
+# Enable LiteX SoC Builder specific drivers
+#
+# end of Enable LiteX SoC Builder specific drivers
+
+#
+# Qualcomm SoC drivers
+#
+# end of Qualcomm SoC drivers
+
+# CONFIG_SOC_TI is not set
+
+#
+# Xilinx SoC drivers
+#
+# end of Xilinx SoC drivers
+# end of SOC (System On Chip) specific Drivers
+
+# CONFIG_PM_DEVFREQ is not set
+# CONFIG_EXTCON is not set
+# CONFIG_MEMORY is not set
+# CONFIG_IIO is not set
+# CONFIG_NTB is not set
+# CONFIG_VME_BUS is not set
+# CONFIG_PWM is not set
+
+#
+# IRQ chip support
+#
+# end of IRQ chip support
+
+# CONFIG_IPACK_BUS is not set
+# CONFIG_RESET_CONTROLLER is not set
+
+#
+# PHY Subsystem
+#
+# CONFIG_GENERIC_PHY is not set
+# CONFIG_PHY_CAN_TRANSCEIVER is not set
+# CONFIG_BCM_KONA_USB2_PHY is not set
+# CONFIG_PHY_PXA_28NM_HSIC is not set
+# CONFIG_PHY_PXA_28NM_USB2 is not set
+# CONFIG_PHY_INTEL_LGM_EMMC is not set
+# end of PHY Subsystem
+
+# CONFIG_POWERCAP is not set
+# CONFIG_MCB is not set
+
+#
+# Performance monitor support
+#
+# end of Performance monitor support
+
+CONFIG_RAS=y
+# CONFIG_RAS_CEC is not set
+# CONFIG_USB4 is not set
+
+#
+# Android
+#
+# CONFIG_ANDROID is not set
+# end of Android
+
+# CONFIG_LIBNVDIMM is not set
+CONFIG_DAX=y
+# CONFIG_DEV_DAX is not set
+CONFIG_NVMEM=y
+CONFIG_NVMEM_SYSFS=y
+# CONFIG_NVMEM_RMEM is not set
+
+#
+# HW tracing support
+#
+# CONFIG_STM is not set
+# CONFIG_INTEL_TH is not set
+# end of HW tracing support
+
+# CONFIG_FPGA is not set
+# CONFIG_TEE is not set
+# CONFIG_UNISYS_VISORBUS is not set
+# CONFIG_SIOX is not set
+# CONFIG_SLIMBUS is not set
+# CONFIG_INTERCONNECT is not set
+# CONFIG_COUNTER is not set
+# CONFIG_MOST is not set
+# end of Device Drivers
+
+#
+# File systems
+#
+CONFIG_DCACHE_WORD_ACCESS=y
+# CONFIG_VALIDATE_FS_PARSER is not set
+CONFIG_FS_IOMAP=y
+# CONFIG_EXT2_FS is not set
+# CONFIG_EXT3_FS is not set
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_USE_FOR_EXT2=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+# CONFIG_EXT4_DEBUG is not set
+CONFIG_JBD2=y
+# CONFIG_JBD2_DEBUG is not set
+CONFIG_FS_MBCACHE=y
+# CONFIG_REISERFS_FS is not set
+# CONFIG_JFS_FS is not set
+CONFIG_XFS_FS=y
+CONFIG_XFS_SUPPORT_V4=y
+CONFIG_XFS_QUOTA=y
+CONFIG_XFS_POSIX_ACL=y
+# CONFIG_XFS_RT is not set
+# CONFIG_XFS_ONLINE_SCRUB is not set
+# CONFIG_XFS_WARN is not set
+# CONFIG_XFS_DEBUG is not set
+# CONFIG_GFS2_FS is not set
+# CONFIG_OCFS2_FS is not set
+# CONFIG_BTRFS_FS is not set
+# CONFIG_NILFS2_FS is not set
+# CONFIG_F2FS_FS is not set
+# CONFIG_FS_DAX is not set
+CONFIG_FS_POSIX_ACL=y
+CONFIG_EXPORTFS=y
+# CONFIG_EXPORTFS_BLOCK_OPS is not set
+CONFIG_FILE_LOCKING=y
+# CONFIG_FS_ENCRYPTION is not set
+# CONFIG_FS_VERITY is not set
+CONFIG_FSNOTIFY=y
+CONFIG_DNOTIFY=y
+CONFIG_INOTIFY_USER=y
+CONFIG_FANOTIFY=y
+CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
+CONFIG_QUOTA=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
+# CONFIG_PRINT_QUOTA_WARNING is not set
+# CONFIG_QUOTA_DEBUG is not set
+CONFIG_QUOTA_TREE=y
+# CONFIG_QFMT_V1 is not set
+CONFIG_QFMT_V2=y
+CONFIG_QUOTACTL=y
+CONFIG_AUTOFS4_FS=y
+CONFIG_AUTOFS_FS=y
+CONFIG_FUSE_FS=m
+# CONFIG_CUSE is not set
+# CONFIG_VIRTIO_FS is not set
+CONFIG_OVERLAY_FS=y
+# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set
+CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y
+# CONFIG_OVERLAY_FS_INDEX is not set
+# CONFIG_OVERLAY_FS_XINO_AUTO is not set
+# CONFIG_OVERLAY_FS_METACOPY is not set
+
+#
+# Caches
+#
+CONFIG_NETFS_SUPPORT=m
+# CONFIG_NETFS_STATS is not set
+CONFIG_FSCACHE=m
+# CONFIG_FSCACHE_STATS is not set
+# CONFIG_FSCACHE_DEBUG is not set
+CONFIG_CACHEFILES=m
+# CONFIG_CACHEFILES_DEBUG is not set
+# end of Caches
+
+#
+# CD-ROM/DVD Filesystems
+#
+CONFIG_ISO9660_FS=m
+CONFIG_JOLIET=y
+CONFIG_ZISOFS=y
+CONFIG_UDF_FS=m
+# end of CD-ROM/DVD Filesystems
+
+#
+# DOS/FAT/EXFAT/NT Filesystems
+#
+CONFIG_FAT_FS=m
+# CONFIG_MSDOS_FS is not set
+CONFIG_VFAT_FS=m
+CONFIG_FAT_DEFAULT_CODEPAGE=437
+CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
+# CONFIG_FAT_DEFAULT_UTF8 is not set
+# CONFIG_EXFAT_FS is not set
+CONFIG_NTFS_FS=m
+# CONFIG_NTFS_DEBUG is not set
+# CONFIG_NTFS_RW is not set
+# CONFIG_NTFS3_FS is not set
+# end of DOS/FAT/EXFAT/NT Filesystems
+
+#
+# Pseudo filesystems
+#
+CONFIG_PROC_FS=y
+CONFIG_PROC_KCORE=y
+CONFIG_PROC_SYSCTL=y
+CONFIG_PROC_PAGE_MONITOR=y
+CONFIG_PROC_CHILDREN=y
+CONFIG_PROC_PID_ARCH_STATUS=y
+CONFIG_PROC_SELF_MEM_READONLY=y
+CONFIG_KERNFS=y
+CONFIG_SYSFS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_TMPFS_XATTR=y
+# CONFIG_TMPFS_INODE64 is not set
+CONFIG_HUGETLBFS=y
+CONFIG_HUGETLB_PAGE=y
+CONFIG_HUGETLB_PAGE_FREE_VMEMMAP=y
+# CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON is not set
+CONFIG_MEMFD_CREATE=y
+CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
+CONFIG_CONFIGFS_FS=m
+CONFIG_EFIVAR_FS=y
+# end of Pseudo filesystems
+
+CONFIG_MISC_FILESYSTEMS=y
+# CONFIG_ORANGEFS_FS is not set
+# CONFIG_ADFS_FS is not set
+# CONFIG_AFFS_FS is not set
+# CONFIG_ECRYPT_FS is not set
+# CONFIG_HFS_FS is not set
+# CONFIG_HFSPLUS_FS is not set
+# CONFIG_BEFS_FS is not set
+# CONFIG_BFS_FS is not set
+# CONFIG_EFS_FS is not set
+# CONFIG_CRAMFS is not set
+CONFIG_SQUASHFS=m
+# CONFIG_SQUASHFS_FILE_CACHE is not set
+CONFIG_SQUASHFS_FILE_DIRECT=y
+# CONFIG_SQUASHFS_DECOMP_SINGLE is not set
+# CONFIG_SQUASHFS_DECOMP_MULTI is not set
+CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y
+CONFIG_SQUASHFS_XATTR=y
+CONFIG_SQUASHFS_ZLIB=y
+# CONFIG_SQUASHFS_LZ4 is not set
+# CONFIG_SQUASHFS_LZO is not set
+# CONFIG_SQUASHFS_XZ is not set
+CONFIG_SQUASHFS_ZSTD=y
+CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
+# CONFIG_SQUASHFS_EMBEDDED is not set
+CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
+# CONFIG_VXFS_FS is not set
+# CONFIG_MINIX_FS is not set
+# CONFIG_OMFS_FS is not set
+# CONFIG_HPFS_FS is not set
+# CONFIG_QNX4FS_FS is not set
+# CONFIG_QNX6FS_FS is not set
+# CONFIG_ROMFS_FS is not set
+CONFIG_PSTORE=y
+CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
+CONFIG_PSTORE_DEFLATE_COMPRESS=y
+# CONFIG_PSTORE_LZO_COMPRESS is not set
+# CONFIG_PSTORE_LZ4_COMPRESS is not set
+# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
+# CONFIG_PSTORE_842_COMPRESS is not set
+# CONFIG_PSTORE_ZSTD_COMPRESS is not set
+CONFIG_PSTORE_COMPRESS=y
+CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
+CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
+# CONFIG_PSTORE_CONSOLE is not set
+# CONFIG_PSTORE_PMSG is not set
+# CONFIG_PSTORE_FTRACE is not set
+# CONFIG_PSTORE_RAM is not set
+# CONFIG_PSTORE_BLK is not set
+# CONFIG_SYSV_FS is not set
+# CONFIG_UFS_FS is not set
+# CONFIG_EROFS_FS is not set
+CONFIG_NETWORK_FILESYSTEMS=y
+CONFIG_NFS_FS=m
+# CONFIG_NFS_V2 is not set
+CONFIG_NFS_V3=m
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=m
+# CONFIG_NFS_SWAP is not set
+CONFIG_NFS_V4_1=y
+CONFIG_NFS_V4_2=y
+CONFIG_PNFS_FILE_LAYOUT=m
+CONFIG_PNFS_BLOCK=m
+CONFIG_PNFS_FLEXFILE_LAYOUT=m
+CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
+# CONFIG_NFS_V4_1_MIGRATION is not set
+CONFIG_NFS_V4_SECURITY_LABEL=y
+CONFIG_NFS_FSCACHE=y
+# CONFIG_NFS_USE_LEGACY_DNS is not set
+CONFIG_NFS_USE_KERNEL_DNS=y
+CONFIG_NFS_DEBUG=y
+CONFIG_NFS_DISABLE_UDP_SUPPORT=y
+# CONFIG_NFS_V4_2_READ_PLUS is not set
+CONFIG_NFSD=m
+CONFIG_NFSD_V2_ACL=y
+CONFIG_NFSD_V3=y
+CONFIG_NFSD_V3_ACL=y
+CONFIG_NFSD_V4=y
+# CONFIG_NFSD_BLOCKLAYOUT is not set
+# CONFIG_NFSD_SCSILAYOUT is not set
+# CONFIG_NFSD_FLEXFILELAYOUT is not set
+# CONFIG_NFSD_V4_2_INTER_SSC is not set
+CONFIG_NFSD_V4_SECURITY_LABEL=y
+CONFIG_GRACE_PERIOD=m
+CONFIG_LOCKD=m
+CONFIG_LOCKD_V4=y
+CONFIG_NFS_ACL_SUPPORT=m
+CONFIG_NFS_COMMON=y
+CONFIG_NFS_V4_2_SSC_HELPER=y
+CONFIG_SUNRPC=m
+CONFIG_SUNRPC_GSS=m
+CONFIG_SUNRPC_BACKCHANNEL=y
+CONFIG_RPCSEC_GSS_KRB5=m
+# CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES is not set
+CONFIG_SUNRPC_DEBUG=y
+CONFIG_SUNRPC_XPRT_RDMA=m
+# CONFIG_CEPH_FS is not set
+CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
+# CONFIG_CIFS_ALLOW_INSECURE_LEGACY is not set
+CONFIG_CIFS_UPCALL=y
+CONFIG_CIFS_XATTR=y
+CONFIG_CIFS_DEBUG=y
+# CONFIG_CIFS_DEBUG2 is not set
+# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set
+CONFIG_CIFS_DFS_UPCALL=y
+# CONFIG_CIFS_SWN_UPCALL is not set
+# CONFIG_CIFS_SMB_DIRECT is not set
+# CONFIG_CIFS_FSCACHE is not set
+# CONFIG_SMB_SERVER is not set
+CONFIG_SMBFS_COMMON=m
+# CONFIG_CODA_FS is not set
+# CONFIG_AFS_FS is not set
+CONFIG_9P_FS=y
+CONFIG_9P_FS_POSIX_ACL=y
+CONFIG_9P_FS_SECURITY=y
+CONFIG_NLS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=m
+# CONFIG_NLS_CODEPAGE_737 is not set
+# CONFIG_NLS_CODEPAGE_775 is not set
+# CONFIG_NLS_CODEPAGE_850 is not set
+# CONFIG_NLS_CODEPAGE_852 is not set
+# CONFIG_NLS_CODEPAGE_855 is not set
+# CONFIG_NLS_CODEPAGE_857 is not set
+# CONFIG_NLS_CODEPAGE_860 is not set
+# CONFIG_NLS_CODEPAGE_861 is not set
+# CONFIG_NLS_CODEPAGE_862 is not set
+# CONFIG_NLS_CODEPAGE_863 is not set
+# CONFIG_NLS_CODEPAGE_864 is not set
+# CONFIG_NLS_CODEPAGE_865 is not set
+# CONFIG_NLS_CODEPAGE_866 is not set
+# CONFIG_NLS_CODEPAGE_869 is not set
+# CONFIG_NLS_CODEPAGE_936 is not set
+# CONFIG_NLS_CODEPAGE_950 is not set
+# CONFIG_NLS_CODEPAGE_932 is not set
+# CONFIG_NLS_CODEPAGE_949 is not set
+# CONFIG_NLS_CODEPAGE_874 is not set
+# CONFIG_NLS_ISO8859_8 is not set
+# CONFIG_NLS_CODEPAGE_1250 is not set
+# CONFIG_NLS_CODEPAGE_1251 is not set
+CONFIG_NLS_ASCII=m
+CONFIG_NLS_ISO8859_1=m
+# CONFIG_NLS_ISO8859_2 is not set
+# CONFIG_NLS_ISO8859_3 is not set
+# CONFIG_NLS_ISO8859_4 is not set
+# CONFIG_NLS_ISO8859_5 is not set
+# CONFIG_NLS_ISO8859_6 is not set
+# CONFIG_NLS_ISO8859_7 is not set
+# CONFIG_NLS_ISO8859_9 is not set
+# CONFIG_NLS_ISO8859_13 is not set
+# CONFIG_NLS_ISO8859_14 is not set
+# CONFIG_NLS_ISO8859_15 is not set
+# CONFIG_NLS_KOI8_R is not set
+# CONFIG_NLS_KOI8_U is not set
+# CONFIG_NLS_MAC_ROMAN is not set
+# CONFIG_NLS_MAC_CELTIC is not set
+# CONFIG_NLS_MAC_CENTEURO is not set
+# CONFIG_NLS_MAC_CROATIAN is not set
+# CONFIG_NLS_MAC_CYRILLIC is not set
+# CONFIG_NLS_MAC_GAELIC is not set
+# CONFIG_NLS_MAC_GREEK is not set
+# CONFIG_NLS_MAC_ICELAND is not set
+# CONFIG_NLS_MAC_INUIT is not set
+# CONFIG_NLS_MAC_ROMANIAN is not set
+# CONFIG_NLS_MAC_TURKISH is not set
+CONFIG_NLS_UTF8=m
+# CONFIG_DLM is not set
+# CONFIG_UNICODE is not set
+CONFIG_IO_WQ=y
+# end of File systems
+
+#
+# Security options
+#
+CONFIG_KEYS=y
+# CONFIG_KEYS_REQUEST_CACHE is not set
+# CONFIG_PERSISTENT_KEYRINGS is not set
+# CONFIG_TRUSTED_KEYS is not set
+# CONFIG_ENCRYPTED_KEYS is not set
+# CONFIG_KEY_DH_OPERATIONS is not set
+CONFIG_SECURITY_DMESG_RESTRICT=y
+CONFIG_SECURITY=y
+CONFIG_SECURITYFS=y
+CONFIG_SECURITY_NETWORK=y
+# CONFIG_SECURITY_INFINIBAND is not set
+# CONFIG_SECURITY_NETWORK_XFRM is not set
+CONFIG_SECURITY_PATH=y
+CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
+CONFIG_HARDENED_USERCOPY=y
+# CONFIG_HARDENED_USERCOPY_FALLBACK is not set
+# CONFIG_HARDENED_USERCOPY_PAGESPAN is not set
+# CONFIG_STATIC_USERMODEHELPER is not set
+# CONFIG_SECURITY_SELINUX is not set
+# CONFIG_SECURITY_SMACK is not set
+# CONFIG_SECURITY_TOMOYO is not set
+CONFIG_SECURITY_APPARMOR=y
+CONFIG_SECURITY_APPARMOR_HASH=y
+CONFIG_SECURITY_APPARMOR_HASH_DEFAULT=y
+# CONFIG_SECURITY_APPARMOR_DEBUG is not set
+CONFIG_SECURITY_LOADPIN=y
+CONFIG_SECURITY_LOADPIN_ENFORCE=y
+CONFIG_SECURITY_YAMA=y
+CONFIG_SECURITY_CONTAINER_MONITOR=y
+# CONFIG_SECURITY_CONTAINER_MONITOR_DEBUG is not set
+CONFIG_SECURITY_SAFESETID=y
+CONFIG_SECURITY_LOCKDOWN_LSM=y
+# CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is not set
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=y
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY is not set
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY is not set
+# CONFIG_SECURITY_LANDLOCK is not set
+CONFIG_INTEGRITY=y
+CONFIG_INTEGRITY_SIGNATURE=y
+CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
+CONFIG_INTEGRITY_TRUSTED_KEYRING=y
+CONFIG_INTEGRITY_PLATFORM_KEYRING=y
+CONFIG_LOAD_UEFI_KEYS=y
+CONFIG_INTEGRITY_AUDIT=y
+CONFIG_IMA=y
+CONFIG_IMA_MEASURE_PCR_IDX=10
+CONFIG_IMA_LSM_RULES=y
+CONFIG_IMA_NG_TEMPLATE=y
+# CONFIG_IMA_SIG_TEMPLATE is not set
+CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
+# CONFIG_IMA_DEFAULT_HASH_SHA1 is not set
+CONFIG_IMA_DEFAULT_HASH_SHA256=y
+CONFIG_IMA_DEFAULT_HASH="sha256"
+CONFIG_IMA_WRITE_POLICY=y
+# CONFIG_IMA_READ_POLICY is not set
+CONFIG_IMA_APPRAISE=y
+# CONFIG_IMA_ARCH_POLICY is not set
+CONFIG_IMA_APPRAISE_BUILD_POLICY=y
+CONFIG_IMA_APPRAISE_REQUIRE_FIRMWARE_SIGS=y
+# CONFIG_IMA_APPRAISE_REQUIRE_KEXEC_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_MODULE_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_POLICY_SIGS is not set
+CONFIG_IMA_APPRAISE_BOOTPARAM=y
+# CONFIG_IMA_APPRAISE_MODSIG is not set
+# CONFIG_IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY is not set
+# CONFIG_IMA_BLACKLIST_KEYRING is not set
+CONFIG_IMA_LOAD_X509=y
+CONFIG_IMA_X509_PATH="/etc/ima/pubkey.x509"
+# CONFIG_IMA_APPRAISE_SIGNED_INIT is not set
+CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS=y
+CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS=y
+# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set
+# CONFIG_IMA_DISABLE_HTABLE is not set
+# CONFIG_EVM is not set
+# CONFIG_DEFAULT_SECURITY_APPARMOR is not set
+CONFIG_DEFAULT_SECURITY_DAC=y
+CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,apparmor,bpf"
+
+#
+# Kernel hardening options
+#
+
+#
+# Memory initialization
+#
+CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_ENABLER=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
+# CONFIG_INIT_STACK_NONE is not set
+# CONFIG_INIT_STACK_ALL_PATTERN is not set
+CONFIG_INIT_STACK_ALL_ZERO=y
+# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set
+# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
+# end of Memory initialization
+# end of Kernel hardening options
+# end of Security options
+
+CONFIG_XOR_BLOCKS=y
+CONFIG_ASYNC_CORE=y
+CONFIG_ASYNC_MEMCPY=m
+CONFIG_ASYNC_XOR=y
+CONFIG_ASYNC_PQ=m
+CONFIG_ASYNC_RAID6_RECOV=m
+CONFIG_CRYPTO=y
+
+#
+# Crypto core or helper
+#
+CONFIG_CRYPTO_ALGAPI=y
+CONFIG_CRYPTO_ALGAPI2=y
+CONFIG_CRYPTO_AEAD=y
+CONFIG_CRYPTO_AEAD2=y
+CONFIG_CRYPTO_SKCIPHER=y
+CONFIG_CRYPTO_SKCIPHER2=y
+CONFIG_CRYPTO_HASH=y
+CONFIG_CRYPTO_HASH2=y
+CONFIG_CRYPTO_RNG=m
+CONFIG_CRYPTO_RNG2=y
+CONFIG_CRYPTO_RNG_DEFAULT=m
+CONFIG_CRYPTO_AKCIPHER2=y
+CONFIG_CRYPTO_AKCIPHER=y
+CONFIG_CRYPTO_KPP2=y
+CONFIG_CRYPTO_ACOMP2=y
+CONFIG_CRYPTO_MANAGER=y
+CONFIG_CRYPTO_MANAGER2=y
+# CONFIG_CRYPTO_USER is not set
+CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
+CONFIG_CRYPTO_GF128MUL=y
+CONFIG_CRYPTO_NULL=y
+CONFIG_CRYPTO_NULL2=y
+# CONFIG_CRYPTO_PCRYPT is not set
+CONFIG_CRYPTO_CRYPTD=m
+CONFIG_CRYPTO_AUTHENC=y
+# CONFIG_CRYPTO_TEST is not set
+CONFIG_CRYPTO_SIMD=m
+CONFIG_CRYPTO_ENGINE=m
+
+#
+# Public-key cryptography
+#
+CONFIG_CRYPTO_RSA=y
+# CONFIG_CRYPTO_DH is not set
+# CONFIG_CRYPTO_ECDH is not set
+# CONFIG_CRYPTO_ECDSA is not set
+# CONFIG_CRYPTO_ECRDSA is not set
+# CONFIG_CRYPTO_SM2 is not set
+# CONFIG_CRYPTO_CURVE25519 is not set
+CONFIG_CRYPTO_CURVE25519_X86=m
+
+#
+# Authenticated Encryption with Associated Data
+#
+CONFIG_CRYPTO_CCM=m
+CONFIG_CRYPTO_GCM=y
+# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
+# CONFIG_CRYPTO_AEGIS128 is not set
+# CONFIG_CRYPTO_AEGIS128_AESNI_SSE2 is not set
+CONFIG_CRYPTO_SEQIV=m
+CONFIG_CRYPTO_ECHAINIV=m
+
+#
+# Block modes
+#
+CONFIG_CRYPTO_CBC=y
+# CONFIG_CRYPTO_CFB is not set
+CONFIG_CRYPTO_CTR=y
+CONFIG_CRYPTO_CTS=y
+CONFIG_CRYPTO_ECB=y
+CONFIG_CRYPTO_LRW=m
+# CONFIG_CRYPTO_OFB is not set
+# CONFIG_CRYPTO_PCBC is not set
+CONFIG_CRYPTO_XTS=m
+# CONFIG_CRYPTO_KEYWRAP is not set
+# CONFIG_CRYPTO_NHPOLY1305_SSE2 is not set
+# CONFIG_CRYPTO_NHPOLY1305_AVX2 is not set
+# CONFIG_CRYPTO_ADIANTUM is not set
+CONFIG_CRYPTO_ESSIV=y
+
+#
+# Hash modes
+#
+CONFIG_CRYPTO_CMAC=m
+CONFIG_CRYPTO_HMAC=y
+# CONFIG_CRYPTO_XCBC is not set
+# CONFIG_CRYPTO_VMAC is not set
+
+#
+# Digest
+#
+CONFIG_CRYPTO_CRC32C=y
+# CONFIG_CRYPTO_CRC32C_INTEL is not set
+# CONFIG_CRYPTO_CRC32 is not set
+# CONFIG_CRYPTO_CRC32_PCLMUL is not set
+# CONFIG_CRYPTO_XXHASH is not set
+# CONFIG_CRYPTO_BLAKE2B is not set
+CONFIG_CRYPTO_BLAKE2S_X86=y
+CONFIG_CRYPTO_CRCT10DIF=y
+# CONFIG_CRYPTO_CRCT10DIF_PCLMUL is not set
+CONFIG_CRYPTO_GHASH=y
+# CONFIG_CRYPTO_POLY1305 is not set
+CONFIG_CRYPTO_POLY1305_X86_64=m
+CONFIG_CRYPTO_MD4=m
+CONFIG_CRYPTO_MD5=y
+CONFIG_CRYPTO_MICHAEL_MIC=m
+# CONFIG_CRYPTO_RMD160 is not set
+CONFIG_CRYPTO_SHA1=y
+# CONFIG_CRYPTO_SHA1_SSSE3 is not set
+# CONFIG_CRYPTO_SHA256_SSSE3 is not set
+# CONFIG_CRYPTO_SHA512_SSSE3 is not set
+CONFIG_CRYPTO_SHA256=y
+CONFIG_CRYPTO_SHA512=m
+# CONFIG_CRYPTO_SHA3 is not set
+# CONFIG_CRYPTO_SM3 is not set
+# CONFIG_CRYPTO_STREEBOG is not set
+# CONFIG_CRYPTO_WP512 is not set
+# CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set
+
+#
+# Ciphers
+#
+CONFIG_CRYPTO_AES=y
+# CONFIG_CRYPTO_AES_TI is not set
+CONFIG_CRYPTO_AES_NI_INTEL=m
+# CONFIG_CRYPTO_ANUBIS is not set
+CONFIG_CRYPTO_ARC4=y
+# CONFIG_CRYPTO_BLOWFISH is not set
+# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set
+# CONFIG_CRYPTO_CAMELLIA is not set
+# CONFIG_CRYPTO_CAMELLIA_X86_64 is not set
+# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64 is not set
+# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
+# CONFIG_CRYPTO_CAST5 is not set
+# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
+# CONFIG_CRYPTO_CAST6 is not set
+# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set
+CONFIG_CRYPTO_DES=y
+# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
+# CONFIG_CRYPTO_FCRYPT is not set
+# CONFIG_CRYPTO_KHAZAD is not set
+# CONFIG_CRYPTO_CHACHA20 is not set
+CONFIG_CRYPTO_CHACHA20_X86_64=m
+# CONFIG_CRYPTO_SEED is not set
+# CONFIG_CRYPTO_SERPENT is not set
+# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
+# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
+# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
+# CONFIG_CRYPTO_SM4 is not set
+# CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64 is not set
+# CONFIG_CRYPTO_SM4_AESNI_AVX2_X86_64 is not set
+# CONFIG_CRYPTO_TEA is not set
+# CONFIG_CRYPTO_TWOFISH is not set
+# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
+# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
+# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set
+
+#
+# Compression
+#
+CONFIG_CRYPTO_DEFLATE=y
+CONFIG_CRYPTO_LZO=m
+# CONFIG_CRYPTO_842 is not set
+CONFIG_CRYPTO_LZ4=m
+# CONFIG_CRYPTO_LZ4HC is not set
+# CONFIG_CRYPTO_ZSTD is not set
+
+#
+# Random Number Generation
+#
+# CONFIG_CRYPTO_ANSI_CPRNG is not set
+CONFIG_CRYPTO_DRBG_MENU=m
+CONFIG_CRYPTO_DRBG_HMAC=y
+# CONFIG_CRYPTO_DRBG_HASH is not set
+# CONFIG_CRYPTO_DRBG_CTR is not set
+CONFIG_CRYPTO_DRBG=m
+CONFIG_CRYPTO_JITTERENTROPY=m
+CONFIG_CRYPTO_USER_API=y
+CONFIG_CRYPTO_USER_API_HASH=y
+CONFIG_CRYPTO_USER_API_SKCIPHER=y
+# CONFIG_CRYPTO_USER_API_RNG is not set
+CONFIG_CRYPTO_USER_API_AEAD=y
+CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y
+CONFIG_CRYPTO_HASH_INFO=y
+CONFIG_CRYPTO_HW=y
+# CONFIG_CRYPTO_DEV_PADLOCK is not set
+# CONFIG_CRYPTO_DEV_CCP is not set
+# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
+# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
+# CONFIG_CRYPTO_DEV_QAT_C62X is not set
+# CONFIG_CRYPTO_DEV_QAT_4XXX is not set
+# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
+# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
+# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
+# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
+CONFIG_CRYPTO_DEV_VIRTIO=m
+# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
+# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
+CONFIG_ASYMMETRIC_KEY_TYPE=y
+CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
+CONFIG_X509_CERTIFICATE_PARSER=y
+# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
+CONFIG_PKCS7_MESSAGE_PARSER=y
+# CONFIG_PKCS7_TEST_KEY is not set
+# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set
+
+#
+# Certificates for signature checking
+#
+CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
+CONFIG_MODULE_SIG_KEY_TYPE_RSA=y
+# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set
+CONFIG_SYSTEM_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_TRUSTED_KEYS="google/certs/lakitu_root_cert.pem"
+# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
+CONFIG_SECONDARY_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_HASH_LIST=""
+# CONFIG_SYSTEM_REVOCATION_LIST is not set
+# end of Certificates for signature checking
+
+CONFIG_BINARY_PRINTF=y
+
+#
+# Library routines
+#
+CONFIG_RAID6_PQ=m
+CONFIG_RAID6_PQ_BENCHMARK=y
+# CONFIG_PACKING is not set
+CONFIG_BITREVERSE=y
+CONFIG_GENERIC_STRNCPY_FROM_USER=y
+CONFIG_GENERIC_STRNLEN_USER=y
+CONFIG_GENERIC_NET_UTILS=y
+CONFIG_GENERIC_FIND_FIRST_BIT=y
+# CONFIG_CORDIC is not set
+# CONFIG_PRIME_NUMBERS is not set
+CONFIG_RATIONAL=y
+CONFIG_GENERIC_PCI_IOMAP=y
+CONFIG_GENERIC_IOMAP=y
+CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
+CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
+CONFIG_ARCH_USE_SYM_ANNOTATIONS=y
+
+#
+# Crypto library routines
+#
+CONFIG_CRYPTO_LIB_AES=y
+CONFIG_CRYPTO_LIB_ARC4=y
+CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S=y
+CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
+CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA=m
+CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m
+CONFIG_CRYPTO_LIB_CHACHA=m
+CONFIG_CRYPTO_ARCH_HAVE_LIB_CURVE25519=m
+CONFIG_CRYPTO_LIB_CURVE25519_GENERIC=m
+CONFIG_CRYPTO_LIB_CURVE25519=m
+CONFIG_CRYPTO_LIB_DES=y
+CONFIG_CRYPTO_LIB_POLY1305_RSIZE=11
+CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_POLY1305_GENERIC=m
+CONFIG_CRYPTO_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m
+CONFIG_CRYPTO_LIB_SHA256=y
+# end of Crypto library routines
+
+CONFIG_LIB_MEMNEQ=y
+CONFIG_CRC_CCITT=y
+CONFIG_CRC16=y
+CONFIG_CRC_T10DIF=y
+CONFIG_CRC_ITU_T=y
+CONFIG_CRC32=y
+# CONFIG_CRC32_SELFTEST is not set
+CONFIG_CRC32_SLICEBY8=y
+# CONFIG_CRC32_SLICEBY4 is not set
+# CONFIG_CRC32_SARWATE is not set
+# CONFIG_CRC32_BIT is not set
+CONFIG_CRC64=m
+# CONFIG_CRC4 is not set
+CONFIG_CRC7=m
+CONFIG_LIBCRC32C=y
+# CONFIG_CRC8 is not set
+CONFIG_XXHASH=y
+# CONFIG_RANDOM32_SELFTEST is not set
+CONFIG_ZLIB_INFLATE=y
+CONFIG_ZLIB_DEFLATE=y
+CONFIG_LZO_COMPRESS=m
+CONFIG_LZO_DECOMPRESS=m
+CONFIG_LZ4_COMPRESS=m
+CONFIG_LZ4_DECOMPRESS=y
+CONFIG_ZSTD_DECOMPRESS=y
+CONFIG_XZ_DEC=y
+CONFIG_XZ_DEC_X86=y
+# CONFIG_XZ_DEC_POWERPC is not set
+# CONFIG_XZ_DEC_IA64 is not set
+# CONFIG_XZ_DEC_ARM is not set
+# CONFIG_XZ_DEC_ARMTHUMB is not set
+# CONFIG_XZ_DEC_SPARC is not set
+CONFIG_XZ_DEC_BCJ=y
+# CONFIG_XZ_DEC_TEST is not set
+CONFIG_DECOMPRESS_GZIP=y
+CONFIG_DECOMPRESS_XZ=y
+CONFIG_DECOMPRESS_LZ4=y
+CONFIG_DECOMPRESS_ZSTD=y
+CONFIG_GENERIC_ALLOCATOR=y
+CONFIG_TEXTSEARCH=y
+CONFIG_TEXTSEARCH_KMP=m
+CONFIG_TEXTSEARCH_BM=m
+CONFIG_TEXTSEARCH_FSM=m
+CONFIG_INTERVAL_TREE=y
+CONFIG_XARRAY_MULTI=y
+CONFIG_ASSOCIATIVE_ARRAY=y
+CONFIG_HAS_IOMEM=y
+CONFIG_HAS_IOPORT_MAP=y
+CONFIG_HAS_DMA=y
+CONFIG_DMA_OPS=y
+CONFIG_NEED_SG_DMA_LENGTH=y
+CONFIG_NEED_DMA_MAP_STATE=y
+CONFIG_ARCH_DMA_ADDR_T_64BIT=y
+CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED=y
+CONFIG_SWIOTLB=y
+CONFIG_DMA_COHERENT_POOL=y
+# CONFIG_DMA_CMA is not set
+# CONFIG_DMA_API_DEBUG is not set
+# CONFIG_DMA_MAP_BENCHMARK is not set
+CONFIG_SGL_ALLOC=y
+CONFIG_IOMMU_HELPER=y
+CONFIG_CPU_RMAP=y
+CONFIG_DQL=y
+CONFIG_GLOB=y
+# CONFIG_GLOB_SELFTEST is not set
+CONFIG_NLATTR=y
+CONFIG_CLZ_TAB=y
+CONFIG_IRQ_POLL=y
+CONFIG_MPILIB=y
+CONFIG_SIGNATURE=y
+CONFIG_DIMLIB=y
+CONFIG_OID_REGISTRY=y
+CONFIG_UCS2_STRING=y
+CONFIG_HAVE_GENERIC_VDSO=y
+CONFIG_GENERIC_GETTIMEOFDAY=y
+CONFIG_GENERIC_VDSO_TIME_NS=y
+CONFIG_FONT_SUPPORT=y
+CONFIG_FONT_8x16=y
+CONFIG_FONT_AUTOSELECT=y
+CONFIG_SG_POOL=y
+CONFIG_ARCH_HAS_PMEM_API=y
+CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
+CONFIG_ARCH_HAS_COPY_MC=y
+CONFIG_ARCH_STACKWALK=y
+CONFIG_SBITMAP=y
+# end of Library routines
+
+#
+# Kernel hacking
+#
+
+#
+# printk and dmesg options
+#
+CONFIG_PRINTK_TIME=y
+# CONFIG_PRINTK_CALLER is not set
+# CONFIG_STACKTRACE_BUILD_ID is not set
+CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
+CONFIG_CONSOLE_LOGLEVEL_QUIET=4
+CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
+# CONFIG_BOOT_PRINTK_DELAY is not set
+# CONFIG_DYNAMIC_DEBUG is not set
+# CONFIG_DYNAMIC_DEBUG_CORE is not set
+CONFIG_SYMBOLIC_ERRNAME=y
+CONFIG_DEBUG_BUGVERBOSE=y
+# end of printk and dmesg options
+
+CONFIG_AS_HAS_NON_CONST_LEB128=y
+
+#
+# Compile-time checks and compiler options
+#
+CONFIG_DEBUG_INFO=y
+# CONFIG_DEBUG_INFO_REDUCED is not set
+# CONFIG_DEBUG_INFO_COMPRESSED is not set
+# CONFIG_DEBUG_INFO_SPLIT is not set
+# CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is not set
+CONFIG_DEBUG_INFO_DWARF4=y
+# CONFIG_DEBUG_INFO_DWARF5 is not set
+CONFIG_DEBUG_INFO_BTF=y
+CONFIG_PAHOLE_HAS_SPLIT_BTF=y
+CONFIG_DEBUG_INFO_BTF_MODULES=y
+# CONFIG_GDB_SCRIPTS is not set
+CONFIG_FRAME_WARN=2048
+# CONFIG_STRIP_ASM_SYMS is not set
+# CONFIG_HEADERS_INSTALL is not set
+CONFIG_SECTION_MISMATCH_WARN_ONLY=y
+# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set
+CONFIG_STACK_VALIDATION=y
+# CONFIG_VMLINUX_MAP is not set
+# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
+# end of Compile-time checks and compiler options
+
+#
+# Generic Kernel Debugging Instruments
+#
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
+CONFIG_MAGIC_SYSRQ_SERIAL=y
+CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE=""
+CONFIG_DEBUG_FS=y
+CONFIG_DEBUG_FS_ALLOW_ALL=y
+# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
+# CONFIG_DEBUG_FS_ALLOW_NONE is not set
+CONFIG_HAVE_ARCH_KGDB=y
+# CONFIG_KGDB is not set
+CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
+CONFIG_UBSAN=y
+# CONFIG_UBSAN_TRAP is not set
+CONFIG_CC_HAS_UBSAN_BOUNDS=y
+CONFIG_CC_HAS_UBSAN_ARRAY_BOUNDS=y
+CONFIG_UBSAN_BOUNDS=y
+CONFIG_UBSAN_ARRAY_BOUNDS=y
+# CONFIG_UBSAN_SHIFT is not set
+# CONFIG_UBSAN_DIV_ZERO is not set
+# CONFIG_UBSAN_BOOL is not set
+# CONFIG_UBSAN_ENUM is not set
+# CONFIG_UBSAN_ALIGNMENT is not set
+# CONFIG_UBSAN_SANITIZE_ALL is not set
+# CONFIG_TEST_UBSAN is not set
+CONFIG_HAVE_ARCH_KCSAN=y
+CONFIG_HAVE_KCSAN_COMPILER=y
+# CONFIG_KCSAN is not set
+# end of Generic Kernel Debugging Instruments
+
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DEBUG_MISC=y
+
+#
+# Memory Debugging
+#
+# CONFIG_PAGE_EXTENSION is not set
+# CONFIG_DEBUG_PAGEALLOC is not set
+# CONFIG_PAGE_OWNER is not set
+# CONFIG_PAGE_POISONING is not set
+# CONFIG_DEBUG_PAGE_REF is not set
+# CONFIG_DEBUG_RODATA_TEST is not set
+CONFIG_ARCH_HAS_DEBUG_WX=y
+# CONFIG_DEBUG_WX is not set
+CONFIG_GENERIC_PTDUMP=y
+# CONFIG_PTDUMP_DEBUGFS is not set
+# CONFIG_DEBUG_OBJECTS is not set
+# CONFIG_SLUB_DEBUG_ON is not set
+# CONFIG_SLUB_STATS is not set
+CONFIG_HAVE_DEBUG_KMEMLEAK=y
+# CONFIG_DEBUG_KMEMLEAK is not set
+# CONFIG_DEBUG_STACK_USAGE is not set
+# CONFIG_SCHED_STACK_END_CHECK is not set
+CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
+# CONFIG_DEBUG_VM is not set
+# CONFIG_DEBUG_VM_PGTABLE is not set
+CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
+# CONFIG_DEBUG_VIRTUAL is not set
+# CONFIG_DEBUG_MEMORY_INIT is not set
+# CONFIG_DEBUG_PER_CPU_MAPS is not set
+CONFIG_ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP=y
+# CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is not set
+CONFIG_HAVE_ARCH_KASAN=y
+CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
+CONFIG_CC_HAS_KASAN_GENERIC=y
+CONFIG_CC_HAS_KASAN_SW_TAGS=y
+CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
+# CONFIG_KASAN is not set
+CONFIG_HAVE_ARCH_KFENCE=y
+# CONFIG_KFENCE is not set
+# end of Memory Debugging
+
+# CONFIG_DEBUG_SHIRQ is not set
+
+#
+# Debug Oops, Lockups and Hangs
+#
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PANIC_ON_OOPS_VALUE=1
+CONFIG_PANIC_TIMEOUT=-1
+CONFIG_LOCKUP_DETECTOR=y
+CONFIG_SOFTLOCKUP_DETECTOR=y
+# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
+CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
+CONFIG_HARDLOCKUP_DETECTOR_PERF=y
+CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
+CONFIG_HARDLOCKUP_DETECTOR=y
+CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
+CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1
+# CONFIG_WQ_WATCHDOG is not set
+# CONFIG_TEST_LOCKUP is not set
+# end of Debug Oops, Lockups and Hangs
+
+#
+# Scheduler Debugging
+#
+# CONFIG_SCHED_DEBUG is not set
+CONFIG_SCHED_INFO=y
+CONFIG_SCHEDSTATS=y
+# end of Scheduler Debugging
+
+# CONFIG_DEBUG_TIMEKEEPING is not set
+
+#
+# Lock Debugging (spinlocks, mutexes, etc...)
+#
+CONFIG_LOCK_DEBUGGING_SUPPORT=y
+# CONFIG_PROVE_LOCKING is not set
+# CONFIG_LOCK_STAT is not set
+# CONFIG_DEBUG_RT_MUTEXES is not set
+# CONFIG_DEBUG_SPINLOCK is not set
+# CONFIG_DEBUG_MUTEXES is not set
+# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
+# CONFIG_DEBUG_RWSEMS is not set
+# CONFIG_DEBUG_LOCK_ALLOC is not set
+# CONFIG_DEBUG_ATOMIC_SLEEP is not set
+# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
+# CONFIG_LOCK_TORTURE_TEST is not set
+# CONFIG_WW_MUTEX_SELFTEST is not set
+# CONFIG_SCF_TORTURE_TEST is not set
+# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
+# end of Lock Debugging (spinlocks, mutexes, etc...)
+
+# CONFIG_DEBUG_IRQFLAGS is not set
+CONFIG_STACKTRACE=y
+# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
+# CONFIG_DEBUG_KOBJECT is not set
+
+#
+# Debug kernel data structures
+#
+CONFIG_DEBUG_LIST=y
+# CONFIG_DEBUG_PLIST is not set
+# CONFIG_DEBUG_SG is not set
+CONFIG_DEBUG_NOTIFIERS=y
+# CONFIG_BUG_ON_DATA_CORRUPTION is not set
+# end of Debug kernel data structures
+
+# CONFIG_DEBUG_CREDENTIALS is not set
+
+#
+# RCU Debugging
+#
+# CONFIG_RCU_SCALE_TEST is not set
+# CONFIG_RCU_TORTURE_TEST is not set
+# CONFIG_RCU_REF_SCALE_TEST is not set
+CONFIG_RCU_CPU_STALL_TIMEOUT=60
+# CONFIG_RCU_TRACE is not set
+# CONFIG_RCU_EQS_DEBUG is not set
+# end of RCU Debugging
+
+# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
+# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
+# CONFIG_LATENCYTOP is not set
+CONFIG_USER_STACKTRACE_SUPPORT=y
+CONFIG_NOP_TRACER=y
+CONFIG_HAVE_FUNCTION_TRACER=y
+CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
+CONFIG_HAVE_DYNAMIC_FTRACE=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
+CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
+CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
+CONFIG_HAVE_FENTRY=y
+CONFIG_HAVE_OBJTOOL_MCOUNT=y
+CONFIG_HAVE_C_RECORDMCOUNT=y
+CONFIG_TRACE_CLOCK=y
+CONFIG_RING_BUFFER=y
+CONFIG_EVENT_TRACING=y
+CONFIG_CONTEXT_SWITCH_TRACER=y
+CONFIG_TRACING=y
+CONFIG_GENERIC_TRACER=y
+CONFIG_TRACING_SUPPORT=y
+CONFIG_FTRACE=y
+# CONFIG_BOOTTIME_TRACING is not set
+CONFIG_FUNCTION_TRACER=y
+CONFIG_FUNCTION_GRAPH_TRACER=y
+CONFIG_DYNAMIC_FTRACE=y
+CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
+CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
+CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
+# CONFIG_FUNCTION_PROFILER is not set
+# CONFIG_STACK_TRACER is not set
+# CONFIG_IRQSOFF_TRACER is not set
+# CONFIG_SCHED_TRACER is not set
+# CONFIG_HWLAT_TRACER is not set
+# CONFIG_OSNOISE_TRACER is not set
+# CONFIG_TIMERLAT_TRACER is not set
+# CONFIG_MMIOTRACE is not set
+CONFIG_FTRACE_SYSCALLS=y
+# CONFIG_TRACER_SNAPSHOT is not set
+CONFIG_BRANCH_PROFILE_NONE=y
+# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
+# CONFIG_PROFILE_ALL_BRANCHES is not set
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_KPROBE_EVENTS=y
+# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
+CONFIG_UPROBE_EVENTS=y
+CONFIG_BPF_EVENTS=y
+CONFIG_DYNAMIC_EVENTS=y
+CONFIG_PROBE_EVENTS=y
+# CONFIG_BPF_KPROBE_OVERRIDE is not set
+CONFIG_FTRACE_MCOUNT_RECORD=y
+CONFIG_FTRACE_MCOUNT_USE_OBJTOOL=y
+# CONFIG_SYNTH_EVENTS is not set
+# CONFIG_HIST_TRIGGERS is not set
+# CONFIG_TRACE_EVENT_INJECT is not set
+# CONFIG_TRACEPOINT_BENCHMARK is not set
+# CONFIG_RING_BUFFER_BENCHMARK is not set
+# CONFIG_TRACE_EVAL_MAP_FILE is not set
+# CONFIG_FTRACE_RECORD_RECURSION is not set
+# CONFIG_FTRACE_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
+# CONFIG_PREEMPTIRQ_DELAY_TEST is not set
+# CONFIG_KPROBE_EVENT_GEN_TEST is not set
+CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
+# CONFIG_SAMPLES is not set
+CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
+CONFIG_STRICT_DEVMEM=y
+CONFIG_IO_STRICT_DEVMEM=y
+
+#
+# x86 Debugging
+#
+CONFIG_EARLY_PRINTK_USB=y
+CONFIG_X86_VERBOSE_BOOTUP=y
+CONFIG_EARLY_PRINTK=y
+CONFIG_EARLY_PRINTK_DBGP=y
+# CONFIG_EARLY_PRINTK_USB_XDBC is not set
+# CONFIG_EFI_PGT_DUMP is not set
+# CONFIG_DEBUG_TLBFLUSH is not set
+# CONFIG_IOMMU_DEBUG is not set
+CONFIG_HAVE_MMIOTRACE_SUPPORT=y
+# CONFIG_X86_DECODER_SELFTEST is not set
+# CONFIG_IO_DELAY_0X80 is not set
+CONFIG_IO_DELAY_0XED=y
+# CONFIG_IO_DELAY_UDELAY is not set
+# CONFIG_IO_DELAY_NONE is not set
+CONFIG_DEBUG_BOOT_PARAMS=y
+# CONFIG_CPA_DEBUG is not set
+# CONFIG_DEBUG_ENTRY is not set
+# CONFIG_DEBUG_NMI_SELFTEST is not set
+# CONFIG_X86_DEBUG_FPU is not set
+# CONFIG_PUNIT_ATOM_DEBUG is not set
+CONFIG_UNWINDER_ORC=y
+# CONFIG_UNWINDER_FRAME_POINTER is not set
+# CONFIG_UNWINDER_GUESS is not set
+# end of x86 Debugging
+
+#
+# Kernel Testing and Coverage
+#
+# CONFIG_KUNIT is not set
+# CONFIG_NOTIFIER_ERROR_INJECTION is not set
+CONFIG_FUNCTION_ERROR_INJECTION=y
+# CONFIG_FAULT_INJECTION is not set
+CONFIG_ARCH_HAS_KCOV=y
+CONFIG_CC_HAS_SANCOV_TRACE_PC=y
+# CONFIG_KCOV is not set
+CONFIG_RUNTIME_TESTING_MENU=y
+# CONFIG_LKDTM is not set
+# CONFIG_TEST_MIN_HEAP is not set
+# CONFIG_TEST_DIV64 is not set
+# CONFIG_KPROBES_SANITY_TEST is not set
+# CONFIG_BACKTRACE_SELF_TEST is not set
+# CONFIG_RBTREE_TEST is not set
+# CONFIG_REED_SOLOMON_TEST is not set
+# CONFIG_INTERVAL_TREE_TEST is not set
+# CONFIG_PERCPU_TEST is not set
+# CONFIG_ATOMIC64_SELFTEST is not set
+# CONFIG_ASYNC_RAID6_TEST is not set
+# CONFIG_TEST_HEXDUMP is not set
+# CONFIG_STRING_SELFTEST is not set
+# CONFIG_TEST_STRING_HELPERS is not set
+# CONFIG_TEST_STRSCPY is not set
+# CONFIG_TEST_KSTRTOX is not set
+# CONFIG_TEST_PRINTF is not set
+# CONFIG_TEST_SCANF is not set
+# CONFIG_TEST_BITMAP is not set
+# CONFIG_TEST_UUID is not set
+# CONFIG_TEST_XARRAY is not set
+# CONFIG_TEST_OVERFLOW is not set
+# CONFIG_TEST_RHASHTABLE is not set
+# CONFIG_TEST_HASH is not set
+# CONFIG_TEST_IDA is not set
+# CONFIG_TEST_LKM is not set
+# CONFIG_TEST_BITOPS is not set
+# CONFIG_TEST_VMALLOC is not set
+# CONFIG_TEST_USER_COPY is not set
+CONFIG_TEST_BPF=m
+# CONFIG_TEST_BLACKHOLE_DEV is not set
+# CONFIG_FIND_BIT_BENCHMARK is not set
+# CONFIG_TEST_FIRMWARE is not set
+# CONFIG_TEST_SYSCTL is not set
+# CONFIG_TEST_UDELAY is not set
+# CONFIG_TEST_STATIC_KEYS is not set
+# CONFIG_TEST_KMOD is not set
+# CONFIG_TEST_MEMCAT_P is not set
+# CONFIG_TEST_STACKINIT is not set
+# CONFIG_TEST_MEMINIT is not set
+# CONFIG_TEST_FREE_PAGES is not set
+# CONFIG_TEST_FPU is not set
+# CONFIG_TEST_CLOCKSOURCE_WATCHDOG is not set
+CONFIG_ARCH_USE_MEMTEST=y
+# CONFIG_MEMTEST is not set
+# CONFIG_HYPERV_TESTING is not set
+# end of Kernel Testing and Coverage
+# end of Kernel hacking
diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index c6262b1..4a537be 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -674,6 +674,7 @@
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE, &icl_cstates),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &icx_cstates),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &icx_cstates),
+ X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, &icx_cstates),
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, &icl_cstates),
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE, &icl_cstates),
diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index 840ee43..206c5bf 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -809,6 +809,7 @@
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &model_spr),
+ X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, &model_spr),
{},
};
MODULE_DEVICE_TABLE(x86cpu, rapl_model_match);
diff --git a/arch/x86/include/asm/bootparam_utils.h b/arch/x86/include/asm/bootparam_utils.h
index 981fe92..53e9b06 100644
--- a/arch/x86/include/asm/bootparam_utils.h
+++ b/arch/x86/include/asm/bootparam_utils.h
@@ -74,6 +74,7 @@
BOOT_PARAM_PRESERVE(hdr),
BOOT_PARAM_PRESERVE(e820_table),
BOOT_PARAM_PRESERVE(eddbuf),
+ BOOT_PARAM_PRESERVE(cc_blob_address),
};
memset(&scratch, 0, sizeof(scratch));
diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
new file mode 100644
index 0000000..70b2db1
--- /dev/null
+++ b/arch/x86/include/asm/cpuid.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * CPUID-related helpers/definitions
+ *
+ * Derived from arch/x86/kvm/cpuid.c
+ */
+
+#ifndef _ASM_X86_CPUID_H
+#define _ASM_X86_CPUID_H
+
+static __always_inline bool cpuid_function_is_indexed(u32 function)
+{
+ switch (function) {
+ case 4:
+ case 7:
+ case 0xb:
+ case 0xd:
+ case 0xf:
+ case 0x10:
+ case 0x12:
+ case 0x14:
+ case 0x17:
+ case 0x18:
+ case 0x1d:
+ case 0x1e:
+ case 0x1f:
+ case 0x8000001d:
+ return true;
+ }
+
+ return false;
+}
+
+#endif /* _ASM_X86_CPUID_H */
diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index c7c924e..6ad8d94 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -129,7 +129,7 @@
unsigned long page_list,
unsigned long start_address,
unsigned int preserve_context,
- unsigned int sme_active);
+ unsigned int host_mem_enc_active);
#endif
#define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 6929987..56935eb 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -83,6 +83,18 @@
return ret;
}
+static inline long kvm_sev_hypercall3(unsigned int nr, unsigned long p1,
+ unsigned long p2, unsigned long p3)
+{
+ long ret;
+
+ asm volatile("vmmcall"
+ : "=a"(ret)
+ : "a"(nr), "b"(p1), "c"(p2), "d"(p3)
+ : "memory");
+ return ret;
+}
+
#ifdef CONFIG_KVM_GUEST
void kvmclock_init(void);
void kvmclock_disable(void);
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 2356fdd..98a3570 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -44,12 +44,12 @@
int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size);
int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size);
+void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr,
+ unsigned long size, bool enc);
void __init mem_encrypt_free_decrypted_mem(void);
void __init sev_es_init_vc_handling(void);
-bool sme_active(void);
-bool sev_active(void);
bool sev_es_active(void);
void __init mem_encrypt_init(void);
@@ -75,14 +75,14 @@
static inline void __init sme_enable(struct boot_params *bp) { }
static inline void sev_es_init_vc_handling(void) { }
-static inline bool sme_active(void) { return false; }
-static inline bool sev_active(void) { return false; }
static inline bool sev_es_active(void) { return false; }
static inline int __init
early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
static inline int __init
early_set_memory_encrypted(unsigned long vaddr, unsigned long size) { return 0; }
+static inline void __init
+early_set_mem_enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool enc) {}
static inline void mem_encrypt_free_decrypted_mem(void) { }
@@ -103,11 +103,6 @@
extern char __start_bss_decrypted[], __end_bss_decrypted[], __start_bss_decrypted_unused[];
-static inline bool mem_encrypt_active(void)
-{
- return sme_me_mask;
-}
-
static inline u64 sme_get_me_mask(void)
{
return sme_me_mask;
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 15939a7..7667ad1 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -560,8 +560,10 @@
#define MSR_AMD64_SEV 0xc0010131
#define MSR_AMD64_SEV_ENABLED_BIT 0
#define MSR_AMD64_SEV_ES_ENABLED_BIT 1
+#define MSR_AMD64_SEV_SNP_ENABLED_BIT 2
#define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
#define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
+#define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index d42e6c6..65ec196 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -10,16 +10,7 @@
#include <asm/errno.h>
#include <asm/cpumask.h>
#include <uapi/asm/msr.h>
-
-struct msr {
- union {
- struct {
- u32 l;
- u32 h;
- };
- u64 q;
- };
-};
+#include <asm/shared/msr.h>
struct msr_info {
u32 msr_no;
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index 4d8b273..e658eb6 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -97,6 +97,12 @@
PVOP_VCALL1(mmu.exit_mmap, mm);
}
+static inline void notify_page_enc_status_changed(unsigned long pfn,
+ int npages, bool enc)
+{
+ PVOP_VCALL3(mmu.notify_page_enc_status_changed, pfn, npages, enc);
+}
+
#ifdef CONFIG_PARAVIRT_XXL
static inline void load_sp0(unsigned long sp0)
{
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index d9d6b02..6641998 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -168,6 +168,7 @@
/* Hook for intercepting the destruction of an mm_struct. */
void (*exit_mmap)(struct mm_struct *mm);
+ void (*notify_page_enc_status_changed)(unsigned long pfn, int npages, bool enc);
#ifdef CONFIG_PARAVIRT_XXL
struct paravirt_callee_save read_cr2;
diff --git a/arch/x86/include/asm/pci-direct.h b/arch/x86/include/asm/pci-direct.h
index 94597a3..ebb3db2 100644
--- a/arch/x86/include/asm/pci-direct.h
+++ b/arch/x86/include/asm/pci-direct.h
@@ -10,9 +10,11 @@
extern u32 read_pci_config(u8 bus, u8 slot, u8 func, u8 offset);
extern u8 read_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset);
extern u16 read_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset);
+extern u32 pci_early_find_cap(int bus, int slot, int func, int cap);
extern void write_pci_config(u8 bus, u8 slot, u8 func, u8 offset, u32 val);
extern void write_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset, u8 val);
extern void write_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset, u16 val);
+extern unsigned int pci_early_clear_msi;
extern int early_pci_allowed(void);
#endif /* _ASM_X86_PCI_DIRECT_H */
diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index e43ccc6..b044bae5 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -49,7 +49,6 @@
extern void reserve_standard_io_resources(void);
extern void i386_reserve_resources(void);
extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp);
-extern unsigned long __startup_secondary_64(void);
extern void startup_64_setup_env(unsigned long physbase);
extern void early_setup_idt(void);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);
diff --git a/arch/x86/include/asm/sev-common.h b/arch/x86/include/asm/sev-common.h
index 2cef6c5..5aadc032 100644
--- a/arch/x86/include/asm/sev-common.h
+++ b/arch/x86/include/asm/sev-common.h
@@ -55,9 +55,79 @@
#define GHCB_MSR_AP_RESET_HOLD_REQ 0x006
#define GHCB_MSR_AP_RESET_HOLD_RESP 0x007
+/* GHCB GPA Register */
+#define GHCB_MSR_REG_GPA_REQ 0x012
+#define GHCB_MSR_REG_GPA_REQ_VAL(v) \
+ /* GHCBData[63:12] */ \
+ (((u64)((v) & GENMASK_ULL(51, 0)) << 12) | \
+ /* GHCBData[11:0] */ \
+ GHCB_MSR_REG_GPA_REQ)
+
+#define GHCB_MSR_REG_GPA_RESP 0x013
+#define GHCB_MSR_REG_GPA_RESP_VAL(v) \
+ /* GHCBData[63:12] */ \
+ (((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
+
+/*
+ * SNP Page State Change Operation
+ *
+ * GHCBData[55:52] - Page operation:
+ * 0x0001 Page assignment, Private
+ * 0x0002 Page assignment, Shared
+ */
+enum psc_op {
+ SNP_PAGE_STATE_PRIVATE = 1,
+ SNP_PAGE_STATE_SHARED,
+};
+
+#define GHCB_MSR_PSC_REQ 0x014
+#define GHCB_MSR_PSC_REQ_GFN(gfn, op) \
+ /* GHCBData[55:52] */ \
+ (((u64)((op) & 0xf) << 52) | \
+ /* GHCBData[51:12] */ \
+ ((u64)((gfn) & GENMASK_ULL(39, 0)) << 12) | \
+ /* GHCBData[11:0] */ \
+ GHCB_MSR_PSC_REQ)
+
+#define GHCB_MSR_PSC_RESP 0x015
+#define GHCB_MSR_PSC_RESP_VAL(val) \
+ /* GHCBData[63:32] */ \
+ (((u64)(val) & GENMASK_ULL(63, 32)) >> 32)
+
/* GHCB Hypervisor Feature Request/Response */
#define GHCB_MSR_HV_FT_REQ 0x080
#define GHCB_MSR_HV_FT_RESP 0x081
+#define GHCB_MSR_HV_FT_RESP_VAL(v) \
+ /* GHCBData[63:12] */ \
+ (((u64)(v) & GENMASK_ULL(63, 12)) >> 12)
+
+#define GHCB_HV_FT_SNP BIT_ULL(0)
+#define GHCB_HV_FT_SNP_AP_CREATION BIT_ULL(1)
+
+/* SNP Page State Change NAE event */
+#define VMGEXIT_PSC_MAX_ENTRY 253
+
+struct psc_hdr {
+ u16 cur_entry;
+ u16 end_entry;
+ u32 reserved;
+} __packed;
+
+struct psc_entry {
+ u64 cur_page : 12,
+ gfn : 40,
+ operation : 4,
+ pagesize : 1,
+ reserved : 7;
+} __packed;
+
+struct snp_psc_desc {
+ struct psc_hdr hdr;
+ struct psc_entry entries[VMGEXIT_PSC_MAX_ENTRY];
+} __packed;
+
+/* Guest message request error code */
+#define SNP_GUEST_REQ_INVALID_LEN BIT_ULL(32)
#define GHCB_MSR_TERM_REQ 0x100
#define GHCB_MSR_TERM_REASON_SET_POS 12
@@ -68,8 +138,20 @@
(((((u64)reason_set) & GHCB_MSR_TERM_REASON_SET_MASK) << GHCB_MSR_TERM_REASON_SET_POS) | \
((((u64)reason_val) & GHCB_MSR_TERM_REASON_MASK) << GHCB_MSR_TERM_REASON_POS))
-#define GHCB_SEV_ES_REASON_GENERAL_REQUEST 0
-#define GHCB_SEV_ES_REASON_PROTOCOL_UNSUPPORTED 1
+/* Error codes from reason set 0 */
+#define SEV_TERM_SET_GEN 0
+#define GHCB_SEV_ES_GEN_REQ 0
+#define GHCB_SEV_ES_PROT_UNSUPPORTED 1
+#define GHCB_SNP_UNSUPPORTED 2
+
+/* Linux-specific reason codes (used with reason set 1) */
+#define SEV_TERM_SET_LINUX 1
+#define GHCB_TERM_REGISTER 0 /* GHCB GPA registration failure */
+#define GHCB_TERM_PSC 1 /* Page State Change failure */
+#define GHCB_TERM_PVALIDATE 2 /* Pvalidate failure */
+#define GHCB_TERM_NOT_VMPL0 3 /* SNP guest is not running at VMPL-0 */
+#define GHCB_TERM_CPUID 4 /* CPUID-validation failure */
+#define GHCB_TERM_CPUID_HV 5 /* CPUID failure during hypervisor fallback */
#define GHCB_RESP_CODE(v) ((v) & GHCB_MSR_INFO_MASK)
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index fa5cd05..aba0cb2 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -11,9 +11,10 @@
#include <linux/types.h>
#include <asm/insn.h>
#include <asm/sev-common.h>
+#include <asm/bootparam.h>
-#define GHCB_PROTO_OUR 0x0001UL
-#define GHCB_PROTOCOL_MAX 1ULL
+#define GHCB_PROTOCOL_MIN 1ULL
+#define GHCB_PROTOCOL_MAX 2ULL
#define GHCB_DEFAULT_USAGE 0ULL
#define VMGEXIT() { asm volatile("rep; vmmcall\n\r"); }
@@ -42,6 +43,24 @@
struct es_fault_info fi;
};
+/*
+ * AMD SEV Confidential computing blob structure. The structure is
+ * defined in OVMF UEFI firmware header:
+ * https://github.com/tianocore/edk2/blob/master/OvmfPkg/Include/Guid/ConfidentialComputingSevSnpBlob.h
+ */
+#define CC_BLOB_SEV_HDR_MAGIC 0x45444d41
+struct cc_blob_sev_info {
+ u32 magic;
+ u16 version;
+ u16 reserved;
+ u64 secrets_phys;
+ u32 secrets_len;
+ u32 rsvd1;
+ u64 cpuid_phys;
+ u32 cpuid_len;
+ u32 rsvd2;
+} __packed;
+
void do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code);
static inline u64 lower_bits(u64 val, unsigned int bits)
@@ -59,6 +78,26 @@
extern void vc_boot_ghcb(void);
extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
+/* Software defined (when rFlags.CF = 1) */
+#define PVALIDATE_FAIL_NOUPDATE 255
+
+/* RMP page size */
+#define RMP_PG_SIZE_4K 0
+
+#define RMPADJUST_VMSA_PAGE_BIT BIT(16)
+
+/* SNP Guest message request */
+struct snp_req_data {
+ unsigned long req_gpa;
+ unsigned long resp_gpa;
+ unsigned long data_gpa;
+ unsigned int data_npages;
+};
+
+struct snp_guest_platform_data {
+ u64 secrets_gpa;
+};
+
#ifdef CONFIG_AMD_MEM_ENCRYPT
extern struct static_key_false sev_es_enable_key;
extern void __sev_es_ist_enter(struct pt_regs *regs);
@@ -81,12 +120,71 @@
__sev_es_nmi_complete();
}
extern int __init sev_es_efi_map_ghcbs(pgd_t *pgd);
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs)
+{
+ int rc;
+
+ /* "rmpadjust" mnemonic support in binutils 2.36 and newer */
+ asm volatile(".byte 0xF3,0x0F,0x01,0xFE\n\t"
+ : "=a"(rc)
+ : "a"(vaddr), "c"(rmp_psize), "d"(attrs)
+ : "memory", "cc");
+
+ return rc;
+}
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate)
+{
+ bool no_rmpupdate;
+ int rc;
+
+ /* "pvalidate" mnemonic support in binutils 2.36 and newer */
+ asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFF\n\t"
+ CC_SET(c)
+ : CC_OUT(c) (no_rmpupdate), "=a"(rc)
+ : "a"(vaddr), "c"(rmp_psize), "d"(validate)
+ : "memory", "cc");
+
+ if (no_rmpupdate)
+ return PVALIDATE_FAIL_NOUPDATE;
+
+ return rc;
+}
+void setup_ghcb(void);
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages);
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op);
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages);
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages);
+void snp_set_wakeup_secondary_cpu(void);
+bool snp_init(struct boot_params *bp);
+void snp_abort(void);
+int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err);
#else
static inline void sev_es_ist_enter(struct pt_regs *regs) { }
static inline void sev_es_ist_exit(void) { }
static inline int sev_es_setup_ap_jump_table(struct real_mode_header *rmh) { return 0; }
static inline void sev_es_nmi_complete(void) { }
static inline int sev_es_efi_map_ghcbs(pgd_t *pgd) { return 0; }
+static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) { return 0; }
+static inline int rmpadjust(unsigned long vaddr, bool rmp_psize, unsigned long attrs) { return 0; }
+static inline void setup_ghcb(void) { }
+static inline void __init
+early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init
+early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { }
+static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { }
+static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_memory_private(unsigned long vaddr, unsigned int npages) { }
+static inline void snp_set_wakeup_secondary_cpu(void) { }
+static inline bool snp_init(struct boot_params *bp) { return false; }
+static inline void snp_abort(void) { }
+static inline int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input,
+ unsigned long *fw_err)
+{
+ return -ENOTTY;
+}
#endif
#endif
diff --git a/arch/x86/include/asm/shared/msr.h b/arch/x86/include/asm/shared/msr.h
new file mode 100644
index 0000000..1e6ec10
--- /dev/null
+++ b/arch/x86/include/asm/shared/msr.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_SHARED_MSR_H
+#define _ASM_X86_SHARED_MSR_H
+
+struct msr {
+ union {
+ struct {
+ u32 l;
+ u32 h;
+ };
+ u64 q;
+ };
+};
+
+#endif /* _ASM_X86_SHARED_MSR_H */
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index b00dbc5..7d90321 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -227,6 +227,7 @@
u64 base;
} __packed;
+/* Save area definition for legacy and SEV-MEM guests */
struct vmcb_save_area {
struct vmcb_seg es;
struct vmcb_seg cs;
@@ -238,12 +239,12 @@
struct vmcb_seg ldtr;
struct vmcb_seg idtr;
struct vmcb_seg tr;
- u8 reserved_1[43];
+ u8 reserved_1[42];
+ u8 vmpl;
u8 cpl;
u8 reserved_2[4];
u64 efer;
- u8 reserved_3[104];
- u64 xss; /* Valid for SEV-ES only */
+ u8 reserved_3[112];
u64 cr4;
u64 cr3;
u64 cr0;
@@ -253,7 +254,9 @@
u64 rip;
u8 reserved_4[88];
u64 rsp;
- u8 reserved_5[24];
+ u64 s_cet;
+ u64 ssp;
+ u64 isst_addr;
u64 rax;
u64 star;
u64 lstar;
@@ -264,29 +267,86 @@
u64 sysenter_esp;
u64 sysenter_eip;
u64 cr2;
- u8 reserved_6[32];
+ u8 reserved_5[32];
u64 g_pat;
u64 dbgctl;
u64 br_from;
u64 br_to;
u64 last_excp_from;
u64 last_excp_to;
-
- /*
- * The following part of the save area is valid only for
- * SEV-ES guests when referenced through the GHCB or for
- * saving to the host save area.
- */
- u8 reserved_7[72];
+ u8 reserved_6[72];
u32 spec_ctrl; /* Guest version of SPEC_CTRL at 0x2E0 */
- u8 reserved_7b[4];
+} __packed;
+
+/* Save area definition for SEV-ES and SEV-SNP guests */
+struct sev_es_save_area {
+ struct vmcb_seg es;
+ struct vmcb_seg cs;
+ struct vmcb_seg ss;
+ struct vmcb_seg ds;
+ struct vmcb_seg fs;
+ struct vmcb_seg gs;
+ struct vmcb_seg gdtr;
+ struct vmcb_seg ldtr;
+ struct vmcb_seg idtr;
+ struct vmcb_seg tr;
+ u64 vmpl0_ssp;
+ u64 vmpl1_ssp;
+ u64 vmpl2_ssp;
+ u64 vmpl3_ssp;
+ u64 u_cet;
+ u8 reserved_1[2];
+ u8 vmpl;
+ u8 cpl;
+ u8 reserved_2[4];
+ u64 efer;
+ u8 reserved_3[104];
+ u64 xss;
+ u64 cr4;
+ u64 cr3;
+ u64 cr0;
+ u64 dr7;
+ u64 dr6;
+ u64 rflags;
+ u64 rip;
+ u64 dr0;
+ u64 dr1;
+ u64 dr2;
+ u64 dr3;
+ u64 dr0_addr_mask;
+ u64 dr1_addr_mask;
+ u64 dr2_addr_mask;
+ u64 dr3_addr_mask;
+ u8 reserved_4[24];
+ u64 rsp;
+ u64 s_cet;
+ u64 ssp;
+ u64 isst_addr;
+ u64 rax;
+ u64 star;
+ u64 lstar;
+ u64 cstar;
+ u64 sfmask;
+ u64 kernel_gs_base;
+ u64 sysenter_cs;
+ u64 sysenter_esp;
+ u64 sysenter_eip;
+ u64 cr2;
+ u8 reserved_5[32];
+ u64 g_pat;
+ u64 dbgctl;
+ u64 br_from;
+ u64 br_to;
+ u64 last_excp_from;
+ u64 last_excp_to;
+ u8 reserved_7[80];
u32 pkru;
- u8 reserved_7a[20];
- u64 reserved_8; /* rax already available at 0x01f8 */
+ u8 reserved_8[20];
+ u64 reserved_9; /* rax already available at 0x01f8 */
u64 rcx;
u64 rdx;
u64 rbx;
- u64 reserved_9; /* rsp already available at 0x01d8 */
+ u64 reserved_10; /* rsp already available at 0x01d8 */
u64 rbp;
u64 rsi;
u64 rdi;
@@ -298,22 +358,83 @@
u64 r13;
u64 r14;
u64 r15;
- u8 reserved_10[16];
+ u8 reserved_11[16];
+ u64 guest_exit_info_1;
+ u64 guest_exit_info_2;
+ u64 guest_exit_int_info;
+ u64 guest_nrip;
+ u64 sev_features;
+ u64 vintr_ctrl;
+ u64 guest_exit_code;
+ u64 virtual_tom;
+ u64 tlb_id;
+ u64 pcpu_id;
+ u64 event_inj;
+ u64 xcr0;
+ u8 reserved_12[16];
+
+ /* Floating point area */
+ u64 x87_dp;
+ u32 mxcsr;
+ u16 x87_ftw;
+ u16 x87_fsw;
+ u16 x87_fcw;
+ u16 x87_fop;
+ u16 x87_ds;
+ u16 x87_cs;
+ u64 x87_rip;
+ u8 fpreg_x87[80];
+ u8 fpreg_xmm[256];
+ u8 fpreg_ymm[256];
+} __packed;
+
+struct ghcb_save_area {
+ u8 reserved_1[203];
+ u8 cpl;
+ u8 reserved_2[116];
+ u64 xss;
+ u8 reserved_3[24];
+ u64 dr7;
+ u8 reserved_4[16];
+ u64 rip;
+ u8 reserved_5[88];
+ u64 rsp;
+ u8 reserved_6[24];
+ u64 rax;
+ u8 reserved_7[264];
+ u64 rcx;
+ u64 rdx;
+ u64 rbx;
+ u8 reserved_8[8];
+ u64 rbp;
+ u64 rsi;
+ u64 rdi;
+ u64 r8;
+ u64 r9;
+ u64 r10;
+ u64 r11;
+ u64 r12;
+ u64 r13;
+ u64 r14;
+ u64 r15;
+ u8 reserved_9[16];
u64 sw_exit_code;
u64 sw_exit_info_1;
u64 sw_exit_info_2;
u64 sw_scratch;
- u8 reserved_11[56];
+ u8 reserved_10[56];
u64 xcr0;
u8 valid_bitmap[16];
u64 x87_state_gpa;
} __packed;
-struct ghcb {
- struct vmcb_save_area save;
- u8 reserved_save[2048 - sizeof(struct vmcb_save_area)];
+#define GHCB_SHARED_BUF_SIZE 2032
- u8 shared_buffer[2032];
+struct ghcb {
+ struct ghcb_save_area save;
+ u8 reserved_save[2048 - sizeof(struct ghcb_save_area)];
+
+ u8 shared_buffer[GHCB_SHARED_BUF_SIZE];
u8 reserved_1[10];
u16 protocol_version; /* negotiated SEV-ES/GHCB protocol version */
@@ -321,13 +442,17 @@
} __packed;
-#define EXPECTED_VMCB_SAVE_AREA_SIZE 1032
+#define EXPECTED_VMCB_SAVE_AREA_SIZE 740
+#define EXPECTED_GHCB_SAVE_AREA_SIZE 1032
+#define EXPECTED_SEV_ES_SAVE_AREA_SIZE 1648
#define EXPECTED_VMCB_CONTROL_AREA_SIZE 1024
#define EXPECTED_GHCB_SIZE PAGE_SIZE
static inline void __unused_size_checks(void)
{
BUILD_BUG_ON(sizeof(struct vmcb_save_area) != EXPECTED_VMCB_SAVE_AREA_SIZE);
+ BUILD_BUG_ON(sizeof(struct ghcb_save_area) != EXPECTED_GHCB_SAVE_AREA_SIZE);
+ BUILD_BUG_ON(sizeof(struct sev_es_save_area) != EXPECTED_SEV_ES_SAVE_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct vmcb_control_area) != EXPECTED_VMCB_CONTROL_AREA_SIZE);
BUILD_BUG_ON(sizeof(struct ghcb) != EXPECTED_GHCB_SIZE);
}
@@ -397,7 +522,7 @@
/* GHCB Accessor functions */
#define GHCB_BITMAP_IDX(field) \
- (offsetof(struct vmcb_save_area, field) / sizeof(u64))
+ (offsetof(struct ghcb_save_area, field) / sizeof(u64))
#define DEFINE_GHCB_ACCESSORS(field) \
static inline bool ghcb_##field##_is_valid(const struct ghcb *ghcb) \
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 5c69f7e..453cc5e 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -142,6 +142,21 @@
};
/**
+ * struct x86_guest - Functions used by misc guest incarnations like SEV, TDX, etc.
+ *
+ * @enc_status_change_prepare Notify HV before the encryption status of a range is changed
+ * @enc_status_change_finish Notify HV after the encryption status of a range is changed
+ * @enc_tlb_flush_required Returns true if a TLB flush is needed before changing page encryption status
+ * @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status
+ */
+struct x86_guest {
+ void (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc);
+ bool (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc);
+ bool (*enc_tlb_flush_required)(bool enc);
+ bool (*enc_cache_flush_required)(void);
+};
+
+/**
* struct x86_init_ops - functions for platform specific setup
*
*/
@@ -287,6 +302,7 @@
struct x86_legacy_features legacy;
void (*set_legacy_features)(void);
struct x86_hyper_runtime hyper;
+ struct x86_guest guest;
};
struct pci_dev;
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index b25d3f8..bea5cdc 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -10,6 +10,7 @@
#define SETUP_EFI 4
#define SETUP_APPLE_PROPERTIES 5
#define SETUP_JAILHOUSE 6
+#define SETUP_CC_BLOB 7
#define SETUP_INDIRECT (1<<31)
@@ -187,7 +188,8 @@
__u32 ext_ramdisk_image; /* 0x0c0 */
__u32 ext_ramdisk_size; /* 0x0c4 */
__u32 ext_cmd_line_ptr; /* 0x0c8 */
- __u8 _pad4[116]; /* 0x0cc */
+ __u8 _pad4[112]; /* 0x0cc */
+ __u32 cc_blob_address; /* 0x13c */
struct edid_info edid_info; /* 0x140 */
struct efi_info efi_info; /* 0x1c0 */
__u32 alt_mem_k; /* 0x1e0 */
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index efa9693..f69c168 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -108,6 +108,14 @@
#define SVM_VMGEXIT_AP_JUMP_TABLE 0x80000005
#define SVM_VMGEXIT_SET_AP_JUMP_TABLE 0
#define SVM_VMGEXIT_GET_AP_JUMP_TABLE 1
+#define SVM_VMGEXIT_PSC 0x80000010
+#define SVM_VMGEXIT_GUEST_REQUEST 0x80000011
+#define SVM_VMGEXIT_EXT_GUEST_REQUEST 0x80000012
+#define SVM_VMGEXIT_AP_CREATION 0x80000013
+#define SVM_VMGEXIT_AP_CREATE_ON_INIT 0
+#define SVM_VMGEXIT_AP_CREATE 1
+#define SVM_VMGEXIT_AP_DESTROY 2
+#define SVM_VMGEXIT_HV_FEATURES 0x8000fffd
#define SVM_VMGEXIT_UNSUPPORTED_EVENT 0x8000ffff
/* Exit code reserved for hypervisor/software use */
@@ -218,6 +226,11 @@
{ SVM_VMGEXIT_NMI_COMPLETE, "vmgexit_nmi_complete" }, \
{ SVM_VMGEXIT_AP_HLT_LOOP, "vmgexit_ap_hlt_loop" }, \
{ SVM_VMGEXIT_AP_JUMP_TABLE, "vmgexit_ap_jump_table" }, \
+ { SVM_VMGEXIT_PSC, "vmgexit_page_state_change" }, \
+ { SVM_VMGEXIT_GUEST_REQUEST, "vmgexit_guest_request" }, \
+ { SVM_VMGEXIT_EXT_GUEST_REQUEST, "vmgexit_ext_guest_request" }, \
+ { SVM_VMGEXIT_AP_CREATION, "vmgexit_ap_creation" }, \
+ { SVM_VMGEXIT_HV_FEATURES, "vmgexit_hypervisor_feature" }, \
{ SVM_EXIT_ERR, "invalid_guest_state" }
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 2ff3e60..4df8c8f 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -48,7 +48,6 @@
# non-deterministic coverage.
KCOV_INSTRUMENT := n
-CFLAGS_head$(BITS).o += -fno-stack-protector
CFLAGS_cc_platform.o += -fno-stack-protector
CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 1056288..4e7386c 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -125,32 +125,6 @@
}
-/* Find a PCI capability */
-static u32 __init find_cap(int bus, int slot, int func, int cap)
-{
- int bytes;
- u8 pos;
-
- if (!(read_pci_config_16(bus, slot, func, PCI_STATUS) &
- PCI_STATUS_CAP_LIST))
- return 0;
-
- pos = read_pci_config_byte(bus, slot, func, PCI_CAPABILITY_LIST);
- for (bytes = 0; bytes < 48 && pos >= 0x40; bytes++) {
- u8 id;
-
- pos &= ~3;
- id = read_pci_config_byte(bus, slot, func, pos+PCI_CAP_LIST_ID);
- if (id == 0xff)
- break;
- if (id == cap)
- return pos;
- pos = read_pci_config_byte(bus, slot, func,
- pos+PCI_CAP_LIST_NEXT);
- }
- return 0;
-}
-
/* Read a standard AGPv3 bridge header */
static u32 __init read_agp(int bus, int slot, int func, int cap, u32 *order)
{
@@ -239,8 +213,8 @@
case PCI_CLASS_BRIDGE_HOST:
case PCI_CLASS_BRIDGE_OTHER: /* needed? */
/* AGP bridge? */
- cap = find_cap(bus, slot, func,
- PCI_CAP_ID_AGP);
+ cap = pci_early_find_cap(bus, slot,
+ func, PCI_CAP_ID_AGP);
if (!cap)
break;
*valid_agp = 1;
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
index 03bb2f3..2bca7f6 100644
--- a/arch/x86/kernel/cc_platform.c
+++ b/arch/x86/kernel/cc_platform.c
@@ -50,6 +50,9 @@
case CC_ATTR_GUEST_STATE_ENCRYPT:
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
+ case CC_ATTR_GUEST_SEV_SNP:
+ return sev_status & MSR_AMD64_SEV_SNP_ENABLED;
+
default:
return false;
}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 809e12f..c55e883 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -62,6 +62,8 @@
#include <asm/intel-family.h>
#include <asm/cpu_device_id.h>
#include <asm/uv/uv.h>
+#include <asm/sigframe.h>
+#include <asm/sev.h>
#include <asm/set_memory.h>
#include "cpu.h"
@@ -2132,6 +2134,9 @@
load_TR_desc();
+ /* GHCB needs to be setup to handle #VC. */
+ setup_ghcb();
+
/* Finally load the IDT */
load_current_idt();
}
diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
index 045e82e..a7f617a 100644
--- a/arch/x86/kernel/crash_dump_64.c
+++ b/arch/x86/kernel/crash_dump_64.c
@@ -10,6 +10,7 @@
#include <linux/crash_dump.h>
#include <linux/uaccess.h>
#include <linux/io.h>
+#include <linux/cc_platform.h>
static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
unsigned long offset, int userbuf,
@@ -73,5 +74,6 @@
ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
{
- return read_from_oldmem(buf, count, ppos, 0, sev_active());
+ return read_from_oldmem(buf, count, ppos, 0,
+ cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
}
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 8690fab..6ae74a6 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -28,6 +28,37 @@
#include <asm/irq_remapping.h>
#include <asm/early_ioremap.h>
+static void __init early_pci_clear_msi(int bus, int slot, int func)
+{
+ int pos;
+ u16 ctrl;
+
+ if (likely(!pci_early_clear_msi))
+ return;
+
+ pr_info_once("Clearing MSI/MSI-X enable bits early in boot (quirk)\n");
+
+ pos = pci_early_find_cap(bus, slot, func, PCI_CAP_ID_MSI);
+ if (pos) {
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS);
+ ctrl &= ~PCI_MSI_FLAGS_ENABLE;
+ write_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS, ctrl);
+
+ /* Read again to flush previous write */
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS);
+ }
+
+ pos = pci_early_find_cap(bus, slot, func, PCI_CAP_ID_MSIX);
+ if (pos) {
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS);
+ ctrl &= ~PCI_MSIX_FLAGS_ENABLE;
+ write_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS, ctrl);
+
+ /* Read again to flush previous write */
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS);
+ }
+}
+
static void __init fix_hypertransport_config(int num, int slot, int func)
{
u32 htcfg;
@@ -724,6 +755,7 @@
PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet},
{ PCI_VENDOR_ID_BROADCOM, 0x4331,
PCI_CLASS_NETWORK_OTHER, PCI_ANY_ID, 0, apple_airport_reset},
+ { PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, early_pci_clear_msi},
{}
};
@@ -746,7 +778,6 @@
u16 vendor;
u16 device;
u8 type;
- u8 sec;
int i;
class = read_pci_config_16(num, slot, func, PCI_CLASS_DEVICE);
@@ -775,11 +806,8 @@
type = read_pci_config_byte(num, slot, func,
PCI_HEADER_TYPE);
- if ((type & 0x7f) == PCI_HEADER_TYPE_BRIDGE) {
- sec = read_pci_config_byte(num, slot, func, PCI_SECONDARY_BUS);
- if (sec > num)
- early_pci_scan_bus(sec);
- }
+ if ((type & 0x7f) == PCI_HEADER_TYPE_BRIDGE)
+ return -1;
if (!(type & 0x80))
return -1;
@@ -802,8 +830,10 @@
void __init early_quirks(void)
{
+ int bus;
if (!early_pci_allowed())
return;
- early_pci_scan_bus(0);
+ for (bus = 0; bus < 256; bus++)
+ early_pci_scan_bus(bus);
}
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 2375f5f..4d9f963 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -19,7 +19,7 @@
#include <linux/start_kernel.h>
#include <linux/io.h>
#include <linux/memblock.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <linux/pgtable.h>
#include <asm/processor.h>
@@ -126,6 +126,49 @@
}
#endif
+static unsigned long sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd)
+{
+ unsigned long vaddr, vaddr_end;
+ int i;
+
+ /* Encrypt the kernel and related (if SME is active) */
+ sme_encrypt_kernel(bp);
+
+ /*
+ * Clear the memory encryption mask from the .bss..decrypted section.
+ * The bss section will be memset to zero later in the initialization so
+ * there is no need to zero it after changing the memory encryption
+ * attribute.
+ */
+ if (sme_get_me_mask()) {
+ vaddr = (unsigned long)__start_bss_decrypted;
+ vaddr_end = (unsigned long)__end_bss_decrypted;
+
+ for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
+ /*
+ * On SNP, transition the page to shared in the RMP table so that
+ * it is consistent with the page table attribute change.
+ *
+ * __start_bss_decrypted has a virtual address in the high range
+ * mapping (kernel .text). PVALIDATE, by way of
+ * early_snp_set_memory_shared(), requires a valid virtual
+ * address but the kernel is currently running off of the identity
+ * mapping so use __pa() to get a *currently* valid virtual address.
+ */
+ early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD);
+
+ i = pmd_index(vaddr);
+ pmd[i] -= sme_get_me_mask();
+ }
+ }
+
+ /*
+ * Return the SME encryption mask (if SME is active) to be used as a
+ * modifier for the initial pgdir entry programmed into CR3.
+ */
+ return sme_get_me_mask();
+}
+
/* Code in __startup_64() can be relocated during execution, but the compiler
* doesn't have to generate PC-relative relocations when accessing globals from
* that function. Clang actually does not generate them, which leads to
@@ -135,7 +178,6 @@
unsigned long __head __startup_64(unsigned long physaddr,
struct boot_params *bp)
{
- unsigned long vaddr, vaddr_end;
unsigned long load_delta, *p;
unsigned long pgtable_flags;
pgdval_t *pgd;
@@ -163,9 +205,6 @@
if (load_delta & ~PMD_PAGE_MASK)
for (;;);
- /* Activate Secure Memory Encryption (SME) if supported and enabled */
- sme_enable(bp);
-
/* Include the SME encryption mask in the fixup value */
load_delta += sme_get_me_mask();
@@ -276,38 +315,7 @@
*/
*fixup_long(&phys_base, physaddr) += load_delta - sme_get_me_mask();
- /* Encrypt the kernel and related (if SME is active) */
- sme_encrypt_kernel(bp);
-
- /*
- * Clear the memory encryption mask from the .bss..decrypted section.
- * The bss section will be memset to zero later in the initialization so
- * there is no need to zero it after changing the memory encryption
- * attribute.
- */
- if (mem_encrypt_active()) {
- vaddr = (unsigned long)__start_bss_decrypted;
- vaddr_end = (unsigned long)__end_bss_decrypted;
- for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
- i = pmd_index(vaddr);
- pmd[i] -= sme_get_me_mask();
- }
- }
-
- /*
- * Return the SME encryption mask (if SME is active) to be used as a
- * modifier for the initial pgdir entry programmed into CR3.
- */
- return sme_get_me_mask();
-}
-
-unsigned long __startup_secondary_64(void)
-{
- /*
- * Return the SME encryption mask (if SME is active) to be used as a
- * modifier for the initial pgdir entry programmed into CR3.
- */
- return sme_get_me_mask();
+ return sme_postprocess_startup(bp, pmd);
}
/* Wipe all early page tables except for the kernel symbol map */
@@ -581,8 +589,10 @@
void early_setup_idt(void)
{
/* VMM Communication Exception */
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
+ if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
+ setup_ghcb();
set_bringup_idt_handler(bringup_idt_table, X86_TRAP_VC, vc_boot_ghcb);
+ }
bringup_idt_descr.address = (unsigned long)bringup_idt_table;
native_load_idt(&bringup_idt_descr);
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 81f1ae2..8a86626 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -65,10 +65,39 @@
leaq (__end_init_task - FRAME_SIZE)(%rip), %rsp
leaq _text(%rip), %rdi
+
+ /*
+ * initial_gs points to initial fixed_percpu_data struct with storage for
+ * the stack protector canary. Global pointer fixups are needed at this
+ * stage, so apply them as is done in fixup_pointer(), and initialize %gs
+ * such that the canary can be accessed at %gs:40 for subsequent C calls.
+ */
+ movl $MSR_GS_BASE, %ecx
+ movq initial_gs(%rip), %rax
+ movq $_text, %rdx
+ subq %rdx, %rax
+ addq %rdi, %rax
+ movq %rax, %rdx
+ shrq $32, %rdx
+ wrmsr
+
pushq %rsi
call startup_64_setup_env
popq %rsi
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ /*
+ * Activate SEV/SME memory encryption if supported/enabled. This needs to
+ * be done now, since this also includes setup of the SEV-SNP CPUID table,
+ * which needs to be done before any CPUID instructions are executed in
+ * subsequent code.
+ */
+ movq %rsi, %rdi
+ pushq %rsi
+ call sme_enable
+ popq %rsi
+#endif
+
/* Now switch to __KERNEL_CS so IRET works reliably */
pushq $__KERNEL_CS
leaq .Lon_kernel_cs(%rip), %rax
@@ -132,9 +161,11 @@
* Retrieve the modifier (SME encryption mask if SME is active) to be
* added to the initial pgdir entry that will be programmed into CR3.
*/
- pushq %rsi
- call __startup_secondary_64
- popq %rsi
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+ movq sme_me_mask, %rax
+#else
+ xorq %rax, %rax
+#endif
/* Form the CR3 value being sure to include the CR3 modifier */
addq $(init_top_pgt - __START_KERNEL_map), %rax
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index eba6485..dc3900f 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -27,6 +27,8 @@
#include <linux/nmi.h>
#include <linux/swait.h>
#include <linux/syscore_ops.h>
+#include <linux/cc_platform.h>
+#include <linux/efi.h>
#include <asm/timer.h>
#include <asm/cpu.h>
#include <asm/traps.h>
@@ -40,6 +42,7 @@
#include <asm/ptrace.h>
#include <asm/reboot.h>
#include <asm/svm.h>
+#include <asm/e820/api.h>
DEFINE_STATIC_KEY_FALSE(kvm_async_pf_enabled);
@@ -432,7 +435,7 @@
{
int cpu;
- if (!sev_active())
+ if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return;
for_each_possible_cpu(cpu) {
@@ -447,6 +450,8 @@
kvm_disable_steal_time();
if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
wrmsrl(MSR_KVM_PV_EOI_EN, 0);
+ if (kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL))
+ wrmsrl(MSR_KVM_MIGRATION_CONTROL, 0);
kvm_pv_disable_apf();
if (!shutdown)
apf_task_wake_all();
@@ -564,6 +569,55 @@
__send_ipi_mask(local_mask, vector);
}
+static int __init setup_efi_kvm_sev_migration(void)
+{
+ efi_char16_t efi_sev_live_migration_enabled[] = L"SevLiveMigrationEnabled";
+ efi_guid_t efi_variable_guid = AMD_SEV_MEM_ENCRYPT_GUID;
+ efi_status_t status;
+ unsigned long size;
+ bool enabled;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) ||
+ !kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL))
+ return 0;
+
+ if (!efi_enabled(EFI_BOOT))
+ return 0;
+
+ if (!efi_enabled(EFI_RUNTIME_SERVICES)) {
+ pr_info("%s : EFI runtime services are not enabled\n", __func__);
+ return 0;
+ }
+
+ size = sizeof(enabled);
+
+ /* Get variable contents into buffer */
+ status = efi.get_variable(efi_sev_live_migration_enabled,
+ &efi_variable_guid, NULL, &size, &enabled);
+
+ if (status == EFI_NOT_FOUND) {
+ pr_info("%s : EFI live migration variable not found\n", __func__);
+ return 0;
+ }
+
+ if (status != EFI_SUCCESS) {
+ pr_info("%s : EFI variable retrieval failed\n", __func__);
+ return 0;
+ }
+
+ if (enabled == 0) {
+ pr_info("%s: live migration disabled in EFI\n", __func__);
+ return 0;
+ }
+
+ pr_info("%s : live migration enabled in EFI\n", __func__);
+ wrmsrl(MSR_KVM_MIGRATION_CONTROL, KVM_MIGRATION_READY);
+
+ return 1;
+}
+
+late_initcall(setup_efi_kvm_sev_migration);
+
/*
* Set the IPI entry points
*/
@@ -834,8 +888,60 @@
return kvm_para_has_feature(KVM_FEATURE_MSI_EXT_DEST_ID);
}
+static void kvm_sev_hc_page_enc_status(unsigned long pfn, int npages, bool enc)
+{
+ kvm_sev_hypercall3(KVM_HC_MAP_GPA_RANGE, pfn << PAGE_SHIFT, npages,
+ KVM_MAP_GPA_RANGE_ENC_STAT(enc) | KVM_MAP_GPA_RANGE_PAGE_SZ_4K);
+}
+
static void __init kvm_init_platform(void)
{
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
+ kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) {
+ unsigned long nr_pages;
+ int i;
+
+ pv_ops.mmu.notify_page_enc_status_changed =
+ kvm_sev_hc_page_enc_status;
+
+ /*
+ * Reset the host's shared pages list related to kernel
+ * specific page encryption status settings before we load a
+ * new kernel by kexec. Reset the page encryption status
+ * during early boot intead of just before kexec to avoid SMP
+ * races during kvm_pv_guest_cpu_reboot().
+ * NOTE: We cannot reset the complete shared pages list
+ * here as we need to retain the UEFI/OVMF firmware
+ * specific settings.
+ */
+
+ for (i = 0; i < e820_table->nr_entries; i++) {
+ struct e820_entry *entry = &e820_table->entries[i];
+
+ if (entry->type != E820_TYPE_RAM)
+ continue;
+
+ nr_pages = DIV_ROUND_UP(entry->size, PAGE_SIZE);
+
+ kvm_sev_hypercall3(KVM_HC_MAP_GPA_RANGE, entry->addr,
+ nr_pages,
+ KVM_MAP_GPA_RANGE_ENCRYPTED | KVM_MAP_GPA_RANGE_PAGE_SZ_4K);
+ }
+
+ /*
+ * Ensure that _bss_decrypted section is marked as decrypted in the
+ * shared pages list.
+ */
+ early_set_mem_enc_dec_hypercall((unsigned long)__start_bss_decrypted,
+ __end_bss_decrypted - __start_bss_decrypted, 0);
+
+ /*
+ * If not booted using EFI, enable Live migration support.
+ */
+ if (!efi_enabled(EFI_BOOT))
+ wrmsrl(MSR_KVM_MIGRATION_CONTROL,
+ KVM_MIGRATION_READY);
+ }
kvmclock_init();
x86_platform.apic_post_init = kvm_apic_init;
}
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 1d986b3..8e1cd29 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -16,9 +16,9 @@
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/set_memory.h>
+#include <linux/cc_platform.h>
#include <asm/hypervisor.h>
-#include <asm/mem_encrypt.h>
#include <asm/x86_init.h>
#include <asm/kvmclock.h>
@@ -224,7 +224,7 @@
* hvclock is shared between the guest and the hypervisor, must
* be mapped decrypted.
*/
- if (sev_active()) {
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
r = set_memory_decrypted((unsigned long) hvclock_mem,
1UL << order);
if (r) {
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index dc8b175..1f0eb0e 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -17,6 +17,7 @@
#include <linux/suspend.h>
#include <linux/vmalloc.h>
#include <linux/efi.h>
+#include <linux/cc_platform.h>
#include <asm/init.h>
#include <asm/tlbflush.h>
@@ -166,7 +167,7 @@
}
pte = pte_offset_kernel(pmd, vaddr);
- if (sev_active())
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
prot = PAGE_KERNEL_EXEC;
set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
@@ -206,7 +207,7 @@
level4p = (pgd_t *)__va(start_pgtable);
clear_page(level4p);
- if (sev_active()) {
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
info.page_flag |= _PAGE_ENC;
info.kernpg_flag |= _PAGE_ENC;
}
@@ -358,7 +359,7 @@
(unsigned long)page_list,
image->start,
image->preserve_context,
- sme_active());
+ cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT));
#ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
@@ -575,12 +576,12 @@
*/
int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
{
- if (sev_active())
+ if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return 0;
/*
- * If SME is active we need to be sure that kexec pages are
- * not encrypted because when we boot to the new kernel the
+ * If host memory encryption is active we need to be sure that kexec
+ * pages are not encrypted because when we boot to the new kernel the
* pages won't be accessed encrypted (initially).
*/
return set_memory_decrypted((unsigned long)vaddr, pages);
@@ -588,12 +589,12 @@
void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
{
- if (sev_active())
+ if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return;
/*
- * If SME is active we need to reset the pages back to being
- * an encrypted mapping before freeing them.
+ * If host memory encryption is active we need to reset the pages back
+ * to being an encrypted mapping before freeing them.
*/
set_memory_encrypted((unsigned long)vaddr, pages);
}
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 33d1a61..2765ec4 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -279,6 +279,7 @@
(void (*)(struct mmu_gather *, void *))tlb_remove_page,
.mmu.exit_mmap = paravirt_nop,
+ .mmu.notify_page_enc_status_changed = paravirt_nop,
#ifdef CONFIG_PARAVIRT_XXL
.mmu.read_cr2 = __PV_IS_CALLEE_SAVE(native_read_cr2),
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e..814ab46 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -6,7 +6,7 @@
#include <linux/swiotlb.h>
#include <linux/memblock.h>
#include <linux/dma-direct.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <asm/iommu.h>
#include <asm/swiotlb.h>
@@ -45,11 +45,10 @@
swiotlb = 1;
/*
- * If SME is active then swiotlb will be set to 1 so that bounce
- * buffers are allocated and used for devices that do not support
- * the addressing range required for the encryption mask.
+ * Set swiotlb to 1 so that bounce buffers are allocated and used for
+ * devices that can't support DMA to encrypted memory.
*/
- if (sme_active())
+ if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
swiotlb = 1;
return swiotlb;
diff --git a/arch/x86/kernel/probe_roms.c b/arch/x86/kernel/probe_roms.c
index 9e1def3..e2f2a15 100644
--- a/arch/x86/kernel/probe_roms.c
+++ b/arch/x86/kernel/probe_roms.c
@@ -21,6 +21,7 @@
#include <asm/sections.h>
#include <asm/io.h>
#include <asm/setup_arch.h>
+#include <asm/sev.h>
static struct resource system_rom_resource = {
.name = "System ROM",
@@ -197,11 +198,21 @@
void __init probe_roms(void)
{
- const unsigned char *rom;
unsigned long start, length, upper;
+ const unsigned char *rom;
unsigned char c;
int i;
+ /*
+ * The ROM memory range is not part of the e820 table and is therefore not
+ * pre-validated by BIOS. The kernel page table maps the ROM region as encrypted
+ * memory, and SNP requires encrypted memory to be validated before access.
+ * Do that here.
+ */
+ snp_prep_memory(video_rom_resource.start,
+ ((system_rom_resource.end + 1) - video_rom_resource.start),
+ SNP_PAGE_STATE_PRIVATE);
+
/* video rom */
upper = adapter_rom_resources[0].start;
for (start = video_rom_resource.start; start < upper; start += 2048) {
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 8a9cea9..2dcbc84 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -48,7 +48,7 @@
* %rsi page_list
* %rdx start address
* %rcx preserve_context
- * %r8 sme_active
+ * %r8 host_mem_enc_active
*/
/* Save the CPU context, used for jumping back */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index dcfea7b..3cdabb4 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -504,6 +504,10 @@
/* 0 means: find the address automatically */
if (!crash_base) {
+ unsigned long long max_addr = high ? CRASH_ADDR_HIGH_MAX
+ : CRASH_ADDR_LOW_MAX;
+ unsigned long long base = CRASH_ALIGN;
+
/*
* Set CRASH_ADDR_LOW_MAX upper bound for crash memory,
* crashkernel=x,high reserves memory over 4G, also allocates
@@ -511,15 +515,14 @@
* But the extra memory is not required for all machines.
* So try low memory first and fall back to high memory
* unless "crashkernel=size[KMG],high" is specified.
+ * To conserve memory in crash-capture kernel try
+ * to allocate crash_base at the lowest address possible.
*/
- if (!high)
+ do {
crash_base = memblock_phys_alloc_range(crash_size,
- CRASH_ALIGN, CRASH_ALIGN,
- CRASH_ADDR_LOW_MAX);
- if (!crash_base)
- crash_base = memblock_phys_alloc_range(crash_size,
- CRASH_ALIGN, CRASH_ALIGN,
- CRASH_ADDR_HIGH_MAX);
+ CRASH_ALIGN, base, base + crash_size);
+ base += CRASH_ALIGN;
+ } while (!crash_base && base + crash_size <= max_addr);
if (!crash_base) {
pr_info("crashkernel reservation failed - No suitable area found.\n");
return;
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 05707b7..c94ce93 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -14,6 +14,89 @@
#define has_cpuflag(f) boot_cpu_has(f)
#endif
+/* I/O parameters for CPUID-related helpers */
+struct cpuid_leaf {
+ u32 fn;
+ u32 subfn;
+ u32 eax;
+ u32 ebx;
+ u32 ecx;
+ u32 edx;
+};
+
+/*
+ * Individual entries of the SNP CPUID table, as defined by the SNP
+ * Firmware ABI, Revision 0.9, Section 7.1, Table 14.
+ */
+struct snp_cpuid_fn {
+ u32 eax_in;
+ u32 ecx_in;
+ u64 xcr0_in;
+ u64 xss_in;
+ u32 eax;
+ u32 ebx;
+ u32 ecx;
+ u32 edx;
+ u64 __reserved;
+} __packed;
+
+/*
+ * SNP CPUID table, as defined by the SNP Firmware ABI, Revision 0.9,
+ * Section 8.14.2.6. Also noted there is the SNP firmware-enforced limit
+ * of 64 entries per CPUID table.
+ */
+#define SNP_CPUID_COUNT_MAX 64
+
+struct snp_cpuid_table {
+ u32 count;
+ u32 __reserved1;
+ u64 __reserved2;
+ struct snp_cpuid_fn fn[SNP_CPUID_COUNT_MAX];
+} __packed;
+
+/*
+ * Since feature negotiation related variables are set early in the boot
+ * process they must reside in the .data section so as not to be zeroed
+ * out when the .bss section is later cleared.
+ *
+ * GHCB protocol version negotiated with the hypervisor.
+ */
+static u16 ghcb_version __ro_after_init;
+
+/*
+ * This may be called early while still running on the initial identity
+ * mapping. Use RIP-relative addressing to obtain the correct address
+ * while running with the initial identity mapping as well as the
+ * switch-over to kernel virtual addresses later.
+ */
+static u16 *get_ghcb_version_ptr(void)
+{
+ void *ptr;
+
+ asm ("lea ghcb_version(%%rip), %0"
+ : "=r" (ptr)
+ : "p" (&ghcb_version));
+
+ return (u16 *)ptr;
+}
+
+/* Copy of the SNP firmware's CPUID page. */
+static struct snp_cpuid_table cpuid_table_copy __ro_after_init;
+
+/*
+ * These will be initialized based on CPUID table so that non-present
+ * all-zero leaves (for sparse tables) can be differentiated from
+ * invalid/out-of-range leaves. This is needed since all-zero leaves
+ * still need to be post-processed.
+ */
+struct cpuid_maxes {
+ u32 std_range;
+ u32 hyp_range;
+ u32 ext_range;
+};
+
+static struct cpuid_maxes cpuid_range_maxes __ro_after_init;
+
static bool __init sev_es_check_cpu_features(void)
{
if (!has_cpuflag(X86_FEATURE_RDRAND)) {
@@ -24,15 +107,12 @@
return true;
}
-static void __noreturn sev_es_terminate(unsigned int reason)
+static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason)
{
u64 val = GHCB_MSR_TERM_REQ;
- /*
- * Tell the hypervisor what went wrong - only reason-set 0 is
- * currently supported.
- */
- val |= GHCB_SEV_TERM_REASON(0, reason);
+ /* Tell the hypervisor what went wrong. */
+ val |= GHCB_SEV_TERM_REASON(set, reason);
/* Request Guest Termination from Hypvervisor */
sev_es_wr_ghcb_msr(val);
@@ -42,6 +122,42 @@
asm volatile("hlt\n" : : : "memory");
}
+/*
+ * The hypervisor features are available from GHCB version 2 onward.
+ */
+static u64 get_hv_features(void)
+{
+ u64 val;
+
+ if (*get_ghcb_version_ptr() < 2)
+ return 0;
+
+ sev_es_wr_ghcb_msr(GHCB_MSR_HV_FT_REQ);
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+ if (GHCB_RESP_CODE(val) != GHCB_MSR_HV_FT_RESP)
+ return 0;
+
+ return GHCB_MSR_HV_FT_RESP_VAL(val);
+}
+
+static void snp_register_ghcb_early(unsigned long paddr)
+{
+ unsigned long pfn = paddr >> PAGE_SHIFT;
+ u64 val;
+
+ sev_es_wr_ghcb_msr(GHCB_MSR_REG_GPA_REQ_VAL(pfn));
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+
+ /* If the response GPA is not ours then abort the guest */
+ if ((GHCB_RESP_CODE(val) != GHCB_MSR_REG_GPA_RESP) ||
+ (GHCB_MSR_REG_GPA_RESP_VAL(val) != pfn))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_REGISTER);
+}
+
static bool sev_es_negotiate_protocol(void)
{
u64 val;
@@ -54,10 +170,12 @@
if (GHCB_MSR_INFO(val) != GHCB_MSR_SEV_INFO_RESP)
return false;
- if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTO_OUR ||
- GHCB_MSR_PROTO_MIN(val) > GHCB_PROTO_OUR)
+ if (GHCB_MSR_PROTO_MAX(val) < GHCB_PROTOCOL_MIN ||
+ GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
return false;
+ *get_ghcb_version_ptr() = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+
return true;
}
@@ -102,7 +220,7 @@
enum es_result ret;
/* Fill in protocol and format specifiers */
- ghcb->protocol_version = GHCB_PROTOCOL_MAX;
+ ghcb->protocol_version = ghcb_version;
ghcb->ghcb_usage = GHCB_DEFAULT_USAGE;
ghcb_set_sw_exit_code(ghcb, exit_code);
@@ -139,6 +257,302 @@
return ret;
}
+static int __sev_cpuid_hv(u32 fn, int reg_idx, u32 *reg)
+{
+ u64 val;
+
+ sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, reg_idx));
+ VMGEXIT();
+ val = sev_es_rd_ghcb_msr();
+ if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ return -EIO;
+
+ *reg = (val >> 32);
+
+ return 0;
+}
+
+static int sev_cpuid_hv(struct cpuid_leaf *leaf)
+{
+ int ret;
+
+ /*
+ * MSR protocol does not support fetching non-zero subfunctions, but is
+ * sufficient to handle current early-boot cases. Should that change,
+ * make sure to report an error rather than ignoring the index and
+ * grabbing random values. If this issue arises in the future, handling
+ * can be added here to use GHCB-page protocol for cases that occur late
+ * enough in boot that GHCB page is available.
+ */
+ if (cpuid_function_is_indexed(leaf->fn) && leaf->subfn)
+ return -EINVAL;
+
+ ret = __sev_cpuid_hv(leaf->fn, GHCB_CPUID_REQ_EAX, &leaf->eax);
+ ret = ret ? : __sev_cpuid_hv(leaf->fn, GHCB_CPUID_REQ_EBX, &leaf->ebx);
+ ret = ret ? : __sev_cpuid_hv(leaf->fn, GHCB_CPUID_REQ_ECX, &leaf->ecx);
+ ret = ret ? : __sev_cpuid_hv(leaf->fn, GHCB_CPUID_REQ_EDX, &leaf->edx);
+
+ return ret;
+}
+
+/*
+ * This may be called early while still running on the initial identity
+ * mapping. Use RIP-relative addressing to obtain the correct address
+ * while running with the initial identity mapping as well as the
+ * switch-over to kernel virtual addresses later.
+ */
+static const struct snp_cpuid_table *snp_cpuid_get_table(void)
+{
+ void *ptr;
+
+ asm ("lea cpuid_table_copy(%%rip), %0"
+ : "=r" (ptr)
+ : "p" (&cpuid_table_copy));
+
+ return ptr;
+}
+
+static const struct cpuid_maxes *snp_cpuid_get_maxes(void)
+{
+ void *ptr;
+
+ asm ("lea cpuid_range_maxes(%%rip), %0"
+ : "=r" (ptr)
+ : "p" (&cpuid_range_maxes));
+
+ return ptr;
+}
+
+/*
+ * The SNP Firmware ABI, Revision 0.9, Section 7.1, details the use of
+ * XCR0_IN and XSS_IN to encode multiple versions of 0xD subfunctions 0
+ * and 1 based on the corresponding features enabled by a particular
+ * combination of XCR0 and XSS registers so that a guest can look up the
+ * version corresponding to the features currently enabled in its XCR0/XSS
+ * registers. The only values that differ between these versions/table
+ * entries is the enabled XSAVE area size advertised via EBX.
+ *
+ * While hypervisors may choose to make use of this support, it is more
+ * robust/secure for a guest to simply find the entry corresponding to the
+ * base/legacy XSAVE area size (XCR0=1 or XCR0=3), and then calculate the
+ * XSAVE area size using subfunctions 2 through 64, as documented in APM
+ * Volume 3, Rev 3.31, Appendix E.3.8, which is what is done here.
+ *
+ * Since base/legacy XSAVE area size is documented as 0x240, use that value
+ * directly rather than relying on the base size in the CPUID table.
+ *
+ * Return: XSAVE area size on success, 0 otherwise.
+ */
+static u32 snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted)
+{
+ const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
+ u64 xfeatures_found = 0;
+ u32 xsave_size = 0x240;
+ int i;
+
+ for (i = 0; i < cpuid_table->count; i++) {
+ const struct snp_cpuid_fn *e = &cpuid_table->fn[i];
+
+ if (!(e->eax_in == 0xD && e->ecx_in > 1 && e->ecx_in < 64))
+ continue;
+ if (!(xfeatures_en & (BIT_ULL(e->ecx_in))))
+ continue;
+ if (xfeatures_found & (BIT_ULL(e->ecx_in)))
+ continue;
+
+ xfeatures_found |= (BIT_ULL(e->ecx_in));
+
+ if (compacted)
+ xsave_size += e->eax;
+ else
+ xsave_size = max(xsave_size, e->eax + e->ebx);
+ }
+
+ /*
+ * Either the guest set unsupported XCR0/XSS bits, or the corresponding
+ * entries in the CPUID table were not present. This is not a valid
+ * state to be in.
+ */
+ if (xfeatures_found != (xfeatures_en & GENMASK_ULL(63, 2)))
+ return 0;
+
+ return xsave_size;
+}
+
+static bool
+snp_cpuid_get_validated_func(struct cpuid_leaf *leaf)
+{
+ const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
+ int i;
+
+ for (i = 0; i < cpuid_table->count; i++) {
+ const struct snp_cpuid_fn *e = &cpuid_table->fn[i];
+
+ if (e->eax_in != leaf->fn)
+ continue;
+
+ if (cpuid_function_is_indexed(leaf->fn) && e->ecx_in != leaf->subfn)
+ continue;
+
+ /*
+ * For 0xD subfunctions 0 and 1, only use the entry corresponding
+ * to the base/legacy XSAVE area size (XCR0=1 or XCR0=3, XSS=0).
+ * See the comments above snp_cpuid_calc_xsave_size() for more
+ * details.
+ */
+ if (e->eax_in == 0xD && (e->ecx_in == 0 || e->ecx_in == 1))
+ if (!(e->xcr0_in == 1 || e->xcr0_in == 3) || e->xss_in)
+ continue;
+
+ leaf->eax = e->eax;
+ leaf->ebx = e->ebx;
+ leaf->ecx = e->ecx;
+ leaf->edx = e->edx;
+
+ return true;
+ }
+
+ return false;
+}
+
+static void snp_cpuid_hv(struct cpuid_leaf *leaf)
+{
+ if (sev_cpuid_hv(leaf))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID_HV);
+}
+
+static int snp_cpuid_postprocess(struct cpuid_leaf *leaf)
+{
+ struct cpuid_leaf leaf_hv = *leaf;
+
+ switch (leaf->fn) {
+ case 0x1:
+ snp_cpuid_hv(&leaf_hv);
+
+ /* initial APIC ID */
+ leaf->ebx = (leaf_hv.ebx & GENMASK(31, 24)) | (leaf->ebx & GENMASK(23, 0));
+ /* APIC enabled bit */
+ leaf->edx = (leaf_hv.edx & BIT(9)) | (leaf->edx & ~BIT(9));
+
+ /* OSXSAVE enabled bit */
+ if (native_read_cr4() & X86_CR4_OSXSAVE)
+ leaf->ecx |= BIT(27);
+ break;
+ case 0x7:
+ /* OSPKE enabled bit */
+ leaf->ecx &= ~BIT(4);
+ if (native_read_cr4() & X86_CR4_PKE)
+ leaf->ecx |= BIT(4);
+ break;
+ case 0xB:
+ leaf_hv.subfn = 0;
+ snp_cpuid_hv(&leaf_hv);
+
+ /* extended APIC ID */
+ leaf->edx = leaf_hv.edx;
+ break;
+ case 0xD: {
+ bool compacted = false;
+ u64 xcr0 = 1, xss = 0;
+ u32 xsave_size;
+
+ if (leaf->subfn != 0 && leaf->subfn != 1)
+ return 0;
+
+ if (native_read_cr4() & X86_CR4_OSXSAVE)
+ xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
+ if (leaf->subfn == 1) {
+ /* Get XSS value if XSAVES is enabled. */
+ if (leaf->eax & BIT(3)) {
+ unsigned long lo, hi;
+
+ asm volatile("rdmsr" : "=a" (lo), "=d" (hi)
+ : "c" (MSR_IA32_XSS));
+ xss = (hi << 32) | lo;
+ }
+
+ /*
+ * The PPR and APM aren't clear on what size should be
+ * encoded in 0xD:0x1:EBX when compaction is not enabled
+ * by either XSAVEC (feature bit 1) or XSAVES (feature
+ * bit 3) since SNP-capable hardware has these feature
+ * bits fixed as 1. KVM sets it to 0 in this case, but
+ * to avoid this becoming an issue it's safer to simply
+ * treat this as unsupported for SNP guests.
+ */
+ if (!(leaf->eax & (BIT(1) | BIT(3))))
+ return -EINVAL;
+
+ compacted = true;
+ }
+
+ xsave_size = snp_cpuid_calc_xsave_size(xcr0 | xss, compacted);
+ if (!xsave_size)
+ return -EINVAL;
+
+ leaf->ebx = xsave_size;
+ }
+ break;
+ case 0x8000001E:
+ snp_cpuid_hv(&leaf_hv);
+
+ /* extended APIC ID */
+ leaf->eax = leaf_hv.eax;
+ /* compute ID */
+ leaf->ebx = (leaf->ebx & GENMASK(31, 8)) | (leaf_hv.ebx & GENMASK(7, 0));
+ /* node ID */
+ leaf->ecx = (leaf->ecx & GENMASK(31, 8)) | (leaf_hv.ecx & GENMASK(7, 0));
+ break;
+ default:
+ /* No fix-ups needed, use values as-is. */
+ break;
+ }
+
+ return 0;
+}
+
+/*
+ * Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value
+ * should be treated as fatal by caller.
+ */
+static int snp_cpuid(struct cpuid_leaf *leaf)
+{
+ const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
+ const struct cpuid_maxes *maxes = snp_cpuid_get_maxes();
+
+ if (!cpuid_table->count)
+ return -EOPNOTSUPP;
+
+ if (!snp_cpuid_get_validated_func(leaf)) {
+ /*
+ * Some hypervisors will avoid keeping track of CPUID entries
+ * where all values are zero, since they can be handled the
+ * same as out-of-range values (all-zero). This is useful here
+ * as well as it allows virtually all guest configurations to
+ * work using a single SNP CPUID table.
+ *
+ * To allow for this, there is a need to distinguish between
+ * out-of-range entries and in-range zero entries, since the
+ * CPUID table entries are only a template that may need to be
+ * augmented with additional values for things like
+ * CPU-specific information during post-processing. So if it's
+ * not in the table, set the values to zero. Then, if they are
+ * within a valid CPUID range, proceed with post-processing
+ * using zeros as the initial values. Otherwise, skip
+ * post-processing and just return zeros immediately.
+ */
+ leaf->eax = leaf->ebx = leaf->ecx = leaf->edx = 0;
+
+ /* Skip post-processing for out-of-range zero leafs. */
+ if (!(leaf->fn <= maxes->std_range ||
+ (leaf->fn >= 0x40000000 && leaf->fn <= maxes->hyp_range) ||
+ (leaf->fn >= 0x80000000 && leaf->fn <= maxes->ext_range)))
+ return 0;
+ }
+
+ return snp_cpuid_postprocess(leaf);
+}
+
/*
* Boot VC Handler - This is the first VC handler during boot, there is no GHCB
* page yet, so it only supports the MSR based communication with the
@@ -146,40 +560,33 @@
*/
void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code)
{
+ unsigned int subfn = lower_bits(regs->cx, 32);
unsigned int fn = lower_bits(regs->ax, 32);
- unsigned long val;
+ struct cpuid_leaf leaf;
+ int ret;
/* Only CPUID is supported via MSR protocol */
if (exit_code != SVM_EXIT_CPUID)
goto fail;
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EAX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
- goto fail;
- regs->ax = val >> 32;
+ leaf.fn = fn;
+ leaf.subfn = subfn;
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EBX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
- goto fail;
- regs->bx = val >> 32;
+ ret = snp_cpuid(&leaf);
+ if (!ret)
+ goto cpuid_done;
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_ECX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ if (ret != -EOPNOTSUPP)
goto fail;
- regs->cx = val >> 32;
- sev_es_wr_ghcb_msr(GHCB_CPUID_REQ(fn, GHCB_CPUID_REQ_EDX));
- VMGEXIT();
- val = sev_es_rd_ghcb_msr();
- if (GHCB_RESP_CODE(val) != GHCB_MSR_CPUID_RESP)
+ if (sev_cpuid_hv(&leaf))
goto fail;
- regs->dx = val >> 32;
+
+cpuid_done:
+ regs->ax = leaf.eax;
+ regs->bx = leaf.ebx;
+ regs->cx = leaf.ecx;
+ regs->dx = leaf.edx;
/*
* This is a VC handler and the #VC is only raised when SEV-ES is
@@ -210,7 +617,7 @@
fail:
/* Terminate the guest */
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
}
static enum es_result vc_insn_string_check(struct es_em_ctxt *ctxt,
@@ -504,12 +911,37 @@
return ret;
}
+static int vc_handle_cpuid_snp(struct pt_regs *regs)
+{
+ struct cpuid_leaf leaf;
+ int ret;
+
+ leaf.fn = regs->ax;
+ leaf.subfn = regs->cx;
+ ret = snp_cpuid(&leaf);
+ if (!ret) {
+ regs->ax = leaf.eax;
+ regs->bx = leaf.ebx;
+ regs->cx = leaf.ecx;
+ regs->dx = leaf.edx;
+ }
+
+ return ret;
+}
+
static enum es_result vc_handle_cpuid(struct ghcb *ghcb,
struct es_em_ctxt *ctxt)
{
struct pt_regs *regs = ctxt->regs;
u32 cr4 = native_read_cr4();
enum es_result ret;
+ int snp_cpuid_ret;
+
+ snp_cpuid_ret = vc_handle_cpuid_snp(regs);
+ if (!snp_cpuid_ret)
+ return ES_OK;
+ if (snp_cpuid_ret != -EOPNOTSUPP)
+ return ES_VMM_ERROR;
ghcb_set_rax(ghcb, regs->ax);
ghcb_set_rcx(ghcb, regs->cx);
@@ -561,3 +993,71 @@
return ES_OK;
}
+
+struct cc_setup_data {
+ struct setup_data header;
+ u32 cc_blob_address;
+};
+
+/*
+ * Search for a Confidential Computing blob passed in as a setup_data entry
+ * via the Linux Boot Protocol.
+ */
+static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp)
+{
+ struct cc_setup_data *sd = NULL;
+ struct setup_data *hdr;
+
+ hdr = (struct setup_data *)bp->hdr.setup_data;
+
+ while (hdr) {
+ if (hdr->type == SETUP_CC_BLOB) {
+ sd = (struct cc_setup_data *)hdr;
+ return (struct cc_blob_sev_info *)(unsigned long)sd->cc_blob_address;
+ }
+ hdr = (struct setup_data *)hdr->next;
+ }
+
+ return NULL;
+}
+
+/*
+ * Initialize the kernel's copy of the SNP CPUID table, and set up the
+ * pointer that will be used to access it.
+ *
+ * Maintaining a direct mapping of the SNP CPUID table used by firmware would
+ * be possible as an alternative, but the approach is brittle since the
+ * mapping needs to be updated in sync with all the changes to virtual memory
+ * layout and related mapping facilities throughout the boot process.
+ */
+static void __init setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
+{
+ const struct snp_cpuid_table *cpuid_table_fw, *cpuid_table;
+ const struct cpuid_maxes *range_maxes;
+ struct cpuid_maxes local;
+ int i;
+
+ if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE)
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
+
+ cpuid_table_fw = (const struct snp_cpuid_table *)cc_info->cpuid_phys;
+ if (!cpuid_table_fw->count || cpuid_table_fw->count > SNP_CPUID_COUNT_MAX)
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
+
+ cpuid_table = snp_cpuid_get_table();
+ range_maxes = snp_cpuid_get_maxes();
+ memcpy((void *)cpuid_table, cpuid_table_fw, sizeof(*cpuid_table));
+
+ /* Initialize CPUID ranges for range-checking. */
+ for (i = 0; i < cpuid_table->count; i++) {
+ const struct snp_cpuid_fn *fn = &cpuid_table->fn[i];
+
+ if (fn->eax_in == 0x0)
+ local.std_range = fn->eax;
+ else if (fn->eax_in == 0x40000000)
+ local.hyp_range = fn->eax;
+ else if (fn->eax_in == 0x80000000)
+ local.ext_range = fn->eax;
+ }
+ memcpy((void *)range_maxes, &local, sizeof(*range_maxes));
+}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 6161b14..be9ffef 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -18,6 +18,10 @@
#include <linux/memblock.h>
#include <linux/kernel.h>
#include <linux/mm.h>
+#include <linux/cpumask.h>
+#include <linux/efi.h>
+#include <linux/platform_device.h>
+#include <linux/io.h>
#include <asm/cpu_entry_area.h>
#include <asm/stacktrace.h>
@@ -26,13 +30,33 @@
#include <asm/fpu/internal.h>
#include <asm/processor.h>
#include <asm/realmode.h>
+#include <asm/setup.h>
#include <asm/traps.h>
#include <asm/svm.h>
#include <asm/smp.h>
#include <asm/cpu.h>
+#include <asm/apic.h>
+#include <asm/cpuid.h>
+#include <asm/cmdline.h>
#define DR7_RESET_VALUE 0x400
+/* AP INIT values as documented in the APM2 section "Processor Initialization State" */
+#define AP_INIT_CS_LIMIT 0xffff
+#define AP_INIT_DS_LIMIT 0xffff
+#define AP_INIT_LDTR_LIMIT 0xffff
+#define AP_INIT_GDTR_LIMIT 0xffff
+#define AP_INIT_IDTR_LIMIT 0xffff
+#define AP_INIT_TR_LIMIT 0xffff
+#define AP_INIT_RFLAGS_DEFAULT 0x2
+#define AP_INIT_DR6_DEFAULT 0xffff0ff0
+#define AP_INIT_GPAT_DEFAULT 0x0007040600070406ULL
+#define AP_INIT_XCR0_DEFAULT 0x1
+#define AP_INIT_X87_FTW_DEFAULT 0x5555
+#define AP_INIT_X87_FCW_DEFAULT 0x0040
+#define AP_INIT_CR0_DEFAULT 0x60000010
+#define AP_INIT_MXCSR_DEFAULT 0x1f80
+
/* For early boot hypervisor communication in SEV-ES enabled guests */
static struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE);
@@ -40,7 +64,10 @@
* Needs to be in the .data section because we need it NULL before bss is
* cleared
*/
-static struct ghcb __initdata *boot_ghcb;
+static struct ghcb *boot_ghcb __section(".data");
+
+/* Bitmap of SEV features supported by the hypervisor */
+static u64 sev_hv_features __ro_after_init;
/* #VC handler runtime per-CPU data */
struct sev_es_runtime_data {
@@ -86,8 +113,14 @@
static DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
DEFINE_STATIC_KEY_FALSE(sev_es_enable_key);
-/* Needed in vc_early_forward_exception */
-void do_early_exception(struct pt_regs *regs, int trapnr);
+static DEFINE_PER_CPU(struct sev_es_save_area *, sev_vmsa);
+
+struct sev_config {
+ __u64 debug : 1,
+ __reserved : 63;
+};
+
+static struct sev_config sev_cfg __read_mostly;
static __always_inline bool on_vc_stack(struct pt_regs *regs)
{
@@ -209,9 +242,6 @@
return ghcb;
}
-/* Needed in vc_early_forward_exception */
-void do_early_exception(struct pt_regs *regs, int trapnr);
-
static inline u64 sev_es_rd_ghcb_msr(void)
{
return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
@@ -585,6 +615,495 @@
return ret;
}
+static void pvalidate_pages(unsigned long vaddr, unsigned int npages, bool validate)
+{
+ unsigned long vaddr_end;
+ int rc;
+
+ vaddr = vaddr & PAGE_MASK;
+ vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+ while (vaddr < vaddr_end) {
+ rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate);
+ if (WARN(rc, "Failed to validate address 0x%lx ret %d", vaddr, rc))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE);
+
+ vaddr = vaddr + PAGE_SIZE;
+ }
+}
+
+static void __init early_set_pages_state(unsigned long paddr, unsigned int npages, enum psc_op op)
+{
+ unsigned long paddr_end;
+ u64 val;
+
+ paddr = paddr & PAGE_MASK;
+ paddr_end = paddr + (npages << PAGE_SHIFT);
+
+ while (paddr < paddr_end) {
+ /*
+ * Use the MSR protocol because this function can be called before
+ * the GHCB is established.
+ */
+ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op));
+ VMGEXIT();
+
+ val = sev_es_rd_ghcb_msr();
+
+ if (WARN(GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP,
+ "Wrong PSC response code: 0x%x\n",
+ (unsigned int)GHCB_RESP_CODE(val)))
+ goto e_term;
+
+ if (WARN(GHCB_MSR_PSC_RESP_VAL(val),
+ "Failed to change page state to '%s' paddr 0x%lx error 0x%llx\n",
+ op == SNP_PAGE_STATE_PRIVATE ? "private" : "shared",
+ paddr, GHCB_MSR_PSC_RESP_VAL(val)))
+ goto e_term;
+
+ paddr = paddr + PAGE_SIZE;
+ }
+
+ return;
+
+e_term:
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ /*
+ * Ask the hypervisor to mark the memory pages as private in the RMP
+ * table.
+ */
+ early_set_pages_state(paddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ /* Validate the memory pages after they've been added in the RMP table. */
+ pvalidate_pages(vaddr, npages, true);
+}
+
+void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr,
+ unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ /* Invalidate the memory pages before they are marked shared in the RMP table. */
+ pvalidate_pages(vaddr, npages, false);
+
+ /* Ask hypervisor to mark the memory pages shared in the RMP table. */
+ early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op)
+{
+ unsigned long vaddr, npages;
+
+ vaddr = (unsigned long)__va(paddr);
+ npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (op == SNP_PAGE_STATE_PRIVATE)
+ early_snp_set_memory_private(vaddr, paddr, npages);
+ else if (op == SNP_PAGE_STATE_SHARED)
+ early_snp_set_memory_shared(vaddr, paddr, npages);
+ else
+ WARN(1, "invalid memory op %d\n", op);
+}
+
+static int vmgexit_psc(struct snp_psc_desc *desc)
+{
+ int cur_entry, end_entry, ret = 0;
+ struct snp_psc_desc *data;
+ struct ghcb_state state;
+ struct es_em_ctxt ctxt;
+ unsigned long flags;
+ struct ghcb *ghcb;
+
+ /*
+ * __sev_get_ghcb() needs to run with IRQs disabled because it is using
+ * a per-CPU GHCB.
+ */
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+ if (!ghcb) {
+ ret = 1;
+ goto out_unlock;
+ }
+
+ /* Copy the input desc into GHCB shared buffer */
+ data = (struct snp_psc_desc *)ghcb->shared_buffer;
+ memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc)));
+
+ /*
+ * As per the GHCB specification, the hypervisor can resume the guest
+ * before processing all the entries. Check whether all the entries
+ * are processed. If not, then keep retrying. Note, the hypervisor
+ * will update the data memory directly to indicate the status, so
+ * reference the data->hdr everywhere.
+ *
+ * The strategy here is to wait for the hypervisor to change the page
+ * state in the RMP table before guest accesses the memory pages. If the
+ * page state change was not successful, then later memory access will
+ * result in a crash.
+ */
+ cur_entry = data->hdr.cur_entry;
+ end_entry = data->hdr.end_entry;
+
+ while (data->hdr.cur_entry <= data->hdr.end_entry) {
+ ghcb_set_sw_scratch(ghcb, (u64)__pa(data));
+
+ /* This will advance the shared buffer data points to. */
+ ret = sev_es_ghcb_hv_call(ghcb, &ctxt, SVM_VMGEXIT_PSC, 0, 0);
+
+ /*
+ * Page State Change VMGEXIT can pass error code through
+ * exit_info_2.
+ */
+ if (WARN(ret || ghcb->save.sw_exit_info_2,
+ "SNP: PSC failed ret=%d exit_info_2=%llx\n",
+ ret, ghcb->save.sw_exit_info_2)) {
+ ret = 1;
+ goto out;
+ }
+
+ /* Verify that reserved bit is not set */
+ if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) {
+ ret = 1;
+ goto out;
+ }
+
+ /*
+ * Sanity check that entry processing is not going backwards.
+ * This will happen only if hypervisor is tricking us.
+ */
+ if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry,
+"SNP: PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n",
+ end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) {
+ ret = 1;
+ goto out;
+ }
+ }
+
+out:
+ __sev_put_ghcb(&state);
+
+out_unlock:
+ local_irq_restore(flags);
+
+ return ret;
+}
+
+static void __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr,
+ unsigned long vaddr_end, int op)
+{
+ struct psc_hdr *hdr;
+ struct psc_entry *e;
+ unsigned long pfn;
+ int i;
+
+ hdr = &data->hdr;
+ e = data->entries;
+
+ memset(data, 0, sizeof(*data));
+ i = 0;
+
+ while (vaddr < vaddr_end) {
+ if (is_vmalloc_addr((void *)vaddr))
+ pfn = vmalloc_to_pfn((void *)vaddr);
+ else
+ pfn = __pa(vaddr) >> PAGE_SHIFT;
+
+ e->gfn = pfn;
+ e->operation = op;
+ hdr->end_entry = i;
+
+ /*
+ * Current SNP implementation doesn't keep track of the RMP page
+ * size so use 4K for simplicity.
+ */
+ e->pagesize = RMP_PG_SIZE_4K;
+
+ vaddr = vaddr + PAGE_SIZE;
+ e++;
+ i++;
+ }
+
+ if (vmgexit_psc(data))
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC);
+}
+
+static void set_pages_state(unsigned long vaddr, unsigned int npages, int op)
+{
+ unsigned long vaddr_end, next_vaddr;
+ struct snp_psc_desc *desc;
+
+ desc = kmalloc(sizeof(*desc), GFP_KERNEL_ACCOUNT);
+ if (!desc)
+ panic("SNP: failed to allocate memory for PSC descriptor\n");
+
+ vaddr = vaddr & PAGE_MASK;
+ vaddr_end = vaddr + (npages << PAGE_SHIFT);
+
+ while (vaddr < vaddr_end) {
+ /* Calculate the last vaddr that fits in one struct snp_psc_desc. */
+ next_vaddr = min_t(unsigned long, vaddr_end,
+ (VMGEXIT_PSC_MAX_ENTRY * PAGE_SIZE) + vaddr);
+
+ __set_pages_state(desc, vaddr, next_vaddr, op);
+
+ vaddr = next_vaddr;
+ }
+
+ kfree(desc);
+}
+
+void snp_set_memory_shared(unsigned long vaddr, unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ pvalidate_pages(vaddr, npages, false);
+
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_SHARED);
+}
+
+void snp_set_memory_private(unsigned long vaddr, unsigned int npages)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ set_pages_state(vaddr, npages, SNP_PAGE_STATE_PRIVATE);
+
+ pvalidate_pages(vaddr, npages, true);
+}
+
+static int snp_set_vmsa(void *va, bool vmsa)
+{
+ u64 attrs;
+
+ /*
+ * Running at VMPL0 allows the kernel to change the VMSA bit for a page
+ * using the RMPADJUST instruction. However, for the instruction to
+ * succeed it must target the permissions of a lesser privileged
+ * (higher numbered) VMPL level, so use VMPL1 (refer to the RMPADJUST
+ * instruction in the AMD64 APM Volume 3).
+ */
+ attrs = 1;
+ if (vmsa)
+ attrs |= RMPADJUST_VMSA_PAGE_BIT;
+
+ return rmpadjust((unsigned long)va, RMP_PG_SIZE_4K, attrs);
+}
+
+#define __ATTR_BASE (SVM_SELECTOR_P_MASK | SVM_SELECTOR_S_MASK)
+#define INIT_CS_ATTRIBS (__ATTR_BASE | SVM_SELECTOR_READ_MASK | SVM_SELECTOR_CODE_MASK)
+#define INIT_DS_ATTRIBS (__ATTR_BASE | SVM_SELECTOR_WRITE_MASK)
+
+#define INIT_LDTR_ATTRIBS (SVM_SELECTOR_P_MASK | 2)
+#define INIT_TR_ATTRIBS (SVM_SELECTOR_P_MASK | 3)
+
+static void *snp_alloc_vmsa_page(void)
+{
+ struct page *p;
+
+ /*
+ * Allocate VMSA page to work around the SNP erratum where the CPU will
+ * incorrectly signal an RMP violation #PF if a large page (2MB or 1GB)
+ * collides with the RMP entry of VMSA page. The recommended workaround
+ * is to not use a large page.
+ *
+ * Allocate an 8k page which is also 8k-aligned.
+ */
+ p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+ if (!p)
+ return NULL;
+
+ split_page(p, 1);
+
+ /* Free the first 4k. This page may be 2M/1G aligned and cannot be used. */
+ __free_page(p);
+
+ return page_address(p + 1);
+}
+
+static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa)
+{
+ int err;
+
+ err = snp_set_vmsa(vmsa, false);
+ if (err)
+ pr_err("clear VMSA page failed (%u), leaking page\n", err);
+ else
+ free_page((unsigned long)vmsa);
+}
+
+static int wakeup_cpu_via_vmgexit(int apic_id, unsigned long start_ip)
+{
+ struct sev_es_save_area *cur_vmsa, *vmsa;
+ struct ghcb_state state;
+ unsigned long flags;
+ struct ghcb *ghcb;
+ u8 sipi_vector;
+ int cpu, ret;
+ u64 cr4;
+
+ /*
+ * The hypervisor SNP feature support check has happened earlier, just check
+ * the AP_CREATION one here.
+ */
+ if (!(sev_hv_features & GHCB_HV_FT_SNP_AP_CREATION))
+ return -EOPNOTSUPP;
+
+ /*
+ * Verify the desired start IP against the known trampoline start IP
+ * to catch any future new trampolines that may be introduced that
+ * would require a new protected guest entry point.
+ */
+ if (WARN_ONCE(start_ip != real_mode_header->trampoline_start,
+ "Unsupported SNP start_ip: %lx\n", start_ip))
+ return -EINVAL;
+
+ /* Override start_ip with known protected guest start IP */
+ start_ip = real_mode_header->sev_es_trampoline_start;
+
+ /* Find the logical CPU for the APIC ID */
+ for_each_present_cpu(cpu) {
+ if (arch_match_cpu_phys_id(cpu, apic_id))
+ break;
+ }
+ if (cpu >= nr_cpu_ids)
+ return -EINVAL;
+
+ cur_vmsa = per_cpu(sev_vmsa, cpu);
+
+ /*
+ * A new VMSA is created each time because there is no guarantee that
+ * the current VMSA is the kernels or that the vCPU is not running. If
+ * an attempt was done to use the current VMSA with a running vCPU, a
+ * #VMEXIT of that vCPU would wipe out all of the settings being done
+ * here.
+ */
+ vmsa = (struct sev_es_save_area *)snp_alloc_vmsa_page();
+ if (!vmsa)
+ return -ENOMEM;
+
+ /* CR4 should maintain the MCE value */
+ cr4 = native_read_cr4() & X86_CR4_MCE;
+
+ /* Set the CS value based on the start_ip converted to a SIPI vector */
+ sipi_vector = (start_ip >> 12);
+ vmsa->cs.base = sipi_vector << 12;
+ vmsa->cs.limit = AP_INIT_CS_LIMIT;
+ vmsa->cs.attrib = INIT_CS_ATTRIBS;
+ vmsa->cs.selector = sipi_vector << 8;
+
+ /* Set the RIP value based on start_ip */
+ vmsa->rip = start_ip & 0xfff;
+
+ /* Set AP INIT defaults as documented in the APM */
+ vmsa->ds.limit = AP_INIT_DS_LIMIT;
+ vmsa->ds.attrib = INIT_DS_ATTRIBS;
+ vmsa->es = vmsa->ds;
+ vmsa->fs = vmsa->ds;
+ vmsa->gs = vmsa->ds;
+ vmsa->ss = vmsa->ds;
+
+ vmsa->gdtr.limit = AP_INIT_GDTR_LIMIT;
+ vmsa->ldtr.limit = AP_INIT_LDTR_LIMIT;
+ vmsa->ldtr.attrib = INIT_LDTR_ATTRIBS;
+ vmsa->idtr.limit = AP_INIT_IDTR_LIMIT;
+ vmsa->tr.limit = AP_INIT_TR_LIMIT;
+ vmsa->tr.attrib = INIT_TR_ATTRIBS;
+
+ vmsa->cr4 = cr4;
+ vmsa->cr0 = AP_INIT_CR0_DEFAULT;
+ vmsa->dr7 = DR7_RESET_VALUE;
+ vmsa->dr6 = AP_INIT_DR6_DEFAULT;
+ vmsa->rflags = AP_INIT_RFLAGS_DEFAULT;
+ vmsa->g_pat = AP_INIT_GPAT_DEFAULT;
+ vmsa->xcr0 = AP_INIT_XCR0_DEFAULT;
+ vmsa->mxcsr = AP_INIT_MXCSR_DEFAULT;
+ vmsa->x87_ftw = AP_INIT_X87_FTW_DEFAULT;
+ vmsa->x87_fcw = AP_INIT_X87_FCW_DEFAULT;
+
+ /* SVME must be set. */
+ vmsa->efer = EFER_SVME;
+
+ /*
+ * Set the SNP-specific fields for this VMSA:
+ * VMPL level
+ * SEV_FEATURES (matches the SEV STATUS MSR right shifted 2 bits)
+ */
+ vmsa->vmpl = 0;
+ vmsa->sev_features = sev_status >> 2;
+
+ /* Switch the page over to a VMSA page now that it is initialized */
+ ret = snp_set_vmsa(vmsa, true);
+ if (ret) {
+ pr_err("set VMSA page failed (%u)\n", ret);
+ free_page((unsigned long)vmsa);
+
+ return -EINVAL;
+ }
+
+ /* Issue VMGEXIT AP Creation NAE event */
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+
+ vc_ghcb_invalidate(ghcb);
+ ghcb_set_rax(ghcb, vmsa->sev_features);
+ ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_CREATION);
+ ghcb_set_sw_exit_info_1(ghcb, ((u64)apic_id << 32) | SVM_VMGEXIT_AP_CREATE);
+ ghcb_set_sw_exit_info_2(ghcb, __pa(vmsa));
+
+ sev_es_wr_ghcb_msr(__pa(ghcb));
+ VMGEXIT();
+
+ if (!ghcb_sw_exit_info_1_is_valid(ghcb) ||
+ lower_32_bits(ghcb->save.sw_exit_info_1)) {
+ pr_err("SNP AP Creation error\n");
+ ret = -EINVAL;
+ }
+
+ __sev_put_ghcb(&state);
+
+ local_irq_restore(flags);
+
+ /* Perform cleanup if there was an error */
+ if (ret) {
+ snp_cleanup_vmsa(vmsa);
+ vmsa = NULL;
+ }
+
+ /* Free up any previous VMSA page */
+ if (cur_vmsa)
+ snp_cleanup_vmsa(cur_vmsa);
+
+ /* Record the current VMSA page */
+ per_cpu(sev_vmsa, cpu) = vmsa;
+
+ return ret;
+}
+
+void snp_set_wakeup_secondary_cpu(void)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return;
+
+ /*
+ * Always set this override if SNP is enabled. This makes it the
+ * required method to start APs under SNP. If the hypervisor does
+ * not support AP creation, then no APs will be started.
+ */
+ apic->wakeup_secondary_cpu = wakeup_cpu_via_vmgexit;
+}
+
int sev_es_setup_ap_jump_table(struct real_mode_header *rmh)
{
u16 startup_cs, startup_ip;
@@ -675,15 +1194,39 @@
return ret;
}
-/*
- * This function runs on the first #VC exception after the kernel
- * switched to virtual addresses.
- */
-static bool __init sev_es_setup_ghcb(void)
+static void snp_register_per_cpu_ghcb(void)
{
+ struct sev_es_runtime_data *data;
+ struct ghcb *ghcb;
+
+ data = this_cpu_read(runtime_data);
+ ghcb = &data->ghcb_page;
+
+ snp_register_ghcb_early(__pa(ghcb));
+}
+
+void setup_ghcb(void)
+{
+ if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
+ return;
+
/* First make sure the hypervisor talks a supported protocol. */
if (!sev_es_negotiate_protocol())
- return false;
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
+ /*
+ * Check whether the runtime #VC exception handler is active. It uses
+ * the per-CPU GHCB page which is set up by sev_es_init_vc_handling().
+ *
+ * If SNP is active, register the per-CPU GHCB page so that the runtime
+ * exception handler can use it.
+ */
+ if (initial_vc_handler == (unsigned long)kernel_exc_vmm_communication) {
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ snp_register_per_cpu_ghcb();
+
+ return;
+ }
/*
* Clear the boot_ghcb. The first exception comes in before the bss
@@ -694,7 +1237,9 @@
/* Alright - Make the boot-ghcb public */
boot_ghcb = &boot_ghcb_page;
- return true;
+ /* SNP guest requires that GHCB GPA must be registered. */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ snp_register_ghcb_early(__pa(&boot_ghcb_page));
}
#ifdef CONFIG_HOTPLUG_CPU
@@ -797,6 +1342,17 @@
if (!sev_es_check_cpu_features())
panic("SEV-ES CPU Features missing");
+ /*
+ * SNP is supported in v2 of the GHCB spec which mandates support for HV
+ * features.
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+ sev_hv_features = get_hv_features();
+
+ if (!(sev_hv_features & GHCB_HV_FT_SNP))
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+ }
+
/* Enable SEV-ES special handling */
static_branch_enable(&sev_es_enable_key);
@@ -1448,7 +2004,7 @@
show_regs(regs);
/* Ask hypervisor to sev_es_terminate */
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
/* If that fails and we get here - just panic */
panic("Returned from Terminate-Request to Hypervisor\n");
@@ -1494,10 +2050,6 @@
struct es_em_ctxt ctxt;
enum es_result result;
- /* Do initial setup or terminate the guest */
- if (unlikely(boot_ghcb == NULL && !sev_es_setup_ghcb()))
- sev_es_terminate(GHCB_SEV_ES_REASON_GENERAL_REQUEST);
-
vc_ghcb_invalidate(boot_ghcb);
result = vc_init_em_ctxt(&ctxt, regs, exit_code);
@@ -1539,3 +2091,237 @@
while (true)
halt();
}
+
+/*
+ * Initial set up of SNP relies on information provided by the
+ * Confidential Computing blob, which can be passed to the kernel
+ * in the following ways, depending on how it is booted:
+ *
+ * - when booted via the boot/decompress kernel:
+ * - via boot_params
+ *
+ * - when booted directly by firmware/bootloader (e.g. CONFIG_PVH):
+ * - via a setup_data entry, as defined by the Linux Boot Protocol
+ *
+ * Scan for the blob in that order.
+ */
+static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ /* Boot kernel would have passed the CC blob via boot_params. */
+ if (bp->cc_blob_address) {
+ cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address;
+ goto found_cc_info;
+ }
+
+ /*
+ * If kernel was booted directly, without the use of the
+ * boot/decompression kernel, the CC blob may have been passed via
+ * setup_data instead.
+ */
+ cc_info = find_cc_blob_setup_data(bp);
+ if (!cc_info)
+ return NULL;
+
+found_cc_info:
+ if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC)
+ snp_abort();
+
+ return cc_info;
+}
+
+bool __init snp_init(struct boot_params *bp)
+{
+ struct cc_blob_sev_info *cc_info;
+
+ if (!bp)
+ return false;
+
+ cc_info = find_cc_blob(bp);
+ if (!cc_info)
+ return false;
+
+ setup_cpuid_table(cc_info);
+
+ /*
+ * The CC blob will be used later to access the secrets page. Cache
+ * it here like the boot kernel does.
+ */
+ bp->cc_blob_address = (u32)(unsigned long)cc_info;
+
+ return true;
+}
+
+void __init snp_abort(void)
+{
+ sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED);
+}
+
+static void dump_cpuid_table(void)
+{
+ const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
+ int i = 0;
+
+ pr_info("count=%d reserved=0x%x reserved2=0x%llx\n",
+ cpuid_table->count, cpuid_table->__reserved1, cpuid_table->__reserved2);
+
+ for (i = 0; i < SNP_CPUID_COUNT_MAX; i++) {
+ const struct snp_cpuid_fn *fn = &cpuid_table->fn[i];
+
+ pr_info("index=%3d fn=0x%08x subfn=0x%08x: eax=0x%08x ebx=0x%08x ecx=0x%08x edx=0x%08x xcr0_in=0x%016llx xss_in=0x%016llx reserved=0x%016llx\n",
+ i, fn->eax_in, fn->ecx_in, fn->eax, fn->ebx, fn->ecx,
+ fn->edx, fn->xcr0_in, fn->xss_in, fn->__reserved);
+ }
+}
+
+/*
+ * It is useful from an auditing/testing perspective to provide an easy way
+ * for the guest owner to know that the CPUID table has been initialized as
+ * expected, but that initialization happens too early in boot to print any
+ * sort of indicator, and there's not really any other good place to do it,
+ * so do it here.
+ */
+static int __init report_cpuid_table(void)
+{
+ const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
+
+ if (!cpuid_table->count)
+ return 0;
+
+ pr_info("Using SNP CPUID table, %d entries present.\n",
+ cpuid_table->count);
+
+ if (sev_cfg.debug)
+ dump_cpuid_table();
+
+ return 0;
+}
+arch_initcall(report_cpuid_table);
+
+static int __init init_sev_config(char *str)
+{
+ char *s;
+
+ while ((s = strsep(&str, ","))) {
+ if (!strcmp(s, "debug")) {
+ sev_cfg.debug = true;
+ continue;
+ }
+
+ pr_info("SEV command-line option '%s' was not recognized\n", s);
+ }
+
+ return 1;
+}
+__setup("sev=", init_sev_config);
+
+int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, unsigned long *fw_err)
+{
+ struct ghcb_state state;
+ struct es_em_ctxt ctxt;
+ unsigned long flags;
+ struct ghcb *ghcb;
+ int ret;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return -ENODEV;
+
+ if (!fw_err)
+ return -EINVAL;
+
+ /*
+ * __sev_get_ghcb() needs to run with IRQs disabled because it is using
+ * a per-CPU GHCB.
+ */
+ local_irq_save(flags);
+
+ ghcb = __sev_get_ghcb(&state);
+ if (!ghcb) {
+ ret = -EIO;
+ goto e_restore_irq;
+ }
+
+ vc_ghcb_invalidate(ghcb);
+
+ if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST) {
+ ghcb_set_rax(ghcb, input->data_gpa);
+ ghcb_set_rbx(ghcb, input->data_npages);
+ }
+
+ ret = sev_es_ghcb_hv_call(ghcb, &ctxt, exit_code, input->req_gpa, input->resp_gpa);
+ if (ret)
+ goto e_put;
+
+ if (ghcb->save.sw_exit_info_2) {
+ /* Number of expected pages are returned in RBX */
+ if (exit_code == SVM_VMGEXIT_EXT_GUEST_REQUEST &&
+ ghcb->save.sw_exit_info_2 == SNP_GUEST_REQ_INVALID_LEN)
+ input->data_npages = ghcb_get_rbx(ghcb);
+
+ *fw_err = ghcb->save.sw_exit_info_2;
+
+ ret = -EIO;
+ }
+
+e_put:
+ __sev_put_ghcb(&state);
+e_restore_irq:
+ local_irq_restore(flags);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(snp_issue_guest_request);
+
+static struct platform_device guest_req_device = {
+ .name = "snp-guest",
+ .id = -1,
+};
+
+static u64 get_secrets_page(void)
+{
+ u64 pa_data = boot_params.cc_blob_address;
+ struct cc_blob_sev_info info;
+ void *map;
+
+ /*
+ * The CC blob contains the address of the secrets page, check if the
+ * blob is present.
+ */
+ if (!pa_data)
+ return 0;
+
+ map = early_memremap(pa_data, sizeof(info));
+ memcpy(&info, map, sizeof(info));
+ early_memunmap(map, sizeof(info));
+
+ /* smoke-test the secrets page passed */
+ if (!info.secrets_phys || info.secrets_len != PAGE_SIZE)
+ return 0;
+
+ return info.secrets_phys;
+}
+
+static int __init snp_init_platform_device(void)
+{
+ struct snp_guest_platform_data data;
+ u64 gpa;
+
+ if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ return -ENODEV;
+
+ gpa = get_secrets_page();
+ if (!gpa)
+ return -ENODEV;
+
+ data.secrets_gpa = gpa;
+ if (platform_device_add_data(&guest_req_device, &data, sizeof(data)))
+ return -ENODEV;
+
+ if (platform_device_register(&guest_req_device))
+ return -ENODEV;
+
+ pr_info("SNP guest platform device initialized.\n");
+ return 0;
+}
+device_initcall(snp_init_platform_device);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 714f66a..0446358 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -82,6 +82,7 @@
#include <asm/spec-ctrl.h>
#include <asm/hw_irq.h>
#include <asm/stackprotector.h>
+#include <asm/sev.h>
#ifdef CONFIG_ACPI_CPPC_LIB
#include <acpi/cppc_acpi.h>
@@ -1391,6 +1392,8 @@
smp_quirk_init_udelay();
speculative_store_bypass_ht_init();
+
+ snp_set_wakeup_secondary_cpu();
}
void arch_thaw_secondary_cpus_begin(void)
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index d3e3b16e..267ff34 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -129,6 +129,11 @@
static void default_nmi_init(void) { };
+static void enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool enc) { }
+static bool enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return false; }
+static bool enc_tlb_flush_required_noop(bool enc) { return false; }
+static bool enc_cache_flush_required_noop(void) { return false; }
+
struct x86_platform_ops x86_platform __ro_after_init = {
.calibrate_cpu = native_calibrate_cpu_early,
.calibrate_tsc = native_calibrate_tsc,
@@ -138,9 +143,16 @@
.is_untracked_pat_range = is_ISA_range,
.nmi_init = default_nmi_init,
.get_nmi_reason = default_get_nmi_reason,
- .save_sched_clock_state = tsc_save_sched_clock_state,
- .restore_sched_clock_state = tsc_restore_sched_clock_state,
+ .save_sched_clock_state = tsc_save_sched_clock_state,
+ .restore_sched_clock_state = tsc_restore_sched_clock_state,
.hyper.pin_vcpu = x86_op_int_noop,
+
+ .guest = {
+ .enc_status_change_prepare = enc_status_change_prepare_noop,
+ .enc_status_change_finish = enc_status_change_finish_noop,
+ .enc_tlb_flush_required = enc_tlb_flush_required_noop,
+ .enc_cache_flush_required = enc_cache_flush_required_noop,
+ },
};
EXPORT_SYMBOL_GPL(x86_platform);
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 6222aa3..18456cd 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -19,6 +19,7 @@
#include <asm/user.h>
#include <asm/fpu/xstate.h>
#include <asm/sgx.h>
+#include <asm/cpuid.h>
#include "cpuid.h"
#include "lapic.h"
#include "mmu.h"
@@ -599,22 +600,8 @@
cpuid_count(entry->function, entry->index,
&entry->eax, &entry->ebx, &entry->ecx, &entry->edx);
- switch (function) {
- case 4:
- case 7:
- case 0xb:
- case 0xd:
- case 0xf:
- case 0x10:
- case 0x12:
- case 0x14:
- case 0x17:
- case 0x18:
- case 0x1f:
- case 0x8000001d:
+ if (cpuid_function_is_indexed(function))
entry->flags |= KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
- break;
- }
return entry;
}
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index a1811f5..7dd3463 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -144,45 +144,6 @@
FNAME(is_bad_mt_xwr)(&mmu->guest_rsvd_check, gpte);
}
-static int FNAME(cmpxchg_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
- pt_element_t __user *ptep_user, unsigned index,
- pt_element_t orig_pte, pt_element_t new_pte)
-{
- int r = -EFAULT;
-
- if (!user_access_begin(ptep_user, sizeof(pt_element_t)))
- return -EFAULT;
-
-#ifdef CMPXCHG
- asm volatile("1:" LOCK_PREFIX CMPXCHG " %[new], %[ptr]\n"
- "mov $0, %[r]\n"
- "setnz %b[r]\n"
- "2:"
- _ASM_EXTABLE_UA(1b, 2b)
- : [ptr] "+m" (*ptep_user),
- [old] "+a" (orig_pte),
- [r] "+q" (r)
- : [new] "r" (new_pte)
- : "memory");
-#else
- asm volatile("1:" LOCK_PREFIX "cmpxchg8b %[ptr]\n"
- "movl $0, %[r]\n"
- "jz 2f\n"
- "incl %[r]\n"
- "2:"
- _ASM_EXTABLE_UA(1b, 2b)
- : [ptr] "+m" (*ptep_user),
- [old] "+A" (orig_pte),
- [r] "+rm" (r)
- : [new_lo] "b" ((u32)new_pte),
- [new_hi] "c" ((u32)(new_pte >> 32))
- : "memory");
-#endif
-
- user_access_end();
- return r;
-}
-
static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
struct kvm_mmu_page *sp, u64 *spte,
u64 gpte)
@@ -281,7 +242,7 @@
if (unlikely(!walker->pte_writable[level - 1]))
continue;
- ret = FNAME(cmpxchg_gpte)(vcpu, mmu, ptep_user, index, orig_pte, pte);
+ ret = __try_cmpxchg_user(ptep_user, &orig_pte, pte, fault);
if (ret)
return ret;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 0f3d29f..8e0d483 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -551,12 +551,20 @@
static int sev_es_sync_vmsa(struct vcpu_svm *svm)
{
- struct vmcb_save_area *save = &svm->vmcb->save;
+ struct sev_es_save_area *save = svm->sev_es.vmsa;
/* Check some debug related fields before encrypting the VMSA */
- if (svm->vcpu.guest_debug || (save->dr7 & ~DR7_FIXED_1))
+ if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1))
return -EINVAL;
+ /*
+ * SEV-ES will use a VMSA that is pointed to by the VMCB, not
+ * the traditional VMSA that is part of the VMCB. Copy the
+ * traditional VMSA as it has been built so far (in prep
+ * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
+ */
+ memcpy(save, &svm->vmcb->save, sizeof(svm->vmcb->save));
+
/* Sync registgers */
save->rax = svm->vcpu.arch.regs[VCPU_REGS_RAX];
save->rbx = svm->vcpu.arch.regs[VCPU_REGS_RBX];
@@ -584,14 +592,6 @@
save->xss = svm->vcpu.arch.ia32_xss;
save->dr6 = svm->vcpu.arch.dr6;
- /*
- * SEV-ES will use a VMSA that is pointed to by the VMCB, not
- * the traditional VMSA that is part of the VMCB. Copy the
- * traditional VMSA as it has been built so far (in prep
- * for LAUNCH_UPDATE_VMSA) to be the initial SEV-ES state.
- */
- memcpy(svm->vmsa, save, sizeof(*save));
-
return 0;
}
@@ -612,11 +612,11 @@
* the VMSA memory content (i.e it will write the same memory region
* with the guest's key), so invalidate it first.
*/
- clflush_cache_range(svm->vmsa, PAGE_SIZE);
+ clflush_cache_range(svm->sev_es.vmsa, PAGE_SIZE);
vmsa.reserved = 0;
vmsa.handle = to_kvm_svm(kvm)->sev_info.handle;
- vmsa.address = __sme_pa(svm->vmsa);
+ vmsa.address = __sme_pa(svm->sev_es.vmsa);
vmsa.len = PAGE_SIZE;
ret = sev_issue_cmd(kvm, SEV_CMD_LAUNCH_UPDATE_VMSA, &vmsa, error);
if (ret)
@@ -2057,16 +2057,16 @@
svm = to_svm(vcpu);
if (vcpu->arch.guest_state_protected)
- sev_flush_guest_memory(svm, svm->vmsa, PAGE_SIZE);
- __free_page(virt_to_page(svm->vmsa));
+ sev_flush_guest_memory(svm, svm->sev_es.vmsa, PAGE_SIZE);
+ __free_page(virt_to_page(svm->sev_es.vmsa));
- if (svm->ghcb_sa_free)
- kfree(svm->ghcb_sa);
+ if (svm->sev_es.ghcb_sa_free)
+ kfree(svm->sev_es.ghcb_sa);
}
static void dump_ghcb(struct vcpu_svm *svm)
{
- struct ghcb *ghcb = svm->ghcb;
+ struct ghcb *ghcb = svm->sev_es.ghcb;
unsigned int nbits;
/* Re-use the dump_invalid_vmcb module parameter */
@@ -2092,7 +2092,7 @@
static void sev_es_sync_to_ghcb(struct vcpu_svm *svm)
{
struct kvm_vcpu *vcpu = &svm->vcpu;
- struct ghcb *ghcb = svm->ghcb;
+ struct ghcb *ghcb = svm->sev_es.ghcb;
/*
* The GHCB protocol so far allows for the following data
@@ -2112,7 +2112,7 @@
{
struct vmcb_control_area *control = &svm->vmcb->control;
struct kvm_vcpu *vcpu = &svm->vcpu;
- struct ghcb *ghcb = svm->ghcb;
+ struct ghcb *ghcb = svm->sev_es.ghcb;
u64 exit_code;
/*
@@ -2159,7 +2159,7 @@
struct ghcb *ghcb;
u64 exit_code = 0;
- ghcb = svm->ghcb;
+ ghcb = svm->sev_es.ghcb;
/* Only GHCB Usage code 0 is supported */
if (ghcb->ghcb_usage)
@@ -2277,33 +2277,34 @@
void sev_es_unmap_ghcb(struct vcpu_svm *svm)
{
- if (!svm->ghcb)
+ if (!svm->sev_es.ghcb)
return;
- if (svm->ghcb_sa_free) {
+ if (svm->sev_es.ghcb_sa_free) {
/*
* The scratch area lives outside the GHCB, so there is a
* buffer that, depending on the operation performed, may
* need to be synced, then freed.
*/
- if (svm->ghcb_sa_sync) {
+ if (svm->sev_es.ghcb_sa_sync) {
kvm_write_guest(svm->vcpu.kvm,
- ghcb_get_sw_scratch(svm->ghcb),
- svm->ghcb_sa, svm->ghcb_sa_len);
- svm->ghcb_sa_sync = false;
+ ghcb_get_sw_scratch(svm->sev_es.ghcb),
+ svm->sev_es.ghcb_sa,
+ svm->sev_es.ghcb_sa_len);
+ svm->sev_es.ghcb_sa_sync = false;
}
- kfree(svm->ghcb_sa);
- svm->ghcb_sa = NULL;
- svm->ghcb_sa_free = false;
+ kfree(svm->sev_es.ghcb_sa);
+ svm->sev_es.ghcb_sa = NULL;
+ svm->sev_es.ghcb_sa_free = false;
}
- trace_kvm_vmgexit_exit(svm->vcpu.vcpu_id, svm->ghcb);
+ trace_kvm_vmgexit_exit(svm->vcpu.vcpu_id, svm->sev_es.ghcb);
sev_es_sync_to_ghcb(svm);
- kvm_vcpu_unmap(&svm->vcpu, &svm->ghcb_map, true);
- svm->ghcb = NULL;
+ kvm_vcpu_unmap(&svm->vcpu, &svm->sev_es.ghcb_map, true);
+ svm->sev_es.ghcb = NULL;
}
void pre_sev_run(struct vcpu_svm *svm, int cpu)
@@ -2333,7 +2334,7 @@
static int setup_vmgexit_scratch(struct vcpu_svm *svm, bool sync, u64 len)
{
struct vmcb_control_area *control = &svm->vmcb->control;
- struct ghcb *ghcb = svm->ghcb;
+ struct ghcb *ghcb = svm->sev_es.ghcb;
u64 ghcb_scratch_beg, ghcb_scratch_end;
u64 scratch_gpa_beg, scratch_gpa_end;
void *scratch_va;
@@ -2369,7 +2370,7 @@
return -EINVAL;
}
- scratch_va = (void *)svm->ghcb;
+ scratch_va = (void *)svm->sev_es.ghcb;
scratch_va += (scratch_gpa_beg - control->ghcb_gpa);
} else {
/*
@@ -2399,12 +2400,12 @@
* the vCPU next time (i.e. a read was requested so the data
* must be written back to the guest memory).
*/
- svm->ghcb_sa_sync = sync;
- svm->ghcb_sa_free = true;
+ svm->sev_es.ghcb_sa_sync = sync;
+ svm->sev_es.ghcb_sa_free = true;
}
- svm->ghcb_sa = scratch_va;
- svm->ghcb_sa_len = len;
+ svm->sev_es.ghcb_sa = scratch_va;
+ svm->sev_es.ghcb_sa_len = len;
return 0;
}
@@ -2523,15 +2524,15 @@
return -EINVAL;
}
- if (kvm_vcpu_map(vcpu, ghcb_gpa >> PAGE_SHIFT, &svm->ghcb_map)) {
+ if (kvm_vcpu_map(vcpu, ghcb_gpa >> PAGE_SHIFT, &svm->sev_es.ghcb_map)) {
/* Unable to map GHCB from guest */
vcpu_unimpl(vcpu, "vmgexit: error mapping GHCB [%#llx] from guest\n",
ghcb_gpa);
return -EINVAL;
}
- svm->ghcb = svm->ghcb_map.hva;
- ghcb = svm->ghcb_map.hva;
+ svm->sev_es.ghcb = svm->sev_es.ghcb_map.hva;
+ ghcb = svm->sev_es.ghcb_map.hva;
trace_kvm_vmgexit_enter(vcpu->vcpu_id, ghcb);
@@ -2554,7 +2555,7 @@
ret = kvm_sev_es_mmio_read(vcpu,
control->exit_info_1,
control->exit_info_2,
- svm->ghcb_sa);
+ svm->sev_es.ghcb_sa);
break;
case SVM_VMGEXIT_MMIO_WRITE:
ret = setup_vmgexit_scratch(svm, false, control->exit_info_2);
@@ -2564,7 +2565,7 @@
ret = kvm_sev_es_mmio_write(vcpu,
control->exit_info_1,
control->exit_info_2,
- svm->ghcb_sa);
+ svm->sev_es.ghcb_sa);
break;
case SVM_VMGEXIT_NMI_COMPLETE:
ret = svm_invoke_exit_handler(vcpu, SVM_EXIT_IRET);
@@ -2627,7 +2628,8 @@
if (r)
return r;
- return kvm_sev_es_string_io(&svm->vcpu, size, port, svm->ghcb_sa, count, in);
+ return kvm_sev_es_string_io(&svm->vcpu, size, port, svm->sev_es.ghcb_sa,
+ count, in);
}
void sev_es_init_vmcb(struct vcpu_svm *svm)
@@ -2642,7 +2644,7 @@
* VMCB page. Do not include the encryption mask on the VMSA physical
* address since hardware will access it using the guest key.
*/
- svm->vmcb->control.vmsa_pa = __pa(svm->vmsa);
+ svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa);
/* Can't intercept CR register access, HV can't modify CR registers */
svm_clr_intercept(svm, INTERCEPT_CR0_READ);
@@ -2689,7 +2691,7 @@
void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu)
{
struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
- struct vmcb_save_area *hostsa;
+ struct sev_es_save_area *hostsa;
/*
* As an SEV-ES guest, hardware will restore the host state on VMEXIT,
@@ -2699,7 +2701,7 @@
vmsave(__sme_page_pa(sd->save_area));
/* XCR0 is restored on VMEXIT, save the current host value */
- hostsa = (struct vmcb_save_area *)(page_address(sd->save_area) + 0x400);
+ hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
hostsa->xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
/* PKRU is restored on VMEXIT, save the current host value */
@@ -2714,8 +2716,8 @@
struct vcpu_svm *svm = to_svm(vcpu);
/* First SIPI: Use the values as initially set by the VMM */
- if (!svm->received_first_sipi) {
- svm->received_first_sipi = true;
+ if (!svm->sev_es.received_first_sipi) {
+ svm->sev_es.received_first_sipi = true;
return;
}
@@ -2724,8 +2726,8 @@
* the guest will set the CS and RIP. Set SW_EXIT_INFO_2 to a
* non-zero value.
*/
- if (!svm->ghcb)
+ if (!svm->sev_es.ghcb)
return;
- ghcb_set_sw_exit_info_2(svm->ghcb, 1);
+ ghcb_set_sw_exit_info_2(svm->sev_es.ghcb, 1);
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 059d9c2..8ec106b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -25,6 +25,7 @@
#include <linux/pagemap.h>
#include <linux/swap.h>
#include <linux/rwsem.h>
+#include <linux/cc_platform.h>
#include <asm/apic.h>
#include <asm/perf_event.h>
@@ -457,7 +458,7 @@
return 0;
}
- if (sev_active()) {
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
pr_info("KVM is unsupported when running as an SEV guest\n");
return 0;
}
@@ -1392,7 +1393,7 @@
svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT);
if (vmsa_page)
- svm->vmsa = page_address(vmsa_page);
+ svm->sev_es.vmsa = page_address(vmsa_page);
svm->guest_state_loaded = false;
@@ -2807,11 +2808,11 @@
static int svm_complete_emulated_msr(struct kvm_vcpu *vcpu, int err)
{
struct vcpu_svm *svm = to_svm(vcpu);
- if (!err || !sev_es_guest(vcpu->kvm) || WARN_ON_ONCE(!svm->ghcb))
+ if (!err || !sev_es_guest(vcpu->kvm) || WARN_ON_ONCE(!svm->sev_es.ghcb))
return kvm_complete_insn_gp(vcpu, err);
- ghcb_set_sw_exit_info_1(svm->ghcb, 1);
- ghcb_set_sw_exit_info_2(svm->ghcb,
+ ghcb_set_sw_exit_info_1(svm->sev_es.ghcb, 1);
+ ghcb_set_sw_exit_info_2(svm->sev_es.ghcb,
X86_TRAP_GP |
SVM_EVTINJ_TYPE_EXEPT |
SVM_EVTINJ_VALID);
@@ -3254,8 +3255,8 @@
"tr:",
save01->tr.selector, save01->tr.attrib,
save01->tr.limit, save01->tr.base);
- pr_err("cpl: %d efer: %016llx\n",
- save->cpl, save->efer);
+ pr_err("vmpl: %d cpl: %d efer: %016llx\n",
+ save->vmpl, save->cpl, save->efer);
pr_err("%-15s %016llx %-13s %016llx\n",
"cr0:", save->cr0, "cr2:", save->cr2);
pr_err("%-15s %016llx %-13s %016llx\n",
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 1d9b1a9..4737d50 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -125,6 +125,20 @@
bool initialized;
};
+struct vcpu_sev_es_state {
+ /* SEV-ES support */
+ struct sev_es_save_area *vmsa;
+ struct ghcb *ghcb;
+ struct kvm_host_map ghcb_map;
+ bool received_first_sipi;
+
+ /* SEV-ES scratch area support */
+ void *ghcb_sa;
+ u32 ghcb_sa_len;
+ bool ghcb_sa_sync;
+ bool ghcb_sa_free;
+};
+
struct vcpu_svm {
struct kvm_vcpu vcpu;
/* vmcb always points at current_vmcb->ptr, it's purely a shorthand. */
@@ -185,17 +199,7 @@
DECLARE_BITMAP(write, MAX_DIRECT_ACCESS_MSRS);
} shadow_msr_intercept;
- /* SEV-ES support */
- struct vmcb_save_area *vmsa;
- struct ghcb *ghcb;
- struct kvm_host_map ghcb_map;
- bool received_first_sipi;
-
- /* SEV-ES scratch area support */
- void *ghcb_sa;
- u32 ghcb_sa_len;
- bool ghcb_sa_sync;
- bool ghcb_sa_free;
+ struct vcpu_sev_es_state sev_es;
bool guest_state_loaded;
};
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 5864219..c9c4806 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -1,10 +1,10 @@
# SPDX-License-Identifier: GPL-2.0
# Kernel does not boot with instrumentation of tlb.c and mem_encrypt*.c
KCOV_INSTRUMENT_tlb.o := n
-KCOV_INSTRUMENT_mem_encrypt.o := n
+KCOV_INSTRUMENT_mem_encrypt_amd.o := n
KCOV_INSTRUMENT_mem_encrypt_identity.o := n
-KASAN_SANITIZE_mem_encrypt.o := n
+KASAN_SANITIZE_mem_encrypt_amd.o := n
KASAN_SANITIZE_mem_encrypt_identity.o := n
# Disable KCSAN entirely, because otherwise we get warnings that some functions
@@ -12,7 +12,7 @@
KCSAN_SANITIZE := n
ifdef CONFIG_FUNCTION_TRACER
-CFLAGS_REMOVE_mem_encrypt.o = -pg
+CFLAGS_REMOVE_mem_encrypt_amd.o = -pg
CFLAGS_REMOVE_mem_encrypt_identity.o = -pg
endif
@@ -52,6 +52,6 @@
obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
obj-$(CONFIG_PAGE_TABLE_ISOLATION) += pti.o
-obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt.o
+obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_amd.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 5dfa402..98258ac 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -14,7 +14,7 @@
#include <linux/slab.h>
#include <linux/vmalloc.h>
#include <linux/mmiotrace.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <linux/efi.h>
#include <linux/pgtable.h>
@@ -92,7 +92,7 @@
*/
static unsigned int __ioremap_check_encrypted(struct resource *res)
{
- if (!sev_active())
+ if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return 0;
switch (res->desc) {
@@ -112,7 +112,7 @@
*/
static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *desc)
{
- if (!sev_active())
+ if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return;
if (!IS_ENABLED(CONFIG_EFI))
@@ -561,7 +561,7 @@
case E820_TYPE_NVS:
case E820_TYPE_UNUSABLE:
/* For SEV, these areas are encrypted */
- if (sev_active())
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
break;
fallthrough;
@@ -744,7 +744,7 @@
bool arch_memremap_can_ram_remap(resource_size_t phys_addr, unsigned long size,
unsigned long flags)
{
- if (!mem_encrypt_active())
+ if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return true;
if (flags & MEMREMAP_ENC)
@@ -753,7 +753,7 @@
if (flags & MEMREMAP_DEC)
return false;
- if (sme_active()) {
+ if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) {
if (memremap_is_setup_data(phys_addr, size) ||
memremap_is_efi_data(phys_addr, size))
return false;
@@ -774,12 +774,12 @@
{
bool encrypted_prot;
- if (!mem_encrypt_active())
+ if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return prot;
encrypted_prot = true;
- if (sme_active()) {
+ if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) {
if (early_memremap_is_setup_data(phys_addr, size) ||
memremap_is_efi_data(phys_addr, size))
encrypted_prot = false;
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt_amd.c
similarity index 70%
rename from arch/x86/mm/mem_encrypt.c
rename to arch/x86/mm/mem_encrypt_amd.c
index e29b141..53ff5e8 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -31,6 +31,7 @@
#include <asm/processor-flags.h>
#include <asm/msr.h>
#include <asm/cmdline.h>
+#include <asm/sev.h>
#include "mm_internal.h"
@@ -50,6 +51,36 @@
static char sme_early_buffer[PAGE_SIZE] __initdata __aligned(PAGE_SIZE);
/*
+ * SNP-specific routine which needs to additionally change the page state from
+ * private to shared before copying the data from the source to destination and
+ * restore after the copy.
+ */
+static inline void __init snp_memcpy(void *dst, void *src, size_t sz,
+ unsigned long paddr, bool decrypt)
+{
+ unsigned long npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+
+ if (decrypt) {
+ /*
+ * @paddr needs to be accessed decrypted, mark the page shared in
+ * the RMP table before copying it.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(paddr), paddr, npages);
+
+ memcpy(dst, src, sz);
+
+ /* Restore the page state after the memcpy. */
+ early_snp_set_memory_private((unsigned long)__va(paddr), paddr, npages);
+ } else {
+ /*
+ * @paddr need to be accessed encrypted, no need for the page state
+ * change.
+ */
+ memcpy(dst, src, sz);
+ }
+}
+
+/*
* This routine does not change the underlying encryption setting of the
* page(s) that map this memory. It assumes that eventually the memory is
* meant to be accessed as either encrypted or decrypted but the contents
@@ -97,8 +128,13 @@
* Use a temporary buffer, of cache-line multiple size, to
* avoid data corruption as documented in the APM.
*/
- memcpy(sme_early_buffer, src, len);
- memcpy(dst, sme_early_buffer, len);
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+ snp_memcpy(sme_early_buffer, src, len, paddr, enc);
+ snp_memcpy(dst, sme_early_buffer, len, paddr, !enc);
+ } else {
+ memcpy(sme_early_buffer, src, len);
+ memcpy(dst, sme_early_buffer, len);
+ }
early_memunmap(dst, len);
early_memunmap(src, len);
@@ -144,7 +180,7 @@
struct boot_params *boot_data;
unsigned long cmdline_paddr;
- if (!sme_active())
+ if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return;
/* Get the command line address before unmapping the real_mode_data */
@@ -164,7 +200,7 @@
struct boot_params *boot_data;
unsigned long cmdline_paddr;
- if (!sme_active())
+ if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return;
__sme_early_map_unmap_mem(real_mode_data, sizeof(boot_params), true);
@@ -179,31 +215,12 @@
__sme_early_map_unmap_mem(__va(cmdline_paddr), COMMAND_LINE_SIZE, true);
}
-void __init sme_early_init(void)
-{
- unsigned int i;
-
- if (!sme_me_mask)
- return;
-
- early_pmd_flags = __sme_set(early_pmd_flags);
-
- __supported_pte_mask = __sme_set(__supported_pte_mask);
-
- /* Update the protection map with memory encryption mask */
- for (i = 0; i < ARRAY_SIZE(protection_map); i++)
- protection_map[i] = pgprot_encrypted(protection_map[i]);
-
- if (sev_active())
- swiotlb_force = SWIOTLB_FORCE;
-}
-
void __init sev_setup_arch(void)
{
phys_addr_t total_mem = memblock_phys_mem_size();
unsigned long size;
- if (!sev_active())
+ if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return;
/*
@@ -229,28 +246,110 @@
swiotlb_adjust_size(size);
}
+static unsigned long pg_level_to_pfn(int level, pte_t *kpte, pgprot_t *ret_prot)
+{
+ unsigned long pfn = 0;
+ pgprot_t prot;
+
+ switch (level) {
+ case PG_LEVEL_4K:
+ pfn = pte_pfn(*kpte);
+ prot = pte_pgprot(*kpte);
+ break;
+ case PG_LEVEL_2M:
+ pfn = pmd_pfn(*(pmd_t *)kpte);
+ prot = pmd_pgprot(*(pmd_t *)kpte);
+ break;
+ case PG_LEVEL_1G:
+ pfn = pud_pfn(*(pud_t *)kpte);
+ prot = pud_pgprot(*(pud_t *)kpte);
+ break;
+ default:
+ WARN_ONCE(1, "Invalid level for kpte\n");
+ return 0;
+ }
+
+ if (ret_prot)
+ *ret_prot = prot;
+
+ return pfn;
+}
+
+static bool amd_enc_tlb_flush_required(bool enc)
+{
+ return true;
+}
+
+static bool amd_enc_cache_flush_required(void)
+{
+ return !cpu_feature_enabled(X86_FEATURE_SME_COHERENT);
+}
+
+static void enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool enc)
+{
+#ifdef CONFIG_PARAVIRT
+ unsigned long vaddr_end = vaddr + size;
+
+ while (vaddr < vaddr_end) {
+ int psize, pmask, level;
+ unsigned long pfn;
+ pte_t *kpte;
+
+ kpte = lookup_address(vaddr, &level);
+ if (!kpte || pte_none(*kpte)) {
+ WARN_ONCE(1, "kpte lookup for vaddr\n");
+ return;
+ }
+
+ pfn = pg_level_to_pfn(level, kpte, NULL);
+ if (!pfn)
+ continue;
+
+ psize = page_level_size(level);
+ pmask = page_level_mask(level);
+
+ notify_page_enc_status_changed(pfn, psize >> PAGE_SHIFT, enc);
+
+ vaddr = (vaddr & pmask) + psize;
+ }
+#endif
+}
+
+static void amd_enc_status_change_prepare(unsigned long vaddr, int npages, bool enc)
+{
+ /*
+ * To maintain the security guarantees of SEV-SNP guests, make sure
+ * to invalidate the memory before encryption attribute is cleared.
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && !enc)
+ snp_set_memory_shared(vaddr, npages);
+}
+
+/* Return true unconditionally: return value doesn't matter for the SEV side */
+static bool amd_enc_status_change_finish(unsigned long vaddr, int npages, bool enc)
+{
+ /*
+ * After memory is mapped encrypted in the page table, validate it
+ * so that it is consistent with the page table updates.
+ */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP) && enc)
+ snp_set_memory_private(vaddr, npages);
+
+ if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
+ enc_dec_hypercall(vaddr, npages << PAGE_SHIFT, enc);
+
+ return true;
+}
+
static void __init __set_clr_pte_enc(pte_t *kpte, int level, bool enc)
{
pgprot_t old_prot, new_prot;
unsigned long pfn, pa, size;
pte_t new_pte;
- switch (level) {
- case PG_LEVEL_4K:
- pfn = pte_pfn(*kpte);
- old_prot = pte_pgprot(*kpte);
- break;
- case PG_LEVEL_2M:
- pfn = pmd_pfn(*(pmd_t *)kpte);
- old_prot = pmd_pgprot(*(pmd_t *)kpte);
- break;
- case PG_LEVEL_1G:
- pfn = pud_pfn(*(pud_t *)kpte);
- old_prot = pud_pgprot(*(pud_t *)kpte);
- break;
- default:
+ pfn = pg_level_to_pfn(level, kpte, &old_prot);
+ if (!pfn)
return;
- }
new_prot = old_prot;
if (enc)
@@ -273,25 +372,40 @@
clflush_cache_range(__va(pa), size);
/* Encrypt/decrypt the contents in-place */
- if (enc)
+ if (enc) {
sme_early_encrypt(pa, size);
- else
+ } else {
sme_early_decrypt(pa, size);
+ /*
+ * ON SNP, the page state in the RMP table must happen
+ * before the page table updates.
+ */
+ early_snp_set_memory_shared((unsigned long)__va(pa), pa, 1);
+ }
+
/* Change the page encryption mask. */
new_pte = pfn_pte(pfn, new_prot);
set_pte_atomic(kpte, new_pte);
+
+ /*
+ * If page is set encrypted in the page table, then update the RMP table to
+ * add this page as private.
+ */
+ if (enc)
+ early_snp_set_memory_private((unsigned long)__va(pa), pa, 1);
}
static int __init early_set_memory_enc_dec(unsigned long vaddr,
unsigned long size, bool enc)
{
- unsigned long vaddr_end, vaddr_next;
+ unsigned long vaddr_end, vaddr_next, start;
unsigned long psize, pmask;
int split_page_size_mask;
int level, ret;
pte_t *kpte;
+ start = vaddr;
vaddr_next = vaddr;
vaddr_end = vaddr + size;
@@ -346,6 +460,7 @@
ret = 0;
+ early_set_mem_enc_dec_hypercall(start, size, enc);
out:
__flush_tlb_all();
return ret;
@@ -361,11 +476,40 @@
return early_set_memory_enc_dec(vaddr, size, true);
}
+void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, unsigned long size, bool enc)
+{
+ enc_dec_hypercall(vaddr, size, enc);
+}
+
+void __init sme_early_init(void)
+{
+ unsigned int i;
+
+ if (!sme_me_mask)
+ return;
+
+ early_pmd_flags = __sme_set(early_pmd_flags);
+
+ __supported_pte_mask = __sme_set(__supported_pte_mask);
+
+ /* Update the protection map with memory encryption mask */
+ for (i = 0; i < ARRAY_SIZE(protection_map); i++)
+ protection_map[i] = pgprot_encrypted(protection_map[i]);
+
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
+ swiotlb_force = SWIOTLB_FORCE;
+
+ x86_platform.guest.enc_status_change_prepare = amd_enc_status_change_prepare;
+ x86_platform.guest.enc_status_change_finish = amd_enc_status_change_finish;
+ x86_platform.guest.enc_tlb_flush_required = amd_enc_tlb_flush_required;
+ x86_platform.guest.enc_cache_flush_required = amd_enc_cache_flush_required;
+}
+
/*
* SME and SEV are very similar but they are not the same, so there are
* times that the kernel will need to distinguish between SME and SEV. The
- * sme_active() and sev_active() functions are used for this. When a
- * distinction isn't needed, the mem_encrypt_active() function can be used.
+ * cc_platform_has() function is used for this. When a distinction isn't
+ * needed, the CC_ATTR_MEM_ENCRYPT attribute can be used.
*
* The trampoline code is a good example for this requirement. Before
* paging is activated, SME will access all memory as decrypted, but SEV
@@ -373,16 +517,6 @@
* up under SME the trampoline area cannot be encrypted, whereas under SEV
* the trampoline area must be encrypted.
*/
-bool sev_active(void)
-{
- return sev_status & MSR_AMD64_SEV_ENABLED;
-}
-
-bool sme_active(void)
-{
- return sme_me_mask && !sev_active();
-}
-EXPORT_SYMBOL_GPL(sev_active);
/* Needs to be called from non-instrumentable code */
bool noinstr sev_es_active(void)
@@ -396,7 +530,7 @@
/*
* For SEV, all DMA must be to unencrypted addresses.
*/
- if (sev_active())
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return true;
/*
@@ -404,7 +538,7 @@
* device does not support DMA to addresses that include the
* encryption mask.
*/
- if (sme_active()) {
+ if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) {
u64 dma_enc_mask = DMA_BIT_MASK(__ffs64(sme_me_mask));
u64 dma_dev_mask = min_not_zero(dev->coherent_dma_mask,
dev->bus_dma_limit);
@@ -429,7 +563,7 @@
* The unused memory range was mapped decrypted, change the encryption
* attribute from decrypted to encrypted before freeing it.
*/
- if (mem_encrypt_active()) {
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
r = set_memory_encrypted(vaddr, npages);
if (r) {
pr_warn("failed to free unused decrypted pages\n");
@@ -445,7 +579,7 @@
pr_info("AMD Memory Encryption Features active:");
/* Secure Memory Encryption */
- if (sme_active()) {
+ if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) {
/*
* SME is mutually exclusive with any of the SEV
* features below.
@@ -455,13 +589,17 @@
}
/* Secure Encrypted Virtualization */
- if (sev_active())
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
pr_cont(" SEV");
/* Encrypted Register State */
if (sev_es_active())
pr_cont(" SEV-ES");
+ /* Secure Nested Paging */
+ if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+ pr_cont(" SEV-SNP");
+
pr_cont("\n");
}
@@ -478,7 +616,7 @@
* With SEV, we need to unroll the rep string I/O instructions,
* but SEV-ES supports them through the #VC handler.
*/
- if (sev_active() && !sev_es_active())
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && !sev_es_active())
static_branch_enable(&sev_enable_key);
print_mem_encrypt_feature_info();
@@ -486,6 +624,6 @@
int arch_has_restricted_virtio_memory_access(void)
{
- return sev_active();
+ return cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT);
}
EXPORT_SYMBOL_GPL(arch_has_restricted_virtio_memory_access);
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index c7e9fb1..2e05707 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -39,10 +39,12 @@
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <asm/setup.h>
#include <asm/sections.h>
#include <asm/cmdline.h>
+#include <asm/sev.h>
#include "mm_internal.h"
@@ -296,7 +298,13 @@
unsigned long pgtable_area_len;
unsigned long decrypted_base;
- if (!sme_active())
+ /*
+ * This is early code, use an open coded check for SME instead of
+ * using cc_platform_has(). This eliminates worries about removing
+ * instrumentation or checking boot_cpu_data in the cc_platform_has()
+ * function.
+ */
+ if (!sme_get_me_mask() || sev_status & MSR_AMD64_SEV_ENABLED)
return;
/*
@@ -501,8 +509,11 @@
bool active_by_default;
unsigned long me_mask;
char buffer[16];
+ bool snp;
u64 msr;
+ snp = snp_init(bp);
+
/* Check for the SME/SEV support leaf */
eax = 0x80000000;
ecx = 0;
@@ -534,6 +545,10 @@
sev_status = __rdmsr(MSR_AMD64_SEV);
feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
+ /* The SEV-SNP CC blob should never be present unless SEV-SNP is enabled. */
+ if (snp && !(sev_status & MSR_AMD64_SEV_SNP_ENABLED))
+ snp_abort();
+
/* Check if memory encryption is enabled */
if (feature_mask == AMD_SME_BIT) {
/*
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index ad8a5c5..836800e 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -18,6 +18,7 @@
#include <linux/libnvdimm.h>
#include <linux/vmstat.h>
#include <linux/kernel.h>
+#include <linux/cc_platform.h>
#include <asm/e820/api.h>
#include <asm/processor.h>
@@ -1986,7 +1987,7 @@
int ret;
/* Nothing to do if memory encryption is not active */
- if (!mem_encrypt_active())
+ if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
/* Should not be working on unaligned addresses */
@@ -2004,10 +2005,12 @@
kmap_flush_unused();
vm_unmap_aliases();
- /*
- * Before changing the encryption attribute, we need to flush caches.
- */
- cpa_flush(&cpa, !this_cpu_has(X86_FEATURE_SME_COHERENT));
+ /* Flush the caches as needed before changing the encryption attribute. */
+ if (x86_platform.guest.enc_tlb_flush_required(enc))
+ cpa_flush(&cpa, x86_platform.guest.enc_cache_flush_required());
+
+ /* Notify hypervisor that we are about to set/clr encryption attribute. */
+ x86_platform.guest.enc_status_change_prepare(addr, numpages, enc);
ret = __change_page_attr_set_clr(&cpa, 1);
@@ -2020,6 +2023,12 @@
*/
cpa_flush(&cpa, 0);
+ /* Notify hypervisor that we have successfully set/clr encryption attribute. */
+ if (!ret) {
+ if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
+ ret = -EIO;
+ }
+
return ret;
}
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 3507f45..13b0dd3 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -34,6 +34,7 @@
#endif
int pcibios_last_bus = -1;
unsigned long pirq_table_addr;
+unsigned int pci_early_clear_msi;
const struct pci_raw_ops *__read_mostly raw_pci_ops;
const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
@@ -606,6 +607,9 @@
} else if (!strcmp(str, "skip_isa_align")) {
pci_probe |= PCI_CAN_SKIP_ISA_ALIGN;
return NULL;
+ } else if (!strcmp(str, "clearmsi")) {
+ pci_early_clear_msi = 1;
+ return NULL;
} else if (!strcmp(str, "noioapicquirk")) {
noioapicquirk = 1;
return NULL;
diff --git a/arch/x86/pci/early.c b/arch/x86/pci/early.c
index f5fc953..f1ba9d7 100644
--- a/arch/x86/pci/early.c
+++ b/arch/x86/pci/early.c
@@ -51,6 +51,31 @@
outw(val, 0xcfc + (offset&2));
}
+u32 pci_early_find_cap(int bus, int slot, int func, int cap)
+{
+ int bytes;
+ u8 pos;
+
+ if (!(read_pci_config_16(bus, slot, func, PCI_STATUS) &
+ PCI_STATUS_CAP_LIST))
+ return 0;
+
+ pos = read_pci_config_byte(bus, slot, func, PCI_CAPABILITY_LIST);
+ for (bytes = 0; bytes < 48 && pos >= 0x40; bytes++) {
+ u8 id;
+
+ pos &= ~3;
+ id = read_pci_config_byte(bus, slot, func, pos+PCI_CAP_LIST_ID);
+ if (id == 0xff)
+ break;
+ if (id == cap)
+ return pos;
+ pos = read_pci_config_byte(bus, slot, func,
+ pos+PCI_CAP_LIST_NEXT);
+ }
+ return 0;
+}
+
int early_pci_allowed(void)
{
return (pci_probe & (PCI_PROBE_CONF1|PCI_PROBE_NOEARLY)) ==
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 7515e78..1f36754 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -33,7 +33,7 @@
#include <linux/reboot.h>
#include <linux/slab.h>
#include <linux/ucs2_string.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <linux/sched/task.h>
#include <asm/setup.h>
@@ -284,7 +284,8 @@
if (!(md->attribute & EFI_MEMORY_WB))
flags |= _PAGE_PCD;
- if (sev_active() && md->type != EFI_MEMORY_MAPPED_IO)
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
+ md->type != EFI_MEMORY_MAPPED_IO)
flags |= _PAGE_ENC;
pfn = md->phys_addr >> PAGE_SHIFT;
@@ -390,7 +391,7 @@
if (!(md->attribute & EFI_MEMORY_RO))
pf |= _PAGE_RW;
- if (sev_active())
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
pf |= _PAGE_ENC;
return efi_update_mappings(md, pf);
@@ -438,7 +439,7 @@
(md->type != EFI_RUNTIME_SERVICES_CODE))
pf |= _PAGE_RW;
- if (sev_active())
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
pf |= _PAGE_ENC;
efi_update_mappings(md, pf);
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 1d20ed4..e739ea3 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -3,6 +3,7 @@
#include <linux/slab.h>
#include <linux/memblock.h>
#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <linux/pgtable.h>
#include <asm/set_memory.h>
@@ -70,7 +71,7 @@
static void sme_sev_setup_real_mode(struct trampoline_header *th)
{
#ifdef CONFIG_AMD_MEM_ENCRYPT
- if (sme_active())
+ if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
th->flags |= TH_FLAGS_SME_ACTIVE;
if (sev_es_active()) {
@@ -108,7 +109,7 @@
* decrypted memory in order to bring up other processors
* successfully. This is not needed for SEV.
*/
- if (sme_active())
+ if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
set_memory_decrypted((unsigned long)base, size >> PAGE_SHIFT);
memcpy(base, real_mode_blob, size);
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index ffcbe2b..d530bde 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -62,6 +62,15 @@
rescue mode with init=/bin/sh, even when the /dev directory
on the rootfs is completely empty.
+config DEVTMPFS_SAFE
+ bool "Automount devtmpfs with nosuid/noexec"
+ depends on DEVTMPFS_MOUNT
+ default y
+ help
+ This instructs the kernel to automount devtmpfs with the
+ MS_NOEXEC and MS_NOSUID mount flags, which can prevent
+ certain kinds of code-execution attack on embedded platforms.
+
config STANDALONE
bool "Select only drivers that don't need compile-time external firmware"
default y
@@ -187,7 +196,7 @@
source "drivers/base/regmap/Kconfig"
config DMA_SHARED_BUFFER
- bool
+ bool "Buffer framework to be shared between drivers"
default n
select IRQ_WORK
help
diff --git a/drivers/base/devtmpfs.c b/drivers/base/devtmpfs.c
index fa13ad4..6c2b736 100644
--- a/drivers/base/devtmpfs.c
+++ b/drivers/base/devtmpfs.c
@@ -363,6 +363,7 @@
int __init devtmpfs_mount(void)
{
int err;
+ int mflags = MS_SILENT;
if (!mount_dev)
return 0;
@@ -370,7 +371,10 @@
if (!thread)
return 0;
- err = init_mount("devtmpfs", "dev", "devtmpfs", MS_SILENT, NULL);
+#ifdef CONFIG_DEVTMPFS_SAFE
+ mflags |= MS_NOEXEC | MS_NOSUID;
+#endif
+ err = init_mount("devtmpfs", "dev", "devtmpfs", mflags, NULL);
if (err)
printk(KERN_INFO "devtmpfs: error mounting %i\n", err);
else
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 968c3df..bbd55d3 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -20,10 +20,12 @@
#include <linux/debugfs.h>
#include <linux/module.h>
#include <linux/seq_file.h>
+#include <linux/pci.h>
#include <linux/poll.h>
#include <linux/dma-resv.h>
#include <linux/mm.h>
#include <linux/mount.h>
+#include <linux/netdevice.h>
#include <linux/pseudo_fs.h>
#include <uapi/linux/dma-buf.h>
@@ -361,12 +363,16 @@
return ret;
}
+static long dma_buf_create_pages(struct file *file,
+ struct dma_buf_create_pages_info *create_info);
+
static long dma_buf_ioctl(struct file *file,
unsigned int cmd, unsigned long arg)
{
struct dma_buf *dmabuf;
struct dma_buf_sync sync;
enum dma_data_direction direction;
+ struct dma_buf_create_pages_info create_info;
int ret;
dmabuf = file->private_data;
@@ -403,6 +409,12 @@
case DMA_BUF_SET_NAME_A:
case DMA_BUF_SET_NAME_B:
return dma_buf_set_name(dmabuf, (const char __user *)arg);
+ case DMA_BUF_CREATE_PAGES:
+ if (copy_from_user(&create_info, (void __user *)arg,
+ sizeof(create_info))) {
+ return -EFAULT;
+ }
+ return dma_buf_create_pages(file, &create_info);
default:
return -ENOTTY;
@@ -1366,6 +1378,374 @@
}
EXPORT_SYMBOL_GPL(dma_buf_vunmap);
+static DEFINE_MUTEX(bind_rx_queue_mutex);
+
+static int dma_buf_pages_release(struct inode *inode, struct file *file)
+{
+ struct dma_buf_pages_file_priv *priv = file->private_data;
+ struct netdev_rx_queue *rxq;
+ struct file *old_pages;
+ unsigned long xa_idx;
+ int i;
+
+ xa_for_each(&priv->bound_rxq_list, xa_idx, rxq) {
+ mutex_lock(&bind_rx_queue_mutex);
+ old_pages = rcu_dereference_protected(rxq->dmabuf_pages,
+ mutex_is_locked(&bind_rx_queue_mutex));
+ if (old_pages == file)
+ rcu_assign_pointer(rxq->dmabuf_pages, NULL);
+ mutex_unlock(&bind_rx_queue_mutex);
+ dev_put(rxq->dev);
+ }
+
+ if (priv->tx_bv)
+ for (i = 0; i < priv->num_pages; i++)
+ put_page(&priv->pages[i]);
+
+ dma_buf_unmap_attachment(priv->attachment, priv->sgt, priv->direction);
+ dma_buf_detach(priv->dmabuf, priv->attachment);
+ dma_buf_put(priv->dmabuf);
+ pci_dev_put(priv->pci_dev);
+
+ xa_destroy(&priv->bound_rxq_list);
+
+ percpu_ref_kill(&priv->pgmap.ref);
+ /* Drop initial ref after percpu_ref_kill(). */
+ percpu_ref_put(&priv->pgmap.ref);
+
+ return 0;
+}
+
+static int
+dma_buf_pages_bind_rx_queue(struct file *file,
+ struct dma_buf_pages_bind_rx_queue *bind_rx_queue)
+{
+ struct dma_buf_pages_file_priv *priv = file->private_data;
+ struct netdev_rx_queue *rxq;
+ struct net_device *netdev;
+ int xa_id;
+ int err;
+
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
+ if (!priv->page_pool)
+ return -ENOTTY;
+
+ bind_rx_queue->ifname[IFNAMSIZ - 1] = '\0';
+
+ netdev = dev_get_by_name(current->nsproxy->net_ns,
+ bind_rx_queue->ifname);
+ if (!netdev)
+ return -ENODEV;
+
+ if (!dev_is_pci(netdev->dev.parent)) {
+ err = -ENOTBLK;
+ goto out_put_dev;
+ }
+
+ if (to_pci_dev(netdev->dev.parent) != priv->pci_dev) {
+ err = -EXDEV;
+ goto out_put_dev;
+ }
+
+ if (bind_rx_queue->rxq_idx >= netdev->num_rx_queues) {
+ err = -ERANGE;
+ goto out_put_dev;
+ }
+
+ rxq = __netif_get_rx_queue(netdev, bind_rx_queue->rxq_idx);
+
+ err = xa_alloc(&priv->bound_rxq_list, &xa_id, rxq, xa_limit_32b,
+ GFP_KERNEL);
+ if (err)
+ goto out_put_dev;
+ mutex_lock(&bind_rx_queue_mutex);
+
+ /* The DMA_BUF_CREATE_PAGES ioctl that creates the input file does a
+ * dma_buf_attach(), which validates that the net_device we're trying to
+ * attach to can reach the dmabuf, so we don't need to check here as
+ * well.
+ */
+ rcu_assign_pointer(rxq->dmabuf_pages, file);
+
+ mutex_unlock(&bind_rx_queue_mutex);
+
+ return 0;
+out_put_dev:
+ dev_put(netdev);
+ return err;
+}
+
+static long dma_buf_pages_ioctl(struct file *file, unsigned int op,
+ unsigned long arg)
+{
+ struct dma_buf_pages_bind_rx_queue bind_rx_queue;
+ void *input_ptr = (void *)arg;
+
+ switch (op) {
+ case DMA_BUF_PAGES_BIND_RX:
+ if (copy_from_user(&bind_rx_queue, input_ptr,
+ sizeof(bind_rx_queue)))
+ return -EFAULT;
+ return dma_buf_pages_bind_rx_queue(file, &bind_rx_queue);
+ default:
+ return -EINVAL;
+ }
+}
+
+static void dma_buf_page_free(struct page *page)
+{
+ struct dma_buf_pages_file_priv *priv;
+ struct dev_pagemap *pgmap;
+ unsigned long addr;
+ ssize_t offset;
+
+ pgmap = page->pgmap;
+ priv = container_of(pgmap, struct dma_buf_pages_file_priv, pgmap);
+ offset = page - priv->pages;
+
+ if (WARN_ON_ONCE(offset < 0 || offset > priv->num_pages))
+ return;
+
+ /* Offset + 1 is due to the fact that we want to avoid 0 virt address
+ * returned from the gen_pool. The genpool returns 0 on error, and virt
+ * address 0 is indistinguishable from an error.
+ */
+ addr = (offset + 1) << PAGE_SHIFT;
+
+ if (priv->page_pool) {
+ /* page->private containers the order for dma buf pages. */
+ if (!WARN_ON_ONCE(!gen_pool_has_addr(priv->page_pool, addr,
+ PAGE_SIZE * (1 << page->private)))) {
+ gen_pool_free(priv->page_pool, addr,
+ PAGE_SIZE * (1 << page->private));
+ }
+
+ }
+ percpu_ref_put(&pgmap->ref);
+}
+
+const struct dev_pagemap_ops dma_buf_pgmap_ops = {
+ .page_free = dma_buf_page_free,
+};
+EXPORT_SYMBOL_GPL(dma_buf_pgmap_ops);
+
+const struct file_operations dma_buf_pages_fops = {
+ .unlocked_ioctl = dma_buf_pages_ioctl,
+ .release = dma_buf_pages_release,
+};
+EXPORT_SYMBOL_GPL(dma_buf_pages_fops);
+
+#ifdef CONFIG_ZONE_DEVICE
+static void dma_buf_pages_percpu_release(struct percpu_ref *ref)
+{
+ struct dma_buf_pages_file_priv *priv;
+ struct dev_pagemap *pgmap;
+
+ pgmap = container_of(ref, struct dev_pagemap, ref);
+ priv = container_of(pgmap, struct dma_buf_pages_file_priv, pgmap);
+
+ if (priv->tx_bv) {
+ kvfree(priv->tx_bv);
+ } else {
+ /* This can be a racy check, if another thread is releasing
+ * memory to the gen_pool. However, that should not happen, as
+ * the dma_buf_pages_percpu_release() being called indicates
+ * the there are no lingering refs to pages anymore
+ */
+ if (!WARN_ON_ONCE(gen_pool_size(priv->page_pool) !=
+ gen_pool_avail(priv->page_pool))) {
+ gen_pool_destroy(priv->page_pool);
+ }
+ }
+
+ kvfree(priv->pages);
+ kfree(priv);
+}
+
+static long dma_buf_create_pages(struct file *file,
+ struct dma_buf_create_pages_info *create_info)
+{
+ int err, fd, i, pg_idx;
+ struct scatterlist *sg;
+ struct dma_buf_pages_file_priv *priv;
+ struct file *new_file;
+
+ fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC);
+ if (fd < 0) {
+ err = fd;
+ goto out_err;
+ }
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv) {
+ err = -ENOMEM;
+ goto out_put_fd;
+ }
+
+ priv->pgmap.type = MEMORY_DEVICE_PRIVATE;
+ priv->pgmap.ops = &dma_buf_pgmap_ops;
+ init_completion(&priv->pgmap.done);
+
+ /* This refcount is incremented everytime a page in priv->pages is
+ * allocated, and decremented everytime a page is freed. When
+ * it drops to 0, the priv struct can be freed. The priv struct
+ * is not freed until the initial reference acquired below is dropped.
+ */
+ err = percpu_ref_init(&priv->pgmap.ref, dma_buf_pages_percpu_release, 0,
+ GFP_KERNEL);
+ if (err)
+ goto out_free_priv;
+
+ /* Initial ref to be dropped after percpu_ref_kill(). */
+ percpu_ref_get(&priv->pgmap.ref);
+
+ priv->pci_dev = pci_get_domain_bus_and_slot(
+ 0, create_info->pci_bdf[0],
+ PCI_DEVFN(create_info->pci_bdf[1], create_info->pci_bdf[2]));
+ if (!priv->pci_dev) {
+ err = -ENODEV;
+ goto out_exit_percpu_ref;
+ }
+
+ priv->dmabuf = dma_buf_get(create_info->dma_buf_fd);
+ if (IS_ERR(priv->dmabuf)) {
+ err = PTR_ERR(priv->dmabuf);
+ goto out_put_pci_dev;
+ }
+
+ if (priv->dmabuf->size % PAGE_SIZE != 0) {
+ err = -EINVAL;
+ goto out_put_dma_buf;
+ }
+
+ priv->attachment = dma_buf_attach(priv->dmabuf, &priv->pci_dev->dev);
+ if (IS_ERR(priv->attachment)) {
+ err = PTR_ERR(priv->attachment);
+ goto out_put_dma_buf;
+ }
+
+ priv->num_pages = priv->dmabuf->size / PAGE_SIZE;
+ priv->pages = kvmalloc_array(priv->num_pages, sizeof(struct page),
+ GFP_KERNEL);
+ if (!priv->pages) {
+ err = -ENOMEM;
+ goto out_detach_dma_buf;
+ }
+
+ for (i = 0; i < priv->num_pages; i++) {
+ struct page *page = &priv->pages[i];
+
+ mm_zero_struct_page(page);
+ set_page_zone(page, ZONE_DEVICE);
+ set_page_count(page, 0);
+ page->pgmap = &priv->pgmap;
+ }
+
+ priv->direction = DMA_BIDIRECTIONAL;
+ priv->sgt = dma_buf_map_attachment(priv->attachment, priv->direction);
+ if (IS_ERR(priv->sgt)) {
+ err = PTR_ERR(priv->sgt);
+ goto out_free_pages;
+ }
+
+ /* Now write each dma address to each page */
+ pg_idx = 0;
+ for_each_sgtable_dma_sg(priv->sgt, sg, i) {
+ size_t len = sg_dma_len(sg);
+ dma_addr_t dma_addr = sg_dma_address(sg);
+
+ BUG_ON(!PAGE_ALIGNED(len));
+ while (len > 0) {
+ priv->pages[pg_idx].zone_device_data = (void *)dma_addr;
+ pg_idx++;
+ dma_addr += PAGE_SIZE;
+ len -= PAGE_SIZE;
+ }
+ }
+
+ if (create_info->create_page_pool != 0) {
+ priv->page_pool = gen_pool_create(
+ PAGE_SHIFT, dev_to_node(&priv->pci_dev->dev));
+ if (!priv->page_pool) {
+ err = -ENOMEM;
+ goto out_unmap_dma_buf;
+ }
+ /*
+ * We start with PAGE_SIZE instead of 0 since
+ * gen_pool_alloc_*() returns NULL when error
+ */
+ err = gen_pool_add_virt(priv->page_pool, PAGE_SIZE, 0,
+ PAGE_SIZE * priv->num_pages,
+ dev_to_node(&priv->pci_dev->dev));
+ if (err)
+ goto out_destroy_genpool;
+ xa_init_flags(&priv->bound_rxq_list, XA_FLAGS_ALLOC);
+ priv->tx_bv = NULL;
+ } else {
+ priv->page_pool = NULL;
+ priv->tx_bv = kvmalloc_array(priv->num_pages, sizeof(struct bio_vec),
+ GFP_KERNEL);
+ if (!priv->tx_bv) {
+ err = -ENOMEM;
+ goto out_unmap_dma_buf;
+ }
+ for (i = 0; i < priv->num_pages; i++) {
+ priv->tx_bv[i].bv_page = &priv->pages[i];
+ priv->tx_bv[i].bv_offset = 0;
+ priv->tx_bv[i].bv_len = PAGE_SIZE;
+ get_page(&priv->pages[i]);
+ }
+ percpu_ref_get_many(&priv->pgmap.ref, priv->num_pages);
+ iov_iter_bvec(&priv->tx_iter, WRITE, priv->tx_bv,
+ priv->num_pages, priv->dmabuf->size);
+ }
+
+ new_file = anon_inode_getfile("[dma_buf_pages]", &dma_buf_pages_fops,
+ (void *)priv, O_RDWR | O_CLOEXEC);
+ if (IS_ERR(new_file)) {
+ err = PTR_ERR(new_file);
+ goto out_destroy_genpool;
+ }
+
+ fd_install(fd, new_file);
+ return fd;
+
+out_destroy_genpool:
+ if (priv->page_pool) {
+ gen_pool_destroy(priv->page_pool);
+ } else {
+ kvfree(priv->tx_bv);
+ percpu_ref_put_many(&priv->pgmap.ref, priv->num_pages);
+ }
+out_unmap_dma_buf:
+ dma_buf_unmap_attachment(priv->attachment, priv->sgt, priv->direction);
+out_free_pages:
+ kvfree(priv->pages);
+out_detach_dma_buf:
+ dma_buf_detach(priv->dmabuf, priv->attachment);
+out_put_dma_buf:
+ dma_buf_put(priv->dmabuf);
+out_put_pci_dev:
+ pci_dev_put(priv->pci_dev);
+out_exit_percpu_ref:
+ percpu_ref_exit(&priv->pgmap.ref);
+out_free_priv:
+ kfree(priv);
+out_put_fd:
+ put_unused_fd(fd);
+out_err:
+ return err;
+}
+#else
+static long dma_buf_create_pages(struct file *file,
+ struct dma_buf_create_pages_info *create_info)
+{
+ return -ENOTSUPP;
+}
+#endif
+
#ifdef CONFIG_DEBUG_FS
static int dma_buf_debug_show(struct seq_file *s, void *unused)
{
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 57943e9..4a7f0f5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -38,6 +38,7 @@
#include <linux/mmu_notifier.h>
#include <linux/suspend.h>
#include <linux/fb.h>
+#include <linux/cc_platform.h>
#include "amdgpu.h"
#include "amdgpu_irq.h"
@@ -2022,7 +2023,8 @@
* however, SME requires an indirect IOMMU mapping because the encryption
* bit is beyond the DMA mask of the chip.
*/
- if (mem_encrypt_active() && ((flags & AMD_ASIC_MASK) == CHIP_RAVEN)) {
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) &&
+ ((flags & AMD_ASIC_MASK) == CHIP_RAVEN)) {
dev_info(&pdev->dev,
"SME is not compatible with RAVEN\n");
return -ENOTSUPP;
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 30cc59f..f19d9ac 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -31,7 +31,7 @@
#include <linux/dma-buf-map.h>
#include <linux/export.h>
#include <linux/highmem.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <xen/xen.h>
#include <drm/drm_cache.h>
@@ -204,7 +204,7 @@
* Enforce dma_alloc_coherent when memory encryption is active as well
* for the same reasons as for Xen paravirtual hosts.
*/
- if (mem_encrypt_active())
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return true;
for (tmp = iomem_resource.child; tmp; tmp = tmp->sibling)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 8449d09..ab246cf 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -29,7 +29,7 @@
#include <linux/dma-mapping.h>
#include <linux/module.h>
#include <linux/pci.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <drm/drm_aperture.h>
#include <drm/drm_drv.h>
@@ -666,7 +666,7 @@
[vmw_dma_map_bind] = "Giving up DMA mappings early."};
/* TTM currently doesn't fully support SEV encryption. */
- if (mem_encrypt_active())
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return -EINVAL;
if (vmw_force_coherent)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
index 8d2437f..50fa3df 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_msg.c
@@ -28,7 +28,7 @@
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/slab.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <asm/hypervisor.h>
#include <drm/drm_ioctl.h>
@@ -160,7 +160,7 @@
unsigned long msg_len = strlen(msg);
/* HB port can't access encrypted memory. */
- if (hb && !mem_encrypt_active()) {
+ if (hb && !cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
unsigned long bp = channel->cookie_high;
u32 channel_id = (channel->channel_id << 16);
@@ -216,7 +216,7 @@
unsigned long si, di, eax, ebx, ecx, edx;
/* HB port can't access encrypted memory */
- if (hb && !mem_encrypt_active()) {
+ if (hb && !cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
unsigned long bp = channel->cookie_low;
u32 channel_id = (channel->channel_id << 16);
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 376e631..57a676c 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -66,6 +66,7 @@
/* intel_idle.max_cstate=0 disables driver */
static int max_cstate = CPUIDLE_STATE_MAX - 1;
static unsigned int disabled_states_mask;
+static unsigned int preferred_states_mask;
static struct cpuidle_device __percpu *intel_idle_cpuidle_devices;
@@ -778,6 +779,46 @@
.enter = NULL }
};
+/*
+ * On Sapphire Rapids Xeon C1 has to be disabled if C1E is enabled, and vice
+ * versa. On SPR C1E is enabled only if "C1E promotion" bit is set in
+ * MSR_IA32_POWER_CTL. But in this case there effectively no C1, because C1
+ * requests are promoted to C1E. If the "C1E promotion" bit is cleared, then
+ * both C1 and C1E requests end up with C1, so there is effectively no C1E.
+ *
+ * By default we enable C1 and disable C1E by marking it with
+ * 'CPUIDLE_FLAG_UNUSABLE'.
+ */
+static struct cpuidle_state spr_cstates[] __initdata = {
+ {
+ .name = "C1",
+ .desc = "MWAIT 0x00",
+ .flags = MWAIT2flg(0x00),
+ .exit_latency = 1,
+ .target_residency = 1,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C1E",
+ .desc = "MWAIT 0x01",
+ .flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE | \
+ CPUIDLE_FLAG_UNUSABLE,
+ .exit_latency = 2,
+ .target_residency = 4,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .name = "C6",
+ .desc = "MWAIT 0x20",
+ .flags = MWAIT2flg(0x20) | CPUIDLE_FLAG_TLB_FLUSHED,
+ .exit_latency = 290,
+ .target_residency = 800,
+ .enter = &intel_idle,
+ .enter_s2idle = intel_idle_s2idle, },
+ {
+ .enter = NULL }
+};
+
static struct cpuidle_state atom_cstates[] __initdata = {
{
.name = "C1E",
@@ -1121,6 +1162,12 @@
.use_acpi = true,
};
+static const struct idle_cpu idle_cpu_spr __initconst = {
+ .state_table = spr_cstates,
+ .disable_promotion_to_c1e = true,
+ .use_acpi = true,
+};
+
static const struct idle_cpu idle_cpu_avn __initconst = {
.state_table = avn_cstates,
.disable_promotion_to_c1e = true,
@@ -1183,6 +1230,8 @@
X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &idle_cpu_skx),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &idle_cpu_icx),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &idle_cpu_icx),
+ X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &idle_cpu_spr),
+ X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, &idle_cpu_spr),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &idle_cpu_knl),
X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &idle_cpu_knl),
X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT, &idle_cpu_bxt),
@@ -1370,6 +1419,8 @@
static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
#endif /* !CONFIG_ACPI_PROCESSOR_CSTATE */
+static void c1e_promotion_enable(void);
+
/**
* ivt_idle_state_table_update - Tune the idle states table for Ivy Town.
*
@@ -1540,6 +1591,26 @@
}
}
+/**
+ * spr_idle_state_table_update - Adjust Sapphire Rapids idle states table.
+ */
+static void __init spr_idle_state_table_update(void)
+{
+ /* Check if user prefers C1E over C1. */
+ if (preferred_states_mask & BIT(2)) {
+ if (preferred_states_mask & BIT(1))
+ /* Both can't be enabled, stick to the defaults. */
+ return;
+
+ spr_cstates[0].flags |= CPUIDLE_FLAG_UNUSABLE;
+ spr_cstates[1].flags &= ~CPUIDLE_FLAG_UNUSABLE;
+
+ /* Enable C1E using the "C1E promotion" bit. */
+ c1e_promotion_enable();
+ disable_promotion_to_c1e = false;
+ }
+}
+
static bool __init intel_idle_verify_cstate(unsigned int mwait_hint)
{
unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1;
@@ -1574,6 +1645,10 @@
case INTEL_FAM6_SKYLAKE_X:
skx_idle_state_table_update();
break;
+ case INTEL_FAM6_SAPPHIRERAPIDS_X:
+ case INTEL_FAM6_EMERALDRAPIDS_X:
+ spr_idle_state_table_update();
+ break;
}
for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
@@ -1651,6 +1726,15 @@
wrmsrl(MSR_PKG_CST_CONFIG_CONTROL, msr_bits);
}
+static void c1e_promotion_enable(void)
+{
+ unsigned long long msr_bits;
+
+ rdmsrl(MSR_IA32_POWER_CTL, msr_bits);
+ msr_bits |= 0x2;
+ wrmsrl(MSR_IA32_POWER_CTL, msr_bits);
+}
+
static void c1e_promotion_disable(void)
{
unsigned long long msr_bits;
@@ -1820,3 +1904,14 @@
*/
module_param_named(states_off, disabled_states_mask, uint, 0444);
MODULE_PARM_DESC(states_off, "Mask of disabled idle states");
+/*
+ * Some platforms come with mutually exclusive C-states, so that if one is
+ * enabled, the other C-states must not be used. Example: C1 and C1E on
+ * Sapphire Rapids platform. This parameter allows for selecting the
+ * preferred C-states among the groups of mutually exclusive C-states - the
+ * selected C-states will be registered, the other C-states from the mutually
+ * exclusive group won't be registered. If the platform has no mutually
+ * exclusive C-states, this parameter has no effect.
+ */
+module_param_named(preferred_cstates, preferred_states_mask, uint, 0444);
+MODULE_PARM_DESC(preferred_cstates, "Mask of preferred idle states");
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index ef85549..7c0e168 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -20,7 +20,7 @@
#include <linux/amd-iommu.h>
#include <linux/export.h>
#include <linux/kmemleak.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <linux/iopoll.h>
#include <asm/pci-direct.h>
#include <asm/iommu.h>
@@ -979,7 +979,7 @@
pr_err("The address of old device table is above 4G, not trustworthy!\n");
return false;
}
- old_devtb = (sme_active() && is_kdump_kernel())
+ old_devtb = (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT) && is_kdump_kernel())
? (__force void *)ioremap_encrypted(old_devtb_phys,
dev_table_size)
: memremap(old_devtb_phys, dev_table_size, MEMREMAP_WB);
@@ -3060,7 +3060,8 @@
static bool amd_iommu_sme_check(void)
{
- if (!sme_active() || (boot_cpu_data.x86 != 0x17))
+ if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT) ||
+ (boot_cpu_data.x86 != 0x17))
return true;
/* For Fam17h, a specific level of support is required */
diff --git a/drivers/iommu/amd/io_pgtable.c b/drivers/iommu/amd/io_pgtable.c
index 1eddf55..4ab0c7f 100644
--- a/drivers/iommu/amd/io_pgtable.c
+++ b/drivers/iommu/amd/io_pgtable.c
@@ -74,87 +74,61 @@
*
****************************************************************************/
-static void free_page_list(struct page *freelist)
+static void free_pt_page(u64 *pt, struct list_head *freelist)
{
- while (freelist != NULL) {
- unsigned long p = (unsigned long)page_address(freelist);
+ struct page *p = virt_to_page(pt);
- freelist = freelist->freelist;
- free_page(p);
+ list_add_tail(&p->lru, freelist);
+}
+
+static void free_pt_lvl(u64 *pt, struct list_head *freelist, int lvl)
+{
+ u64 *p;
+ int i;
+
+ for (i = 0; i < 512; ++i) {
+ /* PTE present? */
+ if (!IOMMU_PTE_PRESENT(pt[i]))
+ continue;
+
+ /* Large PTE? */
+ if (PM_PTE_LEVEL(pt[i]) == 0 ||
+ PM_PTE_LEVEL(pt[i]) == 7)
+ continue;
+
+ /*
+ * Free the next level. No need to look at l1 tables here since
+ * they can only contain leaf PTEs; just free them directly.
+ */
+ p = IOMMU_PTE_PAGE(pt[i]);
+ if (lvl > 2)
+ free_pt_lvl(p, freelist, lvl - 1);
+ else
+ free_pt_page(p, freelist);
}
+
+ free_pt_page(pt, freelist);
}
-static struct page *free_pt_page(unsigned long pt, struct page *freelist)
-{
- struct page *p = virt_to_page((void *)pt);
-
- p->freelist = freelist;
-
- return p;
-}
-
-#define DEFINE_FREE_PT_FN(LVL, FN) \
-static struct page *free_pt_##LVL (unsigned long __pt, struct page *freelist) \
-{ \
- unsigned long p; \
- u64 *pt; \
- int i; \
- \
- pt = (u64 *)__pt; \
- \
- for (i = 0; i < 512; ++i) { \
- /* PTE present? */ \
- if (!IOMMU_PTE_PRESENT(pt[i])) \
- continue; \
- \
- /* Large PTE? */ \
- if (PM_PTE_LEVEL(pt[i]) == 0 || \
- PM_PTE_LEVEL(pt[i]) == 7) \
- continue; \
- \
- p = (unsigned long)IOMMU_PTE_PAGE(pt[i]); \
- freelist = FN(p, freelist); \
- } \
- \
- return free_pt_page((unsigned long)pt, freelist); \
-}
-
-DEFINE_FREE_PT_FN(l2, free_pt_page)
-DEFINE_FREE_PT_FN(l3, free_pt_l2)
-DEFINE_FREE_PT_FN(l4, free_pt_l3)
-DEFINE_FREE_PT_FN(l5, free_pt_l4)
-DEFINE_FREE_PT_FN(l6, free_pt_l5)
-
-static struct page *free_sub_pt(unsigned long root, int mode,
- struct page *freelist)
+static void free_sub_pt(u64 *root, int mode, struct list_head *freelist)
{
switch (mode) {
case PAGE_MODE_NONE:
case PAGE_MODE_7_LEVEL:
break;
case PAGE_MODE_1_LEVEL:
- freelist = free_pt_page(root, freelist);
+ free_pt_page(root, freelist);
break;
case PAGE_MODE_2_LEVEL:
- freelist = free_pt_l2(root, freelist);
- break;
case PAGE_MODE_3_LEVEL:
- freelist = free_pt_l3(root, freelist);
- break;
case PAGE_MODE_4_LEVEL:
- freelist = free_pt_l4(root, freelist);
- break;
case PAGE_MODE_5_LEVEL:
- freelist = free_pt_l5(root, freelist);
- break;
case PAGE_MODE_6_LEVEL:
- freelist = free_pt_l6(root, freelist);
+ free_pt_lvl(root, freelist, mode);
break;
default:
BUG();
}
-
- return freelist;
}
void amd_iommu_domain_set_pgtable(struct protection_domain *domain,
@@ -362,9 +336,9 @@
return pte;
}
-static struct page *free_clear_pte(u64 *pte, u64 pteval, struct page *freelist)
+static void free_clear_pte(u64 *pte, u64 pteval, struct list_head *freelist)
{
- unsigned long pt;
+ u64 *pt;
int mode;
while (cmpxchg64(pte, pteval, 0) != pteval) {
@@ -373,12 +347,12 @@
}
if (!IOMMU_PTE_PRESENT(pteval))
- return freelist;
+ return;
- pt = (unsigned long)IOMMU_PTE_PAGE(pteval);
+ pt = IOMMU_PTE_PAGE(pteval);
mode = IOMMU_PTE_MODE(pteval);
- return free_sub_pt(pt, mode, freelist);
+ free_sub_pt(pt, mode, freelist);
}
/*
@@ -388,48 +362,57 @@
* supporting all features of AMD IOMMU page tables like level skipping
* and full 64 bit address spaces.
*/
-static int iommu_v1_map_page(struct io_pgtable_ops *ops, unsigned long iova,
- phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+static int iommu_v1_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
+ phys_addr_t paddr, size_t pgsize, size_t pgcount,
+ int prot, gfp_t gfp, size_t *mapped)
{
struct protection_domain *dom = io_pgtable_ops_to_domain(ops);
- struct page *freelist = NULL;
+ LIST_HEAD(freelist);
bool updated = false;
u64 __pte, *pte;
int ret, i, count;
- BUG_ON(!IS_ALIGNED(iova, size));
- BUG_ON(!IS_ALIGNED(paddr, size));
+ BUG_ON(!IS_ALIGNED(iova, pgsize));
+ BUG_ON(!IS_ALIGNED(paddr, pgsize));
ret = -EINVAL;
if (!(prot & IOMMU_PROT_MASK))
goto out;
- count = PAGE_SIZE_PTE_COUNT(size);
- pte = alloc_pte(dom, iova, size, NULL, gfp, &updated);
+ while (pgcount > 0) {
+ count = PAGE_SIZE_PTE_COUNT(pgsize);
+ pte = alloc_pte(dom, iova, pgsize, NULL, gfp, &updated);
- ret = -ENOMEM;
- if (!pte)
- goto out;
+ ret = -ENOMEM;
+ if (!pte)
+ goto out;
- for (i = 0; i < count; ++i)
- freelist = free_clear_pte(&pte[i], pte[i], freelist);
+ for (i = 0; i < count; ++i)
+ free_clear_pte(&pte[i], pte[i], &freelist);
- if (freelist != NULL)
- updated = true;
+ if (!list_empty(&freelist))
+ updated = true;
- if (count > 1) {
- __pte = PAGE_SIZE_PTE(__sme_set(paddr), size);
- __pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC;
- } else
- __pte = __sme_set(paddr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
+ if (count > 1) {
+ __pte = PAGE_SIZE_PTE(__sme_set(paddr), pgsize);
+ __pte |= PM_LEVEL_ENC(7) | IOMMU_PTE_PR | IOMMU_PTE_FC;
+ } else
+ __pte = __sme_set(paddr) | IOMMU_PTE_PR | IOMMU_PTE_FC;
- if (prot & IOMMU_PROT_IR)
- __pte |= IOMMU_PTE_IR;
- if (prot & IOMMU_PROT_IW)
- __pte |= IOMMU_PTE_IW;
+ if (prot & IOMMU_PROT_IR)
+ __pte |= IOMMU_PTE_IR;
+ if (prot & IOMMU_PROT_IW)
+ __pte |= IOMMU_PTE_IW;
- for (i = 0; i < count; ++i)
- pte[i] = __pte;
+ for (i = 0; i < count; ++i)
+ pte[i] = __pte;
+
+ iova += pgsize;
+ paddr += pgsize;
+ pgcount--;
+ if (mapped)
+ *mapped += pgsize;
+ }
ret = 0;
@@ -449,22 +432,23 @@
}
/* Everything flushed out, free pages now */
- free_page_list(freelist);
+ put_pages_list(&freelist);
return ret;
}
-static unsigned long iommu_v1_unmap_page(struct io_pgtable_ops *ops,
- unsigned long iova,
- size_t size,
- struct iommu_iotlb_gather *gather)
+static unsigned long iommu_v1_unmap_pages(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t pgsize, size_t pgcount,
+ struct iommu_iotlb_gather *gather)
{
struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
unsigned long long unmapped;
unsigned long unmap_size;
u64 *pte;
+ size_t size = pgcount << __ffs(pgsize);
- BUG_ON(!is_power_of_2(size));
+ BUG_ON(!is_power_of_2(pgsize));
unmapped = 0;
@@ -476,14 +460,14 @@
count = PAGE_SIZE_PTE_COUNT(unmap_size);
for (i = 0; i < count; i++)
pte[i] = 0ULL;
+ } else {
+ return unmapped;
}
iova = (iova & ~(unmap_size - 1)) + unmap_size;
unmapped += unmap_size;
}
- BUG_ON(unmapped && !is_power_of_2(unmapped));
-
return unmapped;
}
@@ -511,8 +495,7 @@
{
struct amd_io_pgtable *pgtable = container_of(iop, struct amd_io_pgtable, iop);
struct protection_domain *dom;
- struct page *freelist = NULL;
- unsigned long root;
+ LIST_HEAD(freelist);
if (pgtable->mode == PAGE_MODE_NONE)
return;
@@ -523,8 +506,7 @@
BUG_ON(pgtable->mode < PAGE_MODE_NONE ||
pgtable->mode > PAGE_MODE_6_LEVEL);
- root = (unsigned long)pgtable->root;
- freelist = free_sub_pt(root, pgtable->mode, freelist);
+ free_sub_pt(pgtable->root, pgtable->mode, &freelist);
/* Update data structure */
amd_iommu_domain_clr_pt_root(dom);
@@ -532,7 +514,7 @@
/* Make changes visible to IOMMUs */
amd_iommu_domain_update(dom);
- free_page_list(freelist);
+ put_pages_list(&freelist);
}
static struct io_pgtable *v1_alloc_pgtable(struct io_pgtable_cfg *cfg, void *cookie)
@@ -544,8 +526,8 @@
cfg->oas = IOMMU_OUT_ADDR_BIT_SIZE,
cfg->tlb = &v1_flush_ops;
- pgtable->iop.ops.map = iommu_v1_map_page;
- pgtable->iop.ops.unmap = iommu_v1_unmap_page;
+ pgtable->iop.ops.map_pages = iommu_v1_map_pages;
+ pgtable->iop.ops.unmap_pages = iommu_v1_unmap_pages;
pgtable->iop.ops.iova_to_phys = iommu_v1_iova_to_phys;
return &pgtable->iop;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index d9251af..955878e 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -31,6 +31,7 @@
#include <linux/irqdomain.h>
#include <linux/percpu.h>
#include <linux/io-pgtable.h>
+#include <linux/cc_platform.h>
#include <asm/irq_remapping.h>
#include <asm/io_apic.h>
#include <asm/apic.h>
@@ -2050,13 +2051,13 @@
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = &domain->iop.iop.ops;
- if (ops->map)
+ if (ops->map_pages)
domain_flush_np_cache(domain, iova, size);
}
-static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
- phys_addr_t paddr, size_t page_size, int iommu_prot,
- gfp_t gfp)
+static int amd_iommu_map_pages(struct iommu_domain *dom, unsigned long iova,
+ phys_addr_t paddr, size_t pgsize, size_t pgcount,
+ int iommu_prot, gfp_t gfp, size_t *mapped)
{
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = &domain->iop.iop.ops;
@@ -2072,8 +2073,10 @@
if (iommu_prot & IOMMU_WRITE)
prot |= IOMMU_PROT_IW;
- if (ops->map)
- ret = ops->map(ops, iova, paddr, page_size, prot, gfp);
+ if (ops->map_pages) {
+ ret = ops->map_pages(ops, iova, paddr, pgsize,
+ pgcount, prot, gfp, mapped);
+ }
return ret;
}
@@ -2099,9 +2102,9 @@
iommu_iotlb_gather_add_range(gather, iova, size);
}
-static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
- size_t page_size,
- struct iommu_iotlb_gather *gather)
+static size_t amd_iommu_unmap_pages(struct iommu_domain *dom, unsigned long iova,
+ size_t pgsize, size_t pgcount,
+ struct iommu_iotlb_gather *gather)
{
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = &domain->iop.iop.ops;
@@ -2111,9 +2114,10 @@
(domain->iop.mode == PAGE_MODE_NONE))
return 0;
- r = (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0;
+ r = (ops->unmap_pages) ? ops->unmap_pages(ops, iova, pgsize, pgcount, NULL) : 0;
- amd_iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
+ if (r)
+ amd_iommu_iotlb_gather_add_page(dom, gather, iova, r);
return r;
}
@@ -2240,7 +2244,7 @@
* active, because some of those devices (AMD GPUs) don't have the
* encryption bit in their DMA-mask and require remapping.
*/
- if (!mem_encrypt_active() && dev_data->iommu_v2)
+ if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT) && dev_data->iommu_v2)
return IOMMU_DOMAIN_IDENTITY;
return 0;
@@ -2252,9 +2256,9 @@
.domain_free = amd_iommu_domain_free,
.attach_dev = amd_iommu_attach_device,
.detach_dev = amd_iommu_detach_device,
- .map = amd_iommu_map,
+ .map_pages = amd_iommu_map_pages,
.iotlb_sync_map = amd_iommu_iotlb_sync_map,
- .unmap = amd_iommu_unmap,
+ .unmap_pages = amd_iommu_unmap_pages,
.iova_to_phys = amd_iommu_iova_to_phys,
.probe_device = amd_iommu_probe_device,
.release_device = amd_iommu_release_device,
diff --git a/drivers/iommu/amd/iommu_v2.c b/drivers/iommu/amd/iommu_v2.c
index 29a3a62..cb263252 100644
--- a/drivers/iommu/amd/iommu_v2.c
+++ b/drivers/iommu/amd/iommu_v2.c
@@ -17,6 +17,7 @@
#include <linux/wait.h>
#include <linux/pci.h>
#include <linux/gfp.h>
+#include <linux/cc_platform.h>
#include "amd_iommu.h"
@@ -743,7 +744,7 @@
* When memory encryption is active the device is likely not in a
* direct-mapped domain. Forbid using IOMMUv2 functionality for now.
*/
- if (mem_encrypt_active())
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return -ENODEV;
if (!amd_iommu_v2_supported())
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d06dbf0..d672620 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -25,6 +25,7 @@
#include <linux/property.h>
#include <linux/fsl/mc.h>
#include <linux/module.h>
+#include <linux/cc_platform.h>
#include <trace/events/iommu.h>
static struct kset *iommu_group_kset;
@@ -130,7 +131,7 @@
else
iommu_set_default_translated(false);
- if (iommu_default_passthrough() && mem_encrypt_active()) {
+ if (iommu_default_passthrough() && cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
pr_info("Memory encryption detected - Disabling default IOMMU Passthrough\n");
iommu_set_default_translated(false);
}
diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
index 08f4c05..b39bca5 100644
--- a/drivers/net/ethernet/google/gve/gve.h
+++ b/drivers/net/ethernet/google/gve/gve.h
@@ -14,6 +14,7 @@
#include "gve_desc.h"
#include "gve_desc_dqo.h"
+#include "gve_register.h"
#ifndef PCI_VENDOR_ID_GOOGLE
#define PCI_VENDOR_ID_GOOGLE 0x1ae0
@@ -42,13 +43,45 @@
#define GVE_DATA_SLOT_ADDR_PAGE_MASK (~(PAGE_SIZE - 1))
+// TX timeout period to check the miss path
+#define GVE_TX_TIMEOUT_PERIOD 1 * HZ
+
/* PTYPEs are always 10 bits. */
#define GVE_NUM_PTYPES 1024
#define GVE_RX_BUFFER_SIZE_DQO 2048
+#define GVE_MIN_RX_BUFFER_SIZE 2048
+#define GVE_MAX_RX_BUFFER_SIZE 4096
+
+#define GVE_RSS_KEY_SIZE 40
+#define GVE_RSS_INDIR_SIZE 128
+
+#define GVE_HEADER_BUFFER_SIZE_MIN 64
+#define GVE_HEADER_BUFFER_SIZE_MAX 256
+#define GVE_HEADER_BUFFER_SIZE_DEFAULT 128
#define GVE_GQ_TX_MIN_PKT_DESC_BYTES 182
+#define DQO_QPL_DEFAULT_TX_PAGES 512
+#define DQO_QPL_DEFAULT_RX_PAGES 2048
+
+/* Maximum TSO size supported on DQO */
+#define GVE_DQO_TX_MAX 0x3FFFF
+
+#define GVE_TX_BUF_SHIFT_DQO 11
+
+/* 2K buffers for DQO-QPL */
+#define GVE_TX_BUF_SIZE_DQO BIT(GVE_TX_BUF_SHIFT_DQO)
+#define GVE_TX_BUFS_PER_PAGE_DQO (PAGE_SIZE >> GVE_TX_BUF_SHIFT_DQO)
+#define GVE_MAX_TX_BUFS_PER_PKT (DIV_ROUND_UP(GVE_DQO_TX_MAX, GVE_TX_BUF_SIZE_DQO))
+
+/* If number of free/recyclable buffers are less than this threshold; driver
+ * allocs and uses a non-qpl page on the receive path of DQO QPL to free
+ * up buffers.
+ * Value is set big enough to post at least 3 64K LRO packet via 2K buffer to NIC.
+ */
+#define GVE_DQO_QPL_ONDEMAND_ALLOC_THRESHOLD 96
+
/* Each slot in the desc ring has a 1:1 mapping to a slot in the data ring */
struct gve_rx_desc_queue {
struct gve_rx_desc *desc_ring; /* the descriptor ring */
@@ -62,7 +95,8 @@
void *page_address;
u32 page_offset; /* offset to write to in page */
int pagecnt_bias; /* expected pagecnt if only the driver has a ref */
- u8 can_flip;
+ u16 pad; /* adjustment for rx padding */
+ u8 can_flip; /* tracks if the networking stack is using the page */
};
/* A list of pages registered with the device during setup and used by a queue
@@ -121,6 +155,11 @@
u32 mask; /* Mask for indices to the size of the ring */
};
+struct gve_header_buf {
+ u8 *data;
+ dma_addr_t addr;
+};
+
/* Stores state for tracking buffers posted to HW */
struct gve_rx_buf_state_dqo {
/* The page posted to HW. */
@@ -134,6 +173,9 @@
*/
u32 last_single_ref_offset;
+ /* Pointer to the header buffer when header-split is active */
+ struct gve_header_buf *hdr_buf;
+
/* Linked list index to next element in the list, or -1 if none */
s16 next;
};
@@ -144,6 +186,26 @@
s16 tail;
};
+/* A single received packet split across multiple buffers may be
+ * reconstructed using the information in this structure.
+ */
+struct gve_rx_ctx {
+ /* head and tail of skb chain for the current packet or NULL if none */
+ struct sk_buff *skb_head;
+ struct sk_buff *skb_tail;
+ u32 total_size;
+ u8 frag_cnt;
+ bool drop_pkt;
+};
+
+struct gve_rx_cnts {
+ u32 ok_pkt_bytes;
+ u16 ok_pkt_cnt;
+ u16 total_pkt_cnt;
+ u16 cont_pkt_cnt;
+ u16 desc_err_pkt_cnt;
+};
+
/* Contains datapath state used to represent an RX queue. */
struct gve_rx_ring {
struct gve_priv *gve;
@@ -155,6 +217,11 @@
/* threshold for posting new buffs and descs */
u32 db_threshold;
+ u16 packet_buffer_size;
+
+ u32 qpl_copy_pool_mask;
+ u32 qpl_copy_pool_head;
+ struct gve_rx_slot_page_info *qpl_copy_pool;
};
/* DQO fields. */
@@ -189,33 +256,57 @@
* which cannot be reused yet.
*/
struct gve_index_list used_buf_states;
+
+ /* Array of buffers for header-split */
+ struct gve_header_buf *hdr_bufs;
+
+ /* qpl assigned to this queue */
+ struct gve_queue_page_list *qpl;
+
+ /* index into queue page list */
+ u32 next_qpl_page_idx;
+
+ /* track number of used buffers */
+ u16 used_buf_states_cnt;
} dqo;
};
u64 rbytes; /* free-running bytes received */
+ u64 rheader_bytes; /* free-running header bytes received */
u64 rpackets; /* free-running packets received */
u32 cnt; /* free-running total number of completed packets */
u32 fill_cnt; /* free-running total number of descs and buffs posted */
u32 mask; /* masks the cnt and fill_cnt to the size of the ring */
+ u32 rx_dmabuf_bound; /* rx queue bound to dmabuf */
+ u64 rx_hsplit_pkt; /* free-running packets with headers split */
+ u64 rx_hsplit_hbo_pkt; /* free-running packets with header buffer overflow */
+ u64 rx_devmem_pkt; /* devmem packets processed */
+ u64 rx_devmem_dropped; /* devmem pkts dropped */
u64 rx_copybreak_pkt; /* free-running count of copybreak packets */
u64 rx_copied_pkt; /* free-running total number of copied packets */
u64 rx_skb_alloc_fail; /* free-running count of skb alloc fails */
u64 rx_buf_alloc_fail; /* free-running count of buffer alloc fails */
u64 rx_desc_err_dropped_pkt; /* free-running count of packets dropped by descriptor error */
+ /* free-running count of packets dropped by header-split overflow */
+ u64 rx_hsplit_err_dropped_pkt;
+ u64 rx_cont_packet_cnt; /* free-running multi-fragment packets received */
+ u64 rx_frag_flip_cnt; /* free-running count of rx segments where page_flip was used */
+ u64 rx_frag_copy_cnt; /* free-running count of rx segments copied */
+ u64 rx_frag_alloc_cnt; /* free-running count of rx page allocations */
+
u32 q_num; /* queue index */
u32 ntfy_id; /* notification block index */
struct gve_queue_resources *q_resources; /* head and tail pointer idx */
dma_addr_t q_resources_bus; /* dma address for the queue resources */
struct u64_stats_sync statss; /* sync stats for 32bit archs */
- /* head and tail of skb chain for the current packet or NULL if none */
- struct sk_buff *skb_head;
- struct sk_buff *skb_tail;
+ struct gve_rx_ctx ctx; /* Info for packet currently being processed in this ring. */
};
/* A TX desc ring entry */
union gve_tx_desc {
struct gve_tx_pkt_desc pkt; /* first desc for a packet */
+ struct gve_tx_mtd_desc mtd; /* optional metadata descriptor */
struct gve_tx_seg_desc seg; /* subsequent descs for a packet */
};
@@ -280,8 +371,14 @@
* All others correspond to `skb`'s frags and should be unmapped with
* `dma_unmap_page`.
*/
- DEFINE_DMA_UNMAP_ADDR(dma[MAX_SKB_FRAGS + 1]);
- DEFINE_DMA_UNMAP_LEN(len[MAX_SKB_FRAGS + 1]);
+ union {
+ struct {
+ DEFINE_DMA_UNMAP_ADDR(dma[MAX_SKB_FRAGS + 1]);
+ DEFINE_DMA_UNMAP_LEN(len[MAX_SKB_FRAGS + 1]);
+ };
+ s16 tx_qpl_buf_ids[GVE_MAX_TX_BUFS_PER_PKT];
+ };
+
u16 num_bufs;
/* Linked list index to next element in the list, or -1 if none */
@@ -336,6 +433,32 @@
* set.
*/
u32 last_re_idx;
+
+ /* free running number of packet buf descriptors posted */
+ u16 posted_packet_desc_cnt;
+ /* free running number of packet buf descriptors completed */
+ u16 completed_packet_desc_cnt;
+
+ /* QPL fields */
+ struct {
+ /* Linked list of gve_tx_buf_dqo. Index into
+ * tx_qpl_buf_next, or -1 if empty.
+ *
+ * This is a consumer list owned by the TX path. When it
+ * runs out, the producer list is stolen from the
+ * completion handling path
+ * (dqo_compl.free_tx_qpl_buf_head).
+ */
+ s16 free_tx_qpl_buf_head;
+
+ /* Free running count of the number of QPL tx buffers
+ * allocated
+ */
+ u32 alloc_tx_qpl_buf_cnt;
+
+ /* Cached value of `dqo_compl.free_tx_qpl_buf_cnt` */
+ u32 free_tx_qpl_buf_cnt;
+ };
} dqo_tx;
};
@@ -343,8 +466,8 @@
union {
/* GQI fields */
struct {
- /* NIC tail pointer */
- __be32 last_nic_done;
+ /* Spinlock for when cleanup in progress */
+ spinlock_t clean_lock;
};
/* DQO fields. */
@@ -354,6 +477,10 @@
/* Tracks the current gen bit of compl_q */
u8 cur_gen_bit;
+ /* the jiffies when last TX completion was processed*/
+ unsigned long last_processed;
+ bool kicked;
+
/* Linked list of gve_tx_pending_packet_dqo. Index into
* pending_packets, or -1 if empty.
*
@@ -377,6 +504,24 @@
* reached a specified timeout.
*/
struct gve_index_list timed_out_completions;
+
+ /* QPL fields */
+ struct {
+ /* Linked list of gve_tx_buf_dqo. Index into
+ * tx_qpl_buf_next, or -1 if empty.
+ *
+ * This is the producer list, owned by the completion
+ * handling path. When the consumer list
+ * (dqo_tx.free_tx_qpl_buf_head) is runs out, this list
+ * will be stolen.
+ */
+ atomic_t free_tx_qpl_buf_head;
+
+ /* Free running count of the number of tx buffers
+ * freed
+ */
+ atomic_t free_tx_qpl_buf_cnt;
+ };
} dqo_compl;
} ____cacheline_aligned;
u64 pkt_done; /* free-running - total packets completed */
@@ -403,6 +548,21 @@
s16 num_pending_packets;
u32 complq_mask; /* complq size is complq_mask + 1 */
+
+ /* QPL fields */
+ struct {
+ /* qpl assigned to this queue */
+ struct gve_queue_page_list *qpl;
+
+ /* Each QPL page is divided into TX bounce buffers
+ * of size GVE_TX_BUF_SIZE_DQO. tx_qpl_buf_next is
+ * an array to manage linked lists of TX buffers.
+ * An entry j at index i implies that j'th buffer
+ * is next on the list after i
+ */
+ s16 *tx_qpl_buf_next;
+ u32 num_tx_qpl_bufs;
+ };
} dqo;
} ____cacheline_aligned;
struct netdev_queue *netdev_txq;
@@ -428,13 +588,13 @@
* associated with that irq.
*/
struct gve_notify_block {
- __be32 irq_db_index; /* idx into Bar2 - set by device, must be 1st */
+ __be32 *irq_db_index; /* pointer to idx into Bar2 */
char name[IFNAMSIZ + 16]; /* name registered with the kernel */
struct napi_struct napi; /* kernel napi struct for this block */
struct gve_priv *priv;
struct gve_tx_ring *tx; /* tx rings on this block */
struct gve_rx_ring *rx; /* rx rings on this block */
-} ____cacheline_aligned;
+};
/* Tracks allowed and current queue settings */
struct gve_queue_config {
@@ -453,6 +613,10 @@
u16 rx_buff_ring_entries; /* number of rx_buff descriptors */
};
+struct gve_irq_db {
+ __be32 index;
+} ____cacheline_aligned;
+
struct gve_ptype {
u8 l3_type; /* `gve_l3_type` in gve_adminq.h */
u8 l4_type; /* `gve_l4_type` in gve_adminq.h */
@@ -462,6 +626,19 @@
struct gve_ptype ptypes[GVE_NUM_PTYPES];
};
+enum gve_rss_hash_alg {
+ GVE_RSS_HASH_UNDEFINED = 0,
+ GVE_RSS_HASH_TOEPLITZ = 1,
+};
+
+struct gve_rss_config {
+ enum gve_rss_hash_alg alg;
+ u16 key_size;
+ u16 indir_size;
+ u8 *key;
+ u32 *indir;
+};
+
/* GVE_QUEUE_FORMAT_UNSPECIFIED must be zero since 0 is the default value
* when the entire configure_device_resources command is zeroed out and the
* queue_format is not specified.
@@ -471,6 +648,32 @@
GVE_GQI_RDA_FORMAT = 0x1,
GVE_GQI_QPL_FORMAT = 0x2,
GVE_DQO_RDA_FORMAT = 0x3,
+ GVE_DQO_QPL_FORMAT = 0x4,
+};
+
+struct gve_flow_spec {
+ __be32 src_ip[4];
+ __be32 dst_ip[4];
+ union {
+ struct {
+ __be16 src_port;
+ __be16 dst_port;
+ };
+ __be32 spi;
+ };
+ union {
+ u8 tos;
+ u8 tclass;
+ };
+};
+
+struct gve_flow_rule {
+ struct list_head list;
+ u16 loc;
+ u16 flow_type;
+ u16 action;
+ struct gve_flow_spec key;
+ struct gve_flow_spec mask;
};
struct gve_priv {
@@ -479,7 +682,8 @@
struct gve_rx_ring *rx; /* array of rx_cfg.num_queues */
struct gve_queue_page_list *qpls; /* array of num qpls */
struct gve_notify_block *ntfy_blocks; /* array of num_ntfy_blks */
- dma_addr_t ntfy_block_bus;
+ struct gve_irq_db *irq_db_indices; /* array of num_ntfy_blks */
+ dma_addr_t irq_db_indices_bus;
struct msix_entry *msix_vectors; /* array of num_ntfy_blks + 1 */
char mgmt_msix_name[IFNAMSIZ + 16];
u32 mgmt_msix_idx;
@@ -489,7 +693,8 @@
u16 num_event_counters;
u16 tx_desc_cnt; /* num desc per ring */
u16 rx_desc_cnt; /* num desc per ring */
- u16 tx_pages_per_qpl; /* tx buffer length */
+ u16 tx_pages_per_qpl; /* Suggested number of pages per qpl for TX queues by NIC */
+ u16 rx_pages_per_qpl; /* Suggested number of pages per qpl for RX queues by NIC */
u16 rx_data_slot_cnt; /* rx buffer length */
u64 max_registered_pages;
u64 num_registered_pages; /* num pages registered with NIC */
@@ -530,6 +735,9 @@
u32 adminq_report_stats_cnt;
u32 adminq_report_link_speed_cnt;
u32 adminq_get_ptype_map_cnt;
+ u32 adminq_verify_driver_compatibility_cnt;
+ u32 adminq_cfg_flow_rule_cnt;
+ u32 adminq_cfg_rss_cnt;
/* Global stats */
u32 interface_up_cnt; /* count of times interface turned up since last reset */
@@ -538,6 +746,8 @@
u32 page_alloc_fail; /* count of page alloc fails */
u32 dma_mapping_error; /* count of dma mapping errors */
u32 stats_report_trigger_cnt; /* count of device-requested stats-reports since last reset */
+ u32 suspend_cnt; /* count of times suspended */
+ u32 resume_cnt; /* count of times resumed */
struct workqueue_struct *gve_wq;
struct work_struct service_task;
struct work_struct stats_report_task;
@@ -548,20 +758,51 @@
u64 stats_report_len;
dma_addr_t stats_report_bus; /* dma address for the stats report */
unsigned long ethtool_flags;
+ unsigned long ethtool_defaults; /* default flags */
unsigned long stats_report_timer_period;
struct timer_list stats_report_timer;
+ unsigned long tx_timeout_period;
+ /* tx timeout timer for the miss path */
+ struct timer_list tx_timeout_timer;
+
/* Gvnic device link speed from hypervisor. */
u64 link_speed;
+ bool up_before_suspend; /* True if dev was up before suspend */
struct gve_options_dqo_rda options_dqo_rda;
struct gve_ptype_lut *ptype_lut_dqo;
/* Must be a power of two. */
int data_buffer_size_dqo;
+ int dev_max_rx_buffer_size; /* The max rx buffer size that device support*/
enum gve_queue_format queue_format;
+
+ /* Interrupt coalescing settings */
+ u32 tx_coalesce_usecs;
+ u32 rx_coalesce_usecs;
+
+ /* The size of buffers to allocate for the headers.
+ * A non-zero value enables header-split.
+ */
+ u16 header_buf_size;
+ u8 header_split_strict;
+ struct dma_pool *header_buf_pool;
+
+ /* The maximum number of rules for flow-steering.
+ * A non-zero value enables flow-steering.
+ */
+ u16 flow_rules_max;
+ u16 flow_rules_cnt;
+ struct list_head flow_rules;
+ struct mutex flow_rules_lock;
+
+ /* RSS configuration */
+ struct gve_rss_config rss_config;
+
+ enum gve_reset_reason scheduled_reset_reason;
};
enum gve_service_task_flags_bit {
@@ -580,8 +821,17 @@
enum gve_ethtool_flags_bit {
GVE_PRIV_FLAGS_REPORT_STATS = 0,
+ GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT = 1,
+ GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT = 2,
+ GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE = 3,
};
+#define GVE_PRIV_FLAGS_MASK \
+ (BIT(GVE_PRIV_FLAGS_REPORT_STATS) | \
+ BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT) | \
+ BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT) | \
+ BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE))
+
static inline bool gve_get_do_reset(struct gve_priv *priv)
{
return test_bit(GVE_PRIV_FLAGS_DO_RESET, &priv->service_task_flags);
@@ -715,12 +965,22 @@
clear_bit(GVE_PRIV_FLAGS_REPORT_STATS, &priv->ethtool_flags);
}
+static inline bool gve_get_enable_header_split(struct gve_priv *priv)
+{
+ return test_bit(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT, &priv->ethtool_flags);
+}
+
+static inline bool gve_get_enable_max_rx_buffer_size(struct gve_priv *priv)
+{
+ return test_bit(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE, &priv->ethtool_flags);
+}
+
/* Returns the address of the ntfy_blocks irq doorbell
*/
static inline __be32 __iomem *gve_irq_doorbell(struct gve_priv *priv,
struct gve_notify_block *block)
{
- return &priv->db_bar2[be32_to_cpu(block->irq_db_index)];
+ return &priv->db_bar2[be32_to_cpu(*block->irq_db_index)];
}
/* Returns the index into ntfy_blocks of the given tx ring's block
@@ -737,11 +997,17 @@
return (priv->num_ntfy_blks / 2) + queue_idx;
}
+static inline bool gve_is_qpl(struct gve_priv *priv)
+{
+ return priv->queue_format == GVE_GQI_QPL_FORMAT ||
+ priv->queue_format == GVE_DQO_QPL_FORMAT;
+}
+
/* Returns the number of tx queue page lists
*/
static inline u32 gve_num_tx_qpls(struct gve_priv *priv)
{
- if (priv->queue_format != GVE_GQI_QPL_FORMAT)
+ if (!gve_is_qpl(priv))
return 0;
return priv->tx_cfg.num_queues;
@@ -751,7 +1017,7 @@
*/
static inline u32 gve_num_rx_qpls(struct gve_priv *priv)
{
- if (priv->queue_format != GVE_GQI_QPL_FORMAT)
+ if (!gve_is_qpl(priv))
return 0;
return priv->rx_cfg.num_queues;
@@ -814,6 +1080,11 @@
priv->queue_format == GVE_GQI_QPL_FORMAT;
}
+static inline int gve_num_tx_queues(struct gve_priv *priv)
+{
+ return priv->tx_cfg.num_queues;
+}
+
/* buffers */
int gve_alloc_page(struct gve_priv *priv, struct device *dev,
struct page **page, dma_addr_t *dma,
@@ -825,23 +1096,37 @@
bool gve_tx_poll(struct gve_notify_block *block, int budget);
int gve_tx_alloc_rings(struct gve_priv *priv);
void gve_tx_free_rings_gqi(struct gve_priv *priv);
-__be32 gve_tx_load_event_counter(struct gve_priv *priv,
- struct gve_tx_ring *tx);
+u32 gve_tx_load_event_counter(struct gve_priv *priv,
+ struct gve_tx_ring *tx);
+bool gve_tx_clean_pending(struct gve_priv *priv, struct gve_tx_ring *tx);
/* rx handling */
void gve_rx_write_doorbell(struct gve_priv *priv, struct gve_rx_ring *rx);
-bool gve_rx_poll(struct gve_notify_block *block, int budget);
+int gve_rx_poll(struct gve_notify_block *block, int budget);
+bool gve_rx_work_pending(struct gve_rx_ring *rx);
int gve_rx_alloc_rings(struct gve_priv *priv);
void gve_rx_free_rings_gqi(struct gve_priv *priv);
-bool gve_clean_rx_done(struct gve_rx_ring *rx, int budget,
- netdev_features_t feat);
+int gve_recreate_rx_rings(struct gve_priv *priv);
+int gve_reconfigure_rx_rings(struct gve_priv *priv,
+ bool enable_hdr_split,
+ int packet_buffer_size);
/* Reset */
void gve_schedule_reset(struct gve_priv *priv);
-int gve_reset(struct gve_priv *priv, bool attempt_teardown);
+int gve_reset(struct gve_priv *priv, bool attempt_teardown,
+ enum gve_reset_reason reason);
int gve_adjust_queues(struct gve_priv *priv,
struct gve_queue_config new_rx_config,
struct gve_queue_config new_tx_config);
+int gve_flow_rules_reset(struct gve_priv *priv);
+void gve_flow_rules_release(struct gve_priv *priv);
+
/* report stats handling */
void gve_handle_report_stats(struct gve_priv *priv);
+
+/* RSS support */
+int gve_rss_config_init(struct gve_priv *priv);
+void gve_rss_set_default_indir(struct gve_priv *priv);
+void gve_rss_config_release(struct gve_rss_config *rss_config);
+
/* exported by ethtool.c */
extern const struct ethtool_ops gve_ethtool_ops;
/* needed by ethtool */
diff --git a/drivers/net/ethernet/google/gve/gve_adminq.c b/drivers/net/ethernet/google/gve/gve_adminq.c
index 54d649e..4438e4e 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.c
+++ b/drivers/net/ethernet/google/gve/gve_adminq.c
@@ -12,7 +12,7 @@
#define GVE_MAX_ADMINQ_RELEASE_CHECK 500
#define GVE_ADMINQ_SLEEP_LEN 20
-#define GVE_MAX_ADMINQ_EVENT_COUNTER_CHECK 100
+#define GVE_MAX_ADMINQ_EVENT_COUNTER_CHECK 1000
#define GVE_DEVICE_OPTION_ERROR_FMT "%s option error:\n" \
"Expected: length=%d, feature_mask=%x.\n" \
@@ -38,7 +38,11 @@
struct gve_device_option *option,
struct gve_device_option_gqi_rda **dev_op_gqi_rda,
struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
- struct gve_device_option_dqo_rda **dev_op_dqo_rda)
+ struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+ struct gve_device_option_jumbo_frames **dev_op_jumbo_frames,
+ struct gve_device_option_buffer_sizes **dev_op_buffer_sizes,
+ struct gve_device_option_flow_steering **dev_op_flow_steering,
+ struct gve_device_option_dqo_qpl **dev_op_dqo_qpl)
{
u32 req_feat_mask = be32_to_cpu(option->required_features_mask);
u16 option_length = be16_to_cpu(option->option_length);
@@ -111,6 +115,78 @@
}
*dev_op_dqo_rda = (void *)(option + 1);
break;
+ case GVE_DEV_OPT_ID_DQO_QPL:
+ if (option_length < sizeof(**dev_op_dqo_qpl) ||
+ req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_DQO_QPL) {
+ dev_warn(&priv->pdev->dev, GVE_DEVICE_OPTION_ERROR_FMT,
+ "DQO QPL", (int)sizeof(**dev_op_dqo_qpl),
+ GVE_DEV_OPT_REQ_FEAT_MASK_DQO_QPL,
+ option_length, req_feat_mask);
+ break;
+ }
+
+ if (option_length > sizeof(**dev_op_dqo_qpl)) {
+ dev_warn(&priv->pdev->dev,
+ GVE_DEVICE_OPTION_TOO_BIG_FMT, "DQO QPL");
+ }
+ *dev_op_dqo_qpl = (void *)(option + 1);
+ break;
+ case GVE_DEV_OPT_ID_JUMBO_FRAMES:
+ if (option_length < sizeof(**dev_op_jumbo_frames) ||
+ req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES) {
+ dev_warn(&priv->pdev->dev, GVE_DEVICE_OPTION_ERROR_FMT,
+ "Jumbo Frames",
+ (int)sizeof(**dev_op_jumbo_frames),
+ GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES,
+ option_length, req_feat_mask);
+ break;
+ }
+
+ if (option_length > sizeof(**dev_op_jumbo_frames)) {
+ dev_warn(&priv->pdev->dev,
+ GVE_DEVICE_OPTION_TOO_BIG_FMT,
+ "Jumbo Frames");
+ }
+ *dev_op_jumbo_frames = (void *)(option + 1);
+ break;
+ case GVE_DEV_OPT_ID_BUFFER_SIZES:
+ if (option_length < sizeof(**dev_op_buffer_sizes) ||
+ req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_BUFFER_SIZES) {
+ dev_warn(&priv->pdev->dev, GVE_DEVICE_OPTION_ERROR_FMT,
+ "Buffer Sizes",
+ (int)sizeof(**dev_op_buffer_sizes),
+ GVE_DEV_OPT_REQ_FEAT_MASK_BUFFER_SIZES,
+ option_length, req_feat_mask);
+ break;
+ }
+
+ if (option_length > sizeof(**dev_op_buffer_sizes)) {
+ dev_warn(&priv->pdev->dev,
+ GVE_DEVICE_OPTION_TOO_BIG_FMT,
+ "Buffer Sizes");
+ }
+ *dev_op_buffer_sizes = (void *)(option + 1);
+ if ((*dev_op_buffer_sizes)->header_buffer_size)
+ priv->ethtool_defaults |= BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+ break;
+ case GVE_DEV_OPT_ID_FLOW_STEERING:
+ if (option_length < sizeof(**dev_op_flow_steering) ||
+ req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING) {
+ dev_warn(&priv->pdev->dev, GVE_DEVICE_OPTION_ERROR_FMT,
+ "Flow Steering",
+ (int)sizeof(**dev_op_flow_steering),
+ GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING,
+ option_length, req_feat_mask);
+ break;
+ }
+
+ if (option_length > sizeof(**dev_op_flow_steering)) {
+ dev_warn(&priv->pdev->dev,
+ GVE_DEVICE_OPTION_TOO_BIG_FMT,
+ "Flow Steering");
+ }
+ *dev_op_flow_steering = (void *)(option + 1);
+ break;
default:
/* If we don't recognize the option just continue
* without doing anything.
@@ -126,7 +202,11 @@
struct gve_device_descriptor *descriptor,
struct gve_device_option_gqi_rda **dev_op_gqi_rda,
struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
- struct gve_device_option_dqo_rda **dev_op_dqo_rda)
+ struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+ struct gve_device_option_jumbo_frames **dev_op_jumbo_frames,
+ struct gve_device_option_buffer_sizes **dev_op_buffer_sizes,
+ struct gve_device_option_flow_steering **dev_op_flow_steering,
+ struct gve_device_option_dqo_qpl **dev_op_dqo_qpl)
{
const int num_options = be16_to_cpu(descriptor->num_device_options);
struct gve_device_option *dev_opt;
@@ -146,7 +226,9 @@
gve_parse_device_option(priv, descriptor, dev_opt,
dev_op_gqi_rda, dev_op_gqi_qpl,
- dev_op_dqo_rda);
+ dev_op_dqo_rda, dev_op_jumbo_frames,
+ dev_op_buffer_sizes, dev_op_flow_steering,
+ dev_op_dqo_qpl);
dev_opt = next_opt;
}
@@ -177,6 +259,7 @@
priv->adminq_report_stats_cnt = 0;
priv->adminq_report_link_speed_cnt = 0;
priv->adminq_get_ptype_map_cnt = 0;
+ priv->adminq_cfg_flow_rule_cnt = 0;
/* Setup Admin queue with the device */
iowrite32be(priv->adminq_bus_addr / PAGE_SIZE,
@@ -269,7 +352,7 @@
case GVE_ADMINQ_COMMAND_ERROR_RESOURCE_EXHAUSTED:
return -ENOMEM;
case GVE_ADMINQ_COMMAND_ERROR_UNIMPLEMENTED:
- return -ENOTSUPP;
+ return -EOPNOTSUPP;
default:
dev_err(&priv->pdev->dev, "parse_aq_err: unknown status code %d\n", status);
return -EINVAL;
@@ -279,14 +362,17 @@
/* Flushes all AQ commands currently queued and waits for them to complete.
* If there are failures, it will return the first error.
*/
-static int gve_adminq_kick_and_wait(struct gve_priv *priv)
+static int gve_adminq_kick_and_wait(struct gve_priv *priv, int ret_cnt, int *ret_codes)
{
int tail, head;
- int i;
+ int i, j;
tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
head = priv->adminq_prod_cnt;
+ if ((head - tail) > ret_cnt)
+ return -EINVAL;
+
gve_adminq_kick_cmd(priv, head);
if (!gve_adminq_wait_for_cmd(priv, head)) {
dev_err(&priv->pdev->dev, "AQ commands timed out, need to reset AQ\n");
@@ -294,16 +380,13 @@
return -ENOTRECOVERABLE;
}
- for (i = tail; i < head; i++) {
+ for (i = tail, j = 0; i < head; i++, j++) {
union gve_adminq_command *cmd;
u32 status, err;
cmd = &priv->adminq[i & priv->adminq_mask];
status = be32_to_cpu(READ_ONCE(cmd->status));
- err = gve_adminq_parse_err(priv, status);
- if (err)
- // Return the first error if we failed.
- return err;
+ ret_codes[j] = gve_adminq_parse_err(priv, status);
}
return 0;
@@ -322,30 +405,16 @@
tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
// Check if next command will overflow the buffer.
- if (((priv->adminq_prod_cnt + 1) & priv->adminq_mask) ==
- (tail & priv->adminq_mask)) {
- int err;
-
- // Flush existing commands to make room.
- err = gve_adminq_kick_and_wait(priv);
- if (err)
- return err;
-
- // Retry.
- tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
- if (((priv->adminq_prod_cnt + 1) & priv->adminq_mask) ==
- (tail & priv->adminq_mask)) {
- // This should never happen. We just flushed the
- // command queue so there should be enough space.
- return -ENOMEM;
- }
- }
+ if ((priv->adminq_prod_cnt - tail) > priv->adminq_mask)
+ return -ENOMEM;
cmd = &priv->adminq[priv->adminq_prod_cnt & priv->adminq_mask];
priv->adminq_prod_cnt++;
memcpy(cmd, cmd_orig, sizeof(*cmd_orig));
opcode = be32_to_cpu(READ_ONCE(cmd->opcode));
+ if (opcode == GVE_ADMINQ_EXTENDED_COMMAND)
+ opcode = be32_to_cpu(cmd->extended_command.inner_opcode);
switch (opcode) {
case GVE_ADMINQ_DESCRIBE_DEVICE:
@@ -375,6 +444,9 @@
case GVE_ADMINQ_DECONFIGURE_DEVICE_RESOURCES:
priv->adminq_dcfg_device_resources_cnt++;
break;
+ case GVE_ADMINQ_CONFIGURE_RSS:
+ priv->adminq_cfg_rss_cnt++;
+ break;
case GVE_ADMINQ_SET_DRIVER_PARAMETER:
priv->adminq_set_driver_parameter_cnt++;
break;
@@ -387,6 +459,12 @@
case GVE_ADMINQ_GET_PTYPE_MAP:
priv->adminq_get_ptype_map_cnt++;
break;
+ case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY:
+ priv->adminq_verify_driver_compatibility_cnt++;
+ break;
+ case GVE_ADMINQ_CONFIGURE_FLOW_RULE:
+ priv->adminq_cfg_flow_rule_cnt++;
+ break;
default:
dev_err(&priv->pdev->dev, "unknown AQ command opcode %d\n", opcode);
}
@@ -402,8 +480,9 @@
static int gve_adminq_execute_cmd(struct gve_priv *priv,
union gve_adminq_command *cmd_orig)
{
+ int retry_cnt = GVE_ADMINQ_RETRY_COUNT;
u32 tail, head;
- int err;
+ int err, ret;
tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
head = priv->adminq_prod_cnt;
@@ -411,13 +490,52 @@
// This is not a valid path
return -EINVAL;
- err = gve_adminq_issue_cmd(priv, cmd_orig);
- if (err)
- return err;
+ do {
+ err = gve_adminq_issue_cmd(priv, cmd_orig);
+ if (err)
+ return err;
- return gve_adminq_kick_and_wait(priv);
+ err = gve_adminq_kick_and_wait(priv, 1, &ret);
+ if (err)
+ return err;
+ } while (ret == -ETIME && retry_cnt-- > 0);
+
+ return ret;
}
+static int gve_adminq_execute_extended_cmd(struct gve_priv *priv,
+ uint32_t opcode, size_t cmd_size,
+ void *cmd_orig)
+{
+ union gve_adminq_command cmd;
+ dma_addr_t inner_cmd_bus;
+ void *inner_cmd;
+ int err;
+
+ inner_cmd = dma_alloc_coherent(&priv->pdev->dev, cmd_size,
+ &inner_cmd_bus, GFP_KERNEL);
+ if (!inner_cmd)
+ return -ENOMEM;
+
+ memcpy(inner_cmd, cmd_orig, cmd_size);
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.opcode = cpu_to_be32(GVE_ADMINQ_EXTENDED_COMMAND);
+ cmd.extended_command = (struct gve_adminq_extended_command) {
+ .inner_opcode = cpu_to_be32(opcode),
+ .inner_length = cpu_to_be32(cmd_size),
+ .inner_command_addr = cpu_to_be64(inner_cmd_bus),
+ };
+
+ err = gve_adminq_execute_cmd(priv, &cmd);
+
+ dma_free_coherent(&priv->pdev->dev,
+ cmd_size,
+ inner_cmd, inner_cmd_bus);
+ return err;
+}
+
+
/* The device specifies that the management vector can either be the first irq
* or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
* the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
@@ -442,7 +560,7 @@
.num_counters = cpu_to_be32(num_counters),
.irq_db_addr = cpu_to_be64(db_array_bus_addr),
.num_irq_dbs = cpu_to_be32(num_ntfy_blks),
- .irq_db_stride = cpu_to_be32(sizeof(priv->ntfy_blocks[0])),
+ .irq_db_stride = cpu_to_be32(sizeof(*priv->irq_db_indices)),
.ntfy_blk_msix_base_idx =
cpu_to_be32(GVE_NTFY_BLK_BASE_MSIX_IDX),
.queue_format = priv->queue_format,
@@ -461,6 +579,69 @@
return gve_adminq_execute_cmd(priv, &cmd);
}
+typedef int (gve_adminq_queue_cmd) (struct gve_priv *priv, u32 queue_index);
+
+static int gve_adminq_manage_queues(struct gve_priv *priv,
+ gve_adminq_queue_cmd *cmd,
+ u32 start_id, u32 num_queues) {
+#define QUEUE_DONE -1
+ int retry_cnt = GVE_ADMINQ_RETRY_COUNT;
+ int cmd_idx, queue_idx, code_idx;
+ int queues_waiting[num_queues];
+ int commands[num_queues];
+ int codes[num_queues];
+ int retry_needed;
+ int err;
+
+ for (queue_idx = 0; queue_idx < num_queues; queue_idx++)
+ queues_waiting[queue_idx] = start_id + queue_idx;
+
+ do {
+ retry_needed = 0;
+ queue_idx = 0;
+ while (queue_idx < num_queues) {
+ cmd_idx = 0;
+ while (queue_idx < num_queues) {
+ if (queues_waiting[queue_idx] != QUEUE_DONE) {
+ err = cmd(priv, queues_waiting[queue_idx]);
+ if (err == -ENOMEM)
+ break;
+ if (err)
+ return err;
+ commands[cmd_idx++] = queue_idx;
+ }
+ queue_idx++;
+ }
+
+ if (queue_idx < num_queues)
+ dev_dbg(&priv->pdev->dev,
+ "Issued %d of %d batched commands\n",
+ queue_idx, num_queues);
+
+ err = gve_adminq_kick_and_wait(priv, cmd_idx, codes);
+ if (err)
+ return err;
+
+ for (code_idx = 0; code_idx < cmd_idx; code_idx++) {
+ if (codes[code_idx] == 0)
+ queues_waiting[commands[code_idx]] = QUEUE_DONE;
+ else if (codes[code_idx] != -ETIME)
+ return codes[code_idx];
+ else
+ retry_needed++;
+ }
+
+ if (retry_needed)
+ dev_dbg(&priv->pdev->dev,
+ "Issued %d batched commands, %d needed a retry\n",
+ cmd_idx, retry_needed);
+ }
+ } while (retry_needed && retry_cnt-- > 0);
+
+ return retry_needed ? -ETIME : 0;
+#undef QUEUE_DONE
+}
+
static int gve_adminq_create_tx_queue(struct gve_priv *priv, u32 queue_index)
{
struct gve_tx_ring *tx = &priv->tx[queue_index];
@@ -482,29 +663,33 @@
cmd.create_tx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
} else {
+ u16 comp_ring_size;
+ u32 qpl_id = 0;
+
+ if (priv->queue_format == GVE_DQO_RDA_FORMAT) {
+ qpl_id = GVE_RAW_ADDRESSING_QPL_ID;
+ comp_ring_size =
+ priv->options_dqo_rda.tx_comp_ring_entries;
+ } else {
+ qpl_id = tx->dqo.qpl->id;
+ comp_ring_size = priv->tx_desc_cnt;
+ }
+ cmd.create_tx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
cmd.create_tx_queue.tx_ring_size =
cpu_to_be16(priv->tx_desc_cnt);
cmd.create_tx_queue.tx_comp_ring_addr =
cpu_to_be64(tx->complq_bus_dqo);
cmd.create_tx_queue.tx_comp_ring_size =
- cpu_to_be16(priv->options_dqo_rda.tx_comp_ring_entries);
+ cpu_to_be16(comp_ring_size);
}
return gve_adminq_issue_cmd(priv, &cmd);
}
-int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 num_queues)
+int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_create_tx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_create_tx_queue,
+ start_id, num_queues);
}
static int gve_adminq_create_rx_queue(struct gve_priv *priv, u32 queue_index)
@@ -530,7 +715,20 @@
cpu_to_be64(rx->data.data_bus),
cmd.create_rx_queue.index = cpu_to_be32(queue_index);
cmd.create_rx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
+ cmd.create_rx_queue.packet_buffer_size = cpu_to_be16(rx->packet_buffer_size);
} else {
+ u16 rx_buff_ring_entries;
+ u32 qpl_id = 0;
+
+ if (priv->queue_format == GVE_DQO_RDA_FORMAT) {
+ qpl_id = GVE_RAW_ADDRESSING_QPL_ID;
+ rx_buff_ring_entries =
+ priv->options_dqo_rda.rx_buff_ring_entries;
+ } else {
+ qpl_id = rx->dqo.qpl->id;
+ rx_buff_ring_entries = priv->rx_desc_cnt;
+ }
+ cmd.create_rx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
cmd.create_rx_queue.rx_ring_size =
cpu_to_be16(priv->rx_desc_cnt);
cmd.create_rx_queue.rx_desc_ring_addr =
@@ -540,9 +738,12 @@
cmd.create_rx_queue.packet_buffer_size =
cpu_to_be16(priv->data_buffer_size_dqo);
cmd.create_rx_queue.rx_buff_ring_size =
- cpu_to_be16(priv->options_dqo_rda.rx_buff_ring_entries);
+ cpu_to_be16(rx_buff_ring_entries);
cmd.create_rx_queue.enable_rsc =
!!(priv->dev->features & NETIF_F_LRO);
+ if (rx->dqo.hdr_bufs)
+ cmd.create_rx_queue.header_buffer_size =
+ cpu_to_be16(priv->header_buf_size);
}
return gve_adminq_issue_cmd(priv, &cmd);
@@ -550,16 +751,8 @@
int gve_adminq_create_rx_queues(struct gve_priv *priv, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_create_rx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_create_rx_queue,
+ 0, num_queues);
}
static int gve_adminq_destroy_tx_queue(struct gve_priv *priv, u32 queue_index)
@@ -580,18 +773,10 @@
return 0;
}
-int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 num_queues)
+int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_destroy_tx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_destroy_tx_queue,
+ start_id, num_queues);
}
static int gve_adminq_destroy_rx_queue(struct gve_priv *priv, u32 queue_index)
@@ -614,16 +799,8 @@
int gve_adminq_destroy_rx_queues(struct gve_priv *priv, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_destroy_rx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_destroy_rx_queue,
+ 0, num_queues);
}
static int gve_set_desc_cnt(struct gve_priv *priv,
@@ -651,21 +828,105 @@
const struct gve_device_option_dqo_rda *dev_op_dqo_rda)
{
priv->tx_desc_cnt = be16_to_cpu(descriptor->tx_queue_entries);
+ priv->rx_desc_cnt = be16_to_cpu(descriptor->rx_queue_entries);
+
+ if (priv->queue_format == GVE_DQO_QPL_FORMAT)
+ return 0;
+
priv->options_dqo_rda.tx_comp_ring_entries =
be16_to_cpu(dev_op_dqo_rda->tx_comp_ring_entries);
- priv->rx_desc_cnt = be16_to_cpu(descriptor->rx_queue_entries);
priv->options_dqo_rda.rx_buff_ring_entries =
be16_to_cpu(dev_op_dqo_rda->rx_buff_ring_entries);
return 0;
}
+static void gve_enable_supported_features(
+ struct gve_priv *priv,
+ u32 supported_features_mask,
+ const struct gve_device_option_jumbo_frames *dev_op_jumbo_frames,
+ const struct gve_device_option_buffer_sizes *dev_op_buffer_sizes,
+ const struct gve_device_option_flow_steering *dev_op_flow_steering,
+ const struct gve_device_option_dqo_qpl *dev_op_dqo_qpl)
+{
+ int buf_size;
+
+ /* Before control reaches this point, the page-size-capped max MTU in
+ * the gve_device_descriptor field has already been stored in
+ * priv->dev->max_mtu. We overwrite it with the true max MTU below.
+ */
+ if (dev_op_jumbo_frames &&
+ (supported_features_mask & GVE_SUP_JUMBO_FRAMES_MASK)) {
+ dev_info(&priv->pdev->dev,
+ "JUMBO FRAMES device option enabled.\n");
+ priv->dev->max_mtu = be16_to_cpu(dev_op_jumbo_frames->max_mtu);
+ }
+
+ /* Override pages for qpl for DQO-QPL */
+ if (dev_op_dqo_qpl) {
+ priv->tx_pages_per_qpl =
+ be16_to_cpu(dev_op_dqo_qpl->tx_pages_per_qpl);
+ priv->rx_pages_per_qpl =
+ be16_to_cpu(dev_op_dqo_qpl->rx_pages_per_qpl);
+ if (priv->tx_pages_per_qpl == 0)
+ priv->tx_pages_per_qpl = DQO_QPL_DEFAULT_TX_PAGES;
+ if (priv->rx_pages_per_qpl == 0)
+ priv->rx_pages_per_qpl = DQO_QPL_DEFAULT_RX_PAGES;
+ }
+
+ priv->data_buffer_size_dqo = GVE_RX_BUFFER_SIZE_DQO;
+ priv->dev_max_rx_buffer_size = GVE_RX_BUFFER_SIZE_DQO;
+ priv->header_buf_size = 0;
+
+ if (dev_op_buffer_sizes &&
+ (supported_features_mask & GVE_SUP_BUFFER_SIZES_MASK)) {
+ dev_info(&priv->pdev->dev,
+ "BUFFER SIZES device option enabled.\n");
+ buf_size = be16_to_cpu(dev_op_buffer_sizes->packet_buffer_size);
+ if (buf_size) {
+ priv->dev_max_rx_buffer_size = buf_size;
+ if (priv->dev_max_rx_buffer_size &
+ (priv->dev_max_rx_buffer_size - 1))
+ priv->dev_max_rx_buffer_size = GVE_RX_BUFFER_SIZE_DQO;
+ if (priv->dev_max_rx_buffer_size < GVE_MIN_RX_BUFFER_SIZE)
+ priv->dev_max_rx_buffer_size = GVE_MIN_RX_BUFFER_SIZE;
+ if (priv->dev_max_rx_buffer_size > GVE_MAX_RX_BUFFER_SIZE)
+ priv->dev_max_rx_buffer_size = GVE_MAX_RX_BUFFER_SIZE;
+ }
+ buf_size = be16_to_cpu(dev_op_buffer_sizes->header_buffer_size);
+ if (buf_size) {
+ priv->header_buf_size = buf_size;
+ if (priv->header_buf_size & (priv->header_buf_size - 1))
+ priv->header_buf_size =
+ GVE_HEADER_BUFFER_SIZE_DEFAULT;
+ if (priv->header_buf_size < GVE_HEADER_BUFFER_SIZE_MIN)
+ priv->header_buf_size = GVE_HEADER_BUFFER_SIZE_MIN;
+ if (priv->header_buf_size > GVE_HEADER_BUFFER_SIZE_MAX)
+ priv->header_buf_size = GVE_HEADER_BUFFER_SIZE_MAX;
+ }
+ }
+
+ if (dev_op_flow_steering &&
+ (supported_features_mask & GVE_SUP_FLOW_STEERING_MASK)) {
+ dev_info(&priv->pdev->dev,
+ "FLOW STEERING device option enabled.\n");
+ priv->flow_rules_max =
+ be16_to_cpu(dev_op_flow_steering->max_num_rules);
+ }
+
+}
+
int gve_adminq_describe_device(struct gve_priv *priv)
{
+ struct gve_device_option_flow_steering *dev_op_flow_steering = NULL;
+ struct gve_device_option_buffer_sizes *dev_op_buffer_sizes = NULL;
+ struct gve_device_option_jumbo_frames *dev_op_jumbo_frames = NULL;
struct gve_device_option_gqi_rda *dev_op_gqi_rda = NULL;
struct gve_device_option_gqi_qpl *dev_op_gqi_qpl = NULL;
struct gve_device_option_dqo_rda *dev_op_dqo_rda = NULL;
+ struct gve_device_option_dqo_qpl *dev_op_dqo_qpl = NULL;
struct gve_device_descriptor *descriptor;
+ u32 supported_features_mask = 0;
union gve_adminq_command cmd;
dma_addr_t descriptor_bus;
int err = 0;
@@ -689,7 +950,11 @@
goto free_device_descriptor;
err = gve_process_device_options(priv, descriptor, &dev_op_gqi_rda,
- &dev_op_gqi_qpl, &dev_op_dqo_rda);
+ &dev_op_gqi_qpl, &dev_op_dqo_rda,
+ &dev_op_jumbo_frames,
+ &dev_op_buffer_sizes,
+ &dev_op_flow_steering,
+ &dev_op_dqo_qpl);
if (err)
goto free_device_descriptor;
@@ -697,27 +962,39 @@
* is not set to GqiRda, choose the queue format in a priority order:
* DqoRda, GqiRda, GqiQpl. Use GqiQpl as default.
*/
- if (priv->queue_format == GVE_GQI_RDA_FORMAT) {
- dev_info(&priv->pdev->dev,
- "Driver is running with GQI RDA queue format.\n");
- } else if (dev_op_dqo_rda) {
+ if (dev_op_dqo_rda) {
priv->queue_format = GVE_DQO_RDA_FORMAT;
dev_info(&priv->pdev->dev,
"Driver is running with DQO RDA queue format.\n");
- } else if (dev_op_gqi_rda) {
+ supported_features_mask =
+ be32_to_cpu(dev_op_dqo_rda->supported_features_mask);
+ } else if (dev_op_dqo_qpl) {
+ priv->queue_format = GVE_DQO_QPL_FORMAT;
+ supported_features_mask =
+ be32_to_cpu(dev_op_dqo_qpl->supported_features_mask);
+ } else if (dev_op_gqi_rda) {
priv->queue_format = GVE_GQI_RDA_FORMAT;
dev_info(&priv->pdev->dev,
"Driver is running with GQI RDA queue format.\n");
+ supported_features_mask =
+ be32_to_cpu(dev_op_gqi_rda->supported_features_mask);
+ } else if (priv->queue_format == GVE_GQI_RDA_FORMAT) {
+ dev_info(&priv->pdev->dev,
+ "Driver is running with GQI RDA queue format.\n");
} else {
priv->queue_format = GVE_GQI_QPL_FORMAT;
+ if (dev_op_gqi_qpl)
+ supported_features_mask =
+ be32_to_cpu(dev_op_gqi_qpl->supported_features_mask);
dev_info(&priv->pdev->dev,
"Driver is running with GQI QPL queue format.\n");
}
if (gve_is_gqi(priv)) {
err = gve_set_desc_cnt(priv, descriptor);
} else {
- /* DQO supports LRO. */
+ /* DQO supports LRO and flow-steering */
priv->dev->hw_features |= NETIF_F_LRO;
+ priv->dev->hw_features |= NETIF_F_NTUPLE;
err = gve_set_desc_cnt_dqo(priv, descriptor, dev_op_dqo_rda);
}
if (err)
@@ -746,6 +1023,12 @@
}
priv->default_num_queues = be16_to_cpu(descriptor->default_num_queues);
+ gve_enable_supported_features(priv, supported_features_mask,
+ dev_op_jumbo_frames,
+ dev_op_buffer_sizes,
+ dev_op_flow_steering,
+ dev_op_dqo_qpl);
+
free_device_descriptor:
dma_free_coherent(&priv->pdev->dev, PAGE_SIZE, descriptor,
descriptor_bus);
@@ -827,6 +1110,22 @@
return gve_adminq_execute_cmd(priv, &cmd);
}
+int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
+ u64 driver_info_len,
+ dma_addr_t driver_info_addr)
+{
+ union gve_adminq_command cmd;
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.opcode = cpu_to_be32(GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY);
+ cmd.verify_driver_compatibility = (struct gve_adminq_verify_driver_compatibility) {
+ .driver_info_len = cpu_to_be64(driver_info_len),
+ .driver_info_addr = cpu_to_be64(driver_info_addr),
+ };
+
+ return gve_adminq_execute_cmd(priv, &cmd);
+}
+
int gve_adminq_report_link_speed(struct gve_priv *priv)
{
union gve_adminq_command gvnic_cmd;
@@ -891,3 +1190,166 @@
ptype_map_bus);
return err;
}
+
+static int gve_adminq_configure_flow_rule(struct gve_priv *priv,
+ struct gve_adminq_configure_flow_rule *flow_rule_cmd)
+{
+ return gve_adminq_execute_extended_cmd(priv,
+ GVE_ADMINQ_CONFIGURE_FLOW_RULE,
+ sizeof(struct gve_adminq_configure_flow_rule),
+ flow_rule_cmd);
+}
+
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+ struct gve_flow_rule *rule)
+{
+ struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+ .cmd = cpu_to_be16(GVE_RULE_ADD),
+ .loc = cpu_to_be16(rule->loc),
+ .rule = {
+ .flow_type = cpu_to_be16(rule->flow_type),
+ .action = cpu_to_be16(rule->action),
+ .key = {
+ .src_ip = { rule->key.src_ip[0],
+ rule->key.src_ip[1],
+ rule->key.src_ip[2],
+ rule->key.src_ip[3] },
+ .dst_ip = { rule->key.dst_ip[0],
+ rule->key.dst_ip[1],
+ rule->key.dst_ip[2],
+ rule->key.dst_ip[3] },
+ },
+ .mask = {
+ .src_ip = { rule->mask.src_ip[0],
+ rule->mask.src_ip[1],
+ rule->mask.src_ip[2],
+ rule->mask.src_ip[3] },
+ .dst_ip = { rule->mask.dst_ip[0],
+ rule->mask.dst_ip[1],
+ rule->mask.dst_ip[2],
+ rule->mask.dst_ip[3] },
+ },
+ },
+ };
+ switch (rule->flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ case GVE_FLOW_TYPE_UDPV4:
+ case GVE_FLOW_TYPE_SCTPV4:
+ flow_rule_cmd.rule.key.src_port = rule->key.src_port;
+ flow_rule_cmd.rule.key.dst_port = rule->key.dst_port;
+ flow_rule_cmd.rule.key.tos = rule->key.tos;
+ flow_rule_cmd.rule.mask.src_port = rule->mask.src_port;
+ flow_rule_cmd.rule.mask.dst_port = rule->mask.dst_port;
+ flow_rule_cmd.rule.mask.tos = rule->mask.tos;
+ break;
+ case GVE_FLOW_TYPE_AHV4:
+ case GVE_FLOW_TYPE_ESPV4:
+ flow_rule_cmd.rule.key.spi = rule->key.spi;
+ flow_rule_cmd.rule.key.tos = rule->key.tos;
+ flow_rule_cmd.rule.mask.spi = rule->mask.spi;
+ flow_rule_cmd.rule.mask.tos = rule->mask.tos;
+ break;
+ case GVE_FLOW_TYPE_TCPV6:
+ case GVE_FLOW_TYPE_UDPV6:
+ case GVE_FLOW_TYPE_SCTPV6:
+ flow_rule_cmd.rule.key.src_port = rule->key.src_port;
+ flow_rule_cmd.rule.key.dst_port = rule->key.dst_port;
+ flow_rule_cmd.rule.key.tclass = rule->key.tclass;
+ flow_rule_cmd.rule.mask.src_port = rule->mask.src_port;
+ flow_rule_cmd.rule.mask.dst_port = rule->mask.dst_port;
+ flow_rule_cmd.rule.mask.tclass = rule->mask.tclass;
+ break;
+ case GVE_FLOW_TYPE_AHV6:
+ case GVE_FLOW_TYPE_ESPV6:
+ flow_rule_cmd.rule.key.spi = rule->key.spi;
+ flow_rule_cmd.rule.key.tclass = rule->key.tclass;
+ flow_rule_cmd.rule.mask.spi = rule->mask.spi;
+ flow_rule_cmd.rule.mask.tclass = rule->mask.tclass;
+ break;
+ }
+
+ return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_del_flow_rule(struct gve_priv *priv, int loc)
+{
+ struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+ .cmd = cpu_to_be16(GVE_RULE_DEL),
+ .loc = cpu_to_be16(loc),
+ };
+ return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_reset_flow_rules(struct gve_priv *priv)
+{
+ struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+ .cmd = cpu_to_be16(GVE_RULE_RESET),
+ };
+ return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_configure_rss(struct gve_priv *priv,
+ struct gve_rss_config *rss_config)
+{
+ dma_addr_t indir_bus = 0, key_bus = 0;
+ union gve_adminq_command cmd;
+ __be32 *indir = NULL;
+ u8 *key = NULL;
+ int err = 0;
+ int i;
+
+ if (rss_config->indir_size) {
+ indir = dma_alloc_coherent(&priv->pdev->dev,
+ rss_config->indir_size *
+ sizeof(*rss_config->indir),
+ &indir_bus, GFP_KERNEL);
+ if (!indir) {
+ err = -ENOMEM;
+ goto out;
+ }
+ for (i = 0; i < rss_config->indir_size; i++)
+ indir[i] = cpu_to_be32(rss_config->indir[i]);
+ }
+
+ if (rss_config->key_size) {
+ key = dma_alloc_coherent(&priv->pdev->dev,
+ rss_config->key_size *
+ sizeof(*rss_config->key),
+ &key_bus, GFP_KERNEL);
+ if (!key) {
+ err = -ENOMEM;
+ goto out;
+ }
+ memcpy(key, rss_config->key, rss_config->key_size);
+ }
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.opcode = cpu_to_be32(GVE_ADMINQ_CONFIGURE_RSS);
+ cmd.configure_rss = (struct gve_adminq_configure_rss) {
+ .hash_types = cpu_to_be16(GVE_RSS_HASH_TCPV4 |
+ GVE_RSS_HASH_UDPV4 |
+ GVE_RSS_HASH_TCPV6 |
+ GVE_RSS_HASH_UDPV6),
+ .halg = rss_config->alg,
+ .hkey_len = cpu_to_be16(rss_config->key_size),
+ .indir_len = cpu_to_be16(rss_config->indir_size),
+ .hkey_addr = cpu_to_be64(key_bus),
+ .indir_addr = cpu_to_be64(indir_bus),
+ };
+
+ err = gve_adminq_execute_cmd(priv, &cmd);
+
+out:
+ if (indir)
+ dma_free_coherent(&priv->pdev->dev,
+ rss_config->indir_size *
+ sizeof(*rss_config->indir),
+ indir, indir_bus);
+ if (key)
+ dma_free_coherent(&priv->pdev->dev,
+ rss_config->key_size *
+ sizeof(*rss_config->key),
+ key, key_bus);
+ return err;
+}
+
diff --git a/drivers/net/ethernet/google/gve/gve_adminq.h b/drivers/net/ethernet/google/gve/gve_adminq.h
index 3953f6f..0b4acdd 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.h
+++ b/drivers/net/ethernet/google/gve/gve_adminq.h
@@ -20,10 +20,17 @@
GVE_ADMINQ_DESTROY_TX_QUEUE = 0x7,
GVE_ADMINQ_DESTROY_RX_QUEUE = 0x8,
GVE_ADMINQ_DECONFIGURE_DEVICE_RESOURCES = 0x9,
+ GVE_ADMINQ_CONFIGURE_RSS = 0xA,
GVE_ADMINQ_SET_DRIVER_PARAMETER = 0xB,
GVE_ADMINQ_REPORT_STATS = 0xC,
GVE_ADMINQ_REPORT_LINK_SPEED = 0xD,
GVE_ADMINQ_GET_PTYPE_MAP = 0xE,
+ GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY = 0xF,
+
+ /* For commands that are larger than 56 bytes */
+ GVE_ADMINQ_EXTENDED_COMMAND = 0xFF,
+
+ GVE_ADMINQ_CONFIGURE_FLOW_RULE = 0x101,
};
/* Admin queue status codes */
@@ -48,6 +55,11 @@
GVE_ADMINQ_COMMAND_ERROR_UNKNOWN_ERROR = 0xFFFFFFFF,
};
+/* AdminQ commands (that aren't batched) will be retried if they encounter
+ * an recoverable error.
+ */
+#define GVE_ADMINQ_RETRY_COUNT 3
+
#define GVE_ADMINQ_DEVICE_DESCRIPTOR_VERSION 1
/* All AdminQ command structs should be naturally packed. The static_assert
@@ -108,6 +120,38 @@
static_assert(sizeof(struct gve_device_option_dqo_rda) == 8);
+struct gve_device_option_dqo_qpl {
+ __be32 supported_features_mask;
+ __be16 tx_pages_per_qpl;
+ __be16 rx_pages_per_qpl;
+};
+
+static_assert(sizeof(struct gve_device_option_dqo_qpl) == 8);
+
+struct gve_device_option_jumbo_frames {
+ __be32 supported_features_mask;
+ __be16 max_mtu;
+ u8 padding[2];
+};
+
+static_assert(sizeof(struct gve_device_option_jumbo_frames) == 8);
+
+struct gve_device_option_buffer_sizes {
+ __be32 supported_features_mask;
+ __be16 packet_buffer_size;
+ __be16 header_buffer_size;
+};
+
+static_assert(sizeof(struct gve_device_option_buffer_sizes) == 8);
+
+struct gve_device_option_flow_steering {
+ __be32 supported_features_mask;
+ __be16 max_num_rules;
+ u8 padding[2];
+};
+
+static_assert(sizeof(struct gve_device_option_flow_steering) == 8);
+
/* Terminology:
*
* RDA - Raw DMA Addressing - Buffers associated with SKBs are directly DMA
@@ -121,6 +165,10 @@
GVE_DEV_OPT_ID_GQI_RDA = 0x2,
GVE_DEV_OPT_ID_GQI_QPL = 0x3,
GVE_DEV_OPT_ID_DQO_RDA = 0x4,
+ GVE_DEV_OPT_ID_DQO_QPL = 0x7,
+ GVE_DEV_OPT_ID_JUMBO_FRAMES = 0x8,
+ GVE_DEV_OPT_ID_BUFFER_SIZES = 0xa,
+ GVE_DEV_OPT_ID_FLOW_STEERING = 0xb,
};
enum gve_dev_opt_req_feat_mask {
@@ -128,10 +176,74 @@
GVE_DEV_OPT_REQ_FEAT_MASK_GQI_RDA = 0x0,
GVE_DEV_OPT_REQ_FEAT_MASK_GQI_QPL = 0x0,
GVE_DEV_OPT_REQ_FEAT_MASK_DQO_RDA = 0x0,
+ GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES = 0x0,
+ GVE_DEV_OPT_REQ_FEAT_MASK_DQO_QPL = 0x0,
+ GVE_DEV_OPT_REQ_FEAT_MASK_BUFFER_SIZES = 0x0,
+ GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING = 0x0,
+};
+
+enum gve_sup_feature_mask {
+ GVE_SUP_JUMBO_FRAMES_MASK = 1 << 2,
+ GVE_SUP_BUFFER_SIZES_MASK = 1 << 4,
+ GVE_SUP_FLOW_STEERING_MASK = 1 << 5,
};
#define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0
+#define GVE_VERSION_STR_LEN 128
+
+enum gve_driver_capbility {
+ gve_driver_capability_gqi_qpl = 0,
+ gve_driver_capability_gqi_rda = 1,
+ gve_driver_capability_dqo_qpl = 2, /* reserved for future use */
+ gve_driver_capability_dqo_rda = 3,
+ gve_driver_capability_alt_miss_compl = 4,
+ gve_driver_capability_flexible_buffer_size = 5,
+};
+
+#define GVE_CAP1(a) BIT((int)a)
+#define GVE_CAP2(a) BIT(((int)a) - 64)
+#define GVE_CAP3(a) BIT(((int)a) - 128)
+#define GVE_CAP4(a) BIT(((int)a) - 192)
+
+#define GVE_DRIVER_CAPABILITY_FLAGS1 \
+ (GVE_CAP1(gve_driver_capability_gqi_qpl) | \
+ GVE_CAP1(gve_driver_capability_gqi_rda) | \
+ GVE_CAP1(gve_driver_capability_dqo_rda) | \
+ GVE_CAP1(gve_driver_capability_alt_miss_compl) | \
+ GVE_CAP1(gve_driver_capability_flexible_buffer_size))
+
+#define GVE_DRIVER_CAPABILITY_FLAGS2 0x0
+#define GVE_DRIVER_CAPABILITY_FLAGS3 0x0
+#define GVE_DRIVER_CAPABILITY_FLAGS4 0x0
+
+struct gve_adminq_extended_command {
+ __be32 inner_opcode;
+ __be32 inner_length;
+ __be64 inner_command_addr;
+};
+static_assert(sizeof(struct gve_adminq_extended_command) == 16);
+
+struct gve_driver_info {
+ u8 os_type; /* 0x01 = Linux */
+ u8 driver_major;
+ u8 driver_minor;
+ u8 driver_sub;
+ __be32 os_version_major;
+ __be32 os_version_minor;
+ __be32 os_version_sub;
+ __be64 driver_capability_flags[4];
+ u8 os_version_str1[GVE_VERSION_STR_LEN];
+ u8 os_version_str2[GVE_VERSION_STR_LEN];
+};
+
+struct gve_adminq_verify_driver_compatibility {
+ __be64 driver_info_len;
+ __be64 driver_info_addr;
+};
+
+static_assert(sizeof(struct gve_adminq_verify_driver_compatibility) == 16);
+
struct gve_adminq_configure_device_resources {
__be64 counter_array;
__be64 irq_db_addr;
@@ -189,7 +301,9 @@
__be16 packet_buffer_size;
__be16 rx_buff_ring_size;
u8 enable_rsc;
- u8 padding[5];
+ u8 padding1;
+ __be16 header_buffer_size;
+ u8 padding2[2];
};
static_assert(sizeof(struct gve_adminq_create_rx_queue) == 56);
@@ -313,6 +427,81 @@
__be64 ptype_map_addr;
};
+/* Flow-steering related definitions */
+enum gve_adminq_flow_rule_cmd {
+ GVE_RULE_ADD = 0,
+ GVE_RULE_DEL = 1,
+ GVE_RULE_RESET = 2,
+};
+
+enum gve_adminq_flow_type {
+ GVE_FLOW_TYPE_TCPV4 = 0,
+ GVE_FLOW_TYPE_UDPV4 = 1,
+ GVE_FLOW_TYPE_SCTPV4 = 2,
+ GVE_FLOW_TYPE_AHV4 = 3,
+ GVE_FLOW_TYPE_ESPV4 = 4,
+ GVE_FLOW_TYPE_TCPV6 = 5,
+ GVE_FLOW_TYPE_UDPV6 = 6,
+ GVE_FLOW_TYPE_SCTPV6 = 7,
+ GVE_FLOW_TYPE_AHV6 = 8,
+ GVE_FLOW_TYPE_ESPV6 = 9,
+};
+
+struct gve_adminq_flow_spec {
+ __be32 src_ip[4];
+ __be32 dst_ip[4];
+ union {
+ struct {
+ __be16 src_port;
+ __be16 dst_port;
+ };
+ __be32 spi;
+ };
+ union {
+ u8 tos;
+ u8 tclass;
+ };
+};
+static_assert(sizeof(struct gve_adminq_flow_spec) == 40);
+
+/* Flow-steering command */
+struct gve_adminq_flow_rule {
+ __be16 flow_type;
+ __be16 action; /* Queue */
+ struct gve_adminq_flow_spec key;
+ struct gve_adminq_flow_spec mask; /* ports can be 0 or 0xffff */
+};
+
+struct gve_adminq_configure_flow_rule {
+ __be16 cmd;
+ __be16 loc;
+ struct gve_adminq_flow_rule rule;
+};
+static_assert(sizeof(struct gve_adminq_configure_flow_rule) == 88);
+
+#define GVE_RSS_HASH_IPV4 BIT(0)
+#define GVE_RSS_HASH_TCPV4 BIT(1)
+#define GVE_RSS_HASH_IPV6 BIT(2)
+#define GVE_RSS_HASH_IPV6_EX BIT(3)
+#define GVE_RSS_HASH_TCPV6 BIT(4)
+#define GVE_RSS_HASH_TCPV6_EX BIT(5)
+#define GVE_RSS_HASH_UDPV4 BIT(6)
+#define GVE_RSS_HASH_UDPV6 BIT(7)
+#define GVE_RSS_HASH_UDPV6_EX BIT(8)
+
+/* RSS configuration command */
+struct gve_adminq_configure_rss {
+ __be16 hash_types;
+ u8 halg; /* hash algorithm */
+ u8 reserved;
+ __be16 hkey_len;
+ __be16 indir_len;
+ __be64 hkey_addr;
+ __be64 indir_addr;
+};
+
+static_assert(sizeof(struct gve_adminq_configure_rss) == 24);
+
union gve_adminq_command {
struct {
__be32 opcode;
@@ -327,10 +516,14 @@
struct gve_adminq_describe_device describe_device;
struct gve_adminq_register_page_list reg_page_list;
struct gve_adminq_unregister_page_list unreg_page_list;
+ struct gve_adminq_configure_rss configure_rss;
struct gve_adminq_set_driver_parameter set_driver_param;
struct gve_adminq_report_stats report_stats;
struct gve_adminq_report_link_speed report_link_speed;
struct gve_adminq_get_ptype_map get_ptype_map;
+ struct gve_adminq_verify_driver_compatibility
+ verify_driver_compatibility;
+ struct gve_adminq_extended_command extended_command;
};
};
u8 reserved[64];
@@ -348,8 +541,8 @@
dma_addr_t db_array_bus_addr,
u32 num_ntfy_blks);
int gve_adminq_deconfigure_device_resources(struct gve_priv *priv);
-int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 num_queues);
-int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 queue_id);
+int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues);
+int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues);
int gve_adminq_create_rx_queues(struct gve_priv *priv, u32 num_queues);
int gve_adminq_destroy_rx_queues(struct gve_priv *priv, u32 queue_id);
int gve_adminq_register_page_list(struct gve_priv *priv,
@@ -358,7 +551,16 @@
int gve_adminq_set_mtu(struct gve_priv *priv, u64 mtu);
int gve_adminq_report_stats(struct gve_priv *priv, u64 stats_report_len,
dma_addr_t stats_report_addr, u64 interval);
+int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
+ u64 driver_info_len,
+ dma_addr_t driver_info_addr);
+int gve_adminq_configure_rss(struct gve_priv *priv,
+ struct gve_rss_config *config);
int gve_adminq_report_link_speed(struct gve_priv *priv);
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+ struct gve_flow_rule *rule);
+int gve_adminq_del_flow_rule(struct gve_priv *priv, int loc);
+int gve_adminq_reset_flow_rules(struct gve_priv *priv);
struct gve_ptype_lut;
int gve_adminq_get_ptype_map_dqo(struct gve_priv *priv,
diff --git a/drivers/net/ethernet/google/gve/gve_desc.h b/drivers/net/ethernet/google/gve/gve_desc.h
index 05ae630..f4ae9e1 100644
--- a/drivers/net/ethernet/google/gve/gve_desc.h
+++ b/drivers/net/ethernet/google/gve/gve_desc.h
@@ -33,6 +33,14 @@
__be64 seg_addr; /* Base address (see note) of this segment */
} __packed;
+struct gve_tx_mtd_desc {
+ u8 type_flags; /* type is lower 4 bits, subtype upper */
+ u8 path_state; /* state is lower 4 bits, hash type upper */
+ __be16 reserved0;
+ __be32 path_hash;
+ __be64 reserved1;
+} __packed;
+
struct gve_tx_seg_desc {
u8 type_flags; /* type is lower 4 bits, flags upper */
u8 l3_offset; /* TSO: 2 byte units to start of IPH */
@@ -46,6 +54,7 @@
#define GVE_TXD_STD (0x0 << 4) /* Std with Host Address */
#define GVE_TXD_TSO (0x1 << 4) /* TSO with Host Address */
#define GVE_TXD_SEG (0x2 << 4) /* Seg with Host Address */
+#define GVE_TXD_MTD (0x3 << 4) /* Metadata */
/* GVE Transmit Descriptor Flags for Std Pkts */
#define GVE_TXF_L4CSUM BIT(0) /* Need csum offload */
@@ -54,6 +63,17 @@
/* GVE Transmit Descriptor Flags for TSO Segs */
#define GVE_TXSF_IPV6 BIT(1) /* IPv6 TSO */
+/* GVE Transmit Descriptor Options for MTD Segs */
+#define GVE_MTD_SUBTYPE_PATH 0
+
+#define GVE_MTD_PATH_STATE_DEFAULT 0
+#define GVE_MTD_PATH_STATE_TIMEOUT 1
+#define GVE_MTD_PATH_STATE_CONGESTION 2
+#define GVE_MTD_PATH_STATE_RETRANSMIT 3
+
+#define GVE_MTD_PATH_HASH_NONE (0x0 << 4)
+#define GVE_MTD_PATH_HASH_L4 (0x1 << 4)
+
/* GVE Receive Packet Descriptor */
/* The start of an ethernet packet comes 2 bytes into the rx buffer.
* gVNIC adds this padding so that both the DMA and the L3/4 protocol header
@@ -90,12 +110,13 @@
/* GVE Recive Packet Descriptor Flags */
#define GVE_RXFLG(x) cpu_to_be16(1 << (3 + (x)))
-#define GVE_RXF_FRAG GVE_RXFLG(3) /* IP Fragment */
-#define GVE_RXF_IPV4 GVE_RXFLG(4) /* IPv4 */
-#define GVE_RXF_IPV6 GVE_RXFLG(5) /* IPv6 */
-#define GVE_RXF_TCP GVE_RXFLG(6) /* TCP Packet */
-#define GVE_RXF_UDP GVE_RXFLG(7) /* UDP Packet */
-#define GVE_RXF_ERR GVE_RXFLG(8) /* Packet Error Detected */
+#define GVE_RXF_FRAG GVE_RXFLG(3) /* IP Fragment */
+#define GVE_RXF_IPV4 GVE_RXFLG(4) /* IPv4 */
+#define GVE_RXF_IPV6 GVE_RXFLG(5) /* IPv6 */
+#define GVE_RXF_TCP GVE_RXFLG(6) /* TCP Packet */
+#define GVE_RXF_UDP GVE_RXFLG(7) /* UDP Packet */
+#define GVE_RXF_ERR GVE_RXFLG(8) /* Packet Error Detected */
+#define GVE_RXF_PKT_CONT GVE_RXFLG(10) /* Multi Fragment RX packet */
/* GVE IRQ */
#define GVE_IRQ_ACK BIT(31)
diff --git a/drivers/net/ethernet/google/gve/gve_desc_dqo.h b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
index e8fe9ad..f79cd05 100644
--- a/drivers/net/ethernet/google/gve/gve_desc_dqo.h
+++ b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
@@ -176,6 +176,11 @@
#define GVE_COMPL_TYPE_DQO_MISS 0x1 /* Miss path completion */
#define GVE_COMPL_TYPE_DQO_REINJECTION 0x3 /* Re-injection completion */
+/* The most significant bit in the completion tag can change the completion
+ * type from packet completion to miss path completion.
+ */
+#define GVE_ALT_MISS_COMPL_BIT BIT(15)
+
/* Descriptor to post buffers to HW on buffer queue. */
struct gve_rx_desc_dqo {
__le16 buf_id; /* ID returned in Rx completion descriptor */
diff --git a/drivers/net/ethernet/google/gve/gve_dqo.h b/drivers/net/ethernet/google/gve/gve_dqo.h
index 8360423..82c2c73 100644
--- a/drivers/net/ethernet/google/gve/gve_dqo.h
+++ b/drivers/net/ethernet/google/gve/gve_dqo.h
@@ -18,6 +18,7 @@
#define GVE_TX_IRQ_RATELIMIT_US_DQO 50
#define GVE_RX_IRQ_RATELIMIT_US_DQO 20
+#define GVE_MAX_ITR_INTERVAL_DQO (GVE_ITR_INTERVAL_DQO_MASK * 2)
/* Timeout in seconds to wait for a reinjection completion after receiving
* its corresponding miss completion.
@@ -32,16 +33,23 @@
#define GVE_DEALLOCATE_COMPL_TIMEOUT 60
netdev_tx_t gve_tx_dqo(struct sk_buff *skb, struct net_device *dev);
+netdev_features_t gve_features_check_dqo(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features);
+
bool gve_tx_poll_dqo(struct gve_notify_block *block, bool do_clean);
int gve_rx_poll_dqo(struct gve_notify_block *block, int budget);
+bool gve_tx_work_pending_dqo(struct gve_tx_ring *tx);
int gve_tx_alloc_rings_dqo(struct gve_priv *priv);
void gve_tx_free_rings_dqo(struct gve_priv *priv);
int gve_rx_alloc_rings_dqo(struct gve_priv *priv);
void gve_rx_free_rings_dqo(struct gve_priv *priv);
+void gve_rx_reset_rings_dqo(struct gve_priv *priv);
int gve_clean_tx_done_dqo(struct gve_priv *priv, struct gve_tx_ring *tx,
struct napi_struct *napi);
void gve_rx_post_buffers_dqo(struct gve_rx_ring *rx);
void gve_rx_write_doorbell_dqo(const struct gve_priv *priv, int queue_idx);
+int gve_rx_handle_hdr_resources_dqo(struct gve_priv *priv, bool enable_hdr_split);
static inline void
gve_tx_put_doorbell_dqo(const struct gve_priv *priv,
@@ -50,21 +58,25 @@
u64 index;
index = be32_to_cpu(q_resources->db_index);
+ /* Ensure that all the TX descriptor writes are flushed */
+ dma_wmb();
iowrite32(val, &priv->db_bar2[index]);
+ /* Make sure that the doorbell write is sent to NIC */
+ wmb();
}
/* Builds register value to write to DQO IRQ doorbell to enable with specified
- * ratelimit.
+ * ITR interval.
*/
-static inline u32 gve_set_itr_ratelimit_dqo(u32 ratelimit_us)
+static inline u32 gve_setup_itr_interval_dqo(u32 interval_us)
{
u32 result = GVE_ITR_ENABLE_BIT_DQO;
/* Interval has 2us granularity. */
- ratelimit_us >>= 1;
+ interval_us >>= 1;
- ratelimit_us &= GVE_ITR_INTERVAL_DQO_MASK;
- result |= (ratelimit_us << GVE_ITR_INTERVAL_DQO_SHIFT);
+ interval_us &= GVE_ITR_INTERVAL_DQO_MASK;
+ result |= (interval_us << GVE_ITR_INTERVAL_DQO_SHIFT);
return result;
}
@@ -73,9 +85,20 @@
gve_write_irq_doorbell_dqo(const struct gve_priv *priv,
const struct gve_notify_block *block, u32 val)
{
- u32 index = be32_to_cpu(block->irq_db_index);
+ u32 index = be32_to_cpu(*block->irq_db_index);
iowrite32(val, &priv->db_bar2[index]);
}
+/* Sets interrupt throttling interval and enables interrupt
+ * by writing to IRQ doorbell.
+ */
+static inline void
+gve_set_itr_coalesce_usecs_dqo(struct gve_priv *priv,
+ struct gve_notify_block *block,
+ u32 usecs)
+{
+ gve_write_irq_doorbell_dqo(priv, block,
+ gve_setup_itr_interval_dqo(usecs));
+}
#endif /* _GVE_DQO_H_ */
diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c
index 1f8cc72..fbb30bd 100644
--- a/drivers/net/ethernet/google/gve/gve_ethtool.c
+++ b/drivers/net/ethernet/google/gve/gve_ethtool.c
@@ -8,6 +8,7 @@
#include <linux/rtnetlink.h>
#include "gve.h"
#include "gve_adminq.h"
+#include "gve_dqo.h"
static void gve_get_drvinfo(struct net_device *netdev,
struct ethtool_drvinfo *info)
@@ -34,22 +35,26 @@
}
static const char gve_gstrings_main_stats[][ETH_GSTRING_LEN] = {
- "rx_packets", "tx_packets", "rx_bytes", "tx_bytes",
- "rx_dropped", "tx_dropped", "tx_timeouts",
- "rx_skb_alloc_fail", "rx_buf_alloc_fail", "rx_desc_err_dropped_pkt",
+ "rx_packets", "rx_packets_sph", "rx_packets_hbo", "rx_devmem_pkts",
+ "rx_devmem_dropped", "tx_packets", "rx_bytes", "tx_bytes", "rx_dropped",
+ "tx_dropped", "tx_timeouts", "rx_skb_alloc_fail", "rx_buf_alloc_fail",
+ "rx_desc_err_dropped_pkt", "rx_hsplit_err_dropped_pkt",
"interface_up_cnt", "interface_down_cnt", "reset_cnt",
"page_alloc_fail", "dma_mapping_error", "stats_report_trigger_cnt",
};
static const char gve_gstrings_rx_stats[][ETH_GSTRING_LEN] = {
- "rx_posted_desc[%u]", "rx_completed_desc[%u]", "rx_bytes[%u]",
+ "rx_posted_desc[%u]", "rx_completed_desc[%u]", "rx_consumed_desc[%u]",
+ "rx_bytes[%u]", "rx_dmabuf_bound[%u]", "rx_header_bytes[%u]",
+ "rx_cont_packet_cnt[%u]", "rx_frag_flip_cnt[%u]", "rx_frag_copy_cnt[%u]",
+ "rx_frag_alloc_cnt[%u]",
"rx_dropped_pkt[%u]", "rx_copybreak_pkt[%u]", "rx_copied_pkt[%u]",
"rx_queue_drop_cnt[%u]", "rx_no_buffers_posted[%u]",
"rx_drops_packet_over_mru[%u]", "rx_drops_invalid_checksum[%u]",
};
static const char gve_gstrings_tx_stats[][ETH_GSTRING_LEN] = {
- "tx_posted_desc[%u]", "tx_completed_desc[%u]", "tx_bytes[%u]",
+ "tx_posted_desc[%u]", "tx_completed_desc[%u]", "tx_consumed_desc[%u]", "tx_bytes[%u]",
"tx_wake[%u]", "tx_stop[%u]", "tx_event_counter[%u]",
"tx_dma_mapping_error[%u]",
};
@@ -61,11 +66,13 @@
"adminq_create_tx_queue_cnt", "adminq_create_rx_queue_cnt",
"adminq_destroy_tx_queue_cnt", "adminq_destroy_rx_queue_cnt",
"adminq_dcfg_device_resources_cnt", "adminq_set_driver_parameter_cnt",
- "adminq_report_stats_cnt", "adminq_report_link_speed_cnt"
+ "adminq_report_stats_cnt", "adminq_report_link_speed_cnt",
+ "adminq_cfg_flow_rule", "adminq_cfg_rss_cnt"
};
static const char gve_gstrings_priv_flags[][ETH_GSTRING_LEN] = {
- "report-stats",
+ "report-stats", "enable-header-split", "enable-strict-header-split",
+ "enable-max-rx-buffer-size"
};
#define GVE_MAIN_STATS_LEN ARRAY_SIZE(gve_gstrings_main_stats)
@@ -138,10 +145,16 @@
gve_get_ethtool_stats(struct net_device *netdev,
struct ethtool_stats *stats, u64 *data)
{
- u64 tmp_rx_pkts, tmp_rx_bytes, tmp_rx_skb_alloc_fail, tmp_rx_buf_alloc_fail,
- tmp_rx_desc_err_dropped_pkt, tmp_tx_pkts, tmp_tx_bytes;
- u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_pkts,
- rx_skb_alloc_fail, rx_bytes, tx_pkts, tx_bytes;
+ u64 tmp_rx_pkts, tmp_rx_pkts_sph, tmp_rx_pkts_hbo, tmp_rx_devmem_pkt,
+ tmp_rx_devmem_dropped, tmp_rx_bytes,
+ tmp_rx_hbytes, tmp_rx_skb_alloc_fail, tmp_rx_buf_alloc_fail,
+ tmp_rx_desc_err_dropped_pkt, tmp_rx_hsplit_err_dropped_pkt,
+ tmp_tx_pkts, tmp_tx_bytes;
+
+ u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_hsplit_err_dropped_pkt,
+ rx_pkts, rx_pkts_sph, rx_pkts_hbo, rx_devmem_pkt, rx_devmem_dropped,
+ rx_skb_alloc_fail, rx_bytes,
+ tx_pkts, tx_bytes, tx_dropped;
int stats_idx, base_stats_idx, max_stats_idx;
struct stats *report_stats;
int *rx_qid_to_stats_idx;
@@ -166,60 +179,78 @@
kfree(rx_qid_to_stats_idx);
return;
}
- for (rx_pkts = 0, rx_bytes = 0, rx_skb_alloc_fail = 0,
- rx_buf_alloc_fail = 0, rx_desc_err_dropped_pkt = 0, ring = 0;
+ for (rx_pkts = 0, rx_bytes = 0, rx_pkts_sph = 0, rx_pkts_hbo = 0,
+ rx_devmem_pkt = 0, rx_devmem_dropped = 0,
+ rx_skb_alloc_fail = 0, rx_buf_alloc_fail = 0,
+ rx_desc_err_dropped_pkt = 0, rx_hsplit_err_dropped_pkt = 0,
+ ring = 0;
ring < priv->rx_cfg.num_queues; ring++) {
if (priv->rx) {
do {
struct gve_rx_ring *rx = &priv->rx[ring];
start =
- u64_stats_fetch_begin_irq(&priv->rx[ring].statss);
+ u64_stats_fetch_begin(&priv->rx[ring].statss);
tmp_rx_pkts = rx->rpackets;
+ tmp_rx_pkts_sph = rx->rx_hsplit_pkt;
+ tmp_rx_pkts_hbo = rx->rx_hsplit_hbo_pkt;
+ tmp_rx_devmem_pkt = rx->rx_devmem_pkt;
+ tmp_rx_devmem_dropped = rx->rx_devmem_dropped;
tmp_rx_bytes = rx->rbytes;
tmp_rx_skb_alloc_fail = rx->rx_skb_alloc_fail;
tmp_rx_buf_alloc_fail = rx->rx_buf_alloc_fail;
tmp_rx_desc_err_dropped_pkt =
rx->rx_desc_err_dropped_pkt;
- } while (u64_stats_fetch_retry_irq(&priv->rx[ring].statss,
+ tmp_rx_hsplit_err_dropped_pkt =
+ rx->rx_hsplit_err_dropped_pkt;
+ } while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
rx_pkts += tmp_rx_pkts;
+ rx_pkts_sph += tmp_rx_pkts_sph;
+ rx_pkts_hbo += tmp_rx_pkts_hbo;
+ rx_devmem_pkt += tmp_rx_devmem_pkt;
+ rx_devmem_dropped += tmp_rx_devmem_dropped;
rx_bytes += tmp_rx_bytes;
rx_skb_alloc_fail += tmp_rx_skb_alloc_fail;
rx_buf_alloc_fail += tmp_rx_buf_alloc_fail;
rx_desc_err_dropped_pkt += tmp_rx_desc_err_dropped_pkt;
+ rx_hsplit_err_dropped_pkt += tmp_rx_hsplit_err_dropped_pkt;
}
}
- for (tx_pkts = 0, tx_bytes = 0, ring = 0;
+ for (tx_pkts = 0, tx_bytes = 0, tx_dropped = 0, ring = 0;
ring < priv->tx_cfg.num_queues; ring++) {
if (priv->tx) {
do {
start =
- u64_stats_fetch_begin_irq(&priv->tx[ring].statss);
+ u64_stats_fetch_begin(&priv->tx[ring].statss);
tmp_tx_pkts = priv->tx[ring].pkt_done;
tmp_tx_bytes = priv->tx[ring].bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->tx[ring].statss,
start));
tx_pkts += tmp_tx_pkts;
tx_bytes += tmp_tx_bytes;
+ tx_dropped += priv->tx[ring].dropped_pkt;
}
}
i = 0;
data[i++] = rx_pkts;
+ data[i++] = rx_pkts_sph;
+ data[i++] = rx_pkts_hbo;
+ data[i++] = rx_devmem_pkt;
+ data[i++] = rx_devmem_dropped;
data[i++] = tx_pkts;
data[i++] = rx_bytes;
data[i++] = tx_bytes;
/* total rx dropped packets */
data[i++] = rx_skb_alloc_fail + rx_buf_alloc_fail +
rx_desc_err_dropped_pkt;
- /* Skip tx_dropped */
- i++;
-
+ data[i++] = tx_dropped;
data[i++] = priv->tx_timeo_cnt;
data[i++] = rx_skb_alloc_fail;
data[i++] = rx_buf_alloc_fail;
data[i++] = rx_desc_err_dropped_pkt;
+ data[i++] = rx_hsplit_err_dropped_pkt;
data[i++] = priv->interface_up_cnt;
data[i++] = priv->interface_down_cnt;
data[i++] = priv->reset_cnt;
@@ -254,17 +285,25 @@
data[i++] = rx->fill_cnt;
data[i++] = rx->cnt;
+ data[i++] = rx->fill_cnt - rx->cnt;
do {
start =
- u64_stats_fetch_begin_irq(&priv->rx[ring].statss);
+ u64_stats_fetch_begin(&priv->rx[ring].statss);
tmp_rx_bytes = rx->rbytes;
+ tmp_rx_hbytes = rx->rheader_bytes;
tmp_rx_skb_alloc_fail = rx->rx_skb_alloc_fail;
tmp_rx_buf_alloc_fail = rx->rx_buf_alloc_fail;
tmp_rx_desc_err_dropped_pkt =
rx->rx_desc_err_dropped_pkt;
- } while (u64_stats_fetch_retry_irq(&priv->rx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
data[i++] = tmp_rx_bytes;
+ data[i++] = !! __netif_get_rx_queue(priv->dev, rx->q_num)->dmabuf_pages;
+ data[i++] = tmp_rx_hbytes;
+ data[i++] = rx->rx_cont_packet_cnt;
+ data[i++] = rx->rx_frag_flip_cnt;
+ data[i++] = rx->rx_frag_copy_cnt;
+ data[i++] = rx->rx_frag_alloc_cnt;
/* rx dropped packets */
data[i++] = tmp_rx_skb_alloc_fail +
tmp_rx_buf_alloc_fail +
@@ -314,24 +353,25 @@
if (gve_is_gqi(priv)) {
data[i++] = tx->req;
data[i++] = tx->done;
+ data[i++] = tx->req - tx->done;
} else {
/* DQO doesn't currently support
* posted/completed descriptor counts;
*/
data[i++] = 0;
data[i++] = 0;
+ data[i++] = tx->dqo_tx.tail - tx->dqo_tx.head;
}
do {
start =
- u64_stats_fetch_begin_irq(&priv->tx[ring].statss);
+ u64_stats_fetch_begin(&priv->tx[ring].statss);
tmp_tx_bytes = tx->bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->tx[ring].statss,
start));
data[i++] = tmp_tx_bytes;
data[i++] = tx->wake_queue;
data[i++] = tx->stop_queue;
- data[i++] = be32_to_cpu(gve_tx_load_event_counter(priv,
- tx));
+ data[i++] = gve_tx_load_event_counter(priv, tx);
data[i++] = tx->dma_mapping_error;
/* stats from NIC */
if (skip_nic_stats) {
@@ -367,6 +407,8 @@
data[i++] = priv->adminq_set_driver_parameter_cnt;
data[i++] = priv->adminq_report_stats_cnt;
data[i++] = priv->adminq_report_link_speed_cnt;
+ data[i++] = priv->adminq_cfg_flow_rule_cnt;
+ data[i++] = priv->adminq_cfg_rss_cnt;
}
static void gve_get_channels(struct net_device *netdev,
@@ -403,7 +445,7 @@
if (!new_rx || !new_tx)
return -EINVAL;
- if (!netif_carrier_ok(netdev)) {
+ if (!netif_running(netdev)) {
priv->tx_cfg.num_queues = new_tx;
priv->rx_cfg.num_queues = new_rx;
return 0;
@@ -432,7 +474,7 @@
if (*flags == ETH_RESET_ALL) {
*flags = 0;
- return gve_reset(priv, true);
+ return gve_reset(priv, true, GVE_RESET_REASON_RESET_BY_USER);
}
return -EOPNOTSUPP;
@@ -479,28 +521,69 @@
static u32 gve_get_priv_flags(struct net_device *netdev)
{
struct gve_priv *priv = netdev_priv(netdev);
- u32 ret_flags = 0;
-
- /* Only 1 flag exists currently: report-stats (BIT(O)), so set that flag. */
- if (priv->ethtool_flags & BIT(0))
- ret_flags |= BIT(0);
- return ret_flags;
+ return priv->ethtool_flags & GVE_PRIV_FLAGS_MASK;
}
static int gve_set_priv_flags(struct net_device *netdev, u32 flags)
{
struct gve_priv *priv = netdev_priv(netdev);
- u64 ori_flags, new_flags;
+ u64 ori_flags, new_flags, flag_diff;
+ int new_packet_buffer_size;
+
+ /* If turning off header split, strict header split will be turned off too*/
+ if (gve_get_enable_header_split(priv) &&
+ !(flags & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT))) {
+ flags &= ~BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+ flags &= ~BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT);
+ }
+
+ /* If strict header-split is requested, turn on regular header-split */
+ if (flags & BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT))
+ flags |= BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+
+ /* Make sure header-split is available */
+ if ((flags & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT)) &&
+ !(priv->ethtool_defaults & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT))) {
+ dev_err(&priv->pdev->dev,
+ "Header-split not available\n");
+ return -EINVAL;
+ }
+
+ if ((flags & BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE)) &&
+ priv->dev_max_rx_buffer_size <= GVE_MIN_RX_BUFFER_SIZE) {
+ dev_err(&priv->pdev->dev,
+ "Max-rx-buffer-size not available\n");
+ return -EINVAL;
+ }
ori_flags = READ_ONCE(priv->ethtool_flags);
- new_flags = ori_flags;
- /* Only one priv flag exists: report-stats (BIT(0))*/
- if (flags & BIT(0))
- new_flags |= BIT(0);
- else
- new_flags &= ~(BIT(0));
+ new_flags = flags & GVE_PRIV_FLAGS_MASK;
+
+ flag_diff = new_flags ^ ori_flags;
+
+ if ((flag_diff & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT)) ||
+ (flag_diff & BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE))) {
+ bool enable_hdr_split =
+ new_flags & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+ bool enable_max_buffer_size =
+ new_flags & BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE);
+ int err;
+
+ if (enable_max_buffer_size)
+ new_packet_buffer_size = priv->dev_max_rx_buffer_size;
+ else
+ new_packet_buffer_size = GVE_RX_BUFFER_SIZE_DQO;
+
+ err = gve_reconfigure_rx_rings(priv,
+ enable_hdr_split,
+ new_packet_buffer_size);
+ if (err)
+ return err;
+ }
+
priv->ethtool_flags = new_flags;
+
/* start report-stats timer when user turns report stats on. */
if (flags & BIT(0)) {
mod_timer(&priv->stats_report_timer,
@@ -519,6 +602,10 @@
sizeof(struct stats));
del_timer_sync(&priv->stats_report_timer);
}
+ priv->header_split_strict =
+ (priv->ethtool_flags &
+ BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT)) ? true : false;
+
return 0;
}
@@ -538,7 +625,689 @@
return err;
}
+static int gve_get_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *ec,
+ struct kernel_ethtool_coalesce *kernel_ec,
+ struct netlink_ext_ack *extack)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+
+ if (gve_is_gqi(priv))
+ return -EOPNOTSUPP;
+ ec->tx_coalesce_usecs = priv->tx_coalesce_usecs;
+ ec->rx_coalesce_usecs = priv->rx_coalesce_usecs;
+
+ return 0;
+}
+
+static int gve_set_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *ec,
+ struct kernel_ethtool_coalesce *kernel_ec,
+ struct netlink_ext_ack *extack)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ u32 tx_usecs_orig = priv->tx_coalesce_usecs;
+ u32 rx_usecs_orig = priv->rx_coalesce_usecs;
+ int idx;
+
+ if (gve_is_gqi(priv))
+ return -EOPNOTSUPP;
+
+ if (ec->tx_coalesce_usecs > GVE_MAX_ITR_INTERVAL_DQO ||
+ ec->rx_coalesce_usecs > GVE_MAX_ITR_INTERVAL_DQO)
+ return -EINVAL;
+ priv->tx_coalesce_usecs = ec->tx_coalesce_usecs;
+ priv->rx_coalesce_usecs = ec->rx_coalesce_usecs;
+
+ if (tx_usecs_orig != priv->tx_coalesce_usecs) {
+ for (idx = 0; idx < priv->tx_cfg.num_queues; idx++) {
+ int ntfy_idx = gve_tx_idx_to_ntfy(priv, idx);
+ struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx];
+
+ gve_set_itr_coalesce_usecs_dqo(priv, block,
+ priv->tx_coalesce_usecs);
+ }
+ }
+
+ if (rx_usecs_orig != priv->rx_coalesce_usecs) {
+ for (idx = 0; idx < priv->rx_cfg.num_queues; idx++) {
+ int ntfy_idx = gve_rx_idx_to_ntfy(priv, idx);
+ struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx];
+
+ gve_set_itr_coalesce_usecs_dqo(priv, block,
+ priv->rx_coalesce_usecs);
+ }
+ }
+
+ return 0;
+}
+
+static u32 gve_get_rxfh_key_size(struct net_device *netdev)
+{
+ return GVE_RSS_KEY_SIZE;
+}
+
+static u32 gve_get_rxfh_indir_size(struct net_device *netdev)
+{
+ return GVE_RSS_INDIR_SIZE;
+}
+
+static int gve_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
+ u8 *hfunc)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ struct gve_rss_config *rss_config = &priv->rss_config;
+ u16 i;
+
+ if (hfunc) {
+ switch (rss_config->alg) {
+ case GVE_RSS_HASH_TOEPLITZ:
+ *hfunc = ETH_RSS_HASH_TOP;
+ break;
+ case GVE_RSS_HASH_UNDEFINED:
+ default:
+ return -EOPNOTSUPP;
+ }
+ }
+ if (key)
+ memcpy(key, rss_config->key, rss_config->key_size);
+
+ if (indir)
+ /* Each 32 bits pointed by 'indir' is stored with a lut entry */
+ for (i = 0; i < rss_config->indir_size; i++)
+ indir[i] = (u32)rss_config->indir[i];
+
+ return 0;
+}
+
+static int gve_set_rxfh(struct net_device *netdev, const u32 *indir,
+ const u8 *key, const u8 hfunc)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ struct gve_rss_config *rss_config = &priv->rss_config;
+ bool init = false;
+ u16 i;
+ int err = 0;
+
+ /* Initialize RSS if not configured before */
+ if (rss_config->alg == GVE_RSS_HASH_UNDEFINED) {
+ err = gve_rss_config_init(priv);
+ if (err)
+ return err;
+ init = true;
+ }
+
+ switch (hfunc) {
+ case ETH_RSS_HASH_NO_CHANGE:
+ break;
+ case ETH_RSS_HASH_TOP:
+ rss_config->alg = GVE_RSS_HASH_TOEPLITZ;
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ if (!key && !indir && !init)
+ return 0;
+
+ if (key)
+ memcpy(rss_config->key, key, rss_config->key_size);
+
+ if (indir) {
+ /* Each 32 bits pointed by 'indir' is stored with a lut entry */
+ for (i = 0; i < rss_config->indir_size; i++)
+ rss_config->indir[i] = indir[i];
+ }
+
+ return gve_adminq_configure_rss(priv, rss_config);
+}
+
+static const char *gve_flow_type_name(enum gve_adminq_flow_type flow_type)
+{
+ switch (flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ case GVE_FLOW_TYPE_TCPV6:
+ return "TCP";
+ case GVE_FLOW_TYPE_UDPV4:
+ case GVE_FLOW_TYPE_UDPV6:
+ return "UDP";
+ case GVE_FLOW_TYPE_SCTPV4:
+ case GVE_FLOW_TYPE_SCTPV6:
+ return "SCTP";
+ case GVE_FLOW_TYPE_AHV4:
+ case GVE_FLOW_TYPE_AHV6:
+ return "AH";
+ case GVE_FLOW_TYPE_ESPV4:
+ case GVE_FLOW_TYPE_ESPV6:
+ return "ESP";
+ }
+ return NULL;
+}
+
+static void gve_print_flow_rule(struct gve_priv *priv,
+ struct gve_flow_rule *rule)
+{
+ const char *proto = gve_flow_type_name(rule->flow_type);
+
+ if (!proto)
+ return;
+
+ switch (rule->flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ case GVE_FLOW_TYPE_UDPV4:
+ case GVE_FLOW_TYPE_SCTPV4:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI4 src_ip %pI4 %s: dst_port %hu src_port %hu\n",
+ rule->loc,
+ &rule->key.dst_ip[0],
+ &rule->key.src_ip[0],
+ proto,
+ ntohs(rule->key.dst_port),
+ ntohs(rule->key.src_port));
+ break;
+ case GVE_FLOW_TYPE_AHV4:
+ case GVE_FLOW_TYPE_ESPV4:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI4 src_ip %pI4 %s: spi %hu\n",
+ rule->loc,
+ &rule->key.dst_ip[0],
+ &rule->key.src_ip[0],
+ proto,
+ ntohl(rule->key.spi));
+ break;
+ case GVE_FLOW_TYPE_TCPV6:
+ case GVE_FLOW_TYPE_UDPV6:
+ case GVE_FLOW_TYPE_SCTPV6:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI6 src_ip %pI6 %s: dst_port %hu src_port %hu\n",
+ rule->loc,
+ &rule->key.dst_ip,
+ &rule->key.src_ip,
+ proto,
+ ntohs(rule->key.dst_port),
+ ntohs(rule->key.src_port));
+ break;
+ case GVE_FLOW_TYPE_AHV6:
+ case GVE_FLOW_TYPE_ESPV6:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI6 src_ip %pI6 %s: spi %hu\n",
+ rule->loc,
+ &rule->key.dst_ip,
+ &rule->key.src_ip,
+ proto,
+ ntohl(rule->key.spi));
+ break;
+ default:
+ break;
+ }
+}
+
+static bool gve_flow_rule_is_dup_rule(struct gve_priv *priv, struct gve_flow_rule *rule)
+{
+ struct gve_flow_rule *tmp;
+
+ list_for_each_entry(tmp, &priv->flow_rules, list) {
+ if (tmp->flow_type != rule->flow_type)
+ continue;
+
+ if (!memcmp(&tmp->key, &rule->key,
+ sizeof(struct gve_flow_spec)) &&
+ !memcmp(&tmp->mask, &rule->mask,
+ sizeof(struct gve_flow_spec)))
+ return true;
+ }
+ return false;
+}
+
+static struct gve_flow_rule *gve_find_flow_rule_by_loc(struct gve_priv *priv, u16 loc)
+{
+ struct gve_flow_rule *rule;
+
+ list_for_each_entry(rule, &priv->flow_rules, list)
+ if (rule->loc == loc)
+ return rule;
+
+ return NULL;
+}
+
+static void gve_flow_rules_add_rule(struct gve_priv *priv, struct gve_flow_rule *rule)
+{
+ struct gve_flow_rule *tmp, *parent = NULL;
+
+ list_for_each_entry(tmp, &priv->flow_rules, list) {
+ if (tmp->loc >= rule->loc)
+ break;
+ parent = tmp;
+ }
+
+ if (parent)
+ list_add(&rule->list, &parent->list);
+ else
+ list_add(&rule->list, &priv->flow_rules);
+
+ priv->flow_rules_cnt++;
+}
+
+static void gve_flow_rules_del_rule(struct gve_priv *priv, struct gve_flow_rule *rule)
+{
+ list_del(&rule->list);
+ kvfree(rule);
+ priv->flow_rules_cnt--;
+}
+
+static int
+gve_get_flow_rule_entry(struct gve_priv *priv, struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp = (struct ethtool_rx_flow_spec *)&cmd->fs;
+ struct gve_flow_rule *rule = NULL;
+ int err = 0;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ mutex_lock(&priv->flow_rules_lock);
+ rule = gve_find_flow_rule_by_loc(priv, fsp->location);
+ if (!rule) {
+ err = -EINVAL;
+ goto ret;
+ }
+
+ switch (rule->flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ fsp->flow_type = TCP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_UDPV4:
+ fsp->flow_type = UDP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_SCTPV4:
+ fsp->flow_type = SCTP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_AHV4:
+ fsp->flow_type = AH_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_ESPV4:
+ fsp->flow_type = ESP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_TCPV6:
+ fsp->flow_type = TCP_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_UDPV6:
+ fsp->flow_type = UDP_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_SCTPV6:
+ fsp->flow_type = SCTP_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_AHV6:
+ fsp->flow_type = AH_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_ESPV6:
+ fsp->flow_type = ESP_V6_FLOW;
+ break;
+ default:
+ err = -EINVAL;
+ goto ret;
+ }
+
+ memset(&fsp->h_u, 0, sizeof(fsp->h_u));
+ memset(&fsp->h_ext, 0, sizeof(fsp->h_ext));
+ memset(&fsp->m_u, 0, sizeof(fsp->m_u));
+ memset(&fsp->m_ext, 0, sizeof(fsp->m_ext));
+
+ switch (fsp->flow_type) {
+ case TCP_V4_FLOW:
+ case UDP_V4_FLOW:
+ case SCTP_V4_FLOW:
+ fsp->h_u.tcp_ip4_spec.ip4src = rule->key.src_ip[0];
+ fsp->h_u.tcp_ip4_spec.ip4dst = rule->key.dst_ip[0];
+ fsp->h_u.tcp_ip4_spec.psrc = rule->key.src_port;
+ fsp->h_u.tcp_ip4_spec.pdst = rule->key.dst_port;
+ fsp->h_u.tcp_ip4_spec.tos = rule->key.tos;
+ fsp->m_u.tcp_ip4_spec.ip4src = rule->mask.src_ip[0];
+ fsp->m_u.tcp_ip4_spec.ip4dst = rule->mask.dst_ip[0];
+ fsp->m_u.tcp_ip4_spec.psrc = rule->mask.src_port;
+ fsp->m_u.tcp_ip4_spec.pdst = rule->mask.dst_port;
+ fsp->m_u.tcp_ip4_spec.tos = rule->mask.tos;
+ break;
+ case AH_V4_FLOW:
+ case ESP_V4_FLOW:
+ fsp->h_u.ah_ip4_spec.ip4src = rule->key.src_ip[0];
+ fsp->h_u.ah_ip4_spec.ip4dst = rule->key.dst_ip[0];
+ fsp->h_u.ah_ip4_spec.spi = rule->key.spi;
+ fsp->h_u.ah_ip4_spec.tos = rule->key.tos;
+ fsp->m_u.ah_ip4_spec.ip4src = rule->mask.src_ip[0];
+ fsp->m_u.ah_ip4_spec.ip4dst = rule->mask.dst_ip[0];
+ fsp->m_u.ah_ip4_spec.spi = rule->mask.spi;
+ fsp->m_u.ah_ip4_spec.tos = rule->mask.tos;
+ break;
+ case TCP_V6_FLOW:
+ case UDP_V6_FLOW:
+ case SCTP_V6_FLOW:
+ memcpy(fsp->h_u.tcp_ip6_spec.ip6src, &rule->key.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->h_u.tcp_ip6_spec.ip6dst, &rule->key.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->h_u.tcp_ip6_spec.psrc = rule->key.src_port;
+ fsp->h_u.tcp_ip6_spec.pdst = rule->key.dst_port;
+ fsp->h_u.tcp_ip6_spec.tclass = rule->key.tclass;
+ memcpy(fsp->m_u.tcp_ip6_spec.ip6src, &rule->mask.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->m_u.tcp_ip6_spec.ip6dst, &rule->mask.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->m_u.tcp_ip6_spec.psrc = rule->mask.src_port;
+ fsp->m_u.tcp_ip6_spec.pdst = rule->mask.dst_port;
+ fsp->m_u.tcp_ip6_spec.tclass = rule->mask.tclass;
+ break;
+ case AH_V6_FLOW:
+ case ESP_V6_FLOW:
+ memcpy(fsp->h_u.ah_ip6_spec.ip6src, &rule->key.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->h_u.ah_ip6_spec.ip6dst, &rule->key.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->h_u.ah_ip6_spec.spi = rule->key.spi;
+ fsp->h_u.ah_ip6_spec.tclass = rule->key.tclass;
+ memcpy(fsp->m_u.ah_ip6_spec.ip6src, &rule->mask.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->m_u.ah_ip6_spec.ip6dst, &rule->mask.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->m_u.ah_ip6_spec.spi = rule->mask.spi;
+ fsp->m_u.ah_ip6_spec.tclass = rule->mask.tclass;
+ break;
+ default:
+ err = -EINVAL;
+ goto ret;
+ }
+
+ fsp->ring_cookie = rule->action;
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ return err;
+}
+
+static int
+gve_get_flow_rule_ids(struct gve_priv *priv, struct ethtool_rxnfc *cmd,
+ u32 *rule_locs)
+{
+ struct gve_flow_rule *rule;
+ unsigned int cnt = 0;
+ int err = 0;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ cmd->data = priv->flow_rules_max;
+
+ mutex_lock(&priv->flow_rules_lock);
+ list_for_each_entry(rule, &priv->flow_rules, list) {
+ if (cnt == cmd->rule_cnt) {
+ err = -EMSGSIZE;
+ goto ret;
+ }
+ rule_locs[cnt] = rule->loc;
+ cnt++;
+ }
+ cmd->rule_cnt = cnt;
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ return err;
+}
+
+static int
+gve_add_flow_rule_info(struct gve_priv *priv, struct ethtool_rx_flow_spec *fsp,
+ struct gve_flow_rule *rule)
+{
+ u32 flow_type, q_index = 0;
+
+ if (fsp->ring_cookie == RX_CLS_FLOW_DISC)
+ return -EOPNOTSUPP;
+
+ q_index = fsp->ring_cookie;
+ if (q_index >= priv->rx_cfg.num_queues)
+ return -EINVAL;
+
+ rule->action = q_index;
+ rule->loc = fsp->location;
+
+ flow_type = fsp->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+ switch (flow_type) {
+ case TCP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_TCPV4;
+ break;
+ case UDP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_UDPV4;
+ break;
+ case SCTP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_SCTPV4;
+ break;
+ case AH_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_AHV4;
+ break;
+ case ESP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_ESPV4;
+ break;
+ case TCP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_TCPV6;
+ break;
+ case UDP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_UDPV6;
+ break;
+ case SCTP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_SCTPV6;
+ break;
+ case AH_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_AHV6;
+ break;
+ case ESP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_ESPV6;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ switch (flow_type) {
+ case TCP_V4_FLOW:
+ case UDP_V4_FLOW:
+ case SCTP_V4_FLOW:
+ rule->key.src_ip[0] = fsp->h_u.tcp_ip4_spec.ip4src;
+ rule->key.dst_ip[0] = fsp->h_u.tcp_ip4_spec.ip4dst;
+ rule->key.src_port = fsp->h_u.tcp_ip4_spec.psrc;
+ rule->key.dst_port = fsp->h_u.tcp_ip4_spec.pdst;
+ rule->mask.src_ip[0] = fsp->m_u.tcp_ip4_spec.ip4src;
+ rule->mask.dst_ip[0] = fsp->m_u.tcp_ip4_spec.ip4dst;
+ rule->mask.src_port = fsp->m_u.tcp_ip4_spec.psrc;
+ rule->mask.dst_port = fsp->m_u.tcp_ip4_spec.pdst;
+ break;
+ case AH_V4_FLOW:
+ case ESP_V4_FLOW:
+ rule->key.src_ip[0] = fsp->h_u.tcp_ip4_spec.ip4src;
+ rule->key.dst_ip[0] = fsp->h_u.tcp_ip4_spec.ip4dst;
+ rule->key.spi = fsp->h_u.ah_ip4_spec.spi;
+ rule->mask.src_ip[0] = fsp->m_u.tcp_ip4_spec.ip4src;
+ rule->mask.dst_ip[0] = fsp->m_u.tcp_ip4_spec.ip4dst;
+ rule->mask.spi = fsp->m_u.ah_ip4_spec.spi;
+ break;
+ case TCP_V6_FLOW:
+ case UDP_V6_FLOW:
+ case SCTP_V6_FLOW:
+ memcpy(&rule->key.src_ip, fsp->h_u.tcp_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->key.dst_ip, fsp->h_u.tcp_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->key.src_port = fsp->h_u.tcp_ip6_spec.psrc;
+ rule->key.dst_port = fsp->h_u.tcp_ip6_spec.pdst;
+ memcpy(&rule->mask.src_ip, fsp->m_u.tcp_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->mask.dst_ip, fsp->m_u.tcp_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->mask.src_port = fsp->m_u.tcp_ip6_spec.psrc;
+ rule->mask.dst_port = fsp->m_u.tcp_ip6_spec.pdst;
+ break;
+ case AH_V6_FLOW:
+ case ESP_V6_FLOW:
+ memcpy(&rule->key.src_ip, fsp->h_u.usr_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->key.dst_ip, fsp->h_u.usr_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->key.spi = fsp->h_u.ah_ip6_spec.spi;
+ memcpy(&rule->mask.src_ip, fsp->m_u.usr_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->mask.dst_ip, fsp->m_u.usr_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->key.spi = fsp->h_u.ah_ip6_spec.spi;
+ break;
+ default:
+ /* not doing un-parsed flow types */
+ return -EINVAL;
+ }
+
+ if (gve_flow_rule_is_dup_rule(priv, rule))
+ return -EEXIST;
+
+ return 0;
+}
+
+static int gve_add_flow_rule(struct gve_priv *priv, struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp = &cmd->fs;
+ struct gve_flow_rule *rule = NULL;
+ int err;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ if (priv->flow_rules_cnt >= priv->flow_rules_max) {
+ dev_err(&priv->pdev->dev,
+ "Reached the limit of max allowed flow rules (%u)\n",
+ priv->flow_rules_max);
+ return -ENOSPC;
+ }
+
+ mutex_lock(&priv->flow_rules_lock);
+ if (gve_find_flow_rule_by_loc(priv, fsp->location)) {
+ dev_err(&priv->pdev->dev, "Flow rule %d already exists\n",
+ fsp->location);
+ err = -EEXIST;
+ goto ret;
+ }
+
+ rule = kvzalloc(sizeof(*rule), GFP_KERNEL);
+ if (!rule) {
+ err = -ENOMEM;
+ goto ret;
+ }
+
+ err = gve_add_flow_rule_info(priv, fsp, rule);
+ if (err)
+ goto ret;
+
+ err = gve_adminq_add_flow_rule(priv, rule);
+ if (err)
+ goto ret;
+
+ gve_flow_rules_add_rule(priv, rule);
+ gve_print_flow_rule(priv, rule);
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ if (err && rule)
+ kfree(rule);
+ return err;
+}
+
+static int gve_del_flow_rule(struct gve_priv *priv, struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp = (struct ethtool_rx_flow_spec *)&cmd->fs;
+ struct gve_flow_rule *rule = NULL;
+ int err = 0;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ mutex_lock(&priv->flow_rules_lock);
+ rule = gve_find_flow_rule_by_loc(priv, fsp->location);
+ if (!rule) {
+ err = -EINVAL;
+ goto ret;
+ }
+
+ err = gve_adminq_del_flow_rule(priv, fsp->location);
+ if (err)
+ goto ret;
+
+ gve_flow_rules_del_rule(priv, rule);
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ return err;
+}
+
+static int gve_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ int err = -EOPNOTSUPP;
+
+ dev_hold(netdev);
+ rtnl_unlock();
+ if (!(netdev->features & NETIF_F_NTUPLE))
+ goto ret;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_SRXCLSRLINS:
+ err = gve_add_flow_rule(priv, cmd);
+ break;
+ case ETHTOOL_SRXCLSRLDEL:
+ err = gve_del_flow_rule(priv, cmd);
+ break;
+ case ETHTOOL_SRXFH:
+ /* not supported */
+ break;
+ default:
+ break;
+ }
+
+ret:
+ rtnl_lock();
+ dev_put(netdev);
+ return err;
+}
+
+static int gve_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd,
+ u32 *rule_locs)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ int err = -EOPNOTSUPP;
+
+ dev_hold(netdev);
+ rtnl_unlock();
+ switch (cmd->cmd) {
+ case ETHTOOL_GRXRINGS:
+ cmd->data = priv->rx_cfg.num_queues;
+ err = 0;
+ break;
+ case ETHTOOL_GRXCLSRLCNT:
+ if (priv->flow_rules_max == 0)
+ break;
+ cmd->rule_cnt = priv->flow_rules_cnt;
+ cmd->data = priv->flow_rules_max;
+ err = 0;
+ break;
+ case ETHTOOL_GRXCLSRULE:
+ err = gve_get_flow_rule_entry(priv, cmd);
+ break;
+ case ETHTOOL_GRXCLSRLALL:
+ err = gve_get_flow_rule_ids(priv, cmd, (u32 *)rule_locs);
+ break;
+ case ETHTOOL_GRXFH:
+ /* not supported */
+ break;
+ default:
+ break;
+ }
+
+ rtnl_lock();
+ dev_put(netdev);
+ return err;
+}
+
const struct ethtool_ops gve_ethtool_ops = {
+ .supported_coalesce_params = ETHTOOL_COALESCE_USECS,
.get_drvinfo = gve_get_drvinfo,
.get_strings = gve_get_strings,
.get_sset_count = gve_get_sset_count,
@@ -547,7 +1316,15 @@
.get_msglevel = gve_get_msglevel,
.set_channels = gve_set_channels,
.get_channels = gve_get_channels,
+ .set_rxnfc = gve_set_rxnfc,
+ .get_rxnfc = gve_get_rxnfc,
+ .get_rxfh_indir_size = gve_get_rxfh_indir_size,
+ .get_rxfh_key_size = gve_get_rxfh_key_size,
+ .get_rxfh = gve_get_rxfh,
+ .set_rxfh = gve_set_rxfh,
.get_link = ethtool_op_get_link,
+ .get_coalesce = gve_get_coalesce,
+ .set_coalesce = gve_set_coalesce,
.get_ringparam = gve_get_ringparam,
.reset = gve_user_reset,
.get_tunable = gve_get_tunable,
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index 4327d66..655a812 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -12,6 +12,9 @@
#include <linux/sched.h>
#include <linux/timer.h>
#include <linux/workqueue.h>
+#include <linux/utsname.h>
+#include <linux/version.h>
+#include <linux/dma-buf.h>
#include <net/sch_generic.h>
#include "gve.h"
#include "gve_dqo.h"
@@ -30,6 +33,61 @@
const char gve_version_str[] = GVE_VERSION;
static const char gve_version_prefix[] = GVE_VERSION_PREFIX;
+static int gve_verify_driver_compatibility(struct gve_priv *priv)
+{
+ int err;
+ struct gve_driver_info *driver_info;
+ dma_addr_t driver_info_bus;
+
+ driver_info = dma_alloc_coherent(&priv->pdev->dev,
+ sizeof(struct gve_driver_info),
+ &driver_info_bus, GFP_KERNEL);
+ if (!driver_info)
+ return -ENOMEM;
+
+ *driver_info = (struct gve_driver_info) {
+ .os_type = 1, /* Linux */
+ .os_version_major = cpu_to_be32(LINUX_VERSION_MAJOR),
+ .os_version_minor = cpu_to_be32(LINUX_VERSION_SUBLEVEL),
+ .os_version_sub = cpu_to_be32(LINUX_VERSION_PATCHLEVEL),
+ .driver_capability_flags = {
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS1),
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS2),
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS3),
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS4),
+ },
+ };
+ strscpy(driver_info->os_version_str1, utsname()->release,
+ sizeof(driver_info->os_version_str1));
+ strscpy(driver_info->os_version_str2, utsname()->version,
+ sizeof(driver_info->os_version_str2));
+
+ err = gve_adminq_verify_driver_compatibility(priv,
+ sizeof(struct gve_driver_info),
+ driver_info_bus);
+
+ /* It's ok if the device doesn't support this */
+ if (err == -EOPNOTSUPP)
+ err = 0;
+
+ dma_free_coherent(&priv->pdev->dev,
+ sizeof(struct gve_driver_info),
+ driver_info, driver_info_bus);
+ return err;
+}
+
+static netdev_features_t gve_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ struct gve_priv *priv = netdev_priv(dev);
+
+ if (!gve_is_gqi(priv))
+ return gve_features_check_dqo(skb, dev, features);
+
+ return features;
+}
+
static netdev_tx_t gve_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct gve_priv *priv = netdev_priv(dev);
@@ -51,10 +109,10 @@
for (ring = 0; ring < priv->rx_cfg.num_queues; ring++) {
do {
start =
- u64_stats_fetch_begin_irq(&priv->rx[ring].statss);
+ u64_stats_fetch_begin(&priv->rx[ring].statss);
packets = priv->rx[ring].rpackets;
bytes = priv->rx[ring].rbytes;
- } while (u64_stats_fetch_retry_irq(&priv->rx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
s->rx_packets += packets;
s->rx_bytes += bytes;
@@ -64,10 +122,10 @@
for (ring = 0; ring < priv->tx_cfg.num_queues; ring++) {
do {
start =
- u64_stats_fetch_begin_irq(&priv->tx[ring].statss);
+ u64_stats_fetch_begin(&priv->tx[ring].statss);
packets = priv->tx[ring].pkt_done;
bytes = priv->tx[ring].bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->tx[ring].statss,
start));
s->tx_packets += packets;
s->tx_bytes += bytes;
@@ -162,6 +220,77 @@
priv->stats_report = NULL;
}
+static void gve_tx_timeout_for_miss_path(struct net_device *dev, unsigned int txqueue)
+{
+ struct gve_notify_block *block;
+ struct gve_tx_ring *tx = NULL;
+ bool has_work = false;
+ struct gve_priv *priv;
+ u32 ntfy_idx;
+
+ priv = netdev_priv(dev);
+
+ ntfy_idx = gve_tx_idx_to_ntfy(priv, txqueue);
+ if (ntfy_idx > priv->num_ntfy_blks)
+ return;
+
+ block = &priv->ntfy_blocks[ntfy_idx];
+ tx = block->tx;
+ if (!tx)
+ return;
+
+ /* Check to see if there is pending work */
+ has_work = gve_tx_work_pending_dqo(tx);
+ if (!has_work)
+ return;
+
+ if (READ_ONCE(tx->dqo_compl.kicked)) {
+ netdev_warn(dev,
+ "TX timeout on queue %d. Scheduling reset.",
+ txqueue);
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_TX_MISS_PATH_TIMEOUT);
+ gve_schedule_reset(priv);
+ }
+
+ gve_write_irq_doorbell_dqo(priv, block, GVE_ITR_NO_UPDATE_DQO);
+
+ netdev_info(dev, "Kicking tx queue %d for miss path", txqueue);
+ napi_schedule(&block->napi);
+ WRITE_ONCE(tx->dqo_compl.kicked, true);
+}
+
+static void gve_tx_timeout_timer(struct timer_list *t)
+{
+ struct gve_priv *priv = from_timer(priv, t, tx_timeout_timer);
+ int i;
+
+ for(i = 0; i < priv->tx_cfg.num_queues; i++) {
+ if (time_after(jiffies, READ_ONCE(priv->tx[i].dqo_compl.last_processed)
+ + priv->tx_timeout_period)) {
+ gve_tx_timeout_for_miss_path(priv->dev, i);
+ }
+ }
+ mod_timer(&priv->tx_timeout_timer,
+ jiffies + priv->tx_timeout_period);
+}
+
+static int gve_setup_tx_timeout_timer(struct gve_priv *priv)
+{
+ /* Set up 1 sec timer to check no reinjection on miss path */
+ if (gve_is_gqi(priv))
+ return 0;
+ priv->tx_timeout_period = GVE_TX_TIMEOUT_PERIOD;
+ timer_setup(&priv->tx_timeout_timer, gve_tx_timeout_timer, 0);
+ return 0;
+}
+
+static void gve_free_tx_timeout_timer(struct gve_priv *priv)
+{
+ if (gve_is_gqi(priv))
+ return;
+ del_timer_sync(&priv->tx_timeout_timer);
+}
+
static irqreturn_t gve_mgmnt_intr(int irq, void *arg)
{
struct gve_priv *priv = arg;
@@ -195,34 +324,40 @@
__be32 __iomem *irq_doorbell;
bool reschedule = false;
struct gve_priv *priv;
+ int work_done = 0;
block = container_of(napi, struct gve_notify_block, napi);
priv = block->priv;
if (block->tx)
reschedule |= gve_tx_poll(block, budget);
- if (block->rx)
- reschedule |= gve_rx_poll(block, budget);
+ if (block->rx) {
+ work_done = gve_rx_poll(block, budget);
+ reschedule |= work_done == budget;
+ }
if (reschedule)
return budget;
- napi_complete(napi);
- irq_doorbell = gve_irq_doorbell(priv, block);
- iowrite32be(GVE_IRQ_ACK | GVE_IRQ_EVENT, irq_doorbell);
+ /* Complete processing - don't unmask irq if busy polling is enabled */
+ if (likely(napi_complete_done(napi, work_done))) {
+ irq_doorbell = gve_irq_doorbell(priv, block);
+ iowrite32be(GVE_IRQ_ACK | GVE_IRQ_EVENT, irq_doorbell);
- /* Double check we have no extra work.
- * Ensure unmask synchronizes with checking for work.
- */
- mb();
- if (block->tx)
- reschedule |= gve_tx_poll(block, -1);
- if (block->rx)
- reschedule |= gve_rx_poll(block, -1);
- if (reschedule && napi_reschedule(napi))
- iowrite32be(GVE_IRQ_MASK, irq_doorbell);
+ /* Ensure IRQ ACK is visible before we check pending work.
+ * If queue had issued updates, it would be truly visible.
+ */
+ mb();
- return 0;
+ if (block->tx)
+ reschedule |= gve_tx_clean_pending(priv, block->tx);
+ if (block->rx)
+ reschedule |= gve_rx_work_pending(block->rx);
+
+ if (reschedule && napi_reschedule(napi))
+ iowrite32be(GVE_IRQ_MASK, irq_doorbell);
+ }
+ return work_done;
}
static int gve_napi_poll_dqo(struct napi_struct *napi, int budget)
@@ -263,13 +398,12 @@
static int gve_alloc_notify_blocks(struct gve_priv *priv)
{
int num_vecs_requested = priv->num_ntfy_blks + 1;
- char *name = priv->dev->name;
unsigned int active_cpus;
int vecs_enabled;
int i, j;
int err;
- priv->msix_vectors = kvzalloc(num_vecs_requested *
+ priv->msix_vectors = kvcalloc(num_vecs_requested,
sizeof(*priv->msix_vectors), GFP_KERNEL);
if (!priv->msix_vectors)
return -ENOMEM;
@@ -307,30 +441,38 @@
active_cpus = min_t(int, priv->num_ntfy_blks / 2, num_online_cpus());
/* Setup Management Vector - the last vector */
- snprintf(priv->mgmt_msix_name, sizeof(priv->mgmt_msix_name), "%s-mgmnt",
- name);
+ snprintf(priv->mgmt_msix_name, sizeof(priv->mgmt_msix_name), "gve-mgmnt@pci:%s",
+ pci_name(priv->pdev));
err = request_irq(priv->msix_vectors[priv->mgmt_msix_idx].vector,
gve_mgmnt_intr, 0, priv->mgmt_msix_name, priv);
if (err) {
dev_err(&priv->pdev->dev, "Did not receive management vector.\n");
goto abort_with_msix_enabled;
}
- priv->ntfy_blocks =
+ priv->irq_db_indices =
dma_alloc_coherent(&priv->pdev->dev,
priv->num_ntfy_blks *
- sizeof(*priv->ntfy_blocks),
- &priv->ntfy_block_bus, GFP_KERNEL);
- if (!priv->ntfy_blocks) {
+ sizeof(*priv->irq_db_indices),
+ &priv->irq_db_indices_bus, GFP_KERNEL);
+ if (!priv->irq_db_indices) {
err = -ENOMEM;
goto abort_with_mgmt_vector;
}
+
+ priv->ntfy_blocks = kvzalloc(priv->num_ntfy_blks *
+ sizeof(*priv->ntfy_blocks), GFP_KERNEL);
+ if (!priv->ntfy_blocks) {
+ err = -ENOMEM;
+ goto abort_with_irq_db_indices;
+ }
+
/* Setup the other blocks - the first n-1 vectors */
for (i = 0; i < priv->num_ntfy_blks; i++) {
struct gve_notify_block *block = &priv->ntfy_blocks[i];
int msix_idx = i;
- snprintf(block->name, sizeof(block->name), "%s-ntfy-block.%d",
- name, i);
+ snprintf(block->name, sizeof(block->name), "gve-ntfy-blk%d@pci:%s",
+ i, pci_name(priv->pdev));
block->priv = priv;
err = request_irq(priv->msix_vectors[msix_idx].vector,
gve_is_gqi(priv) ? gve_intr : gve_intr_dqo,
@@ -342,6 +484,7 @@
}
irq_set_affinity_hint(priv->msix_vectors[msix_idx].vector,
get_cpu_mask(i % active_cpus));
+ block->irq_db_index = &priv->irq_db_indices[i].index;
}
return 0;
abort_with_some_ntfy_blocks:
@@ -353,10 +496,13 @@
NULL);
free_irq(priv->msix_vectors[msix_idx].vector, block);
}
- dma_free_coherent(&priv->pdev->dev, priv->num_ntfy_blks *
- sizeof(*priv->ntfy_blocks),
- priv->ntfy_blocks, priv->ntfy_block_bus);
+ kvfree(priv->ntfy_blocks);
priv->ntfy_blocks = NULL;
+abort_with_irq_db_indices:
+ dma_free_coherent(&priv->pdev->dev, priv->num_ntfy_blks *
+ sizeof(*priv->irq_db_indices),
+ priv->irq_db_indices, priv->irq_db_indices_bus);
+ priv->irq_db_indices = NULL;
abort_with_mgmt_vector:
free_irq(priv->msix_vectors[priv->mgmt_msix_idx].vector, priv);
abort_with_msix_enabled:
@@ -384,10 +530,12 @@
free_irq(priv->msix_vectors[msix_idx].vector, block);
}
free_irq(priv->msix_vectors[priv->mgmt_msix_idx].vector, priv);
- dma_free_coherent(&priv->pdev->dev,
- priv->num_ntfy_blks * sizeof(*priv->ntfy_blocks),
- priv->ntfy_blocks, priv->ntfy_block_bus);
+ kvfree(priv->ntfy_blocks);
priv->ntfy_blocks = NULL;
+ dma_free_coherent(&priv->pdev->dev, priv->num_ntfy_blks *
+ sizeof(*priv->irq_db_indices),
+ priv->irq_db_indices, priv->irq_db_indices_bus);
+ priv->irq_db_indices = NULL;
pci_disable_msix(priv->pdev);
kvfree(priv->msix_vectors);
priv->msix_vectors = NULL;
@@ -403,13 +551,16 @@
err = gve_alloc_notify_blocks(priv);
if (err)
goto abort_with_counter;
+ err = gve_setup_tx_timeout_timer(priv);
+ if(err)
+ goto abort_with_ntfy_blocks;
err = gve_alloc_stats_report(priv);
if (err)
- goto abort_with_ntfy_blocks;
+ goto abort_with_tx_timeout;
err = gve_adminq_configure_device_resources(priv,
priv->counter_array_bus,
priv->num_event_counters,
- priv->ntfy_block_bus,
+ priv->irq_db_indices_bus,
priv->num_ntfy_blks);
if (unlikely(err)) {
dev_err(&priv->pdev->dev,
@@ -418,7 +569,7 @@
goto abort_with_stats_report;
}
- if (priv->queue_format == GVE_DQO_RDA_FORMAT) {
+ if (!gve_is_gqi(priv)) {
priv->ptype_lut_dqo = kvzalloc(sizeof(*priv->ptype_lut_dqo),
GFP_KERNEL);
if (!priv->ptype_lut_dqo) {
@@ -447,6 +598,8 @@
priv->ptype_lut_dqo = NULL;
abort_with_stats_report:
gve_free_stats_report(priv);
+abort_with_tx_timeout:
+ gve_free_tx_timeout_timer(priv);
abort_with_ntfy_blocks:
gve_free_notify_blocks(priv);
abort_with_counter:
@@ -455,7 +608,8 @@
return err;
}
-static void gve_trigger_reset(struct gve_priv *priv);
+static void gve_trigger_reset(struct gve_priv *priv,
+ enum gve_reset_reason reason);
static void gve_teardown_device_resources(struct gve_priv *priv)
{
@@ -463,28 +617,39 @@
/* Tell device its resources are being freed */
if (gve_get_device_resources_ok(priv)) {
+ if (priv->flow_rules_cnt != 0) {
+ err = gve_adminq_reset_flow_rules(priv);
+ if (err) {
+ dev_err(&priv->pdev->dev,
+ "Failed to reset flow rules: err=%d\n", err);
+ gve_trigger_reset(priv, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
+ }
+ }
/* detach the stats report */
err = gve_adminq_report_stats(priv, 0, 0x0, GVE_STATS_REPORT_TIMER_PERIOD);
if (err) {
dev_err(&priv->pdev->dev,
"Failed to detach stats report: err=%d\n", err);
- gve_trigger_reset(priv);
+ gve_trigger_reset(priv, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
}
err = gve_adminq_deconfigure_device_resources(priv);
if (err) {
dev_err(&priv->pdev->dev,
"Could not deconfigure device resources: err=%d\n",
err);
- gve_trigger_reset(priv);
+ gve_trigger_reset(priv, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
}
}
kvfree(priv->ptype_lut_dqo);
priv->ptype_lut_dqo = NULL;
+ gve_flow_rules_release(priv);
+ gve_rss_config_release(&priv->rss_config);
gve_free_counter_array(priv);
gve_free_notify_blocks(priv);
gve_free_stats_report(priv);
+ gve_free_tx_timeout_timer(priv);
gve_clear_device_resources_ok(priv);
}
@@ -544,23 +709,11 @@
return 0;
}
-static int gve_create_rings(struct gve_priv *priv)
+static int gve_create_rx_rings(struct gve_priv *priv)
{
int err;
int i;
- err = gve_adminq_create_tx_queues(priv, priv->tx_cfg.num_queues);
- if (err) {
- netif_err(priv, drv, priv->dev, "failed to create %d tx queues\n",
- priv->tx_cfg.num_queues);
- /* This failure will trigger a reset - no need to clean
- * up
- */
- return err;
- }
- netif_dbg(priv, drv, priv->dev, "created %d tx queues\n",
- priv->tx_cfg.num_queues);
-
err = gve_adminq_create_rx_queues(priv, priv->rx_cfg.num_queues);
if (err) {
netif_err(priv, drv, priv->dev, "failed to create %d rx queues\n",
@@ -592,6 +745,39 @@
return 0;
}
+static int gve_create_tx_rings(struct gve_priv *priv, int start_id, u32 num_tx_queues)
+{
+ int err;
+
+ err = gve_adminq_create_tx_queues(priv, start_id, num_tx_queues);
+ if (err) {
+ netif_err(priv, drv, priv->dev, "failed to create %d tx queues\n",
+ num_tx_queues);
+ /* This failure will trigger a reset - no need to clean
+ * up
+ */
+ return err;
+ }
+ netif_dbg(priv, drv, priv->dev, "created %d tx queues\n",
+ num_tx_queues);
+
+ return 0;
+}
+
+static int gve_create_rings(struct gve_priv *priv)
+{
+ int num_tx_queues = gve_num_tx_queues(priv);
+ int err;
+
+ err = gve_create_tx_rings(priv, 0, num_tx_queues);
+ if (err)
+ return err;
+
+ err = gve_create_rx_rings(priv);
+
+ return err;
+}
+
static void add_napi_init_sync_stats(struct gve_priv *priv,
int (*napi_poll)(struct napi_struct *napi,
int budget))
@@ -630,7 +816,7 @@
int err;
/* Setup tx rings */
- priv->tx = kvzalloc(priv->tx_cfg.num_queues * sizeof(*priv->tx),
+ priv->tx = kvcalloc(priv->tx_cfg.num_queues, sizeof(*priv->tx),
GFP_KERNEL);
if (!priv->tx)
return -ENOMEM;
@@ -643,7 +829,7 @@
goto free_tx;
/* Setup rx rings */
- priv->rx = kvzalloc(priv->rx_cfg.num_queues * sizeof(*priv->rx),
+ priv->rx = kvcalloc(priv->rx_cfg.num_queues, sizeof(*priv->rx),
GFP_KERNEL);
if (!priv->rx) {
err = -ENOMEM;
@@ -675,18 +861,10 @@
return err;
}
-static int gve_destroy_rings(struct gve_priv *priv)
+static int gve_destroy_rx_rings(struct gve_priv *priv)
{
int err;
- err = gve_adminq_destroy_tx_queues(priv, priv->tx_cfg.num_queues);
- if (err) {
- netif_err(priv, drv, priv->dev,
- "failed to destroy tx queues\n");
- /* This failure will trigger a reset - no need to clean up */
- return err;
- }
- netif_dbg(priv, drv, priv->dev, "destroyed tx queues\n");
err = gve_adminq_destroy_rx_queues(priv, priv->rx_cfg.num_queues);
if (err) {
netif_err(priv, drv, priv->dev,
@@ -698,6 +876,38 @@
return 0;
}
+static int gve_destroy_tx_rings(struct gve_priv *priv, int start_id, u32 num_queues)
+{
+ int err;
+
+ err = gve_adminq_destroy_tx_queues(priv, start_id, num_queues);
+ if (err) {
+ netif_err(priv, drv, priv->dev,
+ "failed to destroy tx queues\n");
+ /* This failure will trigger a reset - no need to clean up */
+ return err;
+ }
+ netif_dbg(priv, drv, priv->dev, "destroyed tx queues\n");
+
+ return 0;
+}
+
+static int gve_destroy_rings(struct gve_priv *priv)
+{
+ int num_tx_queues = gve_num_tx_queues(priv);
+ int err;
+
+ err = gve_destroy_tx_rings(priv, 0, num_tx_queues);
+ if (err)
+ return err;
+
+ err = gve_destroy_rx_rings(priv);
+ if (err)
+ return err;
+
+ return 0;
+}
+
static void gve_rx_free_rings(struct gve_priv *priv)
{
if (gve_is_gqi(priv))
@@ -766,12 +976,11 @@
qpl->id = id;
qpl->num_entries = 0;
- qpl->pages = kvzalloc(pages * sizeof(*qpl->pages), GFP_KERNEL);
+ qpl->pages = kvcalloc(pages, sizeof(*qpl->pages), GFP_KERNEL);
/* caller handles clean up */
if (!qpl->pages)
return -ENOMEM;
- qpl->page_buses = kvzalloc(pages * sizeof(*qpl->page_buses),
- GFP_KERNEL);
+ qpl->page_buses = kvcalloc(pages, sizeof(*qpl->page_buses), GFP_KERNEL);
/* caller handles clean up */
if (!qpl->page_buses)
return -ENOMEM;
@@ -793,14 +1002,13 @@
void gve_free_page(struct device *dev, struct page *page, dma_addr_t dma,
enum dma_data_direction dir)
{
- if (!dma_mapping_error(dev, dma))
+ if (!is_dma_buf_page(page) && !dma_mapping_error(dev, dma))
dma_unmap_page(dev, dma, PAGE_SIZE, dir);
if (page)
put_page(page);
}
-static void gve_free_queue_page_list(struct gve_priv *priv,
- int id)
+static void gve_free_queue_page_list(struct gve_priv *priv, u32 id)
{
struct gve_queue_page_list *qpl = &priv->qpls[id];
int i;
@@ -823,33 +1031,41 @@
static int gve_alloc_qpls(struct gve_priv *priv)
{
int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv);
+ int page_count;
int i, j;
int err;
- /* Raw addressing means no QPLs */
- if (priv->queue_format == GVE_GQI_RDA_FORMAT)
+ if (num_qpls == 0)
return 0;
- priv->qpls = kvzalloc(num_qpls * sizeof(*priv->qpls), GFP_KERNEL);
+ priv->qpls = kvcalloc(num_qpls, sizeof(*priv->qpls), GFP_KERNEL);
if (!priv->qpls)
return -ENOMEM;
+ page_count = priv->tx_pages_per_qpl;
for (i = 0; i < gve_num_tx_qpls(priv); i++) {
err = gve_alloc_queue_page_list(priv, i,
- priv->tx_pages_per_qpl);
+ page_count);
if (err)
goto free_qpls;
}
+
+ /* For GQI_QPL number of pages allocated have 1:1 relationship with
+ * number of descriptors. For DQO, number of pages required are
+ * more than descriptors (because of out of order completions).
+ */
+ page_count = priv->queue_format == GVE_GQI_QPL_FORMAT ?
+ priv->rx_data_slot_cnt : priv->rx_pages_per_qpl;
for (; i < num_qpls; i++) {
err = gve_alloc_queue_page_list(priv, i,
- priv->rx_data_slot_cnt);
+ page_count);
if (err)
goto free_qpls;
}
priv->qpl_cfg.qpl_map_size = BITS_TO_LONGS(num_qpls) *
sizeof(unsigned long) * BITS_PER_BYTE;
- priv->qpl_cfg.qpl_id_map = kvzalloc(BITS_TO_LONGS(num_qpls) *
+ priv->qpl_cfg.qpl_id_map = kvcalloc(BITS_TO_LONGS(num_qpls),
sizeof(unsigned long), GFP_KERNEL);
if (!priv->qpl_cfg.qpl_id_map) {
err = -ENOMEM;
@@ -870,8 +1086,7 @@
int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv);
int i;
- /* Raw addressing means no QPLs */
- if (priv->queue_format == GVE_GQI_RDA_FORMAT)
+ if (num_qpls == 0)
return;
kvfree(priv->qpl_cfg.qpl_id_map);
@@ -892,7 +1107,8 @@
queue_work(priv->gve_wq, &priv->service_task);
}
-static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up);
+static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up,
+ enum gve_reset_reason reason);
static int gve_reset_recovery(struct gve_priv *priv, bool was_up);
static void gve_turndown(struct gve_priv *priv);
static void gve_turnup(struct gve_priv *priv);
@@ -900,6 +1116,7 @@
static int gve_open(struct net_device *dev)
{
struct gve_priv *priv = netdev_priv(dev);
+ enum gve_reset_reason reset_reason;
int err;
err = gve_alloc_qpls(priv);
@@ -918,18 +1135,16 @@
goto free_rings;
err = gve_register_qpls(priv);
- if (err)
+ if (err) {
+ reset_reason = GVE_RESET_REASON_REGISTER_QPLS_FAILED;
goto reset;
-
- if (!gve_is_gqi(priv)) {
- /* Hard code this for now. This may be tuned in the future for
- * performance.
- */
- priv->data_buffer_size_dqo = GVE_RX_BUFFER_SIZE_DQO;
}
+
err = gve_create_rings(priv);
- if (err)
+ if (err) {
+ reset_reason = GVE_RESET_REASON_CREATE_RINGS_FAILED;
goto reset;
+ }
gve_set_device_rings_ok(priv);
@@ -938,6 +1153,10 @@
round_jiffies(jiffies +
msecs_to_jiffies(priv->stats_report_timer_period)));
+ if (!gve_is_gqi(priv))
+ mod_timer(&priv->tx_timeout_timer,
+ jiffies + priv->tx_timeout_period);
+
gve_turnup(priv);
queue_work(priv->gve_wq, &priv->service_task);
priv->interface_up_cnt++;
@@ -956,7 +1175,7 @@
if (gve_get_reset_in_progress(priv))
return err;
/* Otherwise reset before returning */
- gve_reset_and_teardown(priv, true);
+ gve_reset_and_teardown(priv, true, reset_reason);
/* if this fails there is nothing we can do so just ignore the return */
gve_reset_recovery(priv, false);
/* return the original error */
@@ -980,6 +1199,7 @@
gve_clear_device_rings_ok(priv);
}
del_timer_sync(&priv->stats_report_timer);
+ del_timer_sync(&priv->tx_timeout_timer);
gve_free_rings(priv);
gve_free_qpls(priv);
@@ -993,40 +1213,75 @@
if (gve_get_reset_in_progress(priv))
return err;
/* Otherwise reset before returning */
- gve_reset_and_teardown(priv, true);
+ gve_reset_and_teardown(priv, true, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
return gve_reset_recovery(priv, false);
}
+int gve_flow_rules_reset(struct gve_priv *priv)
+{
+ int err;
+
+ if (priv->flow_rules_cnt == 0)
+ return 0;
+
+ err = gve_adminq_reset_flow_rules(priv);
+ if (err)
+ return err;
+
+ gve_flow_rules_release(priv);
+ return 0;
+}
+
+static int gve_adjust_queue_count(struct gve_priv *priv,
+ struct gve_queue_config new_rx_config,
+ struct gve_queue_config new_tx_config)
+{
+ struct gve_queue_config old_rx_config = priv->rx_cfg;
+ int err = 0;
+
+ priv->rx_cfg = new_rx_config;
+ priv->tx_cfg = new_tx_config;
+
+ if (old_rx_config.num_queues != new_rx_config.num_queues) {
+ err = gve_flow_rules_reset(priv);
+ if (err)
+ return err;
+
+ if (priv->rss_config.alg != GVE_RSS_HASH_UNDEFINED)
+ err = gve_rss_config_init(priv);
+ }
+
+ return err;
+}
+
int gve_adjust_queues(struct gve_priv *priv,
struct gve_queue_config new_rx_config,
struct gve_queue_config new_tx_config)
{
int err;
-
- if (netif_carrier_ok(priv->dev)) {
+ if (netif_running(priv->dev)) {
/* To make this process as simple as possible we teardown the
* device, set the new configuration, and then bring the device
* up again.
*/
err = gve_close(priv->dev);
- /* we have already tried to reset in close,
- * just fail at this point
+ /* We have already tried to reset in close, just fail at this
+ * point.
*/
if (err)
return err;
- priv->tx_cfg = new_tx_config;
- priv->rx_cfg = new_rx_config;
-
+ err = gve_adjust_queue_count(priv, new_rx_config, new_tx_config);
+ if (err)
+ goto err;
err = gve_open(priv->dev);
if (err)
goto err;
-
return 0;
}
/* Set the config for the next up. */
- priv->tx_cfg = new_tx_config;
- priv->rx_cfg = new_rx_config;
-
+ err = gve_adjust_queue_count(priv, new_rx_config, new_tx_config);
+ if (err)
+ goto err;
return 0;
err:
netif_err(priv, drv, priv->dev,
@@ -1082,9 +1337,8 @@
if (gve_is_gqi(priv)) {
iowrite32be(0, gve_irq_doorbell(priv, block));
} else {
- u32 val = gve_set_itr_ratelimit_dqo(GVE_TX_IRQ_RATELIMIT_US_DQO);
-
- gve_write_irq_doorbell_dqo(priv, block, val);
+ gve_set_itr_coalesce_usecs_dqo(priv, block,
+ priv->tx_coalesce_usecs);
}
}
for (idx = 0; idx < priv->rx_cfg.num_queues; idx++) {
@@ -1095,21 +1349,53 @@
if (gve_is_gqi(priv)) {
iowrite32be(0, gve_irq_doorbell(priv, block));
} else {
- u32 val = gve_set_itr_ratelimit_dqo(GVE_RX_IRQ_RATELIMIT_US_DQO);
-
- gve_write_irq_doorbell_dqo(priv, block, val);
+ gve_set_itr_coalesce_usecs_dqo(priv, block,
+ priv->rx_coalesce_usecs);
}
}
gve_set_napi_enabled(priv);
}
+static void gve_tx_rings_dump(struct gve_priv *priv, struct gve_tx_ring *tx)
+{
+ char prefix[64];
+ int i;
+
+ if (gve_is_gqi(priv) || !tx)
+ return;
+
+ netdev_info(priv->dev, "TX q_num %d, TX tail %u, TX head %d, TX compl head %u, TX compl cur_gen_bit %u",
+ tx->q_num, tx->dqo_tx.tail,
+ atomic_read_acquire(&tx->dqo_compl.hw_tx_head),
+ tx->dqo_compl.head, tx->dqo_compl.cur_gen_bit);
+
+ netdev_info(priv->dev, "TX descriptor ring dump for queue %d\n", tx->q_num);
+ for (i = 0; i <= tx->mask; i++) {
+ snprintf(prefix, sizeof(prefix),
+ "desc index %4d, dtype %2u: ", i, tx->dqo.tx_ring[i].pkt.dtype);
+ print_hex_dump(KERN_INFO, prefix, DUMP_PREFIX_NONE,
+ 16, 1,
+ &tx->dqo.tx_ring[i],
+ sizeof(tx->dqo.tx_ring[0]), false);
+ }
+
+ netdev_info(priv->dev, "TX completion ring dump for queue %d\n", tx->q_num);
+ for (i = 0; i <= tx->dqo.complq_mask; i++) {
+ snprintf(prefix, sizeof(prefix),
+ "desc index %4d, type %u: ", i, tx->dqo.compl_ring[i].type);
+ print_hex_dump(KERN_INFO, prefix, DUMP_PREFIX_NONE,
+ 16, 1,
+ &tx->dqo.compl_ring[i],
+ sizeof(tx->dqo.compl_ring[0]), false);
+ }
+}
+
static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
{
struct gve_notify_block *block;
struct gve_tx_ring *tx = NULL;
struct gve_priv *priv;
- u32 last_nic_done;
u32 current_time;
u32 ntfy_idx;
@@ -1132,22 +1418,57 @@
/* Check to see if there are missed completions, which will allow us to
* kick the queue.
*/
- last_nic_done = gve_tx_load_event_counter(priv, tx);
- if (last_nic_done - tx->done) {
+ if (gve_is_gqi(priv)) {
+ u32 last_nic_done = gve_tx_load_event_counter(priv, tx);
+ bool has_work = (last_nic_done - tx->done) != 0;
+
+ if (!has_work) {
+ netdev_err(dev, "Tx queue %d stuck with no work. "
+ "Posted descriptors %d, Completed descriptors %d, "
+ "Driver stopped %d, Transmit stopped %d",
+ txqueue, tx->req,
+ tx->done,
+ netif_tx_queue_stopped(tx->netdev_txq),
+ netif_xmit_stopped(tx->netdev_txq));
+ goto reset;
+ }
+
netdev_info(dev, "Kicking queue %d", txqueue);
iowrite32be(GVE_IRQ_MASK, gve_irq_doorbell(priv, block));
- napi_schedule(&block->napi);
- tx->last_kick_msec = current_time;
- goto out;
- } // Else reset.
+ } else {
+ struct gve_tx_compl_desc *compl_desc =
+ &tx->dqo.compl_ring[tx->dqo_compl.head];
-reset:
- gve_schedule_reset(priv);
+ bool has_work =
+ compl_desc->generation != tx->dqo_compl.cur_gen_bit;
+
+ if (!has_work) {
+ netdev_err(dev, "Tx queue %d stuck with no work. "
+ "Driver stopped %d, Transmit stopped %d",
+ txqueue,
+ netif_tx_queue_stopped(tx->netdev_txq),
+ netif_xmit_stopped(tx->netdev_txq));
+ goto reset;
+ }
+
+ netdev_info(dev, "Kicking queue %d", txqueue);
+ gve_write_irq_doorbell_dqo(priv, block, GVE_ITR_NO_UPDATE_DQO);
+ }
+
+ napi_schedule(&block->napi);
+ tx->last_kick_msec = current_time;
out:
if (tx)
tx->queue_timeout++;
priv->tx_timeo_cnt++;
+ return;
+
+reset:
+ gve_tx_rings_dump(priv, tx);
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_TX_TIMEOUT);
+ gve_schedule_reset(priv);
+ goto out;
}
static int gve_set_features(struct net_device *netdev,
@@ -1159,7 +1480,7 @@
if ((netdev->features & NETIF_F_LRO) != (features & NETIF_F_LRO)) {
netdev->features ^= NETIF_F_LRO;
- if (netif_carrier_ok(netdev)) {
+ if (netif_running(netdev)) {
/* To make this process as simple as possible we
* teardown the device, set the new configuration,
* and then bring the device up again.
@@ -1177,6 +1498,12 @@
}
}
+ if ((netdev->features & NETIF_F_NTUPLE) && !(features & NETIF_F_NTUPLE)) {
+ err = gve_flow_rules_reset(priv);
+ if (err)
+ goto err;
+ }
+
return 0;
err:
/* Reverts the change on error. */
@@ -1188,6 +1515,7 @@
static const struct net_device_ops gve_netdev_ops = {
.ndo_start_xmit = gve_start_xmit,
+ .ndo_features_check = gve_features_check,
.ndo_open = gve_open,
.ndo_stop = gve_close,
.ndo_get_stats64 = gve_get_stats,
@@ -1199,6 +1527,7 @@
{
if (GVE_DEVICE_STATUS_RESET_MASK & status) {
dev_info(&priv->pdev->dev, "Device requested reset.\n");
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_DEVICE_REQUESTED);
gve_set_do_reset(priv);
}
if (GVE_DEVICE_STATUS_REPORT_STATS_MASK & status) {
@@ -1218,7 +1547,7 @@
if (gve_get_do_reset(priv)) {
rtnl_lock();
- gve_reset(priv, false);
+ gve_reset(priv, false, READ_ONCE(priv->scheduled_reset_reason));
rtnl_unlock();
}
}
@@ -1247,9 +1576,9 @@
}
do {
- start = u64_stats_fetch_begin_irq(&priv->tx[idx].statss);
+ start = u64_stats_fetch_begin(&priv->tx[idx].statss);
tx_bytes = priv->tx[idx].bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[idx].statss, start));
+ } while (u64_stats_fetch_retry(&priv->tx[idx].statss, start));
stats[stats_idx++] = (struct stats) {
.stat_name = cpu_to_be32(TX_WAKE_CNT),
.value = cpu_to_be64(priv->tx[idx].wake_queue),
@@ -1316,6 +1645,66 @@
}
}
+static void gve_turnup_and_check_status(struct gve_priv *priv)
+{
+ u32 status;
+
+ gve_turnup(priv);
+ status = ioread32be(&priv->reg_bar0->device_status);
+ gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status);
+}
+
+int gve_recreate_rx_rings(struct gve_priv *priv)
+{
+ int err;
+
+ /* Unregister queues with the device*/
+ err = gve_destroy_rx_rings(priv);
+ if (err)
+ return err;
+
+ /* Reset the RX state */
+ gve_rx_reset_rings_dqo(priv);
+
+ /* Register queues with the device */
+ return gve_create_rx_rings(priv);
+}
+
+int gve_reconfigure_rx_rings(struct gve_priv *priv,
+ bool enable_hdr_split,
+ int packet_buffer_size)
+{
+ int err = 0;
+
+ if (priv->queue_format != GVE_DQO_RDA_FORMAT)
+ return -EOPNOTSUPP;
+
+ gve_turndown(priv);
+
+ /* Allocate/free hdr resources */
+ if (enable_hdr_split != !!priv->header_buf_pool) {
+ err = gve_rx_handle_hdr_resources_dqo(priv, enable_hdr_split);
+ if (err)
+ goto out;
+ }
+
+ /* Apply new RX configuration changes */
+ priv->data_buffer_size_dqo = packet_buffer_size;
+
+ /* Reset RX state and re-register with the device */
+ err = gve_recreate_rx_rings(priv);
+ if (err)
+ goto reset;
+out:
+ gve_turnup_and_check_status(priv);
+ return err;
+
+reset:
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_DEVICE_FAILURE);
+ gve_schedule_reset(priv);
+ return err;
+}
+
/* Handle NIC status register changes, reset requests and report stats */
static void gve_service_task(struct work_struct *work)
{
@@ -1342,6 +1731,13 @@
return err;
}
+ err = gve_verify_driver_compatibility(priv);
+ if (err) {
+ dev_err(&priv->pdev->dev,
+ "Could not verify driver compatibility: err=%d\n", err);
+ goto err;
+ }
+
if (skip_describe_device)
goto setup_device;
@@ -1353,14 +1749,6 @@
"Could not get device information: err=%d\n", err);
goto err;
}
- if (gve_is_gqi(priv) && priv->dev->max_mtu > PAGE_SIZE) {
- priv->dev->max_mtu = PAGE_SIZE;
- err = gve_adminq_set_mtu(priv, priv->dev->mtu);
- if (err) {
- dev_err(&priv->pdev->dev, "Could not set mtu");
- goto err;
- }
- }
priv->dev->mtu = priv->dev->max_mtu;
num_ntfy = pci_msix_vec_count(priv->pdev);
if (num_ntfy <= 0) {
@@ -1383,6 +1771,9 @@
priv->num_ntfy_blks = (num_ntfy - 1) & ~0x1;
priv->mgmt_msix_idx = priv->num_ntfy_blks;
+ mutex_init(&priv->flow_rules_lock);
+ INIT_LIST_HEAD(&priv->flow_rules);
+
priv->tx_cfg.max_queues =
min_t(int, priv->tx_cfg.max_queues, priv->num_ntfy_blks / 2);
priv->rx_cfg.max_queues =
@@ -1402,6 +1793,11 @@
dev_info(&priv->pdev->dev, "Max TX queues %d, Max RX queues %d\n",
priv->tx_cfg.max_queues, priv->rx_cfg.max_queues);
+ if (!gve_is_gqi(priv)) {
+ priv->tx_coalesce_usecs = GVE_TX_IRQ_RATELIMIT_US_DQO;
+ priv->rx_coalesce_usecs = GVE_RX_IRQ_RATELIMIT_US_DQO;
+ }
+
setup_device:
err = gve_setup_device_resources(priv);
if (!err)
@@ -1417,15 +1813,27 @@
gve_adminq_free(&priv->pdev->dev, priv);
}
-static void gve_trigger_reset(struct gve_priv *priv)
+static void gve_write_reset_reason(struct gve_priv *priv,
+ enum gve_reset_reason reason)
{
+ u32 driver_status = reason;
+ driver_status <<= 32 - GVE_RESET_REASON_SIZE;
+
+ iowrite32be(driver_status, &priv->reg_bar0->driver_status);
+}
+
+static void gve_trigger_reset(struct gve_priv *priv,
+ enum gve_reset_reason reason)
+{
+ gve_write_reset_reason(priv, reason);
/* Reset the device by releasing the AQ */
gve_adminq_release(priv);
}
-static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up)
+static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up,
+ enum gve_reset_reason reason)
{
- gve_trigger_reset(priv);
+ gve_trigger_reset(priv, reason);
/* With the reset having already happened, close cannot fail */
if (was_up)
gve_close(priv->dev);
@@ -1451,12 +1859,13 @@
return err;
}
-int gve_reset(struct gve_priv *priv, bool attempt_teardown)
+int gve_reset(struct gve_priv *priv, bool attempt_teardown,
+ enum gve_reset_reason reason)
{
- bool was_up = netif_carrier_ok(priv->dev);
+ bool was_up = netif_running(priv->dev);
int err;
- dev_info(&priv->pdev->dev, "Performing reset\n");
+ dev_info(&priv->pdev->dev, "Performing reset with reason %d\n", reason);
gve_clear_do_reset(priv);
gve_set_reset_in_progress(priv);
/* If we aren't attempting to teardown normally, just go turndown and
@@ -1464,19 +1873,23 @@
*/
if (!attempt_teardown) {
gve_turndown(priv);
- gve_reset_and_teardown(priv, was_up);
+ gve_reset_and_teardown(priv, was_up, reason);
} else {
/* Otherwise attempt to close normally */
if (was_up) {
err = gve_close(priv->dev);
/* If that fails reset as we did above */
if (err)
- gve_reset_and_teardown(priv, was_up);
+ gve_reset_and_teardown(priv, was_up, reason);
}
+ gve_write_reset_reason(priv, reason);
/* Clean up any remaining resources */
gve_teardown_priv_resources(priv);
}
+ /* The reset reason preceding this reset should be cleared. */
+ WRITE_ONCE(priv->scheduled_reset_reason, 0);
+
/* Set it all back up */
err = gve_reset_recovery(priv, was_up);
gve_clear_reset_in_progress(priv);
@@ -1586,6 +1999,7 @@
priv->service_task_flags = 0x0;
priv->state_flags = 0x0;
priv->ethtool_flags = 0x0;
+ priv->ethtool_defaults = 0x0;
gve_set_probe_in_progress(priv);
priv->gve_wq = alloc_ordered_workqueue("gve", 0);
@@ -1644,6 +2058,7 @@
void __iomem *reg_bar = priv->reg_bar0;
unregister_netdev(netdev);
+ gve_write_reset_reason(priv, GVE_RESET_REASON_DRIVER_REMOVED);
gve_teardown_priv_resources(priv);
destroy_workqueue(priv->gve_wq);
free_netdev(netdev);
@@ -1653,6 +2068,121 @@
pci_disable_device(pdev);
}
+static void gve_shutdown(struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct gve_priv *priv = netdev_priv(netdev);
+ bool was_up = netif_running(priv->dev);
+
+ rtnl_lock();
+ if (was_up && gve_close(priv->dev)) {
+ /* If the dev was up, attempt to close, if close fails, reset */
+ gve_reset_and_teardown(priv, was_up, GVE_RESET_REASON_DRIVER_SUSPENDED);
+ } else {
+ gve_write_reset_reason(priv, GVE_RESET_REASON_DRIVER_SUSPENDED);
+ /* If the dev wasn't up or close worked, finish tearing down */
+ gve_teardown_priv_resources(priv);
+ }
+ rtnl_unlock();
+}
+
+#ifdef CONFIG_PM
+static int gve_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct gve_priv *priv = netdev_priv(netdev);
+ bool was_up = netif_running(priv->dev);
+
+ priv->suspend_cnt++;
+ rtnl_lock();
+ if (was_up && gve_close(priv->dev)) {
+ /* If the dev was up, attempt to close, if close fails, reset */
+ gve_reset_and_teardown(priv, was_up, GVE_RESET_REASON_DRIVER_SHUTDOWN);
+ } else {
+ gve_write_reset_reason(priv, GVE_RESET_REASON_DRIVER_SHUTDOWN);
+ /* If the dev wasn't up or close worked, finish tearing down */
+ gve_teardown_priv_resources(priv);
+ }
+ priv->up_before_suspend = was_up;
+ rtnl_unlock();
+ return 0;
+}
+
+static int gve_resume(struct pci_dev *pdev)
+{
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct gve_priv *priv = netdev_priv(netdev);
+ int err;
+
+ priv->resume_cnt++;
+ rtnl_lock();
+ err = gve_reset_recovery(priv, priv->up_before_suspend);
+ rtnl_unlock();
+ return err;
+}
+#endif /* CONFIG_PM */
+
+void gve_rss_set_default_indir(struct gve_priv *priv)
+{
+ struct gve_rss_config *rss_config = &priv->rss_config;
+ int i;
+
+ for (i = 0; i < GVE_RSS_INDIR_SIZE; i++)
+ rss_config->indir[i] = i % priv->rx_cfg.num_queues;
+}
+
+void gve_flow_rules_release(struct gve_priv *priv)
+{
+ struct gve_flow_rule *cur, *next;
+
+ if (priv->flow_rules_cnt == 0)
+ return;
+
+ list_for_each_entry_safe(cur, next, &priv->flow_rules, list) {
+ list_del(&cur->list);
+ kvfree(cur);
+ priv->flow_rules_cnt--;
+ }
+}
+
+void gve_rss_config_release(struct gve_rss_config *rss_config)
+{
+ kvfree(rss_config->key);
+ kvfree(rss_config->indir);
+ memset(rss_config, 0, sizeof(*rss_config));
+}
+
+int gve_rss_config_init(struct gve_priv *priv)
+{
+ struct gve_rss_config *rss_config = &priv->rss_config;
+
+ gve_rss_config_release(rss_config);
+
+ rss_config->key = kvzalloc(GVE_RSS_KEY_SIZE, GFP_KERNEL);
+ if (!rss_config->key)
+ goto err;
+
+ netdev_rss_key_fill(rss_config->key, GVE_RSS_KEY_SIZE);
+
+ rss_config->indir = kvcalloc(GVE_RSS_INDIR_SIZE,
+ sizeof(*rss_config->indir),
+ GFP_KERNEL);
+ if (!rss_config->indir)
+ goto err;
+
+ rss_config->alg = GVE_RSS_HASH_TOEPLITZ;
+ rss_config->key_size = GVE_RSS_KEY_SIZE;
+ rss_config->indir_size = GVE_RSS_INDIR_SIZE;
+ gve_rss_set_default_indir(priv);
+
+ return gve_adminq_configure_rss(priv, rss_config);
+
+err:
+ kvfree(rss_config->key);
+ rss_config->key = NULL;
+ return -ENOMEM;
+}
+
static const struct pci_device_id gve_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_GOOGLE, PCI_DEV_ID_GVNIC) },
{ }
@@ -1663,6 +2193,11 @@
.id_table = gve_id_table,
.probe = gve_probe,
.remove = gve_remove,
+ .shutdown = gve_shutdown,
+#ifdef CONFIG_PM
+ .suspend = gve_suspend,
+ .resume = gve_resume,
+#endif
};
module_pci_driver(gvnic_driver);
diff --git a/drivers/net/ethernet/google/gve/gve_register.h b/drivers/net/ethernet/google/gve/gve_register.h
index fb65546..2511352 100644
--- a/drivers/net/ethernet/google/gve/gve_register.h
+++ b/drivers/net/ethernet/google/gve/gve_register.h
@@ -25,4 +25,25 @@
GVE_DEVICE_STATUS_LINK_STATUS_MASK = BIT(2),
GVE_DEVICE_STATUS_REPORT_STATS_MASK = BIT(3),
};
+
+#define GVE_RESET_REASON_SIZE 5
+
+enum gve_reset_reason {
+ GVE_RESET_REASON_UNKNOWN,
+ GVE_RESET_REASON_RESET_BY_USER,
+ GVE_RESET_REASON_DRIVER_SHUTDOWN,
+ GVE_RESET_REASON_DRIVER_SUSPENDED,
+ GVE_RESET_REASON_CREATE_RINGS_FAILED,
+ GVE_RESET_REASON_REGISTER_QPLS_FAILED,
+ GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED,
+ GVE_RESET_REASON_TX_TIMEOUT,
+ GVE_RESET_REASON_TX_MISS_PATH_TIMEOUT,
+ GVE_RESET_REASON_RX_ERROR,
+ GVE_RESET_REASON_DEVICE_REQUESTED,
+ GVE_RESET_REASON_DRIVER_REMOVED,
+ GVE_RESET_REASON_DEVICE_FAILURE,
+ GVE_NUM_RESET_REASONS, /* Not a reset reason */
+};
+
+static_assert(GVE_NUM_RESET_REASONS <= (1 << GVE_RESET_REASON_SIZE));
#endif /* _GVE_REGISTER_H_ */
diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
index 9743196..9f60bb5 100644
--- a/drivers/net/ethernet/google/gve/gve_rx.c
+++ b/drivers/net/ethernet/google/gve/gve_rx.c
@@ -6,6 +6,7 @@
#include "gve.h"
#include "gve_adminq.h"
+#include "gve_register.h"
#include "gve_utils.h"
#include <linux/etherdevice.h>
@@ -16,21 +17,31 @@
dma_addr_t dma = (dma_addr_t)(be64_to_cpu(data_slot->addr) &
GVE_DATA_SLOT_ADDR_PAGE_MASK);
+ page_ref_sub(page_info->page, page_info->pagecnt_bias - 1);
gve_free_page(dev, page_info->page, dma, DMA_FROM_DEVICE);
}
static void gve_rx_unfill_pages(struct gve_priv *priv, struct gve_rx_ring *rx)
{
- if (rx->data.raw_addressing) {
- u32 slots = rx->mask + 1;
- int i;
+ u32 slots = rx->mask + 1;
+ int i;
+ if (rx->data.raw_addressing) {
for (i = 0; i < slots; i++)
gve_rx_free_buffer(&priv->pdev->dev, &rx->data.page_info[i],
&rx->data.data_ring[i]);
} else {
+ for (i = 0; i < slots; i++)
+ page_ref_sub(rx->data.page_info[i].page,
+ rx->data.page_info[i].pagecnt_bias - 1);
gve_unassign_qpl(priv, rx->data.qpl->id);
rx->data.qpl = NULL;
+
+ for (i = 0; i < rx->qpl_copy_pool_mask + 1; i++) {
+ page_ref_sub(rx->qpl_copy_pool[i].page,
+ rx->qpl_copy_pool[i].pagecnt_bias - 1);
+ put_page(rx->qpl_copy_pool[i].page);
+ }
}
kvfree(rx->data.page_info);
rx->data.page_info = NULL;
@@ -59,6 +70,10 @@
dma_free_coherent(dev, bytes, rx->data.data_ring,
rx->data.data_bus);
rx->data.data_ring = NULL;
+
+ kvfree(rx->qpl_copy_pool);
+ rx->qpl_copy_pool = NULL;
+
netif_dbg(priv, drv, priv->dev, "freed rx ring %d\n", idx);
}
@@ -69,6 +84,9 @@
page_info->page_offset = 0;
page_info->page_address = page_address(page);
*slot_addr = cpu_to_be64(addr);
+ /* The page already has 1 ref */
+ page_ref_add(page, INT_MAX - 1);
+ page_info->pagecnt_bias = INT_MAX;
}
static int gve_rx_alloc_buffer(struct gve_priv *priv, struct device *dev,
@@ -94,6 +112,7 @@
u32 slots;
int err;
int i;
+ int j;
/* Allocate one page per Rx queue slot. Each page is split into two
* packet buffers, when possible we "page flip" between the two.
@@ -128,7 +147,33 @@
goto alloc_err;
}
+ if (!rx->data.raw_addressing) {
+ for (j = 0; j < rx->qpl_copy_pool_mask + 1; j++) {
+ struct page *page = alloc_page(GFP_KERNEL);
+
+ if (!page) {
+ err = -ENOMEM;
+ goto alloc_err_qpl;
+ }
+
+ rx->qpl_copy_pool[j].page = page;
+ rx->qpl_copy_pool[j].page_offset = 0;
+ rx->qpl_copy_pool[j].page_address = page_address(page);
+
+ /* The page already has 1 ref. */
+ page_ref_add(page, INT_MAX - 1);
+ rx->qpl_copy_pool[j].pagecnt_bias = INT_MAX;
+ }
+ }
+
return slots;
+
+alloc_err_qpl:
+ while (j--) {
+ page_ref_sub(rx->qpl_copy_pool[j].page,
+ rx->qpl_copy_pool[j].pagecnt_bias - 1);
+ put_page(rx->qpl_copy_pool[j].page);
+ }
alloc_err:
while (i--)
gve_rx_free_buffer(&priv->pdev->dev,
@@ -137,6 +182,15 @@
return err;
}
+static void gve_rx_ctx_clear(struct gve_rx_ctx *ctx)
+{
+ ctx->skb_head = NULL;
+ ctx->skb_tail = NULL;
+ ctx->total_size = 0;
+ ctx->frag_cnt = 0;
+ ctx->drop_pkt = false;
+}
+
static int gve_rx_alloc_ring(struct gve_priv *priv, int idx)
{
struct gve_rx_ring *rx = &priv->rx[idx];
@@ -164,10 +218,22 @@
GFP_KERNEL);
if (!rx->data.data_ring)
return -ENOMEM;
+
+ rx->qpl_copy_pool_mask = min_t(u32, U32_MAX, slots * 2) - 1;
+ rx->qpl_copy_pool_head = 0;
+ rx->qpl_copy_pool = kvcalloc(rx->qpl_copy_pool_mask + 1,
+ sizeof(rx->qpl_copy_pool[0]),
+ GFP_KERNEL);
+
+ if (!rx->qpl_copy_pool) {
+ err = -ENOMEM;
+ goto abort_with_slots;
+ }
+
filled_pages = gve_prefill_rx_pages(rx);
if (filled_pages < 0) {
err = -ENOMEM;
- goto abort_with_slots;
+ goto abort_with_copy_pool;
}
rx->fill_cnt = filled_pages;
/* Ensure data ring slots (packet buffers) are visible. */
@@ -203,6 +269,12 @@
rx->cnt = 0;
rx->db_threshold = priv->rx_desc_cnt / 2;
rx->desc.seqno = 1;
+
+ /* Allocating half-page buffers allows page-flipping which is faster
+ * than copying or allocating new pages.
+ */
+ rx->packet_buffer_size = PAGE_SIZE / 2;
+ gve_rx_ctx_clear(&rx->ctx);
gve_rx_add_to_block(priv, idx);
return 0;
@@ -213,6 +285,9 @@
rx->q_resources = NULL;
abort_filled:
gve_rx_unfill_pages(priv, rx);
+abort_with_copy_pool:
+ kvfree(rx->qpl_copy_pool);
+ rx->qpl_copy_pool = NULL;
abort_with_slots:
bytes = sizeof(*rx->data.data_ring) * slots;
dma_free_coherent(hdev, bytes, rx->data.data_ring, rx->data.data_bus);
@@ -271,18 +346,45 @@
static struct sk_buff *gve_rx_add_frags(struct napi_struct *napi,
struct gve_rx_slot_page_info *page_info,
- u16 len)
+ u16 packet_buffer_size, u16 len,
+ struct gve_rx_ctx *ctx)
{
- struct sk_buff *skb = napi_get_frags(napi);
+ u32 offset = page_info->page_offset + page_info->pad;
+ struct sk_buff *skb = ctx->skb_tail;
+ int num_frags = 0;
- if (unlikely(!skb))
- return NULL;
+ if (!skb) {
+ skb = napi_get_frags(napi);
+ if (unlikely(!skb))
+ return NULL;
- skb_add_rx_frag(skb, 0, page_info->page,
- page_info->page_offset +
- GVE_RX_PAD, len, PAGE_SIZE / 2);
+ ctx->skb_head = skb;
+ ctx->skb_tail = skb;
+ } else {
+ num_frags = skb_shinfo(ctx->skb_tail)->nr_frags;
+ if (num_frags == MAX_SKB_FRAGS) {
+ skb = napi_alloc_skb(napi, 0);
+ if (!skb)
+ return NULL;
- return skb;
+ // We will never chain more than two SKBs: 2 * 16 * 2k > 64k
+ // which is why we do not need to chain by using skb->next
+ skb_shinfo(ctx->skb_tail)->frag_list = skb;
+
+ ctx->skb_tail = skb;
+ num_frags = 0;
+ }
+ }
+
+ if (skb != ctx->skb_head) {
+ ctx->skb_head->len += len;
+ ctx->skb_head->data_len += len;
+ ctx->skb_head->truesize += packet_buffer_size;
+ }
+ skb_add_rx_frag(skb, num_frags, page_info->page,
+ offset, len, packet_buffer_size);
+
+ return ctx->skb_head;
}
static void gve_rx_flip_buff(struct gve_rx_slot_page_info *page_info, __be64 *slot_addr)
@@ -294,23 +396,18 @@
*(slot_addr) ^= offset;
}
-static bool gve_rx_can_flip_buffers(struct net_device *netdev)
+static int gve_rx_can_recycle_buffer(struct gve_rx_slot_page_info *page_info)
{
- return PAGE_SIZE == 4096
- ? netdev->mtu + GVE_RX_PAD + ETH_HLEN <= PAGE_SIZE / 2 : false;
-}
-
-static int gve_rx_can_recycle_buffer(struct page *page)
-{
- int pagecount = page_count(page);
+ int pagecount = page_count(page_info->page);
/* This page is not being used by any SKBs - reuse */
- if (pagecount == 1)
+ if (pagecount == page_info->pagecnt_bias)
return 1;
/* This page is still being used by an SKB - we can't reuse */
- else if (pagecount >= 2)
+ else if (pagecount > page_info->pagecnt_bias)
return 0;
- WARN(pagecount < 1, "Pagecount should never be < 1");
+ WARN(pagecount < page_info->pagecnt_bias,
+ "Pagecount should never be less than the bias.");
return -1;
}
@@ -318,19 +415,106 @@
gve_rx_raw_addressing(struct device *dev, struct net_device *netdev,
struct gve_rx_slot_page_info *page_info, u16 len,
struct napi_struct *napi,
- union gve_rx_data_slot *data_slot)
+ union gve_rx_data_slot *data_slot,
+ u16 packet_buffer_size, struct gve_rx_ctx *ctx)
{
- struct sk_buff *skb;
+ struct sk_buff *skb = gve_rx_add_frags(napi, page_info, packet_buffer_size, len, ctx);
- skb = gve_rx_add_frags(napi, page_info, len);
if (!skb)
return NULL;
- /* Optimistically stop the kernel from freeing the page by increasing
- * the page bias. We will check the refcount in refill to determine if
- * we need to alloc a new page.
+ /* Optimistically stop the kernel from freeing the page.
+ * We will check again in refill to determine if we need to alloc a
+ * new page.
*/
- get_page(page_info->page);
+ gve_dec_pagecnt_bias(page_info);
+
+ return skb;
+}
+
+static struct sk_buff *gve_rx_copy_to_pool(struct gve_rx_ring *rx,
+ struct gve_rx_slot_page_info *page_info,
+ u16 len, struct napi_struct *napi)
+{
+ u32 pool_idx = rx->qpl_copy_pool_head & rx->qpl_copy_pool_mask;
+ void *src = page_info->page_address + page_info->page_offset;
+ struct gve_rx_slot_page_info *copy_page_info;
+ struct gve_rx_ctx *ctx = &rx->ctx;
+ bool alloc_page = false;
+ struct sk_buff *skb;
+ void *dst;
+
+ copy_page_info = &rx->qpl_copy_pool[pool_idx];
+ if (!copy_page_info->can_flip) {
+ int recycle = gve_rx_can_recycle_buffer(copy_page_info);
+
+ if (unlikely(recycle < 0)) {
+ WRITE_ONCE(rx->gve->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(rx->gve);
+ return NULL;
+ }
+ alloc_page = !recycle;
+ }
+
+ if (alloc_page) {
+ struct gve_rx_slot_page_info alloc_page_info;
+ struct page *page;
+
+ /* The least recently used page turned out to be
+ * still in use by the kernel. Ignoring it and moving
+ * on alleviates head-of-line blocking.
+ */
+ rx->qpl_copy_pool_head++;
+
+ page = alloc_page(GFP_ATOMIC);
+ if (!page)
+ return NULL;
+
+ alloc_page_info.page = page;
+ alloc_page_info.page_offset = 0;
+ alloc_page_info.page_address = page_address(page);
+ alloc_page_info.pad = page_info->pad;
+
+ memcpy(alloc_page_info.page_address, src, page_info->pad + len);
+ skb = gve_rx_add_frags(napi, &alloc_page_info,
+ rx->packet_buffer_size,
+ len, ctx);
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_copy_cnt++;
+ rx->rx_frag_alloc_cnt++;
+ u64_stats_update_end(&rx->statss);
+
+ return skb;
+ }
+
+ dst = copy_page_info->page_address + copy_page_info->page_offset;
+ memcpy(dst, src, page_info->pad + len);
+ copy_page_info->pad = page_info->pad;
+
+ skb = gve_rx_add_frags(napi, copy_page_info,
+ rx->packet_buffer_size, len, ctx);
+ if (unlikely(!skb))
+ return NULL;
+
+ gve_dec_pagecnt_bias(copy_page_info);
+ copy_page_info->page_offset += rx->packet_buffer_size;
+ copy_page_info->page_offset &= (PAGE_SIZE - 1);
+
+ if (copy_page_info->can_flip) {
+ /* We have used both halves of this copy page, it
+ * is time for it to go to the back of the queue.
+ */
+ copy_page_info->can_flip = false;
+ rx->qpl_copy_pool_head++;
+ prefetch(rx->qpl_copy_pool[rx->qpl_copy_pool_head & rx->qpl_copy_pool_mask].page);
+ } else {
+ copy_page_info->can_flip = true;
+ }
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_copy_cnt++;
+ u64_stats_update_end(&rx->statss);
return skb;
}
@@ -341,6 +525,7 @@
u16 len, struct napi_struct *napi,
union gve_rx_data_slot *data_slot)
{
+ struct gve_rx_ctx *ctx = &rx->ctx;
struct sk_buff *skb;
/* if raw_addressing mode is not enabled gvnic can only receive into
@@ -349,116 +534,177 @@
* device.
*/
if (page_info->can_flip) {
- skb = gve_rx_add_frags(napi, page_info, len);
+ skb = gve_rx_add_frags(napi, page_info, rx->packet_buffer_size, len, ctx);
/* No point in recycling if we didn't get the skb */
if (skb) {
/* Make sure that the page isn't freed. */
- get_page(page_info->page);
+ gve_dec_pagecnt_bias(page_info);
gve_rx_flip_buff(page_info, &data_slot->qpl_offset);
}
} else {
+ skb = gve_rx_copy_to_pool(rx, page_info, len, napi);
+ }
+ return skb;
+}
+
+static struct sk_buff *gve_rx_skb(struct gve_priv *priv, struct gve_rx_ring *rx,
+ struct gve_rx_slot_page_info *page_info, struct napi_struct *napi,
+ u16 len, union gve_rx_data_slot *data_slot,
+ bool is_only_frag)
+{
+ struct net_device *netdev = priv->dev;
+ struct gve_rx_ctx *ctx = &rx->ctx;
+ struct sk_buff *skb = NULL;
+
+ if (len <= priv->rx_copybreak && is_only_frag) {
+ /* Just copy small packets */
skb = gve_rx_copy(netdev, napi, page_info, len, GVE_RX_PAD);
if (skb) {
u64_stats_update_begin(&rx->statss);
rx->rx_copied_pkt++;
+ rx->rx_frag_copy_cnt++;
+ rx->rx_copybreak_pkt++;
u64_stats_update_end(&rx->statss);
}
+ } else {
+ int recycle = gve_rx_can_recycle_buffer(page_info);
+
+ if (unlikely(recycle < 0)) {
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(priv);
+ return NULL;
+ }
+ page_info->can_flip = recycle;
+ if (page_info->can_flip) {
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_flip_cnt++;
+ u64_stats_update_end(&rx->statss);
+ }
+
+ if (rx->data.raw_addressing) {
+ skb = gve_rx_raw_addressing(&priv->pdev->dev, netdev,
+ page_info, len, napi,
+ data_slot,
+ rx->packet_buffer_size, ctx);
+ } else {
+ skb = gve_rx_qpl(&priv->pdev->dev, netdev, rx,
+ page_info, len, napi, data_slot);
+ }
}
return skb;
}
-static bool gve_rx(struct gve_rx_ring *rx, struct gve_rx_desc *rx_desc,
- netdev_features_t feat, u32 idx)
+#define GVE_PKTCONT_BIT_IS_SET(x) (GVE_RXF_PKT_CONT & (x))
+static void gve_rx(struct gve_rx_ring *rx, netdev_features_t feat,
+ struct gve_rx_desc *desc, u32 idx,
+ struct gve_rx_cnts *cnts)
{
+ bool is_last_frag = !GVE_PKTCONT_BIT_IS_SET(desc->flags_seq);
struct gve_rx_slot_page_info *page_info;
- struct gve_priv *priv = rx->gve;
- struct napi_struct *napi = &priv->ntfy_blocks[rx->ntfy_id].napi;
- struct net_device *dev = priv->dev;
+ u16 frag_size = be16_to_cpu(desc->len);
+ struct gve_rx_ctx *ctx = &rx->ctx;
union gve_rx_data_slot *data_slot;
+ struct gve_priv *priv = rx->gve;
struct sk_buff *skb = NULL;
dma_addr_t page_bus;
- u16 len;
+ void *va;
- /* drop this packet */
- if (unlikely(rx_desc->flags_seq & GVE_RXF_ERR)) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_desc_err_dropped_pkt++;
- u64_stats_update_end(&rx->statss);
- return false;
+ struct napi_struct *napi = &priv->ntfy_blocks[rx->ntfy_id].napi;
+ bool is_first_frag = ctx->frag_cnt == 0;
+
+ bool is_only_frag = is_first_frag && is_last_frag;
+
+ if (unlikely(ctx->drop_pkt))
+ goto finish_frag;
+
+ if (desc->flags_seq & GVE_RXF_ERR) {
+ ctx->drop_pkt = true;
+ cnts->desc_err_pkt_cnt++;
+ napi_free_frags(napi);
+ goto finish_frag;
}
- len = be16_to_cpu(rx_desc->len) - GVE_RX_PAD;
- page_info = &rx->data.page_info[idx];
+ if (unlikely(frag_size > rx->packet_buffer_size)) {
+ netdev_warn(priv->dev, "Unexpected frag size %d, can't exceed %d, scheduling reset",
+ frag_size, rx->packet_buffer_size);
+ ctx->drop_pkt = true;
+ napi_free_frags(napi);
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(rx->gve);
+ goto finish_frag;
+ }
+ /* Prefetch two packet buffers ahead, we will need it soon. */
+ page_info = &rx->data.page_info[(idx + 2) & rx->mask];
+ va = page_info->page_address + page_info->page_offset;
+ prefetch(page_info->page); /* Kernel page struct. */
+ prefetch(va); /* Packet header. */
+ prefetch(va + 64); /* Next cacheline too. */
+
+ page_info = &rx->data.page_info[idx];
data_slot = &rx->data.data_ring[idx];
page_bus = (rx->data.raw_addressing) ?
- be64_to_cpu(data_slot->addr) & GVE_DATA_SLOT_ADDR_PAGE_MASK :
- rx->data.qpl->page_buses[idx];
+ be64_to_cpu(data_slot->addr) - page_info->page_offset :
+ rx->data.qpl->page_buses[idx];
dma_sync_single_for_cpu(&priv->pdev->dev, page_bus,
PAGE_SIZE, DMA_FROM_DEVICE);
+ page_info->pad = is_first_frag ? GVE_RX_PAD : 0;
+ frag_size -= page_info->pad;
- if (len <= priv->rx_copybreak) {
- /* Just copy small packets */
- skb = gve_rx_copy(dev, napi, page_info, len, GVE_RX_PAD);
- u64_stats_update_begin(&rx->statss);
- rx->rx_copied_pkt++;
- rx->rx_copybreak_pkt++;
- u64_stats_update_end(&rx->statss);
- } else {
- u8 can_flip = gve_rx_can_flip_buffers(dev);
- int recycle = 0;
-
- if (can_flip) {
- recycle = gve_rx_can_recycle_buffer(page_info->page);
- if (recycle < 0) {
- if (!rx->data.raw_addressing)
- gve_schedule_reset(priv);
- return false;
- }
- }
-
- page_info->can_flip = can_flip && recycle;
- if (rx->data.raw_addressing) {
- skb = gve_rx_raw_addressing(&priv->pdev->dev, dev,
- page_info, len, napi,
- data_slot);
- } else {
- skb = gve_rx_qpl(&priv->pdev->dev, dev, rx,
- page_info, len, napi, data_slot);
- }
- }
-
+ skb = gve_rx_skb(priv, rx, page_info, napi, frag_size,
+ data_slot, is_only_frag);
if (!skb) {
u64_stats_update_begin(&rx->statss);
rx->rx_skb_alloc_fail++;
u64_stats_update_end(&rx->statss);
- return false;
+
+ napi_free_frags(napi);
+ ctx->drop_pkt = true;
+ goto finish_frag;
+ }
+ ctx->total_size += frag_size;
+
+ if (is_first_frag) {
+ if (likely(feat & NETIF_F_RXCSUM)) {
+ /* NIC passes up the partial sum */
+ if (desc->csum)
+ skb->ip_summed = CHECKSUM_COMPLETE;
+ else
+ skb->ip_summed = CHECKSUM_NONE;
+ skb->csum = csum_unfold(desc->csum);
+ }
+
+ /* parse flags & pass relevant info up */
+ if (likely(feat & NETIF_F_RXHASH) &&
+ gve_needs_rss(desc->flags_seq))
+ skb_set_hash(skb, be32_to_cpu(desc->rss_hash),
+ gve_rss_type(desc->flags_seq));
}
- if (likely(feat & NETIF_F_RXCSUM)) {
- /* NIC passes up the partial sum */
- if (rx_desc->csum)
- skb->ip_summed = CHECKSUM_COMPLETE;
+ if (is_last_frag) {
+ skb_record_rx_queue(skb, rx->q_num);
+ if (skb_is_nonlinear(skb))
+ napi_gro_frags(napi);
else
- skb->ip_summed = CHECKSUM_NONE;
- skb->csum = csum_unfold(rx_desc->csum);
+ napi_gro_receive(napi, skb);
+ goto finish_ok_pkt;
}
- /* parse flags & pass relevant info up */
- if (likely(feat & NETIF_F_RXHASH) &&
- gve_needs_rss(rx_desc->flags_seq))
- skb_set_hash(skb, be32_to_cpu(rx_desc->rss_hash),
- gve_rss_type(rx_desc->flags_seq));
+ goto finish_frag;
- skb_record_rx_queue(skb, rx->q_num);
- if (skb_is_nonlinear(skb))
- napi_gro_frags(napi);
- else
- napi_gro_receive(napi, skb);
- return true;
+finish_ok_pkt:
+ cnts->ok_pkt_bytes += ctx->total_size;
+ cnts->ok_pkt_cnt++;
+finish_frag:
+ ctx->frag_cnt++;
+ if (is_last_frag) {
+ cnts->total_pkt_cnt++;
+ cnts->cont_pkt_cnt += (ctx->frag_cnt > 1);
+ gve_rx_ctx_clear(ctx);
+ }
}
-static bool gve_rx_work_pending(struct gve_rx_ring *rx)
+bool gve_rx_work_pending(struct gve_rx_ring *rx)
{
struct gve_rx_desc *desc;
__be16 flags_seq;
@@ -468,8 +714,6 @@
desc = rx->desc.desc_ring + next_idx;
flags_seq = desc->flags_seq;
- /* Make sure we have synchronized the seq no with the device */
- smp_rmb();
return (GVE_SEQNO(flags_seq) == rx->desc.seqno);
}
@@ -501,11 +745,13 @@
* owns half the page it is impossible to tell which half. Either
* the whole page is free or it needs to be replaced.
*/
- int recycle = gve_rx_can_recycle_buffer(page_info->page);
+ int recycle = gve_rx_can_recycle_buffer(page_info);
if (recycle < 0) {
- if (!rx->data.raw_addressing)
+ if (!rx->data.raw_addressing) {
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
gve_schedule_reset(priv);
+ }
return false;
}
if (!recycle) {
@@ -513,7 +759,6 @@
union gve_rx_data_slot *data_slot =
&rx->data.data_ring[idx];
struct device *dev = &priv->pdev->dev;
-
gve_rx_free_buffer(dev, page_info, data_slot);
page_info->page = NULL;
if (gve_rx_alloc_buffer(priv, dev, page_info,
@@ -531,78 +776,86 @@
return true;
}
-bool gve_clean_rx_done(struct gve_rx_ring *rx, int budget,
- netdev_features_t feat)
+static int gve_clean_rx_done(struct gve_rx_ring *rx, int budget,
+ netdev_features_t feat)
{
+ struct gve_rx_ctx *ctx = &rx->ctx;
struct gve_priv *priv = rx->gve;
- u32 work_done = 0, packets = 0;
- struct gve_rx_desc *desc;
- u32 cnt = rx->cnt;
- u32 idx = cnt & rx->mask;
- u64 bytes = 0;
+ struct gve_rx_cnts cnts = {0};
+ struct gve_rx_desc *next_desc;
+ u32 idx = rx->cnt & rx->mask;
+ u32 work_done = 0;
- desc = rx->desc.desc_ring + idx;
+ struct gve_rx_desc *desc = &rx->desc.desc_ring[idx];
+
+ // Exceed budget only if (and till) the inflight packet is consumed.
while ((GVE_SEQNO(desc->flags_seq) == rx->desc.seqno) &&
- work_done < budget) {
- bool dropped;
+ (work_done < budget || ctx->frag_cnt)) {
+ next_desc = &rx->desc.desc_ring[(idx + 1) & rx->mask];
+ prefetch(next_desc);
- netif_info(priv, rx_status, priv->dev,
- "[%d] idx=%d desc=%p desc->flags_seq=0x%x\n",
- rx->q_num, idx, desc, desc->flags_seq);
- netif_info(priv, rx_status, priv->dev,
- "[%d] seqno=%d rx->desc.seqno=%d\n",
- rx->q_num, GVE_SEQNO(desc->flags_seq),
- rx->desc.seqno);
- dropped = !gve_rx(rx, desc, feat, idx);
- if (!dropped) {
- bytes += be16_to_cpu(desc->len) - GVE_RX_PAD;
- packets++;
- }
- cnt++;
- idx = cnt & rx->mask;
- desc = rx->desc.desc_ring + idx;
+ gve_rx(rx, feat, desc, idx, &cnts);
+
+ rx->cnt++;
+ idx = rx->cnt & rx->mask;
+ desc = &rx->desc.desc_ring[idx];
rx->desc.seqno = gve_next_seqno(rx->desc.seqno);
work_done++;
}
- if (!work_done && rx->fill_cnt - cnt > rx->db_threshold)
- return false;
+ // The device will only send whole packets.
+ if (unlikely(ctx->frag_cnt)) {
+ struct napi_struct *napi = &priv->ntfy_blocks[rx->ntfy_id].napi;
- u64_stats_update_begin(&rx->statss);
- rx->rpackets += packets;
- rx->rbytes += bytes;
- u64_stats_update_end(&rx->statss);
- rx->cnt = cnt;
+ napi_free_frags(napi);
+ gve_rx_ctx_clear(&rx->ctx);
+ netdev_warn(priv->dev, "Unexpected seq number %d with incomplete packet, expected %d, scheduling reset",
+ GVE_SEQNO(desc->flags_seq), rx->desc.seqno);
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(rx->gve);
+ }
+
+ if (!work_done && rx->fill_cnt - rx->cnt > rx->db_threshold)
+ return 0;
+
+ if (work_done) {
+ u64_stats_update_begin(&rx->statss);
+ rx->rpackets += cnts.ok_pkt_cnt;
+ rx->rbytes += cnts.ok_pkt_bytes;
+ rx->rx_cont_packet_cnt += cnts.cont_pkt_cnt;
+ rx->rx_desc_err_dropped_pkt += cnts.desc_err_pkt_cnt;
+ u64_stats_update_end(&rx->statss);
+ }
/* restock ring slots */
if (!rx->data.raw_addressing) {
/* In QPL mode buffs are refilled as the desc are processed */
rx->fill_cnt += work_done;
- } else if (rx->fill_cnt - cnt <= rx->db_threshold) {
+ } else if (rx->fill_cnt - rx->cnt <= rx->db_threshold) {
/* In raw addressing mode buffs are only refilled if the avail
* falls below a threshold.
*/
if (!gve_rx_refill_buffers(priv, rx))
- return false;
+ return 0;
/* If we were not able to completely refill buffers, we'll want
* to schedule this queue for work again to refill buffers.
*/
- if (rx->fill_cnt - cnt <= rx->db_threshold) {
+ if (rx->fill_cnt - rx->cnt <= rx->db_threshold) {
gve_rx_write_doorbell(priv, rx);
- return true;
+ return budget;
}
}
gve_rx_write_doorbell(priv, rx);
- return gve_rx_work_pending(rx);
+ return cnts.total_pkt_cnt;
}
-bool gve_rx_poll(struct gve_notify_block *block, int budget)
+int gve_rx_poll(struct gve_notify_block *block, int budget)
{
struct gve_rx_ring *rx = block->rx;
netdev_features_t feat;
- bool repoll = false;
+ int work_done = 0;
feat = block->napi.dev->features;
@@ -611,8 +864,7 @@
budget = INT_MAX;
if (budget > 0)
- repoll |= gve_clean_rx_done(rx, budget, feat);
- else
- repoll |= gve_rx_work_pending(rx);
- return repoll;
+ work_done = gve_clean_rx_done(rx, budget, feat);
+
+ return work_done;
}
diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
index d947c2c..6c82b66 100644
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
@@ -12,6 +12,7 @@
#include <linux/ipv6.h>
#include <linux/skbuff.h>
#include <linux/slab.h>
+#include <linux/dma-buf.h>
#include <net/ip6_checksum.h>
#include <net/ipv6.h>
#include <net/tcp.h>
@@ -22,11 +23,13 @@
}
static void gve_free_page_dqo(struct gve_priv *priv,
- struct gve_rx_buf_state_dqo *bs)
+ struct gve_rx_buf_state_dqo *bs,
+ bool free_page)
{
page_ref_sub(bs->page_info.page, bs->page_info.pagecnt_bias - 1);
- gve_free_page(&priv->pdev->dev, bs->page_info.page, bs->addr,
- DMA_FROM_DEVICE);
+ if (free_page)
+ gve_free_page(&priv->pdev->dev, bs->page_info.page, bs->addr,
+ DMA_FROM_DEVICE);
bs->page_info.page = NULL;
}
@@ -109,6 +112,13 @@
}
}
+static void gve_recycle_buf(struct gve_rx_ring *rx,
+ struct gve_rx_buf_state_dqo *buf_state)
+{
+ buf_state->hdr_buf = NULL;
+ gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+}
+
static struct gve_rx_buf_state_dqo *
gve_get_recycled_buf_state(struct gve_rx_ring *rx)
{
@@ -130,12 +140,20 @@
*/
for (i = 0; i < 5; i++) {
buf_state = gve_dequeue_buf_state(rx, &rx->dqo.used_buf_states);
- if (gve_buf_ref_cnt(buf_state) == 0)
+ if (gve_buf_ref_cnt(buf_state) == 0) {
+ rx->dqo.used_buf_states_cnt--;
return buf_state;
+ }
gve_enqueue_buf_state(rx, &rx->dqo.used_buf_states, buf_state);
}
+ /* For QPL, we cannot allocate any new buffers and must
+ * wait for the existing ones to be available.
+ */
+ if (rx->dqo.qpl)
+ return NULL;
+
/* If there are no free buf states discard an entry from
* `used_buf_states` so it can be used.
*/
@@ -144,7 +162,7 @@
if (gve_buf_ref_cnt(buf_state) == 0)
return buf_state;
- gve_free_page_dqo(rx->gve, buf_state);
+ gve_free_page_dqo(rx->gve, buf_state, true);
gve_free_buf_state(rx, buf_state);
}
@@ -152,14 +170,62 @@
}
static int gve_alloc_page_dqo(struct gve_priv *priv,
- struct gve_rx_buf_state_dqo *buf_state)
+ struct gve_rx_buf_state_dqo *buf_state,
+ struct gve_rx_ring *rx)
{
+ struct netdev_rx_queue *rxq = NULL;
+ struct scatterlist sgl;
+ int num_pages_mapped;
int err;
+ u32 idx;
- err = gve_alloc_page(priv, &priv->pdev->dev, &buf_state->page_info.page,
- &buf_state->addr, DMA_FROM_DEVICE, GFP_ATOMIC);
- if (err)
- return err;
+ if (rx)
+ rxq = __netif_get_rx_queue(priv->dev, rx->q_num);
+
+ if (rxq && unlikely(rcu_access_pointer(rxq->dmabuf_pages))) {
+ buf_state->page_info.page =
+ netdev_rxq_alloc_dma_buf_page(rxq, 0);
+
+ if (!buf_state->page_info.page) {
+ priv->page_alloc_fail++;
+ return -ENOMEM;
+ }
+
+ BUG_ON(!is_dma_buf_page(buf_state->page_info.page));
+
+ sgl.offset = 0;
+ sgl.length = PAGE_SIZE;
+ sgl.page_link = (unsigned long)buf_state->page_info.page;
+ num_pages_mapped = dma_buf_map_sg(&priv->pdev->dev, &sgl, 1,
+ DMA_FROM_DEVICE);
+ if (!num_pages_mapped) {
+ net_err_ratelimited(
+ "dma_buf_map_sg failed (num_mapped (%d) <= 0)\n",
+ num_pages_mapped);
+ netdev_rxq_free_page(buf_state->page_info.page);
+ return -ENOMEM;
+ }
+ buf_state->addr = sgl.dma_address;
+ } else {
+ if (!rx->dqo.qpl) {
+ err = gve_alloc_page(priv, &priv->pdev->dev,
+ &buf_state->page_info.page,
+ &buf_state->addr,
+ DMA_FROM_DEVICE, GFP_ATOMIC);
+ if (err)
+ return err;
+ } else {
+ idx = rx->dqo.next_qpl_page_idx;
+ if (idx >= priv->rx_pages_per_qpl) {
+ net_err_ratelimited("%s: Out of QPL pages\n",
+ priv->dev->name);
+ return -ENOMEM;
+ }
+ buf_state->page_info.page = rx->dqo.qpl->pages[idx];
+ buf_state->addr = rx->dqo.qpl->page_buses[idx];
+ rx->dqo.next_qpl_page_idx++;
+ }
+ }
buf_state->page_info.page_offset = 0;
buf_state->page_info.page_address =
@@ -170,9 +236,33 @@
page_ref_add(buf_state->page_info.page, INT_MAX - 1);
buf_state->page_info.pagecnt_bias = INT_MAX;
+ /* Update stats for RDA. */
+ if (!rx->dqo.qpl) {
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_alloc_cnt++;
+ u64_stats_update_end(&rx->statss);
+ }
+
return 0;
}
+static void gve_rx_free_hdr_bufs(struct gve_priv *priv, int idx)
+{
+ struct gve_rx_ring *rx = &priv->rx[idx];
+ int buffer_queue_slots = rx->dqo.bufq.mask + 1;
+ int i;
+
+ if (rx->dqo.hdr_bufs) {
+ for (i = 0; i < buffer_queue_slots; i++)
+ if (rx->dqo.hdr_bufs[i].data)
+ dma_pool_free(priv->header_buf_pool,
+ rx->dqo.hdr_bufs[i].data,
+ rx->dqo.hdr_bufs[i].addr);
+ kvfree(rx->dqo.hdr_bufs);
+ rx->dqo.hdr_bufs = NULL;
+ }
+}
+
static void gve_rx_free_ring_dqo(struct gve_priv *priv, int idx)
{
struct gve_rx_ring *rx = &priv->rx[idx];
@@ -195,9 +285,13 @@
for (i = 0; i < rx->dqo.num_buf_states; i++) {
struct gve_rx_buf_state_dqo *bs = &rx->dqo.buf_states[i];
-
+ /* Only free page for RDA. QPL pages are freed in gve_main. */
if (bs->page_info.page)
- gve_free_page_dqo(priv, bs);
+ gve_free_page_dqo(priv, bs, !rx->dqo.qpl);
+ }
+ if (rx->dqo.qpl) {
+ gve_unassign_qpl(priv, rx->dqo.qpl->id);
+ rx->dqo.qpl = NULL;
}
if (rx->dqo.bufq.desc_ring) {
@@ -218,18 +312,116 @@
kvfree(rx->dqo.buf_states);
rx->dqo.buf_states = NULL;
+ gve_rx_free_hdr_bufs(priv, idx);
+
netif_dbg(priv, drv, priv->dev, "freed rx ring %d\n", idx);
}
+static int gve_rx_alloc_hdr_bufs(struct gve_priv *priv, int idx)
+{
+ struct gve_rx_ring *rx = &priv->rx[idx];
+ int buffer_queue_slots = rx->dqo.bufq.mask + 1;
+ int i;
+
+ rx->dqo.hdr_bufs = kvcalloc(buffer_queue_slots,
+ sizeof(rx->dqo.hdr_bufs[0]),
+ GFP_KERNEL);
+ if (!rx->dqo.hdr_bufs)
+ return -ENOMEM;
+
+ for (i = 0; i < buffer_queue_slots; i++) {
+ rx->dqo.hdr_bufs[i].data =
+ dma_pool_alloc(priv->header_buf_pool,
+ GFP_KERNEL,
+ &rx->dqo.hdr_bufs[i].addr);
+ if (!rx->dqo.hdr_bufs[i].data)
+ goto err;
+ }
+
+ return 0;
+err:
+ gve_rx_free_hdr_bufs(priv, idx);
+ return -ENOMEM;
+}
+
+static void gve_rx_init_ring_state_dqo(struct gve_rx_ring *rx,
+ const u32 buffer_queue_slots,
+ const u32 completion_queue_slots)
+{
+ int i;
+
+ /* Set buffer queue state */
+ rx->dqo.bufq.mask = buffer_queue_slots - 1;
+ rx->dqo.bufq.head = 0;
+ rx->dqo.bufq.tail = 0;
+
+ /* Set completion queue state */
+ rx->dqo.complq.num_free_slots = completion_queue_slots;
+ rx->dqo.complq.mask = completion_queue_slots - 1;
+ rx->dqo.complq.cur_gen_bit = 0;
+ rx->dqo.complq.head = 0;
+
+ /* Set RX SKB context */
+ rx->ctx.skb_head = NULL;
+ rx->ctx.skb_tail = NULL;
+
+ /* Set up linked list of buffer IDs */
+ for (i = 0; i < rx->dqo.num_buf_states - 1; i++)
+ rx->dqo.buf_states[i].next = i + 1;
+ rx->dqo.buf_states[rx->dqo.num_buf_states - 1].next = -1;
+
+ rx->dqo.free_buf_states = 0;
+ rx->dqo.recycled_buf_states.head = -1;
+ rx->dqo.recycled_buf_states.tail = -1;
+ rx->dqo.used_buf_states.head = -1;
+ rx->dqo.used_buf_states.tail = -1;
+}
+
+static void gve_rx_reset_ring_dqo(struct gve_priv *priv, int idx)
+{
+ struct gve_rx_ring *rx = &priv->rx[idx];
+ size_t size;
+ int i;
+
+ const u32 buffer_queue_slots = priv->rx_desc_cnt;
+ const u32 completion_queue_slots = priv->rx_desc_cnt;
+
+ netif_dbg(priv, drv, priv->dev, "Resetting rx ring \n");
+
+ /* Reset buffer queue */
+ size = sizeof(rx->dqo.bufq.desc_ring[0]) *
+ buffer_queue_slots;
+ memset(rx->dqo.bufq.desc_ring, 0 , size);
+
+ /* Reset completion queue */
+ size = sizeof(rx->dqo.complq.desc_ring[0]) *
+ completion_queue_slots;
+ memset(rx->dqo.complq.desc_ring, 0, size);
+
+ /* Reset q_resources */
+ memset(rx->q_resources, 0, sizeof(*rx->q_resources));
+
+ /* Reset buf states */
+ for (i = 0; i < rx->dqo.num_buf_states; i++) {
+ struct gve_rx_buf_state_dqo *bs = &rx->dqo.buf_states[i];
+
+ if (bs->page_info.page)
+ gve_free_page_dqo(priv, bs, !rx->dqo.qpl);
+ }
+
+ gve_rx_init_ring_state_dqo(rx, buffer_queue_slots,
+ completion_queue_slots);
+}
+
static int gve_rx_alloc_ring_dqo(struct gve_priv *priv, int idx)
{
struct gve_rx_ring *rx = &priv->rx[idx];
struct device *hdev = &priv->pdev->dev;
size_t size;
- int i;
const u32 buffer_queue_slots =
- priv->options_dqo_rda.rx_buff_ring_entries;
+ priv->queue_format == GVE_DQO_RDA_FORMAT ?
+ priv->options_dqo_rda.rx_buff_ring_entries : priv->rx_desc_cnt;
const u32 completion_queue_slots = priv->rx_desc_cnt;
netif_dbg(priv, drv, priv->dev, "allocating rx ring DQO\n");
@@ -237,29 +429,17 @@
memset(rx, 0, sizeof(*rx));
rx->gve = priv;
rx->q_num = idx;
- rx->dqo.bufq.mask = buffer_queue_slots - 1;
- rx->dqo.complq.num_free_slots = completion_queue_slots;
- rx->dqo.complq.mask = completion_queue_slots - 1;
- rx->skb_head = NULL;
- rx->skb_tail = NULL;
- rx->dqo.num_buf_states = min_t(s16, S16_MAX, buffer_queue_slots * 4);
+ /* Allocate buf states */
+ rx->dqo.num_buf_states = priv->queue_format == GVE_DQO_RDA_FORMAT ?
+ min_t(s16, S16_MAX, buffer_queue_slots * 8) :
+ priv->rx_pages_per_qpl;
rx->dqo.buf_states = kvcalloc(rx->dqo.num_buf_states,
sizeof(rx->dqo.buf_states[0]),
GFP_KERNEL);
if (!rx->dqo.buf_states)
return -ENOMEM;
- /* Set up linked list of buffer IDs */
- for (i = 0; i < rx->dqo.num_buf_states - 1; i++)
- rx->dqo.buf_states[i].next = i + 1;
-
- rx->dqo.buf_states[rx->dqo.num_buf_states - 1].next = -1;
- rx->dqo.recycled_buf_states.head = -1;
- rx->dqo.recycled_buf_states.tail = -1;
- rx->dqo.used_buf_states.head = -1;
- rx->dqo.used_buf_states.tail = -1;
-
/* Allocate RX completion queue */
size = sizeof(rx->dqo.complq.desc_ring[0]) *
completion_queue_slots;
@@ -275,11 +455,26 @@
if (!rx->dqo.bufq.desc_ring)
goto err;
+ if (priv->queue_format != GVE_DQO_RDA_FORMAT) {
+ rx->dqo.qpl = gve_assign_rx_qpl(priv);
+ if (!rx->dqo.qpl)
+ goto err;
+ rx->dqo.next_qpl_page_idx = 0;
+ }
+
rx->q_resources = dma_alloc_coherent(hdev, sizeof(*rx->q_resources),
&rx->q_resources_bus, GFP_KERNEL);
if (!rx->q_resources)
goto err;
+ gve_rx_init_ring_state_dqo(rx, buffer_queue_slots,
+ completion_queue_slots);
+
+ /* Allocate header buffers for header-split */
+ if (priv->header_buf_pool)
+ if (gve_rx_alloc_hdr_bufs(priv, idx))
+ goto err;
+
gve_rx_add_to_block(priv, idx);
return 0;
@@ -297,10 +492,28 @@
iowrite32(rx->dqo.bufq.tail, &priv->db_bar2[index]);
}
+static int gve_rx_alloc_hdr_buf_pool(struct gve_priv *priv)
+{
+ priv->header_buf_pool = dma_pool_create("header_bufs",
+ &priv->pdev->dev,
+ priv->header_buf_size,
+ 64, 0);
+ if (!priv->header_buf_pool)
+ return -ENOMEM;
+
+ return 0;
+}
+
int gve_rx_alloc_rings_dqo(struct gve_priv *priv)
{
int err = 0;
- int i;
+ int i = 0;
+
+ if (gve_get_enable_header_split(priv)) {
+ err = gve_rx_alloc_hdr_buf_pool(priv);
+ if (err)
+ goto err;
+ }
for (i = 0; i < priv->rx_cfg.num_queues; i++) {
err = gve_rx_alloc_ring_dqo(priv, i);
@@ -321,12 +534,23 @@
return err;
}
+void gve_rx_reset_rings_dqo(struct gve_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->rx_cfg.num_queues; i++)
+ gve_rx_reset_ring_dqo(priv, i);
+}
+
void gve_rx_free_rings_dqo(struct gve_priv *priv)
{
int i;
for (i = 0; i < priv->rx_cfg.num_queues; i++)
gve_rx_free_ring_dqo(priv, i);
+
+ dma_pool_destroy(priv->header_buf_pool);
+ priv->header_buf_pool = NULL;
}
void gve_rx_post_buffers_dqo(struct gve_rx_ring *rx)
@@ -352,7 +576,7 @@
if (unlikely(!buf_state))
break;
- if (unlikely(gve_alloc_page_dqo(priv, buf_state))) {
+ if (unlikely(gve_alloc_page_dqo(priv, buf_state, rx))) {
u64_stats_update_begin(&rx->statss);
rx->rx_buf_alloc_fail++;
u64_stats_update_end(&rx->statss);
@@ -364,6 +588,12 @@
desc->buf_id = cpu_to_le16(buf_state - rx->dqo.buf_states);
desc->buf_addr = cpu_to_le64(buf_state->addr +
buf_state->page_info.page_offset);
+ if (rx->dqo.hdr_bufs) {
+ struct gve_header_buf *hdr_buf =
+ &rx->dqo.hdr_bufs[bufq->tail];
+ buf_state->hdr_buf = hdr_buf;
+ desc->header_buf_addr = cpu_to_le64(hdr_buf->addr);
+ }
bufq->tail = (bufq->tail + 1) & bufq->mask;
complq->num_free_slots--;
@@ -410,11 +640,12 @@
goto mark_used;
}
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+ gve_recycle_buf(rx, buf_state);
return;
mark_used:
gve_enqueue_buf_state(rx, &rx->dqo.used_buf_states, buf_state);
+ rx->dqo.used_buf_states_cnt++;
}
static void gve_rx_skb_csum(struct sk_buff *skb,
@@ -465,14 +696,53 @@
skb_set_hash(skb, le32_to_cpu(compl_desc->hash), hash_type);
}
-static void gve_rx_free_skb(struct gve_rx_ring *rx)
+static void gve_rx_free_skb(struct napi_struct *napi, struct gve_rx_ring *rx)
{
- if (!rx->skb_head)
+ if (!rx->ctx.skb_head)
return;
- dev_kfree_skb_any(rx->skb_head);
- rx->skb_head = NULL;
- rx->skb_tail = NULL;
+ if (rx->ctx.skb_head == napi->skb)
+ napi->skb = NULL;
+ dev_kfree_skb_any(rx->ctx.skb_head);
+ rx->ctx.skb_head = NULL;
+ rx->ctx.skb_tail = NULL;
+}
+
+static bool gve_rx_should_trigger_copy_ondemand(struct gve_rx_ring *rx)
+{
+ if (!rx->dqo.qpl)
+ return false;
+ if (rx->dqo.used_buf_states_cnt <
+ (rx->dqo.num_buf_states -
+ GVE_DQO_QPL_ONDEMAND_ALLOC_THRESHOLD))
+ return false;
+ return true;
+}
+
+static int gve_rx_copy_ondemand(struct gve_rx_ring *rx,
+ struct gve_rx_buf_state_dqo *buf_state,
+ u16 buf_len)
+{
+ struct page *page = alloc_page(GFP_ATOMIC);
+ int num_frags;
+
+ if (!page)
+ return -ENOMEM;
+
+ memcpy(page_address(page),
+ buf_state->page_info.page_address +
+ buf_state->page_info.page_offset,
+ buf_len);
+ num_frags = skb_shinfo(rx->ctx.skb_tail)->nr_frags;
+ skb_add_rx_frag(rx->ctx.skb_tail, num_frags, page,
+ 0, buf_len, PAGE_SIZE);
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_alloc_cnt++;
+ u64_stats_update_end(&rx->statss);
+ /* Return unused buffer. */
+ gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+ return 0;
}
/* Chains multi skbs for single rx packet.
@@ -483,7 +753,7 @@
u16 buf_len, struct gve_rx_ring *rx,
struct gve_priv *priv)
{
- int num_frags = skb_shinfo(rx->skb_tail)->nr_frags;
+ int num_frags = skb_shinfo(rx->ctx.skb_tail)->nr_frags;
if (unlikely(num_frags == MAX_SKB_FRAGS)) {
struct sk_buff *skb;
@@ -492,22 +762,36 @@
if (!skb)
return -1;
- skb_shinfo(rx->skb_tail)->frag_list = skb;
- rx->skb_tail = skb;
+ if (rx->ctx.skb_tail == rx->ctx.skb_head)
+ skb_shinfo(rx->ctx.skb_tail)->frag_list = skb;
+ else
+ rx->ctx.skb_tail->next = skb;
+ rx->ctx.skb_tail = skb;
num_frags = 0;
}
- if (rx->skb_tail != rx->skb_head) {
- rx->skb_head->len += buf_len;
- rx->skb_head->data_len += buf_len;
- rx->skb_head->truesize += priv->data_buffer_size_dqo;
+ if (rx->ctx.skb_tail != rx->ctx.skb_head) {
+ rx->ctx.skb_head->len += buf_len;
+ rx->ctx.skb_head->data_len += buf_len;
+ rx->ctx.skb_head->truesize += priv->data_buffer_size_dqo;
}
- skb_add_rx_frag(rx->skb_tail, num_frags,
+ /* Trigger ondemand page allocation if we are running low on buffers */
+ if (gve_rx_should_trigger_copy_ondemand(rx))
+ return gve_rx_copy_ondemand(rx, buf_state, buf_len);
+
+ skb_add_rx_frag(rx->ctx.skb_tail, num_frags,
buf_state->page_info.page,
buf_state->page_info.page_offset,
buf_len, priv->data_buffer_size_dqo);
gve_dec_pagecnt_bias(&buf_state->page_info);
+ if (is_dma_buf_page(buf_state->page_info.page))
+ rx->ctx.skb_tail->devmem = 1;
+
+ /* Advances buffer page-offset if page is partially used.
+ * Marks buffer as used if page is full.
+ */
+ gve_try_recycle_buf(priv, rx, buf_state);
return 0;
}
@@ -520,10 +804,13 @@
int queue_idx)
{
const u16 buffer_id = le16_to_cpu(compl_desc->buf_id);
+ const bool hbo = compl_desc->header_buffer_overflow != 0;
const bool eop = compl_desc->end_of_packet != 0;
+ const bool sph = compl_desc->split_header != 0;
struct gve_rx_buf_state_dqo *buf_state;
struct gve_priv *priv = rx->gve;
u16 buf_len;
+ u16 hdr_len;
if (unlikely(buffer_id >= rx->dqo.num_buf_states)) {
net_err_ratelimited("%s: Invalid RX buffer_id=%u\n",
@@ -538,66 +825,130 @@
}
if (unlikely(compl_desc->rx_error)) {
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states,
- buf_state);
+ net_err_ratelimited("%s: Descriptor error=%u\n",
+ priv->dev->name, compl_desc->rx_error);
+ gve_recycle_buf(rx, buf_state);
return -EINVAL;
}
buf_len = compl_desc->packet_len;
+ hdr_len = compl_desc->header_len;
+
+ if (unlikely(sph && !hdr_len)) {
+ gve_recycle_buf(rx, buf_state);
+ return -EINVAL;
+ }
+
+ if (unlikely(hdr_len && buf_state->hdr_buf == NULL)) {
+ gve_recycle_buf(rx, buf_state);
+ return -EINVAL;
+ }
+
+ if (unlikely(hbo && priv->header_split_strict)) {
+ gve_recycle_buf(rx, buf_state);
+ return -EFAULT;
+ }
/* Page might have not been used for awhile and was likely last written
* by a different thread.
*/
prefetch(buf_state->page_info.page);
+ if (!sph && !rx->ctx.skb_head &&
+ is_dma_buf_page(buf_state->page_info.page)) {
+ /* !sph indicates the packet is not split, and the header went
+ * to the packet buffer. If the packet buffer is a dma_buf
+ * page, those can't be easily mapped into the kernel space to
+ * access the header required to process the packet.
+ *
+ * In the future we may be able to map the dma_buf page to
+ * kernel space to access the header for dma_buf providers that
+ * support that, but for now, simply drop the packet. We expect
+ * the TCP packets that we care about to be header split
+ * anyway.
+ */
+ rx->rx_devmem_dropped++;
+ gve_recycle_buf(rx, buf_state);
+ return -EFAULT;
+ }
+
+ /* Copy the header into the skb in the case of header split */
+ if (sph) {
+ dma_sync_single_for_cpu(&priv->pdev->dev,
+ buf_state->hdr_buf->addr,
+ hdr_len, DMA_FROM_DEVICE);
+
+ rx->ctx.skb_head = gve_rx_copy_data(priv->dev, napi,
+ buf_state->hdr_buf->data,
+ hdr_len);
+ if (unlikely(!rx->ctx.skb_head))
+ goto error;
+
+ rx->ctx.skb_tail = rx->ctx.skb_head;
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_hsplit_pkt++;
+ rx->rx_hsplit_hbo_pkt += hbo;
+ rx->rheader_bytes += hdr_len;
+ u64_stats_update_end(&rx->statss);
+ }
+
/* Sync the portion of dma buffer for CPU to read. */
dma_sync_single_range_for_cpu(&priv->pdev->dev, buf_state->addr,
buf_state->page_info.page_offset,
buf_len, DMA_FROM_DEVICE);
/* Append to current skb if one exists. */
- if (rx->skb_head) {
+ if (rx->ctx.skb_head) {
if (unlikely(gve_rx_append_frags(napi, buf_state, buf_len, rx,
- priv)) != 0) {
+ priv)) != 0)
goto error;
- }
-
- gve_try_recycle_buf(priv, rx, buf_state);
return 0;
}
- if (eop && buf_len <= priv->rx_copybreak) {
- rx->skb_head = gve_rx_copy(priv->dev, napi,
+ /* We can't copy dma-buf pages. Ignore any copybreak setting. */
+ if (eop && buf_len <= priv->rx_copybreak &&
+ (!is_dma_buf_page(buf_state->page_info.page) || !buf_len)) {
+ rx->ctx.skb_head = gve_rx_copy(priv->dev, napi,
&buf_state->page_info, buf_len, 0);
- if (unlikely(!rx->skb_head))
+ if (unlikely(!rx->ctx.skb_head))
goto error;
- rx->skb_tail = rx->skb_head;
+ rx->ctx.skb_tail = rx->ctx.skb_head;
u64_stats_update_begin(&rx->statss);
rx->rx_copied_pkt++;
rx->rx_copybreak_pkt++;
u64_stats_update_end(&rx->statss);
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states,
- buf_state);
+ gve_recycle_buf(rx, buf_state);
return 0;
}
- rx->skb_head = napi_get_frags(napi);
- if (unlikely(!rx->skb_head))
+ rx->ctx.skb_head = napi_get_frags(napi);
+ if (unlikely(!rx->ctx.skb_head))
goto error;
- rx->skb_tail = rx->skb_head;
+ rx->ctx.skb_tail = rx->ctx.skb_head;
- skb_add_rx_frag(rx->skb_head, 0, buf_state->page_info.page,
+ if (gve_rx_should_trigger_copy_ondemand(rx)) {
+ if (gve_rx_copy_ondemand(rx, buf_state, buf_len) < 0)
+ goto error;
+ return 0;
+ }
+
+ skb_add_rx_frag(rx->ctx.skb_head, 0, buf_state->page_info.page,
buf_state->page_info.page_offset, buf_len,
priv->data_buffer_size_dqo);
gve_dec_pagecnt_bias(&buf_state->page_info);
+ if (is_dma_buf_page(buf_state->page_info.page))
+ rx->ctx.skb_head->devmem = 1;
+
gve_try_recycle_buf(priv, rx, buf_state);
return 0;
error:
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+ dev_err(&priv->pdev->dev, "%s: Error return", priv->dev->name);
+ gve_recycle_buf(rx, buf_state);
return -ENOMEM;
}
@@ -635,27 +986,32 @@
rx->gve->ptype_lut_dqo->ptypes[desc->packet_type];
int err;
- skb_record_rx_queue(rx->skb_head, rx->q_num);
+ skb_record_rx_queue(rx->ctx.skb_head, rx->q_num);
if (feat & NETIF_F_RXHASH)
- gve_rx_skb_hash(rx->skb_head, desc, ptype);
+ gve_rx_skb_hash(rx->ctx.skb_head, desc, ptype);
if (feat & NETIF_F_RXCSUM)
- gve_rx_skb_csum(rx->skb_head, desc, ptype);
+ gve_rx_skb_csum(rx->ctx.skb_head, desc, ptype);
/* RSC packets must set gso_size otherwise the TCP stack will complain
* that packets are larger than MTU.
*/
if (desc->rsc) {
- err = gve_rx_complete_rsc(rx->skb_head, desc, ptype);
+ err = gve_rx_complete_rsc(rx->ctx.skb_head, desc, ptype);
if (err < 0)
return err;
}
- if (skb_headlen(rx->skb_head) == 0)
+ if (skb_headlen(rx->ctx.skb_head) == 0) {
+ if (napi_get_frags(napi)->devmem)
+ rx->rx_devmem_pkt++;
napi_gro_frags(napi);
- else
- napi_gro_receive(napi, rx->skb_head);
+ } else {
+ if (rx->ctx.skb_head->devmem)
+ rx->rx_devmem_pkt++;
+ napi_gro_receive(napi, rx->ctx.skb_head);
+ }
return 0;
}
@@ -690,12 +1046,14 @@
err = gve_rx_dqo(napi, rx, compl_desc, rx->q_num);
if (err < 0) {
- gve_rx_free_skb(rx);
+ gve_rx_free_skb(napi, rx);
u64_stats_update_begin(&rx->statss);
if (err == -ENOMEM)
rx->rx_skb_alloc_fail++;
else if (err == -EINVAL)
rx->rx_desc_err_dropped_pkt++;
+ else if (err == -EFAULT)
+ rx->rx_hsplit_err_dropped_pkt++;
u64_stats_update_end(&rx->statss);
}
@@ -717,23 +1075,23 @@
/* Free running counter of completed descriptors */
rx->cnt++;
- if (!rx->skb_head)
+ if (!rx->ctx.skb_head)
continue;
if (!compl_desc->end_of_packet)
continue;
work_done++;
- pkt_bytes = rx->skb_head->len;
+ pkt_bytes = rx->ctx.skb_head->len;
/* The ethernet header (first ETH_HLEN bytes) is snipped off
* by eth_type_trans.
*/
- if (skb_headlen(rx->skb_head))
+ if (skb_headlen(rx->ctx.skb_head))
pkt_bytes += ETH_HLEN;
/* gve_rx_complete_skb() will consume skb if successful */
if (gve_rx_complete_skb(rx, napi, compl_desc, feat) != 0) {
- gve_rx_free_skb(rx);
+ gve_rx_free_skb(napi, rx);
u64_stats_update_begin(&rx->statss);
rx->rx_desc_err_dropped_pkt++;
u64_stats_update_end(&rx->statss);
@@ -741,8 +1099,8 @@
}
bytes += pkt_bytes;
- rx->skb_head = NULL;
- rx->skb_tail = NULL;
+ rx->ctx.skb_head = NULL;
+ rx->ctx.skb_tail = NULL;
}
gve_rx_post_buffers_dqo(rx);
@@ -754,3 +1112,39 @@
return work_done;
}
+
+int gve_rx_handle_hdr_resources_dqo(struct gve_priv *priv,
+ bool enable_hdr_split)
+{
+ int err = 0;
+ int i;
+
+ if (enable_hdr_split) {
+ err = gve_rx_alloc_hdr_buf_pool(priv);
+ if (err)
+ goto err;
+
+ for (i = 0; i < priv->rx_cfg.num_queues; i++) {
+ err = gve_rx_alloc_hdr_bufs(priv, i);
+ if (err)
+ goto free_buf_pool;
+ }
+ } else {
+ for (i = 0; i < priv->rx_cfg.num_queues; i++)
+ gve_rx_free_hdr_bufs(priv, i);
+
+ dma_pool_destroy(priv->header_buf_pool);
+ priv->header_buf_pool = NULL;
+ }
+
+ return 0;
+
+free_buf_pool:
+ for (i--; i >= 0; i--)
+ gve_rx_free_hdr_bufs(priv, i);
+
+ dma_pool_destroy(priv->header_buf_pool);
+ priv->header_buf_pool = NULL;
+err:
+ return err;
+}
diff --git a/drivers/net/ethernet/google/gve/gve_tx.c b/drivers/net/ethernet/google/gve/gve_tx.c
index 43e7b74..5e11b82 100644
--- a/drivers/net/ethernet/google/gve/gve_tx.c
+++ b/drivers/net/ethernet/google/gve/gve_tx.c
@@ -144,7 +144,7 @@
gve_tx_remove_from_block(priv, idx);
slots = tx->mask + 1;
- gve_clean_tx_done(priv, tx, tx->req, false);
+ gve_clean_tx_done(priv, tx, priv->tx_desc_cnt, false);
netdev_tx_reset_queue(tx->netdev_txq);
dma_free_coherent(hdev, sizeof(*tx->q_resources),
@@ -176,6 +176,7 @@
/* Make sure everything is zeroed to start */
memset(tx, 0, sizeof(*tx));
+ spin_lock_init(&tx->clean_lock);
tx->q_num = idx;
tx->mask = slots - 1;
@@ -295,11 +296,14 @@
return bytes;
}
-/* The most descriptors we could need is MAX_SKB_FRAGS + 3 : 1 for each skb frag,
- * +1 for the skb linear portion, +1 for when tcp hdr needs to be in separate descriptor,
- * and +1 if the payload wraps to the beginning of the FIFO.
+/* The most descriptors we could need is MAX_SKB_FRAGS + 4 :
+ * 1 for each skb frag
+ * 1 for the skb linear portion
+ * 1 for when tcp hdr needs to be in separate descriptor
+ * 1 if the payload wraps to the beginning of the FIFO
+ * 1 for metadata descriptor
*/
-#define MAX_TX_DESC_NEEDED (MAX_SKB_FRAGS + 3)
+#define MAX_TX_DESC_NEEDED (MAX_SKB_FRAGS + 4)
static void gve_tx_unmap_buf(struct device *dev, struct gve_tx_buffer_state *info)
{
if (info->skb) {
@@ -328,10 +332,16 @@
return (gve_tx_avail(tx) >= MAX_TX_DESC_NEEDED && can_alloc);
}
+static_assert(NAPI_POLL_WEIGHT >= MAX_TX_DESC_NEEDED);
+
/* Stops the queue if the skb cannot be transmitted. */
-static int gve_maybe_stop_tx(struct gve_tx_ring *tx, struct sk_buff *skb)
+static int gve_maybe_stop_tx(struct gve_priv *priv, struct gve_tx_ring *tx,
+ struct sk_buff *skb)
{
int bytes_required = 0;
+ u32 nic_done;
+ u32 to_do;
+ int ret;
if (!tx->raw_addressing)
bytes_required = gve_skb_fifo_bytes_required(tx, skb);
@@ -339,29 +349,28 @@
if (likely(gve_can_tx(tx, bytes_required)))
return 0;
- /* No space, so stop the queue */
- tx->stop_queue++;
- netif_tx_stop_queue(tx->netdev_txq);
- smp_mb(); /* sync with restarting queue in gve_clean_tx_done() */
+ ret = -EBUSY;
+ spin_lock(&tx->clean_lock);
+ nic_done = gve_tx_load_event_counter(priv, tx);
+ to_do = nic_done - tx->done;
- /* Now check for resources again, in case gve_clean_tx_done() freed
- * resources after we checked and we stopped the queue after
- * gve_clean_tx_done() checked.
- *
- * gve_maybe_stop_tx() gve_clean_tx_done()
- * nsegs/can_alloc test failed
- * gve_tx_free_fifo()
- * if (tx queue stopped)
- * netif_tx_queue_wake()
- * netif_tx_stop_queue()
- * Need to check again for space here!
- */
- if (likely(!gve_can_tx(tx, bytes_required)))
- return -EBUSY;
+ /* Only try to clean if there is hope for TX */
+ if (to_do + gve_tx_avail(tx) >= MAX_TX_DESC_NEEDED) {
+ if (to_do > 0) {
+ to_do = min_t(u32, to_do, NAPI_POLL_WEIGHT);
+ gve_clean_tx_done(priv, tx, to_do, false);
+ }
+ if (likely(gve_can_tx(tx, bytes_required)))
+ ret = 0;
+ }
+ if (ret) {
+ /* No space, so stop the queue */
+ tx->stop_queue++;
+ netif_tx_stop_queue(tx->netdev_txq);
+ }
+ spin_unlock(&tx->clean_lock);
- netif_tx_start_queue(tx->netdev_txq);
- tx->wake_queue++;
- return 0;
+ return ret;
}
static void gve_tx_fill_pkt_desc(union gve_tx_desc *pkt_desc,
@@ -389,6 +398,19 @@
pkt_desc->pkt.seg_addr = cpu_to_be64(addr);
}
+static void gve_tx_fill_mtd_desc(union gve_tx_desc *mtd_desc,
+ struct sk_buff *skb)
+{
+ BUILD_BUG_ON(sizeof(mtd_desc->mtd) != sizeof(mtd_desc->pkt));
+
+ mtd_desc->mtd.type_flags = GVE_TXD_MTD | GVE_MTD_SUBTYPE_PATH;
+ mtd_desc->mtd.path_state = GVE_MTD_PATH_STATE_DEFAULT |
+ GVE_MTD_PATH_HASH_L4;
+ mtd_desc->mtd.path_hash = cpu_to_be32(skb->hash);
+ mtd_desc->mtd.reserved0 = 0;
+ mtd_desc->mtd.reserved1 = 0;
+}
+
static void gve_tx_fill_seg_desc(union gve_tx_desc *seg_desc,
struct sk_buff *skb, bool is_gso,
u16 len, u64 addr)
@@ -420,6 +442,7 @@
int pad_bytes, hlen, hdr_nfrags, payload_nfrags, l4_hdr_offset;
union gve_tx_desc *pkt_desc, *seg_desc;
struct gve_tx_buffer_state *info;
+ int mtd_desc_nr = !!skb->l4_hash;
bool is_gso = skb_is_gso(skb);
u32 idx = tx->req & tx->mask;
int payload_iov = 2;
@@ -449,7 +472,7 @@
&info->iov[payload_iov]);
gve_tx_fill_pkt_desc(pkt_desc, skb, is_gso, l4_hdr_offset,
- 1 + payload_nfrags, hlen,
+ 1 + mtd_desc_nr + payload_nfrags, hlen,
info->iov[hdr_nfrags - 1].iov_offset);
skb_copy_bits(skb, 0,
@@ -460,8 +483,13 @@
info->iov[hdr_nfrags - 1].iov_len);
copy_offset = hlen;
+ if (mtd_desc_nr) {
+ next_idx = (tx->req + 1) & tx->mask;
+ gve_tx_fill_mtd_desc(&tx->desc[next_idx], skb);
+ }
+
for (i = payload_iov; i < payload_nfrags + payload_iov; i++) {
- next_idx = (tx->req + 1 + i - payload_iov) & tx->mask;
+ next_idx = (tx->req + 1 + mtd_desc_nr + i - payload_iov) & tx->mask;
seg_desc = &tx->desc[next_idx];
gve_tx_fill_seg_desc(seg_desc, skb, is_gso,
@@ -477,16 +505,17 @@
copy_offset += info->iov[i].iov_len;
}
- return 1 + payload_nfrags;
+ return 1 + mtd_desc_nr + payload_nfrags;
}
static int gve_tx_add_skb_no_copy(struct gve_priv *priv, struct gve_tx_ring *tx,
struct sk_buff *skb)
{
const struct skb_shared_info *shinfo = skb_shinfo(skb);
- int hlen, payload_nfrags, l4_hdr_offset;
- union gve_tx_desc *pkt_desc, *seg_desc;
+ int hlen, num_descriptors, l4_hdr_offset;
+ union gve_tx_desc *pkt_desc, *mtd_desc, *seg_desc;
struct gve_tx_buffer_state *info;
+ int mtd_desc_nr = !!skb->l4_hash;
bool is_gso = skb_is_gso(skb);
u32 idx = tx->req & tx->mask;
u64 addr;
@@ -515,23 +544,30 @@
dma_unmap_len_set(info, len, len);
dma_unmap_addr_set(info, dma, addr);
- payload_nfrags = shinfo->nr_frags;
+ num_descriptors = 1 + shinfo->nr_frags;
+ if (hlen < len)
+ num_descriptors++;
+ if (mtd_desc_nr)
+ num_descriptors++;
+
+ gve_tx_fill_pkt_desc(pkt_desc, skb, is_gso, l4_hdr_offset,
+ num_descriptors, hlen, addr);
+
+ if (mtd_desc_nr) {
+ idx = (idx + 1) & tx->mask;
+ mtd_desc = &tx->desc[idx];
+ gve_tx_fill_mtd_desc(mtd_desc, skb);
+ }
+
if (hlen < len) {
/* For gso the rest of the linear portion of the skb needs to
* be in its own descriptor.
*/
- payload_nfrags++;
- gve_tx_fill_pkt_desc(pkt_desc, skb, is_gso, l4_hdr_offset,
- 1 + payload_nfrags, hlen, addr);
-
len -= hlen;
addr += hlen;
- idx = (tx->req + 1) & tx->mask;
+ idx = (idx + 1) & tx->mask;
seg_desc = &tx->desc[idx];
gve_tx_fill_seg_desc(seg_desc, skb, is_gso, len, addr);
- } else {
- gve_tx_fill_pkt_desc(pkt_desc, skb, is_gso, l4_hdr_offset,
- 1 + payload_nfrags, hlen, addr);
}
for (i = 0; i < shinfo->nr_frags; i++) {
@@ -552,11 +588,14 @@
gve_tx_fill_seg_desc(seg_desc, skb, is_gso, len, addr);
}
- return 1 + payload_nfrags;
+ return num_descriptors;
unmap_drop:
- i += (payload_nfrags == shinfo->nr_frags ? 1 : 2);
+ i += num_descriptors - shinfo->nr_frags;
while (i--) {
+ /* Skip metadata descriptor, if set */
+ if (i == 1 && mtd_desc_nr == 1)
+ continue;
idx--;
gve_tx_unmap_buf(tx->dev, &tx->info[idx & tx->mask]);
}
@@ -574,7 +613,7 @@
WARN(skb_get_queue_mapping(skb) >= priv->tx_cfg.num_queues,
"skb queue index out of range");
tx = &priv->tx[skb_get_queue_mapping(skb)];
- if (unlikely(gve_maybe_stop_tx(tx, skb))) {
+ if (unlikely(gve_maybe_stop_tx(priv, tx, skb))) {
/* We need to ring the txq doorbell -- we have stopped the Tx
* queue for want of resources, but prior calls to gve_tx()
* may have added descriptors without ringing the doorbell.
@@ -670,19 +709,19 @@
return pkts;
}
-__be32 gve_tx_load_event_counter(struct gve_priv *priv,
- struct gve_tx_ring *tx)
+u32 gve_tx_load_event_counter(struct gve_priv *priv,
+ struct gve_tx_ring *tx)
{
- u32 counter_index = be32_to_cpu((tx->q_resources->counter_index));
+ u32 counter_index = be32_to_cpu(tx->q_resources->counter_index);
+ __be32 counter = READ_ONCE(priv->counter_array[counter_index]);
- return READ_ONCE(priv->counter_array[counter_index]);
+ return be32_to_cpu(counter);
}
bool gve_tx_poll(struct gve_notify_block *block, int budget)
{
struct gve_priv *priv = block->priv;
struct gve_tx_ring *tx = block->tx;
- bool repoll = false;
u32 nic_done;
u32 to_do;
@@ -690,17 +729,23 @@
if (budget == 0)
budget = INT_MAX;
+ /* In TX path, it may try to clean completed pkts in order to xmit,
+ * to avoid cleaning conflict, use spin_lock(), it yields better
+ * concurrency between xmit/clean than netif's lock.
+ */
+ spin_lock(&tx->clean_lock);
/* Find out how much work there is to be done */
- tx->last_nic_done = gve_tx_load_event_counter(priv, tx);
- nic_done = be32_to_cpu(tx->last_nic_done);
- if (budget > 0) {
- /* Do as much work as we have that the budget will
- * allow
- */
- to_do = min_t(u32, (nic_done - tx->done), budget);
- gve_clean_tx_done(priv, tx, to_do, true);
- }
+ nic_done = gve_tx_load_event_counter(priv, tx);
+ to_do = min_t(u32, (nic_done - tx->done), budget);
+ gve_clean_tx_done(priv, tx, to_do, true);
+ spin_unlock(&tx->clean_lock);
/* If we still have work we want to repoll */
- repoll |= (nic_done != tx->done);
- return repoll;
+ return nic_done != tx->done;
+}
+
+bool gve_tx_clean_pending(struct gve_priv *priv, struct gve_tx_ring *tx)
+{
+ u32 nic_done = gve_tx_load_event_counter(priv, tx);
+
+ return nic_done != tx->done;
}
diff --git a/drivers/net/ethernet/google/gve/gve_tx_dqo.c b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
index 94e3b74..0d06354 100644
--- a/drivers/net/ethernet/google/gve/gve_tx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
@@ -12,6 +12,89 @@
#include <linux/slab.h>
#include <linux/skbuff.h>
+/* Returns true if tx_bufs are available. */
+static bool gve_has_free_tx_qpl_bufs(struct gve_tx_ring *tx, int count)
+{
+ int num_avail;
+
+ if (!tx->dqo.qpl)
+ return true;
+
+ num_avail = tx->dqo.num_tx_qpl_bufs -
+ (tx->dqo_tx.alloc_tx_qpl_buf_cnt -
+ tx->dqo_tx.free_tx_qpl_buf_cnt);
+
+ if (count <= num_avail)
+ return true;
+
+ /* Update cached value from dqo_compl. */
+ tx->dqo_tx.free_tx_qpl_buf_cnt =
+ atomic_read_acquire(&tx->dqo_compl.free_tx_qpl_buf_cnt);
+
+ num_avail = tx->dqo.num_tx_qpl_bufs -
+ (tx->dqo_tx.alloc_tx_qpl_buf_cnt -
+ tx->dqo_tx.free_tx_qpl_buf_cnt);
+
+ return count <= num_avail;
+}
+
+static s16
+gve_alloc_tx_qpl_buf(struct gve_tx_ring *tx)
+{
+ s16 index;
+
+ index = tx->dqo_tx.free_tx_qpl_buf_head;
+
+ /* No TX buffers available, try to steal the list from the
+ * completion handler.
+ */
+ if (unlikely(index == -1)) {
+ tx->dqo_tx.free_tx_qpl_buf_head =
+ atomic_xchg(&tx->dqo_compl.free_tx_qpl_buf_head, -1);
+ index = tx->dqo_tx.free_tx_qpl_buf_head;
+
+ if (unlikely(index == -1))
+ return index;
+ }
+
+ /* Remove TX buf from free list */
+ tx->dqo_tx.free_tx_qpl_buf_head = tx->dqo.tx_qpl_buf_next[index];
+
+ return index;
+}
+
+static void
+gve_free_tx_qpl_bufs(struct gve_tx_ring *tx,
+ struct gve_tx_pending_packet_dqo *pkt)
+{
+ s16 index;
+ int i;
+
+ if (!pkt->num_bufs)
+ return;
+
+ index = pkt->tx_qpl_buf_ids[0];
+ /* Create a linked list of buffers to be added to the free list */
+ for (i = 1; i < pkt->num_bufs; i++) {
+ tx->dqo.tx_qpl_buf_next[index] = pkt->tx_qpl_buf_ids[i];
+ index = pkt->tx_qpl_buf_ids[i];
+ }
+
+ while (true) {
+ s16 old_head = atomic_read_acquire(&tx->dqo_compl.free_tx_qpl_buf_head);
+
+ tx->dqo.tx_qpl_buf_next[index] = old_head;
+ if (atomic_cmpxchg(&tx->dqo_compl.free_tx_qpl_buf_head,
+ old_head,
+ pkt->tx_qpl_buf_ids[0]) == old_head) {
+ break;
+ }
+ }
+
+ atomic_add(pkt->num_bufs, &tx->dqo_compl.free_tx_qpl_buf_cnt);
+ pkt->num_bufs = 0;
+}
+
/* Returns true if a gve_tx_pending_packet_dqo object is available. */
static bool gve_has_pending_packet(struct gve_tx_ring *tx)
{
@@ -135,9 +218,40 @@
kvfree(tx->dqo.pending_packets);
tx->dqo.pending_packets = NULL;
+ kvfree(tx->dqo.tx_qpl_buf_next);
+ tx->dqo.tx_qpl_buf_next = NULL;
+
+ if (tx->dqo.qpl) {
+ gve_unassign_qpl(priv, tx->dqo.qpl->id);
+ tx->dqo.qpl = NULL;
+ }
+
netif_dbg(priv, drv, priv->dev, "freed tx queue %d\n", idx);
}
+static int gve_tx_qpl_buf_init(struct gve_tx_ring *tx)
+{
+ int num_tx_qpl_bufs = GVE_TX_BUFS_PER_PAGE_DQO *
+ tx->dqo.qpl->num_entries;
+ int i;
+
+ tx->dqo.tx_qpl_buf_next = kvcalloc(num_tx_qpl_bufs,
+ sizeof(tx->dqo.tx_qpl_buf_next[0]),
+ GFP_KERNEL);
+ if (!tx->dqo.tx_qpl_buf_next)
+ return -ENOMEM;
+
+ tx->dqo.num_tx_qpl_bufs = num_tx_qpl_bufs;
+
+ /* Generate free TX buf list */
+ for (i = 0; i < num_tx_qpl_bufs - 1; i++)
+ tx->dqo.tx_qpl_buf_next[i] = i + 1;
+ tx->dqo.tx_qpl_buf_next[num_tx_qpl_bufs - 1] = -1;
+
+ atomic_set_release(&tx->dqo_compl.free_tx_qpl_buf_head, -1);
+ return 0;
+}
+
static int gve_tx_alloc_ring_dqo(struct gve_priv *priv, int idx)
{
struct gve_tx_ring *tx = &priv->tx[idx];
@@ -154,7 +268,9 @@
/* Queue sizes must be a power of 2 */
tx->mask = priv->tx_desc_cnt - 1;
- tx->dqo.complq_mask = priv->options_dqo_rda.tx_comp_ring_entries - 1;
+ tx->dqo.complq_mask = priv->queue_format == GVE_DQO_RDA_FORMAT ?
+ priv->options_dqo_rda.tx_comp_ring_entries - 1 :
+ tx->mask;
/* The max number of pending packets determines the maximum number of
* descriptors which maybe written to the completion queue.
@@ -210,6 +326,15 @@
if (!tx->q_resources)
goto err;
+ if (gve_is_qpl(priv)) {
+ tx->dqo.qpl = gve_assign_tx_qpl(priv);
+ if (!tx->dqo.qpl)
+ goto err;
+
+ if (gve_tx_qpl_buf_init(tx))
+ goto err;
+ }
+
gve_tx_add_to_block(priv, idx);
return 0;
@@ -266,20 +391,27 @@
return tx->mask - num_used;
}
+static bool gve_has_avail_slots_tx_dqo(struct gve_tx_ring *tx,
+ int desc_count, int buf_count)
+{
+ return gve_has_pending_packet(tx) &&
+ num_avail_tx_slots(tx) >= desc_count &&
+ gve_has_free_tx_qpl_bufs(tx, buf_count);
+}
+
/* Stops the queue if available descriptors is less than 'count'.
* Return: 0 if stop is not required.
*/
-static int gve_maybe_stop_tx_dqo(struct gve_tx_ring *tx, int count)
+static int gve_maybe_stop_tx_dqo(struct gve_tx_ring *tx,
+ int desc_count, int buf_count)
{
- if (likely(gve_has_pending_packet(tx) &&
- num_avail_tx_slots(tx) >= count))
+ if (likely(gve_has_avail_slots_tx_dqo(tx, desc_count, buf_count)))
return 0;
/* Update cached TX head pointer */
tx->dqo_tx.head = atomic_read_acquire(&tx->dqo_compl.hw_tx_head);
- if (likely(gve_has_pending_packet(tx) &&
- num_avail_tx_slots(tx) >= count))
+ if (likely(gve_has_avail_slots_tx_dqo(tx, desc_count, buf_count)))
return 0;
/* No space, so stop the queue */
@@ -294,8 +426,7 @@
*/
tx->dqo_tx.head = atomic_read_acquire(&tx->dqo_compl.hw_tx_head);
- if (likely(!gve_has_pending_packet(tx) ||
- num_avail_tx_slots(tx) < count))
+ if (likely(!gve_has_avail_slots_tx_dqo(tx, desc_count, buf_count)))
return -EBUSY;
netif_tx_start_queue(tx->netdev_txq);
@@ -450,20 +581,159 @@
* gve_has_pending_packet(tx) returns true.
*/
static int gve_tx_add_skb_no_copy_dqo(struct gve_tx_ring *tx,
- struct sk_buff *skb)
+ struct sk_buff *skb,
+ struct gve_tx_pending_packet_dqo *pkt,
+ s16 completion_tag,
+ u32 *desc_idx,
+ bool is_gso)
+{
+ const struct skb_shared_info *shinfo = skb_shinfo(skb);
+ int i;
+
+ /* Note: HW requires that the size of a non-TSO packet be within the
+ * range of [17, 9728].
+ *
+ * We don't double check because
+ * - We limited `netdev->min_mtu` to ETH_MIN_MTU.
+ * - Hypervisor won't allow MTU larger than 9216.
+ */
+
+ pkt->num_bufs = 0;
+ /* Map the linear portion of skb */
+ {
+ u32 len = skb_headlen(skb);
+ dma_addr_t addr;
+
+ addr = dma_map_single(tx->dev, skb->data, len, DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx->dev, addr)))
+ goto err;
+
+ dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
+ dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
+ ++pkt->num_bufs;
+
+ gve_tx_fill_pkt_desc_dqo(tx, desc_idx, skb, len, addr,
+ completion_tag,
+ /*eop=*/shinfo->nr_frags == 0, is_gso);
+ }
+
+ for (i = 0; i < shinfo->nr_frags; i++) {
+ const skb_frag_t *frag = &shinfo->frags[i];
+ bool is_eop = i == (shinfo->nr_frags - 1);
+ u32 len = skb_frag_size(frag);
+ dma_addr_t addr;
+
+ addr = skb_devmem_frag_dma_map(tx->dev, skb, frag, 0, len,
+ DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx->dev, addr)))
+ goto err;
+
+ dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
+ dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
+ ++pkt->num_bufs;
+
+ gve_tx_fill_pkt_desc_dqo(tx, desc_idx, skb, len, addr,
+ completion_tag, is_eop, is_gso);
+ }
+
+ return 0;
+err:
+ for (i = 0; i < pkt->num_bufs; i++) {
+ if (i == 0) {
+ dma_unmap_single(tx->dev,
+ dma_unmap_addr(pkt, dma[i]),
+ dma_unmap_len(pkt, len[i]),
+ DMA_TO_DEVICE);
+ } else {
+ dma_unmap_page(tx->dev,
+ dma_unmap_addr(pkt, dma[i]),
+ dma_unmap_len(pkt, len[i]),
+ DMA_TO_DEVICE);
+ }
+ }
+ pkt->num_bufs = 0;
+ return -1;
+}
+
+/* Tx buffer i corresponds to
+ * qpl_page_id = i / GVE_TX_BUFS_PER_PAGE_DQO
+ * qpl_page_offset = (i % GVE_TX_BUFS_PER_PAGE_DQO) * GVE_TX_BUF_SIZE_DQO
+ */
+static void gve_tx_buf_get_addr(struct gve_tx_ring *tx,
+ s16 index,
+ void **va, dma_addr_t *dma_addr)
+{
+ int page_id = index >> (PAGE_SHIFT - GVE_TX_BUF_SHIFT_DQO);
+ int offset = (index & (GVE_TX_BUFS_PER_PAGE_DQO - 1)) << GVE_TX_BUF_SHIFT_DQO;
+
+ *va = page_address(tx->dqo.qpl->pages[page_id]) + offset;
+ *dma_addr = tx->dqo.qpl->page_buses[page_id] + offset;
+}
+
+static int gve_tx_add_skb_copy_dqo(struct gve_tx_ring *tx,
+ struct sk_buff *skb,
+ struct gve_tx_pending_packet_dqo *pkt,
+ s16 completion_tag,
+ u32 *desc_idx,
+ bool is_gso)
+{
+ u32 copy_offset = 0;
+ dma_addr_t dma_addr;
+ u32 copy_len;
+ s16 index;
+ void *va;
+
+ /* Break the packet into buffer size chunks */
+ pkt->num_bufs = 0;
+ while (copy_offset < skb->len) {
+ index = gve_alloc_tx_qpl_buf(tx);
+ if (unlikely(index == -1))
+ goto err;
+
+ gve_tx_buf_get_addr(tx, index, &va, &dma_addr);
+ copy_len = min_t(u32, GVE_TX_BUF_SIZE_DQO,
+ skb->len - copy_offset);
+ skb_copy_bits(skb, copy_offset, va, copy_len);
+
+ copy_offset += copy_len;
+ dma_sync_single_for_device(tx->dev, dma_addr,
+ copy_len, DMA_TO_DEVICE);
+ gve_tx_fill_pkt_desc_dqo(tx, desc_idx, skb,
+ copy_len,
+ dma_addr,
+ completion_tag,
+ copy_offset == skb->len,
+ is_gso);
+
+ pkt->tx_qpl_buf_ids[pkt->num_bufs] = index;
+ ++tx->dqo_tx.alloc_tx_qpl_buf_cnt;
+ ++pkt->num_bufs;
+ }
+
+ return 0;
+err:
+ /* Should not be here if gve_has_free_tx_qpl_bufs() check is correct */
+ gve_free_tx_qpl_bufs(tx, pkt);
+ return -ENOMEM;
+}
+
+/* Returns 0 on success, or < 0 on error.
+ *
+ * Before this function is called, the caller must ensure
+ * gve_has_pending_packet(tx) returns true.
+ */
+static int gve_tx_add_skb_dqo(struct gve_tx_ring *tx,
+ struct sk_buff *skb)
{
const bool is_gso = skb_is_gso(skb);
struct skb_shared_info *shinfo;
u32 desc_idx = tx->dqo_tx.tail;
-
struct gve_tx_pending_packet_dqo *pkt;
struct gve_tx_metadata_dqo metadata;
s16 completion_tag;
- int i;
pkt = gve_alloc_pending_packet(tx);
pkt->skb = skb;
- pkt->num_bufs = 0;
completion_tag = pkt - tx->dqo.pending_packets;
gve_extract_tx_metadata_dqo(skb, &metadata);
@@ -484,49 +754,19 @@
&metadata);
desc_idx = (desc_idx + 1) & tx->mask;
- /* Note: HW requires that the size of a non-TSO packet be within the
- * range of [17, 9728].
- *
- * We don't double check because
- * - We limited `netdev->min_mtu` to ETH_MIN_MTU.
- * - Hypervisor won't allow MTU larger than 9216.
- */
-
- /* Map the linear portion of skb */
- {
- u32 len = skb_headlen(skb);
- dma_addr_t addr;
-
- addr = dma_map_single(tx->dev, skb->data, len, DMA_TO_DEVICE);
- if (unlikely(dma_mapping_error(tx->dev, addr)))
+ if (tx->dqo.qpl) {
+ if (gve_tx_add_skb_copy_dqo(tx, skb, pkt,
+ completion_tag,
+ &desc_idx, is_gso))
goto err;
-
- dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
- dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
- ++pkt->num_bufs;
-
- gve_tx_fill_pkt_desc_dqo(tx, &desc_idx, skb, len, addr,
- completion_tag,
- /*eop=*/shinfo->nr_frags == 0, is_gso);
+ } else {
+ if (gve_tx_add_skb_no_copy_dqo(tx, skb, pkt,
+ completion_tag,
+ &desc_idx, is_gso))
+ goto err;
}
- for (i = 0; i < shinfo->nr_frags; i++) {
- const skb_frag_t *frag = &shinfo->frags[i];
- bool is_eop = i == (shinfo->nr_frags - 1);
- u32 len = skb_frag_size(frag);
- dma_addr_t addr;
-
- addr = skb_frag_dma_map(tx->dev, frag, 0, len, DMA_TO_DEVICE);
- if (unlikely(dma_mapping_error(tx->dev, addr)))
- goto err;
-
- dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
- dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
- ++pkt->num_bufs;
-
- gve_tx_fill_pkt_desc_dqo(tx, &desc_idx, skb, len, addr,
- completion_tag, is_eop, is_gso);
- }
+ tx->dqo_tx.posted_packet_desc_cnt += pkt->num_bufs;
/* Commit the changes to our state */
tx->dqo_tx.tail = desc_idx;
@@ -549,22 +789,7 @@
return 0;
err:
- for (i = 0; i < pkt->num_bufs; i++) {
- if (i == 0) {
- dma_unmap_single(tx->dev,
- dma_unmap_addr(pkt, dma[i]),
- dma_unmap_len(pkt, len[i]),
- DMA_TO_DEVICE);
- } else {
- dma_unmap_page(tx->dev,
- dma_unmap_addr(pkt, dma[i]),
- dma_unmap_len(pkt, len[i]),
- DMA_TO_DEVICE);
- }
- }
-
pkt->skb = NULL;
- pkt->num_bufs = 0;
gve_free_pending_packet(tx, pkt);
return -1;
@@ -606,27 +831,57 @@
const struct skb_shared_info *shinfo = skb_shinfo(skb);
const int gso_size = shinfo->gso_size;
int cur_seg_num_bufs;
+ int last_frag_size;
int cur_seg_size;
int i;
cur_seg_size = skb_headlen(skb) - header_len;
+ last_frag_size = skb_headlen(skb);
cur_seg_num_bufs = cur_seg_size > 0;
for (i = 0; i < shinfo->nr_frags; i++) {
if (cur_seg_size >= gso_size) {
cur_seg_size %= gso_size;
cur_seg_num_bufs = cur_seg_size > 0;
+
+ /* If the last buffer is split in the middle of a TSO
+ * segment, then it will count as two descriptors.
+ */
+ if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
+ int last_frag_remain = last_frag_size %
+ GVE_TX_MAX_BUF_SIZE_DQO;
+
+ /* If the last frag was evenly divisible by
+ * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
+ * split in the current segment.
+ */
+ if (last_frag_remain &&
+ cur_seg_size > last_frag_remain) {
+ cur_seg_num_bufs++;
+ }
+ }
}
if (unlikely(++cur_seg_num_bufs > max_bufs_per_seg))
return false;
- cur_seg_size += skb_frag_size(&shinfo->frags[i]);
+ last_frag_size = skb_frag_size(&shinfo->frags[i]);
+ cur_seg_size += last_frag_size;
}
return true;
}
+netdev_features_t gve_features_check_dqo(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ if (skb_is_gso(skb) && !gve_can_send_tso(skb))
+ return features & ~NETIF_F_GSO_MASK;
+
+ return features;
+}
+
/* Attempt to transmit specified SKB.
*
* Returns 0 if the SKB was transmitted or dropped.
@@ -638,37 +893,49 @@
int num_buffer_descs;
int total_num_descs;
- if (skb_is_gso(skb)) {
- /* If TSO doesn't meet HW requirements, attempt to linearize the
- * packet.
+ if (tx->dqo.qpl) {
+ /* We do not need to verify the number of buffers used per
+ * packet or per segment in case of TSO as with 2K size buffers
+ * none of the TX packet rules would be violated.
+ *
+ * gve_can_send_tso() checks that each TCP segment of gso_size is
+ * not distributed over more than 9 SKB frags..
*/
- if (unlikely(!gve_can_send_tso(skb) &&
- skb_linearize(skb) < 0)) {
- net_err_ratelimited("%s: Failed to transmit TSO packet\n",
- priv->dev->name);
- goto drop;
- }
-
- num_buffer_descs = gve_num_buffer_descs_needed(skb);
+ num_buffer_descs = DIV_ROUND_UP(skb->len, GVE_TX_BUF_SIZE_DQO);
} else {
- num_buffer_descs = gve_num_buffer_descs_needed(skb);
-
- if (unlikely(num_buffer_descs > GVE_TX_MAX_DATA_DESCS)) {
- if (unlikely(skb_linearize(skb) < 0))
+ if (skb_is_gso(skb)) {
+ /* If TSO doesn't meet HW requirements, attempt to linearize the
+ * packet.
+ */
+ if (unlikely(!gve_can_send_tso(skb) &&
+ skb_linearize(skb) < 0)) {
+ net_err_ratelimited("%s: Failed to transmit TSO packet\n",
+ priv->dev->name);
goto drop;
+ }
- num_buffer_descs = 1;
+ num_buffer_descs = gve_num_buffer_descs_needed(skb);
+ } else {
+ num_buffer_descs = gve_num_buffer_descs_needed(skb);
+
+ if (unlikely(num_buffer_descs > GVE_TX_MAX_DATA_DESCS)) {
+ if (unlikely(skb_linearize(skb) < 0))
+ goto drop;
+
+ num_buffer_descs = 1;
+ }
}
}
/* Metadata + (optional TSO) + data descriptors. */
total_num_descs = 1 + skb_is_gso(skb) + num_buffer_descs;
if (unlikely(gve_maybe_stop_tx_dqo(tx, total_num_descs +
- GVE_TX_MIN_DESC_PREVENT_CACHE_OVERLAP))) {
+ GVE_TX_MIN_DESC_PREVENT_CACHE_OVERLAP,
+ num_buffer_descs))) {
return -1;
}
- if (unlikely(gve_tx_add_skb_no_copy_dqo(tx, skb) < 0))
+ if (unlikely(gve_tx_add_skb_dqo(tx, skb) < 0))
goto drop;
netdev_tx_sent_queue(tx->netdev_txq, skb->len);
@@ -798,7 +1065,7 @@
GVE_PACKET_STATE_PENDING_REINJECT_COMPL)) {
/* No outstanding miss completion but packet allocated
* implies packet receives a re-injection completion
- * without a a prior miss completion. Return without
+ * without a prior miss completion. Return without
* completing the packet.
*/
net_err_ratelimited("%s: Re-injection completion received without corresponding miss completion: %d\n",
@@ -816,7 +1083,11 @@
return;
}
}
- gve_unmap_packet(tx->dev, pending_packet);
+ tx->dqo_tx.completed_packet_desc_cnt += pending_packet->num_bufs;
+ if (tx->dqo.qpl)
+ gve_free_tx_qpl_bufs(tx, pending_packet);
+ else
+ gve_unmap_packet(tx->dev, pending_packet);
*bytes += pending_packet->skb->len;
(*pkts)++;
@@ -874,12 +1145,16 @@
remove_from_list(tx, &tx->dqo_compl.miss_completions,
pending_packet);
- /* Unmap buffers and free skb but do not unallocate packet i.e.
+ /* Unmap/free TX buffers and free skb but do not unallocate packet i.e.
* the completion tag is not freed to ensure that the driver
* can take appropriate action if a corresponding valid
* completion is received later.
*/
- gve_unmap_packet(tx->dev, pending_packet);
+ if (tx->dqo.qpl)
+ gve_free_tx_qpl_bufs(tx, pending_packet);
+ else
+ gve_unmap_packet(tx->dev, pending_packet);
+
/* This indicates the packet was dropped. */
dev_kfree_skb_any(pending_packet->skb);
pending_packet->skb = NULL;
@@ -956,12 +1231,18 @@
atomic_set_release(&tx->dqo_compl.hw_tx_head, tx_head);
} else if (type == GVE_COMPL_TYPE_DQO_PKT) {
u16 compl_tag = le16_to_cpu(compl_desc->completion_tag);
-
- gve_handle_packet_completion(priv, tx, !!napi,
- compl_tag,
- &pkt_compl_bytes,
- &pkt_compl_pkts,
- /*is_reinjection=*/false);
+ if (compl_tag & GVE_ALT_MISS_COMPL_BIT) {
+ compl_tag &= ~GVE_ALT_MISS_COMPL_BIT;
+ gve_handle_miss_completion(priv, tx, compl_tag,
+ &miss_compl_bytes,
+ &miss_compl_pkts);
+ } else {
+ gve_handle_packet_completion(priv, tx, !!napi,
+ compl_tag,
+ &pkt_compl_bytes,
+ &pkt_compl_pkts,
+ false);
+ }
} else if (type == GVE_COMPL_TYPE_DQO_MISS) {
u16 compl_tag = le16_to_cpu(compl_desc->completion_tag);
@@ -975,7 +1256,7 @@
compl_tag,
&reinject_compl_bytes,
&reinject_compl_pkts,
- /*is_reinjection=*/true);
+ true);
}
tx->dqo_compl.head =
@@ -992,6 +1273,9 @@
remove_miss_completions(priv, tx);
remove_timed_out_completions(priv, tx);
+ WRITE_ONCE(tx->dqo_compl.last_processed, jiffies);
+ WRITE_ONCE(tx->dqo_compl.kicked, false);
+
u64_stats_update_begin(&tx->statss);
tx->bytes_done += pkt_compl_bytes + reinject_compl_bytes;
tx->pkt_done += pkt_compl_pkts + reinject_compl_pkts;
@@ -1023,3 +1307,9 @@
compl_desc = &tx->dqo.compl_ring[tx->dqo_compl.head];
return compl_desc->generation != tx->dqo_compl.cur_gen_bit;
}
+
+bool gve_tx_work_pending_dqo(struct gve_tx_ring *tx)
+{
+ struct gve_index_list *miss_comp_list = &tx->dqo_compl.miss_completions;
+ return READ_ONCE(miss_comp_list->head) != -1;
+}
diff --git a/drivers/net/ethernet/google/gve/gve_utils.c b/drivers/net/ethernet/google/gve/gve_utils.c
index 93f3dcb..3341193 100644
--- a/drivers/net/ethernet/google/gve/gve_utils.c
+++ b/drivers/net/ethernet/google/gve/gve_utils.c
@@ -18,12 +18,16 @@
void gve_tx_add_to_block(struct gve_priv *priv, int queue_idx)
{
+ unsigned int active_cpus = min_t(int, priv->num_ntfy_blks / 2,
+ num_online_cpus());
int ntfy_idx = gve_tx_idx_to_ntfy(priv, queue_idx);
struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx];
struct gve_tx_ring *tx = &priv->tx[queue_idx];
block->tx = tx;
tx->ntfy_id = ntfy_idx;
+ netif_set_xps_queue(priv->dev, get_cpu_mask(ntfy_idx % active_cpus),
+ queue_idx);
}
void gve_rx_remove_from_block(struct gve_priv *priv, int queue_idx)
@@ -44,26 +48,30 @@
rx->ntfy_id = ntfy_idx;
}
-struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi,
- struct gve_rx_slot_page_info *page_info, u16 len,
- u16 pad)
+struct sk_buff *gve_rx_copy_data(struct net_device *dev, struct napi_struct *napi,
+ u8 *data, u16 len)
{
- struct sk_buff *skb = napi_alloc_skb(napi, len);
- void *va = page_info->page_address + pad +
- page_info->page_offset;
+ struct sk_buff *skb;
+ skb = napi_alloc_skb(napi, len);
if (unlikely(!skb))
return NULL;
__skb_put(skb, len);
-
- skb_copy_to_linear_data(skb, va, len);
-
+ skb_copy_to_linear_data_offset(skb, 0, data, len);
skb->protocol = eth_type_trans(skb, dev);
return skb;
}
+struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi,
+ struct gve_rx_slot_page_info *page_info, u16 len,
+ u16 padding)
+{
+ u8 *va = page_info->page_address + padding + page_info->page_offset;
+ return gve_rx_copy_data(dev, napi, va, len);
+}
+
void gve_dec_pagecnt_bias(struct gve_rx_slot_page_info *page_info)
{
page_info->pagecnt_bias--;
diff --git a/drivers/net/ethernet/google/gve/gve_utils.h b/drivers/net/ethernet/google/gve/gve_utils.h
index 79595940..421cc11 100644
--- a/drivers/net/ethernet/google/gve/gve_utils.h
+++ b/drivers/net/ethernet/google/gve/gve_utils.h
@@ -17,6 +17,9 @@
void gve_rx_remove_from_block(struct gve_priv *priv, int queue_idx);
void gve_rx_add_to_block(struct gve_priv *priv, int queue_idx);
+struct sk_buff *gve_rx_copy_data(struct net_device *dev, struct napi_struct *napi,
+ u8 *data, u16 len);
+
struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi,
struct gve_rx_slot_page_info *page_info, u16 len,
u16 pad);
diff --git a/drivers/platform/x86/intel/uncore-frequency.c b/drivers/platform/x86/intel/uncore-frequency.c
index 3ee4c5c..a72fbc7 100644
--- a/drivers/platform/x86/intel/uncore-frequency.c
+++ b/drivers/platform/x86/intel/uncore-frequency.c
@@ -378,6 +378,7 @@
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, NULL),
+ X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, NULL),
{}
};
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index 8061e8e..e457e47 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -36,4 +36,7 @@
source "drivers/virt/nitro_enclaves/Kconfig"
source "drivers/virt/acrn/Kconfig"
+
+source "drivers/virt/coco/sevguest/Kconfig"
+
endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index 3e272ea..9c704a6 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -8,3 +8,4 @@
obj-$(CONFIG_NITRO_ENCLAVES) += nitro_enclaves/
obj-$(CONFIG_ACRN_HSM) += acrn/
+obj-$(CONFIG_SEV_GUEST) += coco/sevguest/
diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig
new file mode 100644
index 0000000..74ca1fe
--- /dev/null
+++ b/drivers/virt/coco/sevguest/Kconfig
@@ -0,0 +1,14 @@
+config SEV_GUEST
+ tristate "AMD SEV Guest driver"
+ default m
+ depends on AMD_MEM_ENCRYPT
+ select CRYPTO_AEAD2
+ select CRYPTO_GCM
+ help
+ SEV-SNP firmware provides the guest a mechanism to communicate with
+ the PSP without risk from a malicious hypervisor who wishes to read,
+ alter, drop or replay the messages sent. The driver provides
+ userspace interface to communicate with the PSP to request the
+ attestation report and more.
+
+ If you choose 'M' here, this module will be called sevguest.
diff --git a/drivers/virt/coco/sevguest/Makefile b/drivers/virt/coco/sevguest/Makefile
new file mode 100644
index 0000000..b1ffb2b
--- /dev/null
+++ b/drivers/virt/coco/sevguest/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_SEV_GUEST) += sevguest.o
diff --git a/drivers/virt/coco/sevguest/sevguest.c b/drivers/virt/coco/sevguest/sevguest.c
new file mode 100644
index 0000000..15afb6c
--- /dev/null
+++ b/drivers/virt/coco/sevguest/sevguest.c
@@ -0,0 +1,740 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD Secure Encrypted Virtualization Nested Paging (SEV-SNP) guest request interface
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ */
+
+#define pr_fmt(fmt) "SNP: GUEST: " fmt
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/mutex.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/miscdevice.h>
+#include <linux/set_memory.h>
+#include <linux/fs.h>
+#include <crypto/aead.h>
+#include <linux/scatterlist.h>
+#include <linux/psp-sev.h>
+#include <uapi/linux/sev-guest.h>
+#include <uapi/linux/psp-sev.h>
+
+#include <asm/svm.h>
+#include <asm/sev.h>
+
+#include "sevguest.h"
+
+#define DEVICE_NAME "sev-guest"
+#define AAD_LEN 48
+#define MSG_HDR_VER 1
+
+struct snp_guest_crypto {
+ struct crypto_aead *tfm;
+ u8 *iv, *authtag;
+ int iv_len, a_len;
+};
+
+struct snp_guest_dev {
+ struct device *dev;
+ struct miscdevice misc;
+
+ void *certs_data;
+ struct snp_guest_crypto *crypto;
+ struct snp_guest_msg *request, *response;
+ struct snp_secrets_page_layout *layout;
+ struct snp_req_data input;
+ u32 *os_area_msg_seqno;
+ u8 *vmpck;
+};
+
+static u32 vmpck_id;
+module_param(vmpck_id, uint, 0444);
+MODULE_PARM_DESC(vmpck_id, "The VMPCK ID to use when communicating with the PSP.");
+
+/* Mutex to serialize the shared buffer access and command handling. */
+static DEFINE_MUTEX(snp_cmd_mutex);
+
+static bool is_vmpck_empty(struct snp_guest_dev *snp_dev)
+{
+ char zero_key[VMPCK_KEY_LEN] = {0};
+
+ if (snp_dev->vmpck)
+ return !memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN);
+
+ return true;
+}
+
+static void snp_disable_vmpck(struct snp_guest_dev *snp_dev)
+{
+ memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN);
+ snp_dev->vmpck = NULL;
+}
+
+static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+ u64 count;
+
+ lockdep_assert_held(&snp_cmd_mutex);
+
+ /* Read the current message sequence counter from secrets pages */
+ count = *snp_dev->os_area_msg_seqno;
+
+ return count + 1;
+}
+
+/* Return a non-zero on success */
+static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+ u64 count = __snp_get_msg_seqno(snp_dev);
+
+ /*
+ * The message sequence counter for the SNP guest request is a 64-bit
+ * value but the version 2 of GHCB specification defines a 32-bit storage
+ * for it. If the counter exceeds the 32-bit value then return zero.
+ * The caller should check the return value, but if the caller happens to
+ * not check the value and use it, then the firmware treats zero as an
+ * invalid number and will fail the message request.
+ */
+ if (count >= UINT_MAX) {
+ dev_err(snp_dev->dev, "request message sequence counter overflow\n");
+ return 0;
+ }
+
+ return count;
+}
+
+static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev)
+{
+ /*
+ * The counter is also incremented by the PSP, so increment it by 2
+ * and save in secrets page.
+ */
+ *snp_dev->os_area_msg_seqno += 2;
+}
+
+static inline struct snp_guest_dev *to_snp_dev(struct file *file)
+{
+ struct miscdevice *dev = file->private_data;
+
+ return container_of(dev, struct snp_guest_dev, misc);
+}
+
+static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen)
+{
+ struct snp_guest_crypto *crypto;
+
+ crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT);
+ if (!crypto)
+ return NULL;
+
+ crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0);
+ if (IS_ERR(crypto->tfm))
+ goto e_free;
+
+ if (crypto_aead_setkey(crypto->tfm, key, keylen))
+ goto e_free_crypto;
+
+ crypto->iv_len = crypto_aead_ivsize(crypto->tfm);
+ crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT);
+ if (!crypto->iv)
+ goto e_free_crypto;
+
+ if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) {
+ if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) {
+ dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN);
+ goto e_free_iv;
+ }
+ }
+
+ crypto->a_len = crypto_aead_authsize(crypto->tfm);
+ crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT);
+ if (!crypto->authtag)
+ goto e_free_auth;
+
+ return crypto;
+
+e_free_auth:
+ kfree(crypto->authtag);
+e_free_iv:
+ kfree(crypto->iv);
+e_free_crypto:
+ crypto_free_aead(crypto->tfm);
+e_free:
+ kfree(crypto);
+
+ return NULL;
+}
+
+static void deinit_crypto(struct snp_guest_crypto *crypto)
+{
+ crypto_free_aead(crypto->tfm);
+ kfree(crypto->iv);
+ kfree(crypto->authtag);
+ kfree(crypto);
+}
+
+static int enc_dec_message(struct snp_guest_crypto *crypto, struct snp_guest_msg *msg,
+ u8 *src_buf, u8 *dst_buf, size_t len, bool enc)
+{
+ struct snp_guest_msg_hdr *hdr = &msg->hdr;
+ struct scatterlist src[3], dst[3];
+ DECLARE_CRYPTO_WAIT(wait);
+ struct aead_request *req;
+ int ret;
+
+ req = aead_request_alloc(crypto->tfm, GFP_KERNEL);
+ if (!req)
+ return -ENOMEM;
+
+ /*
+ * AEAD memory operations:
+ * +------ AAD -------+------- DATA -----+---- AUTHTAG----+
+ * | msg header | plaintext | hdr->authtag |
+ * | bytes 30h - 5Fh | or | |
+ * | | cipher | |
+ * +------------------+------------------+----------------+
+ */
+ sg_init_table(src, 3);
+ sg_set_buf(&src[0], &hdr->algo, AAD_LEN);
+ sg_set_buf(&src[1], src_buf, hdr->msg_sz);
+ sg_set_buf(&src[2], hdr->authtag, crypto->a_len);
+
+ sg_init_table(dst, 3);
+ sg_set_buf(&dst[0], &hdr->algo, AAD_LEN);
+ sg_set_buf(&dst[1], dst_buf, hdr->msg_sz);
+ sg_set_buf(&dst[2], hdr->authtag, crypto->a_len);
+
+ aead_request_set_ad(req, AAD_LEN);
+ aead_request_set_tfm(req, crypto->tfm);
+ aead_request_set_callback(req, 0, crypto_req_done, &wait);
+
+ aead_request_set_crypt(req, src, dst, len, crypto->iv);
+ ret = crypto_wait_req(enc ? crypto_aead_encrypt(req) : crypto_aead_decrypt(req), &wait);
+
+ aead_request_free(req);
+ return ret;
+}
+
+static int __enc_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
+ void *plaintext, size_t len)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_guest_msg_hdr *hdr = &msg->hdr;
+
+ memset(crypto->iv, 0, crypto->iv_len);
+ memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
+
+ return enc_dec_message(crypto, msg, plaintext, msg->payload, len, true);
+}
+
+static int dec_payload(struct snp_guest_dev *snp_dev, struct snp_guest_msg *msg,
+ void *plaintext, size_t len)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_guest_msg_hdr *hdr = &msg->hdr;
+
+ /* Build IV with response buffer sequence number */
+ memset(crypto->iv, 0, crypto->iv_len);
+ memcpy(crypto->iv, &hdr->msg_seqno, sizeof(hdr->msg_seqno));
+
+ return enc_dec_message(crypto, msg, msg->payload, plaintext, len, false);
+}
+
+static int verify_and_dec_payload(struct snp_guest_dev *snp_dev, void *payload, u32 sz)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_guest_msg *resp = snp_dev->response;
+ struct snp_guest_msg *req = snp_dev->request;
+ struct snp_guest_msg_hdr *req_hdr = &req->hdr;
+ struct snp_guest_msg_hdr *resp_hdr = &resp->hdr;
+
+ dev_dbg(snp_dev->dev, "response [seqno %lld type %d version %d sz %d]\n",
+ resp_hdr->msg_seqno, resp_hdr->msg_type, resp_hdr->msg_version, resp_hdr->msg_sz);
+
+ /* Verify that the sequence counter is incremented by 1 */
+ if (unlikely(resp_hdr->msg_seqno != (req_hdr->msg_seqno + 1)))
+ return -EBADMSG;
+
+ /* Verify response message type and version number. */
+ if (resp_hdr->msg_type != (req_hdr->msg_type + 1) ||
+ resp_hdr->msg_version != req_hdr->msg_version)
+ return -EBADMSG;
+
+ /*
+ * If the message size is greater than our buffer length then return
+ * an error.
+ */
+ if (unlikely((resp_hdr->msg_sz + crypto->a_len) > sz))
+ return -EBADMSG;
+
+ /* Decrypt the payload */
+ return dec_payload(snp_dev, resp, payload, resp_hdr->msg_sz + crypto->a_len);
+}
+
+static bool enc_payload(struct snp_guest_dev *snp_dev, u64 seqno, int version, u8 type,
+ void *payload, size_t sz)
+{
+ struct snp_guest_msg *req = snp_dev->request;
+ struct snp_guest_msg_hdr *hdr = &req->hdr;
+
+ memset(req, 0, sizeof(*req));
+
+ hdr->algo = SNP_AEAD_AES_256_GCM;
+ hdr->hdr_version = MSG_HDR_VER;
+ hdr->hdr_sz = sizeof(*hdr);
+ hdr->msg_type = type;
+ hdr->msg_version = version;
+ hdr->msg_seqno = seqno;
+ hdr->msg_vmpck = vmpck_id;
+ hdr->msg_sz = sz;
+
+ /* Verify the sequence number is non-zero */
+ if (!hdr->msg_seqno)
+ return -ENOSR;
+
+ dev_dbg(snp_dev->dev, "request [seqno %lld type %d version %d sz %d]\n",
+ hdr->msg_seqno, hdr->msg_type, hdr->msg_version, hdr->msg_sz);
+
+ return __enc_payload(snp_dev, req, payload, sz);
+}
+
+static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver,
+ u8 type, void *req_buf, size_t req_sz, void *resp_buf,
+ u32 resp_sz, __u64 *fw_err)
+{
+ unsigned long err;
+ u64 seqno;
+ int rc;
+
+ /* Get message sequence and verify that its a non-zero */
+ seqno = snp_get_msg_seqno(snp_dev);
+ if (!seqno)
+ return -EIO;
+
+ memset(snp_dev->response, 0, sizeof(struct snp_guest_msg));
+
+ /* Encrypt the userspace provided payload */
+ rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz);
+ if (rc)
+ return rc;
+
+ /* Call firmware to process the request */
+ rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err);
+ if (fw_err)
+ *fw_err = err;
+
+ if (rc)
+ return rc;
+
+ /*
+ * The verify_and_dec_payload() will fail only if the hypervisor is
+ * actively modifying the message header or corrupting the encrypted payload.
+ * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that
+ * the key cannot be used for any communication. The key is disabled to ensure
+ * that AES-GCM does not use the same IV while encrypting the request payload.
+ */
+ rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz);
+ if (rc) {
+ dev_alert(snp_dev->dev,
+ "Detected unexpected decode failure, disabling the vmpck_id %d\n",
+ vmpck_id);
+ snp_disable_vmpck(snp_dev);
+ return rc;
+ }
+
+ /* Increment to new message sequence after payload decryption was successful. */
+ snp_inc_msg_seqno(snp_dev);
+
+ return 0;
+}
+
+static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_report_resp *resp;
+ struct snp_report_req req;
+ int rc, resp_len;
+
+ lockdep_assert_held(&snp_cmd_mutex);
+
+ if (!arg->req_data || !arg->resp_data)
+ return -EINVAL;
+
+ if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ return -EFAULT;
+
+ /*
+ * The intermediate response buffer is used while decrypting the
+ * response payload. Make sure that it has enough space to cover the
+ * authtag.
+ */
+ resp_len = sizeof(resp->data) + crypto->a_len;
+ resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+ if (!resp)
+ return -ENOMEM;
+
+ rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+ SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data,
+ resp_len, &arg->fw_err);
+ if (rc)
+ goto e_free;
+
+ if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+ rc = -EFAULT;
+
+e_free:
+ kfree(resp);
+ return rc;
+}
+
+static int get_derived_key(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_derived_key_resp resp = {0};
+ struct snp_derived_key_req req;
+ int rc, resp_len;
+ /* Response data is 64 bytes and max authsize for GCM is 16 bytes. */
+ u8 buf[64 + 16];
+
+ lockdep_assert_held(&snp_cmd_mutex);
+
+ if (!arg->req_data || !arg->resp_data)
+ return -EINVAL;
+
+ /*
+ * The intermediate response buffer is used while decrypting the
+ * response payload. Make sure that it has enough space to cover the
+ * authtag.
+ */
+ resp_len = sizeof(resp.data) + crypto->a_len;
+ if (sizeof(buf) < resp_len)
+ return -ENOMEM;
+
+ if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ return -EFAULT;
+
+ rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version,
+ SNP_MSG_KEY_REQ, &req, sizeof(req), buf, resp_len,
+ &arg->fw_err);
+ if (rc)
+ return rc;
+
+ memcpy(resp.data, buf, sizeof(resp.data));
+ if (copy_to_user((void __user *)arg->resp_data, &resp, sizeof(resp)))
+ rc = -EFAULT;
+
+ /* The response buffer contains the sensitive data, explicitly clear it. */
+ memzero_explicit(buf, sizeof(buf));
+ memzero_explicit(&resp, sizeof(resp));
+ return rc;
+}
+
+static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg)
+{
+ struct snp_guest_crypto *crypto = snp_dev->crypto;
+ struct snp_ext_report_req req;
+ struct snp_report_resp *resp;
+ int ret, npages = 0, resp_len;
+
+ lockdep_assert_held(&snp_cmd_mutex);
+
+ if (!arg->req_data || !arg->resp_data)
+ return -EINVAL;
+
+ if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req)))
+ return -EFAULT;
+
+ /* userspace does not want certificate data */
+ if (!req.certs_len || !req.certs_address)
+ goto cmd;
+
+ if (req.certs_len > SEV_FW_BLOB_MAX_SIZE ||
+ !IS_ALIGNED(req.certs_len, PAGE_SIZE))
+ return -EINVAL;
+
+ if (!access_ok((const void __user *)req.certs_address, req.certs_len))
+ return -EFAULT;
+
+ /*
+ * Initialize the intermediate buffer with all zeros. This buffer
+ * is used in the guest request message to get the certs blob from
+ * the host. If host does not supply any certs in it, then copy
+ * zeros to indicate that certificate data was not provided.
+ */
+ memset(snp_dev->certs_data, 0, req.certs_len);
+ npages = req.certs_len >> PAGE_SHIFT;
+cmd:
+ /*
+ * The intermediate response buffer is used while decrypting the
+ * response payload. Make sure that it has enough space to cover the
+ * authtag.
+ */
+ resp_len = sizeof(resp->data) + crypto->a_len;
+ resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT);
+ if (!resp)
+ return -ENOMEM;
+
+ snp_dev->input.data_npages = npages;
+ ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
+ SNP_MSG_REPORT_REQ, &req.data,
+ sizeof(req.data), resp->data, resp_len, &arg->fw_err);
+
+ /* If certs length is invalid then copy the returned length */
+ if (arg->fw_err == SNP_GUEST_REQ_INVALID_LEN) {
+ req.certs_len = snp_dev->input.data_npages << PAGE_SHIFT;
+
+ if (copy_to_user((void __user *)arg->req_data, &req, sizeof(req)))
+ ret = -EFAULT;
+ }
+
+ if (ret)
+ goto e_free;
+
+ if (npages &&
+ copy_to_user((void __user *)req.certs_address, snp_dev->certs_data,
+ req.certs_len)) {
+ ret = -EFAULT;
+ goto e_free;
+ }
+
+ if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp)))
+ ret = -EFAULT;
+
+e_free:
+ kfree(resp);
+ return ret;
+}
+
+static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg)
+{
+ struct snp_guest_dev *snp_dev = to_snp_dev(file);
+ void __user *argp = (void __user *)arg;
+ struct snp_guest_request_ioctl input;
+ int ret = -ENOTTY;
+
+ if (copy_from_user(&input, argp, sizeof(input)))
+ return -EFAULT;
+
+ input.fw_err = 0xff;
+
+ /* Message version must be non-zero */
+ if (!input.msg_version)
+ return -EINVAL;
+
+ mutex_lock(&snp_cmd_mutex);
+
+ /* Check if the VMPCK is not empty */
+ if (is_vmpck_empty(snp_dev)) {
+ dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n");
+ mutex_unlock(&snp_cmd_mutex);
+ return -ENOTTY;
+ }
+
+ switch (ioctl) {
+ case SNP_GET_REPORT:
+ ret = get_report(snp_dev, &input);
+ break;
+ case SNP_GET_DERIVED_KEY:
+ ret = get_derived_key(snp_dev, &input);
+ break;
+ case SNP_GET_EXT_REPORT:
+ ret = get_ext_report(snp_dev, &input);
+ break;
+ default:
+ break;
+ }
+
+ mutex_unlock(&snp_cmd_mutex);
+
+ if (input.fw_err && copy_to_user(argp, &input, sizeof(input)))
+ return -EFAULT;
+
+ return ret;
+}
+
+static void free_shared_pages(void *buf, size_t sz)
+{
+ unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+ int ret;
+
+ if (!buf)
+ return;
+
+ ret = set_memory_encrypted((unsigned long)buf, npages);
+ if (ret) {
+ WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n");
+ return;
+ }
+
+ __free_pages(virt_to_page(buf), get_order(sz));
+}
+
+static void *alloc_shared_pages(size_t sz)
+{
+ unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT;
+ struct page *page;
+ int ret;
+
+ page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz));
+ if (IS_ERR(page))
+ return NULL;
+
+ ret = set_memory_decrypted((unsigned long)page_address(page), npages);
+ if (ret) {
+ pr_err("failed to mark page shared, ret=%d\n", ret);
+ __free_pages(page, get_order(sz));
+ return NULL;
+ }
+
+ return page_address(page);
+}
+
+static const struct file_operations snp_guest_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = snp_guest_ioctl,
+};
+
+static u8 *get_vmpck(int id, struct snp_secrets_page_layout *layout, u32 **seqno)
+{
+ u8 *key = NULL;
+
+ switch (id) {
+ case 0:
+ *seqno = &layout->os_area.msg_seqno_0;
+ key = layout->vmpck0;
+ break;
+ case 1:
+ *seqno = &layout->os_area.msg_seqno_1;
+ key = layout->vmpck1;
+ break;
+ case 2:
+ *seqno = &layout->os_area.msg_seqno_2;
+ key = layout->vmpck2;
+ break;
+ case 3:
+ *seqno = &layout->os_area.msg_seqno_3;
+ key = layout->vmpck3;
+ break;
+ default:
+ break;
+ }
+
+ return key;
+}
+
+static int __init snp_guest_probe(struct platform_device *pdev)
+{
+ struct snp_secrets_page_layout *layout;
+ struct snp_guest_platform_data *data;
+ struct device *dev = &pdev->dev;
+ struct snp_guest_dev *snp_dev;
+ struct miscdevice *misc;
+ int ret;
+
+ if (!dev->platform_data)
+ return -ENODEV;
+
+ data = (struct snp_guest_platform_data *)dev->platform_data;
+ layout = (__force void *)ioremap_encrypted(data->secrets_gpa, PAGE_SIZE);
+ if (!layout)
+ return -ENODEV;
+
+ ret = -ENOMEM;
+ snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL);
+ if (!snp_dev)
+ goto e_unmap;
+
+ ret = -EINVAL;
+ snp_dev->vmpck = get_vmpck(vmpck_id, layout, &snp_dev->os_area_msg_seqno);
+ if (!snp_dev->vmpck) {
+ dev_err(dev, "invalid vmpck id %d\n", vmpck_id);
+ goto e_unmap;
+ }
+
+ /* Verify that VMPCK is not zero. */
+ if (is_vmpck_empty(snp_dev)) {
+ dev_err(dev, "vmpck id %d is null\n", vmpck_id);
+ goto e_unmap;
+ }
+
+ platform_set_drvdata(pdev, snp_dev);
+ snp_dev->dev = dev;
+ snp_dev->layout = layout;
+
+ /* Allocate the shared page used for the request and response message. */
+ snp_dev->request = alloc_shared_pages(sizeof(struct snp_guest_msg));
+ if (!snp_dev->request)
+ goto e_unmap;
+
+ snp_dev->response = alloc_shared_pages(sizeof(struct snp_guest_msg));
+ if (!snp_dev->response)
+ goto e_free_request;
+
+ snp_dev->certs_data = alloc_shared_pages(SEV_FW_BLOB_MAX_SIZE);
+ if (!snp_dev->certs_data)
+ goto e_free_response;
+
+ ret = -EIO;
+ snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN);
+ if (!snp_dev->crypto)
+ goto e_free_cert_data;
+
+ misc = &snp_dev->misc;
+ misc->minor = MISC_DYNAMIC_MINOR;
+ misc->name = DEVICE_NAME;
+ misc->fops = &snp_guest_fops;
+
+ /* initial the input address for guest request */
+ snp_dev->input.req_gpa = __pa(snp_dev->request);
+ snp_dev->input.resp_gpa = __pa(snp_dev->response);
+ snp_dev->input.data_gpa = __pa(snp_dev->certs_data);
+
+ ret = misc_register(misc);
+ if (ret)
+ goto e_free_cert_data;
+
+ dev_info(dev, "Initialized SNP guest driver (using vmpck_id %d)\n", vmpck_id);
+ return 0;
+
+e_free_cert_data:
+ free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
+e_free_response:
+ free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+e_free_request:
+ free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+e_unmap:
+ iounmap(layout);
+ return ret;
+}
+
+static int __exit snp_guest_remove(struct platform_device *pdev)
+{
+ struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev);
+
+ free_shared_pages(snp_dev->certs_data, SEV_FW_BLOB_MAX_SIZE);
+ free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg));
+ free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg));
+ deinit_crypto(snp_dev->crypto);
+ misc_deregister(&snp_dev->misc);
+
+ return 0;
+}
+
+static struct platform_driver snp_guest_driver = {
+ .remove = __exit_p(snp_guest_remove),
+ .driver = {
+ .name = "snp-guest",
+ },
+};
+
+module_platform_driver_probe(snp_guest_driver, snp_guest_probe);
+
+MODULE_AUTHOR("Brijesh Singh <brijesh.singh@amd.com>");
+MODULE_LICENSE("GPL");
+MODULE_VERSION("1.0.0");
+MODULE_DESCRIPTION("AMD SNP Guest Driver");
diff --git a/drivers/virt/coco/sevguest/sevguest.h b/drivers/virt/coco/sevguest/sevguest.h
new file mode 100644
index 0000000..d39bdd0
--- /dev/null
+++ b/drivers/virt/coco/sevguest/sevguest.h
@@ -0,0 +1,98 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * SEV-SNP API spec is available at https://developer.amd.com/sev
+ */
+
+#ifndef __VIRT_SEVGUEST_H__
+#define __VIRT_SEVGUEST_H__
+
+#include <linux/types.h>
+
+#define MAX_AUTHTAG_LEN 32
+
+/* See SNP spec SNP_GUEST_REQUEST section for the structure */
+enum msg_type {
+ SNP_MSG_TYPE_INVALID = 0,
+ SNP_MSG_CPUID_REQ,
+ SNP_MSG_CPUID_RSP,
+ SNP_MSG_KEY_REQ,
+ SNP_MSG_KEY_RSP,
+ SNP_MSG_REPORT_REQ,
+ SNP_MSG_REPORT_RSP,
+ SNP_MSG_EXPORT_REQ,
+ SNP_MSG_EXPORT_RSP,
+ SNP_MSG_IMPORT_REQ,
+ SNP_MSG_IMPORT_RSP,
+ SNP_MSG_ABSORB_REQ,
+ SNP_MSG_ABSORB_RSP,
+ SNP_MSG_VMRK_REQ,
+ SNP_MSG_VMRK_RSP,
+
+ SNP_MSG_TYPE_MAX
+};
+
+enum aead_algo {
+ SNP_AEAD_INVALID,
+ SNP_AEAD_AES_256_GCM,
+};
+
+struct snp_guest_msg_hdr {
+ u8 authtag[MAX_AUTHTAG_LEN];
+ u64 msg_seqno;
+ u8 rsvd1[8];
+ u8 algo;
+ u8 hdr_version;
+ u16 hdr_sz;
+ u8 msg_type;
+ u8 msg_version;
+ u16 msg_sz;
+ u32 rsvd2;
+ u8 msg_vmpck;
+ u8 rsvd3[35];
+} __packed;
+
+struct snp_guest_msg {
+ struct snp_guest_msg_hdr hdr;
+ u8 payload[4000];
+} __packed;
+
+/*
+ * The secrets page contains 96-bytes of reserved field that can be used by
+ * the guest OS. The guest OS uses the area to save the message sequence
+ * number for each VMPCK.
+ *
+ * See the GHCB spec section Secret page layout for the format for this area.
+ */
+struct secrets_os_area {
+ u32 msg_seqno_0;
+ u32 msg_seqno_1;
+ u32 msg_seqno_2;
+ u32 msg_seqno_3;
+ u64 ap_jump_table_pa;
+ u8 rsvd[40];
+ u8 guest_usage[32];
+} __packed;
+
+#define VMPCK_KEY_LEN 32
+
+/* See the SNP spec version 0.9 for secrets page format */
+struct snp_secrets_page_layout {
+ u32 version;
+ u32 imien : 1,
+ rsvd1 : 31;
+ u32 fms;
+ u32 rsvd2;
+ u8 gosvw[16];
+ u8 vmpck0[VMPCK_KEY_LEN];
+ u8 vmpck1[VMPCK_KEY_LEN];
+ u8 vmpck2[VMPCK_KEY_LEN];
+ u8 vmpck3[VMPCK_KEY_LEN];
+ struct secrets_os_area os_area;
+ u8 rsvd3[3840];
+} __packed;
+
+#endif /* __VIRT_SEVGUEST_H__ */
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 0f3c22c..e586f0c 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -860,8 +860,22 @@
struct virtio_balloon *vb = container_of(nb,
struct virtio_balloon, oom_nb);
unsigned long *freed = parm;
-
- *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) /
+ s64 pages_to_release = VIRTIO_BALLOON_OOM_NR_PAGES;
+ if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_MIN_BALLOON_SIZE)) {
+ /* Minimin balloon size - VIRTIO_BALLOON_F_MIN_BALLOON_SIZE */
+ u32 min_balloon_pages;
+ // get the latest dynamic_limit from config space.
+ virtio_cread_le(vb->vdev, struct virtio_balloon_config,
+ min_balloon_pages, &min_balloon_pages);
+ s64 min_pages = min_balloon_pages;
+ if (min_pages >= vb->num_pages) {
+ printk("num_pages: %d <= min_balloon_pages %u \n",
+ vb->num_pages, min_balloon_pages);
+ return NOTIFY_STOP;
+ }
+ pages_to_release = min(vb->num_pages - min_pages, pages_to_release);
+ }
+ *freed += leak_balloon(vb, pages_to_release) /
VIRTIO_BALLOON_PAGES_PER_PAGE;
update_balloon_size(vb);
@@ -1151,6 +1165,7 @@
VIRTIO_BALLOON_F_FREE_PAGE_HINT,
VIRTIO_BALLOON_F_PAGE_POISON,
VIRTIO_BALLOON_F_REPORTING,
+ VIRTIO_BALLOON_F_MIN_BALLOON_SIZE,
};
static struct virtio_driver virtio_balloon_driver = {
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index bea4c2b..c5d36e6 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -1004,6 +1004,14 @@
return rc;
}
+static int ecryptfs_do_getattr(const struct path *path, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ if (flags & AT_GETATTR_NOSEC)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
+
static int ecryptfs_getattr(struct user_namespace *mnt_userns,
const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int flags)
@@ -1012,8 +1020,8 @@
struct kstat lower_stat;
int rc;
- rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
- request_mask, flags);
+ rc = ecryptfs_do_getattr(ecryptfs_dentry_to_lower_path(dentry),
+ &lower_stat, request_mask, flags);
if (!rc) {
fsstack_copy_attr_all(d_inode(dentry),
ecryptfs_inode_to_lower(d_inode(dentry)));
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index e1a5ec7..fcd8e9a 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -721,6 +721,8 @@
#define EXT4_IOC_GETSTATE _IOW('f', 41, __u32)
#define EXT4_IOC_GET_ES_CACHE _IOWR('f', 42, struct fiemap)
#define EXT4_IOC_CHECKPOINT _IOW('f', 43, __u32)
+#define EXT4_IOC_GETFSUUID _IOR('f', 44, struct fsuuid)
+#define EXT4_IOC_SETFSUUID _IOW('f', 44, struct fsuuid)
#define EXT4_IOC_SHUTDOWN _IOR ('X', 125, __u32)
@@ -750,6 +752,15 @@
EXT4_IOC_CHECKPOINT_FLAG_ZEROOUT | \
EXT4_IOC_CHECKPOINT_FLAG_DRY_RUN)
+/*
+ * Structure for EXT4_IOC_GETFSUUID/EXT4_IOC_SETFSUUID
+ */
+struct fsuuid {
+ __u32 fsu_len;
+ __u32 fsu_flags;
+ __u8 fsu_uuid[];
+};
+
#if defined(__KERNEL__) && defined(CONFIG_COMPAT)
/*
* ioctl commands in 32 bit emulation
@@ -1299,6 +1310,8 @@
/* Metadata checksum algorithm codes */
#define EXT4_CRC32C_CHKSUM 1
+#define EXT4_LABEL_MAX 16
+
/*
* Structure of the super block
*/
@@ -1348,7 +1361,7 @@
/*60*/ __le32 s_feature_incompat; /* incompatible feature set */
__le32 s_feature_ro_compat; /* readonly-compatible feature set */
/*68*/ __u8 s_uuid[16]; /* 128-bit uuid for volume */
-/*78*/ char s_volume_name[16]; /* volume name */
+/*78*/ char s_volume_name[EXT4_LABEL_MAX]; /* volume name */
/*88*/ char s_last_mounted[64] __nonstring; /* directory where last mounted */
/*C8*/ __le32 s_algorithm_usage_bitmap; /* For compression */
/*
@@ -3098,6 +3111,9 @@
struct ext4_super_block *es,
ext4_fsblk_t n_blocks_count);
extern int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count);
+extern unsigned int ext4_list_backups(struct super_block *sb,
+ unsigned int *three, unsigned int *five,
+ unsigned int *seven);
/* super.c */
extern struct buffer_head *ext4_sb_bread(struct super_block *sb,
@@ -3112,6 +3128,8 @@
extern void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block);
extern int ext4_seq_options_show(struct seq_file *seq, void *offset);
extern int ext4_calculate_overhead(struct super_block *sb);
+extern __le32 ext4_superblock_csum(struct super_block *sb,
+ struct ext4_super_block *es);
extern void ext4_superblock_csum_set(struct super_block *sb);
extern int ext4_alloc_flex_bg_array(struct super_block *sb,
ext4_group_t ngroup);
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 18002b0..84881b5 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -21,12 +21,264 @@
#include <linux/delay.h>
#include <linux/iversion.h>
#include <linux/fileattr.h>
+#include <linux/uuid.h>
#include "ext4_jbd2.h"
#include "ext4.h"
#include <linux/fsmap.h>
#include "fsmap.h"
#include <trace/events/ext4.h>
+typedef void ext4_update_sb_callback(struct ext4_super_block *es,
+ const void *arg);
+
+/*
+ * Superblock modification callback function for changing file system
+ * label
+ */
+static void ext4_sb_setlabel(struct ext4_super_block *es, const void *arg)
+{
+ /* Sanity check, this should never happen */
+ BUILD_BUG_ON(sizeof(es->s_volume_name) < EXT4_LABEL_MAX);
+
+ memcpy(es->s_volume_name, (char *)arg, EXT4_LABEL_MAX);
+}
+
+/*
+ * Superblock modification callback function for changing file system
+ * UUID.
+ */
+static void ext4_sb_setuuid(struct ext4_super_block *es, const void *arg)
+{
+ memcpy(es->s_uuid, (__u8 *)arg, UUID_SIZE);
+}
+
+static
+int ext4_update_primary_sb(struct super_block *sb, handle_t *handle,
+ ext4_update_sb_callback func,
+ const void *arg)
+{
+ int err = 0;
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ struct buffer_head *bh = sbi->s_sbh;
+ struct ext4_super_block *es = sbi->s_es;
+
+ trace_ext4_update_sb(sb, bh->b_blocknr, 1);
+
+ BUFFER_TRACE(bh, "get_write_access");
+ err = ext4_journal_get_write_access(handle, sb,
+ bh,
+ EXT4_JTR_NONE);
+ if (err)
+ goto out_err;
+
+ lock_buffer(bh);
+ func(es, arg);
+ ext4_superblock_csum_set(sb);
+ unlock_buffer(bh);
+
+ if (buffer_write_io_error(bh) || !buffer_uptodate(bh)) {
+ ext4_msg(sbi->s_sb, KERN_ERR, "previous I/O error to "
+ "superblock detected");
+ clear_buffer_write_io_error(bh);
+ set_buffer_uptodate(bh);
+ }
+
+ err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ if (err)
+ goto out_err;
+ err = sync_dirty_buffer(bh);
+out_err:
+ ext4_std_error(sb, err);
+ return err;
+}
+
+/*
+ * Update one backup superblock in the group 'grp' using the callback
+ * function 'func' and argument 'arg'. If the handle is NULL the
+ * modification is not journalled.
+ *
+ * Returns: 0 when no modification was done (no superblock in the group)
+ * 1 when the modification was successful
+ * <0 on error
+ */
+static int ext4_update_backup_sb(struct super_block *sb,
+ handle_t *handle, ext4_group_t grp,
+ ext4_update_sb_callback func, const void *arg)
+{
+ int err = 0;
+ ext4_fsblk_t sb_block;
+ struct buffer_head *bh;
+ unsigned long offset = 0;
+ struct ext4_super_block *es;
+
+ if (!ext4_bg_has_super(sb, grp))
+ return 0;
+
+ /*
+ * For the group 0 there is always 1k padding, so we have
+ * either adjust offset, or sb_block depending on blocksize
+ */
+ if (grp == 0) {
+ sb_block = 1 * EXT4_MIN_BLOCK_SIZE;
+ offset = do_div(sb_block, sb->s_blocksize);
+ } else {
+ sb_block = ext4_group_first_block_no(sb, grp);
+ offset = 0;
+ }
+
+ trace_ext4_update_sb(sb, sb_block, handle ? 1 : 0);
+
+ bh = ext4_sb_bread(sb, sb_block, 0);
+ if (IS_ERR(bh))
+ return PTR_ERR(bh);
+
+ if (handle) {
+ BUFFER_TRACE(bh, "get_write_access");
+ err = ext4_journal_get_write_access(handle, sb,
+ bh,
+ EXT4_JTR_NONE);
+ if (err)
+ goto out_bh;
+ }
+
+ es = (struct ext4_super_block *) (bh->b_data + offset);
+ lock_buffer(bh);
+ if (ext4_has_metadata_csum(sb) &&
+ es->s_checksum != ext4_superblock_csum(sb, es)) {
+ ext4_msg(sb, KERN_ERR, "Invalid checksum for backup "
+ "superblock %llu\n", sb_block);
+ unlock_buffer(bh);
+ err = -EFSBADCRC;
+ goto out_bh;
+ }
+ func(es, arg);
+ if (ext4_has_metadata_csum(sb))
+ es->s_checksum = ext4_superblock_csum(sb, es);
+ set_buffer_uptodate(bh);
+ unlock_buffer(bh);
+
+ if (err)
+ goto out_bh;
+
+ if (handle) {
+ err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ if (err)
+ goto out_bh;
+ } else {
+ BUFFER_TRACE(bh, "marking dirty");
+ mark_buffer_dirty(bh);
+ }
+ err = sync_dirty_buffer(bh);
+
+out_bh:
+ brelse(bh);
+ ext4_std_error(sb, err);
+ return (err) ? err : 1;
+}
+
+/*
+ * Update primary and backup superblocks using the provided function
+ * func and argument arg.
+ *
+ * Only the primary superblock and at most two backup superblock
+ * modifications are journalled; the rest is modified without journal.
+ * This is safe because e2fsck will re-write them if there is a problem,
+ * and we're very unlikely to ever need more than two backups.
+ */
+static
+int ext4_update_superblocks_fn(struct super_block *sb,
+ ext4_update_sb_callback func,
+ const void *arg)
+{
+ handle_t *handle;
+ ext4_group_t ngroups;
+ unsigned int three = 1;
+ unsigned int five = 5;
+ unsigned int seven = 7;
+ int err = 0, ret, i;
+ ext4_group_t grp, primary_grp;
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+
+ /*
+ * We can't update superblocks while the online resize is running
+ */
+ if (test_and_set_bit_lock(EXT4_FLAGS_RESIZING,
+ &sbi->s_ext4_flags)) {
+ ext4_msg(sb, KERN_ERR, "Can't modify superblock while"
+ "performing online resize");
+ return -EBUSY;
+ }
+
+ /*
+ * We're only going to update primary superblock and two
+ * backup superblocks in this transaction.
+ */
+ handle = ext4_journal_start_sb(sb, EXT4_HT_MISC, 3);
+ if (IS_ERR(handle)) {
+ err = PTR_ERR(handle);
+ goto out;
+ }
+
+ /* Update primary superblock */
+ err = ext4_update_primary_sb(sb, handle, func, arg);
+ if (err) {
+ ext4_msg(sb, KERN_ERR, "Failed to update primary "
+ "superblock");
+ goto out_journal;
+ }
+
+ primary_grp = ext4_get_group_number(sb, sbi->s_sbh->b_blocknr);
+ ngroups = ext4_get_groups_count(sb);
+
+ /*
+ * Update backup superblocks. We have to start from group 0
+ * because it might not be where the primary superblock is
+ * if the fs is mounted with -o sb=<backup_sb_block>
+ */
+ i = 0;
+ grp = 0;
+ while (grp < ngroups) {
+ /* Skip primary superblock */
+ if (grp == primary_grp)
+ goto next_grp;
+
+ ret = ext4_update_backup_sb(sb, handle, grp, func, arg);
+ if (ret < 0) {
+ /* Ignore bad checksum; try to update next sb */
+ if (ret == -EFSBADCRC)
+ goto next_grp;
+ err = ret;
+ goto out_journal;
+ }
+
+ i += ret;
+ if (handle && i > 1) {
+ /*
+ * We're only journalling primary superblock and
+ * two backup superblocks; the rest is not
+ * journalled.
+ */
+ err = ext4_journal_stop(handle);
+ if (err)
+ goto out;
+ handle = NULL;
+ }
+next_grp:
+ grp = ext4_list_backups(sb, &three, &five, &seven);
+ }
+
+out_journal:
+ if (handle) {
+ ret = ext4_journal_stop(handle);
+ if (ret && !err)
+ err = ret;
+ }
+out:
+ clear_bit_unlock(EXT4_FLAGS_RESIZING, &sbi->s_ext4_flags);
+ smp_mb__after_atomic();
+ return err ? err : 0;
+}
+
/**
* Swap memory between @a and @b for @len bytes.
*
@@ -852,6 +1104,131 @@
return err;
}
+static int ext4_ioctl_setlabel(struct file *filp, const char __user *user_label)
+{
+ size_t len;
+ int ret = 0;
+ char new_label[EXT4_LABEL_MAX + 1];
+ struct super_block *sb = file_inode(filp)->i_sb;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ /*
+ * Copy the maximum length allowed for ext4 label with one more to
+ * find the required terminating null byte in order to test the
+ * label length. The on disk label doesn't need to be null terminated.
+ */
+ if (copy_from_user(new_label, user_label, EXT4_LABEL_MAX + 1))
+ return -EFAULT;
+
+ len = strnlen(new_label, EXT4_LABEL_MAX + 1);
+ if (len > EXT4_LABEL_MAX)
+ return -EINVAL;
+
+ /*
+ * Clear the buffer after the new label
+ */
+ memset(new_label + len, 0, EXT4_LABEL_MAX - len);
+
+ ret = mnt_want_write_file(filp);
+ if (ret)
+ return ret;
+
+ ret = ext4_update_superblocks_fn(sb, ext4_sb_setlabel, new_label);
+
+ mnt_drop_write_file(filp);
+ return ret;
+}
+
+static int ext4_ioctl_getlabel(struct ext4_sb_info *sbi, char __user *user_label)
+{
+ char label[EXT4_LABEL_MAX + 1];
+
+ /*
+ * EXT4_LABEL_MAX must always be smaller than FSLABEL_MAX because
+ * FSLABEL_MAX must include terminating null byte, while s_volume_name
+ * does not have to.
+ */
+ BUILD_BUG_ON(EXT4_LABEL_MAX >= FSLABEL_MAX);
+
+ memset(label, 0, sizeof(label));
+ lock_buffer(sbi->s_sbh);
+ strncpy(label, sbi->s_es->s_volume_name, EXT4_LABEL_MAX);
+ unlock_buffer(sbi->s_sbh);
+
+ if (copy_to_user(user_label, label, sizeof(label)))
+ return -EFAULT;
+ return 0;
+}
+
+static int ext4_ioctl_getuuid(struct ext4_sb_info *sbi,
+ struct fsuuid __user *ufsuuid)
+{
+ struct fsuuid fsuuid;
+ __u8 uuid[UUID_SIZE];
+
+ if (copy_from_user(&fsuuid, ufsuuid, sizeof(fsuuid)))
+ return -EFAULT;
+
+ if (fsuuid.fsu_len == 0) {
+ fsuuid.fsu_len = UUID_SIZE;
+ if (copy_to_user(ufsuuid, &fsuuid, sizeof(fsuuid.fsu_len)))
+ return -EFAULT;
+ return -EINVAL;
+ }
+
+ if (fsuuid.fsu_len != UUID_SIZE || fsuuid.fsu_flags != 0)
+ return -EINVAL;
+
+ lock_buffer(sbi->s_sbh);
+ memcpy(uuid, sbi->s_es->s_uuid, UUID_SIZE);
+ unlock_buffer(sbi->s_sbh);
+
+ if (copy_to_user(&ufsuuid->fsu_uuid[0], uuid, UUID_SIZE))
+ return -EFAULT;
+ return 0;
+}
+
+static int ext4_ioctl_setuuid(struct file *filp,
+ const struct fsuuid __user *ufsuuid)
+{
+ int ret = 0;
+ struct super_block *sb = file_inode(filp)->i_sb;
+ struct fsuuid fsuuid;
+ __u8 uuid[UUID_SIZE];
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ /*
+ * If any checksums (group descriptors or metadata) are being used
+ * then the checksum seed feature is required to change the UUID.
+ */
+ if (((ext4_has_feature_gdt_csum(sb) || ext4_has_metadata_csum(sb))
+ && !ext4_has_feature_csum_seed(sb))
+ || ext4_has_feature_stable_inodes(sb))
+ return -EOPNOTSUPP;
+
+ if (copy_from_user(&fsuuid, ufsuuid, sizeof(fsuuid)))
+ return -EFAULT;
+
+ if (fsuuid.fsu_len != UUID_SIZE || fsuuid.fsu_flags != 0)
+ return -EINVAL;
+
+ if (copy_from_user(uuid, &ufsuuid->fsu_uuid[0], UUID_SIZE))
+ return -EFAULT;
+
+ ret = mnt_want_write_file(filp);
+ if (ret)
+ return ret;
+
+ ret = ext4_update_superblocks_fn(sb, ext4_sb_setuuid, &uuid);
+ mnt_drop_write_file(filp);
+
+ return ret;
+}
+
static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
struct inode *inode = file_inode(filp);
@@ -1266,6 +1643,17 @@
case EXT4_IOC_CHECKPOINT:
return ext4_ioctl_checkpoint(filp, arg);
+ case FS_IOC_GETFSLABEL:
+ return ext4_ioctl_getlabel(EXT4_SB(sb), (void __user *)arg);
+
+ case FS_IOC_SETFSLABEL:
+ return ext4_ioctl_setlabel(filp,
+ (const void __user *)arg);
+
+ case EXT4_IOC_GETFSUUID:
+ return ext4_ioctl_getuuid(EXT4_SB(sb), (void __user *)arg);
+ case EXT4_IOC_SETFSUUID:
+ return ext4_ioctl_setuuid(filp, (const void __user *)arg);
default:
return -ENOTTY;
}
@@ -1341,6 +1729,10 @@
case EXT4_IOC_GETSTATE:
case EXT4_IOC_GET_ES_CACHE:
case EXT4_IOC_CHECKPOINT:
+ case FS_IOC_GETFSLABEL:
+ case FS_IOC_SETFSLABEL:
+ case EXT4_IOC_GETFSUUID:
+ case EXT4_IOC_SETFSUUID:
break;
default:
return -ENOIOCTLCMD;
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 4687d59..61b913d 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -730,12 +730,23 @@
* sequence of powers of 3, 5, and 7: 1, 3, 5, 7, 9, 25, 27, 49, 81, ...
* For a non-sparse filesystem it will be every group: 1, 2, 3, 4, ...
*/
-static unsigned ext4_list_backups(struct super_block *sb, unsigned *three,
- unsigned *five, unsigned *seven)
+unsigned int ext4_list_backups(struct super_block *sb, unsigned int *three,
+ unsigned int *five, unsigned int *seven)
{
- unsigned *min = three;
+ struct ext4_super_block *es = EXT4_SB(sb)->s_es;
+ unsigned int *min = three;
int mult = 3;
- unsigned ret;
+ unsigned int ret;
+
+ if (ext4_has_feature_sparse_super2(sb)) {
+ do {
+ if (*min > 2)
+ return UINT_MAX;
+ ret = le32_to_cpu(es->s_backup_bgs[*min - 1]);
+ *min += 1;
+ } while (!ret);
+ return ret;
+ }
if (!ext4_has_feature_sparse_super(sb)) {
ret = *min;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index b09b7a6..78e8cca 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -254,8 +254,8 @@
return es->s_checksum_type == EXT4_CRC32C_CHKSUM;
}
-static __le32 ext4_superblock_csum(struct super_block *sb,
- struct ext4_super_block *es)
+__le32 ext4_superblock_csum(struct super_block *sb,
+ struct ext4_super_block *es)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
int offset = offsetof(struct ext4_super_block, s_checksum);
@@ -4093,9 +4093,6 @@
blocksize = EXT4_MIN_BLOCK_SIZE << le32_to_cpu(es->s_log_block_size);
- if (blocksize == PAGE_SIZE)
- set_opt(sb, DIOREAD_NOLOCK);
-
if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV) {
sbi->s_inode_size = EXT4_GOOD_OLD_INODE_SIZE;
sbi->s_first_ino = EXT4_GOOD_OLD_FIRST_INO;
diff --git a/fs/file_table.c b/fs/file_table.c
index 6f297f97..75c3932 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -278,6 +278,7 @@
}
if (file->f_op->release)
file->f_op->release(inode, file);
+ security_file_pre_free(file);
if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
!(mode & FMODE_PATH))) {
cdev_put(inode->i_cdev);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 65e5e6e..8206eb4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -167,7 +167,7 @@
type = ovl_path_real(dentry, &realpath);
old_cred = ovl_override_creds(dentry->d_sb);
- err = vfs_getattr(&realpath, stat, request_mask, flags);
+ err = ovl_do_getattr(&realpath, stat, request_mask, flags);
if (err)
goto out;
@@ -192,8 +192,8 @@
(!is_dir ? STATX_NLINK : 0);
ovl_path_lower(dentry, &realpath);
- err = vfs_getattr(&realpath, &lowerstat,
- lowermask, flags);
+ err = ovl_do_getattr(&realpath, &lowerstat, lowermask,
+ flags);
if (err)
goto out;
@@ -242,8 +242,8 @@
u32 lowermask = STATX_BLOCKS;
ovl_path_lowerdata(dentry, &realpath);
- err = vfs_getattr(&realpath, &lowerdatastat,
- lowermask, flags);
+ err = ovl_do_getattr(&realpath, &lowerdatastat,
+ lowermask, flags);
if (err)
goto out;
stat->blocks = lowerdatastat.blocks;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index a96b675..d06a664 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -274,6 +274,13 @@
!ofs->config.redirect_dir && ofs->config.xino != OVL_XINO_ON);
}
+static inline int ovl_do_getattr(const struct path *path, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ if (flags & AT_GETATTR_NOSEC)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
/* util.c */
int ovl_want_write(struct dentry *dentry);
diff --git a/fs/proc/Kconfig b/fs/proc/Kconfig
index c930001..c54bc40 100644
--- a/fs/proc/Kconfig
+++ b/fs/proc/Kconfig
@@ -107,3 +107,11 @@
config PROC_CPU_RESCTRL
def_bool n
depends on PROC_FS
+
+config PROC_SELF_MEM_READONLY
+ bool "Force /proc/<pid>/mem paths to be read-only"
+ default y
+ help
+ When enabled, attempts to open /proc/self/mem for write access
+ will always fail. Write access to this file allows bypassing
+ of memory map permissions (such as modifying read-only code).
diff --git a/fs/proc/base.c b/fs/proc/base.c
index e5d7a5a..1250472 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -151,6 +151,12 @@
NULL, &proc_pid_attr_operations, \
{ .lsm = LSM })
+#ifdef CONFIG_PROC_SELF_MEM_READONLY
+# define PROC_PID_MEM_MODE S_IRUSR
+#else
+# define PROC_PID_MEM_MODE S_IRUSR|S_IWUSR
+#endif
+
/*
* Count the number of hardlinks for the pid_entry table, excluding the .
* and .. links.
@@ -899,7 +905,11 @@
static ssize_t mem_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
+#ifdef CONFIG_PROC_SELF_MEM_READONLY
+ return -EACCES;
+#else
return mem_rw(file, (char __user*)buf, count, ppos, 1);
+#endif
}
loff_t mem_lseek(struct file *file, loff_t offset, int orig)
@@ -3236,7 +3246,7 @@
#ifdef CONFIG_NUMA
REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
#endif
- REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations),
+ REG("mem", PROC_PID_MEM_MODE, proc_mem_operations),
LNK("cwd", proc_cwd_link),
LNK("root", proc_root_link),
LNK("exe", proc_exe_link),
@@ -3582,7 +3592,7 @@
#ifdef CONFIG_NUMA
REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
#endif
- REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations),
+ REG("mem", PROC_PID_MEM_MODE, proc_mem_operations),
LNK("cwd", proc_cwd_link),
LNK("root", proc_root_link),
LNK("exe", proc_exe_link),
diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index e573098..d217811 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -26,7 +26,7 @@
#include <linux/vmalloc.h>
#include <linux/pagemap.h>
#include <linux/uaccess.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <asm/io.h>
#include "internal.h"
@@ -181,7 +181,7 @@
*/
ssize_t __weak elfcorehdr_read_notes(char *buf, size_t count, u64 *ppos)
{
- return read_from_oldmem(buf, count, ppos, 0, mem_encrypt_active());
+ return read_from_oldmem(buf, count, ppos, 0, cc_platform_has(CC_ATTR_MEM_ENCRYPT));
}
/*
@@ -382,7 +382,7 @@
buflen);
start = m->paddr + *fpos - m->offset;
tmp = read_from_oldmem(buffer, tsz, &start,
- userbuf, mem_encrypt_active());
+ userbuf, cc_platform_has(CC_ATTR_MEM_ENCRYPT));
if (tmp < 0)
return tmp;
buflen -= tsz;
diff --git a/fs/stat.c b/fs/stat.c
index 246d138..216dc64 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -17,6 +17,7 @@
#include <linux/syscalls.h>
#include <linux/pagemap.h>
#include <linux/compat.h>
+#include <linux/iversion.h>
#include <linux/uaccess.h>
#include <asm/unistd.h>
@@ -118,10 +119,16 @@
stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT |
STATX_ATTR_DAX);
+ if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) {
+ stat->result_mask |= STATX_CHANGE_COOKIE;
+ stat->change_cookie = inode_query_iversion(inode);
+ }
+
mnt_userns = mnt_user_ns(path->mnt);
if (inode->i_op->getattr)
return inode->i_op->getattr(mnt_userns, path, stat,
- request_mask, query_flags);
+ request_mask,
+ query_flags | AT_GETATTR_NOSEC);
generic_fillattr(mnt_userns, inode, stat);
return 0;
@@ -154,6 +161,9 @@
{
int retval;
+ if (WARN_ON_ONCE(query_flags & AT_GETATTR_NOSEC))
+ return -EPERM;
+
retval = security_inode_getattr(path);
if (retval)
return retval;
@@ -573,9 +583,11 @@
memset(&tmp, 0, sizeof(tmp));
- tmp.stx_mask = stat->result_mask;
+ /* STATX_CHANGE_COOKIE is kernel-only for now */
+ tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE;
tmp.stx_blksize = stat->blksize;
- tmp.stx_attributes = stat->attributes;
+ /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */
+ tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC;
tmp.stx_nlink = stat->nlink;
tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid);
tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid);
@@ -612,6 +624,11 @@
if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE)
return -EINVAL;
+ /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests
+ * from userland.
+ */
+ mask &= ~STATX_CHANGE_COOKIE;
+
error = vfs_statx(dfd, filename, flags, &stat, mask);
if (error)
return error;
diff --git a/google/certs/lakitu_root_cert.pem b/google/certs/lakitu_root_cert.pem
new file mode 100644
index 0000000..c7c11b7
--- /dev/null
+++ b/google/certs/lakitu_root_cert.pem
@@ -0,0 +1,26 @@
+-----BEGIN CERTIFICATE-----
+MIIEdzCCAt+gAwIBAgIBATANBgkqhkiG9w0BAQsFADA/MQswCQYDVQQGEwJVUzEP
+MA0GA1UEChMGR29vZ2xlMR8wHQYDVQQDExZDb250YWluZXItT3B0aW1pemVkIE9T
+MB4XDTE5MDExNTAwMzA0MloXDTI5MDExNjAwMzA0MlowPzELMAkGA1UEBhMCVVMx
+DzANBgNVBAoTBkdvb2dsZTEfMB0GA1UEAxMWQ29udGFpbmVyLU9wdGltaXplZCBP
+UzCCAaIwDQYJKoZIhvcNAQEBBQADggGPADCCAYoCggGBALrwED7U5BVwSxRBaaaT
+anJ1V/NrEOL0SRVO6GWPLEK4eNEpIlVG5wvvNVXHZo01pBjpjaU+E7bc7buEzSKa
+62jqIiIe80HbDZ9qmPqsWVtaDiT/bROWpGmJ+oYLqBB20z1vPAhur5KOVOvMZqNG
+MR2Hxg3EvoS2PwcXtsJ3rca3SR7oF7BvGGRaDoeO/1cxi1OLlfe8qIaw5VQquXvm
+NqQchtp/8AN6UOEBCql7fTVf/DINpDa/QhWqnIvGfgcT+84caX+2RIys3JIojZ9j
+sWTLnk/BtPOdRRN4SuJXqxOcdig8RHZPP4K7evdn83IQdOzdL6I5P395QhCNU1hT
+rzkJ6DKIqwzoC9fPlav1zec14xSO8nS8Obwr0af3dYNDZFKc8VZdgCaVckSg6E6w
+3R+6dn2Vdto+r3mSYydpDT6obeMZOwNhQfTFltZ7O+KC98AhnwLu2+ReSPXLHazT
+HYLOrONHmfD/mHbPdzMHrftFUqtmbdIt/45gdEBYIjt51wIDAQABo34wfDAOBgNV
+HQ8BAf8EBAMCAoQwEgYDVR0TAQH/BAgwBgEB/wIBADApBgNVHQ4EIgQgLv6kk/wU
+izWFnQvIw91xoTn8erQRKD4G9sELWmsX41UwKwYDVR0jBCQwIoAgLv6kk/wUizWF
+nQvIw91xoTn8erQRKD4G9sELWmsX41UwDQYJKoZIhvcNAQELBQADggGBAAs4IGt0
+qmpKxfzsM3iN0zZCMpH0/0M7O8DkEXnPFDyY55IuH64hkcPjeA1VNHoCiPAsDlWy
+u4lgJ33aEpWJx/0F5V3DSgszu/JT4/DByi9EP0jli9AT0iV9RCRDTfOWkIoOb6EC
+7OLUmR+gYRUpMfzQEmo5sdrl0Ju9K2LOj3lqyenrWOOXEJ3yR3Z922mxh6SWC7Hw
+ho8CMsaq4PWMycsSOvlr/j+SWCZCxgLdKfm/PmSziD7lexnFytDD52c6vYKXf58e
+v41LzO6s/0DTT6cLE7rHXmRPwBBWjyFpJ8GfRF7o8aHKyzdJtLVlG02PU/uJ/UCT
+DGWVSCqbd3+AkP8IfZ2CZHRpJ15DkkprS9f5nw8o/nHIdjPdbyhTELCsF4kfn8/o
+6jS8HVDv5BvfNpoiIh5VCMnAcaQ9aoCEi3oyPm05/ApFeIRDg7Wu8PpCyNEM6bTp
+I1DxUItYhLYHeunme6EK0aLnDsrtMapPPNyBcWa92GgmPQnfByETYuoO6w==
+-----END CERTIFICATE-----
diff --git a/google/kokoro/build_flavors/default.cfg b/google/kokoro/build_flavors/default.cfg
new file mode 100644
index 0000000..59f085b
--- /dev/null
+++ b/google/kokoro/build_flavors/default.cfg
@@ -0,0 +1,2 @@
+KERNEL_CONFIGS="lakitu_defconfig"
+BUILD_COMPONENTS="-k -H -d"
diff --git a/google/kokoro/build_flavors/xfstest.cfg b/google/kokoro/build_flavors/xfstest.cfg
new file mode 100644
index 0000000..533dd32
--- /dev/null
+++ b/google/kokoro/build_flavors/xfstest.cfg
@@ -0,0 +1,2 @@
+KERNEL_CONFIGS="lakitu_defconfig,google/xfstest.config"
+BUILD_COMPONENTS="-k"
diff --git a/include/linux/audit.h b/include/linux/audit.h
index 82b7c11..b2fe82c 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -95,6 +95,17 @@
struct audit_ntp_data {};
#endif
+struct audit_task_info {
+ kuid_t loginuid;
+ unsigned int sessionid;
+ u64 contid;
+#ifdef CONFIG_AUDITSYSCALL
+ struct audit_context *ctx;
+#endif
+};
+
+extern struct audit_task_info init_struct_audit;
+
enum audit_nfcfgop {
AUDIT_XT_OP_REGISTER,
AUDIT_XT_OP_REPLACE,
@@ -154,6 +165,9 @@
#ifdef CONFIG_AUDIT
/* These are defined in audit.c */
/* Public API */
+extern int audit_alloc(struct task_struct *task);
+extern void audit_free(struct task_struct *task);
+extern void __init audit_task_init(void);
extern __printf(4, 5)
void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
const char *fmt, ...);
@@ -197,12 +211,25 @@
static inline kuid_t audit_get_loginuid(struct task_struct *tsk)
{
- return tsk->loginuid;
+ if (!tsk->audit)
+ return INVALID_UID;
+ return tsk->audit->loginuid;
}
static inline unsigned int audit_get_sessionid(struct task_struct *tsk)
{
- return tsk->sessionid;
+ if (!tsk->audit)
+ return AUDIT_SID_UNSET;
+ return tsk->audit->sessionid;
+}
+
+extern int audit_set_contid(struct task_struct *tsk, u64 contid);
+
+static inline u64 audit_get_contid(struct task_struct *tsk)
+{
+ if (!tsk->audit)
+ return AUDIT_CID_UNSET;
+ return tsk->audit->contid;
}
extern u32 audit_enabled;
@@ -210,6 +237,14 @@
extern int audit_signal_info(int sig, struct task_struct *t);
#else /* CONFIG_AUDIT */
+static inline int audit_alloc(struct task_struct *task)
+{
+ return 0;
+}
+static inline void audit_free(struct task_struct *task)
+{ }
+static inline void __init audit_task_init(void)
+{ }
static inline __printf(4, 5)
void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
const char *fmt, ...)
@@ -261,6 +296,11 @@
return AUDIT_SID_UNSET;
}
+static inline u64 audit_get_contid(struct task_struct *tsk)
+{
+ return AUDIT_CID_UNSET;
+}
+
#define audit_enabled AUDIT_OFF
static inline int audit_signal_info(int sig, struct task_struct *t)
@@ -285,8 +325,6 @@
/* These are defined in auditsc.c */
/* Public API */
-extern int audit_alloc(struct task_struct *task);
-extern void __audit_free(struct task_struct *task);
extern void __audit_syscall_entry(int major, unsigned long a0, unsigned long a1,
unsigned long a2, unsigned long a3);
extern void __audit_syscall_exit(int ret_success, long ret_value);
@@ -305,12 +343,14 @@
static inline void audit_set_context(struct task_struct *task, struct audit_context *ctx)
{
- task->audit_context = ctx;
+ task->audit->ctx = ctx;
}
static inline struct audit_context *audit_context(void)
{
- return current->audit_context;
+ if (!current->audit)
+ return NULL;
+ return current->audit->ctx;
}
static inline bool audit_dummy_context(void)
@@ -318,11 +358,7 @@
void *p = audit_context();
return !p || *(int *)p;
}
-static inline void audit_free(struct task_struct *task)
-{
- if (unlikely(task->audit_context))
- __audit_free(task);
-}
+
static inline void audit_syscall_entry(int major, unsigned long a0,
unsigned long a1, unsigned long a2,
unsigned long a3)
@@ -550,12 +586,6 @@
extern int audit_n_rules;
extern int audit_signals;
#else /* CONFIG_AUDITSYSCALL */
-static inline int audit_alloc(struct task_struct *task)
-{
- return 0;
-}
-static inline void audit_free(struct task_struct *task)
-{ }
static inline void audit_syscall_entry(int major, unsigned long a0,
unsigned long a1, unsigned long a2,
unsigned long a3)
@@ -686,4 +716,19 @@
return uid_valid(audit_get_loginuid(tsk));
}
+static inline bool audit_contid_valid(u64 contid)
+{
+ return contid != AUDIT_CID_UNSET;
+}
+
+static inline bool audit_contid_set(struct task_struct *tsk)
+{
+ return audit_contid_valid(audit_get_contid(tsk));
+}
+
+static inline void audit_log_string(struct audit_buffer *ab, const char *buf)
+{
+ audit_log_n_string(ab, buf, strlen(buf));
+}
+
#endif
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index a075b70..3a341f6 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -61,6 +61,14 @@
* Examples include SEV-ES.
*/
CC_ATTR_GUEST_STATE_ENCRYPT,
+
+ /**
+ * @CC_ATTR_SEV_SNP: Guest SNP is active.
+ *
+ * The platform/OS is running as a guest/virtual machine and actively
+ * using AMD SEV-SNP features.
+ */
+ CC_ATTR_GUEST_SEV_SNP,
};
#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 3ad636a..1e8981e 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -22,6 +22,9 @@
#include <linux/fs.h>
#include <linux/dma-fence.h>
#include <linux/wait.h>
+#include <linux/uio.h>
+#include <linux/genalloc.h>
+#include <linux/xarray.h>
struct device;
struct dma_buf;
@@ -540,6 +543,30 @@
void *priv;
};
+struct dma_buf_pages_file_priv {
+ /* fields for dmabuf */
+ struct dma_buf *dmabuf;
+ struct dma_buf_attachment *attachment;
+ struct sg_table *sgt;
+ struct pci_dev *pci_dev;
+ enum dma_data_direction direction;
+
+ /* fields for dma-buf page */
+ size_t num_pages;
+ struct page *pages;
+ struct dev_pagemap pgmap;
+
+ int has_page_pool;
+
+ /* fields for Tx */
+ struct iov_iter tx_iter;
+ struct bio_vec *tx_bv;
+
+ /* fields for Rx */
+ struct gen_pool *page_pool;
+ struct xarray bound_rxq_list;
+};
+
/**
* DEFINE_DMA_BUF_EXPORT_INFO - helper macro for exporters
* @name: export-info name
@@ -623,4 +650,47 @@
unsigned long);
int dma_buf_vmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
+
+#ifdef CONFIG_DMA_SHARED_BUFFER
+extern const struct file_operations dma_buf_pages_fops;
+extern const struct dev_pagemap_ops dma_buf_pgmap_ops;
+
+static inline bool is_dma_buf_pages_file(struct file *file)
+{
+ return file->f_op == &dma_buf_pages_fops;
+}
+
+static inline bool is_dma_buf_page(struct page *page)
+{
+ return (is_zone_device_page(page) && page->pgmap &&
+ page->pgmap->ops == &dma_buf_pgmap_ops);
+}
+#else
+static bool is_dma_buf_page(struct page *page)
+{
+ return false;
+}
+
+static bool is_dma_buf_pages_file(struct file *file)
+{
+ return false;
+}
+#endif
+
+static inline int dma_buf_map_sg(struct device *dev, struct scatterlist *sg,
+ int nents, enum dma_data_direction dir)
+{
+ struct scatterlist *s;
+ int i;
+
+ for_each_sg(sg, s, nents, i) {
+ struct page *pg = sg_page(s);
+
+ s->dma_address = (dma_addr_t)pg->zone_device_data;
+ sg_dma_len(s) = s->length;
+ }
+
+ return nents;
+}
+
#endif /* __DMA_BUF_H__ */
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 5598fc3..d1ea991 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -346,6 +346,7 @@
#define EFI_CERT_SHA256_GUID EFI_GUID(0xc1c41626, 0x504c, 0x4092, 0xac, 0xa9, 0x41, 0xf9, 0x36, 0x93, 0x43, 0x28)
#define EFI_CERT_X509_GUID EFI_GUID(0xa5c059a1, 0x94e4, 0x4aa7, 0x87, 0xb5, 0xab, 0x15, 0x5c, 0x2b, 0xf0, 0x72)
#define EFI_CERT_X509_SHA256_GUID EFI_GUID(0x3bd2a492, 0x96c0, 0x4079, 0xb4, 0x20, 0xfc, 0xf9, 0x8e, 0xf1, 0x03, 0xed)
+#define EFI_CC_BLOB_GUID EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)
/*
* This GUID is used to pass to the kernel proper the struct screen_info
@@ -364,6 +365,7 @@
/* OEM GUIDs */
#define DELLEMC_EFI_RCI2_TABLE_GUID EFI_GUID(0x2d9f28a2, 0xa886, 0x456a, 0x97, 0xa8, 0xf1, 0x1e, 0xf2, 0x4f, 0xf4, 0x55)
+#define AMD_SEV_MEM_ENCRYPT_GUID EFI_GUID(0x0cf29b71, 0x9e51, 0x433a, 0xa3, 0xb7, 0x81, 0xf3, 0xab, 0x16, 0xb8, 0x75)
typedef struct {
efi_guid_t guid;
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index af2fa11..f88730b 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -163,6 +163,7 @@
LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
LSM_HOOK(int, 0, file_alloc_security, struct file *file)
LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file)
+LSM_HOOK(void, LSM_RET_VOID, file_pre_free_security, struct file *file)
LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd,
unsigned long arg)
LSM_HOOK(int, 0, file_ioctl_compat, struct file *file, unsigned int cmd,
@@ -182,6 +183,7 @@
LSM_HOOK(int, 0, file_open, struct file *file)
LSM_HOOK(int, 0, task_alloc, struct task_struct *task,
unsigned long clone_flags)
+LSM_HOOK(void, LSM_RET_VOID, task_post_alloc, struct task_struct *task)
LSM_HOOK(void, LSM_RET_VOID, task_free, struct task_struct *task)
LSM_HOOK(int, 0, cred_alloc_blank, struct cred *cred, gfp_t gfp)
LSM_HOOK(void, LSM_RET_VOID, cred_free, struct cred *cred)
@@ -223,6 +225,7 @@
LSM_HOOK(int, 0, task_movememory, struct task_struct *p)
LSM_HOOK(int, 0, task_kill, struct task_struct *p, struct kernel_siginfo *info,
int sig, const struct cred *cred)
+LSM_HOOK(void, LSM_RET_VOID, task_exit, struct task_struct *p)
LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
unsigned long arg3, unsigned long arg4, unsigned long arg5)
LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 5902461..c37b56e 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -532,6 +532,10 @@
* @file_free_security:
* Deallocate and free any security structures stored in file->f_security.
* @file contains the file structure being modified.
+ * @file_pre_free_security:
+ * Perform any logging or LSM state updates for a file being deleted
+ * using fields of the file before they have been cleared.
+ * @file contains the file structure being freed
* @file_ioctl:
* @file contains the file structure.
* @cmd contains the operation to perform.
@@ -608,6 +612,10 @@
* @clone_flags contains the flags indicating what should be shared.
* Handle allocation of task-related resources.
* Returns a zero on success, negative values on failure.
+ * @task_post_alloc:
+ * @task task being allocated.
+ * Handle allocation of task-related resources after all task fields are
+ * filled in.
* @task_free:
* @task task about to be freed.
* Handle release of task-related resources. (Note that this can be called
@@ -784,6 +792,9 @@
* @cred contains the cred of the process where the signal originated, or
* NULL if the current task is the originator.
* Return 0 if permission is granted.
+ * @task_exit:
+ * Called early when a task is exiting before all state is lost.
+ * @p contains the task_struct for process.
* @task_prctl:
* Check permission before performing a process control operation on the
* current process.
diff --git a/include/linux/mem_encrypt.h b/include/linux/mem_encrypt.h
index 5c4a18a..ae45263 100644
--- a/include/linux/mem_encrypt.h
+++ b/include/linux/mem_encrypt.h
@@ -16,10 +16,6 @@
#include <asm/mem_encrypt.h>
-#else /* !CONFIG_ARCH_HAS_MEM_ENCRYPT */
-
-static inline bool mem_encrypt_active(void) { return false; }
-
#endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */
#ifdef CONFIG_AMD_MEM_ENCRYPT
diff --git a/include/linux/mmu_context.h b/include/linux/mmu_context.h
index b9b970f..01c77ba 100644
--- a/include/linux/mmu_context.h
+++ b/include/linux/mmu_context.h
@@ -5,6 +5,11 @@
#include <asm/mmu_context.h>
#include <asm/mmu.h>
+struct mm_struct;
+
+void use_mm(struct mm_struct *mm);
+void unuse_mm(struct mm_struct *mm);
+
/* Architectures that care about IRQ state in switch_mm can override this. */
#ifndef switch_mm_irqs_off
# define switch_mm_irqs_off switch_mm
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 829ebde..7afcc17 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -48,6 +48,7 @@
#include <uapi/linux/pkt_cls.h>
#include <linux/hashtable.h>
#include <linux/rbtree.h>
+#include <linux/dma-buf.h>
struct netpoll_info;
struct device;
@@ -763,8 +764,41 @@
#ifdef CONFIG_XDP_SOCKETS
struct xsk_buff_pool *pool;
#endif
+ struct file __rcu *dmabuf_pages;
} ____cacheline_aligned_in_smp;
+struct page *
+__netdev_rxq_alloc_page_from_dmabuf_pool(struct netdev_rx_queue *rxq,
+ unsigned int order);
+
+static inline struct page *netdev_rxq_alloc_dma_buf_page(struct netdev_rx_queue *rxq,
+ unsigned int order)
+{
+
+ /* Return NULL if we can't allocate a dma_buf page, instead of trying
+ * to fallback to alloc_page(). The reason we do this is because we
+ * don't want to confuse the caller with respect to whether the page
+ * they are getting is a dma_buf page or otherwise. dma_buf pages and
+ * devmem skbs require special handling, so the distinction is
+ * important. Let the caller fall back to allocating a non-dma_buf page
+ * if they know what they're doing.
+ */
+ if (unlikely(!rcu_access_pointer(rxq->dmabuf_pages)))
+ return NULL;
+
+ return __netdev_rxq_alloc_page_from_dmabuf_pool(rxq, order);
+}
+
+static inline void netdev_rxq_free_page(struct page *pg)
+{
+ if (is_dma_buf_page(pg)) {
+ put_page(pg);
+ return;
+ }
+
+ __free_page(pg);
+}
+
/*
* RX queue sysfs structures and functions.
*/
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 7c7e627..3a6622f 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -31,6 +31,9 @@
struct ucounts *ucounts;
int reboot; /* group exit code if this pidns was rebooted */
struct ns_common ns;
+#ifdef CONFIG_SECURITY_CONTAINER_MONITOR
+ u64 cid; /* Main container identifier, zero if not assigned. */
+#endif
} __randomize_layout;
extern struct pid_namespace init_pid_ns;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9b3cfe6..38d1001 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -37,7 +37,6 @@
#include <asm/kmap_size.h>
/* task_struct member predeclarations (sorted alphabetically): */
-struct audit_context;
struct backing_dev_info;
struct bio_list;
struct blk_plug;
@@ -1087,11 +1086,7 @@
struct callback_head *task_works;
#ifdef CONFIG_AUDIT
-#ifdef CONFIG_AUDITSYSCALL
- struct audit_context *audit_context;
-#endif
- kuid_t loginuid;
- unsigned int sessionid;
+ struct audit_task_info *audit;
#endif
struct seccomp seccomp;
struct syscall_user_dispatch syscall_dispatch;
@@ -1923,6 +1918,12 @@
extern int wake_up_process(struct task_struct *tsk);
extern void wake_up_new_task(struct task_struct *tsk);
+/*
+ * Wake up tsk and try to swap it into the current tasks place, which
+ * initially means just trying to migrate it to the current CPU.
+ */
+extern int wake_up_swap(struct task_struct *tsk);
+
#ifdef CONFIG_SMP
extern void kick_process(struct task_struct *tsk);
#else
diff --git a/include/linux/security.h b/include/linux/security.h
index e844834..242b596 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -381,6 +381,7 @@
int security_file_permission(struct file *file, int mask);
int security_file_alloc(struct file *file);
void security_file_free(struct file *file);
+void security_file_pre_free(struct file *file);
int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
int security_file_ioctl_compat(struct file *file, unsigned int cmd,
unsigned long arg);
@@ -397,6 +398,7 @@
int security_file_receive(struct file *file);
int security_file_open(struct file *file);
int security_task_alloc(struct task_struct *task, unsigned long clone_flags);
+void security_task_post_alloc(struct task_struct *task);
void security_task_free(struct task_struct *task);
int security_cred_alloc_blank(struct cred *cred, gfp_t gfp);
void security_cred_free(struct cred *cred);
@@ -435,6 +437,7 @@
int security_task_movememory(struct task_struct *p);
int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
int sig, const struct cred *cred);
+void security_task_exit(struct task_struct *p);
int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
unsigned long arg4, unsigned long arg5);
void security_task_to_inode(struct task_struct *p, struct inode *inode);
@@ -959,6 +962,9 @@
static inline void security_file_free(struct file *file)
{ }
+static inline void security_file_pre_free(struct file *file)
+{ }
+
static inline int security_file_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -1029,6 +1035,9 @@
return 0;
}
+static inline void security_task_post_alloc(struct task_struct *task)
+{ }
+
static inline void security_task_free(struct task_struct *task)
{ }
@@ -1189,6 +1198,9 @@
return 0;
}
+static inline void security_task_exit(struct task_struct *p)
+{ }
+
static inline int security_task_prctl(int option, unsigned long arg2,
unsigned long arg3,
unsigned long arg4,
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7ed1d44..0f40d02 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -41,6 +41,7 @@
#if IS_ENABLED(CONFIG_NF_CONNTRACK)
#include <linux/netfilter/nf_conntrack_common.h>
#endif
+#include <linux/dma-buf.h>
/* The interface for checksum offload between the stack and networking drivers
* is as follows...
@@ -547,6 +548,9 @@
int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
struct msghdr *msg, int len,
struct ubuf_info *uarg);
+int skb_devmem_iter_stream(struct sock *sk, struct sk_buff *skb,
+ struct iov_iter *iov_iter, int len,
+ struct ubuf_info *uarg);
/* This data is invariant across clones and lives at
* the end of the header data, ie. at skb->end.
@@ -727,6 +731,8 @@
* the packet minus one that have been verified as
* CHECKSUM_UNNECESSARY (max 3)
* @scm_io_uring: SKB holds io_uring registered files
+ * @devmem: indicates that all the fragments in this skb is backed by
+ * device memory.
* @dst_pending_confirm: need to confirm neighbour
* @decrypted: Decrypted SKB
* @slow_gro: state present at GRO time, slower prepare step required
@@ -913,7 +919,7 @@
#endif
__u8 slow_gro:1;
__u8 scm_io_uring:1;
-
+ __u8 devmem:1;
#ifdef CONFIG_NET_SCHED
__u16 tc_index; /* traffic control index */
#endif
@@ -1589,6 +1595,12 @@
}
}
+/* Return true if frags in this skb are not readable by the host. */
+static inline bool skb_frags_not_readable(const struct sk_buff *skb)
+{
+ return skb->devmem;
+}
+
static inline void skb_mark_not_on_list(struct sk_buff *skb)
{
skb->next = NULL;
@@ -2914,6 +2926,7 @@
{
if (likely(!skb_zcopy(skb)))
return 0;
+
if (!skb_zcopy_is_nouarg(skb) &&
skb_uarg(skb)->callback == msg_zerocopy_callback)
return 0;
@@ -3318,6 +3331,22 @@
skb_frag_off(frag) + offset, size, dir);
}
+/* Similar to skb_frag_dma_map, but handles devmem skbs correctly. */
+static inline dma_addr_t skb_devmem_frag_dma_map(struct device *dev,
+ const struct sk_buff *skb,
+ const skb_frag_t *frag,
+ size_t offset, size_t size,
+ enum dma_data_direction dir)
+{
+ if (unlikely(skb->devmem && is_dma_buf_page(skb_frag_page(frag)))) {
+ struct page *page = skb_frag_page(frag);
+ dma_addr_t dma_addr = (dma_addr_t)page->zone_device_data;
+
+ return dma_addr + skb_frag_off(frag) + offset;
+ }
+ return skb_frag_dma_map(dev, frag, offset, size, dir);
+}
+
static inline struct sk_buff *pskb_copy(struct sk_buff *skb,
gfp_t gfp_mask)
{
diff --git a/include/linux/socket.h b/include/linux/socket.h
index 4c5ce81..1f19f84 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -317,6 +317,8 @@
* plain text and require encryption
*/
+#define MSG_SOCK_DEVMEM 0x2000000 /* don't copy devmem pages but return
+ * them as cmsg instead */
#define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */
#define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */
#define MSG_CMSG_CLOEXEC 0x40000000 /* Set close_on_exec for file
diff --git a/include/linux/stat.h b/include/linux/stat.h
index 7df0693..c295fc03 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -50,6 +50,15 @@
struct timespec64 btime; /* File creation time */
u64 blocks;
u64 mnt_id;
+ u64 change_cookie;
};
+/* These definitions are internal to the kernel for now. Mainly used by nfsd. */
+
+/* mask values */
+#define STATX_CHANGE_COOKIE 0x40000000U /* Want/got stx_change_attr */
+
+/* file attribute values */
+#define STATX_ATTR_CHANGE_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */
+
#endif
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 07a84ae..512fb30 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -142,6 +142,8 @@
return (struct tcp_request_sock *)req;
}
+#define TCP_RMEM_TO_WIN_SCALE 8
+
struct tcp_sock {
/* inet_connection_sock has to be the first member of tcp_sock */
struct inet_connection_sock inet_conn;
@@ -208,7 +210,7 @@
u32 window_clamp; /* Maximal window to advertise */
u32 rcv_ssthresh; /* Current window clamp */
-
+ u8 scaling_ratio; /* see tcp_win_from_space() */
/* Information of the most recently (s)acked skb */
struct tcp_rack {
u64 mstamp; /* (Re)sent time of the skb */
@@ -423,15 +425,17 @@
TCP_MTU_REDUCED_DEFERRED, /* tcp_v{4|6}_err() could not call
* tcp_v{4|6}_mtu_reduced()
*/
+ TCP_ACK_DEFERRED, /* TX pure ack is deferred */
};
enum tsq_flags {
- TSQF_THROTTLED = (1UL << TSQ_THROTTLED),
- TSQF_QUEUED = (1UL << TSQ_QUEUED),
- TCPF_TSQ_DEFERRED = (1UL << TCP_TSQ_DEFERRED),
- TCPF_WRITE_TIMER_DEFERRED = (1UL << TCP_WRITE_TIMER_DEFERRED),
- TCPF_DELACK_TIMER_DEFERRED = (1UL << TCP_DELACK_TIMER_DEFERRED),
- TCPF_MTU_REDUCED_DEFERRED = (1UL << TCP_MTU_REDUCED_DEFERRED),
+ TSQF_THROTTLED = BIT(TSQ_THROTTLED),
+ TSQF_QUEUED = BIT(TSQ_QUEUED),
+ TCPF_TSQ_DEFERRED = BIT(TCP_TSQ_DEFERRED),
+ TCPF_WRITE_TIMER_DEFERRED = BIT(TCP_WRITE_TIMER_DEFERRED),
+ TCPF_DELACK_TIMER_DEFERRED = BIT(TCP_DELACK_TIMER_DEFERRED),
+ TCPF_MTU_REDUCED_DEFERRED = BIT(TCP_MTU_REDUCED_DEFERRED),
+ TCPF_ACK_DEFERRED = BIT(TCP_ACK_DEFERRED),
};
static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 36c2d99..381c279 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -10,6 +10,7 @@
struct ctl_table_header *sysctl_hdr;
int sysctl_somaxconn;
+ int sysctl_optmem_max;
#ifdef CONFIG_PROC_FS
int __percpu *sock_inuse;
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index d60a10c..5b5cd83 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -146,7 +146,7 @@
u8 sysctl_tcp_abort_on_overflow;
u8 sysctl_tcp_fack; /* obsolete */
int sysctl_tcp_max_reordering;
- int sysctl_tcp_adv_win_scale;
+ int sysctl_tcp_adv_win_scale; /* obsolete */
u8 sysctl_tcp_dsack;
u8 sysctl_tcp_app_win;
u8 sysctl_tcp_frto;
@@ -162,6 +162,7 @@
u8 sysctl_tcp_autocorking;
u8 sysctl_tcp_reflect_tos;
u8 sysctl_tcp_comp_sack_nr;
+ u8 sysctl_tcp_backlog_ack_defer;
int sysctl_tcp_invalid_ratelimit;
int sysctl_tcp_pacing_ss_ratio;
int sysctl_tcp_pacing_ca_ratio;
diff --git a/include/net/sock.h b/include/net/sock.h
index 44ebec3..eed5550 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -349,6 +349,7 @@
* @sk_txtime_deadline_mode: set deadline mode for SO_TXTIME
* @sk_txtime_report_errors: set report errors mode for SO_TXTIME
* @sk_txtime_unused: unused txtime flags
+ * @sk_pagepool: page pool associated with this socket.
*/
struct sock {
/*
@@ -537,6 +538,7 @@
struct bpf_local_storage __rcu *sk_bpf_storage;
#endif
struct rcu_head sk_rcu;
+ struct xarray sk_pagepool;
};
enum sk_pacing {
@@ -1628,12 +1630,10 @@
static inline void sock_release_ownership(struct sock *sk)
{
- if (sk->sk_lock.owned) {
- sk->sk_lock.owned = 0;
+ sk->sk_lock.owned = 0;
- /* The sk_lock has mutex_unlock() semantics: */
- mutex_release(&sk->sk_lock.dep_map, _RET_IP_);
- }
+ /* The sk_lock has mutex_unlock() semantics: */
+ mutex_release(&sk->sk_lock.dep_map, _RET_IP_);
}
/*
@@ -1821,6 +1821,8 @@
u64 transmit_time;
u32 mark;
u16 tsflags;
+ u32 devmem_fd;
+ u32 devmem_offset;
};
static inline void sockcm_init(struct sockcm_cookie *sockc,
@@ -2839,7 +2841,6 @@
extern __u32 sysctl_rmem_max;
extern int sysctl_tstamp_allow_data;
-extern int sysctl_optmem_max;
extern __u32 sysctl_wmem_default;
extern __u32 sysctl_rmem_default;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 08923ed..3c45b7c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -708,6 +708,8 @@
tcp_fast_path_on(tp);
}
+u32 tcp_delack_max(struct sock *sk);
+
/* Compute the actual rto_min value */
static inline u32 tcp_rto_min(struct sock *sk)
{
@@ -965,14 +967,16 @@
static inline bool tcp_skb_can_collapse_to(const struct sk_buff *skb)
{
- return likely(!TCP_SKB_CB(skb)->eor);
+ return likely(!TCP_SKB_CB(skb)->eor && !skb_frags_not_readable(skb));
}
static inline bool tcp_skb_can_collapse(const struct sk_buff *to,
const struct sk_buff *from)
{
return likely(tcp_skb_can_collapse_to(to) &&
- mptcp_skb_can_collapse(to, from));
+ mptcp_skb_can_collapse(to, from) &&
+ skb_frags_not_readable(to) ==
+ skb_frags_not_readable(from));
}
/* Events passed to congestion control interface */
@@ -1415,11 +1419,27 @@
static inline int tcp_win_from_space(const struct sock *sk, int space)
{
- int tcp_adv_win_scale = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_adv_win_scale);
+ s64 scaled_space = (s64)space * tcp_sk(sk)->scaling_ratio;
- return tcp_adv_win_scale <= 0 ?
- (space>>(-tcp_adv_win_scale)) :
- space - (space>>tcp_adv_win_scale);
+ return scaled_space >> TCP_RMEM_TO_WIN_SCALE;
+}
+
+/* inverse of tcp_win_from_space() */
+static inline int tcp_space_from_win(const struct sock *sk, int win)
+{
+ u64 val = (u64)win << TCP_RMEM_TO_WIN_SCALE;
+
+ do_div(val, tcp_sk(sk)->scaling_ratio);
+ return val;
+}
+
+static inline void tcp_scaling_ratio_init(struct sock *sk)
+{
+ /* Assume a conservative default of 1200 bytes of payload per 4K page.
+ * This may be adjusted later in tcp_measure_rcv_mss().
+ */
+ tcp_sk(sk)->scaling_ratio = (1200 << TCP_RMEM_TO_WIN_SCALE) /
+ SKB_TRUESIZE(4096);
}
/* Note: caller must be prepared to deal with negative returns */
diff --git a/include/net/tls.h b/include/net/tls.h
index ea0aeae..fe95397 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -422,7 +422,7 @@
{
struct tls_rec *rec;
- rec = list_first_entry(&ctx->tx_list, struct tls_rec, list);
+ rec = list_first_entry_or_null(&ctx->tx_list, struct tls_rec, list);
if (!rec)
return false;
diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h
index c649c7f..bd7e102 100644
--- a/include/trace/events/ext4.h
+++ b/include/trace/events/ext4.h
@@ -2860,6 +2860,29 @@
__entry->end)
);
+TRACE_EVENT(ext4_update_sb,
+ TP_PROTO(struct super_block *sb, ext4_fsblk_t fsblk,
+ unsigned int flags),
+
+ TP_ARGS(sb, fsblk, flags),
+
+ TP_STRUCT__entry(
+ __field(dev_t, dev)
+ __field(ext4_fsblk_t, fsblk)
+ __field(unsigned int, flags)
+ ),
+
+ TP_fast_assign(
+ __entry->dev = sb->s_dev;
+ __entry->fsblk = fsblk;
+ __entry->flags = flags;
+ ),
+
+ TP_printk("dev %d,%d fsblk %llu flags %u",
+ MAJOR(__entry->dev), MINOR(__entry->dev),
+ __entry->fsblk, __entry->flags)
+);
+
#endif /* _TRACE_EXT4_H */
/* This part must be outside protection */
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 1f0a2b4..3930413 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -126,6 +126,13 @@
#define SO_BUF_LOCK 72
+
+#define SO_DEVMEM_DONTNEED 97
+#define SO_DEVMEM_HEADER 98
+#define SCM_DEVMEM_HEADER SO_DEVMEM_HEADER
+#define SO_DEVMEM_OFFSET 99
+#define SCM_DEVMEM_OFFSET SO_DEVMEM_OFFSET
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 27799ac..976d845 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -71,6 +71,7 @@
#define AUDIT_TTY_SET 1017 /* Set TTY auditing status */
#define AUDIT_SET_FEATURE 1018 /* Turn an audit feature on or off */
#define AUDIT_GET_FEATURE 1019 /* Get which features are enabled */
+#define AUDIT_CONTAINER_OP 1020 /* Define the container id and info */
#define AUDIT_FIRST_USER_MSG 1100 /* Userspace messages mostly uninteresting to kernel */
#define AUDIT_USER_AVC 1107 /* We filter this differently */
@@ -495,6 +496,7 @@
#define AUDIT_UID_UNSET (unsigned int)-1
#define AUDIT_SID_UNSET ((unsigned int)-1)
+#define AUDIT_CID_UNSET ((u64)-1)
/* audit_rule_data supports filter rules with both integer and string
* fields. It corresponds with AUDIT_ADD_RULE, AUDIT_DEL_RULE and
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index b1523cb..0440f03 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -75,6 +75,17 @@
__u64 flags;
};
+struct dma_buf_create_pages_info {
+ __u64 pci_bdf[3];
+ __s32 dma_buf_fd;
+ __s32 create_page_pool;
+};
+
+struct dma_buf_pages_bind_rx_queue {
+ char ifname[IFNAMSIZ];
+ __u32 rxq_idx;
+};
+
#define DMA_BUF_SYNC_READ (1 << 0)
#define DMA_BUF_SYNC_WRITE (2 << 0)
#define DMA_BUF_SYNC_RW (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE)
@@ -95,4 +106,7 @@
#define DMA_BUF_SET_NAME_A _IOW(DMA_BUF_BASE, 1, __u32)
#define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, __u64)
+#define DMA_BUF_CREATE_PAGES _IOW(DMA_BUF_BASE, 2, struct dma_buf_create_pages_info)
+#define DMA_BUF_PAGES_BIND_RX _IOW(DMA_BUF_BASE, 3, struct dma_buf_pages_bind_rx_queue)
+
#endif
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 2f86b2a..6b82ad2 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -110,5 +110,8 @@
#define AT_STATX_DONT_SYNC 0x4000 /* - Don't sync attributes with the server */
#define AT_RECURSIVE 0x8000 /* Apply to the entire subtree */
+#if defined(__KERNEL__)
+#define AT_GETATTR_NOSEC 0x80000000
+#endif
#endif /* _UAPI_LINUX_FCNTL_H */
diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h
index 235e5b2..1e4eb77 100644
--- a/include/uapi/linux/futex.h
+++ b/include/uapi/linux/futex.h
@@ -23,6 +23,8 @@
#define FUTEX_CMP_REQUEUE_PI 12
#define FUTEX_LOCK_PI2 13
+#define GFUTEX_SWAP 60
+
#define FUTEX_PRIVATE_FLAG 128
#define FUTEX_CLOCK_REALTIME 256
#define FUTEX_CMD_MASK ~(FUTEX_PRIVATE_FLAG | FUTEX_CLOCK_REALTIME)
@@ -43,6 +45,8 @@
#define FUTEX_CMP_REQUEUE_PI_PRIVATE (FUTEX_CMP_REQUEUE_PI | \
FUTEX_PRIVATE_FLAG)
+#define GFUTEX_SWAP_PRIVATE (GFUTEX_SWAP | FUTEX_PRIVATE_FLAG)
+
/*
* Support for robust futexes: the kernel cleans up held futexes at
* thread exit time.
diff --git a/include/uapi/linux/sev-guest.h b/include/uapi/linux/sev-guest.h
new file mode 100644
index 0000000..256aaef
--- /dev/null
+++ b/include/uapi/linux/sev-guest.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * Userspace interface for AMD SEV and SNP guest driver.
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Brijesh Singh <brijesh.singh@amd.com>
+ *
+ * SEV API specification is available at: https://developer.amd.com/sev/
+ */
+
+#ifndef __UAPI_LINUX_SEV_GUEST_H_
+#define __UAPI_LINUX_SEV_GUEST_H_
+
+#include <linux/types.h>
+
+struct snp_report_req {
+ /* user data that should be included in the report */
+ __u8 user_data[64];
+
+ /* The vmpl level to be included in the report */
+ __u32 vmpl;
+
+ /* Must be zero filled */
+ __u8 rsvd[28];
+};
+
+struct snp_report_resp {
+ /* response data, see SEV-SNP spec for the format */
+ __u8 data[4000];
+};
+
+struct snp_derived_key_req {
+ __u32 root_key_select;
+ __u32 rsvd;
+ __u64 guest_field_select;
+ __u32 vmpl;
+ __u32 guest_svn;
+ __u64 tcb_version;
+};
+
+struct snp_derived_key_resp {
+ /* response data, see SEV-SNP spec for the format */
+ __u8 data[64];
+};
+
+struct snp_guest_request_ioctl {
+ /* message version number (must be non-zero) */
+ __u8 msg_version;
+
+ /* Request and response structure address */
+ __u64 req_data;
+ __u64 resp_data;
+
+ /* firmware error code on failure (see psp-sev.h) */
+ __u64 fw_err;
+};
+
+struct snp_ext_report_req {
+ struct snp_report_req data;
+
+ /* where to copy the certificate blob */
+ __u64 certs_address;
+
+ /* length of the certificate blob */
+ __u32 certs_len;
+};
+
+#define SNP_GUEST_REQ_IOC_TYPE 'S'
+
+/* Get SNP attestation report */
+#define SNP_GET_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x0, struct snp_guest_request_ioctl)
+
+/* Get a derived key from the root */
+#define SNP_GET_DERIVED_KEY _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x1, struct snp_guest_request_ioctl)
+
+/* Get SNP extended report as defined in the GHCB specification version 2. */
+#define SNP_GET_EXT_REPORT _IOWR(SNP_GUEST_REQ_IOC_TYPE, 0x2, struct snp_guest_request_ioctl)
+
+#endif /* __UAPI_LINUX_SEV_GUEST_H_ */
diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h
index 059b1a9..9e96fec 100644
--- a/include/uapi/linux/uio.h
+++ b/include/uapi/linux/uio.h
@@ -20,6 +20,17 @@
__kernel_size_t iov_len; /* Must be size_t (1003.1g) */
};
+struct devmemvec {
+ __u32 frag_offset;
+ __u32 frag_size;
+ __u32 frag_token; /* The token is only 31 bits long. The last bit is
+ reserved to indicate the end of pagelist. */
+};
+
+struct devmemtoken {
+ __u32 token_start;
+ __u32 token_count;
+};
/*
* UIO_MAXIOV shall be at least 16 1003.1g (5.4.1.1)
*/
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index ddaa45e..f69db4d 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -31,12 +31,13 @@
#include <linux/virtio_config.h>
/* The feature bitmap for virtio balloon */
-#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
-#define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */
-#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
-#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
-#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
-#define VIRTIO_BALLOON_F_REPORTING 5 /* Page reporting virtqueue */
+#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
+#define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */
+#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
+#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
+#define VIRTIO_BALLOON_F_REPORTING 5 /* Page reporting virtqueue */
+#define VIRTIO_BALLOON_F_MIN_BALLOON_SIZE 6 /* Min balloon size to remain even with OOM event*/
/* Size of a PFN in the balloon interface. */
#define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -59,6 +60,8 @@
};
/* Stores PAGE_POISON if page poisoning is in use */
__le32 poison_val;
+ /* Number of Pages host wants Guest to give up even with delfate_on_oom event. */
+ __le32 min_balloon_pages;
};
#define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */
diff --git a/init/init_task.c b/init/init_task.c
index 2d02406..ba46dad 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -135,8 +135,7 @@
.thread_group = LIST_HEAD_INIT(init_task.thread_group),
.thread_node = LIST_HEAD_INIT(init_signals.thread_head),
#ifdef CONFIG_AUDIT
- .loginuid = INVALID_UID,
- .sessionid = AUDIT_SID_UNSET,
+ .audit = &init_struct_audit,
#endif
#ifdef CONFIG_PERF_EVENTS
.perf_event_mutex = __MUTEX_INITIALIZER(init_task.perf_event_mutex),
diff --git a/init/main.c b/init/main.c
index 3f3dc2a..af1793b 100644
--- a/init/main.c
+++ b/init/main.c
@@ -97,6 +97,8 @@
#include <linux/cache.h>
#include <linux/rodata_test.h>
#include <linux/jump_label.h>
+#include <linux/mem_encrypt.h>
+#include <linux/audit.h>
#include <linux/kcsan.h>
#include <linux/init_syscalls.h>
#include <linux/stackdepot.h>
@@ -1129,6 +1131,7 @@
nsfs_init();
cpuset_init();
cgroup_init();
+ audit_task_init();
taskstats_init_early();
delayacct_init();
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index ff6c36ae..9a6bbca 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1111,6 +1111,21 @@
static struct kmem_cache *req_cachep;
static const struct file_operations io_uring_fops;
+static int __read_mostly sysctl_io_uring_disabled;
+#ifdef CONFIG_SYSCTL
+static struct ctl_table kernel_io_uring_disabled_table[] = {
+ {
+ .procname = "io_uring_disabled",
+ .data = &sysctl_io_uring_disabled,
+ .maxlen = sizeof(sysctl_io_uring_disabled),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_TWO,
+ },
+ {},
+};
+#endif
static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
{
@@ -10474,9 +10489,19 @@
return io_uring_create(entries, &p, params);
}
+static inline bool io_uring_allowed(void)
+{
+ int disabled = READ_ONCE(sysctl_io_uring_disabled);
+
+ return disabled == 0 || (disabled == 1 && capable(CAP_SYS_ADMIN));
+}
+
SYSCALL_DEFINE2(io_uring_setup, u32, entries,
struct io_uring_params __user *, params)
{
+ if (!io_uring_allowed())
+ return -EPERM;
+
return io_uring_setup(entries, params);
}
@@ -11106,6 +11131,11 @@
req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
SLAB_ACCOUNT);
+
+#ifdef CONFIG_SYSCTL
+ register_sysctl_init("kernel", kernel_io_uring_disabled_table);
+#endif
+
return 0;
};
__initcall(io_uring_init);
diff --git a/kernel/audit.c b/kernel/audit.c
index 82b6fea..ffeb4fc 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -208,6 +208,75 @@
struct sk_buff *skb;
};
+static struct kmem_cache *audit_task_cache;
+
+void __init audit_task_init(void)
+{
+ audit_task_cache = kmem_cache_create("audit_task",
+ sizeof(struct audit_task_info),
+ 0, SLAB_PANIC, NULL);
+}
+
+/**
+ * audit_alloc - allocate an audit info block for a task
+ * @tsk: task
+ *
+ * Call audit_alloc_syscall to filter on the task information and
+ * allocate a per-task audit context if necessary. This is called from
+ * copy_process, so no lock is needed.
+ */
+int audit_alloc(struct task_struct *tsk)
+{
+ int ret = 0;
+ struct audit_task_info *info;
+
+ info = kmem_cache_alloc(audit_task_cache, GFP_KERNEL);
+ if (!info) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ info->loginuid = audit_get_loginuid(current);
+ info->sessionid = audit_get_sessionid(current);
+ info->contid = audit_get_contid(current);
+ tsk->audit = info;
+
+ ret = audit_alloc_syscall(tsk);
+ if (ret) {
+ tsk->audit = NULL;
+ kmem_cache_free(audit_task_cache, info);
+ }
+out:
+ return ret;
+}
+
+struct audit_task_info init_struct_audit = {
+ .loginuid = INVALID_UID,
+ .sessionid = AUDIT_SID_UNSET,
+ .contid = AUDIT_CID_UNSET,
+#ifdef CONFIG_AUDITSYSCALL
+ .ctx = NULL,
+#endif
+};
+
+/**
+ * audit_free - free per-task audit info
+ * @tsk: task whose audit info block to free
+ *
+ * Called from copy_process and do_exit
+ */
+void audit_free(struct task_struct *tsk)
+{
+ struct audit_task_info *info = tsk->audit;
+
+ audit_free_syscall(tsk);
+ /* Freeing the audit_task_info struct must be performed after
+ * audit_log_exit() due to need for loginuid and sessionid.
+ */
+ info = tsk->audit;
+ tsk->audit = NULL;
+ kmem_cache_free(audit_task_cache, info);
+}
+
/**
* auditd_test_task - Check to see if a given task is an audit daemon
* @task: the task to check
@@ -2381,8 +2450,8 @@
sessionid = (unsigned int)atomic_inc_return(&session_id);
}
- current->sessionid = sessionid;
- current->loginuid = loginuid;
+ current->audit->sessionid = sessionid;
+ current->audit->loginuid = loginuid;
out:
audit_log_set_loginuid(oldloginuid, loginuid, oldsessionid, sessionid, rc);
return rc;
@@ -2415,6 +2484,62 @@
return audit_signal_info_syscall(t);
}
+/*
+ * audit_set_contid - set current task's audit contid
+ * @task: target task
+ * @contid: contid value
+ *
+ * Returns 0 on success, -EPERM on permission failure.
+ *
+ * Called (set) from fs/proc/base.c::proc_contid_write().
+ */
+int audit_set_contid(struct task_struct *task, u64 contid)
+{
+ u64 oldcontid;
+ int rc = 0;
+ struct audit_buffer *ab;
+
+ task_lock(task);
+ /* Can't set if audit disabled */
+ if (!task->audit) {
+ task_unlock(task);
+ return -ENOPROTOOPT;
+ }
+ oldcontid = audit_get_contid(task);
+ read_lock(&tasklist_lock);
+ /* Don't allow the audit containerid to be unset */
+ if (!audit_contid_valid(contid))
+ rc = -EINVAL;
+ /* if we don't have caps, reject */
+ else if (!capable(CAP_AUDIT_CONTROL))
+ rc = -EPERM;
+ /* if task has children or is not single-threaded, deny */
+ else if (!list_empty(&task->children))
+ rc = -EBUSY;
+ else if (!(thread_group_leader(task) && thread_group_empty(task)))
+ rc = -EALREADY;
+ /* if contid is already set, deny */
+ else if (audit_contid_set(task))
+ rc = -ECHILD;
+ read_unlock(&tasklist_lock);
+ if (!rc)
+ task->audit->contid = contid;
+ task_unlock(task);
+
+ if (!audit_enabled)
+ return rc;
+
+ ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_OP);
+ if (!ab)
+ return rc;
+
+ audit_log_format(ab,
+ "op=set opid=%d contid=%llu old-contid=%llu",
+ task_tgid_nr(task), contid, oldcontid);
+ audit_log_end(ab);
+ return rc;
+}
+
/**
* audit_log_end - end one audit record
* @ab: the audit_buffer
diff --git a/kernel/audit.h b/kernel/audit.h
index b2ef4c0..6d6f6a3 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -138,6 +138,7 @@
kuid_t target_uid;
unsigned int target_sessionid;
u32 target_sid;
+ u64 target_cid;
char target_comm[TASK_COMM_LEN];
struct audit_tree_refs *trees, *first_trees;
@@ -258,6 +259,8 @@
extern unsigned int audit_serial(void);
extern int auditsc_get_stamp(struct audit_context *ctx,
struct timespec64 *t, unsigned int *serial);
+extern int audit_alloc_syscall(struct task_struct *tsk);
+extern void audit_free_syscall(struct task_struct *tsk);
extern void audit_put_watch(struct audit_watch *watch);
extern void audit_get_watch(struct audit_watch *watch);
@@ -299,6 +302,9 @@
extern struct list_head *audit_killed_trees(void);
#else /* CONFIG_AUDITSYSCALL */
#define auditsc_get_stamp(c, t, s) 0
+#define audit_alloc_syscall(t) 0
+#define audit_free_syscall(t) do { } while (0)
+
#define audit_put_watch(w) do { } while (0)
#define audit_get_watch(w) do { } while (0)
#define audit_to_watch(k, p, l, o) (-EINVAL)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index e7fedf5..72968a9 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -112,6 +112,7 @@
kuid_t target_uid[AUDIT_AUX_PIDS];
unsigned int target_sessionid[AUDIT_AUX_PIDS];
u32 target_sid[AUDIT_AUX_PIDS];
+ u64 target_cid[AUDIT_AUX_PIDS];
char target_comm[AUDIT_AUX_PIDS][TASK_COMM_LEN];
int pid_count;
};
@@ -937,23 +938,25 @@
return context;
}
-/**
- * audit_alloc - allocate an audit context block for a task
+/*
+ * audit_alloc_syscall - allocate an audit context block for a task
* @tsk: task
*
* Filter on the task information and allocate a per-task audit context
* if necessary. Doing so turns on system call auditing for the
- * specified task. This is called from copy_process, so no lock is
- * needed.
+ * specified task. This is called from copy_process via audit_alloc, so
+ * no lock is needed.
*/
-int audit_alloc(struct task_struct *tsk)
+int audit_alloc_syscall(struct task_struct *tsk)
{
struct audit_context *context;
enum audit_state state;
char *key = NULL;
- if (likely(!audit_ever_enabled))
+ if (likely(!audit_ever_enabled)) {
+ audit_set_context(tsk, NULL);
return 0; /* Return if not auditing. */
+ }
state = audit_filter_task(tsk, &key);
if (state == AUDIT_STATE_DISABLED) {
@@ -963,7 +966,7 @@
if (!(context = audit_alloc_context(state))) {
kfree(key);
- audit_log_lost("out of memory in audit_alloc");
+ audit_log_lost("out of memory in audit_alloc_syscall");
return -ENOMEM;
}
context->filterkey = key;
@@ -1665,14 +1668,15 @@
}
/**
- * __audit_free - free a per-task audit context
+ * audit_free_syscall - free per-task audit context info
* @tsk: task whose audit context block to free
*
- * Called from copy_process and do_exit
+ * Called from audit_free
*/
-void __audit_free(struct task_struct *tsk)
+void audit_free_syscall(struct task_struct *tsk)
{
- struct audit_context *context = tsk->audit_context;
+ struct audit_task_info *info = tsk->audit;
+ struct audit_context *context = info->ctx;
if (!context)
return;
@@ -1694,7 +1698,6 @@
if (context->current_state == AUDIT_STATE_RECORD)
audit_log_exit();
}
-
audit_set_context(tsk, NULL);
audit_free_context(context);
}
@@ -2470,6 +2473,7 @@
context->target_uid = task_uid(t);
context->target_sessionid = audit_get_sessionid(t);
security_task_getsecid_obj(t, &context->target_sid);
+ context->target_cid = audit_get_contid(t);
memcpy(context->target_comm, t->comm, TASK_COMM_LEN);
}
@@ -2497,6 +2501,7 @@
ctx->target_uid = t_uid;
ctx->target_sessionid = audit_get_sessionid(t);
security_task_getsecid_obj(t, &ctx->target_sid);
+ ctx->target_cid = audit_get_contid(t);
memcpy(ctx->target_comm, t->comm, TASK_COMM_LEN);
return 0;
}
@@ -2518,6 +2523,7 @@
axp->target_uid[axp->pid_count] = t_uid;
axp->target_sessionid[axp->pid_count] = audit_get_sessionid(t);
security_task_getsecid_obj(t, &axp->target_sid[axp->pid_count]);
+ axp->target_cid[axp->pid_count] = audit_get_contid(t);
memcpy(axp->target_comm[axp->pid_count], t->comm, TASK_COMM_LEN);
axp->pid_count++;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 5c7ed5d..48882c3 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -34,7 +34,7 @@
#include <linux/highmem.h>
#include <linux/gfp.h>
#include <linux/scatterlist.h>
-#include <linux/mem_encrypt.h>
+#include <linux/cc_platform.h>
#include <linux/set_memory.h>
#ifdef CONFIG_DEBUG_FS
#include <linux/debugfs.h>
@@ -564,7 +564,7 @@
if (!mem || !mem->nslabs)
panic("Can not allocate SWIOTLB buffer earlier and can't now provide you with the DMA bounce buffer");
- if (mem_encrypt_active())
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
pr_warn_once("Memory encryption is active and system is using DMA bounce buffers\n");
if (mapping_size > alloc_size) {
diff --git a/kernel/exit.c b/kernel/exit.c
index 80efdfd..e5895cb 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -64,6 +64,7 @@
#include <linux/rcuwait.h>
#include <linux/compat.h>
#include <linux/io_uring.h>
+#include <linux/security.h>
#include <linux/sysfs.h>
#include <linux/uaccess.h>
@@ -847,6 +848,8 @@
#endif
if (tsk->mm)
setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
+
+ security_task_exit(tsk);
}
acct_collect(code, group_dead);
if (group_dead)
diff --git a/kernel/fork.c b/kernel/fork.c
index 753e641..d41e486 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2131,7 +2131,6 @@
posix_cputimers_init(&p->posix_cputimers);
p->io_context = NULL;
- audit_set_context(p, NULL);
cgroup_fork(p);
#ifdef CONFIG_NUMA
p->mempolicy = mpol_dup(p->mempolicy);
@@ -2438,6 +2437,7 @@
uprobe_copy_process(p, clone_flags);
copy_oom_score_adj(clone_flags, p);
+ security_task_post_alloc(p);
return p;
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index d422451..34f34a0 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -1655,16 +1655,16 @@
}
/*
- * Wake up waiters matching bitset queued on this futex (uaddr).
+ * Prepare wake queue matching bitset queued on this futex (uaddr).
*/
static int
-futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
+prepare_wake_q(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset,
+ struct wake_q_head *wake_q)
{
struct futex_hash_bucket *hb;
struct futex_q *this, *next;
union futex_key key = FUTEX_KEY_INIT;
int ret;
- DEFINE_WAKE_Q(wake_q);
if (!bitset)
return -EINVAL;
@@ -1692,14 +1692,29 @@
if (!(this->bitset & bitset))
continue;
- mark_wake_futex(&wake_q, this);
+ mark_wake_futex(wake_q, this);
if (++ret >= nr_wake)
break;
}
}
spin_unlock(&hb->lock);
- wake_up_q(&wake_q);
+
+ return ret;
+}
+
+/*
+ * Wake up waiters matching bitset queued on this futex (uaddr).
+ */
+static int
+futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
+{
+ int ret;
+ DEFINE_WAKE_Q(wake_q);
+
+ ret = prepare_wake_q(uaddr, flags, nr_wake, bitset, &wake_q);
+ if (ret > 0)
+ wake_up_q(&wake_q);
return ret;
}
@@ -2832,9 +2847,12 @@
* @hb: the futex hash bucket, must be locked by the caller
* @q: the futex_q to queue up on
* @timeout: the prepared hrtimer_sleeper, or null for no timeout
+ * @next: if present, wake next and hint to the scheduler that we'd
+ * prefer to execute it locally.
*/
static void futex_wait_queue_me(struct futex_hash_bucket *hb, struct futex_q *q,
- struct hrtimer_sleeper *timeout)
+ struct hrtimer_sleeper *timeout,
+ struct task_struct *next)
{
/*
* The task state is guaranteed to be set before another task can
@@ -2859,10 +2877,25 @@
* flagged for rescheduling. Only call schedule if there
* is no timeout, or if it has yet to expire.
*/
- if (!timeout || timeout->task)
+ if (!timeout || timeout->task) {
+ if (next) {
+#ifdef CONFIG_SMP
+ wake_up_swap(next);
+#else
+ wake_up_process(next);
+#endif
+ put_task_struct(next);
+ next = NULL;
+ }
freezable_schedule();
+ }
}
__set_current_state(TASK_RUNNING);
+
+ if (next) {
+ wake_up_process(next);
+ put_task_struct(next);
+ }
}
/**
@@ -2937,7 +2970,7 @@
}
static int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
- ktime_t *abs_time, u32 bitset)
+ ktime_t *abs_time, u32 bitset, struct task_struct *next)
{
struct hrtimer_sleeper timeout, *to;
struct restart_block *restart;
@@ -2961,7 +2994,8 @@
goto out;
/* queue_me and wait for wakeup, timeout, or a signal. */
- futex_wait_queue_me(hb, &q, to);
+ futex_wait_queue_me(hb, &q, to, next);
+ next = NULL;
/* If we were woken (and unqueued), we succeeded, whatever. */
ret = 0;
@@ -2992,6 +3026,10 @@
ret = set_restart_fn(restart, futex_wait_restart);
out:
+ if (next) {
+ wake_up_process(next);
+ put_task_struct(next);
+ }
if (to) {
hrtimer_cancel(&to->timer);
destroy_hrtimer_on_stack(&to->timer);
@@ -3011,10 +3049,30 @@
}
restart->fn = do_no_restart_syscall;
- return (long)futex_wait(uaddr, restart->futex.flags,
- restart->futex.val, tp, restart->futex.bitset);
+ return (long)futex_wait(uaddr, restart->futex.flags, restart->futex.val,
+ tp, restart->futex.bitset, NULL);
}
+static int futex_swap(u32 __user *uaddr, unsigned int flags, u32 val,
+ ktime_t *abs_time, u32 __user *uaddr2)
+{
+ u32 bitset = FUTEX_BITSET_MATCH_ANY;
+ struct task_struct *next = NULL;
+ DEFINE_WAKE_Q(wake_q);
+ int ret;
+
+ ret = prepare_wake_q(uaddr2, flags, 1, bitset, &wake_q);
+ if (ret < 0)
+ return ret;
+ if (wake_q.first != WAKE_Q_TAIL) {
+ WARN_ON(ret != 1);
+ /* At most one wakee can be present. Pull it out. */
+ next = container_of(wake_q.first, struct task_struct, wake_q);
+ next->wake_q.next = NULL;
+ }
+
+ return futex_wait(uaddr, flags, val, abs_time, bitset, next);
+}
/*
* Userspace tried a 0 -> TID atomic transition of the futex value
@@ -3470,7 +3528,7 @@
}
/* Queue the futex_q, drop the hb lock, wait for wakeup. */
- futex_wait_queue_me(hb, &q, to);
+ futex_wait_queue_me(hb, &q, to, NULL);
switch (futex_requeue_pi_wakeup_sync(&q)) {
case Q_REQUEUE_PI_IGNORE:
@@ -3979,7 +4037,7 @@
val3 = FUTEX_BITSET_MATCH_ANY;
fallthrough;
case FUTEX_WAIT_BITSET:
- return futex_wait(uaddr, flags, val, timeout, val3);
+ return futex_wait(uaddr, flags, val, timeout, val3, NULL);
case FUTEX_WAKE:
val3 = FUTEX_BITSET_MATCH_ANY;
fallthrough;
@@ -4006,6 +4064,8 @@
uaddr2);
case FUTEX_CMP_REQUEUE_PI:
return futex_requeue(uaddr, flags, uaddr2, val, val2, &val3, 1);
+ case GFUTEX_SWAP:
+ return futex_swap(uaddr, flags, val, timeout, uaddr2);
}
return -ENOSYS;
}
@@ -4018,6 +4078,7 @@
case FUTEX_LOCK_PI2:
case FUTEX_WAIT_BITSET:
case FUTEX_WAIT_REQUEUE_PI:
+ case GFUTEX_SWAP:
return true;
}
return false;
@@ -4030,7 +4091,7 @@
return -EINVAL;
*t = timespec64_to_ktime(*ts);
- if (cmd == FUTEX_WAIT)
+ if (cmd == FUTEX_WAIT || cmd == GFUTEX_SWAP)
*t = ktime_add_safe(ktime_get(), *t);
else if (cmd != FUTEX_LOCK_PI && !(op & FUTEX_CLOCK_REALTIME))
*t = timens_ktime_to_host(CLOCK_MONOTONIC, *t);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 4cc6897..9d6f730 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -1358,12 +1358,11 @@
* kthread_use_mm - make the calling kthread operate on an address space
* @mm: address space to operate on
*/
-void kthread_use_mm(struct mm_struct *mm)
+void use_mm(struct mm_struct *mm)
{
struct mm_struct *active_mm;
struct task_struct *tsk = current;
- WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
WARN_ON_ONCE(tsk->mm);
task_lock(tsk);
@@ -1396,6 +1395,16 @@
mmdrop(active_mm);
else
smp_mb();
+}
+EXPORT_SYMBOL_GPL(use_mm);
+
+void kthread_use_mm(struct mm_struct *mm)
+{
+ struct task_struct *tsk = current;
+
+ WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
+
+ use_mm(mm);
to_kthread(tsk)->oldfs = force_uaccess_begin();
}
@@ -1410,10 +1419,18 @@
struct task_struct *tsk = current;
WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
- WARN_ON_ONCE(!tsk->mm);
-
force_uaccess_end(to_kthread(tsk)->oldfs);
+ unuse_mm(mm);
+}
+EXPORT_SYMBOL_GPL(kthread_unuse_mm);
+
+void unuse_mm(struct mm_struct *mm)
+{
+ struct task_struct *tsk = current;
+
+ WARN_ON_ONCE(!tsk->mm);
+
task_lock(tsk);
/*
* When a kthread stops operating on an address space, the loop
@@ -1432,7 +1449,7 @@
local_irq_enable();
task_unlock(tsk);
}
-EXPORT_SYMBOL_GPL(kthread_unuse_mm);
+EXPORT_SYMBOL_GPL(unuse_mm);
#ifdef CONFIG_BLK_CGROUP
/**
diff --git a/kernel/module_signing.c b/kernel/module_signing.c
index 8723ae7..9b08058 100644
--- a/kernel/module_signing.c
+++ b/kernel/module_signing.c
@@ -38,8 +38,15 @@
modlen -= sig_len + sizeof(ms);
info->len = modlen;
- return verify_pkcs7_signature(mod, modlen, mod + modlen, sig_len,
- VERIFY_USE_SECONDARY_KEYRING,
- VERIFYING_MODULE_SIGNATURE,
- NULL, NULL);
+ ret = verify_pkcs7_signature(mod, modlen, mod + modlen, sig_len,
+ VERIFY_USE_SECONDARY_KEYRING,
+ VERIFYING_MODULE_SIGNATURE,
+ NULL, NULL);
+ if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) {
+ ret = verify_pkcs7_signature(mod, modlen, mod + modlen, sig_len,
+ VERIFY_USE_PLATFORM_KEYRING,
+ VERIFYING_MODULE_SIGNATURE,
+ NULL, NULL);
+ }
+ return ret;
}
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 25b8ea9..9219d03 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4216,6 +4216,11 @@
}
EXPORT_SYMBOL(wake_up_process);
+int wake_up_swap(struct task_struct *tsk)
+{
+ return try_to_wake_up(tsk, TASK_NORMAL, WF_CURRENT_CPU);
+}
+
int wake_up_state(struct task_struct *p, unsigned int state)
{
return try_to_wake_up(p, state, 0);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4a13934..2e21067 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7147,6 +7147,8 @@
/* SD_flags and WF_flags share the first nibble */
int sd_flag = wake_flags & 0xF;
+ if ((wake_flags & WF_CURRENT_CPU) && cpumask_test_cpu(cpu, p->cpus_ptr))
+ return cpu;
/*
* required for stable ->cpus_allowed
*/
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 5061093..b1500ef 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2058,6 +2058,8 @@
#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */
#define WF_MIGRATED 0x20 /* Internal use, task got migrated */
+#define WF_ON_CPU 0x40 /* Wakee is on_cpu */
+#define WF_CURRENT_CPU 0x200 /* Prefer to move wakee to the current CPU */
#ifdef CONFIG_SMP
static_assert(WF_EXEC == SD_BALANCE_EXEC);
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 41f4709..3989514 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -80,21 +80,6 @@
wake_up_process(tsk);
}
-/*
- * If ksoftirqd is scheduled, we do not want to process pending softirqs
- * right now. Let ksoftirqd handle this at its own rate, to get fairness,
- * unless we're doing some of the synchronous softirqs.
- */
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
-static bool ksoftirqd_running(unsigned long pending)
-{
- struct task_struct *tsk = __this_cpu_read(ksoftirqd);
-
- if (pending & SOFTIRQ_NOW_MASK)
- return false;
- return tsk && task_is_running(tsk) && !__kthread_should_park(tsk);
-}
-
#ifdef CONFIG_TRACE_IRQFLAGS
DEFINE_PER_CPU(int, hardirqs_enabled);
DEFINE_PER_CPU(int, hardirq_context);
@@ -236,7 +221,7 @@
goto out;
pending = local_softirq_pending();
- if (!pending || ksoftirqd_running(pending))
+ if (!pending)
goto out;
/*
@@ -419,9 +404,6 @@
static inline void invoke_softirq(void)
{
- if (ksoftirqd_running(local_softirq_pending()))
- return;
-
if (!force_irqthreads() || !__this_cpu_read(ksoftirqd)) {
#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
/*
@@ -455,7 +437,7 @@
pending = local_softirq_pending();
- if (pending && !ksoftirqd_running(pending))
+ if (pending)
do_softirq_own_stack();
local_irq_restore(flags);
diff --git a/mm/swap.c b/mm/swap.c
index af3cad4..45bbb27 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -37,6 +37,7 @@
#include <linux/page_idle.h>
#include <linux/local_lock.h>
#include <linux/buffer_head.h>
+#include <linux/dma-buf.h>
#include "internal.h"
@@ -114,6 +115,10 @@
void __put_page(struct page *page)
{
if (is_zone_device_page(page)) {
+ if (is_dma_buf_page(page)) {
+ page->pgmap->ops->page_free(page);
+ return;
+ }
put_dev_pagemap(page->pgmap);
/*
@@ -1142,6 +1147,11 @@
if (WARN_ON_ONCE(!page_is_devmap_managed(page)))
return;
+ if (WARN_ON_ONCE(is_dma_buf_page(page))) {
+ if (put_page_testzero(page))
+ __put_page(page);
+ return;
+ }
count = page_ref_dec_return(page);
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 4953abe..e21b006 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -305,9 +305,10 @@
static int bpf_sk_storage_charge(struct bpf_local_storage_map *smap,
void *owner, u32 size)
{
- int optmem_max = READ_ONCE(sysctl_optmem_max);
struct sock *sk = (struct sock *)owner;
+ int optmem_max;
+ optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
/* same check as in sock_kmalloc() */
if (size <= optmem_max &&
atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 1ff8241..335c8c5 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -431,6 +431,9 @@
return 0;
}
+ if (skb_frags_not_readable(skb))
+ goto short_copy;
+
/* Copy paged appendix. Hmm... why does this look so complicated? */
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
diff --git a/net/core/dev.c b/net/core/dev.c
index f80bc2c..6bad91b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2234,6 +2234,16 @@
return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
}
+static inline int deliver_skb_tx(struct sk_buff *skb,
+ struct packet_type *pt_prev,
+ struct net_device *orig_dev)
+{
+ if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
+ return -ENOMEM;
+ refcount_inc(&skb->users);
+ return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
+}
+
static inline void deliver_ptype_list_skb(struct sk_buff *skb,
struct packet_type **pt,
struct net_device *orig_dev,
@@ -2301,7 +2311,7 @@
continue;
if (pt_prev) {
- deliver_skb(skb2, pt_prev, skb->dev);
+ deliver_skb_tx(skb2, pt_prev, skb->dev);
pt_prev = ptype;
continue;
}
@@ -2338,7 +2348,7 @@
}
out_unlock:
if (pt_prev) {
- if (!skb_orphan_frags_rx(skb2, GFP_ATOMIC))
+ if (!skb_orphan_frags(skb2, GFP_ATOMIC))
pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
else
kfree_skb(skb2);
@@ -4721,6 +4731,56 @@
return rxqueue;
}
+struct page *
+__netdev_rxq_alloc_page_from_dmabuf_pool(struct netdev_rx_queue *rxq,
+ unsigned int order)
+{
+ struct dma_buf_pages_file_priv *priv;
+ struct file *dmabuf_pages_file;
+ unsigned long kvirt;
+ struct page *pg;
+ size_t offset;
+
+ rcu_read_lock();
+ dmabuf_pages_file = rcu_dereference(rxq->dmabuf_pages);
+ if (!dmabuf_pages_file || !get_file_rcu(dmabuf_pages_file)) {
+ rcu_read_unlock();
+ return NULL;
+ }
+ rcu_read_unlock();
+
+ priv = dmabuf_pages_file->private_data;
+ kvirt = gen_pool_alloc(priv->page_pool, PAGE_SIZE * (1 << order));
+ if (!kvirt)
+ goto out_err_put;
+
+ if (!PAGE_ALIGNED(kvirt)) {
+ net_err_ratelimited("dmabuf page pool allocation not aligned");
+ gen_pool_free(priv->page_pool, kvirt, PAGE_SIZE * (1 << order));
+ goto out_err_put;
+ }
+
+ /* - 1 is due to the fact that we want to avoid 0 virt address
+ * returned from the gen_pool. See comment in dma_buf_create_pages()
+ * for details.
+ */
+ offset = (kvirt >> PAGE_SHIFT) - 1;
+ pg = &priv->pages[offset];
+
+ /* pg->private holds the order of the page for freeing. */
+ pg->private = order;
+ percpu_ref_get(&pg->pgmap->ref);
+ fput(dmabuf_pages_file);
+ get_page(pg);
+ return pg;
+
+out_err_put:
+ fput(dmabuf_pages_file);
+
+ return NULL;
+}
+EXPORT_SYMBOL(__netdev_rxq_alloc_page_from_dmabuf_pool);
+
u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp,
struct bpf_prog *xdp_prog)
{
@@ -6040,6 +6100,9 @@
{
struct skb_shared_info *pinfo = skb_shinfo(skb);
+ if (WARN_ON_ONCE(skb_frags_not_readable(skb)))
+ return;
+
BUG_ON(skb->end - skb->tail < grow);
memcpy(skb_tail_pointer(skb), NAPI_GRO_CB(skb)->frag0, grow);
@@ -6169,7 +6232,7 @@
pull:
grow = skb_gro_offset(skb) - skb_headlen(skb);
- if (grow > 0)
+ if (grow > 0 && !skb_frags_not_readable(skb))
gro_pull_from_frag0(skb, grow);
ok:
if (gro_list->count) {
diff --git a/net/core/filter.c b/net/core/filter.c
index 457d1a1..3dd6a21 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1212,8 +1212,8 @@
*/
static bool __sk_filter_charge(struct sock *sk, struct sk_filter *fp)
{
+ int optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
u32 filter_size = bpf_prog_size(fp->prog->len);
- int optmem_max = READ_ONCE(sysctl_optmem_max);
/* same check as in sock_kmalloc() */
if (filter_size <= optmem_max &&
@@ -1544,12 +1544,13 @@
int sk_reuseport_attach_filter(struct sock_fprog *fprog, struct sock *sk)
{
struct bpf_prog *prog = __get_filter(fprog, sk);
- int err;
+ int err, optmem_max;
if (IS_ERR(prog))
return PTR_ERR(prog);
- if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max))
+ optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
+ if (bpf_prog_size(prog->len) > optmem_max)
err = -ENOMEM;
else
err = reuseport_attach_prog(sk, prog);
@@ -1588,7 +1589,7 @@
int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
{
struct bpf_prog *prog;
- int err;
+ int err, optmem_max;
if (sock_flag(sk, SOCK_FILTER_LOCKED))
return -EPERM;
@@ -1616,7 +1617,8 @@
}
} else {
/* BPF_PROG_TYPE_SOCKET_FILTER */
- if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max)) {
+ optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
+ if (bpf_prog_size(prog->len) > optmem_max) {
err = -ENOMEM;
goto err_prog_put;
}
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index dcddc54..62eb90a 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -362,6 +362,11 @@
static int __net_init net_defaults_init_net(struct net *net)
{
net->core.sysctl_somaxconn = SOMAXCONN;
+ /* Limits per socket sk_omem_alloc usage.
+ * TCP zerocopy regular usage needs 128 KB.
+ */
+ net->core.sysctl_optmem_max = 128 * 1024;
+
return 0;
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index a424318..105fea4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -864,11 +864,16 @@
skb_frag_size(frag), p, p_off, p_len,
copied) {
seg_len = min_t(int, p_len, len);
- vaddr = kmap_atomic(p);
- print_hex_dump(level, "skb frag: ",
- DUMP_PREFIX_OFFSET,
- 16, 1, vaddr + p_off, seg_len, false);
- kunmap_atomic(vaddr);
+ if (!is_dma_buf_page(p)) {
+ vaddr = kmap_atomic(p);
+ print_hex_dump(level, "skb frag: ",
+ DUMP_PREFIX_OFFSET, 16, 1,
+ vaddr + p_off, seg_len, false);
+ kunmap_atomic(vaddr);
+ } else {
+ printk("%sskb frag: devmem", level);
+ }
+
len -= seg_len;
if (!len)
break;
@@ -1348,6 +1353,37 @@
}
EXPORT_SYMBOL_GPL(skb_zerocopy_iter_dgram);
+int skb_devmem_iter_stream(struct sock *sk, struct sk_buff *skb,
+ struct iov_iter *iov_iter, int len,
+ struct ubuf_info *uarg)
+{
+ struct ubuf_info *orig_uarg = skb_zcopy(skb);
+ struct iov_iter orig_iter = *iov_iter;
+ int err, orig_len = skb->len;
+
+ /* An skb can only point to one uarg. This edge case happens when
+ * TCP appends to an skb, but zerocopy_realloc triggered a new alloc.
+ */
+ if (orig_uarg && uarg != orig_uarg)
+ return -EEXIST;
+
+ err = __zerocopy_sg_from_iter(sk, skb, iov_iter, len);
+ if (err == -EFAULT || (err == -EMSGSIZE && skb->len == orig_len)) {
+ struct sock *save_sk = skb->sk;
+
+ /* Streams do not free skb on error. Reset to prev state. */
+ *iov_iter = orig_iter;
+ skb->sk = sk;
+ ___pskb_trim(skb, orig_len);
+ skb->sk = save_sk;
+ return err;
+ }
+
+ skb_zcopy_set(skb, uarg, NULL);
+ return skb->len - orig_len;
+}
+EXPORT_SYMBOL_GPL(skb_devmem_iter_stream);
+
int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
struct msghdr *msg, int len,
struct ubuf_info *uarg)
@@ -1424,6 +1460,9 @@
if (skb_shared(skb) || skb_unclone(skb, gfp_mask))
return -EINVAL;
+ if (skb_frags_not_readable(skb))
+ return -EFAULT;
+
if (!num_frags)
goto release;
@@ -1585,8 +1624,10 @@
{
int headerlen = skb_headroom(skb);
unsigned int size = skb_end_offset(skb) + skb->data_len;
- struct sk_buff *n = __alloc_skb(size, gfp_mask,
- skb_alloc_rx_flag(skb), NUMA_NO_NODE);
+ struct sk_buff *n = skb_frags_not_readable(skb) ? NULL :
+ __alloc_skb(size, gfp_mask,
+ skb_alloc_rx_flag(skb),
+ NUMA_NO_NODE);
if (!n)
return NULL;
@@ -1899,9 +1940,10 @@
/*
* Allocate the copy buffer
*/
- struct sk_buff *n = __alloc_skb(newheadroom + skb->len + newtailroom,
- gfp_mask, skb_alloc_rx_flag(skb),
- NUMA_NO_NODE);
+ struct sk_buff *n = skb_frags_not_readable(skb) ? NULL :
+ __alloc_skb(newheadroom + skb->len + newtailroom,
+ gfp_mask, skb_alloc_rx_flag(skb),
+ NUMA_NO_NODE);
int oldheadroom = skb_headroom(skb);
int head_copy_len, head_copy_off;
@@ -2218,6 +2260,9 @@
*/
int i, k, eat = (skb->tail + delta) - skb->end;
+ if (skb_frags_not_readable(skb))
+ return NULL;
+
if (eat > 0 || skb_cloned(skb)) {
if (pskb_expand_head(skb, 0, eat > 0 ? eat + 128 : 0,
GFP_ATOMIC))
@@ -2371,6 +2416,9 @@
to += copy;
}
+ if (skb_frags_not_readable(skb))
+ goto fault;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
skb_frag_t *f = &skb_shinfo(skb)->frags[i];
@@ -2444,7 +2492,7 @@
{
struct page_frag *pfrag = sk_page_frag(sk);
- if (!sk_page_frag_refill(sk, pfrag))
+ if (!sk_page_frag_refill(sk, pfrag) || is_dma_buf_page(pfrag->page))
return NULL;
*len = min_t(unsigned int, *len, pfrag->size - pfrag->offset);
@@ -2773,6 +2821,9 @@
from += copy;
}
+ if (skb_frags_not_readable(skb))
+ goto fault;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
int end;
@@ -2852,6 +2903,9 @@
pos = copy;
}
+ if (WARN_ON_ONCE(skb_frags_not_readable(skb)))
+ return -EFAULT;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
@@ -2952,6 +3006,9 @@
pos = copy;
}
+ if (WARN_ON_ONCE(skb_frags_not_readable(skb)))
+ return -EFAULT;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
@@ -3411,7 +3468,9 @@
skb_shinfo(skb1)->frags[i] = skb_shinfo(skb)->frags[i];
skb_shinfo(skb1)->nr_frags = skb_shinfo(skb)->nr_frags;
+ skb1->devmem = skb->devmem;
skb_shinfo(skb)->nr_frags = 0;
+ skb->devmem = 0;
skb1->data_len = skb->data_len;
skb1->len += skb1->data_len;
skb->data_len = 0;
@@ -3425,11 +3484,13 @@
{
int i, k = 0;
const int nfrags = skb_shinfo(skb)->nr_frags;
+ const bool devmem = skb->devmem;
skb_shinfo(skb)->nr_frags = 0;
skb1->len = skb1->data_len = skb->len - len;
skb->len = len;
skb->data_len = len - pos;
+ skb->devmem = skb1->devmem = 0;
for (i = 0; i < nfrags; i++) {
int size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
@@ -3458,6 +3519,12 @@
pos += size;
}
skb_shinfo(skb1)->nr_frags = k;
+
+ if (skb_shinfo(skb)->nr_frags)
+ skb->devmem = devmem;
+
+ if (skb_shinfo(skb1)->nr_frags)
+ skb1->devmem = devmem;
}
/**
@@ -3695,6 +3762,9 @@
return block_limit - abs_offset;
}
+ if (skb_frags_not_readable(st->cur_skb))
+ return 0;
+
if (st->frag_idx == 0 && !st->frag_data)
st->stepped_offset += skb_headlen(st->cur_skb);
@@ -4822,6 +4892,9 @@
bool icmp_next = false;
unsigned long flags;
+ if (skb_queue_empty_lockless(q))
+ return NULL;
+
spin_lock_irqsave(&q->lock, flags);
skb = __skb_dequeue(q);
if (skb && (skb_next = skb_peek(q))) {
@@ -5430,7 +5503,10 @@
(from->pp_recycle && skb_cloned(from)))
return false;
- if (len <= skb_tailroom(to)) {
+ if (skb_frags_not_readable(from) != skb_frags_not_readable(to))
+ return false;
+
+ if (len <= skb_tailroom(to) && !skb_frags_not_readable(from)) {
if (len)
BUG_ON(skb_copy_bits(from, 0, skb_put(to, len), len));
*delta_truesize = 0;
@@ -5746,6 +5822,9 @@
if (!pskb_may_pull(skb, write_len))
return -ENOMEM;
+ if (skb_frags_not_readable(skb))
+ return -EFAULT;
+
if (!skb_cloned(skb) || skb_clone_writable(skb, write_len))
return 0;
@@ -6419,8 +6498,8 @@
void skb_condense(struct sk_buff *skb)
{
if (skb->data_len) {
- if (skb->data_len > skb->end - skb->tail ||
- skb_cloned(skb))
+ if (skb->data_len > skb->end - skb->tail || skb_cloned(skb) ||
+ skb_frags_not_readable(skb))
return;
/* Nice, we can free page frag(s) right now */
diff --git a/net/core/sock.c b/net/core/sock.c
index 6f761f3..4e597ba 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -276,10 +276,6 @@
__u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX;
__u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX;
-/* Maximal space eaten by iovec or ancillary data plus some space */
-int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512);
-EXPORT_SYMBOL(sysctl_optmem_max);
-
int sysctl_tstamp_allow_data __read_mostly = 1;
DEFINE_STATIC_KEY_FALSE(memalloc_socks_key);
@@ -981,6 +977,76 @@
valbool = val ? 1 : 0;
+ /* handle options that do not need to lock the socket */
+ switch (optname) {
+ case SO_DEVMEM_DONTNEED:
+ {
+ struct devmemtoken singleton_token, *tokens;
+ unsigned int num_tokens, i, j, k, pg_num = 0;
+ struct page *pgs[16];
+
+ if (sk->sk_type != SOCK_STREAM || sk->sk_protocol != IPPROTO_TCP)
+ return -EBADF;
+
+ if (optlen < sizeof(*tokens) || optlen % sizeof(*tokens))
+ return -EINVAL;
+
+ if (optlen == sizeof(*tokens)) {
+ if (copy_from_sockptr(&singleton_token, optval,
+ sizeof(*tokens))) {
+ return -EFAULT;
+ }
+ num_tokens = 1;
+ tokens = &singleton_token;
+ } else {
+ if (optlen > 4096)
+ return -EINVAL;
+ num_tokens = optlen / sizeof(*tokens);
+ tokens = kmalloc(optlen, GFP_KERNEL);
+ if (!tokens)
+ return -ENOMEM;
+ if (copy_from_sockptr(tokens, optval, optlen)) {
+ kfree(tokens);
+ return -EFAULT;
+ }
+ }
+
+ ret = 0;
+
+ xa_lock_bh(&sk->sk_pagepool);
+ for (i = 0; i < num_tokens; i++) {
+ for (j = 0; j < tokens[i].token_count; j++) {
+ struct page *pg = __xa_erase(&sk->sk_pagepool,
+ tokens[i].token_start + j);
+
+ if (unlikely(!pg)) {
+ /* -EINTR here notifies the userspace
+ * that not all tokens passed to it have
+ * been freed.
+ */
+ ret = -EINTR;
+ goto unlock_pagepool;
+ }
+ pgs[pg_num++] = pg;
+ if (pg_num == ARRAY_SIZE(pgs)) {
+ xa_unlock_bh(&sk->sk_pagepool);
+ for (k = 0; k < pg_num; k++)
+ put_page(pgs[k]);
+ pg_num = 0;
+ xa_lock_bh(&sk->sk_pagepool);
+ }
+ }
+ }
+unlock_pagepool:
+ xa_unlock_bh(&sk->sk_pagepool);
+ for (k = 0; k < pg_num; k++)
+ put_page(pgs[k]);
+ if (num_tokens > 1)
+ kfree(tokens);
+
+ return ret;
+ }
+ }
lock_sock(sk);
switch (optname) {
@@ -1371,6 +1437,7 @@
~SOCK_BUF_LOCK_MASK);
break;
+
default:
ret = -ENOPROTOOPT;
break;
@@ -2401,7 +2468,7 @@
/* small safe race: SKB_TRUESIZE may differ from final skb->truesize */
if (atomic_read(&sk->sk_omem_alloc) + SKB_TRUESIZE(size) >
- READ_ONCE(sysctl_optmem_max))
+ READ_ONCE(sock_net(sk)->core.sysctl_optmem_max))
return NULL;
skb = alloc_skb(size, priority);
@@ -2419,7 +2486,7 @@
*/
void *sock_kmalloc(struct sock *sk, int size, gfp_t priority)
{
- int optmem_max = READ_ONCE(sysctl_optmem_max);
+ int optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
if ((unsigned int)size <= optmem_max &&
atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
@@ -2580,6 +2647,12 @@
return -EINVAL;
sockc->transmit_time = get_unaligned((u64 *)CMSG_DATA(cmsg));
break;
+ case SCM_DEVMEM_OFFSET:
+ if (cmsg->cmsg_len != CMSG_LEN(2 * sizeof(u32)))
+ return -EINVAL;
+ sockc->devmem_fd = ((u32 *)CMSG_DATA(cmsg))[0];
+ sockc->devmem_offset = ((u32 *)CMSG_DATA(cmsg))[1];
+ break;
/* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */
case SCM_RIGHTS:
case SCM_CREDENTIALS:
@@ -2742,6 +2815,9 @@
{
spin_lock_bh(&sk->sk_lock.slock);
__release_sock(sk);
+
+ if (sk->sk_prot->release_cb)
+ sk->sk_prot->release_cb(sk);
spin_unlock_bh(&sk->sk_lock.slock);
}
@@ -3264,9 +3340,6 @@
if (sk->sk_backlog.tail)
__release_sock(sk);
- /* Warning : release_cb() might need to release sk ownership,
- * ie call sock_release_ownership(sk) before us.
- */
if (sk->sk_prot->release_cb)
sk->sk_prot->release_cb(sk);
@@ -3413,7 +3486,8 @@
{
struct sock *sk = sock->sk;
- return sk->sk_prot->getsockopt(sk, level, optname, optval, optlen);
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ return READ_ONCE(sk->sk_prot)->getsockopt(sk, level, optname, optval, optlen);
}
EXPORT_SYMBOL(sock_common_getsockopt);
@@ -3440,7 +3514,8 @@
{
struct sock *sk = sock->sk;
- return sk->sk_prot->setsockopt(sk, level, optname, optval, optlen);
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ return READ_ONCE(sk->sk_prot)->setsockopt(sk, level, optname, optval, optlen);
}
EXPORT_SYMBOL(sock_common_setsockopt);
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index ed20cbdd..ebddbd5 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -447,13 +447,6 @@
.proc_handler = proc_dointvec,
},
{
- .procname = "optmem_max",
- .data = &sysctl_optmem_max,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec
- },
- {
.procname = "tstamp_allow_data",
.data = &sysctl_tstamp_allow_data,
.maxlen = sizeof(int),
@@ -595,6 +588,14 @@
.extra1 = SYSCTL_ZERO,
.proc_handler = proc_dointvec_minmax
},
+ {
+ .procname = "optmem_max",
+ .data = &init_net.core.sysctl_optmem_max,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .extra1 = SYSCTL_ZERO,
+ .proc_handler = proc_dointvec_minmax
+ },
{ }
};
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 487f759..110ac22d 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -565,22 +565,27 @@
int addr_len, int flags)
{
struct sock *sk = sock->sk;
+ const struct proto *prot;
int err;
if (addr_len < sizeof(uaddr->sa_family))
return -EINVAL;
+
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ prot = READ_ONCE(sk->sk_prot);
+
if (uaddr->sa_family == AF_UNSPEC)
- return sk->sk_prot->disconnect(sk, flags);
+ return prot->disconnect(sk, flags);
if (BPF_CGROUP_PRE_CONNECT_ENABLED(sk)) {
- err = sk->sk_prot->pre_connect(sk, uaddr, addr_len);
+ err = prot->pre_connect(sk, uaddr, addr_len);
if (err)
return err;
}
if (data_race(!inet_sk(sk)->inet_num) && inet_autobind(sk))
return -EAGAIN;
- return sk->sk_prot->connect(sk, uaddr, addr_len);
+ return prot->connect(sk, uaddr, addr_len);
}
EXPORT_SYMBOL(inet_dgram_connect);
@@ -743,10 +748,11 @@
int inet_accept(struct socket *sock, struct socket *newsock, int flags,
bool kern)
{
- struct sock *sk1 = sock->sk;
+ struct sock *sk1 = sock->sk, *sk2;
int err = -EINVAL;
- struct sock *sk2 = sk1->sk_prot->accept(sk1, flags, &err, kern);
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ sk2 = READ_ONCE(sk1->sk_prot)->accept(sk1, flags, &err, kern);
if (!sk2)
goto do_err;
@@ -834,12 +840,15 @@
size_t size, int flags)
{
struct sock *sk = sock->sk;
+ const struct proto *prot;
if (unlikely(inet_send_prepare(sk)))
return -EAGAIN;
- if (sk->sk_prot->sendpage)
- return sk->sk_prot->sendpage(sk, page, offset, size, flags);
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ prot = READ_ONCE(sk->sk_prot);
+ if (prot->sendpage)
+ return prot->sendpage(sk, page, offset, size, flags);
return sock_no_sendpage(sock, page, offset, size, flags);
}
EXPORT_SYMBOL(inet_sendpage);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 3620dc7..ec4c5d6 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -779,7 +779,7 @@
if (optlen < GROUP_FILTER_SIZE(0))
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max))
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max))
return -ENOBUFS;
gsf = memdup_sockptr(optval, optlen);
@@ -815,7 +815,7 @@
if (optlen < size0)
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max) - 4)
return -ENOBUFS;
p = kmalloc(optlen + 4, GFP_KERNEL);
@@ -1238,7 +1238,7 @@
if (optlen < IP_MSFILTER_SIZE(0))
goto e_inval;
- if (optlen > READ_ONCE(sysctl_optmem_max)) {
+ if (optlen > READ_ONCE(net->core.sysctl_optmem_max)) {
err = -ENOBUFS;
break;
}
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 1f22e72..15cec39 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1329,6 +1329,15 @@
.extra1 = SYSCTL_ZERO,
},
{
+ .procname = "tcp_backlog_ack_defer",
+ .data = &init_net.ipv4.sysctl_tcp_backlog_ack_defer,
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
.procname = "tcp_reflect_tos",
.data = &init_net.ipv4.sysctl_tcp_reflect_tos,
.maxlen = sizeof(u8),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 16fd3da68..ae4b0c2 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -280,6 +280,7 @@
#include <linux/uaccess.h>
#include <asm/ioctls.h>
#include <net/busy_poll.h>
+#include <linux/dma-buf.h>
/* Track pending CMSGs. */
enum {
@@ -461,9 +462,11 @@
WRITE_ONCE(sk->sk_sndbuf, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_wmem[1]));
WRITE_ONCE(sk->sk_rcvbuf, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[1]));
+ tcp_scaling_ratio_init(sk);
sk_sockets_allocated_inc(sk);
sk->sk_route_forced_caps = NETIF_F_GSO;
+ xa_init_flags(&sk->sk_pagepool, XA_FLAGS_ALLOC1);
}
EXPORT_SYMBOL(tcp_init_sock);
@@ -1204,6 +1207,49 @@
return err;
}
+static int tcp_prepare_devmem_data(struct msghdr *msg, int devmem_fd,
+ unsigned int devmem_offset,
+ struct file **devmem_file,
+ struct iov_iter *devmem_tx_iter, size_t size)
+{
+ struct dma_buf_pages_file_priv *priv;
+ int err = 0;
+
+ *devmem_file = fget_raw(devmem_fd);
+ if (!*devmem_file) {
+ err = -EINVAL;
+ goto err;
+ }
+
+ if (!is_dma_buf_pages_file(*devmem_file)) {
+ err = -EBADF;
+ goto err_fput;
+ }
+
+ priv = (*devmem_file)->private_data;
+ if (!priv) {
+ WARN_ONCE(!priv, "dma_buf_pages_file has no private_data");
+ err = -EINTR;
+ goto err_fput;
+ }
+
+ if (devmem_offset + size > priv->dmabuf->size) {
+ err = -ENOSPC;
+ goto err_fput;
+ }
+
+ *devmem_tx_iter = priv->tx_iter;
+ iov_iter_advance(devmem_tx_iter, devmem_offset);
+
+ return 0;
+
+err_fput:
+ fput(*devmem_file);
+ *devmem_file = NULL;
+err:
+ return err;
+}
+
int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
{
struct tcp_sock *tp = tcp_sk(sk);
@@ -1215,6 +1261,8 @@
int process_backlog = 0;
bool zc = false;
long timeo;
+ struct file *devmem_file = NULL;
+ struct iov_iter devmem_tx_iter;
flags = msg->msg_flags;
@@ -1277,6 +1325,14 @@
}
}
+ if (sockc.devmem_fd) {
+ err = tcp_prepare_devmem_data(msg, sockc.devmem_fd,
+ sockc.devmem_offset, &devmem_file,
+ &devmem_tx_iter, size);
+ if (err)
+ goto out_err;
+ }
+
/* This should be in poll */
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
@@ -1375,7 +1431,17 @@
if (!sk_wmem_schedule(sk, copy))
goto wait_for_space;
- err = skb_zerocopy_iter_stream(sk, skb, msg, copy, uarg);
+ if (devmem_file) {
+ err = skb_devmem_iter_stream(sk, skb,
+ &devmem_tx_iter,
+ copy, uarg);
+ skb->devmem = 1;
+ if (err > 0)
+ iov_iter_advance(&msg->msg_iter, err);
+ } else {
+ err = skb_zerocopy_iter_stream(sk, skb, msg,
+ copy, uarg);
+ }
if (err == -EMSGSIZE || err == -EEXIST) {
tcp_mark_push(tp, skb);
goto new_segment;
@@ -1429,6 +1495,8 @@
}
out_nopush:
net_zcopy_put(uarg);
+ if (devmem_file)
+ fput(devmem_file);
return copied + copied_syn;
do_error:
@@ -1437,6 +1505,8 @@
if (copied + copied_syn)
goto out;
out_err:
+ if (devmem_file)
+ fput(devmem_file);
net_zcopy_put_abort(uarg, true);
err = sk_stream_error(sk, flags, err);
/* make sure we wake any epoll edge trigger waiter */
@@ -1714,7 +1784,7 @@
/* Make sure sk_rcvbuf is big enough to satisfy SO_RCVLOWAT hint */
int tcp_set_rcvlowat(struct sock *sk, int val)
{
- int cap;
+ int space, cap;
if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
cap = sk->sk_rcvbuf >> 1;
@@ -1729,10 +1799,10 @@
if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
return 0;
- val <<= 1;
- if (val > sk->sk_rcvbuf) {
- WRITE_ONCE(sk->sk_rcvbuf, val);
- tcp_sk(sk)->window_clamp = tcp_win_from_space(sk, val);
+ space = tcp_space_from_win(sk, val);
+ if (space > sk->sk_rcvbuf) {
+ WRITE_ONCE(sk->sk_rcvbuf, space);
+ tcp_sk(sk)->window_clamp = val;
}
return 0;
}
@@ -2294,6 +2364,209 @@
return inq;
}
+/* batch __xa_alloc() calls and reduce xa_lock()/xa_unlock() overhead. */
+struct tcp_xa_pool {
+ u8 max; /* max <= MAX_SKB_FRAGS */
+ u8 idx; /* idx <= max */
+ unsigned int tokens[MAX_SKB_FRAGS];
+ struct page *pgs[MAX_SKB_FRAGS];
+};
+
+static void tcp_xa_pool_commit(struct sock *sk, struct tcp_xa_pool *p,
+ bool lock)
+{
+ int i;
+
+ if (!p->max)
+ return;
+ if (lock)
+ xa_lock_bh(&sk->sk_pagepool);
+ /* Commit part that has been copied to user space. */
+ for (i = 0; i < p->idx; i++)
+ __xa_cmpxchg(&sk->sk_pagepool,
+ p->tokens[i],
+ XA_ZERO_ENTRY,
+ p->pgs[i],
+ GFP_KERNEL);
+ /* Rollback what has been pre-allocated and is no longer needed. */
+ for (; i < p->max; i++)
+ __xa_erase(&sk->sk_pagepool, p->tokens[i]);
+ if (lock)
+ xa_unlock_bh(&sk->sk_pagepool);
+ p->max = 0;
+ p->idx = 0;
+}
+
+static int tcp_xa_pool_refill(struct sock *sk, struct tcp_xa_pool *p,
+ unsigned int max_frags)
+{
+ int err, k;
+
+ if (p->idx < p->max)
+ return 0;
+
+ xa_lock_bh(&sk->sk_pagepool);
+
+ tcp_xa_pool_commit(sk, p, false);
+ for (k = 0; k < max_frags; k++) {
+ err = __xa_alloc(&sk->sk_pagepool, &p->tokens[k],
+ XA_ZERO_ENTRY, xa_limit_31b, GFP_KERNEL);
+ if (err)
+ break;
+ }
+
+ xa_unlock_bh(&sk->sk_pagepool);
+
+ p->max = k;
+ p->idx = 0;
+ return k ? 0 : err;
+}
+/* On error, returns the -errno. On success, returns number of bytes sent to the
+ * user. May not consume all of @len.
+ */
+static int tcp_recvmsg_devmem(struct sock *sk, const struct sk_buff *skb,
+ int offset, struct msghdr *msg, int len)
+{
+ struct devmemvec devmemvec = { 0 };
+ struct tcp_xa_pool tcp_xa_pool;
+ unsigned int start;
+ int i, copy, n;
+ int sent = 0;
+ int err = 0;
+
+ tcp_xa_pool.max = 0;
+ tcp_xa_pool.idx = 0;
+ do {
+ start = skb_headlen(skb);
+
+ if (!skb->devmem) {
+ err = -ENODEV;
+ goto out;
+ }
+
+ /* Copy header. */
+ copy = start - offset;
+ if (copy > 0) {
+ copy = min(copy, len);
+
+ n = copy_to_iter(skb->data + offset, copy, &msg->msg_iter);
+ if (n != copy) {
+ err = -EFAULT;
+ goto out;
+ }
+
+ offset += copy;
+ len -= copy;
+
+ /* First a devmemvec for # bytes copied to user buffer. */
+ memset(&devmemvec, 0, sizeof(devmemvec));
+ devmemvec.frag_size = copy;
+ err = put_cmsg(msg, SOL_SOCKET, SO_DEVMEM_HEADER,
+ sizeof(devmemvec), &devmemvec);
+ if (err || msg->msg_flags & MSG_CTRUNC) {
+ msg->msg_flags &= ~MSG_CTRUNC;
+ if (!err)
+ err = -ETOOSMALL;
+ goto out;
+ }
+
+ sent += copy;
+
+ if (len == 0)
+ goto out;
+ }
+
+ /* after that, send information of devmem pages through a
+ * sequence of cmsg
+ */
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+ const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+ struct page *page = skb_frag_page(frag);
+ struct dma_buf_pages_file_priv *priv;
+ struct page *dmabuf_pages;
+ u32 frag_offset;
+ int end;
+
+ /* skb->devmem should indicate that ALL the
+ * frags in this skb are unreadable pages.
+ * We're checking for that flag above, but also check
+ * individual pages here. If the tcp stack is not
+ * setting skb->devmem correctly, we still don't want to
+ * crash here when accessing pgmap or priv below.
+ */
+ if (!is_dma_buf_page(page)) {
+ err = -ENODEV;
+ goto out;
+ }
+
+ end = start + skb_frag_size(frag);
+ copy = end - offset;
+
+ if (copy > 0) {
+ copy = min(copy, len);
+
+ priv = container_of(page->pgmap,
+ struct dma_buf_pages_file_priv,
+ pgmap);
+
+ dmabuf_pages = priv->pages;
+ frag_offset = ((page - dmabuf_pages) << PAGE_SHIFT) +
+ skb_frag_off(frag) + offset - start;
+ devmemvec.frag_offset = frag_offset;
+ devmemvec.frag_size = copy;
+ err = tcp_xa_pool_refill(sk, &tcp_xa_pool,
+ skb_shinfo(skb)->nr_frags - i);
+ if (err)
+ goto out;
+ /* Will perform the exchange later */
+ devmemvec.frag_token = tcp_xa_pool.tokens[tcp_xa_pool.idx];
+ offset += copy;
+ len -= copy;
+
+ err = put_cmsg(msg, SOL_SOCKET, SO_DEVMEM_OFFSET,
+ sizeof(devmemvec), &devmemvec);
+ if (err || msg->msg_flags & MSG_CTRUNC) {
+ msg->msg_flags &= ~MSG_CTRUNC;
+ if (!err)
+ err = -ETOOSMALL;
+ goto out;
+ }
+
+ tcp_xa_pool.pgs[tcp_xa_pool.idx++] = page;
+ get_page(page);
+
+ sent += copy;
+
+ if (len == 0)
+ goto out;
+ }
+ start = end;
+ }
+ tcp_xa_pool_commit(sk, &tcp_xa_pool, true);
+ if (!len)
+ goto out;
+
+ /* if len is not satisfied yet, we need to go to the next frag
+ * in the frag_list to satisfy len.
+ */
+ skb = skb_shinfo(skb)->frag_list ?: skb->next;
+
+ offset = offset - start;
+ } while (skb);
+
+ if (len) {
+ err = -EFAULT;
+ goto out;
+ }
+
+out:
+ tcp_xa_pool_commit(sk, &tcp_xa_pool, true);
+ if (!sent)
+ sent = err;
+
+ return sent;
+}
+
/*
* This routine copies from a sock struct into the user buffer.
*
@@ -2307,6 +2580,7 @@
struct scm_timestamping_internal *tss,
int *cmsg_flags)
{
+ bool last_copied_devmem, last_copied_init = false;
struct tcp_sock *tp = tcp_sk(sk);
int copied = 0;
u32 peek_seq;
@@ -2481,13 +2755,42 @@
}
if (!(flags & MSG_TRUNC)) {
- err = skb_copy_datagram_msg(skb, offset, msg, used);
- if (err) {
- /* Exception. Bailout! */
- if (!copied)
- copied = -EFAULT;
+ if (last_copied_init &&
+ last_copied_devmem != skb->devmem)
break;
+
+ if (!skb->devmem) {
+ err = skb_copy_datagram_msg(skb, offset, msg,
+ used);
+ if (err) {
+ /* Exception. Bailout! */
+ if (!copied)
+ copied = -EFAULT;
+ break;
+ }
+ } else {
+ if (!(flags & MSG_SOCK_DEVMEM)) {
+ /* skb->devmem skbs can only be received
+ * with the MSG_SOCK_DEVMEM flag.
+ */
+ if (!copied)
+ copied = -EFAULT;
+
+ break;
+ }
+
+ err = tcp_recvmsg_devmem(sk, skb, offset, msg,
+ used);
+ if (err <= 0) {
+ if (!copied)
+ copied = -EFAULT;
+
+ break;
+ }
+ used = err;
}
+ last_copied_devmem = skb->devmem;
+ last_copied_init = true;
}
WRITE_ONCE(*seq, *seq + used);
@@ -3704,8 +4007,9 @@
const struct inet_connection_sock *icsk = inet_csk(sk);
if (level != SOL_TCP)
- return icsk->icsk_af_ops->setsockopt(sk, level, optname,
- optval, optlen);
+ /* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */
+ return READ_ONCE(icsk->icsk_af_ops)->setsockopt(sk, level, optname,
+ optval, optlen);
return do_tcp_setsockopt(sk, level, optname, optval, optlen);
}
EXPORT_SYMBOL(tcp_setsockopt);
@@ -3792,7 +4096,8 @@
info->tcpi_options |= TCPI_OPT_SYN_DATA;
info->tcpi_rto = jiffies_to_usecs(icsk->icsk_rto);
- info->tcpi_ato = jiffies_to_usecs(icsk->icsk_ack.ato);
+ info->tcpi_ato = jiffies_to_usecs(min(icsk->icsk_ack.ato,
+ tcp_delack_max(sk)));
info->tcpi_snd_mss = tp->mss_cache;
info->tcpi_rcv_mss = icsk->icsk_ack.rcv_mss;
@@ -4303,8 +4608,9 @@
struct inet_connection_sock *icsk = inet_csk(sk);
if (level != SOL_TCP)
- return icsk->icsk_af_ops->getsockopt(sk, level, optname,
- optval, optlen);
+ /* Paired with WRITE_ONCE() in do_ipv6_setsockopt() and tcp_v6_connect() */
+ return READ_ONCE(icsk->icsk_af_ops)->getsockopt(sk, level, optname,
+ optval, optlen);
return do_tcp_getsockopt(sk, level, optname, optval, optlen);
}
EXPORT_SYMBOL(tcp_getsockopt);
@@ -4418,6 +4724,9 @@
if (crypto_ahash_update(req))
return 1;
+ if (skb_frags_not_readable(skb))
+ return 1;
+
for (i = 0; i < shi->nr_frags; ++i) {
const skb_frag_t *f = &shi->frags[i];
unsigned int offset = skb_frag_off(f);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index e51b5d8..77b2152 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -237,6 +237,16 @@
*/
len = skb_shinfo(skb)->gso_size ? : skb->len;
if (len >= icsk->icsk_ack.rcv_mss) {
+ /* Note: divides are still a bit expensive.
+ * For the moment, only adjust scaling_ratio
+ * when we update icsk_ack.rcv_mss.
+ */
+ if (unlikely(len != icsk->icsk_ack.rcv_mss)) {
+ u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE;
+
+ do_div(val, skb->truesize);
+ tcp_sk(sk)->scaling_ratio = val ? val : 1;
+ }
icsk->icsk_ack.rcv_mss = min_t(unsigned int, len,
tcp_sk(sk)->advmss);
/* Account for possibly-removed options */
@@ -731,8 +741,8 @@
if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf) &&
!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
- int rcvmem, rcvbuf;
u64 rcvwin, grow;
+ int rcvbuf;
/* minimal window to cope with packet losses, assuming
* steady state. Add some cushion because of small variations.
@@ -744,12 +754,7 @@
do_div(grow, tp->rcvq_space.space);
rcvwin += (grow << 1);
- rcvmem = SKB_TRUESIZE(tp->advmss + MAX_TCP_HEADER);
- while (tcp_win_from_space(sk, rcvmem) < tp->advmss)
- rcvmem += 128;
-
- do_div(rcvwin, tp->advmss);
- rcvbuf = min_t(u64, rcvwin * rcvmem,
+ rcvbuf = min_t(u64, tcp_space_from_win(sk, rcvwin),
READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
if (rcvbuf > sk->sk_rcvbuf) {
WRITE_ONCE(sk->sk_rcvbuf, rcvbuf);
@@ -5191,6 +5196,9 @@
for (end_of_skbs = true; skb != NULL && skb != tail; skb = n) {
n = tcp_skb_next(skb, list);
+ if (skb_frags_not_readable(skb))
+ goto skip_this;
+
/* No new bits? It is possible on ofo queue. */
if (!before(start, TCP_SKB_CB(skb)->end_seq)) {
skb = tcp_collapse_one(sk, skb, list, root);
@@ -5211,17 +5219,20 @@
break;
}
- if (n && n != tail && mptcp_skb_can_collapse(skb, n) &&
+ if (n && n != tail && !skb_frags_not_readable(n) &&
+ mptcp_skb_can_collapse(skb, n) &&
TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(n)->seq) {
end_of_skbs = false;
break;
}
+skip_this:
/* Decided to skip this, advance start seq. */
start = TCP_SKB_CB(skb)->end_seq;
}
if (end_of_skbs ||
- (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)))
+ (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)) ||
+ skb_frags_not_readable(skb))
return;
__skb_queue_head_init(&tmp);
@@ -5265,7 +5276,8 @@
if (!skb ||
skb == tail ||
!mptcp_skb_can_collapse(nskb, skb) ||
- (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)))
+ (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)) ||
+ skb_frags_not_readable(skb))
goto end;
#ifdef CONFIG_TLS_DEVICE
if (skb->decrypted != nskb->decrypted)
@@ -5515,6 +5527,14 @@
tcp_in_quickack_mode(sk) ||
/* Protocol state mandates a one-time immediate ACK */
inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOW) {
+ /* If we are running from __release_sock() in user context,
+ * Defer the ack until tcp_release_cb().
+ */
+ if (sock_owned_by_user_nocheck(sk) &&
+ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_backlog_ack_defer)) {
+ set_bit(TCP_ACK_DEFERRED, &sk->sk_tsq_flags);
+ return;
+ }
send_now:
tcp_send_ack(sk);
return;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 0666be6..f56147b 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2278,6 +2278,14 @@
{
struct tcp_sock *tp = tcp_sk(sk);
+ unsigned long index;
+ struct page *page;
+
+ xa_for_each(&sk->sk_pagepool, index, page)
+ put_page(page);
+
+ xa_destroy(&sk->sk_pagepool);
+
trace_tcp_destroy_sock(sk);
tcp_clear_xmit_timers(sk);
@@ -3199,6 +3207,7 @@
net->ipv4.sysctl_tcp_comp_sack_delay_ns = NSEC_PER_MSEC;
net->ipv4.sysctl_tcp_comp_sack_slack_ns = 100 * NSEC_PER_USEC;
net->ipv4.sysctl_tcp_comp_sack_nr = 44;
+ net->ipv4.sysctl_tcp_backlog_ack_defer = 1;
net->ipv4.sysctl_tcp_fastopen = TFO_CLIENT_ENABLE;
net->ipv4.sysctl_tcp_fastopen_blackhole_timeout = 0;
atomic_set(&net->ipv4.tfo_active_disable_times, 0);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 2606a55..3bd4e10 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -552,6 +552,8 @@
__TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS);
+ xa_init_flags(&newsk->sk_pagepool, XA_FLAGS_ALLOC1);
+
return newsk;
}
EXPORT_SYMBOL(tcp_create_openreq_child);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index d8817d6..8686c5e 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -242,6 +242,23 @@
space = min_t(u32, space, *window_clamp);
*rcv_wscale = clamp_t(int, ilog2(space) - 15,
0, TCP_MAX_WSCALE);
+ /* b/144469234 : We force WSCALE >= 12 for 4K MTU
+ * so that senders do not feel the need to send
+ * too small packets. We prefer full size (4K) packets.
+ * We also special-case MSS == 8192 for similar reason.
+ *
+ * 4K MTU : SYN packets get mss = 4108 (4096 + 12),
+ * while SYNACK packets get mss = 4096,
+ * (assuming TCP TS), courtesy of tcp_openreq_init_rwin()
+ * calling us with "mss - (ireq->tstamp_ok ? TCPOLEN_TSTAMP_ALIGNED : 0)"
+ */
+ if (mss == 4096 ||
+ mss == 4096 + TCPOLEN_TSTAMP_ALIGNED)
+ *rcv_wscale = max_t(int, *rcv_wscale, 12);
+
+ if (mss == 8192 ||
+ mss == 8192 + TCPOLEN_TSTAMP_ALIGNED)
+ *rcv_wscale = max_t(int, *rcv_wscale, 13);
}
/* Set the clamp no higher than max representable value */
(*window_clamp) = min_t(__u32, U16_MAX << (*rcv_wscale), *window_clamp);
@@ -1065,7 +1082,8 @@
#define TCP_DEFERRED_ALL (TCPF_TSQ_DEFERRED | \
TCPF_WRITE_TIMER_DEFERRED | \
TCPF_DELACK_TIMER_DEFERRED | \
- TCPF_MTU_REDUCED_DEFERRED)
+ TCPF_MTU_REDUCED_DEFERRED | \
+ TCPF_ACK_DEFERRED)
/**
* tcp_release_cb - tcp release_sock() callback
* @sk: socket
@@ -1089,16 +1107,6 @@
tcp_tsq_write(sk);
__sock_put(sk);
}
- /* Here begins the tricky part :
- * We are called from release_sock() with :
- * 1) BH disabled
- * 2) sk_lock.slock spinlock held
- * 3) socket owned by us (sk->sk_lock.owned == 1)
- *
- * But following code is meant to be called from BH handlers,
- * so we should keep BH disabled, but early release socket ownership
- */
- sock_release_ownership(sk);
if (flags & TCPF_WRITE_TIMER_DEFERRED) {
tcp_write_timer_handler(sk);
@@ -1112,6 +1120,8 @@
inet_csk(sk)->icsk_af_ops->mtu_reduced(sk);
__sock_put(sk);
}
+ if ((flags & TCPF_ACK_DEFERRED) && inet_csk_ack_scheduled(sk))
+ tcp_send_ack(sk);
}
EXPORT_SYMBOL(tcp_release_cb);
@@ -2308,7 +2318,8 @@
if (len <= skb->len)
break;
- if (unlikely(TCP_SKB_CB(skb)->eor) || tcp_has_tx_tstamp(skb))
+ if (unlikely(TCP_SKB_CB(skb)->eor) || tcp_has_tx_tstamp(skb) ||
+ skb->devmem != next->devmem)
return false;
len -= skb->len;
@@ -3105,6 +3116,8 @@
return false;
if (skb_cloned(skb))
return false;
+ if (skb_frags_not_readable(skb))
+ return false;
/* Some heuristics for collapsing over SACK'd could be invented */
if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED)
return false;
@@ -3909,6 +3922,20 @@
}
EXPORT_SYMBOL(tcp_connect);
+u32 tcp_delack_max(struct sock *sk)
+{
+ const struct dst_entry *dst = __sk_dst_get(sk);
+ u32 delack_max = inet_csk(sk)->icsk_delack_max;
+
+ if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) {
+ u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN);
+ u32 delack_from_rto_min = max_t(int, 1, rto_min - 1);
+
+ delack_max = min_t(u32, delack_max, delack_from_rto_min);
+ }
+ return delack_max;
+}
+
/* Send out a delayed ack, the caller does the policy checking
* to see if we should even be here. See tcp_input.c:tcp_ack_snd_check()
* for details.
@@ -3944,7 +3971,7 @@
ato = min(ato, max_ato);
}
- ato = min_t(u32, ato, inet_csk(sk)->icsk_delack_max);
+ ato = min_t(u32, ato, tcp_delack_max(sk));
/* Stay within the limit we were given */
timeout = jiffies + ato;
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 8e0c33b..bbfa5d3 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -452,11 +452,14 @@
{
struct sock *sk = sock->sk;
u32 flags = BIND_WITH_LOCK;
+ const struct proto *prot;
int err = 0;
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ prot = READ_ONCE(sk->sk_prot);
/* If the socket has its own bind function then use it. */
- if (sk->sk_prot->bind)
- return sk->sk_prot->bind(sk, uaddr, addr_len);
+ if (prot->bind)
+ return prot->bind(sk, uaddr, addr_len);
if (addr_len < SIN6_LEN_RFC2133)
return -EINVAL;
@@ -572,6 +575,7 @@
void __user *argp = (void __user *)arg;
struct sock *sk = sock->sk;
struct net *net = sock_net(sk);
+ const struct proto *prot;
switch (cmd) {
case SIOCADDRT:
@@ -589,9 +593,11 @@
case SIOCSIFDSTADDR:
return addrconf_set_dstaddr(net, argp);
default:
- if (!sk->sk_prot->ioctl)
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ prot = READ_ONCE(sk->sk_prot);
+ if (!prot->ioctl)
return -ENOIOCTLCMD;
- return sk->sk_prot->ioctl(sk, cmd, arg);
+ return prot->ioctl(sk, cmd, arg);
}
/*NOTREACHED*/
return 0;
@@ -653,11 +659,14 @@
int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
{
struct sock *sk = sock->sk;
+ const struct proto *prot;
if (unlikely(inet_send_prepare(sk)))
return -EAGAIN;
- return INDIRECT_CALL_2(sk->sk_prot->sendmsg, tcp_sendmsg, udpv6_sendmsg,
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ prot = READ_ONCE(sk->sk_prot);
+ return INDIRECT_CALL_2(prot->sendmsg, tcp_sendmsg, udpv6_sendmsg,
sk, msg, size);
}
@@ -667,13 +676,16 @@
int flags)
{
struct sock *sk = sock->sk;
+ const struct proto *prot;
int addr_len = 0;
int err;
if (likely(!(flags & MSG_ERRQUEUE)))
sock_rps_record_flow(sk);
- err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udpv6_recvmsg,
+ /* IPV6_ADDRFORM can change sk->sk_prot under us. */
+ prot = READ_ONCE(sk->sk_prot);
+ err = INDIRECT_CALL_2(prot->recvmsg, tcp_recvmsg, udpv6_recvmsg,
sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
if (err >= 0)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 197e12d..70cc0ed 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -208,7 +208,7 @@
if (optlen < GROUP_FILTER_SIZE(0))
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max))
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max))
return -ENOBUFS;
gsf = memdup_sockptr(optval, optlen);
@@ -242,7 +242,7 @@
if (optlen < size0)
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max) - 4)
return -ENOBUFS;
p = kmalloc(optlen + 4, GFP_KERNEL);
@@ -475,8 +475,11 @@
sock_prot_inuse_add(net, sk->sk_prot, -1);
sock_prot_inuse_add(net, &tcp_prot, 1);
local_bh_enable();
- sk->sk_prot = &tcp_prot;
- icsk->icsk_af_ops = &ipv4_specific;
+
+ /* Paired with READ_ONCE(sk->sk_prot) in inet6_stream_ops */
+ WRITE_ONCE(sk->sk_prot, &tcp_prot);
+ /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+ WRITE_ONCE(icsk->icsk_af_ops, &ipv4_specific);
sk->sk_socket->ops = &inet_stream_ops;
sk->sk_family = PF_INET;
tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
@@ -489,7 +492,9 @@
sock_prot_inuse_add(net, sk->sk_prot, -1);
sock_prot_inuse_add(net, prot, 1);
local_bh_enable();
- sk->sk_prot = prot;
+
+ /* Paired with READ_ONCE(sk->sk_prot) in inet6_dgram_ops */
+ WRITE_ONCE(sk->sk_prot, prot);
sk->sk_socket->ops = &inet_dgram_ops;
sk->sk_family = PF_INET;
}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index c18fddd..29c8d38 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -237,7 +237,8 @@
sin.sin_port = usin->sin6_port;
sin.sin_addr.s_addr = usin->sin6_addr.s6_addr32[3];
- icsk->icsk_af_ops = &ipv6_mapped;
+ /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+ WRITE_ONCE(icsk->icsk_af_ops, &ipv6_mapped);
if (sk_is_mptcp(sk))
mptcpv6_handle_mapped(sk, true);
sk->sk_backlog_rcv = tcp_v4_do_rcv;
@@ -249,7 +250,8 @@
if (err) {
icsk->icsk_ext_hdr_len = exthdrlen;
- icsk->icsk_af_ops = &ipv6_specific;
+ /* Paired with READ_ONCE() in tcp_(get|set)sockopt() */
+ WRITE_ONCE(icsk->icsk_af_ops, &ipv6_specific);
if (sk_is_mptcp(sk))
mptcpv6_handle_mapped(sk, false);
sk->sk_backlog_rcv = tcp_v6_do_rcv;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index cffa217..4b31f1f 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2122,7 +2122,7 @@
}
}
- snaplen = skb->len;
+ snaplen = skb_frags_not_readable(skb) ? skb_headlen(skb) : skb->len;
res = run_filter(skb, sk, snaplen);
if (!res)
@@ -2244,7 +2244,7 @@
}
}
- snaplen = skb->len;
+ snaplen = skb_frags_not_readable(skb) ? skb_headlen(skb) : skb->len;
res = run_filter(skb, sk, snaplen);
if (!res)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index fc55b65..e51dc9d 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -448,7 +448,6 @@
struct scatterlist *sge;
struct sk_msg *msg_en;
struct tls_rec *rec;
- bool ready = false;
int pending;
rec = container_of(aead_req, struct tls_rec, aead_req);
@@ -480,8 +479,12 @@
/* If received record is at head of tx_list, schedule tx */
first_rec = list_first_entry(&ctx->tx_list,
struct tls_rec, list);
- if (rec == first_rec)
- ready = true;
+ if (rec == first_rec) {
+ /* Schedule the transmission */
+ if (!test_and_set_bit(BIT_TX_SCHEDULED,
+ &ctx->tx_bitmask))
+ schedule_delayed_work(&ctx->tx_work.work, 1);
+ }
}
spin_lock_bh(&ctx->encrypt_compl_lock);
@@ -490,13 +493,6 @@
if (!pending && ctx->async_notify)
complete(&ctx->async_wait.completion);
spin_unlock_bh(&ctx->encrypt_compl_lock);
-
- if (!ready)
- return;
-
- /* Schedule the transmission */
- if (!test_and_set_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
- schedule_delayed_work(&ctx->tx_work.work, 1);
}
static int tls_do_encryption(struct sock *sk,
diff --git a/security/Kconfig b/security/Kconfig
index 5d412b3..1342869 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -228,6 +228,7 @@
source "security/apparmor/Kconfig"
source "security/loadpin/Kconfig"
source "security/yama/Kconfig"
+source "security/container/Kconfig"
source "security/safesetid/Kconfig"
source "security/lockdown/Kconfig"
source "security/landlock/Kconfig"
diff --git a/security/Makefile b/security/Makefile
index 18121f8..98c6ad1 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -21,6 +21,7 @@
obj-$(CONFIG_SECURITY_LOADPIN) += loadpin/
obj-$(CONFIG_SECURITY_SAFESETID) += safesetid/
obj-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown/
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += container/
obj-$(CONFIG_CGROUPS) += device_cgroup.o
obj-$(CONFIG_BPF_LSM) += bpf/
obj-$(CONFIG_SECURITY_LANDLOCK) += landlock/
diff --git a/security/container/Kconfig b/security/container/Kconfig
new file mode 100644
index 0000000..72a51eb
--- /dev/null
+++ b/security/container/Kconfig
@@ -0,0 +1,17 @@
+config SECURITY_CONTAINER_MONITOR
+ bool "Monitor containerized processes"
+ depends on SECURITY
+ depends on MMU
+ depends on X86_64
+ select SECURITYFS
+ help
+ Instrument the Linux kernel to collect more information about containers
+ and identify security threats.
+
+config SECURITY_CONTAINER_MONITOR_DEBUG
+ bool "Enable debug pr_devel logs"
+ depends on SECURITY_CONTAINER_MONITOR
+ help
+ Define DEBUG for CSM files to compile verbose debugging messages.
+
+ Only for debugging/testing do not enable for production.
diff --git a/security/container/Makefile b/security/container/Makefile
new file mode 100644
index 0000000..9be2528
--- /dev/null
+++ b/security/container/Makefile
@@ -0,0 +1,16 @@
+PB_CCFLAGS := -DPB_SYSTEM_HEADER="<pbsystem.h>" \
+ -DPB_NO_ERRMSG \
+ -DPB_FIELD_16BIT \
+ -DPB_BUFFER_ONLY
+export PB_CCFLAGS
+
+subdir-$(CONFIG_SECURITY_CONTAINER_MONITOR) += protos
+
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += protos/
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += monitor.o pb.o process.o pipe.o
+
+ccflags-y := -I$(srctree)/security/container/protos \
+ -I$(srctree)/security/container/protos/nanopb \
+ -I$(srctree)/fs \
+ $(PB_CCFLAGS)
+ccflags-$(CONFIG_SECURITY_CONTAINER_MONITOR_DEBUG) += -DDEBUG
diff --git a/security/container/monitor.c b/security/container/monitor.c
new file mode 100644
index 0000000..bc1e132
--- /dev/null
+++ b/security/container/monitor.c
@@ -0,0 +1,769 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+#include "process.h"
+
+#include <linux/audit.h>
+#include <linux/file.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/pipe_fs_i.h>
+#include <linux/poll.h>
+#include <linux/rwsem.h>
+#include <linux/seq_file.h>
+#include <linux/string.h>
+#include <linux/sysctl.h>
+#include <overlayfs/overlayfs.h>
+#include <uapi/linux/magic.h>
+
+/* protects csm_*_enabled and configurations. */
+DECLARE_RWSEM(csm_rwsem_config);
+
+/* queue used for poll wait on config changes. */
+static DECLARE_WAIT_QUEUE_HEAD(config_wait);
+
+/* increase each time a new configuration is applied. */
+static unsigned long config_version;
+
+/* Stats gathered from the LSM. */
+struct container_stats csm_stats;
+
+struct container_stats_mapping {
+ const char *key;
+ size_t *value;
+};
+
+/* Key value pair mapping for the sysfs entry. */
+struct container_stats_mapping csm_stats_mapping[] = {
+ { "ProtoEncodingFailed", &csm_stats.proto_encoding_failed },
+ { "WorkQueueFailed", &csm_stats.workqueue_failed },
+ { "EventWritingFailed", &csm_stats.event_writing_failed },
+ { "SizePickingFailed", &csm_stats.size_picking_failed },
+ { "PipeAlreadyOpened", &csm_stats.pipe_already_opened },
+ { "CsmSetxattr", &csm_stats.csm_setxattr },
+};
+
+/*
+ * Is monitoring enabled? Defaults to disabled.
+ * These variables might be used without locking csm_rwsem_config to check if an
+ * LSM hook can bail quickly. The semaphore is taken later to ensure CSM is
+ * still enabled.
+ *
+ * csm_enabled is true if any collector is enabled.
+ */
+bool csm_enabled;
+static bool csm_container_enabled;
+bool csm_execute_enabled;
+bool csm_memexec_enabled;
+
+/* securityfs control files */
+static struct dentry *csm_dir;
+static struct dentry *csm_enabled_file;
+static struct dentry *csm_container_file;
+static struct dentry *csm_config_file;
+static struct dentry *csm_config_vers_file;
+static struct dentry *csm_pipe_file;
+static struct dentry *csm_stats_file;
+
+/* Pipes to forward data to user-mode. */
+DECLARE_RWSEM(csm_rwsem_pipe);
+static struct file *csm_user_read_pipe;
+struct file *csm_user_write_pipe;
+
+/* Option to disable the CSM features at boot. */
+static bool cmdline_boot_disabled;
+bool cmdline_boot_vsock_enabled;
+
+/* Options disabled by default. */
+static bool cmdline_boot_pipe_enabled;
+static bool cmdline_boot_config_enabled;
+
+/* Option to fully enabled the LSM at boot for automated testing. */
+static bool cmdline_default_enabled;
+
+static int csm_boot_disabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_disabled);
+}
+early_param("csm.disabled", csm_boot_disabled_setup);
+
+static int csm_default_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_default_enabled);
+}
+early_param("csm.default.enabled", csm_default_enabled_setup);
+
+static int csm_boot_vsock_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_vsock_enabled);
+}
+early_param("csm.vsock.enabled", csm_boot_vsock_enabled_setup);
+
+static int csm_boot_pipe_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_pipe_enabled);
+}
+early_param("csm.pipe.enabled", csm_boot_pipe_enabled_setup);
+
+static int csm_boot_config_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_config_enabled);
+}
+early_param("csm.config.enabled", csm_boot_config_enabled_setup);
+
+static bool pipe_in_use(void)
+{
+ struct pipe_inode_info *pipe;
+
+ lockdep_assert_held_write(&csm_rwsem_config);
+ if (csm_user_read_pipe) {
+ pipe = get_pipe_info(csm_user_read_pipe, false);
+ if (pipe)
+ return READ_ONCE(pipe->readers) > 1;
+ }
+ return false;
+}
+
+/* Close pipe, force has to be true to close pipe if it is still being used. */
+int close_pipe_files(bool force)
+{
+ if (csm_user_read_pipe) {
+ /* Pipe is still used. */
+ if (pipe_in_use()) {
+ if (!force)
+ return -EBUSY;
+ pr_warn("pipe is closed while it is still being used.\n");
+ }
+
+ fput(csm_user_read_pipe);
+ fput(csm_user_write_pipe);
+ csm_user_read_pipe = NULL;
+ csm_user_write_pipe = NULL;
+ }
+ return 0;
+}
+
+static void csm_update_config(schema_ConfigurationRequest *req)
+{
+ schema_ExecuteCollectorConfig *econf;
+ size_t i;
+ bool enumerate_processes = false;
+
+ /* Expect the lock to be held for write before this call. */
+ lockdep_assert_held_write(&csm_rwsem_config);
+
+ /* This covers the scenario where a client is connected and the config
+ * transitions the execute collector from disabled to enabled. In that
+ * case there may have been execute events not sent. So they are
+ * enumerated.
+ */
+ if (!csm_execute_enabled && req->execute_config.enabled &&
+ pipe_in_use())
+ enumerate_processes = true;
+
+ csm_container_enabled = req->container_config.enabled;
+ csm_execute_enabled = req->execute_config.enabled;
+ csm_memexec_enabled = req->memexec_config.enabled;
+
+ /* csm_enabled is true if any collector is enabled. */
+ csm_enabled = csm_container_enabled || csm_execute_enabled ||
+ csm_memexec_enabled;
+
+ /* Clean-up existing configurations. */
+ kfree(csm_execute_config.envp_allowlist);
+ memset(&csm_execute_config, 0, sizeof(csm_execute_config));
+
+ if (csm_execute_enabled) {
+ econf = &req->execute_config;
+ csm_execute_config.argv_limit = econf->argv_limit;
+ csm_execute_config.envp_limit = econf->envp_limit;
+
+ /* Swap the allowlist so it is not freed on return. */
+ csm_execute_config.envp_allowlist = econf->envp_allowlist.arg;
+ econf->envp_allowlist.arg = NULL;
+ }
+
+ /* Reset all stats and close pipe if disabled. */
+ if (!csm_enabled) {
+ for (i = 0; i < ARRAY_SIZE(csm_stats_mapping); i++)
+ *csm_stats_mapping[i].value = 0;
+
+ close_pipe_files(true);
+ }
+
+ config_version++;
+ if (enumerate_processes)
+ csm_enumerate_processes();
+ wake_up(&config_wait);
+}
+
+int csm_update_config_from_buffer(void *data, size_t size)
+{
+ schema_ConfigurationRequest c = {};
+ pb_istream_t istream;
+
+ c.execute_config.envp_allowlist.funcs.decode = pb_decode_string_array;
+
+ istream = pb_istream_from_buffer(data, size);
+ if (!pb_decode(&istream, schema_ConfigurationRequest_fields, &c)) {
+ kfree(c.execute_config.envp_allowlist.arg);
+ return -EINVAL;
+ }
+
+ down_write(&csm_rwsem_config);
+ csm_update_config(&c);
+ up_write(&csm_rwsem_config);
+
+ return 0;
+}
+
+static ssize_t csm_config_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ ssize_t err = 0;
+ void *mem;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ /* No partial writes. */
+ if (*ppos != 0)
+ return -EINVAL;
+
+ /* Duplicate user memory to safely parse protobuf. */
+ mem = memdup_user(buf, count);
+ if (IS_ERR(mem))
+ return PTR_ERR(mem);
+
+ err = csm_update_config_from_buffer(mem, count);
+ if (!err)
+ err = count;
+
+ kfree(mem);
+ return err;
+}
+
+static const struct file_operations csm_config_fops = {
+ .write = csm_config_write,
+};
+
+static void csm_enable(void)
+{
+ schema_ConfigurationRequest req = {};
+
+ /* Expect the lock to be held for write before this call. */
+ lockdep_assert_held_write(&csm_rwsem_config);
+
+ /* Default configuration */
+ req.container_config.enabled = true;
+ req.execute_config.enabled = true;
+ req.execute_config.argv_limit = UINT_MAX;
+ req.execute_config.envp_limit = UINT_MAX;
+ req.memexec_config.enabled = true;
+ csm_update_config(&req);
+}
+
+static void csm_disable(void)
+{
+ schema_ConfigurationRequest req = {};
+
+ /* Expect the lock to be held for write before this call. */
+ lockdep_assert_held_write(&csm_rwsem_config);
+
+ /* Zero configuration disable all collectors. */
+ csm_update_config(&req);
+ pr_info("disabled\n");
+}
+
+static ssize_t csm_enabled_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ const char *str = csm_enabled ? "1\n" : "0\n";
+
+ return simple_read_from_buffer(buf, count, ppos, str, 2);
+}
+
+static ssize_t csm_enabled_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ bool enabled;
+ int err;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (count <= 0 || count > PAGE_SIZE || *ppos)
+ return -EINVAL;
+
+ err = kstrtobool_from_user(buf, count, &enabled);
+ if (err)
+ return err;
+
+ down_write(&csm_rwsem_config);
+
+ if (enabled)
+ csm_enable();
+ else
+ csm_disable();
+
+ up_write(&csm_rwsem_config);
+
+ return count;
+}
+
+static const struct file_operations csm_enabled_fops = {
+ .read = csm_enabled_read,
+ .write = csm_enabled_write,
+};
+
+static int csm_config_version_open(struct inode *inode, struct file *file)
+{
+ /* private_data is used to keep the latest config version read. */
+ file->private_data = (void*)-1;
+ return 0;
+}
+
+static ssize_t csm_config_version_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ unsigned long version = config_version;
+ file->private_data = (void*)version;
+ return simple_read_from_buffer(buf, count, ppos, &version,
+ sizeof(version));
+}
+
+static __poll_t csm_config_version_poll(struct file *file,
+ struct poll_table_struct *poll_tab)
+{
+ if ((unsigned long)file->private_data != config_version)
+ return EPOLLIN;
+ poll_wait(file, &config_wait, poll_tab);
+ if ((unsigned long)file->private_data != config_version)
+ return EPOLLIN;
+ return 0;
+}
+
+static const struct file_operations csm_config_version_fops = {
+ .open = csm_config_version_open,
+ .read = csm_config_version_read,
+ .poll = csm_config_version_poll,
+};
+
+static int csm_pipe_open(struct inode *inode, struct file *file)
+{
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+ if (!csm_enabled)
+ return -EAGAIN;
+ return 0;
+}
+
+/* Similar to file_clone_open that is available only in 4.19 and up. */
+static inline struct file *pipe_clone_open(struct file *file)
+{
+ return dentry_open(&file->f_path, file->f_flags, file->f_cred);
+}
+
+/* Check if the pipe is still used, else recreate and dup it. */
+static struct file *csm_dup_pipe(void)
+{
+ long pipe_size = 1024 * PAGE_SIZE;
+ long actual_size;
+ struct file *pipes[2] = {NULL, NULL};
+ struct file *ret;
+ int err;
+
+ down_write(&csm_rwsem_pipe);
+
+ err = close_pipe_files(false);
+ if (err) {
+ ret = ERR_PTR(err);
+ csm_stats.pipe_already_opened++;
+ goto out;
+ }
+
+ err = create_pipe_files(pipes, O_NONBLOCK);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto out;
+ }
+
+ /*
+ * Try to increase the pipe size to 1024 pages, if there is not
+ * enough memory, pipes will stay unchanged.
+ */
+ actual_size = pipe_fcntl(pipes[0], F_SETPIPE_SZ, pipe_size);
+ if (actual_size != pipe_size)
+ pr_err("failed to resize pipe to 1024 pages, error: %ld, fallback to the default value\n",
+ actual_size);
+
+ csm_user_read_pipe = pipes[0];
+ csm_user_write_pipe = pipes[1];
+
+ /* Clone the file so we can track if the reader is still used. */
+ ret = pipe_clone_open(csm_user_read_pipe);
+
+out:
+ up_write(&csm_rwsem_pipe);
+ return ret;
+}
+
+static ssize_t csm_pipe_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ int fd;
+ ssize_t err;
+ struct file *local_pipe;
+
+ /* No partial reads. */
+ if (*ppos != 0)
+ return -EINVAL;
+
+ fd = get_unused_fd_flags(0);
+ if (fd < 0)
+ return fd;
+
+ local_pipe = csm_dup_pipe();
+ if (IS_ERR(local_pipe)) {
+ err = PTR_ERR(local_pipe);
+ local_pipe = NULL;
+ goto error;
+ }
+
+ err = simple_read_from_buffer(buf, count, ppos, &fd, sizeof(fd));
+ if (err < 0)
+ goto error;
+
+ if (err < sizeof(fd)) {
+ err = -EINVAL;
+ goto error;
+ }
+
+ /* Install the file descriptor when we know everything succeeded. */
+ fd_install(fd, local_pipe);
+
+ csm_enumerate_processes();
+
+ return err;
+
+error:
+ if (local_pipe)
+ fput(local_pipe);
+ put_unused_fd(fd);
+ return err;
+}
+
+
+static const struct file_operations csm_pipe_fops = {
+ .open = csm_pipe_open,
+ .read = csm_pipe_read,
+};
+
+static void set_container_decode_callbacks(schema_Container *container)
+{
+ container->pod_namespace.funcs.decode = pb_decode_string_field;
+ container->pod_name.funcs.decode = pb_decode_string_field;
+ container->container_name.funcs.decode = pb_decode_string_field;
+ container->container_image_uri.funcs.decode = pb_decode_string_field;
+ container->labels.funcs.decode = pb_decode_string_array;
+}
+
+static void set_container_encode_callbacks(schema_Container *container)
+{
+ container->pod_namespace.funcs.encode = pb_encode_string_field;
+ container->pod_name.funcs.encode = pb_encode_string_field;
+ container->container_name.funcs.encode = pb_encode_string_field;
+ container->container_image_uri.funcs.encode = pb_encode_string_field;
+ container->labels.funcs.encode = pb_encode_string_array;
+}
+
+static void free_container_callbacks_args(schema_Container *container)
+{
+ kfree(container->pod_namespace.arg);
+ kfree(container->pod_name.arg);
+ kfree(container->container_name.arg);
+ kfree(container->container_image_uri.arg);
+ kfree(container->labels.arg);
+}
+
+static ssize_t csm_container_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ ssize_t err = 0;
+ void *mem;
+ u64 cid;
+ pb_istream_t istream;
+ struct task_struct *task;
+ schema_ContainerReport report = {};
+ schema_Event event = {};
+ schema_Container *container;
+ char *uuid = NULL;
+
+ /* Notify that this collector is not yet enabled. */
+ if (!csm_container_enabled)
+ return -EAGAIN;
+
+ /* No partial writes. */
+ if (*ppos != 0)
+ return -EINVAL;
+
+ /* Duplicate user memory to safely parse protobuf. */
+ mem = memdup_user(buf, count);
+ if (IS_ERR(mem))
+ return PTR_ERR(mem);
+
+ /* Callback to decode string in protobuf. */
+ set_container_decode_callbacks(&report.container);
+
+ istream = pb_istream_from_buffer(mem, count);
+ if (!pb_decode(&istream, schema_ContainerReport_fields, &report)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Check protobuf is as expected */
+ if (report.pid == 0 ||
+ report.container.container_id != 0) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Find if the process id is linked to an existing container-id. */
+ rcu_read_lock();
+ task = find_task_by_pid_ns(report.pid, &init_pid_ns);
+ if (task) {
+ cid = audit_get_contid(task);
+ if (cid == AUDIT_CID_UNSET)
+ err = -ENOENT;
+ } else {
+ err = -ENOENT;
+ }
+ rcu_read_unlock();
+
+ if (err)
+ goto out;
+
+ uuid = kzalloc(PROCESS_UUID_SIZE, GFP_KERNEL);
+ if (!uuid)
+ goto out;
+
+ /* Provide the uuid for the top process of the container. */
+ err = get_process_uuid_by_pid(report.pid, uuid, PROCESS_UUID_SIZE);
+ if (err)
+ goto out;
+
+ /* Correct the container-id and feed the event to pipe */
+ report.has_container = true;
+ report.container.container_id = cid;
+ report.container.init_uuid.funcs.encode = pb_encode_uuid_field;
+ report.container.init_uuid.arg = uuid;
+ container = &event.event.container.container;
+ *container = report.container;
+
+ /* Use encode callback to generate the final proto. */
+ set_container_encode_callbacks(container);
+
+ event.which_event = schema_Event_container_tag;
+ event.event.container.has_container = true;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (!err)
+ err = count;
+
+out:
+ /* Free any allocated nanopb callback arguments. */
+ free_container_callbacks_args(&report.container);
+ kfree(uuid);
+ kfree(mem);
+ return err;
+}
+
+static const struct file_operations csm_container_fops = {
+ .write = csm_container_write,
+};
+
+static int csm_show_stats(struct seq_file *p, void *v)
+{
+ size_t i;
+
+ for (i = 0; i < ARRAY_SIZE(csm_stats_mapping); i++) {
+ seq_printf(p, "%s:\t%zu\n",
+ csm_stats_mapping[i].key,
+ *csm_stats_mapping[i].value);
+ }
+
+ return 0;
+}
+
+static int csm_stats_open(struct inode *inode, struct file *file)
+{
+ size_t i, size = 1; /* Start at one for the null byte. */
+
+ for (i = 0; i < ARRAY_SIZE(csm_stats_mapping); i++) {
+ /*
+ * Calculate the maximum length:
+ * - Length of the key
+ * - 3 additional chars :\t\n
+ * - longest unsigned 64-bit integer.
+ */
+ size += strlen(csm_stats_mapping[i].key)
+ + 3 + sizeof("18446744073709551615");
+ }
+
+ return single_open_size(file, csm_show_stats, NULL, size);
+}
+
+static const struct file_operations csm_stats_fops = {
+ .open = csm_stats_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static bool is_d_overlayfs_mounted(struct dentry *dentry)
+{
+ struct super_block *mnt_sb;
+
+ if (dentry == NULL || dentry->d_inode == NULL)
+ return false;
+
+ mnt_sb = dentry->d_inode->i_sb;
+ if (mnt_sb == NULL || mnt_sb->s_magic != OVERLAYFS_SUPER_MAGIC)
+ return false;
+
+ return true;
+}
+
+static int csm_setxattr(struct user_namespace *mnt_userns,
+ struct dentry *dentry, const char *name,
+ const void *value, size_t size, int flags)
+{
+ if (csm_enabled &&
+ (audit_get_contid(current) != AUDIT_CID_UNSET) &&
+ is_d_overlayfs_mounted(dentry) &&
+ (strcmp(name, XATTR_SECURITY_CSM) == 0))
+ csm_stats.csm_setxattr++;
+
+ return 0;
+}
+
+static struct security_hook_list csm_hooks[] __lsm_ro_after_init = {
+ /* Track process execution. */
+ LSM_HOOK_INIT(bprm_check_security, csm_bprm_check_security),
+ LSM_HOOK_INIT(task_post_alloc, csm_task_post_alloc),
+ LSM_HOOK_INIT(task_exit, csm_task_exit),
+
+ /* Track memory execution */
+ LSM_HOOK_INIT(file_mprotect, csm_mprotect),
+ LSM_HOOK_INIT(mmap_file, csm_mmap_file),
+
+ /* Track file modification provenance. */
+ LSM_HOOK_INIT(file_pre_free_security, csm_file_pre_free),
+
+ /* Block modyfing csm xattr. */
+ LSM_HOOK_INIT(inode_setxattr, csm_setxattr),
+};
+
+static int __init csm_init(void)
+{
+ int err;
+
+ if (cmdline_boot_disabled)
+ return 0;
+
+ if (cmdline_boot_vsock_enabled)
+ pr_debug("vsock is deprecated, but was enabled at boot\n");
+
+ csm_dir = securityfs_create_dir("container_monitor", NULL);
+ if (IS_ERR(csm_dir)) {
+ err = PTR_ERR(csm_dir);
+ goto error;
+ }
+
+ csm_enabled_file = securityfs_create_file("enabled", 0644, csm_dir,
+ NULL, &csm_enabled_fops);
+ if (IS_ERR(csm_enabled_file)) {
+ err = PTR_ERR(csm_enabled_file);
+ goto error_rmdir;
+ }
+
+ csm_container_file = securityfs_create_file("container", 0200, csm_dir,
+ NULL, &csm_container_fops);
+ if (IS_ERR(csm_container_file)) {
+ err = PTR_ERR(csm_container_file);
+ goto error_rm_enabled;
+ }
+
+ csm_config_vers_file = securityfs_create_file("config_version", 0400,
+ csm_dir, NULL,
+ &csm_config_version_fops);
+ if (IS_ERR(csm_config_vers_file)) {
+ err = PTR_ERR(csm_config_vers_file);
+ goto error_rm_container;
+ }
+
+ if (cmdline_boot_config_enabled) {
+ csm_config_file = securityfs_create_file("config", 0200,
+ csm_dir, NULL,
+ &csm_config_fops);
+ if (IS_ERR(csm_config_file)) {
+ err = PTR_ERR(csm_config_file);
+ goto error_rm_config_vers;
+ }
+ }
+
+ if (cmdline_boot_pipe_enabled) {
+ csm_pipe_file = securityfs_create_file("pipe", 0400, csm_dir,
+ NULL, &csm_pipe_fops);
+ if (IS_ERR(csm_pipe_file)) {
+ err = PTR_ERR(csm_pipe_file);
+ goto error_rm_config;
+ }
+ }
+
+ csm_stats_file = securityfs_create_file("stats", 0400, csm_dir,
+ NULL, &csm_stats_fops);
+ if (IS_ERR(csm_stats_file)) {
+ err = PTR_ERR(csm_stats_file);
+ goto error_rm_pipe;
+ }
+
+ pr_debug("created securityfs control files\n");
+
+ security_add_hooks(csm_hooks, ARRAY_SIZE(csm_hooks), "csm");
+ pr_debug("registered hooks\n");
+
+ /* Off-by-default, only used for testing images. */
+ if (cmdline_default_enabled) {
+ down_write(&csm_rwsem_config);
+ csm_enable();
+ up_write(&csm_rwsem_config);
+ }
+
+ return 0;
+
+error_rm_pipe:
+ if (cmdline_boot_pipe_enabled)
+ securityfs_remove(csm_pipe_file);
+error_rm_config:
+ if (cmdline_boot_config_enabled)
+ securityfs_remove(csm_config_file);
+error_rm_config_vers:
+ securityfs_remove(csm_config_vers_file);
+error_rm_container:
+ securityfs_remove(csm_container_file);
+error_rm_enabled:
+ securityfs_remove(csm_enabled_file);
+error_rmdir:
+ securityfs_remove(csm_dir);
+error:
+ pr_warn("fs initialization error: %d", err);
+ return err;
+}
+
+late_initcall(csm_init);
diff --git a/security/container/monitor.h b/security/container/monitor.h
new file mode 100644
index 0000000..654a808
--- /dev/null
+++ b/security/container/monitor.h
@@ -0,0 +1,110 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#define pr_fmt(fmt) "container-security-monitor: " fmt
+
+#include <linux/kernel.h>
+#include <linux/security.h>
+#include <linux/fs.h>
+#include <linux/rwsem.h>
+#include <linux/binfmts.h>
+#include <linux/xattr.h>
+#include <config.pb.h>
+#include <event.pb.h>
+#include <pb_encode.h>
+#include <pb_decode.h>
+
+#include "monitoring_protocol.h"
+
+/* Part of the CSM configuration response. */
+#define CSM_VERSION 1
+
+/* protects csm_*_enabled and configurations. */
+extern struct rw_semaphore csm_rwsem_config;
+
+/*
+ * Is monitoring enabled? Defaults to disabled.
+ * These variables might be used as gates without locking (as processor ensures
+ * valid proper access for native scalar values) so it can bail quickly.
+ */
+extern bool csm_enabled;
+extern bool csm_execute_enabled;
+extern bool csm_memexec_enabled;
+
+/* Configuration options for execute collector. */
+struct execute_config {
+ size_t argv_limit;
+ size_t envp_limit;
+ char *envp_allowlist;
+};
+
+extern struct execute_config csm_execute_config;
+
+/* pipe to forward events to user-mode. */
+extern struct rw_semaphore csm_rwsem_pipe;
+extern struct file *csm_user_write_pipe;
+
+/* Stats on LSM events. */
+struct container_stats {
+ size_t proto_encoding_failed;
+ size_t event_writing_failed;
+ size_t workqueue_failed;
+ size_t size_picking_failed;
+ size_t pipe_already_opened;
+ size_t csm_setxattr;
+};
+
+extern struct container_stats csm_stats;
+
+/* Streams file numbers are unknown from the kernel */
+#define STDIN_FILENO 0
+#define STDOUT_FILENO 1
+#define STDERR_FILENO 2
+
+/* security attribute for file provenance. */
+#define XATTR_SECURITY_CSM XATTR_SECURITY_PREFIX "csm"
+
+/* monitor functions */
+int csm_update_config_from_buffer(void *data, size_t size);
+
+/* send event to userland */
+int csm_sendeventproto(const pb_msgdesc_t *fields, schema_Event *event);
+
+/* process events functions */
+int csm_bprm_check_security(struct linux_binprm *bprm);
+void csm_task_exit(struct task_struct *task);
+void csm_task_post_alloc(struct task_struct *task);
+int get_process_uuid_by_pid(pid_t pid_nr, char *buffer, size_t size);
+
+/* memory execution events functions */
+int csm_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+ unsigned long prot);
+int csm_mmap_file(struct file *file, unsigned long reqprot,
+ unsigned long prot, unsigned long flags);
+
+/* Tracking of file modification provenance. */
+void csm_file_pre_free(struct file *file);
+
+/* nano functions */
+bool pb_encode_string_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_decode_string_field(pb_istream_t *stream, const pb_field_t *field,
+ void **arg);
+ssize_t pb_encode_string_field_limit(pb_ostream_t *stream,
+ const pb_field_t *field,
+ void * const *arg, size_t limit);
+bool pb_encode_string_array(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_decode_string_array(pb_istream_t *stream, const pb_field_t *field,
+ void **arg);
+bool pb_encode_uuid_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_encode_ip4(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_encode_ip6(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+
diff --git a/security/container/monitoring_protocol.h b/security/container/monitoring_protocol.h
new file mode 100644
index 0000000..dbdfc9c
--- /dev/null
+++ b/security/container/monitoring_protocol.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+
+/* Container security monitoring protocol definitions */
+
+#include <linux/types.h>
+
+enum csm_msgtype {
+ CSM_MSG_TYPE_HEARTBEAT = 1,
+ CSM_MSG_EVENT_PROTO = 2,
+ CSM_MSG_CONFIG_REQUEST_PROTO = 3,
+ CSM_MSG_CONFIG_RESPONSE_PROTO = 4,
+};
+
+struct csm_msg_hdr {
+ __le32 msg_type;
+ __le32 msg_length;
+};
+
+/* The process uuid is a 128-bits identifier */
+#define PROCESS_UUID_SIZE 16
+
+/* The entire structure forms the collision domain. */
+union process_uuid {
+ struct {
+ __u32 machineid;
+ __u64 start_time;
+ __u32 tgid;
+ } __attribute__((packed));
+ __u8 data[PROCESS_UUID_SIZE];
+};
diff --git a/security/container/pb.c b/security/container/pb.c
new file mode 100644
index 0000000..1cc7ecf
--- /dev/null
+++ b/security/container/pb.c
@@ -0,0 +1,174 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+
+#include <linux/string.h>
+#include <net/tcp.h>
+#include <net/ipv6.h>
+
+bool pb_encode_string_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ const uint8_t *str = (const uint8_t *)*arg;
+
+ /* If the string is not set, skip this string. */
+ if (!str)
+ return true;
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return false;
+
+ return pb_encode_string(stream, str, strlen(str));
+}
+
+bool pb_decode_string_field(pb_istream_t *stream, const pb_field_t *field,
+ void **arg)
+{
+ size_t size;
+ void *data;
+
+ *arg = NULL;
+
+ size = stream->bytes_left;
+
+ /* Ensure a null-byte at the end */
+ if (size + 1 < size)
+ return false;
+
+ data = kzalloc(size + 1, GFP_KERNEL);
+ if (!data)
+ return false;
+
+ if (!pb_read(stream, data, size)) {
+ kfree(data);
+ return false;
+ }
+
+ *arg = data;
+
+ return true;
+}
+
+bool pb_encode_string_array(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ char *strs = (char *)*arg;
+
+ /* If the string array is not set, skip this string array. */
+ if (!strs)
+ return true;
+
+ do {
+ if (!pb_encode_string_field(stream, field,
+ (void * const *) &strs))
+ return false;
+
+ strs += strlen(strs) + 1;
+ } while (*strs != 0);
+
+ return true;
+}
+
+/* Limit the encoded string size and return how many characters were added. */
+ssize_t pb_encode_string_field_limit(pb_ostream_t *stream,
+ const pb_field_t *field,
+ void * const *arg, size_t limit)
+{
+ char *str = (char *)*arg;
+ size_t length;
+
+ /* If the string is not set, skip this string. */
+ if (!str)
+ return 0;
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return -EINVAL;
+
+ length = strlen(str);
+ if (length > limit)
+ length = limit;
+
+ if (!pb_encode_string(stream, (uint8_t *)str, length))
+ return -EINVAL;
+
+ return length;
+}
+
+bool pb_decode_string_array(pb_istream_t *stream, const pb_field_t *field,
+ void **arg)
+{
+ size_t needed, used = 0;
+ char *data, *strs;
+
+ /* String length, and two null-bytes for the end of the list. */
+ needed = stream->bytes_left + 2;
+ if (needed < stream->bytes_left)
+ return false;
+
+ if (*arg) {
+ /* Calculate used space from the current list. */
+ strs = (char *)*arg;
+ do {
+ used += strlen(strs + used) + 1;
+ } while (strs[used] != 0);
+
+ if (used + needed < needed)
+ return false;
+ }
+
+ data = krealloc(*arg, used + needed, GFP_KERNEL);
+ if (!data)
+ return false;
+
+ /* Will always be freed by the caller */
+ *arg = data;
+
+ /* Reset the new part of the buffer. */
+ memset(data + used, 0, needed);
+
+ /* Read what's in the stream buffer only. */
+ if (!pb_read(stream, data + used, stream->bytes_left))
+ return false;
+
+ return true;
+}
+
+bool pb_encode_fixed_string(pb_ostream_t *stream, const pb_field_t *field,
+ const uint8_t *data, size_t length)
+{
+ /* If the data is not set, skip this string. */
+ if (!data)
+ return true;
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return false;
+
+ return pb_encode_string(stream, data, length);
+}
+
+
+bool pb_encode_uuid_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ return pb_encode_fixed_string(stream, field, (const uint8_t *)*arg,
+ PROCESS_UUID_SIZE);
+}
+
+bool pb_encode_ip4(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ return pb_encode_fixed_string(stream, field, (const uint8_t *)*arg,
+ sizeof(struct in_addr));
+}
+
+bool pb_encode_ip6(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ return pb_encode_fixed_string(stream, field, (const uint8_t *)*arg,
+ sizeof(struct in6_addr));
+}
diff --git a/security/container/pipe.c b/security/container/pipe.c
new file mode 100644
index 0000000..e78ddde
--- /dev/null
+++ b/security/container/pipe.c
@@ -0,0 +1,218 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+
+#include <linux/pipe_fs_i.h>
+#include <linux/printk.h>
+#include <linux/ratelimit.h>
+#include <linux/uio.h>
+#include <linux/workqueue.h>
+
+/* csm protobuf work */
+static void csm_sendmsg_pipe_handler(struct work_struct *work);
+
+/* csm message work container */
+struct msg_work_data {
+ struct work_struct msg_work;
+ size_t pos_bytes_written;
+ char msg[];
+};
+
+/* Mutex to ensure sequential dumping of protos */
+static DEFINE_MUTEX(protodump);
+
+static ssize_t csm_user_pipe_write(struct kvec *vecs, size_t vecs_size,
+ size_t total_length)
+{
+ ssize_t perr = 0;
+ struct iov_iter io = { };
+ loff_t pos = 0;
+ struct pipe_inode_info *pipe;
+ unsigned int readers;
+
+ if (!csm_user_write_pipe)
+ return 0;
+
+ down_read(&csm_rwsem_pipe);
+
+ if (csm_user_write_pipe == NULL)
+ goto end;
+
+ /* The pipe info is the same for reader and write files. */
+ pipe = get_pipe_info(csm_user_write_pipe, false);
+
+ /* If nobody is listening, don't write events. */
+ readers = READ_ONCE(pipe->readers);
+ if (readers <= 1) {
+ WARN_ON(readers == 0);
+ goto end;
+ }
+
+
+ iov_iter_kvec(&io, WRITE, vecs, vecs_size, total_length);
+
+ file_start_write(csm_user_write_pipe);
+ perr = vfs_iter_write(csm_user_write_pipe, &io, &pos, 0);
+ file_end_write(csm_user_write_pipe);
+
+end:
+ up_read(&csm_rwsem_pipe);
+ return perr;
+}
+
+static int csm_sendmsg(int type, const void *buf, size_t len)
+{
+ struct csm_msg_hdr hdr = {
+ .msg_type = cpu_to_le32(type),
+ .msg_length = cpu_to_le32(sizeof(hdr) + len),
+ };
+ struct kvec vecs[] = {
+ {
+ .iov_base = &hdr,
+ .iov_len = sizeof(hdr),
+ }, {
+ .iov_base = (void *)buf,
+ .iov_len = len,
+ }
+ };
+ ssize_t perr;
+
+ perr = csm_user_pipe_write(vecs, ARRAY_SIZE(vecs),
+ le32_to_cpu(hdr.msg_length));
+ if (perr < 0) {
+ pr_warn_ratelimited("vfs_iter_write error (msg_type=%d, msg_length=%u): %zd\n",
+ type, le32_to_cpu(hdr.msg_length), perr);
+ csm_stats.event_writing_failed++;
+ }
+
+ return perr;
+}
+
+static bool csm_get_expected_size(size_t *size, const pb_msgdesc_t *fields,
+ const void *src_struct)
+{
+ schema_Event *event;
+
+ if (fields != schema_Event_fields)
+ goto other;
+
+ /* Size above 99% of the 100 containers tested running k8s. */
+ event = (schema_Event *)src_struct;
+ switch (event->which_event) {
+ case schema_Event_execute_tag:
+ *size = 3344;
+ return true;
+ case schema_Event_memexec_tag:
+ *size = 176;
+ return true;
+ case schema_Event_clone_tag:
+ *size = 50;
+ return true;
+ case schema_Event_exit_tag:
+ *size = 30;
+ return true;
+ }
+
+other:
+ /* If unknown, do the pre-computation. */
+ return pb_get_encoded_size(size, fields, src_struct);
+}
+
+static struct msg_work_data *csm_encodeproto(size_t size,
+ const pb_msgdesc_t *fields,
+ const void *src_struct)
+{
+ pb_ostream_t pos;
+ struct msg_work_data *wd;
+ size_t total;
+
+ total = size + sizeof(*wd);
+ if (total < size)
+ return ERR_PTR(-EINVAL);
+
+ wd = kmalloc(total, GFP_KERNEL);
+ if (!wd)
+ return ERR_PTR(-ENOMEM);
+
+ pos = pb_ostream_from_buffer(wd->msg, size);
+ if (!pb_encode(&pos, fields, src_struct)) {
+ kfree(wd);
+ return ERR_PTR(-EINVAL);
+ }
+
+ INIT_WORK(&wd->msg_work, csm_sendmsg_pipe_handler);
+ wd->pos_bytes_written = pos.bytes_written;
+ return wd;
+}
+
+static int csm_sendproto(int type, const pb_msgdesc_t *fields,
+ const void *src_struct)
+{
+ int err = 0;
+ size_t size, previous_size;
+ struct msg_work_data *wd;
+
+ /* Use the expected size first. */
+ if (!csm_get_expected_size(&size, fields, src_struct))
+ return -EINVAL;
+
+ wd = csm_encodeproto(size, fields, src_struct);
+ if (IS_ERR(wd)) {
+ /* If it failed, retry with the exact size. */
+ csm_stats.size_picking_failed++;
+ previous_size = size;
+
+ if (!pb_get_encoded_size(&size, fields, src_struct))
+ return -EINVAL;
+
+ wd = csm_encodeproto(size, fields, src_struct);
+ if (IS_ERR(wd)) {
+ csm_stats.proto_encoding_failed++;
+ return PTR_ERR(wd);
+ }
+
+ pr_debug("size picking failed %lu vs %lu\n", previous_size,
+ size);
+ }
+
+ /* The work handler takes care of cleanup, if successfully scheduled. */
+ if (likely(schedule_work(&wd->msg_work)))
+ return 0;
+
+ csm_stats.workqueue_failed++;
+ pr_err_ratelimited("Sent msg to workqueue unsuccessfully (assume dropped).\n");
+
+ kfree(wd);
+ return err;
+}
+
+static void csm_sendmsg_pipe_handler(struct work_struct *work)
+{
+ int err;
+ int type = CSM_MSG_EVENT_PROTO;
+ struct msg_work_data *wd = container_of(work, struct msg_work_data,
+ msg_work);
+
+ err = csm_sendmsg(type, wd->msg, wd->pos_bytes_written);
+ if (err < 0)
+ pr_err_ratelimited("csm_sendmsg failed in work handler %s\n",
+ __func__);
+
+ kfree(wd);
+}
+
+int csm_sendeventproto(const pb_msgdesc_t *fields, schema_Event *event)
+{
+ /* Last check before generating and sending an event. */
+ if (!csm_enabled)
+ return -ENOTSUPP;
+
+ event->timestamp = ktime_get_real_ns();
+
+ return csm_sendproto(CSM_MSG_EVENT_PROTO, fields, event);
+}
diff --git a/security/container/process.c b/security/container/process.c
new file mode 100644
index 0000000..a9a0d47
--- /dev/null
+++ b/security/container/process.c
@@ -0,0 +1,1167 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+
+#include <linux/atomic.h>
+#include <linux/audit.h>
+#include <linux/file.h>
+#include <linux/highmem.h>
+#include <linux/mempool.h>
+#include <linux/mm.h>
+#include <linux/mount.h>
+#include <linux/notifier.h>
+#include <linux/net.h>
+#include <linux/path.h>
+#include <linux/pid.h>
+#include <linux/pid_namespace.h>
+#include <linux/random.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+#include <linux/sched/signal.h>
+#include <linux/sched/task.h>
+#include <linux/slab.h>
+#include <linux/socket.h>
+#include <linux/timekeeping.h>
+#include <linux/vmalloc.h>
+#include <linux/workqueue.h>
+#include <linux/xattr.h>
+#include <net/ipv6.h>
+#include <net/sock.h>
+#include <net/tcp.h>
+#include <overlayfs/overlayfs.h>
+#include <uapi/linux/magic.h>
+#include <uapi/asm/mman.h>
+
+/* Configuration options for execute collector. */
+struct execute_config csm_execute_config;
+
+/* unique atomic value for the machine boot instance */
+static atomic_t machine_rand = ATOMIC_INIT(0);
+
+/* sequential container identifier */
+static atomic_t contid = ATOMIC_INIT(0);
+
+/* Generation id for each enumeration invocation. */
+static atomic_t enumeration_count = ATOMIC_INIT(0);
+
+struct file_provenance {
+ /* pid of the process doing the first write. */
+ pid_t tgid;
+ /* start_time of the process to uniquely identify it. */
+ u64 start_time;
+};
+
+struct csm_enumerate_processes_work_data {
+ struct work_struct work;
+ int enumeration_count;
+};
+
+static void *kmap_argument_stack(struct linux_binprm *bprm, void **ctx)
+{
+ char *argv;
+ int err;
+ unsigned long i, pos, count;
+ void *map;
+ struct page *page;
+
+ /* vma_pages() returns the number of pages reserved for the stack */
+ count = vma_pages(bprm->vma);
+
+ if (likely(count == 1)) {
+ err = get_user_pages_remote(bprm->mm, bprm->p, 1,
+ FOLL_FORCE, &page, NULL, NULL);
+ if (err != 1)
+ return NULL;
+
+ argv = kmap(page);
+ *ctx = page;
+ } else {
+ /*
+ * If more than one pages is needed, copy all of them to a set
+ * of pages. Parsing the argument across kmap pages in different
+ * addresses would make it impractical.
+ */
+ argv = vmalloc(count * PAGE_SIZE);
+ if (!argv)
+ return NULL;
+
+ for (i = 0; i < count; i++) {
+ pos = ALIGN_DOWN(bprm->p, PAGE_SIZE) + i * PAGE_SIZE;
+ err = get_user_pages_remote(bprm->mm, pos, 1,
+ FOLL_FORCE, &page, NULL,
+ NULL);
+ if (err <= 0) {
+ vfree(argv);
+ return NULL;
+ }
+
+ map = kmap(page);
+ memcpy(argv + i * PAGE_SIZE, map, PAGE_SIZE);
+ kunmap(page);
+ put_page(page);
+ }
+ *ctx = bprm;
+ }
+
+ return argv;
+}
+
+static void kunmap_argument_stack(struct linux_binprm *bprm, void *addr,
+ void *ctx)
+{
+ struct page *page;
+
+ if (!addr)
+ return;
+
+ if (likely(vma_pages(bprm->vma) == 1)) {
+ page = (struct page *)ctx;
+ kunmap(page);
+ put_page(ctx);
+ } else {
+ vfree(addr);
+ }
+}
+
+static char *find_array_next_entry(char *array, unsigned long *offset,
+ unsigned long end)
+{
+ char *entry;
+ unsigned long off = *offset;
+
+ if (off >= end)
+ return NULL;
+
+ /* Check the entry is null terminated and in bound */
+ entry = array + off;
+ while (array[off]) {
+ if (++off >= end)
+ return NULL;
+ }
+
+ /* Pass the null byte for the next iteration */
+ *offset = off + 1;
+
+ return entry;
+}
+
+struct string_arr_ctx {
+ struct linux_binprm *bprm;
+ void *stack;
+};
+
+static size_t get_config_limit(size_t *config_ptr)
+{
+ lockdep_assert_held_read(&csm_rwsem_config);
+
+ /*
+ * If execute is not enabled, do not capture arguments.
+ * The event proto won't be sent anyway.
+ */
+ if (!csm_execute_enabled)
+ return 0;
+
+ return *config_ptr;
+}
+
+static bool encode_current_argv(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ struct string_arr_ctx *ctx = (struct string_arr_ctx *)*arg;
+ int i;
+ struct linux_binprm *bprm = ctx->bprm;
+ unsigned long offset = bprm->p % PAGE_SIZE;
+ unsigned long end = vma_pages(bprm->vma) * PAGE_SIZE;
+ char *argv = ctx->stack;
+ char *entry;
+ size_t limit, used = 0;
+ ssize_t ret;
+
+ limit = get_config_limit(&csm_execute_config.argv_limit);
+ if (!limit)
+ return true;
+
+ for (i = 0; i < bprm->argc; i++) {
+ entry = find_array_next_entry(argv, &offset, end);
+ if (!entry)
+ return false;
+
+ ret = pb_encode_string_field_limit(stream, field,
+ (void * const *)&entry,
+ limit - used);
+ if (ret < 0)
+ return false;
+
+ used += ret;
+
+ if (used >= limit)
+ break;
+ }
+
+ return true;
+}
+
+static bool check_envp_allowlist(char *envp)
+{
+ bool ret = false;
+ char *strs, *equal;
+ size_t str_size, equal_pos;
+
+ /* If execute is not enabled, skip all. */
+ if (!csm_execute_enabled)
+ goto out;
+
+ /* No filter, allow all. */
+ strs = csm_execute_config.envp_allowlist;
+ if (!strs) {
+ ret = true;
+ goto out;
+ }
+
+ /*
+ * Identify the key=value separation.
+ * If none exists use the whole string as a key.
+ */
+ equal = strchr(envp, '=');
+ equal_pos = equal ? (equal - envp) : strlen(envp);
+
+ /* Default to skip if no match found. */
+ ret = false;
+
+ do {
+ str_size = strlen(strs);
+
+ /*
+ * If the filter length align with the key value equal sign,
+ * it might be a match, check the key value.
+ */
+ if (str_size == equal_pos &&
+ !strncmp(strs, envp, str_size)) {
+ ret = true;
+ goto out;
+ }
+
+ strs += str_size + 1;
+ } while (*strs != 0);
+
+out:
+ return ret;
+}
+
+static bool encode_current_envp(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ struct string_arr_ctx *ctx = (struct string_arr_ctx *)*arg;
+ int i;
+ struct linux_binprm *bprm = ctx->bprm;
+ unsigned long offset = bprm->p % PAGE_SIZE;
+ unsigned long end = vma_pages(bprm->vma) * PAGE_SIZE;
+ char *argv = ctx->stack;
+ char *entry;
+ size_t limit, used = 0;
+ ssize_t ret;
+
+ limit = get_config_limit(&csm_execute_config.envp_limit);
+ if (!limit)
+ return true;
+
+ /* Skip arguments */
+ for (i = 0; i < bprm->argc; i++) {
+ if (!find_array_next_entry(argv, &offset, end))
+ return false;
+ }
+
+ for (i = 0; i < bprm->envc; i++) {
+ entry = find_array_next_entry(argv, &offset, end);
+ if (!entry)
+ return false;
+
+ if (!check_envp_allowlist(entry))
+ continue;
+
+ ret = pb_encode_string_field_limit(stream, field,
+ (void * const *)&entry,
+ limit - used);
+ if (ret < 0)
+ return false;
+
+ used += ret;
+
+ if (used >= limit)
+ break;
+ }
+
+ return true;
+}
+
+static bool is_overlayfs_mounted(struct file *file)
+{
+ struct vfsmount *mnt;
+ struct super_block *mnt_sb;
+
+ mnt = file->f_path.mnt;
+ if (mnt == NULL)
+ return false;
+
+ mnt_sb = mnt->mnt_sb;
+ if (mnt_sb == NULL || mnt_sb->s_magic != OVERLAYFS_SUPER_MAGIC)
+ return false;
+
+ return true;
+}
+
+/*
+ * Before the process starts, identify a possible container by checking if the
+ * task is on a pid namespace and the target file is using an overlayfs mounting
+ * point. This check is valid for COS and GKE but not all existing containers.
+ */
+static bool is_possible_container(struct task_struct *task,
+ struct file *file)
+{
+ if (task_active_pid_ns(task) == &init_pid_ns)
+ return false;
+
+ return is_overlayfs_mounted(file);
+}
+
+/*
+ * Generates a random identifier for this boot instance.
+ * This identifier is generated only when needed to increase the entropy
+ * available compared to doing it at early boot.
+ */
+static u32 get_machine_id(void)
+{
+ int machineid, old;
+
+ machineid = atomic_read(&machine_rand);
+
+ if (unlikely(machineid == 0)) {
+ machineid = (int)get_random_int();
+ if (machineid == 0)
+ machineid = 1;
+ old = atomic_cmpxchg(&machine_rand, 0, machineid);
+
+ /* If someone beat us, use their value. */
+ if (old != 0)
+ machineid = old;
+ }
+
+ return (u32)machineid;
+}
+
+/*
+ * Generate a 128-bit unique identifier for the process by appending:
+ * - A machine identifier unique per boot.
+ * - The start time of the process in nanoseconds.
+ * - The tgid for the set of threads in a process.
+ */
+static int get_process_uuid(struct task_struct *task, char *buffer, size_t size)
+{
+ union process_uuid *id = (union process_uuid *)buffer;
+
+ memset(buffer, 0, size);
+
+ if (WARN_ON(size < PROCESS_UUID_SIZE))
+ return -EINVAL;
+
+ id->machineid = get_machine_id();
+ id->start_time = ktime_mono_to_real(task->group_leader->start_time);
+ id->tgid = task_tgid_nr(task);
+
+ return 0;
+}
+
+int get_process_uuid_by_pid(pid_t pid_nr, char *buffer, size_t size)
+{
+ int err;
+ struct task_struct *task = NULL;
+
+ rcu_read_lock();
+ task = find_task_by_pid_ns(pid_nr, &init_pid_ns);
+ if (!task) {
+ err = -ENOENT;
+ goto out;
+ }
+ err = get_process_uuid(task, buffer, size);
+out:
+ rcu_read_unlock();
+ return err;
+}
+
+static int get_process_uuid_from_xattr(struct file *file, char *buffer,
+ size_t size)
+{
+ struct dentry *dentry;
+ int err;
+ struct file_provenance prov;
+ union process_uuid *id = (union process_uuid *)buffer;
+
+ memset(buffer, 0, size);
+
+ if (WARN_ON(size < PROCESS_UUID_SIZE))
+ return -EINVAL;
+
+ /* The file is part of overlayfs on the upper layer. */
+ if (!is_overlayfs_mounted(file))
+ return -ENODATA;
+
+ dentry = ovl_dentry_upper(file->f_path.dentry);
+ if (!dentry)
+ return -ENODATA;
+
+ err = __vfs_getxattr(dentry, dentry->d_inode,
+ XATTR_SECURITY_CSM, &prov, sizeof(prov));
+ /* returns -ENODATA if the xattr does not exist. */
+ if (err < 0)
+ return err;
+ if (err != sizeof(prov)) {
+ pr_err("unexpected size for xattr: %zu -> %d\n",
+ size, err);
+ return -ENODATA;
+ }
+
+ id->machineid = get_machine_id();
+ id->start_time = prov.start_time;
+ id->tgid = prov.tgid;
+ return 0;
+}
+
+u64 csm_set_contid(struct task_struct *task)
+{
+ u64 cid;
+ struct pid_namespace *ns;
+
+ ns = task_active_pid_ns(task);
+ if (WARN_ON(!task->audit) || WARN_ON(!ns))
+ return AUDIT_CID_UNSET;
+
+ cid = atomic_inc_return(&contid);
+ task->audit->contid = cid;
+
+ /*
+ * If the namespace container-id is not set, use the one assigned
+ * to the first process created.
+ */
+ cmpxchg(&ns->cid, 0, cid);
+ return cid;
+}
+
+u64 csm_get_ns_contid(struct pid_namespace *ns)
+{
+ if (!ns || !ns->cid)
+ return AUDIT_CID_UNSET;
+
+ return ns->cid;
+}
+
+union ip_data {
+ struct in_addr ip4;
+ struct in6_addr ip6;
+};
+
+struct file_data {
+ void *allocated;
+ union ip_data local;
+ union ip_data remote;
+ char modified_uuid[PROCESS_UUID_SIZE];
+};
+
+static void free_file_data(struct file_data *fdata)
+{
+ free_page((unsigned long)fdata->allocated);
+ fdata->allocated = NULL;
+}
+
+static void fill_socket_description(struct sockaddr_storage *saddr,
+ union ip_data *idata,
+ schema_SocketIp *schema_socketip)
+{
+ struct sockaddr_in *sin4 = (struct sockaddr_in *)saddr;
+ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)saddr;
+
+ schema_socketip->family = saddr->ss_family;
+
+ switch (saddr->ss_family) {
+ case AF_INET:
+ schema_socketip->port = ntohs(sin4->sin_port);
+ idata->ip4 = sin4->sin_addr;
+ schema_socketip->ip.funcs.encode = pb_encode_ip4;
+ schema_socketip->ip.arg = &idata->ip4;
+ break;
+ case AF_INET6:
+ schema_socketip->port = ntohs(sin6->sin6_port);
+ idata->ip6 = sin6->sin6_addr;
+ schema_socketip->ip.funcs.encode = pb_encode_ip6;
+ schema_socketip->ip.arg = &idata->ip6;
+ break;
+ }
+}
+
+static int fill_file_overlayfs(struct file *file, schema_File *schema_file,
+ struct file_data *fdata)
+{
+ struct dentry *dentry;
+ int err;
+ schema_Overlay *overlayfs;
+
+ /* If not an overlayfs superblock, done. */
+ if (!is_overlayfs_mounted(file))
+ return 0;
+
+ dentry = file->f_path.dentry;
+ schema_file->which_filesystem = schema_File_overlayfs_tag;
+ overlayfs = &schema_file->filesystem.overlayfs;
+ overlayfs->lower_layer = ovl_dentry_lower(dentry);
+ overlayfs->upper_layer = ovl_dentry_upper(dentry);
+
+ err = get_process_uuid_from_xattr(file, fdata->modified_uuid,
+ sizeof(fdata->modified_uuid));
+ /* If there is no xattr, just skip the modified_uuid field. */
+ if (err == -ENODATA)
+ return 0;
+ if (err < 0)
+ return err;
+
+ overlayfs->modified_uuid.funcs.encode = pb_encode_uuid_field;
+ overlayfs->modified_uuid.arg = fdata->modified_uuid;
+ return 0;
+}
+
+static int fill_file_description(struct file *file, schema_File *schema_file,
+ struct file_data *fdata)
+{
+ char *buf;
+ int err;
+ u32 mode;
+ char *path;
+ struct socket *socket;
+ schema_Socket *socketfs;
+ struct sockaddr_storage saddr;
+
+ memset(fdata, 0, sizeof(*fdata));
+
+ if (file == NULL)
+ return 0;
+
+ schema_file->ino = file_inode(file)->i_ino;
+ mode = file_inode(file)->i_mode;
+
+ /* For pipes, no need to resolve the path. */
+ if (S_ISFIFO(mode))
+ return 0;
+
+ if (S_ISSOCK(mode)) {
+ socket = (struct socket *)file->private_data;
+ socketfs = &schema_file->filesystem.socket;
+
+ /* Local socket */
+ err = kernel_getsockname(socket, (struct sockaddr *)&saddr);
+ if (err >= 0) {
+ socketfs->has_local = true;
+ fill_socket_description(&saddr, &fdata->local,
+ &socketfs->local);
+ }
+
+ /* Remote socket, might not be connected. */
+ err = kernel_getpeername(socket, (struct sockaddr *)&saddr);
+ if (err >= 0) {
+ socketfs->has_remote = true;
+ fill_socket_description(&saddr, &fdata->remote,
+ &socketfs->remote);
+ }
+
+ schema_file->which_filesystem = schema_File_socket_tag;
+ return 0;
+ }
+
+ /*
+ * From this point, we care about all the other types of files as their
+ * path provides interesting insight.
+ */
+ buf = (char *)__get_free_page(GFP_KERNEL);
+ if (buf == NULL)
+ return -ENOMEM;
+
+ fdata->allocated = buf;
+
+ path = d_path(&file->f_path, buf, PAGE_SIZE);
+ if (IS_ERR(path)) {
+ free_file_data(fdata);
+ return PTR_ERR(path);
+ }
+
+ schema_file->fullpath.funcs.encode = pb_encode_string_field;
+ schema_file->fullpath.arg = path; /* buf is freed in free_file_data. */
+
+ err = fill_file_overlayfs(file, schema_file, fdata);
+ if (err) {
+ free_file_data(fdata);
+ return err;
+ }
+
+ return 0;
+}
+
+static int fill_stream_description(schema_Descriptor *desc, int fd,
+ struct file_data *fdata)
+{
+ struct fd sfd;
+ struct file *file;
+ int err = 0;
+
+ sfd = fdget(fd);
+ file = sfd.file;
+
+ if (file == NULL) {
+ memset(fdata, 0, sizeof(*fdata));
+ goto end;
+ }
+
+ desc->mode = file_inode(file)->i_mode;
+ desc->has_file = true;
+ err = fill_file_description(file, &desc->file, fdata);
+
+end:
+ fdput(sfd);
+ return err;
+}
+
+static int populate_proc_uuid_common(schema_Process *proc, char *uuid,
+ size_t uuid_size, char *parent_uuid,
+ size_t parent_uuid_size,
+ struct task_struct *task)
+{
+ int err;
+ struct task_struct *parent;
+ /* Generate unique identifier for the process and its parent */
+ err = get_process_uuid(task, uuid, uuid_size);
+ if (err)
+ return err;
+
+ proc->uuid.funcs.encode = pb_encode_uuid_field;
+ proc->uuid.arg = uuid;
+
+ rcu_read_lock();
+
+ if (!pid_alive(task))
+ goto out;
+ /*
+ * I don't think this needs to be task_rcu_dereference because
+ * real_parent is only supposed to be accessed using RCU.
+ */
+ parent = rcu_dereference(task->real_parent);
+
+ if (parent) {
+ err = get_process_uuid(parent, parent_uuid, parent_uuid_size);
+ if (!err) {
+ proc->parent_uuid.funcs.encode = pb_encode_uuid_field;
+ proc->parent_uuid.arg = parent_uuid;
+ }
+ }
+
+out:
+ rcu_read_unlock();
+
+ return err;
+}
+
+/* Populate the fields that we always want to set in Process messages. */
+static int populate_proc_common(schema_Process *proc, char *uuid,
+ size_t uuid_size, char *parent_uuid,
+ size_t parent_uuid_size,
+ struct task_struct *task)
+{
+ u64 cid;
+ struct pid_namespace *ns = task_active_pid_ns(task);
+
+ /* Container identifier for the current namespace. */
+ proc->container_id = csm_get_ns_contid(ns);
+
+ /*
+ * If the process container-id is different, the process tree is part of
+ * a different session within the namespace (kubectl/docker exec,
+ * liveness probe or others).
+ */
+ cid = audit_get_contid(task);
+ if (proc->container_id != cid)
+ proc->exec_session_id = cid;
+
+ /* Add information about pid in different namespaces */
+ proc->pid = task_tgid_nr(task);
+ proc->parent_pid = task_ppid_nr(task);
+ proc->container_pid = task_tgid_nr_ns(task, ns);
+ proc->container_parent_pid = task_ppid_nr_ns(task, ns);
+
+ return populate_proc_uuid_common(proc, uuid, uuid_size, parent_uuid,
+ parent_uuid_size, task);
+}
+
+int csm_bprm_check_security(struct linux_binprm *bprm)
+{
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ int err;
+ schema_Event event = {};
+ schema_Process *proc;
+ struct string_arr_ctx argv_ctx;
+ void *stack = NULL, *ctx = NULL;
+ u64 cid;
+ struct file_data path_data = {};
+ struct file_data stdin_data = {};
+ struct file_data stdout_data = {};
+ struct file_data stderr_data = {};
+
+ /*
+ * Always create a container-id for containerized processes.
+ * If the LSM is enabled later, we can track existing containers.
+ */
+ cid = audit_get_contid(current);
+
+ if (cid == AUDIT_CID_UNSET) {
+ if (!is_possible_container(current, bprm->file))
+ return 0;
+
+ cid = csm_set_contid(current);
+
+ if (cid == AUDIT_CID_UNSET)
+ return 0;
+ }
+
+ if (!csm_execute_enabled)
+ return 0;
+
+ /* The interpreter will call us again with more context. */
+ if (bprm->buf[0] == '#' && bprm->buf[1] == '!')
+ return 0;
+
+ proc = &event.event.execute.proc;
+ err = populate_proc_common(proc, uuid, sizeof(uuid), parent_uuid,
+ sizeof(parent_uuid), current);
+ if (err)
+ goto out_free_buf;
+
+ proc->creation_timestamp = ktime_get_real_ns();
+
+ /* Provide information about the launched binary. */
+ proc->has_binary = true;
+ err = fill_file_description(bprm->file, &proc->binary, &path_data);
+ if (err)
+ goto out_free_buf;
+
+ /* Information about streams */
+ proc->has_streams = true;
+
+ proc->streams.has_stdin = true;
+ err = fill_stream_description(&proc->streams.stdin, STDIN_FILENO,
+ &stdin_data);
+ if (err)
+ goto out_free_buf;
+
+ proc->streams.has_stdout = true;
+ err = fill_stream_description(&proc->streams.stdout, STDOUT_FILENO,
+ &stdout_data);
+ if (err)
+ goto out_free_buf;
+
+ proc->streams.has_stderr = true;
+ err = fill_stream_description(&proc->streams.stderr, STDERR_FILENO,
+ &stderr_data);
+ if (err)
+ goto out_free_buf;
+
+ stack = kmap_argument_stack(bprm, &ctx);
+ if (!stack) {
+ err = -EFAULT;
+ goto out_free_buf;
+ }
+
+ /* Capture process argument */
+ argv_ctx.bprm = bprm;
+ argv_ctx.stack = stack;
+ proc->args.argv.funcs.encode = encode_current_argv;
+ proc->args.argv.arg = &argv_ctx;
+
+ /* Capture process environment variables */
+ proc->args.envp.funcs.encode = encode_current_envp;
+ proc->args.envp.arg = &argv_ctx;
+
+ event.which_event = schema_Event_execute_tag;
+ event.event.execute.has_proc = true;
+ proc->has_args = true;
+
+ /*
+ * Configurations options are checked when computing the serialized
+ * protobufs.
+ */
+ down_read(&csm_rwsem_config);
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ up_read(&csm_rwsem_config);
+
+ if (err)
+ pr_err("csm_sendeventproto returned %d on execve\n", err);
+ err = 0;
+
+out_free_buf:
+ kunmap_argument_stack(bprm, stack, ctx);
+ free_file_data(&path_data);
+ free_file_data(&stdin_data);
+ free_file_data(&stdout_data);
+ free_file_data(&stderr_data);
+
+ /*
+ * On failure, enforce it only if the execute config is enabled.
+ * If the collector was disabled, prefer to succeed to not impact the
+ * system.
+ */
+ if (unlikely(err < 0 && !csm_execute_enabled))
+ err = 0;
+
+ return err;
+}
+
+/* Create a clone event when a new task leader is created. */
+void csm_task_post_alloc(struct task_struct *task)
+{
+ int err;
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ schema_Event event = {};
+ schema_Process *proc;
+
+ if (!csm_execute_enabled ||
+ audit_get_contid(task) == AUDIT_CID_UNSET ||
+ !thread_group_leader(task))
+ return;
+
+ proc = &event.event.clone.proc;
+
+ err = populate_proc_uuid_common(proc, uuid, sizeof(uuid), parent_uuid,
+ sizeof(parent_uuid), task);
+
+ event.which_event = schema_Event_clone_tag;
+ event.event.clone.has_proc = true;
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on exit\n", err);
+}
+
+/*
+ * This LSM hook callback doesn't exist upstream and is called only when the
+ * last thread of a thread group exit.
+ */
+void csm_task_exit(struct task_struct *task)
+{
+ int err;
+ schema_Event event = {};
+ schema_ExitEvent *exit;
+ char uuid[PROCESS_UUID_SIZE];
+
+ if (!csm_execute_enabled ||
+ audit_get_contid(task) == AUDIT_CID_UNSET)
+ return;
+
+ exit = &event.event.exit;
+
+ /* Fetch the unique identifier for this process */
+ err = get_process_uuid(task, uuid, sizeof(uuid));
+ if (err) {
+ pr_err("failed to get process uuid on exit\n");
+ return;
+ }
+
+ exit->process_uuid.funcs.encode = pb_encode_uuid_field;
+ exit->process_uuid.arg = uuid;
+
+ event.which_event = schema_Event_exit_tag;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on exit\n", err);
+}
+
+int csm_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+ unsigned long prot)
+{
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ int err;
+ schema_Event event = {};
+ schema_MemoryExecEvent *memexec;
+ u64 cid;
+ struct file_data path_data = {};
+
+ cid = audit_get_contid(current);
+
+ if (!csm_memexec_enabled ||
+ !(prot & PROT_EXEC) ||
+ vma->vm_file == NULL ||
+ cid == AUDIT_CID_UNSET)
+ return 0;
+
+ memexec = &event.event.memexec;
+
+ err = fill_file_description(vma->vm_file, &memexec->mapped_file,
+ &path_data);
+ if (err)
+ return err;
+
+ err = populate_proc_common(&memexec->proc, uuid, sizeof(uuid),
+ parent_uuid, sizeof(parent_uuid), current);
+ if (err)
+ goto out;
+
+ memexec->prot_exec_timestamp = ktime_get_real_ns();
+ memexec->new_flags = prot;
+ memexec->req_flags = reqprot;
+ memexec->old_vm_flags = vma->vm_flags;
+
+ memexec->action = schema_MemoryExecEvent_Action_MPROTECT;
+ memexec->start_addr = vma->vm_start;
+ memexec->end_addr = vma->vm_end;
+
+ event.which_event = schema_Event_memexec_tag;
+ event.event.memexec.has_proc = true;
+ event.event.memexec.has_mapped_file = true;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on mprotect\n", err);
+ err = 0;
+
+ if (unlikely(err < 0 && !csm_memexec_enabled))
+ err = 0;
+
+out:
+ free_file_data(&path_data);
+ return err;
+}
+
+int csm_mmap_file(struct file *file, unsigned long reqprot,
+ unsigned long prot, unsigned long flags)
+{
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ int err;
+ schema_Event event = {};
+ schema_MemoryExecEvent *memexec;
+ struct file *exe_file;
+ u64 cid;
+ struct file_data path_data = {};
+
+ cid = audit_get_contid(current);
+
+ if (!csm_memexec_enabled ||
+ !(prot & PROT_EXEC) ||
+ file == NULL ||
+ cid == AUDIT_CID_UNSET)
+ return 0;
+
+ memexec = &event.event.memexec;
+ err = fill_file_description(file, &memexec->mapped_file,
+ &path_data);
+ if (err)
+ return err;
+
+ err = populate_proc_common(&memexec->proc, uuid, sizeof(uuid),
+ parent_uuid, sizeof(parent_uuid), current);
+ if (err)
+ goto out;
+
+ /* get_mm_exe_file does its own locking on mm_sem. */
+ exe_file = get_mm_exe_file(current->mm);
+ if (exe_file) {
+ if (path_equal(&file->f_path, &exe_file->f_path))
+ memexec->is_initial_mmap = 1;
+ fput(exe_file);
+ }
+
+ memexec->prot_exec_timestamp = ktime_get_real_ns();
+ memexec->new_flags = prot;
+ memexec->req_flags = reqprot;
+ memexec->mmap_flags = flags;
+ memexec->action = schema_MemoryExecEvent_Action_MMAP_FILE;
+ event.which_event = schema_Event_memexec_tag;
+ event.event.memexec.has_proc = true;
+ event.event.memexec.has_mapped_file = true;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on mmap_file\n", err);
+ err = 0;
+
+ if (unlikely(err < 0 && !csm_memexec_enabled))
+ err = 0;
+
+out:
+ free_file_data(&path_data);
+ return err;
+}
+
+void csm_file_pre_free(struct file *file)
+{
+ struct dentry *dentry;
+ int err;
+ struct file_provenance prov;
+
+ /* The file was opened to be modified and the LSM is enabled */
+ if (!(file->f_mode & FMODE_WRITE) ||
+ !csm_enabled)
+ return;
+
+ /* The current process is containerized. */
+ if (audit_get_contid(current) == AUDIT_CID_UNSET)
+ return;
+
+ /* The file is part of overlayfs on the upper layer. */
+ if (!is_overlayfs_mounted(file))
+ return;
+
+ dentry = ovl_dentry_upper(file->f_path.dentry);
+ if (!dentry)
+ return;
+
+ err = __vfs_getxattr(dentry, dentry->d_inode, XATTR_SECURITY_CSM,
+ NULL, 0);
+ if (err != -ENODATA) {
+ if (err < 0)
+ pr_err("failed to get security attribute: %d\n", err);
+ return;
+ }
+
+ prov.tgid = task_tgid_nr(current);
+ prov.start_time = ktime_mono_to_real(current->group_leader->start_time);
+
+ err = __vfs_setxattr(&init_user_ns, dentry, dentry->d_inode,
+ XATTR_SECURITY_CSM, &prov, sizeof(prov), 0);
+ if (err < 0)
+ pr_err("failed to set security attribute: %d\n", err);
+}
+
+/*
+ * Based off of fs/proc/base.c:next_tgid
+ *
+ * next_thread_group_leader returns the task_struct of the next task with a pid
+ * greater than or equal to tgid. The reference count is increased so that
+ * rcu_read_unlock may be called, and preemption reenabled.
+ */
+static struct task_struct *next_thread_group_leader(pid_t *tgid)
+{
+ struct pid *pid;
+ struct task_struct *task;
+
+ cond_resched();
+ rcu_read_lock();
+retry:
+ task = NULL;
+ pid = find_ge_pid(*tgid, &init_pid_ns);
+ if (pid) {
+ *tgid = pid_nr_ns(pid, &init_pid_ns);
+ task = pid_task(pid, PIDTYPE_PID);
+ if (!task || !thread_group_leader(task) ||
+ audit_get_contid(task) == AUDIT_CID_UNSET) {
+ (*tgid) += 1;
+ goto retry;
+ }
+
+ /*
+ * Increment the reference count on the task before leaving
+ * the RCU grace period.
+ */
+ get_task_struct(task);
+ (*tgid) += 1;
+ }
+
+ rcu_read_unlock();
+ return task;
+}
+
+void delayed_enumerate_processes(struct work_struct *work)
+{
+ pid_t tgid = 0;
+ struct task_struct *task;
+ struct csm_enumerate_processes_work_data *wd = container_of(
+ work, struct csm_enumerate_processes_work_data, work);
+ int wd_enumeration_count = wd->enumeration_count;
+
+ kfree(wd);
+ wd = NULL;
+ work = NULL;
+
+ /*
+ * Try for only a single enumeration routine at a time, as long as the
+ * execute collector is enabled.
+ */
+ while ((wd_enumeration_count == atomic_read(&enumeration_count)) &&
+ READ_ONCE(csm_execute_enabled) &&
+ (task = next_thread_group_leader(&tgid))) {
+ int err;
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ struct file *exe_file = NULL;
+ struct file_data path_data = {};
+ schema_Event event = {};
+ schema_Process *proc = &event.event.enumproc.proc;
+
+ exe_file = get_task_exe_file(task);
+ if (!exe_file) {
+ pr_err("failed to get enumerated process executable, pid: %u\n",
+ task_pid_nr(task));
+ goto next;
+ }
+
+ proc->has_binary = true;
+ err = fill_file_description(exe_file, &proc->binary,
+ &path_data);
+ if (err) {
+ pr_err("failed to fill enumerated process %u executable description: %d\n",
+ task_pid_nr(task), err);
+ goto next;
+ }
+
+ err = populate_proc_common(proc, uuid, sizeof(uuid),
+ parent_uuid, sizeof(parent_uuid),
+ task);
+ if (err) {
+ pr_err("failed to set pid %u common fields: %d\n",
+ task_pid_nr(task), err);
+ goto next;
+ }
+
+ if (task->flags & PF_EXITING)
+ goto next;
+
+ event.which_event = schema_Event_enumproc_tag;
+ event.event.execute.has_proc = true;
+ err = csm_sendeventproto(schema_Event_fields,
+ &event);
+ if (err) {
+ pr_err("failed to send pid %u enumerated process: %d\n",
+ task_pid_nr(task), err);
+ goto next;
+ }
+next:
+ free_file_data(&path_data);
+ if (exe_file)
+ fput(exe_file);
+
+ put_task_struct(task);
+ }
+}
+
+void csm_enumerate_processes(unsigned long const config_version)
+{
+ struct csm_enumerate_processes_work_data *wd;
+
+ wd = kmalloc(sizeof(*wd), GFP_KERNEL);
+ if (!wd)
+ return;
+
+ INIT_WORK(&wd->work, delayed_enumerate_processes);
+ wd->enumeration_count = atomic_add_return(1, &enumeration_count);
+ schedule_work(&wd->work);
+}
diff --git a/security/container/process.h b/security/container/process.h
new file mode 100644
index 0000000..1c98134
--- /dev/null
+++ b/security/container/process.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2019 Google, Inc
+ */
+
+void csm_enumerate_processes(void);
diff --git a/security/container/protos/Makefile b/security/container/protos/Makefile
new file mode 100644
index 0000000..a88068b
--- /dev/null
+++ b/security/container/protos/Makefile
@@ -0,0 +1,10 @@
+subdir-$(CONFIG_SECURITY_CONTAINER_MONITOR) += nanopb
+
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += nanopb/
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += protos.o
+
+protos-y := config.pb.o event.pb.o
+
+ccflags-y := -I$(srctree)/security/container/protos \
+ -I$(srctree)/security/container/protos/nanopb \
+ $(PB_CCFLAGS)
diff --git a/security/container/protos/README b/security/container/protos/README
new file mode 100644
index 0000000..1b0628a
--- /dev/null
+++ b/security/container/protos/README
@@ -0,0 +1,18 @@
+This document provides guidance on how to change the protos used in this directory.
+
+Any change made to a proto file require to reformat it and regenerate nanopb
+sources. It also requires the proto files to be compatible to previously released versions.
+
+To reformat any proto file run: "clang-format -style=Google -i <file.proto>"
+
+To regenerate nanopb files:
+ - Install protoc
+ - apt-get install protobuf-compiler
+ - Clone/setup nanopb for version 0.3.9.1 (or clone the internal depot)
+ - git clone --depth=1 https://github.com/nanopb/nanopb.git
+ - cd nanopb
+ - git fetch --tags
+ - git checkout tags/0.3.9.1
+ - make -C generator/proto
+ - Run protoc with the nanopb definition
+ - protoc --plugin=<path_to_nanopb>/generator/protoc-gen-nanopb --nanopb_out=<path_to_linux>/security/container/protos/ <path_to_linux>/security/container/protos/<file.proto> --proto_path=<path_to_linux>/security/container/protos
diff --git a/security/container/protos/config.pb.c b/security/container/protos/config.pb.c
new file mode 100644
index 0000000..08436ee
--- /dev/null
+++ b/security/container/protos/config.pb.c
@@ -0,0 +1,25 @@
+/* Automatically generated nanopb constant definitions */
+/* Generated by nanopb-0.4.5 */
+
+#include "config.pb.h"
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+PB_BIND(schema_ContainerCollectorConfig, schema_ContainerCollectorConfig, AUTO)
+
+
+PB_BIND(schema_ExecuteCollectorConfig, schema_ExecuteCollectorConfig, AUTO)
+
+
+PB_BIND(schema_MemExecCollectorConfig, schema_MemExecCollectorConfig, AUTO)
+
+
+PB_BIND(schema_ConfigurationRequest, schema_ConfigurationRequest, AUTO)
+
+
+PB_BIND(schema_ConfigurationResponse, schema_ConfigurationResponse, AUTO)
+
+
+
+
diff --git a/security/container/protos/config.pb.h b/security/container/protos/config.pb.h
new file mode 100644
index 0000000..893961e
--- /dev/null
+++ b/security/container/protos/config.pb.h
@@ -0,0 +1,157 @@
+/* Automatically generated nanopb header */
+/* Generated by nanopb-0.4.5 */
+
+#ifndef PB_SCHEMA_CONFIG_PB_H_INCLUDED
+#define PB_SCHEMA_CONFIG_PB_H_INCLUDED
+#include <pb.h>
+
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+/* Enum definitions */
+typedef enum _schema_ConfigurationResponse_ErrorCode {
+ schema_ConfigurationResponse_ErrorCode_NO_ERROR = 0,
+ schema_ConfigurationResponse_ErrorCode_UNKNOWN = 2
+} schema_ConfigurationResponse_ErrorCode;
+
+/* Struct definitions */
+/* Report success or failure of previous ConfigurationRequest */
+typedef struct _schema_ConfigurationResponse {
+ schema_ConfigurationResponse_ErrorCode error;
+ pb_callback_t msg;
+ uint64_t version; /* Version of the LSM */
+ uint32_t kernel_version; /* LINUX_VERSION_CODE */
+} schema_ConfigurationResponse;
+
+/* Collect information about running containers */
+typedef struct _schema_ContainerCollectorConfig {
+ bool enabled;
+} schema_ContainerCollectorConfig;
+
+typedef struct _schema_ExecuteCollectorConfig {
+ bool enabled;
+ /* truncate argv/envp if cumulative length exceeds limit */
+ uint32_t argv_limit;
+ uint32_t envp_limit;
+ /* If specified, only report the named environment variables. An
+ empty envp_allowlist indicates that all environment variables
+ should be reported up to a cumulative total of envp_limit bytes. */
+ pb_callback_t envp_allowlist;
+} schema_ExecuteCollectorConfig;
+
+/* Collect information about executable memory mappings. */
+typedef struct _schema_MemExecCollectorConfig {
+ bool enabled;
+} schema_MemExecCollectorConfig;
+
+/* Convey configuration information to Guest LSM */
+typedef struct _schema_ConfigurationRequest {
+ bool has_container_config;
+ schema_ContainerCollectorConfig container_config;
+ bool has_execute_config;
+ schema_ExecuteCollectorConfig execute_config;
+ bool has_memexec_config;
+ schema_MemExecCollectorConfig memexec_config;
+} schema_ConfigurationRequest;
+
+
+/* Helper constants for enums */
+#define _schema_ConfigurationResponse_ErrorCode_MIN schema_ConfigurationResponse_ErrorCode_NO_ERROR
+#define _schema_ConfigurationResponse_ErrorCode_MAX schema_ConfigurationResponse_ErrorCode_UNKNOWN
+#define _schema_ConfigurationResponse_ErrorCode_ARRAYSIZE ((schema_ConfigurationResponse_ErrorCode)(schema_ConfigurationResponse_ErrorCode_UNKNOWN+1))
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Initializer values for message structs */
+#define schema_ContainerCollectorConfig_init_default {0}
+#define schema_ExecuteCollectorConfig_init_default {0, 0, 0, {{NULL}, NULL}}
+#define schema_MemExecCollectorConfig_init_default {0}
+#define schema_ConfigurationRequest_init_default {false, schema_ContainerCollectorConfig_init_default, false, schema_ExecuteCollectorConfig_init_default, false, schema_MemExecCollectorConfig_init_default}
+#define schema_ConfigurationResponse_init_default {_schema_ConfigurationResponse_ErrorCode_MIN, {{NULL}, NULL}, 0, 0}
+#define schema_ContainerCollectorConfig_init_zero {0}
+#define schema_ExecuteCollectorConfig_init_zero {0, 0, 0, {{NULL}, NULL}}
+#define schema_MemExecCollectorConfig_init_zero {0}
+#define schema_ConfigurationRequest_init_zero {false, schema_ContainerCollectorConfig_init_zero, false, schema_ExecuteCollectorConfig_init_zero, false, schema_MemExecCollectorConfig_init_zero}
+#define schema_ConfigurationResponse_init_zero {_schema_ConfigurationResponse_ErrorCode_MIN, {{NULL}, NULL}, 0, 0}
+
+/* Field tags (for use in manual encoding/decoding) */
+#define schema_ConfigurationResponse_error_tag 1
+#define schema_ConfigurationResponse_msg_tag 2
+#define schema_ConfigurationResponse_version_tag 3
+#define schema_ConfigurationResponse_kernel_version_tag 4
+#define schema_ContainerCollectorConfig_enabled_tag 1
+#define schema_ExecuteCollectorConfig_enabled_tag 1
+#define schema_ExecuteCollectorConfig_argv_limit_tag 2
+#define schema_ExecuteCollectorConfig_envp_limit_tag 3
+#define schema_ExecuteCollectorConfig_envp_allowlist_tag 4
+#define schema_MemExecCollectorConfig_enabled_tag 1
+#define schema_ConfigurationRequest_container_config_tag 1
+#define schema_ConfigurationRequest_execute_config_tag 2
+#define schema_ConfigurationRequest_memexec_config_tag 3
+
+/* Struct field encoding specification for nanopb */
+#define schema_ContainerCollectorConfig_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, enabled, 1)
+#define schema_ContainerCollectorConfig_CALLBACK NULL
+#define schema_ContainerCollectorConfig_DEFAULT NULL
+
+#define schema_ExecuteCollectorConfig_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, enabled, 1) \
+X(a, STATIC, SINGULAR, UINT32, argv_limit, 2) \
+X(a, STATIC, SINGULAR, UINT32, envp_limit, 3) \
+X(a, CALLBACK, REPEATED, STRING, envp_allowlist, 4)
+#define schema_ExecuteCollectorConfig_CALLBACK pb_default_field_callback
+#define schema_ExecuteCollectorConfig_DEFAULT NULL
+
+#define schema_MemExecCollectorConfig_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, enabled, 1)
+#define schema_MemExecCollectorConfig_CALLBACK NULL
+#define schema_MemExecCollectorConfig_DEFAULT NULL
+
+#define schema_ConfigurationRequest_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, container_config, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, execute_config, 2) \
+X(a, STATIC, OPTIONAL, MESSAGE, memexec_config, 3)
+#define schema_ConfigurationRequest_CALLBACK NULL
+#define schema_ConfigurationRequest_DEFAULT NULL
+#define schema_ConfigurationRequest_container_config_MSGTYPE schema_ContainerCollectorConfig
+#define schema_ConfigurationRequest_execute_config_MSGTYPE schema_ExecuteCollectorConfig
+#define schema_ConfigurationRequest_memexec_config_MSGTYPE schema_MemExecCollectorConfig
+
+#define schema_ConfigurationResponse_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UENUM, error, 1) \
+X(a, CALLBACK, SINGULAR, STRING, msg, 2) \
+X(a, STATIC, SINGULAR, UINT64, version, 3) \
+X(a, STATIC, SINGULAR, UINT32, kernel_version, 4)
+#define schema_ConfigurationResponse_CALLBACK pb_default_field_callback
+#define schema_ConfigurationResponse_DEFAULT NULL
+
+extern const pb_msgdesc_t schema_ContainerCollectorConfig_msg;
+extern const pb_msgdesc_t schema_ExecuteCollectorConfig_msg;
+extern const pb_msgdesc_t schema_MemExecCollectorConfig_msg;
+extern const pb_msgdesc_t schema_ConfigurationRequest_msg;
+extern const pb_msgdesc_t schema_ConfigurationResponse_msg;
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define schema_ContainerCollectorConfig_fields &schema_ContainerCollectorConfig_msg
+#define schema_ExecuteCollectorConfig_fields &schema_ExecuteCollectorConfig_msg
+#define schema_MemExecCollectorConfig_fields &schema_MemExecCollectorConfig_msg
+#define schema_ConfigurationRequest_fields &schema_ConfigurationRequest_msg
+#define schema_ConfigurationResponse_fields &schema_ConfigurationResponse_msg
+
+/* Maximum encoded size of messages (where known) */
+/* schema_ExecuteCollectorConfig_size depends on runtime parameters */
+/* schema_ConfigurationRequest_size depends on runtime parameters */
+/* schema_ConfigurationResponse_size depends on runtime parameters */
+#define schema_ContainerCollectorConfig_size 2
+#define schema_MemExecCollectorConfig_size 2
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/config.proto b/security/container/protos/config.proto
new file mode 100644
index 0000000..e32a517
--- /dev/null
+++ b/security/container/protos/config.proto
@@ -0,0 +1,51 @@
+syntax = "proto3";
+
+package schema;
+
+// Collect information about running containers
+message ContainerCollectorConfig {
+ bool enabled = 1;
+}
+
+message ExecuteCollectorConfig {
+ bool enabled = 1;
+
+ // truncate argv/envp if cumulative length exceeds limit
+ uint32 argv_limit = 2;
+ uint32 envp_limit = 3;
+
+ // If specified, only report the named environment variables. An
+ // empty envp_allowlist indicates that all environment variables
+ // should be reported up to a cumulative total of envp_limit bytes.
+ repeated string envp_allowlist = 4;
+}
+
+// Collect information about executable memory mappings.
+message MemExecCollectorConfig {
+ bool enabled = 1;
+}
+
+// Convey configuration information to Guest LSM
+message ConfigurationRequest {
+ ContainerCollectorConfig container_config = 1;
+ ExecuteCollectorConfig execute_config = 2;
+ MemExecCollectorConfig memexec_config = 3;
+
+ // Additional configuration messages will be added as new collectors
+ // are implemented
+}
+
+// Report success or failure of previous ConfigurationRequest
+message ConfigurationResponse {
+ enum ErrorCode {
+ // Keep values in sync with
+ // https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
+ NO_ERROR = 0;
+ UNKNOWN = 2;
+ }
+
+ ErrorCode error = 1;
+ string msg = 2;
+ uint64 version = 3; // Version of the LSM
+ uint32 kernel_version = 4; // LINUX_VERSION_CODE
+}
diff --git a/security/container/protos/event.pb.c b/security/container/protos/event.pb.c
new file mode 100644
index 0000000..2293566
--- /dev/null
+++ b/security/container/protos/event.pb.c
@@ -0,0 +1,61 @@
+/* Automatically generated nanopb constant definitions */
+/* Generated by nanopb-0.4.5 */
+
+#include "event.pb.h"
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+PB_BIND(schema_SocketIp, schema_SocketIp, AUTO)
+
+
+PB_BIND(schema_Socket, schema_Socket, AUTO)
+
+
+PB_BIND(schema_Overlay, schema_Overlay, AUTO)
+
+
+PB_BIND(schema_File, schema_File, AUTO)
+
+
+PB_BIND(schema_ProcessArguments, schema_ProcessArguments, AUTO)
+
+
+PB_BIND(schema_Descriptor, schema_Descriptor, AUTO)
+
+
+PB_BIND(schema_Streams, schema_Streams, 2)
+
+
+PB_BIND(schema_Process, schema_Process, 2)
+
+
+PB_BIND(schema_Container, schema_Container, AUTO)
+
+
+PB_BIND(schema_ExecuteEvent, schema_ExecuteEvent, 2)
+
+
+PB_BIND(schema_CloneEvent, schema_CloneEvent, 2)
+
+
+PB_BIND(schema_EnumerateProcessEvent, schema_EnumerateProcessEvent, 2)
+
+
+PB_BIND(schema_MemoryExecEvent, schema_MemoryExecEvent, 2)
+
+
+PB_BIND(schema_ContainerInfoEvent, schema_ContainerInfoEvent, AUTO)
+
+
+PB_BIND(schema_ExitEvent, schema_ExitEvent, AUTO)
+
+
+PB_BIND(schema_Event, schema_Event, 2)
+
+
+PB_BIND(schema_ContainerReport, schema_ContainerReport, AUTO)
+
+
+
+
diff --git a/security/container/protos/event.pb.h b/security/container/protos/event.pb.h
new file mode 100644
index 0000000..9535068
--- /dev/null
+++ b/security/container/protos/event.pb.h
@@ -0,0 +1,518 @@
+/* Automatically generated nanopb header */
+/* Generated by nanopb-0.4.5 */
+
+#ifndef PB_SCHEMA_EVENT_PB_H_INCLUDED
+#define PB_SCHEMA_EVENT_PB_H_INCLUDED
+#include <pb.h>
+
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+/* Enum definitions */
+typedef enum _schema_MemoryExecEvent_Action {
+ schema_MemoryExecEvent_Action_UNDEFINED = 0,
+ schema_MemoryExecEvent_Action_MPROTECT = 1,
+ schema_MemoryExecEvent_Action_MMAP_FILE = 2
+} schema_MemoryExecEvent_Action;
+
+/* Struct definitions */
+/* The process with the indicated pid has exited. */
+typedef struct _schema_ExitEvent {
+ pb_callback_t process_uuid;
+} schema_ExitEvent;
+
+typedef struct _schema_Container {
+ uint64_t creation_timestamp; /* container create time in ns */
+ pb_callback_t pod_namespace;
+ pb_callback_t pod_name;
+ uint64_t container_id; /* unique across lifetime of Node */
+ pb_callback_t container_name;
+ pb_callback_t container_image_uri;
+ pb_callback_t labels;
+ pb_callback_t init_uuid;
+ pb_callback_t container_image_id;
+} schema_Container;
+
+typedef struct _schema_Overlay {
+ bool lower_layer;
+ bool upper_layer;
+ pb_callback_t modified_uuid; /* The process who first modified the file. */
+} schema_Overlay;
+
+typedef struct _schema_ProcessArguments {
+ pb_callback_t argv; /* process arguments */
+ uint32_t argv_truncated; /* number of characters truncated from argv */
+ pb_callback_t envp; /* process environment variables */
+ uint32_t envp_truncated; /* number of characters truncated from envp */
+} schema_ProcessArguments;
+
+typedef struct _schema_SocketIp {
+ uint32_t family; /* AF_* for socket type. */
+ pb_callback_t ip; /* ip4 or ip6 address. */
+ uint32_t port; /* port bind or connected. */
+} schema_SocketIp;
+
+/* Associate the following container information with all processes
+ that have the indicated container_id. */
+typedef struct _schema_ContainerInfoEvent {
+ bool has_container;
+ schema_Container container;
+} schema_ContainerInfoEvent;
+
+/* Message sent by the daemonset to the LSM for container enlightenment. */
+typedef struct _schema_ContainerReport {
+ uint32_t pid; /* Top pid of the running container. */
+ bool has_container;
+ schema_Container container; /* Information collected about the container. */
+} schema_ContainerReport;
+
+typedef struct _schema_Socket {
+ bool has_local;
+ schema_SocketIp local;
+ bool has_remote;
+ schema_SocketIp remote; /* unset if not connected. */
+} schema_Socket;
+
+typedef struct _schema_File {
+ pb_callback_t fullpath;
+ pb_size_t which_filesystem;
+ union {
+ schema_Overlay overlayfs;
+ schema_Socket socket;
+ } filesystem; /* inode number. */
+ uint32_t ino;
+ uint64_t ctime;
+} schema_File;
+
+typedef struct _schema_Descriptor {
+ uint32_t mode; /* file mode (stat st_mode) */
+ bool has_file;
+ schema_File file;
+} schema_Descriptor;
+
+typedef struct _schema_Streams {
+ bool has_stdin;
+ schema_Descriptor stdin;
+ bool has_stdout;
+ schema_Descriptor stdout;
+ bool has_stderr;
+ schema_Descriptor stderr;
+} schema_Streams;
+
+typedef struct _schema_Process {
+ uint64_t creation_timestamp; /* Only populated in ExecuteEvent, in ns. */
+ pb_callback_t uuid;
+ uint32_t pid;
+ bool has_binary;
+ schema_File binary; /* Only populated in ExecuteEvent. */
+ uint32_t parent_pid;
+ pb_callback_t parent_uuid;
+ uint64_t container_id; /* unique id of process's container */
+ uint32_t container_pid; /* pid inside the container namespace pid */
+ uint32_t container_parent_pid; /* optional */
+ bool has_args;
+ schema_ProcessArguments args; /* Only populated in ExecuteEvent. */
+ bool has_streams;
+ schema_Streams streams; /* Only populated in ExecuteEvent. */
+ uint64_t exec_session_id; /* identifier set for kubectl exec sessions. */
+} schema_Process;
+
+/* A process clone is being created. This message means that a cloning operation
+ is being attempted. It may be sent even if fork fails. */
+typedef struct _schema_CloneEvent {
+ bool has_proc;
+ schema_Process proc;
+} schema_CloneEvent;
+
+/* Processes that are enumerated at startup will be sent with this event. There
+ is no distinction from events we would have seen from fork or exec. */
+typedef struct _schema_EnumerateProcessEvent {
+ bool has_proc;
+ schema_Process proc;
+} schema_EnumerateProcessEvent;
+
+/* A binary being executed.
+ e.g., execve() */
+typedef struct _schema_ExecuteEvent {
+ bool has_proc;
+ schema_Process proc;
+} schema_ExecuteEvent;
+
+/* Collect information about mmap/mprotect calls with the PROT_EXEC flag set. */
+typedef struct _schema_MemoryExecEvent {
+ bool has_proc;
+ schema_Process proc; /* The origin process */
+ /* The timestamp in ns when the memory was set executable */
+ uint64_t prot_exec_timestamp;
+ /* The prot flags granted by the kernel for the operation */
+ uint64_t new_flags;
+ /* The prot flags requested for the mprotect/mmap operation */
+ uint64_t req_flags;
+ /* The vm_flags prior to the mprotect operation, if relevant */
+ uint64_t old_vm_flags;
+ /* The operational flags for the mmap operation, if relevant */
+ uint64_t mmap_flags;
+ /* Derived from the file struct describing the fd being mapped */
+ bool has_mapped_file;
+ schema_File mapped_file;
+ schema_MemoryExecEvent_Action action;
+ uint64_t start_addr; /* The executable memory region start addr */
+ uint64_t end_addr; /* The executable memory region end addr */
+ /* True if this event is a mmap of the process' binary */
+ bool is_initial_mmap;
+} schema_MemoryExecEvent;
+
+/* Next ID: 8 */
+typedef struct _schema_Event {
+ pb_size_t which_event;
+ union {
+ schema_ExecuteEvent execute;
+ schema_ContainerInfoEvent container;
+ schema_ExitEvent exit;
+ schema_MemoryExecEvent memexec;
+ schema_CloneEvent clone;
+ schema_EnumerateProcessEvent enumproc;
+ } event;
+ uint64_t timestamp;
+} schema_Event;
+
+
+/* Helper constants for enums */
+#define _schema_MemoryExecEvent_Action_MIN schema_MemoryExecEvent_Action_UNDEFINED
+#define _schema_MemoryExecEvent_Action_MAX schema_MemoryExecEvent_Action_MMAP_FILE
+#define _schema_MemoryExecEvent_Action_ARRAYSIZE ((schema_MemoryExecEvent_Action)(schema_MemoryExecEvent_Action_MMAP_FILE+1))
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Initializer values for message structs */
+#define schema_SocketIp_init_default {0, {{NULL}, NULL}, 0}
+#define schema_Socket_init_default {false, schema_SocketIp_init_default, false, schema_SocketIp_init_default}
+#define schema_Overlay_init_default {0, 0, {{NULL}, NULL}}
+#define schema_File_init_default {{{NULL}, NULL}, 0, {schema_Overlay_init_default}, 0, 0}
+#define schema_ProcessArguments_init_default {{{NULL}, NULL}, 0, {{NULL}, NULL}, 0}
+#define schema_Descriptor_init_default {0, false, schema_File_init_default}
+#define schema_Streams_init_default {false, schema_Descriptor_init_default, false, schema_Descriptor_init_default, false, schema_Descriptor_init_default}
+#define schema_Process_init_default {0, {{NULL}, NULL}, 0, false, schema_File_init_default, 0, {{NULL}, NULL}, 0, 0, 0, false, schema_ProcessArguments_init_default, false, schema_Streams_init_default, 0}
+#define schema_Container_init_default {0, {{NULL}, NULL}, {{NULL}, NULL}, 0, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}}
+#define schema_ExecuteEvent_init_default {false, schema_Process_init_default}
+#define schema_CloneEvent_init_default {false, schema_Process_init_default}
+#define schema_EnumerateProcessEvent_init_default {false, schema_Process_init_default}
+#define schema_MemoryExecEvent_init_default {false, schema_Process_init_default, 0, 0, 0, 0, 0, false, schema_File_init_default, _schema_MemoryExecEvent_Action_MIN, 0, 0, 0}
+#define schema_ContainerInfoEvent_init_default {false, schema_Container_init_default}
+#define schema_ExitEvent_init_default {{{NULL}, NULL}}
+#define schema_Event_init_default {0, {schema_ExecuteEvent_init_default}, 0}
+#define schema_ContainerReport_init_default {0, false, schema_Container_init_default}
+#define schema_SocketIp_init_zero {0, {{NULL}, NULL}, 0}
+#define schema_Socket_init_zero {false, schema_SocketIp_init_zero, false, schema_SocketIp_init_zero}
+#define schema_Overlay_init_zero {0, 0, {{NULL}, NULL}}
+#define schema_File_init_zero {{{NULL}, NULL}, 0, {schema_Overlay_init_zero}, 0, 0}
+#define schema_ProcessArguments_init_zero {{{NULL}, NULL}, 0, {{NULL}, NULL}, 0}
+#define schema_Descriptor_init_zero {0, false, schema_File_init_zero}
+#define schema_Streams_init_zero {false, schema_Descriptor_init_zero, false, schema_Descriptor_init_zero, false, schema_Descriptor_init_zero}
+#define schema_Process_init_zero {0, {{NULL}, NULL}, 0, false, schema_File_init_zero, 0, {{NULL}, NULL}, 0, 0, 0, false, schema_ProcessArguments_init_zero, false, schema_Streams_init_zero, 0}
+#define schema_Container_init_zero {0, {{NULL}, NULL}, {{NULL}, NULL}, 0, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}}
+#define schema_ExecuteEvent_init_zero {false, schema_Process_init_zero}
+#define schema_CloneEvent_init_zero {false, schema_Process_init_zero}
+#define schema_EnumerateProcessEvent_init_zero {false, schema_Process_init_zero}
+#define schema_MemoryExecEvent_init_zero {false, schema_Process_init_zero, 0, 0, 0, 0, 0, false, schema_File_init_zero, _schema_MemoryExecEvent_Action_MIN, 0, 0, 0}
+#define schema_ContainerInfoEvent_init_zero {false, schema_Container_init_zero}
+#define schema_ExitEvent_init_zero {{{NULL}, NULL}}
+#define schema_Event_init_zero {0, {schema_ExecuteEvent_init_zero}, 0}
+#define schema_ContainerReport_init_zero {0, false, schema_Container_init_zero}
+
+/* Field tags (for use in manual encoding/decoding) */
+#define schema_ExitEvent_process_uuid_tag 1
+#define schema_Container_creation_timestamp_tag 1
+#define schema_Container_pod_namespace_tag 2
+#define schema_Container_pod_name_tag 3
+#define schema_Container_container_id_tag 4
+#define schema_Container_container_name_tag 5
+#define schema_Container_container_image_uri_tag 6
+#define schema_Container_labels_tag 7
+#define schema_Container_init_uuid_tag 8
+#define schema_Container_container_image_id_tag 9
+#define schema_Overlay_lower_layer_tag 1
+#define schema_Overlay_upper_layer_tag 2
+#define schema_Overlay_modified_uuid_tag 3
+#define schema_ProcessArguments_argv_tag 1
+#define schema_ProcessArguments_argv_truncated_tag 2
+#define schema_ProcessArguments_envp_tag 3
+#define schema_ProcessArguments_envp_truncated_tag 4
+#define schema_SocketIp_family_tag 1
+#define schema_SocketIp_ip_tag 2
+#define schema_SocketIp_port_tag 3
+#define schema_ContainerInfoEvent_container_tag 1
+#define schema_ContainerReport_pid_tag 1
+#define schema_ContainerReport_container_tag 2
+#define schema_Socket_local_tag 1
+#define schema_Socket_remote_tag 2
+#define schema_File_fullpath_tag 1
+#define schema_File_overlayfs_tag 2
+#define schema_File_socket_tag 4
+#define schema_File_ino_tag 3
+#define schema_File_ctime_tag 5
+#define schema_Descriptor_mode_tag 1
+#define schema_Descriptor_file_tag 2
+#define schema_Streams_stdin_tag 1
+#define schema_Streams_stdout_tag 2
+#define schema_Streams_stderr_tag 3
+#define schema_Process_creation_timestamp_tag 1
+#define schema_Process_uuid_tag 2
+#define schema_Process_pid_tag 3
+#define schema_Process_binary_tag 4
+#define schema_Process_parent_pid_tag 5
+#define schema_Process_parent_uuid_tag 6
+#define schema_Process_container_id_tag 7
+#define schema_Process_container_pid_tag 8
+#define schema_Process_container_parent_pid_tag 9
+#define schema_Process_args_tag 10
+#define schema_Process_streams_tag 11
+#define schema_Process_exec_session_id_tag 12
+#define schema_CloneEvent_proc_tag 1
+#define schema_EnumerateProcessEvent_proc_tag 1
+#define schema_ExecuteEvent_proc_tag 1
+#define schema_MemoryExecEvent_proc_tag 1
+#define schema_MemoryExecEvent_prot_exec_timestamp_tag 2
+#define schema_MemoryExecEvent_new_flags_tag 3
+#define schema_MemoryExecEvent_req_flags_tag 4
+#define schema_MemoryExecEvent_old_vm_flags_tag 5
+#define schema_MemoryExecEvent_mmap_flags_tag 6
+#define schema_MemoryExecEvent_mapped_file_tag 7
+#define schema_MemoryExecEvent_action_tag 8
+#define schema_MemoryExecEvent_start_addr_tag 9
+#define schema_MemoryExecEvent_end_addr_tag 10
+#define schema_MemoryExecEvent_is_initial_mmap_tag 11
+#define schema_Event_execute_tag 1
+#define schema_Event_container_tag 2
+#define schema_Event_exit_tag 3
+#define schema_Event_memexec_tag 4
+#define schema_Event_clone_tag 5
+#define schema_Event_enumproc_tag 7
+#define schema_Event_timestamp_tag 6
+
+/* Struct field encoding specification for nanopb */
+#define schema_SocketIp_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT32, family, 1) \
+X(a, CALLBACK, SINGULAR, BYTES, ip, 2) \
+X(a, STATIC, SINGULAR, UINT32, port, 3)
+#define schema_SocketIp_CALLBACK pb_default_field_callback
+#define schema_SocketIp_DEFAULT NULL
+
+#define schema_Socket_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, local, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, remote, 2)
+#define schema_Socket_CALLBACK NULL
+#define schema_Socket_DEFAULT NULL
+#define schema_Socket_local_MSGTYPE schema_SocketIp
+#define schema_Socket_remote_MSGTYPE schema_SocketIp
+
+#define schema_Overlay_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, lower_layer, 1) \
+X(a, STATIC, SINGULAR, BOOL, upper_layer, 2) \
+X(a, CALLBACK, SINGULAR, BYTES, modified_uuid, 3)
+#define schema_Overlay_CALLBACK pb_default_field_callback
+#define schema_Overlay_DEFAULT NULL
+
+#define schema_File_FIELDLIST(X, a) \
+X(a, CALLBACK, SINGULAR, BYTES, fullpath, 1) \
+X(a, STATIC, ONEOF, MESSAGE, (filesystem,overlayfs,filesystem.overlayfs), 2) \
+X(a, STATIC, SINGULAR, UINT32, ino, 3) \
+X(a, STATIC, ONEOF, MESSAGE, (filesystem,socket,filesystem.socket), 4) \
+X(a, STATIC, SINGULAR, UINT64, ctime, 5)
+#define schema_File_CALLBACK pb_default_field_callback
+#define schema_File_DEFAULT NULL
+#define schema_File_filesystem_overlayfs_MSGTYPE schema_Overlay
+#define schema_File_filesystem_socket_MSGTYPE schema_Socket
+
+#define schema_ProcessArguments_FIELDLIST(X, a) \
+X(a, CALLBACK, REPEATED, BYTES, argv, 1) \
+X(a, STATIC, SINGULAR, UINT32, argv_truncated, 2) \
+X(a, CALLBACK, REPEATED, BYTES, envp, 3) \
+X(a, STATIC, SINGULAR, UINT32, envp_truncated, 4)
+#define schema_ProcessArguments_CALLBACK pb_default_field_callback
+#define schema_ProcessArguments_DEFAULT NULL
+
+#define schema_Descriptor_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT32, mode, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, file, 2)
+#define schema_Descriptor_CALLBACK NULL
+#define schema_Descriptor_DEFAULT NULL
+#define schema_Descriptor_file_MSGTYPE schema_File
+
+#define schema_Streams_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, stdin, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, stdout, 2) \
+X(a, STATIC, OPTIONAL, MESSAGE, stderr, 3)
+#define schema_Streams_CALLBACK NULL
+#define schema_Streams_DEFAULT NULL
+#define schema_Streams_stdin_MSGTYPE schema_Descriptor
+#define schema_Streams_stdout_MSGTYPE schema_Descriptor
+#define schema_Streams_stderr_MSGTYPE schema_Descriptor
+
+#define schema_Process_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT64, creation_timestamp, 1) \
+X(a, CALLBACK, SINGULAR, BYTES, uuid, 2) \
+X(a, STATIC, SINGULAR, UINT32, pid, 3) \
+X(a, STATIC, OPTIONAL, MESSAGE, binary, 4) \
+X(a, STATIC, SINGULAR, UINT32, parent_pid, 5) \
+X(a, CALLBACK, SINGULAR, BYTES, parent_uuid, 6) \
+X(a, STATIC, SINGULAR, UINT64, container_id, 7) \
+X(a, STATIC, SINGULAR, UINT32, container_pid, 8) \
+X(a, STATIC, SINGULAR, UINT32, container_parent_pid, 9) \
+X(a, STATIC, OPTIONAL, MESSAGE, args, 10) \
+X(a, STATIC, OPTIONAL, MESSAGE, streams, 11) \
+X(a, STATIC, SINGULAR, UINT64, exec_session_id, 12)
+#define schema_Process_CALLBACK pb_default_field_callback
+#define schema_Process_DEFAULT NULL
+#define schema_Process_binary_MSGTYPE schema_File
+#define schema_Process_args_MSGTYPE schema_ProcessArguments
+#define schema_Process_streams_MSGTYPE schema_Streams
+
+#define schema_Container_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT64, creation_timestamp, 1) \
+X(a, CALLBACK, SINGULAR, BYTES, pod_namespace, 2) \
+X(a, CALLBACK, SINGULAR, BYTES, pod_name, 3) \
+X(a, STATIC, SINGULAR, UINT64, container_id, 4) \
+X(a, CALLBACK, SINGULAR, BYTES, container_name, 5) \
+X(a, CALLBACK, SINGULAR, BYTES, container_image_uri, 6) \
+X(a, CALLBACK, REPEATED, BYTES, labels, 7) \
+X(a, CALLBACK, SINGULAR, BYTES, init_uuid, 8) \
+X(a, CALLBACK, SINGULAR, BYTES, container_image_id, 9)
+#define schema_Container_CALLBACK pb_default_field_callback
+#define schema_Container_DEFAULT NULL
+
+#define schema_ExecuteEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1)
+#define schema_ExecuteEvent_CALLBACK NULL
+#define schema_ExecuteEvent_DEFAULT NULL
+#define schema_ExecuteEvent_proc_MSGTYPE schema_Process
+
+#define schema_CloneEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1)
+#define schema_CloneEvent_CALLBACK NULL
+#define schema_CloneEvent_DEFAULT NULL
+#define schema_CloneEvent_proc_MSGTYPE schema_Process
+
+#define schema_EnumerateProcessEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1)
+#define schema_EnumerateProcessEvent_CALLBACK NULL
+#define schema_EnumerateProcessEvent_DEFAULT NULL
+#define schema_EnumerateProcessEvent_proc_MSGTYPE schema_Process
+
+#define schema_MemoryExecEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1) \
+X(a, STATIC, SINGULAR, UINT64, prot_exec_timestamp, 2) \
+X(a, STATIC, SINGULAR, UINT64, new_flags, 3) \
+X(a, STATIC, SINGULAR, UINT64, req_flags, 4) \
+X(a, STATIC, SINGULAR, UINT64, old_vm_flags, 5) \
+X(a, STATIC, SINGULAR, UINT64, mmap_flags, 6) \
+X(a, STATIC, OPTIONAL, MESSAGE, mapped_file, 7) \
+X(a, STATIC, SINGULAR, UENUM, action, 8) \
+X(a, STATIC, SINGULAR, UINT64, start_addr, 9) \
+X(a, STATIC, SINGULAR, UINT64, end_addr, 10) \
+X(a, STATIC, SINGULAR, BOOL, is_initial_mmap, 11)
+#define schema_MemoryExecEvent_CALLBACK NULL
+#define schema_MemoryExecEvent_DEFAULT NULL
+#define schema_MemoryExecEvent_proc_MSGTYPE schema_Process
+#define schema_MemoryExecEvent_mapped_file_MSGTYPE schema_File
+
+#define schema_ContainerInfoEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, container, 1)
+#define schema_ContainerInfoEvent_CALLBACK NULL
+#define schema_ContainerInfoEvent_DEFAULT NULL
+#define schema_ContainerInfoEvent_container_MSGTYPE schema_Container
+
+#define schema_ExitEvent_FIELDLIST(X, a) \
+X(a, CALLBACK, SINGULAR, BYTES, process_uuid, 1)
+#define schema_ExitEvent_CALLBACK pb_default_field_callback
+#define schema_ExitEvent_DEFAULT NULL
+
+#define schema_Event_FIELDLIST(X, a) \
+X(a, STATIC, ONEOF, MESSAGE, (event,execute,event.execute), 1) \
+X(a, STATIC, ONEOF, MESSAGE, (event,container,event.container), 2) \
+X(a, STATIC, ONEOF, MESSAGE, (event,exit,event.exit), 3) \
+X(a, STATIC, ONEOF, MESSAGE, (event,memexec,event.memexec), 4) \
+X(a, STATIC, ONEOF, MESSAGE, (event,clone,event.clone), 5) \
+X(a, STATIC, SINGULAR, UINT64, timestamp, 6) \
+X(a, STATIC, ONEOF, MESSAGE, (event,enumproc,event.enumproc), 7)
+#define schema_Event_CALLBACK NULL
+#define schema_Event_DEFAULT NULL
+#define schema_Event_event_execute_MSGTYPE schema_ExecuteEvent
+#define schema_Event_event_container_MSGTYPE schema_ContainerInfoEvent
+#define schema_Event_event_exit_MSGTYPE schema_ExitEvent
+#define schema_Event_event_memexec_MSGTYPE schema_MemoryExecEvent
+#define schema_Event_event_clone_MSGTYPE schema_CloneEvent
+#define schema_Event_event_enumproc_MSGTYPE schema_EnumerateProcessEvent
+
+#define schema_ContainerReport_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT32, pid, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, container, 2)
+#define schema_ContainerReport_CALLBACK NULL
+#define schema_ContainerReport_DEFAULT NULL
+#define schema_ContainerReport_container_MSGTYPE schema_Container
+
+extern const pb_msgdesc_t schema_SocketIp_msg;
+extern const pb_msgdesc_t schema_Socket_msg;
+extern const pb_msgdesc_t schema_Overlay_msg;
+extern const pb_msgdesc_t schema_File_msg;
+extern const pb_msgdesc_t schema_ProcessArguments_msg;
+extern const pb_msgdesc_t schema_Descriptor_msg;
+extern const pb_msgdesc_t schema_Streams_msg;
+extern const pb_msgdesc_t schema_Process_msg;
+extern const pb_msgdesc_t schema_Container_msg;
+extern const pb_msgdesc_t schema_ExecuteEvent_msg;
+extern const pb_msgdesc_t schema_CloneEvent_msg;
+extern const pb_msgdesc_t schema_EnumerateProcessEvent_msg;
+extern const pb_msgdesc_t schema_MemoryExecEvent_msg;
+extern const pb_msgdesc_t schema_ContainerInfoEvent_msg;
+extern const pb_msgdesc_t schema_ExitEvent_msg;
+extern const pb_msgdesc_t schema_Event_msg;
+extern const pb_msgdesc_t schema_ContainerReport_msg;
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define schema_SocketIp_fields &schema_SocketIp_msg
+#define schema_Socket_fields &schema_Socket_msg
+#define schema_Overlay_fields &schema_Overlay_msg
+#define schema_File_fields &schema_File_msg
+#define schema_ProcessArguments_fields &schema_ProcessArguments_msg
+#define schema_Descriptor_fields &schema_Descriptor_msg
+#define schema_Streams_fields &schema_Streams_msg
+#define schema_Process_fields &schema_Process_msg
+#define schema_Container_fields &schema_Container_msg
+#define schema_ExecuteEvent_fields &schema_ExecuteEvent_msg
+#define schema_CloneEvent_fields &schema_CloneEvent_msg
+#define schema_EnumerateProcessEvent_fields &schema_EnumerateProcessEvent_msg
+#define schema_MemoryExecEvent_fields &schema_MemoryExecEvent_msg
+#define schema_ContainerInfoEvent_fields &schema_ContainerInfoEvent_msg
+#define schema_ExitEvent_fields &schema_ExitEvent_msg
+#define schema_Event_fields &schema_Event_msg
+#define schema_ContainerReport_fields &schema_ContainerReport_msg
+
+/* Maximum encoded size of messages (where known) */
+/* schema_SocketIp_size depends on runtime parameters */
+/* schema_Socket_size depends on runtime parameters */
+/* schema_Overlay_size depends on runtime parameters */
+/* schema_File_size depends on runtime parameters */
+/* schema_ProcessArguments_size depends on runtime parameters */
+/* schema_Descriptor_size depends on runtime parameters */
+/* schema_Streams_size depends on runtime parameters */
+/* schema_Process_size depends on runtime parameters */
+/* schema_Container_size depends on runtime parameters */
+/* schema_ExecuteEvent_size depends on runtime parameters */
+/* schema_CloneEvent_size depends on runtime parameters */
+/* schema_EnumerateProcessEvent_size depends on runtime parameters */
+/* schema_MemoryExecEvent_size depends on runtime parameters */
+/* schema_ContainerInfoEvent_size depends on runtime parameters */
+/* schema_ExitEvent_size depends on runtime parameters */
+/* schema_Event_size depends on runtime parameters */
+/* schema_ContainerReport_size depends on runtime parameters */
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/event.proto b/security/container/protos/event.proto
new file mode 100644
index 0000000..79a604d
--- /dev/null
+++ b/security/container/protos/event.proto
@@ -0,0 +1,152 @@
+syntax = "proto3";
+
+package schema;
+
+message SocketIp {
+ uint32 family = 1; // AF_* for socket type.
+ bytes ip = 2; // ip4 or ip6 address.
+ uint32 port = 3; // port bind or connected.
+}
+
+message Socket {
+ SocketIp local = 1;
+ SocketIp remote = 2; // unset if not connected.
+}
+
+message Overlay {
+ bool lower_layer = 1;
+ bool upper_layer = 2;
+ bytes modified_uuid = 3; // The process who first modified the file.
+}
+
+message File {
+ bytes fullpath = 1;
+ uint32 ino = 3; // inode number.
+ oneof filesystem {
+ Overlay overlayfs = 2;
+ Socket socket = 4;
+ }
+ uint64 ctime = 5;
+}
+
+message ProcessArguments {
+ repeated bytes argv = 1; // process arguments
+ uint32 argv_truncated = 2; // number of characters truncated from argv
+ repeated bytes envp = 3; // process environment variables
+ uint32 envp_truncated = 4; // number of characters truncated from envp
+}
+
+message Descriptor {
+ uint32 mode = 1; // file mode (stat st_mode)
+ File file = 2;
+}
+
+message Streams {
+ Descriptor stdin = 1;
+ Descriptor stdout = 2;
+ Descriptor stderr = 3;
+}
+
+message Process {
+ uint64 creation_timestamp = 1; // Only populated in ExecuteEvent, in ns.
+ bytes uuid = 2;
+ uint32 pid = 3;
+ File binary = 4; // Only populated in ExecuteEvent.
+ uint32 parent_pid = 5;
+ bytes parent_uuid = 6;
+ uint64 container_id = 7; // unique id of process's container
+ uint32 container_pid = 8; // pid inside the container namespace pid
+ uint32 container_parent_pid = 9; // optional
+ ProcessArguments args = 10; // Only populated in ExecuteEvent.
+ Streams streams = 11; // Only populated in ExecuteEvent.
+ uint64 exec_session_id = 12; // identifier set for kubectl exec sessions.
+}
+
+message Container {
+ uint64 creation_timestamp = 1; // container create time in ns
+ bytes pod_namespace = 2;
+ bytes pod_name = 3;
+ uint64 container_id = 4; // unique across lifetime of Node
+ bytes container_name = 5;
+ bytes container_image_uri = 6;
+ repeated bytes labels = 7;
+ bytes init_uuid = 8;
+ bytes container_image_id = 9;
+}
+
+// A binary being executed.
+// e.g., execve()
+message ExecuteEvent {
+ Process proc = 1;
+}
+
+// A process clone is being created. This message means that a cloning operation
+// is being attempted. It may be sent even if fork fails.
+message CloneEvent {
+ Process proc = 1;
+}
+
+// Processes that are enumerated at startup will be sent with this event. There
+// is no distinction from events we would have seen from fork or exec.
+message EnumerateProcessEvent {
+ Process proc = 1;
+}
+
+// Collect information about mmap/mprotect calls with the PROT_EXEC flag set.
+message MemoryExecEvent {
+ Process proc = 1; // The origin process
+ // The timestamp in ns when the memory was set executable
+ uint64 prot_exec_timestamp = 2;
+ // The prot flags granted by the kernel for the operation
+ uint64 new_flags = 3;
+ // The prot flags requested for the mprotect/mmap operation
+ uint64 req_flags = 4;
+ // The vm_flags prior to the mprotect operation, if relevant
+ uint64 old_vm_flags = 5;
+ // The operational flags for the mmap operation, if relevant
+ uint64 mmap_flags = 6;
+ // Derived from the file struct describing the fd being mapped
+ File mapped_file = 7;
+ enum Action {
+ UNDEFINED = 0;
+ MPROTECT = 1;
+ MMAP_FILE = 2;
+ }
+ Action action = 8;
+
+ uint64 start_addr = 9; // The executable memory region start addr
+ uint64 end_addr = 10; // The executable memory region end addr
+ // True if this event is a mmap of the process' binary
+ bool is_initial_mmap = 11;
+}
+
+// Associate the following container information with all processes
+// that have the indicated container_id.
+message ContainerInfoEvent {
+ Container container = 1;
+}
+
+// The process with the indicated pid has exited.
+message ExitEvent {
+ bytes process_uuid = 1;
+}
+
+// Next ID: 8
+message Event {
+ oneof event {
+ ExecuteEvent execute = 1;
+ ContainerInfoEvent container = 2;
+ ExitEvent exit = 3;
+ MemoryExecEvent memexec = 4;
+ CloneEvent clone = 5;
+ EnumerateProcessEvent enumproc = 7;
+ }
+
+ uint64 timestamp = 6; // In nanoseconds
+}
+
+// Message sent by the daemonset to the LSM for container enlightenment.
+message ContainerReport {
+ uint32 pid = 1; // Top pid of the running container.
+ Container container = 2; // Information collected about the container.
+}
diff --git a/security/container/protos/nanopb/LICENSE b/security/container/protos/nanopb/LICENSE
new file mode 100644
index 0000000..a83630a
--- /dev/null
+++ b/security/container/protos/nanopb/LICENSE
@@ -0,0 +1,20 @@
+Copyright (c) 2011 Petteri Aimonen <jpa at nanopb.mail.kapsi.fi>
+
+This software is provided 'as-is', without any express or
+implied warranty. In no event will the authors be held liable
+for any damages arising from the use of this software.
+
+Permission is granted to anyone to use this software for any
+purpose, including commercial applications, and to alter it and
+redistribute it freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you
+ must not claim that you wrote the original software. If you use
+ this software in a product, an acknowledgment in the product
+ documentation would be appreciated but is not required.
+
+2. Altered source versions must be plainly marked as such, and
+ must not be misrepresented as being the original software.
+
+3. This notice may not be removed or altered from any source
+ distribution.
diff --git a/security/container/protos/nanopb/METADATA b/security/container/protos/nanopb/METADATA
new file mode 100644
index 0000000..6b85630
--- /dev/null
+++ b/security/container/protos/nanopb/METADATA
@@ -0,0 +1,23 @@
+name: "nanopb"
+description: "Nanopb is a C library for encoding and decoding protocol buffers."
+
+third_party {
+ url {
+ type: GIT
+ value: "https://github.com/nanopb/nanopb/"
+ }
+ version: "0.4.5"
+ last_upgrade_date: {
+ year: 2021
+ month: 8
+ day: 12
+ }
+ license_type: NOTICE
+ security {
+ category: REVIEWED_AND_SECURE
+ note: "https://buganizer.corp.google.com/u/0/issues/19409596, https://buganizer.corp.google.com/u/0/issues/120506242"
+ tag: "NVD-CPE2.3:cpe:/a:nanopb_project:nanopb"
+ tag: "vuln_reporting:buganizer_component:588910"
+ tag: "vuln_reporting:contact_emails:" # Blunderbuss will assign bugs.
+ }
+}
diff --git a/security/container/protos/nanopb/Makefile b/security/container/protos/nanopb/Makefile
new file mode 100644
index 0000000..b7e15f8
--- /dev/null
+++ b/security/container/protos/nanopb/Makefile
@@ -0,0 +1,7 @@
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += nanopb.o
+
+nanopb-y := pb_encode.o pb_decode.o pb_common.o
+
+ccflags-y := -I$(srctree)/security/container/protos \
+ -I$(srctree)/security/container/protos/nanopb \
+ $(PB_CCFLAGS)
diff --git a/security/container/protos/nanopb/pb.h b/security/container/protos/nanopb/pb.h
new file mode 100644
index 0000000..be7c067
--- /dev/null
+++ b/security/container/protos/nanopb/pb.h
@@ -0,0 +1,875 @@
+/* Common parts of the nanopb library. Most of these are quite low-level
+ * stuff. For the high-level interface, see pb_encode.h and pb_decode.h.
+ */
+
+#ifndef PB_H_INCLUDED
+#define PB_H_INCLUDED
+
+/*****************************************************************
+ * Nanopb compilation time options. You can change these here by *
+ * uncommenting the lines, or on the compiler command line. *
+ *****************************************************************/
+
+/* Enable support for dynamically allocated fields */
+/* #define PB_ENABLE_MALLOC 1 */
+
+/* Define this if your CPU / compiler combination does not support
+ * unaligned memory access to packed structures. */
+/* #define PB_NO_PACKED_STRUCTS 1 */
+
+/* Increase the number of required fields that are tracked.
+ * A compiler warning will tell if you need this. */
+/* #define PB_MAX_REQUIRED_FIELDS 256 */
+
+/* Add support for tag numbers > 65536 and fields larger than 65536 bytes. */
+/* #define PB_FIELD_32BIT 1 */
+
+/* Disable support for error messages in order to save some code space. */
+/* #define PB_NO_ERRMSG 1 */
+
+/* Disable support for custom streams (support only memory buffers). */
+/* #define PB_BUFFER_ONLY 1 */
+
+/* Disable support for 64-bit datatypes, for compilers without int64_t
+ or to save some code space. */
+/* #define PB_WITHOUT_64BIT 1 */
+
+/* Don't encode scalar arrays as packed. This is only to be used when
+ * the decoder on the receiving side cannot process packed scalar arrays.
+ * Such example is older protobuf.js. */
+/* #define PB_ENCODE_ARRAYS_UNPACKED 1 */
+
+/* Enable conversion of doubles to floats for platforms that do not
+ * support 64-bit doubles. Most commonly AVR. */
+/* #define PB_CONVERT_DOUBLE_FLOAT 1 */
+
+/* Check whether incoming strings are valid UTF-8 sequences. Slows down
+ * the string processing slightly and slightly increases code size. */
+/* #define PB_VALIDATE_UTF8 1 */
+
+/******************************************************************
+ * You usually don't need to change anything below this line. *
+ * Feel free to look around and use the defined macros, though. *
+ ******************************************************************/
+
+
+/* Version of the nanopb library. Just in case you want to check it in
+ * your own program. */
+#define NANOPB_VERSION nanopb-0.4.5
+
+/* Include all the system headers needed by nanopb. You will need the
+ * definitions of the following:
+ * - strlen, memcpy, memset functions
+ * - [u]int_least8_t, uint_fast8_t, [u]int_least16_t, [u]int32_t, [u]int64_t
+ * - size_t
+ * - bool
+ *
+ * If you don't have the standard header files, you can instead provide
+ * a custom header that defines or includes all this. In that case,
+ * define PB_SYSTEM_HEADER to the path of this file.
+ */
+#ifdef PB_SYSTEM_HEADER
+#include PB_SYSTEM_HEADER
+#else
+#include <stdint.h>
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <limits.h>
+
+#ifdef PB_ENABLE_MALLOC
+#include <stdlib.h>
+#endif
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Macro for defining packed structures (compiler dependent).
+ * This just reduces memory requirements, but is not required.
+ */
+#if defined(PB_NO_PACKED_STRUCTS)
+ /* Disable struct packing */
+# define PB_PACKED_STRUCT_START
+# define PB_PACKED_STRUCT_END
+# define pb_packed
+#elif defined(__GNUC__) || defined(__clang__)
+ /* For GCC and clang */
+# define PB_PACKED_STRUCT_START
+# define PB_PACKED_STRUCT_END
+# define pb_packed __attribute__((packed))
+#elif defined(__ICCARM__) || defined(__CC_ARM)
+ /* For IAR ARM and Keil MDK-ARM compilers */
+# define PB_PACKED_STRUCT_START _Pragma("pack(push, 1)")
+# define PB_PACKED_STRUCT_END _Pragma("pack(pop)")
+# define pb_packed
+#elif defined(_MSC_VER) && (_MSC_VER >= 1500)
+ /* For Microsoft Visual C++ */
+# define PB_PACKED_STRUCT_START __pragma(pack(push, 1))
+# define PB_PACKED_STRUCT_END __pragma(pack(pop))
+# define pb_packed
+#else
+ /* Unknown compiler */
+# define PB_PACKED_STRUCT_START
+# define PB_PACKED_STRUCT_END
+# define pb_packed
+#endif
+
+/* Handly macro for suppressing unreferenced-parameter compiler warnings. */
+#ifndef PB_UNUSED
+#define PB_UNUSED(x) (void)(x)
+#endif
+
+/* Harvard-architecture processors may need special attributes for storing
+ * field information in program memory. */
+#ifndef PB_PROGMEM
+#ifdef __AVR__
+#include <avr/pgmspace.h>
+#define PB_PROGMEM PROGMEM
+#define PB_PROGMEM_READU32(x) pgm_read_dword(&x)
+#else
+#define PB_PROGMEM
+#define PB_PROGMEM_READU32(x) (x)
+#endif
+#endif
+
+/* Compile-time assertion, used for checking compatible compilation options.
+ * If this does not work properly on your compiler, use
+ * #define PB_NO_STATIC_ASSERT to disable it.
+ *
+ * But before doing that, check carefully the error message / place where it
+ * comes from to see if the error has a real cause. Unfortunately the error
+ * message is not always very clear to read, but you can see the reason better
+ * in the place where the PB_STATIC_ASSERT macro was called.
+ */
+#ifndef PB_NO_STATIC_ASSERT
+# ifndef PB_STATIC_ASSERT
+# if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
+ /* C11 standard _Static_assert mechanism */
+# define PB_STATIC_ASSERT(COND,MSG) _Static_assert(COND,#MSG);
+# else
+ /* Classic negative-size-array static assert mechanism */
+# define PB_STATIC_ASSERT(COND,MSG) typedef char PB_STATIC_ASSERT_MSG(MSG, __LINE__, __COUNTER__)[(COND)?1:-1];
+# define PB_STATIC_ASSERT_MSG(MSG, LINE, COUNTER) PB_STATIC_ASSERT_MSG_(MSG, LINE, COUNTER)
+# define PB_STATIC_ASSERT_MSG_(MSG, LINE, COUNTER) pb_static_assertion_##MSG##_##LINE##_##COUNTER
+# endif
+# endif
+#else
+ /* Static asserts disabled by PB_NO_STATIC_ASSERT */
+# define PB_STATIC_ASSERT(COND,MSG)
+#endif
+
+/* Number of required fields to keep track of. */
+#ifndef PB_MAX_REQUIRED_FIELDS
+#define PB_MAX_REQUIRED_FIELDS 64
+#endif
+
+#if PB_MAX_REQUIRED_FIELDS < 64
+#error You should not lower PB_MAX_REQUIRED_FIELDS from the default value (64).
+#endif
+
+#ifdef PB_WITHOUT_64BIT
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+/* Cannot use doubles without 64-bit types */
+#undef PB_CONVERT_DOUBLE_FLOAT
+#endif
+#endif
+
+/* List of possible field types. These are used in the autogenerated code.
+ * Least-significant 4 bits tell the scalar type
+ * Most-significant 4 bits specify repeated/required/packed etc.
+ */
+
+typedef uint_least8_t pb_type_t;
+
+/**** Field data types ****/
+
+/* Numeric types */
+#define PB_LTYPE_BOOL 0x00U /* bool */
+#define PB_LTYPE_VARINT 0x01U /* int32, int64, enum, bool */
+#define PB_LTYPE_UVARINT 0x02U /* uint32, uint64 */
+#define PB_LTYPE_SVARINT 0x03U /* sint32, sint64 */
+#define PB_LTYPE_FIXED32 0x04U /* fixed32, sfixed32, float */
+#define PB_LTYPE_FIXED64 0x05U /* fixed64, sfixed64, double */
+
+/* Marker for last packable field type. */
+#define PB_LTYPE_LAST_PACKABLE 0x05U
+
+/* Byte array with pre-allocated buffer.
+ * data_size is the length of the allocated PB_BYTES_ARRAY structure. */
+#define PB_LTYPE_BYTES 0x06U
+
+/* String with pre-allocated buffer.
+ * data_size is the maximum length. */
+#define PB_LTYPE_STRING 0x07U
+
+/* Submessage
+ * submsg_fields is pointer to field descriptions */
+#define PB_LTYPE_SUBMESSAGE 0x08U
+
+/* Submessage with pre-decoding callback
+ * The pre-decoding callback is stored as pb_callback_t right before pSize.
+ * submsg_fields is pointer to field descriptions */
+#define PB_LTYPE_SUBMSG_W_CB 0x09U
+
+/* Extension pseudo-field
+ * The field contains a pointer to pb_extension_t */
+#define PB_LTYPE_EXTENSION 0x0AU
+
+/* Byte array with inline, pre-allocated byffer.
+ * data_size is the length of the inline, allocated buffer.
+ * This differs from PB_LTYPE_BYTES by defining the element as
+ * pb_byte_t[data_size] rather than pb_bytes_array_t. */
+#define PB_LTYPE_FIXED_LENGTH_BYTES 0x0BU
+
+/* Number of declared LTYPES */
+#define PB_LTYPES_COUNT 0x0CU
+#define PB_LTYPE_MASK 0x0FU
+
+/**** Field repetition rules ****/
+
+#define PB_HTYPE_REQUIRED 0x00U
+#define PB_HTYPE_OPTIONAL 0x10U
+#define PB_HTYPE_SINGULAR 0x10U
+#define PB_HTYPE_REPEATED 0x20U
+#define PB_HTYPE_FIXARRAY 0x20U
+#define PB_HTYPE_ONEOF 0x30U
+#define PB_HTYPE_MASK 0x30U
+
+/**** Field allocation types ****/
+
+#define PB_ATYPE_STATIC 0x00U
+#define PB_ATYPE_POINTER 0x80U
+#define PB_ATYPE_CALLBACK 0x40U
+#define PB_ATYPE_MASK 0xC0U
+
+#define PB_ATYPE(x) ((x) & PB_ATYPE_MASK)
+#define PB_HTYPE(x) ((x) & PB_HTYPE_MASK)
+#define PB_LTYPE(x) ((x) & PB_LTYPE_MASK)
+#define PB_LTYPE_IS_SUBMSG(x) (PB_LTYPE(x) == PB_LTYPE_SUBMESSAGE || \
+ PB_LTYPE(x) == PB_LTYPE_SUBMSG_W_CB)
+
+/* Data type used for storing sizes of struct fields
+ * and array counts.
+ */
+#if defined(PB_FIELD_32BIT)
+ typedef uint32_t pb_size_t;
+ typedef int32_t pb_ssize_t;
+#else
+ typedef uint_least16_t pb_size_t;
+ typedef int_least16_t pb_ssize_t;
+#endif
+#define PB_SIZE_MAX ((pb_size_t)-1)
+
+/* Data type for storing encoded data and other byte streams.
+ * This typedef exists to support platforms where uint8_t does not exist.
+ * You can regard it as equivalent on uint8_t on other platforms.
+ */
+typedef uint_least8_t pb_byte_t;
+
+/* Forward declaration of struct types */
+typedef struct pb_istream_s pb_istream_t;
+typedef struct pb_ostream_s pb_ostream_t;
+typedef struct pb_field_iter_s pb_field_iter_t;
+
+/* This structure is used in auto-generated constants
+ * to specify struct fields.
+ */
+typedef struct pb_msgdesc_s pb_msgdesc_t;
+struct pb_msgdesc_s {
+ const uint32_t *field_info;
+ const pb_msgdesc_t * const * submsg_info;
+ const pb_byte_t *default_value;
+
+ bool (*field_callback)(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_iter_t *field);
+
+ pb_size_t field_count;
+ pb_size_t required_field_count;
+ pb_size_t largest_tag;
+};
+
+/* Iterator for message descriptor */
+struct pb_field_iter_s {
+ const pb_msgdesc_t *descriptor; /* Pointer to message descriptor constant */
+ void *message; /* Pointer to start of the structure */
+
+ pb_size_t index; /* Index of the field */
+ pb_size_t field_info_index; /* Index to descriptor->field_info array */
+ pb_size_t required_field_index; /* Index that counts only the required fields */
+ pb_size_t submessage_index; /* Index that counts only submessages */
+
+ pb_size_t tag; /* Tag of current field */
+ pb_size_t data_size; /* sizeof() of a single item */
+ pb_size_t array_size; /* Number of array entries */
+ pb_type_t type; /* Type of current field */
+
+ void *pField; /* Pointer to current field in struct */
+ void *pData; /* Pointer to current data contents. Different than pField for arrays and pointers. */
+ void *pSize; /* Pointer to count/has field */
+
+ const pb_msgdesc_t *submsg_desc; /* For submessage fields, pointer to field descriptor for the submessage. */
+};
+
+/* For compatibility with legacy code */
+typedef pb_field_iter_t pb_field_t;
+
+/* Make sure that the standard integer types are of the expected sizes.
+ * Otherwise fixed32/fixed64 fields can break.
+ *
+ * If you get errors here, it probably means that your stdint.h is not
+ * correct for your platform.
+ */
+#ifndef PB_WITHOUT_64BIT
+PB_STATIC_ASSERT(sizeof(int64_t) == 2 * sizeof(int32_t), INT64_T_WRONG_SIZE)
+PB_STATIC_ASSERT(sizeof(uint64_t) == 2 * sizeof(uint32_t), UINT64_T_WRONG_SIZE)
+#endif
+
+/* This structure is used for 'bytes' arrays.
+ * It has the number of bytes in the beginning, and after that an array.
+ * Note that actual structs used will have a different length of bytes array.
+ */
+#define PB_BYTES_ARRAY_T(n) struct { pb_size_t size; pb_byte_t bytes[n]; }
+#define PB_BYTES_ARRAY_T_ALLOCSIZE(n) ((size_t)n + offsetof(pb_bytes_array_t, bytes))
+
+struct pb_bytes_array_s {
+ pb_size_t size;
+ pb_byte_t bytes[1];
+};
+typedef struct pb_bytes_array_s pb_bytes_array_t;
+
+/* This structure is used for giving the callback function.
+ * It is stored in the message structure and filled in by the method that
+ * calls pb_decode.
+ *
+ * The decoding callback will be given a limited-length stream
+ * If the wire type was string, the length is the length of the string.
+ * If the wire type was a varint/fixed32/fixed64, the length is the length
+ * of the actual value.
+ * The function may be called multiple times (especially for repeated types,
+ * but also otherwise if the message happens to contain the field multiple
+ * times.)
+ *
+ * The encoding callback will receive the actual output stream.
+ * It should write all the data in one call, including the field tag and
+ * wire type. It can write multiple fields.
+ *
+ * The callback can be null if you want to skip a field.
+ */
+typedef struct pb_callback_s pb_callback_t;
+struct pb_callback_s {
+ /* Callback functions receive a pointer to the arg field.
+ * You can access the value of the field as *arg, and modify it if needed.
+ */
+ union {
+ bool (*decode)(pb_istream_t *stream, const pb_field_t *field, void **arg);
+ bool (*encode)(pb_ostream_t *stream, const pb_field_t *field, void * const *arg);
+ } funcs;
+
+ /* Free arg for use by callback */
+ void *arg;
+};
+
+extern bool pb_default_field_callback(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_t *field);
+
+/* Wire types. Library user needs these only in encoder callbacks. */
+typedef enum {
+ PB_WT_VARINT = 0,
+ PB_WT_64BIT = 1,
+ PB_WT_STRING = 2,
+ PB_WT_32BIT = 5
+} pb_wire_type_t;
+
+/* Structure for defining the handling of unknown/extension fields.
+ * Usually the pb_extension_type_t structure is automatically generated,
+ * while the pb_extension_t structure is created by the user. However,
+ * if you want to catch all unknown fields, you can also create a custom
+ * pb_extension_type_t with your own callback.
+ */
+typedef struct pb_extension_type_s pb_extension_type_t;
+typedef struct pb_extension_s pb_extension_t;
+struct pb_extension_type_s {
+ /* Called for each unknown field in the message.
+ * If you handle the field, read off all of its data and return true.
+ * If you do not handle the field, do not read anything and return true.
+ * If you run into an error, return false.
+ * Set to NULL for default handler.
+ */
+ bool (*decode)(pb_istream_t *stream, pb_extension_t *extension,
+ uint32_t tag, pb_wire_type_t wire_type);
+
+ /* Called once after all regular fields have been encoded.
+ * If you have something to write, do so and return true.
+ * If you do not have anything to write, just return true.
+ * If you run into an error, return false.
+ * Set to NULL for default handler.
+ */
+ bool (*encode)(pb_ostream_t *stream, const pb_extension_t *extension);
+
+ /* Free field for use by the callback. */
+ const void *arg;
+};
+
+struct pb_extension_s {
+ /* Type describing the extension field. Usually you'll initialize
+ * this to a pointer to the automatically generated structure. */
+ const pb_extension_type_t *type;
+
+ /* Destination for the decoded data. This must match the datatype
+ * of the extension field. */
+ void *dest;
+
+ /* Pointer to the next extension handler, or NULL.
+ * If this extension does not match a field, the next handler is
+ * automatically called. */
+ pb_extension_t *next;
+
+ /* The decoder sets this to true if the extension was found.
+ * Ignored for encoding. */
+ bool found;
+};
+
+#define pb_extension_init_zero {NULL,NULL,NULL,false}
+
+/* Memory allocation functions to use. You can define pb_realloc and
+ * pb_free to custom functions if you want. */
+#ifdef PB_ENABLE_MALLOC
+# ifndef pb_realloc
+# define pb_realloc(ptr, size) realloc(ptr, size)
+# endif
+# ifndef pb_free
+# define pb_free(ptr) free(ptr)
+# endif
+#endif
+
+/* This is used to inform about need to regenerate .pb.h/.pb.c files. */
+#define PB_PROTO_HEADER_VERSION 40
+
+/* These macros are used to declare pb_field_t's in the constant array. */
+/* Size of a structure member, in bytes. */
+#define pb_membersize(st, m) (sizeof ((st*)0)->m)
+/* Number of entries in an array. */
+#define pb_arraysize(st, m) (pb_membersize(st, m) / pb_membersize(st, m[0]))
+/* Delta from start of one member to the start of another member. */
+#define pb_delta(st, m1, m2) ((int)offsetof(st, m1) - (int)offsetof(st, m2))
+
+/* Force expansion of macro value */
+#define PB_EXPAND(x) x
+
+/* Binding of a message field set into a specific structure */
+#define PB_BIND(msgname, structname, width) \
+ const uint32_t structname ## _field_info[] PB_PROGMEM = \
+ { \
+ msgname ## _FIELDLIST(PB_GEN_FIELD_INFO_ ## width, structname) \
+ 0 \
+ }; \
+ const pb_msgdesc_t* const structname ## _submsg_info[] = \
+ { \
+ msgname ## _FIELDLIST(PB_GEN_SUBMSG_INFO, structname) \
+ NULL \
+ }; \
+ const pb_msgdesc_t structname ## _msg = \
+ { \
+ structname ## _field_info, \
+ structname ## _submsg_info, \
+ msgname ## _DEFAULT, \
+ msgname ## _CALLBACK, \
+ 0 msgname ## _FIELDLIST(PB_GEN_FIELD_COUNT, structname), \
+ 0 msgname ## _FIELDLIST(PB_GEN_REQ_FIELD_COUNT, structname), \
+ 0 msgname ## _FIELDLIST(PB_GEN_LARGEST_TAG, structname), \
+ }; \
+ msgname ## _FIELDLIST(PB_GEN_FIELD_INFO_ASSERT_ ## width, structname)
+
+#define PB_GEN_FIELD_COUNT(structname, atype, htype, ltype, fieldname, tag) +1
+#define PB_GEN_REQ_FIELD_COUNT(structname, atype, htype, ltype, fieldname, tag) \
+ + (PB_HTYPE_ ## htype == PB_HTYPE_REQUIRED)
+#define PB_GEN_LARGEST_TAG(structname, atype, htype, ltype, fieldname, tag) \
+ * 0 + tag
+
+/* X-macro for generating the entries in struct_field_info[] array. */
+#define PB_GEN_FIELD_INFO_1(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_1(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_2(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_2(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_4(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_4(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_8(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_8(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_AUTO(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_AUTO2(PB_FIELDINFO_WIDTH_AUTO(_PB_ATYPE_ ## atype, _PB_HTYPE_ ## htype, _PB_LTYPE_ ## ltype), \
+ tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_FIELDINFO_AUTO2(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size)
+
+#define PB_FIELDINFO_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_ ## width(tag, type, data_offset, data_size, size_offset, array_size)
+
+/* X-macro for generating asserts that entries fit in struct_field_info[] array.
+ * The structure of macros here must match the structure above in PB_GEN_FIELD_INFO_x(),
+ * but it is not easily reused because of how macro substitutions work. */
+#define PB_GEN_FIELD_INFO_ASSERT_1(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_1(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_2(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_2(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_4(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_4(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_8(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_8(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_AUTO(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_AUTO2(PB_FIELDINFO_WIDTH_AUTO(_PB_ATYPE_ ## atype, _PB_HTYPE_ ## htype, _PB_LTYPE_ ## ltype), \
+ tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_FIELDINFO_ASSERT_AUTO2(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_ASSERT_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size)
+
+#define PB_FIELDINFO_ASSERT_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_ASSERT_ ## width(tag, type, data_offset, data_size, size_offset, array_size)
+
+#define PB_DATA_OFFSET_STATIC(htype, structname, fieldname) PB_DO ## htype(structname, fieldname)
+#define PB_DATA_OFFSET_POINTER(htype, structname, fieldname) PB_DO ## htype(structname, fieldname)
+#define PB_DATA_OFFSET_CALLBACK(htype, structname, fieldname) PB_DO ## htype(structname, fieldname)
+#define PB_DO_PB_HTYPE_REQUIRED(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_SINGULAR(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_ONEOF(structname, fieldname) offsetof(structname, PB_ONEOF_NAME(FULL, fieldname))
+#define PB_DO_PB_HTYPE_OPTIONAL(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_REPEATED(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_FIXARRAY(structname, fieldname) offsetof(structname, fieldname)
+
+#define PB_SIZE_OFFSET_STATIC(htype, structname, fieldname) PB_SO ## htype(structname, fieldname)
+#define PB_SIZE_OFFSET_POINTER(htype, structname, fieldname) PB_SO_PTR ## htype(structname, fieldname)
+#define PB_SIZE_OFFSET_CALLBACK(htype, structname, fieldname) PB_SO_CB ## htype(structname, fieldname)
+#define PB_SO_PB_HTYPE_REQUIRED(structname, fieldname) 0
+#define PB_SO_PB_HTYPE_SINGULAR(structname, fieldname) 0
+#define PB_SO_PB_HTYPE_ONEOF(structname, fieldname) PB_SO_PB_HTYPE_ONEOF2(structname, PB_ONEOF_NAME(FULL, fieldname), PB_ONEOF_NAME(UNION, fieldname))
+#define PB_SO_PB_HTYPE_ONEOF2(structname, fullname, unionname) PB_SO_PB_HTYPE_ONEOF3(structname, fullname, unionname)
+#define PB_SO_PB_HTYPE_ONEOF3(structname, fullname, unionname) pb_delta(structname, fullname, which_ ## unionname)
+#define PB_SO_PB_HTYPE_OPTIONAL(structname, fieldname) pb_delta(structname, fieldname, has_ ## fieldname)
+#define PB_SO_PB_HTYPE_REPEATED(structname, fieldname) pb_delta(structname, fieldname, fieldname ## _count)
+#define PB_SO_PB_HTYPE_FIXARRAY(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_REQUIRED(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_SINGULAR(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_ONEOF(structname, fieldname) PB_SO_PB_HTYPE_ONEOF(structname, fieldname)
+#define PB_SO_PTR_PB_HTYPE_OPTIONAL(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_REPEATED(structname, fieldname) PB_SO_PB_HTYPE_REPEATED(structname, fieldname)
+#define PB_SO_PTR_PB_HTYPE_FIXARRAY(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_REQUIRED(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_SINGULAR(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_ONEOF(structname, fieldname) PB_SO_PB_HTYPE_ONEOF(structname, fieldname)
+#define PB_SO_CB_PB_HTYPE_OPTIONAL(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_REPEATED(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_FIXARRAY(structname, fieldname) 0
+
+#define PB_ARRAY_SIZE_STATIC(htype, structname, fieldname) PB_AS ## htype(structname, fieldname)
+#define PB_ARRAY_SIZE_POINTER(htype, structname, fieldname) PB_AS_PTR ## htype(structname, fieldname)
+#define PB_ARRAY_SIZE_CALLBACK(htype, structname, fieldname) 1
+#define PB_AS_PB_HTYPE_REQUIRED(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_SINGULAR(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_OPTIONAL(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_ONEOF(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_REPEATED(structname, fieldname) pb_arraysize(structname, fieldname)
+#define PB_AS_PB_HTYPE_FIXARRAY(structname, fieldname) pb_arraysize(structname, fieldname)
+#define PB_AS_PTR_PB_HTYPE_REQUIRED(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_SINGULAR(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_OPTIONAL(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_ONEOF(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_REPEATED(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_FIXARRAY(structname, fieldname) pb_arraysize(structname, fieldname[0])
+
+#define PB_DATA_SIZE_STATIC(htype, structname, fieldname) PB_DS ## htype(structname, fieldname)
+#define PB_DATA_SIZE_POINTER(htype, structname, fieldname) PB_DS_PTR ## htype(structname, fieldname)
+#define PB_DATA_SIZE_CALLBACK(htype, structname, fieldname) PB_DS_CB ## htype(structname, fieldname)
+#define PB_DS_PB_HTYPE_REQUIRED(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_PB_HTYPE_SINGULAR(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_PB_HTYPE_OPTIONAL(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_PB_HTYPE_ONEOF(structname, fieldname) pb_membersize(structname, PB_ONEOF_NAME(FULL, fieldname))
+#define PB_DS_PB_HTYPE_REPEATED(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PB_HTYPE_FIXARRAY(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_REQUIRED(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_SINGULAR(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_OPTIONAL(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_ONEOF(structname, fieldname) pb_membersize(structname, PB_ONEOF_NAME(FULL, fieldname)[0])
+#define PB_DS_PTR_PB_HTYPE_REPEATED(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_FIXARRAY(structname, fieldname) pb_membersize(structname, fieldname[0][0])
+#define PB_DS_CB_PB_HTYPE_REQUIRED(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_SINGULAR(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_OPTIONAL(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_ONEOF(structname, fieldname) pb_membersize(structname, PB_ONEOF_NAME(FULL, fieldname))
+#define PB_DS_CB_PB_HTYPE_REPEATED(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_FIXARRAY(structname, fieldname) pb_membersize(structname, fieldname)
+
+#define PB_ONEOF_NAME(type, tuple) PB_EXPAND(PB_ONEOF_NAME_ ## type tuple)
+#define PB_ONEOF_NAME_UNION(unionname,membername,fullname) unionname
+#define PB_ONEOF_NAME_MEMBER(unionname,membername,fullname) membername
+#define PB_ONEOF_NAME_FULL(unionname,membername,fullname) fullname
+
+#define PB_GEN_SUBMSG_INFO(structname, atype, htype, ltype, fieldname, tag) \
+ PB_SUBMSG_INFO_ ## htype(_PB_LTYPE_ ## ltype, structname, fieldname)
+
+#define PB_SUBMSG_INFO_REQUIRED(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_SINGULAR(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_OPTIONAL(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_ONEOF(ltype, structname, fieldname) PB_SUBMSG_INFO_ONEOF2(ltype, structname, PB_ONEOF_NAME(UNION, fieldname), PB_ONEOF_NAME(MEMBER, fieldname))
+#define PB_SUBMSG_INFO_ONEOF2(ltype, structname, unionname, membername) PB_SUBMSG_INFO_ONEOF3(ltype, structname, unionname, membername)
+#define PB_SUBMSG_INFO_ONEOF3(ltype, structname, unionname, membername) PB_SI ## ltype(structname ## _ ## unionname ## _ ## membername ## _MSGTYPE)
+#define PB_SUBMSG_INFO_REPEATED(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_FIXARRAY(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SI_PB_LTYPE_BOOL(t)
+#define PB_SI_PB_LTYPE_BYTES(t)
+#define PB_SI_PB_LTYPE_DOUBLE(t)
+#define PB_SI_PB_LTYPE_ENUM(t)
+#define PB_SI_PB_LTYPE_UENUM(t)
+#define PB_SI_PB_LTYPE_FIXED32(t)
+#define PB_SI_PB_LTYPE_FIXED64(t)
+#define PB_SI_PB_LTYPE_FLOAT(t)
+#define PB_SI_PB_LTYPE_INT32(t)
+#define PB_SI_PB_LTYPE_INT64(t)
+#define PB_SI_PB_LTYPE_MESSAGE(t) PB_SUBMSG_DESCRIPTOR(t)
+#define PB_SI_PB_LTYPE_MSG_W_CB(t) PB_SUBMSG_DESCRIPTOR(t)
+#define PB_SI_PB_LTYPE_SFIXED32(t)
+#define PB_SI_PB_LTYPE_SFIXED64(t)
+#define PB_SI_PB_LTYPE_SINT32(t)
+#define PB_SI_PB_LTYPE_SINT64(t)
+#define PB_SI_PB_LTYPE_STRING(t)
+#define PB_SI_PB_LTYPE_UINT32(t)
+#define PB_SI_PB_LTYPE_UINT64(t)
+#define PB_SI_PB_LTYPE_EXTENSION(t)
+#define PB_SI_PB_LTYPE_FIXED_LENGTH_BYTES(t)
+#define PB_SUBMSG_DESCRIPTOR(t) &(t ## _msg),
+
+/* The field descriptors use a variable width format, with width of either
+ * 1, 2, 4 or 8 of 32-bit words. The two lowest bytes of the first byte always
+ * encode the descriptor size, 6 lowest bits of field tag number, and 8 bits
+ * of the field type.
+ *
+ * Descriptor size is encoded as 0 = 1 word, 1 = 2 words, 2 = 4 words, 3 = 8 words.
+ *
+ * Formats, listed starting with the least significant bit of the first word.
+ * 1 word: [2-bit len] [6-bit tag] [8-bit type] [8-bit data_offset] [4-bit size_offset] [4-bit data_size]
+ *
+ * 2 words: [2-bit len] [6-bit tag] [8-bit type] [12-bit array_size] [4-bit size_offset]
+ * [16-bit data_offset] [12-bit data_size] [4-bit tag>>6]
+ *
+ * 4 words: [2-bit len] [6-bit tag] [8-bit type] [16-bit array_size]
+ * [8-bit size_offset] [24-bit tag>>6]
+ * [32-bit data_offset]
+ * [32-bit data_size]
+ *
+ * 8 words: [2-bit len] [6-bit tag] [8-bit type] [16-bit reserved]
+ * [8-bit size_offset] [24-bit tag>>6]
+ * [32-bit data_offset]
+ * [32-bit data_size]
+ * [32-bit array_size]
+ * [32-bit reserved]
+ * [32-bit reserved]
+ * [32-bit reserved]
+ */
+
+#define PB_FIELDINFO_1(tag, type, data_offset, data_size, size_offset, array_size) \
+ (0 | (((tag) << 2) & 0xFF) | ((type) << 8) | (((uint32_t)(data_offset) & 0xFF) << 16) | \
+ (((uint32_t)(size_offset) & 0x0F) << 24) | (((uint32_t)(data_size) & 0x0F) << 28)),
+
+#define PB_FIELDINFO_2(tag, type, data_offset, data_size, size_offset, array_size) \
+ (1 | (((tag) << 2) & 0xFF) | ((type) << 8) | (((uint32_t)(array_size) & 0xFFF) << 16) | (((uint32_t)(size_offset) & 0x0F) << 28)), \
+ (((uint32_t)(data_offset) & 0xFFFF) | (((uint32_t)(data_size) & 0xFFF) << 16) | (((uint32_t)(tag) & 0x3c0) << 22)),
+
+#define PB_FIELDINFO_4(tag, type, data_offset, data_size, size_offset, array_size) \
+ (2 | (((tag) << 2) & 0xFF) | ((type) << 8) | (((uint32_t)(array_size) & 0xFFFF) << 16)), \
+ ((uint32_t)(int_least8_t)(size_offset) | (((uint32_t)(tag) << 2) & 0xFFFFFF00)), \
+ (data_offset), (data_size),
+
+#define PB_FIELDINFO_8(tag, type, data_offset, data_size, size_offset, array_size) \
+ (3 | (((tag) << 2) & 0xFF) | ((type) << 8)), \
+ ((uint32_t)(int_least8_t)(size_offset) | (((uint32_t)(tag) << 2) & 0xFFFFFF00)), \
+ (data_offset), (data_size), (array_size), 0, 0, 0,
+
+/* These assertions verify that the field information fits in the allocated space.
+ * The generator tries to automatically determine the correct width that can fit all
+ * data associated with a message. These asserts will fail only if there has been a
+ * problem in the automatic logic - this may be worth reporting as a bug. As a workaround,
+ * you can increase the descriptor width by defining PB_FIELDINFO_WIDTH or by setting
+ * descriptorsize option in .options file.
+ */
+#define PB_FITS(value,bits) ((uint32_t)(value) < ((uint32_t)1<<bits))
+#define PB_FIELDINFO_ASSERT_1(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,6) && PB_FITS(data_offset,8) && PB_FITS(size_offset,4) && PB_FITS(data_size,4) && PB_FITS(array_size,1), FIELDINFO_DOES_NOT_FIT_width1_field ## tag)
+
+#define PB_FIELDINFO_ASSERT_2(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,10) && PB_FITS(data_offset,16) && PB_FITS(size_offset,4) && PB_FITS(data_size,12) && PB_FITS(array_size,12), FIELDINFO_DOES_NOT_FIT_width2_field ## tag)
+
+#ifndef PB_FIELD_32BIT
+/* Maximum field sizes are still 16-bit if pb_size_t is 16-bit */
+#define PB_FIELDINFO_ASSERT_4(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,16) && PB_FITS(data_offset,16) && PB_FITS((int_least8_t)size_offset,8) && PB_FITS(data_size,16) && PB_FITS(array_size,16), FIELDINFO_DOES_NOT_FIT_width4_field ## tag)
+
+#define PB_FIELDINFO_ASSERT_8(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,16) && PB_FITS(data_offset,16) && PB_FITS((int_least8_t)size_offset,8) && PB_FITS(data_size,16) && PB_FITS(array_size,16), FIELDINFO_DOES_NOT_FIT_width8_field ## tag)
+#else
+/* Up to 32-bit fields supported.
+ * Note that the checks are against 31 bits to avoid compiler warnings about shift wider than type in the test.
+ * I expect that there is no reasonable use for >2GB messages with nanopb anyway.
+ */
+#define PB_FIELDINFO_ASSERT_4(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,30) && PB_FITS(data_offset,31) && PB_FITS(size_offset,8) && PB_FITS(data_size,31) && PB_FITS(array_size,16), FIELDINFO_DOES_NOT_FIT_width4_field ## tag)
+
+#define PB_FIELDINFO_ASSERT_8(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,30) && PB_FITS(data_offset,31) && PB_FITS(size_offset,8) && PB_FITS(data_size,31) && PB_FITS(array_size,31), FIELDINFO_DOES_NOT_FIT_width8_field ## tag)
+#endif
+
+
+/* Automatic picking of FIELDINFO width:
+ * Uses width 1 when possible, otherwise resorts to width 2.
+ * This is used when PB_BIND() is called with "AUTO" as the argument.
+ * The generator will give explicit size argument when it knows that a message
+ * structure grows beyond 1-word format limits.
+ */
+#define PB_FIELDINFO_WIDTH_AUTO(atype, htype, ltype) PB_FI_WIDTH ## atype(htype, ltype)
+#define PB_FI_WIDTH_PB_ATYPE_STATIC(htype, ltype) PB_FI_WIDTH ## htype(ltype)
+#define PB_FI_WIDTH_PB_ATYPE_POINTER(htype, ltype) PB_FI_WIDTH ## htype(ltype)
+#define PB_FI_WIDTH_PB_ATYPE_CALLBACK(htype, ltype) 2
+#define PB_FI_WIDTH_PB_HTYPE_REQUIRED(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_SINGULAR(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_OPTIONAL(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_ONEOF(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_REPEATED(ltype) 2
+#define PB_FI_WIDTH_PB_HTYPE_FIXARRAY(ltype) 2
+#define PB_FI_WIDTH_PB_LTYPE_BOOL 1
+#define PB_FI_WIDTH_PB_LTYPE_BYTES 2
+#define PB_FI_WIDTH_PB_LTYPE_DOUBLE 1
+#define PB_FI_WIDTH_PB_LTYPE_ENUM 1
+#define PB_FI_WIDTH_PB_LTYPE_UENUM 1
+#define PB_FI_WIDTH_PB_LTYPE_FIXED32 1
+#define PB_FI_WIDTH_PB_LTYPE_FIXED64 1
+#define PB_FI_WIDTH_PB_LTYPE_FLOAT 1
+#define PB_FI_WIDTH_PB_LTYPE_INT32 1
+#define PB_FI_WIDTH_PB_LTYPE_INT64 1
+#define PB_FI_WIDTH_PB_LTYPE_MESSAGE 2
+#define PB_FI_WIDTH_PB_LTYPE_MSG_W_CB 2
+#define PB_FI_WIDTH_PB_LTYPE_SFIXED32 1
+#define PB_FI_WIDTH_PB_LTYPE_SFIXED64 1
+#define PB_FI_WIDTH_PB_LTYPE_SINT32 1
+#define PB_FI_WIDTH_PB_LTYPE_SINT64 1
+#define PB_FI_WIDTH_PB_LTYPE_STRING 2
+#define PB_FI_WIDTH_PB_LTYPE_UINT32 1
+#define PB_FI_WIDTH_PB_LTYPE_UINT64 1
+#define PB_FI_WIDTH_PB_LTYPE_EXTENSION 1
+#define PB_FI_WIDTH_PB_LTYPE_FIXED_LENGTH_BYTES 2
+
+/* The mapping from protobuf types to LTYPEs is done using these macros. */
+#define PB_LTYPE_MAP_BOOL PB_LTYPE_BOOL
+#define PB_LTYPE_MAP_BYTES PB_LTYPE_BYTES
+#define PB_LTYPE_MAP_DOUBLE PB_LTYPE_FIXED64
+#define PB_LTYPE_MAP_ENUM PB_LTYPE_VARINT
+#define PB_LTYPE_MAP_UENUM PB_LTYPE_UVARINT
+#define PB_LTYPE_MAP_FIXED32 PB_LTYPE_FIXED32
+#define PB_LTYPE_MAP_FIXED64 PB_LTYPE_FIXED64
+#define PB_LTYPE_MAP_FLOAT PB_LTYPE_FIXED32
+#define PB_LTYPE_MAP_INT32 PB_LTYPE_VARINT
+#define PB_LTYPE_MAP_INT64 PB_LTYPE_VARINT
+#define PB_LTYPE_MAP_MESSAGE PB_LTYPE_SUBMESSAGE
+#define PB_LTYPE_MAP_MSG_W_CB PB_LTYPE_SUBMSG_W_CB
+#define PB_LTYPE_MAP_SFIXED32 PB_LTYPE_FIXED32
+#define PB_LTYPE_MAP_SFIXED64 PB_LTYPE_FIXED64
+#define PB_LTYPE_MAP_SINT32 PB_LTYPE_SVARINT
+#define PB_LTYPE_MAP_SINT64 PB_LTYPE_SVARINT
+#define PB_LTYPE_MAP_STRING PB_LTYPE_STRING
+#define PB_LTYPE_MAP_UINT32 PB_LTYPE_UVARINT
+#define PB_LTYPE_MAP_UINT64 PB_LTYPE_UVARINT
+#define PB_LTYPE_MAP_EXTENSION PB_LTYPE_EXTENSION
+#define PB_LTYPE_MAP_FIXED_LENGTH_BYTES PB_LTYPE_FIXED_LENGTH_BYTES
+
+/* These macros are used for giving out error messages.
+ * They are mostly a debugging aid; the main error information
+ * is the true/false return value from functions.
+ * Some code space can be saved by disabling the error
+ * messages if not used.
+ *
+ * PB_SET_ERROR() sets the error message if none has been set yet.
+ * msg must be a constant string literal.
+ * PB_GET_ERROR() always returns a pointer to a string.
+ * PB_RETURN_ERROR() sets the error and returns false from current
+ * function.
+ */
+#ifdef PB_NO_ERRMSG
+#define PB_SET_ERROR(stream, msg) PB_UNUSED(stream)
+#define PB_GET_ERROR(stream) "(errmsg disabled)"
+#else
+#define PB_SET_ERROR(stream, msg) (stream->errmsg = (stream)->errmsg ? (stream)->errmsg : (msg))
+#define PB_GET_ERROR(stream) ((stream)->errmsg ? (stream)->errmsg : "(none)")
+#endif
+
+#define PB_RETURN_ERROR(stream, msg) return PB_SET_ERROR(stream, msg), false
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#ifdef __cplusplus
+#if __cplusplus >= 201103L
+#define PB_CONSTEXPR constexpr
+#else // __cplusplus >= 201103L
+#define PB_CONSTEXPR
+#endif // __cplusplus >= 201103L
+
+#if __cplusplus >= 201703L
+#define PB_INLINE_CONSTEXPR inline constexpr
+#else // __cplusplus >= 201703L
+#define PB_INLINE_CONSTEXPR PB_CONSTEXPR
+#endif // __cplusplus >= 201703L
+
+namespace nanopb {
+// Each type will be partially specialized by the generator.
+template <typename GenMessageT> struct MessageDescriptor;
+} // namespace nanopb
+#endif /* __cplusplus */
+
+#endif
+
diff --git a/security/container/protos/nanopb/pb_common.c b/security/container/protos/nanopb/pb_common.c
new file mode 100644
index 0000000..6aee76b
--- /dev/null
+++ b/security/container/protos/nanopb/pb_common.c
@@ -0,0 +1,388 @@
+/* pb_common.c: Common support functions for pb_encode.c and pb_decode.c.
+ *
+ * 2014 Petteri Aimonen <jpa@kapsi.fi>
+ */
+
+#include "pb_common.h"
+
+static bool load_descriptor_values(pb_field_iter_t *iter)
+{
+ uint32_t word0;
+ uint32_t data_offset;
+ int_least8_t size_offset;
+
+ if (iter->index >= iter->descriptor->field_count)
+ return false;
+
+ word0 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+ iter->type = (pb_type_t)((word0 >> 8) & 0xFF);
+
+ switch(word0 & 3)
+ {
+ case 0: {
+ /* 1-word format */
+ iter->array_size = 1;
+ iter->tag = (pb_size_t)((word0 >> 2) & 0x3F);
+ size_offset = (int_least8_t)((word0 >> 24) & 0x0F);
+ data_offset = (word0 >> 16) & 0xFF;
+ iter->data_size = (pb_size_t)((word0 >> 28) & 0x0F);
+ break;
+ }
+
+ case 1: {
+ /* 2-word format */
+ uint32_t word1 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 1]);
+
+ iter->array_size = (pb_size_t)((word0 >> 16) & 0x0FFF);
+ iter->tag = (pb_size_t)(((word0 >> 2) & 0x3F) | ((word1 >> 28) << 6));
+ size_offset = (int_least8_t)((word0 >> 28) & 0x0F);
+ data_offset = word1 & 0xFFFF;
+ iter->data_size = (pb_size_t)((word1 >> 16) & 0x0FFF);
+ break;
+ }
+
+ case 2: {
+ /* 4-word format */
+ uint32_t word1 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 1]);
+ uint32_t word2 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 2]);
+ uint32_t word3 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 3]);
+
+ iter->array_size = (pb_size_t)(word0 >> 16);
+ iter->tag = (pb_size_t)(((word0 >> 2) & 0x3F) | ((word1 >> 8) << 6));
+ size_offset = (int_least8_t)(word1 & 0xFF);
+ data_offset = word2;
+ iter->data_size = (pb_size_t)word3;
+ break;
+ }
+
+ default: {
+ /* 8-word format */
+ uint32_t word1 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 1]);
+ uint32_t word2 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 2]);
+ uint32_t word3 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 3]);
+ uint32_t word4 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 4]);
+
+ iter->array_size = (pb_size_t)word4;
+ iter->tag = (pb_size_t)(((word0 >> 2) & 0x3F) | ((word1 >> 8) << 6));
+ size_offset = (int_least8_t)(word1 & 0xFF);
+ data_offset = word2;
+ iter->data_size = (pb_size_t)word3;
+ break;
+ }
+ }
+
+ if (!iter->message)
+ {
+ /* Avoid doing arithmetic on null pointers, it is undefined */
+ iter->pField = NULL;
+ iter->pSize = NULL;
+ }
+ else
+ {
+ iter->pField = (char*)iter->message + data_offset;
+
+ if (size_offset)
+ {
+ iter->pSize = (char*)iter->pField - size_offset;
+ }
+ else if (PB_HTYPE(iter->type) == PB_HTYPE_REPEATED &&
+ (PB_ATYPE(iter->type) == PB_ATYPE_STATIC ||
+ PB_ATYPE(iter->type) == PB_ATYPE_POINTER))
+ {
+ /* Fixed count array */
+ iter->pSize = &iter->array_size;
+ }
+ else
+ {
+ iter->pSize = NULL;
+ }
+
+ if (PB_ATYPE(iter->type) == PB_ATYPE_POINTER && iter->pField != NULL)
+ {
+ iter->pData = *(void**)iter->pField;
+ }
+ else
+ {
+ iter->pData = iter->pField;
+ }
+ }
+
+ if (PB_LTYPE_IS_SUBMSG(iter->type))
+ {
+ iter->submsg_desc = iter->descriptor->submsg_info[iter->submessage_index];
+ }
+ else
+ {
+ iter->submsg_desc = NULL;
+ }
+
+ return true;
+}
+
+static void advance_iterator(pb_field_iter_t *iter)
+{
+ iter->index++;
+
+ if (iter->index >= iter->descriptor->field_count)
+ {
+ /* Restart */
+ iter->index = 0;
+ iter->field_info_index = 0;
+ iter->submessage_index = 0;
+ iter->required_field_index = 0;
+ }
+ else
+ {
+ /* Increment indexes based on previous field type.
+ * All field info formats have the following fields:
+ * - lowest 2 bits tell the amount of words in the descriptor (2^n words)
+ * - bits 2..7 give the lowest bits of tag number.
+ * - bits 8..15 give the field type.
+ */
+ uint32_t prev_descriptor = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+ pb_type_t prev_type = (prev_descriptor >> 8) & 0xFF;
+ pb_size_t descriptor_len = (pb_size_t)(1 << (prev_descriptor & 3));
+
+ /* Add to fields.
+ * The cast to pb_size_t is needed to avoid -Wconversion warning.
+ * Because the data is is constants from generator, there is no danger of overflow.
+ */
+ iter->field_info_index = (pb_size_t)(iter->field_info_index + descriptor_len);
+ iter->required_field_index = (pb_size_t)(iter->required_field_index + (PB_HTYPE(prev_type) == PB_HTYPE_REQUIRED));
+ iter->submessage_index = (pb_size_t)(iter->submessage_index + PB_LTYPE_IS_SUBMSG(prev_type));
+ }
+}
+
+bool pb_field_iter_begin(pb_field_iter_t *iter, const pb_msgdesc_t *desc, void *message)
+{
+ memset(iter, 0, sizeof(*iter));
+
+ iter->descriptor = desc;
+ iter->message = message;
+
+ return load_descriptor_values(iter);
+}
+
+bool pb_field_iter_begin_extension(pb_field_iter_t *iter, pb_extension_t *extension)
+{
+ const pb_msgdesc_t *msg = (const pb_msgdesc_t*)extension->type->arg;
+ bool status;
+
+ uint32_t word0 = PB_PROGMEM_READU32(msg->field_info[0]);
+ if (PB_ATYPE(word0 >> 8) == PB_ATYPE_POINTER)
+ {
+ /* For pointer extensions, the pointer is stored directly
+ * in the extension structure. This avoids having an extra
+ * indirection. */
+ status = pb_field_iter_begin(iter, msg, &extension->dest);
+ }
+ else
+ {
+ status = pb_field_iter_begin(iter, msg, extension->dest);
+ }
+
+ iter->pSize = &extension->found;
+ return status;
+}
+
+bool pb_field_iter_next(pb_field_iter_t *iter)
+{
+ advance_iterator(iter);
+ (void)load_descriptor_values(iter);
+ return iter->index != 0;
+}
+
+bool pb_field_iter_find(pb_field_iter_t *iter, uint32_t tag)
+{
+ if (iter->tag == tag)
+ {
+ return true; /* Nothing to do, correct field already. */
+ }
+ else if (tag > iter->descriptor->largest_tag)
+ {
+ return false;
+ }
+ else
+ {
+ pb_size_t start = iter->index;
+ uint32_t fieldinfo;
+
+ if (tag < iter->tag)
+ {
+ /* Fields are in tag number order, so we know that tag is between
+ * 0 and our start position. Setting index to end forces
+ * advance_iterator() call below to restart from beginning. */
+ iter->index = iter->descriptor->field_count;
+ }
+
+ do
+ {
+ /* Advance iterator but don't load values yet */
+ advance_iterator(iter);
+
+ /* Do fast check for tag number match */
+ fieldinfo = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+
+ if (((fieldinfo >> 2) & 0x3F) == (tag & 0x3F))
+ {
+ /* Good candidate, check further */
+ (void)load_descriptor_values(iter);
+
+ if (iter->tag == tag &&
+ PB_LTYPE(iter->type) != PB_LTYPE_EXTENSION)
+ {
+ /* Found it */
+ return true;
+ }
+ }
+ } while (iter->index != start);
+
+ /* Searched all the way back to start, and found nothing. */
+ (void)load_descriptor_values(iter);
+ return false;
+ }
+}
+
+bool pb_field_iter_find_extension(pb_field_iter_t *iter)
+{
+ if (PB_LTYPE(iter->type) == PB_LTYPE_EXTENSION)
+ {
+ return true;
+ }
+ else
+ {
+ pb_size_t start = iter->index;
+ uint32_t fieldinfo;
+
+ do
+ {
+ /* Advance iterator but don't load values yet */
+ advance_iterator(iter);
+
+ /* Do fast check for field type */
+ fieldinfo = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+
+ if (PB_LTYPE((fieldinfo >> 8) & 0xFF) == PB_LTYPE_EXTENSION)
+ {
+ return load_descriptor_values(iter);
+ }
+ } while (iter->index != start);
+
+ /* Searched all the way back to start, and found nothing. */
+ (void)load_descriptor_values(iter);
+ return false;
+ }
+}
+
+static void *pb_const_cast(const void *p)
+{
+ /* Note: this casts away const, in order to use the common field iterator
+ * logic for both encoding and decoding. The cast is done using union
+ * to avoid spurious compiler warnings. */
+ union {
+ void *p1;
+ const void *p2;
+ } t;
+ t.p2 = p;
+ return t.p1;
+}
+
+bool pb_field_iter_begin_const(pb_field_iter_t *iter, const pb_msgdesc_t *desc, const void *message)
+{
+ return pb_field_iter_begin(iter, desc, pb_const_cast(message));
+}
+
+bool pb_field_iter_begin_extension_const(pb_field_iter_t *iter, const pb_extension_t *extension)
+{
+ return pb_field_iter_begin_extension(iter, (pb_extension_t*)pb_const_cast(extension));
+}
+
+bool pb_default_field_callback(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_t *field)
+{
+ if (field->data_size == sizeof(pb_callback_t))
+ {
+ pb_callback_t *pCallback = (pb_callback_t*)field->pData;
+
+ if (pCallback != NULL)
+ {
+ if (istream != NULL && pCallback->funcs.decode != NULL)
+ {
+ return pCallback->funcs.decode(istream, field, &pCallback->arg);
+ }
+
+ if (ostream != NULL && pCallback->funcs.encode != NULL)
+ {
+ return pCallback->funcs.encode(ostream, field, &pCallback->arg);
+ }
+ }
+ }
+
+ return true; /* Success, but didn't do anything */
+
+}
+
+#ifdef PB_VALIDATE_UTF8
+
+/* This function checks whether a string is valid UTF-8 text.
+ *
+ * Algorithm is adapted from https://www.cl.cam.ac.uk/~mgk25/ucs/utf8_check.c
+ * Original copyright: Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> 2005-03-30
+ * Licensed under "Short code license", which allows use under MIT license or
+ * any compatible with it.
+ */
+
+bool pb_validate_utf8(const char *str)
+{
+ const pb_byte_t *s = (const pb_byte_t*)str;
+ while (*s)
+ {
+ if (*s < 0x80)
+ {
+ /* 0xxxxxxx */
+ s++;
+ }
+ else if ((s[0] & 0xe0) == 0xc0)
+ {
+ /* 110XXXXx 10xxxxxx */
+ if ((s[1] & 0xc0) != 0x80 ||
+ (s[0] & 0xfe) == 0xc0) /* overlong? */
+ return false;
+ else
+ s += 2;
+ }
+ else if ((s[0] & 0xf0) == 0xe0)
+ {
+ /* 1110XXXX 10Xxxxxx 10xxxxxx */
+ if ((s[1] & 0xc0) != 0x80 ||
+ (s[2] & 0xc0) != 0x80 ||
+ (s[0] == 0xe0 && (s[1] & 0xe0) == 0x80) || /* overlong? */
+ (s[0] == 0xed && (s[1] & 0xe0) == 0xa0) || /* surrogate? */
+ (s[0] == 0xef && s[1] == 0xbf &&
+ (s[2] & 0xfe) == 0xbe)) /* U+FFFE or U+FFFF? */
+ return false;
+ else
+ s += 3;
+ }
+ else if ((s[0] & 0xf8) == 0xf0)
+ {
+ /* 11110XXX 10XXxxxx 10xxxxxx 10xxxxxx */
+ if ((s[1] & 0xc0) != 0x80 ||
+ (s[2] & 0xc0) != 0x80 ||
+ (s[3] & 0xc0) != 0x80 ||
+ (s[0] == 0xf0 && (s[1] & 0xf0) == 0x80) || /* overlong? */
+ (s[0] == 0xf4 && s[1] > 0x8f) || s[0] > 0xf4) /* > U+10FFFF? */
+ return false;
+ else
+ s += 4;
+ }
+ else
+ {
+ return false;
+ }
+ }
+
+ return true;
+}
+
+#endif
+
diff --git a/security/container/protos/nanopb/pb_common.h b/security/container/protos/nanopb/pb_common.h
new file mode 100644
index 0000000..58aa90f
--- /dev/null
+++ b/security/container/protos/nanopb/pb_common.h
@@ -0,0 +1,49 @@
+/* pb_common.h: Common support functions for pb_encode.c and pb_decode.c.
+ * These functions are rarely needed by applications directly.
+ */
+
+#ifndef PB_COMMON_H_INCLUDED
+#define PB_COMMON_H_INCLUDED
+
+#include "pb.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Initialize the field iterator structure to beginning.
+ * Returns false if the message type is empty. */
+bool pb_field_iter_begin(pb_field_iter_t *iter, const pb_msgdesc_t *desc, void *message);
+
+/* Get a field iterator for extension field. */
+bool pb_field_iter_begin_extension(pb_field_iter_t *iter, pb_extension_t *extension);
+
+/* Same as pb_field_iter_begin(), but for const message pointer.
+ * Note that the pointers in pb_field_iter_t will be non-const but shouldn't
+ * be written to when using these functions. */
+bool pb_field_iter_begin_const(pb_field_iter_t *iter, const pb_msgdesc_t *desc, const void *message);
+bool pb_field_iter_begin_extension_const(pb_field_iter_t *iter, const pb_extension_t *extension);
+
+/* Advance the iterator to the next field.
+ * Returns false when the iterator wraps back to the first field. */
+bool pb_field_iter_next(pb_field_iter_t *iter);
+
+/* Advance the iterator until it points at a field with the given tag.
+ * Returns false if no such field exists. */
+bool pb_field_iter_find(pb_field_iter_t *iter, uint32_t tag);
+
+/* Find a field with type PB_LTYPE_EXTENSION, or return false if not found.
+ * There can be only one extension range field per message. */
+bool pb_field_iter_find_extension(pb_field_iter_t *iter);
+
+#ifdef PB_VALIDATE_UTF8
+/* Validate UTF-8 text string */
+bool pb_validate_utf8(const char *s);
+#endif
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
+
diff --git a/security/container/protos/nanopb/pb_decode.c b/security/container/protos/nanopb/pb_decode.c
new file mode 100644
index 0000000..b194825
--- /dev/null
+++ b/security/container/protos/nanopb/pb_decode.c
@@ -0,0 +1,1709 @@
+/* pb_decode.c -- decode a protobuf using minimal resources
+ *
+ * 2011 Petteri Aimonen <jpa@kapsi.fi>
+ */
+
+/* Use the GCC warn_unused_result attribute to check that all return values
+ * are propagated correctly. On other compilers and gcc before 3.4.0 just
+ * ignore the annotation.
+ */
+#if !defined(__GNUC__) || ( __GNUC__ < 3) || (__GNUC__ == 3 && __GNUC_MINOR__ < 4)
+ #define checkreturn
+#else
+ #define checkreturn __attribute__((warn_unused_result))
+#endif
+
+#include "pb.h"
+#include "pb_decode.h"
+#include "pb_common.h"
+
+/**************************************
+ * Declarations internal to this file *
+ **************************************/
+
+static bool checkreturn buf_read(pb_istream_t *stream, pb_byte_t *buf, size_t count);
+static bool checkreturn pb_decode_varint32_eof(pb_istream_t *stream, uint32_t *dest, bool *eof);
+static bool checkreturn read_raw_value(pb_istream_t *stream, pb_wire_type_t wire_type, pb_byte_t *buf, size_t *size);
+static bool checkreturn decode_basic_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_static_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_pointer_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_callback_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn default_extension_decoder(pb_istream_t *stream, pb_extension_t *extension, uint32_t tag, pb_wire_type_t wire_type);
+static bool checkreturn decode_extension(pb_istream_t *stream, uint32_t tag, pb_wire_type_t wire_type, pb_extension_t *extension);
+static bool pb_field_set_to_default(pb_field_iter_t *field);
+static bool pb_message_set_to_defaults(pb_field_iter_t *iter);
+static bool checkreturn pb_dec_bool(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_varint(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_bytes(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_string(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_submessage(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_fixed_length_bytes(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_skip_varint(pb_istream_t *stream);
+static bool checkreturn pb_skip_string(pb_istream_t *stream);
+
+#ifdef PB_ENABLE_MALLOC
+static bool checkreturn allocate_field(pb_istream_t *stream, void *pData, size_t data_size, size_t array_size);
+static void initialize_pointer_field(void *pItem, pb_field_iter_t *field);
+static bool checkreturn pb_release_union_field(pb_istream_t *stream, pb_field_iter_t *field);
+static void pb_release_single_field(pb_field_iter_t *field);
+#endif
+
+#ifdef PB_WITHOUT_64BIT
+#define pb_int64_t int32_t
+#define pb_uint64_t uint32_t
+#else
+#define pb_int64_t int64_t
+#define pb_uint64_t uint64_t
+#endif
+
+#define PB_WT_PACKED ((pb_wire_type_t)0xFF)
+
+typedef struct {
+ uint32_t bitfield[(PB_MAX_REQUIRED_FIELDS + 31) / 32];
+} pb_fields_seen_t;
+
+/*******************************
+ * pb_istream_t implementation *
+ *******************************/
+
+static bool checkreturn buf_read(pb_istream_t *stream, pb_byte_t *buf, size_t count)
+{
+ size_t i;
+ const pb_byte_t *source = (const pb_byte_t*)stream->state;
+ stream->state = (pb_byte_t*)stream->state + count;
+
+ if (buf != NULL)
+ {
+ for (i = 0; i < count; i++)
+ buf[i] = source[i];
+ }
+
+ return true;
+}
+
+bool checkreturn pb_read(pb_istream_t *stream, pb_byte_t *buf, size_t count)
+{
+ if (count == 0)
+ return true;
+
+#ifndef PB_BUFFER_ONLY
+ if (buf == NULL && stream->callback != buf_read)
+ {
+ /* Skip input bytes */
+ pb_byte_t tmp[16];
+ while (count > 16)
+ {
+ if (!pb_read(stream, tmp, 16))
+ return false;
+
+ count -= 16;
+ }
+
+ return pb_read(stream, tmp, count);
+ }
+#endif
+
+ if (stream->bytes_left < count)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+#ifndef PB_BUFFER_ONLY
+ if (!stream->callback(stream, buf, count))
+ PB_RETURN_ERROR(stream, "io error");
+#else
+ if (!buf_read(stream, buf, count))
+ return false;
+#endif
+
+ stream->bytes_left -= count;
+ return true;
+}
+
+/* Read a single byte from input stream. buf may not be NULL.
+ * This is an optimization for the varint decoding. */
+static bool checkreturn pb_readbyte(pb_istream_t *stream, pb_byte_t *buf)
+{
+ if (stream->bytes_left == 0)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+#ifndef PB_BUFFER_ONLY
+ if (!stream->callback(stream, buf, 1))
+ PB_RETURN_ERROR(stream, "io error");
+#else
+ *buf = *(const pb_byte_t*)stream->state;
+ stream->state = (pb_byte_t*)stream->state + 1;
+#endif
+
+ stream->bytes_left--;
+
+ return true;
+}
+
+pb_istream_t pb_istream_from_buffer(const pb_byte_t *buf, size_t msglen)
+{
+ pb_istream_t stream;
+ /* Cast away the const from buf without a compiler error. We are
+ * careful to use it only in a const manner in the callbacks.
+ */
+ union {
+ void *state;
+ const void *c_state;
+ } state;
+#ifdef PB_BUFFER_ONLY
+ stream.callback = NULL;
+#else
+ stream.callback = &buf_read;
+#endif
+ state.c_state = buf;
+ stream.state = state.state;
+ stream.bytes_left = msglen;
+#ifndef PB_NO_ERRMSG
+ stream.errmsg = NULL;
+#endif
+ return stream;
+}
+
+/********************
+ * Helper functions *
+ ********************/
+
+static bool checkreturn pb_decode_varint32_eof(pb_istream_t *stream, uint32_t *dest, bool *eof)
+{
+ pb_byte_t byte;
+ uint32_t result;
+
+ if (!pb_readbyte(stream, &byte))
+ {
+ if (stream->bytes_left == 0)
+ {
+ if (eof)
+ {
+ *eof = true;
+ }
+ }
+
+ return false;
+ }
+
+ if ((byte & 0x80) == 0)
+ {
+ /* Quick case, 1 byte value */
+ result = byte;
+ }
+ else
+ {
+ /* Multibyte case */
+ uint_fast8_t bitpos = 7;
+ result = byte & 0x7F;
+
+ do
+ {
+ if (!pb_readbyte(stream, &byte))
+ return false;
+
+ if (bitpos >= 32)
+ {
+ /* Note: The varint could have trailing 0x80 bytes, or 0xFF for negative. */
+ pb_byte_t sign_extension = (bitpos < 63) ? 0xFF : 0x01;
+ bool valid_extension = ((byte & 0x7F) == 0x00 ||
+ ((result >> 31) != 0 && byte == sign_extension));
+
+ if (bitpos >= 64 || !valid_extension)
+ {
+ PB_RETURN_ERROR(stream, "varint overflow");
+ }
+ }
+ else
+ {
+ result |= (uint32_t)(byte & 0x7F) << bitpos;
+ }
+ bitpos = (uint_fast8_t)(bitpos + 7);
+ } while (byte & 0x80);
+
+ if (bitpos == 35 && (byte & 0x70) != 0)
+ {
+ /* The last byte was at bitpos=28, so only bottom 4 bits fit. */
+ PB_RETURN_ERROR(stream, "varint overflow");
+ }
+ }
+
+ *dest = result;
+ return true;
+}
+
+bool checkreturn pb_decode_varint32(pb_istream_t *stream, uint32_t *dest)
+{
+ return pb_decode_varint32_eof(stream, dest, NULL);
+}
+
+#ifndef PB_WITHOUT_64BIT
+bool checkreturn pb_decode_varint(pb_istream_t *stream, uint64_t *dest)
+{
+ pb_byte_t byte;
+ uint_fast8_t bitpos = 0;
+ uint64_t result = 0;
+
+ do
+ {
+ if (bitpos >= 64)
+ PB_RETURN_ERROR(stream, "varint overflow");
+
+ if (!pb_readbyte(stream, &byte))
+ return false;
+
+ result |= (uint64_t)(byte & 0x7F) << bitpos;
+ bitpos = (uint_fast8_t)(bitpos + 7);
+ } while (byte & 0x80);
+
+ *dest = result;
+ return true;
+}
+#endif
+
+bool checkreturn pb_skip_varint(pb_istream_t *stream)
+{
+ pb_byte_t byte;
+ do
+ {
+ if (!pb_read(stream, &byte, 1))
+ return false;
+ } while (byte & 0x80);
+ return true;
+}
+
+bool checkreturn pb_skip_string(pb_istream_t *stream)
+{
+ uint32_t length;
+ if (!pb_decode_varint32(stream, &length))
+ return false;
+
+ if ((size_t)length != length)
+ {
+ PB_RETURN_ERROR(stream, "size too large");
+ }
+
+ return pb_read(stream, NULL, (size_t)length);
+}
+
+bool checkreturn pb_decode_tag(pb_istream_t *stream, pb_wire_type_t *wire_type, uint32_t *tag, bool *eof)
+{
+ uint32_t temp;
+ *eof = false;
+ *wire_type = (pb_wire_type_t) 0;
+ *tag = 0;
+
+ if (!pb_decode_varint32_eof(stream, &temp, eof))
+ {
+ return false;
+ }
+
+ *tag = temp >> 3;
+ *wire_type = (pb_wire_type_t)(temp & 7);
+ return true;
+}
+
+bool checkreturn pb_skip_field(pb_istream_t *stream, pb_wire_type_t wire_type)
+{
+ switch (wire_type)
+ {
+ case PB_WT_VARINT: return pb_skip_varint(stream);
+ case PB_WT_64BIT: return pb_read(stream, NULL, 8);
+ case PB_WT_STRING: return pb_skip_string(stream);
+ case PB_WT_32BIT: return pb_read(stream, NULL, 4);
+ default: PB_RETURN_ERROR(stream, "invalid wire_type");
+ }
+}
+
+/* Read a raw value to buffer, for the purpose of passing it to callback as
+ * a substream. Size is maximum size on call, and actual size on return.
+ */
+static bool checkreturn read_raw_value(pb_istream_t *stream, pb_wire_type_t wire_type, pb_byte_t *buf, size_t *size)
+{
+ size_t max_size = *size;
+ switch (wire_type)
+ {
+ case PB_WT_VARINT:
+ *size = 0;
+ do
+ {
+ (*size)++;
+ if (*size > max_size)
+ PB_RETURN_ERROR(stream, "varint overflow");
+
+ if (!pb_read(stream, buf, 1))
+ return false;
+ } while (*buf++ & 0x80);
+ return true;
+
+ case PB_WT_64BIT:
+ *size = 8;
+ return pb_read(stream, buf, 8);
+
+ case PB_WT_32BIT:
+ *size = 4;
+ return pb_read(stream, buf, 4);
+
+ case PB_WT_STRING:
+ /* Calling read_raw_value with a PB_WT_STRING is an error.
+ * Explicitly handle this case and fallthrough to default to avoid
+ * compiler warnings.
+ */
+
+ default: PB_RETURN_ERROR(stream, "invalid wire_type");
+ }
+}
+
+/* Decode string length from stream and return a substream with limited length.
+ * Remember to close the substream using pb_close_string_substream().
+ */
+bool checkreturn pb_make_string_substream(pb_istream_t *stream, pb_istream_t *substream)
+{
+ uint32_t size;
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ *substream = *stream;
+ if (substream->bytes_left < size)
+ PB_RETURN_ERROR(stream, "parent stream too short");
+
+ substream->bytes_left = (size_t)size;
+ stream->bytes_left -= (size_t)size;
+ return true;
+}
+
+bool checkreturn pb_close_string_substream(pb_istream_t *stream, pb_istream_t *substream)
+{
+ if (substream->bytes_left) {
+ if (!pb_read(substream, NULL, substream->bytes_left))
+ return false;
+ }
+
+ stream->state = substream->state;
+
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = substream->errmsg;
+#endif
+ return true;
+}
+
+/*************************
+ * Decode a single field *
+ *************************/
+
+static bool checkreturn decode_basic_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+ switch (PB_LTYPE(field->type))
+ {
+ case PB_LTYPE_BOOL:
+ if (wire_type != PB_WT_VARINT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_bool(stream, field);
+
+ case PB_LTYPE_VARINT:
+ case PB_LTYPE_UVARINT:
+ case PB_LTYPE_SVARINT:
+ if (wire_type != PB_WT_VARINT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_varint(stream, field);
+
+ case PB_LTYPE_FIXED32:
+ if (wire_type != PB_WT_32BIT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_decode_fixed32(stream, field->pData);
+
+ case PB_LTYPE_FIXED64:
+ if (wire_type != PB_WT_64BIT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+ if (field->data_size == sizeof(float))
+ {
+ return pb_decode_double_as_float(stream, (float*)field->pData);
+ }
+#endif
+
+#ifdef PB_WITHOUT_64BIT
+ PB_RETURN_ERROR(stream, "invalid data_size");
+#else
+ return pb_decode_fixed64(stream, field->pData);
+#endif
+
+ case PB_LTYPE_BYTES:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_bytes(stream, field);
+
+ case PB_LTYPE_STRING:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_string(stream, field);
+
+ case PB_LTYPE_SUBMESSAGE:
+ case PB_LTYPE_SUBMSG_W_CB:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_submessage(stream, field);
+
+ case PB_LTYPE_FIXED_LENGTH_BYTES:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_fixed_length_bytes(stream, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+static bool checkreturn decode_static_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+ switch (PB_HTYPE(field->type))
+ {
+ case PB_HTYPE_REQUIRED:
+ return decode_basic_field(stream, wire_type, field);
+
+ case PB_HTYPE_OPTIONAL:
+ if (field->pSize != NULL)
+ *(bool*)field->pSize = true;
+ return decode_basic_field(stream, wire_type, field);
+
+ case PB_HTYPE_REPEATED:
+ if (wire_type == PB_WT_STRING
+ && PB_LTYPE(field->type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ /* Packed array */
+ bool status = true;
+ pb_istream_t substream;
+ pb_size_t *size = (pb_size_t*)field->pSize;
+ field->pData = (char*)field->pField + field->data_size * (*size);
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ while (substream.bytes_left > 0 && *size < field->array_size)
+ {
+ if (!decode_basic_field(&substream, PB_WT_PACKED, field))
+ {
+ status = false;
+ break;
+ }
+ (*size)++;
+ field->pData = (char*)field->pData + field->data_size;
+ }
+
+ if (substream.bytes_left != 0)
+ PB_RETURN_ERROR(stream, "array overflow");
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return status;
+ }
+ else
+ {
+ /* Repeated field */
+ pb_size_t *size = (pb_size_t*)field->pSize;
+ field->pData = (char*)field->pField + field->data_size * (*size);
+
+ if ((*size)++ >= field->array_size)
+ PB_RETURN_ERROR(stream, "array overflow");
+
+ return decode_basic_field(stream, wire_type, field);
+ }
+
+ case PB_HTYPE_ONEOF:
+ if (PB_LTYPE_IS_SUBMSG(field->type) &&
+ *(pb_size_t*)field->pSize != field->tag)
+ {
+ /* We memset to zero so that any callbacks are set to NULL.
+ * This is because the callbacks might otherwise have values
+ * from some other union field.
+ * If callbacks are needed inside oneof field, use .proto
+ * option submsg_callback to have a separate callback function
+ * that can set the fields before submessage is decoded.
+ * pb_dec_submessage() will set any default values. */
+ memset(field->pData, 0, (size_t)field->data_size);
+
+ /* Set default values for the submessage fields. */
+ if (field->submsg_desc->default_value != NULL ||
+ field->submsg_desc->field_callback != NULL ||
+ field->submsg_desc->submsg_info[0] != NULL)
+ {
+ pb_field_iter_t submsg_iter;
+ if (pb_field_iter_begin(&submsg_iter, field->submsg_desc, field->pData))
+ {
+ if (!pb_message_set_to_defaults(&submsg_iter))
+ PB_RETURN_ERROR(stream, "failed to set defaults");
+ }
+ }
+ }
+ *(pb_size_t*)field->pSize = field->tag;
+
+ return decode_basic_field(stream, wire_type, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+#ifdef PB_ENABLE_MALLOC
+/* Allocate storage for the field and store the pointer at iter->pData.
+ * array_size is the number of entries to reserve in an array.
+ * Zero size is not allowed, use pb_free() for releasing.
+ */
+static bool checkreturn allocate_field(pb_istream_t *stream, void *pData, size_t data_size, size_t array_size)
+{
+ void *ptr = *(void**)pData;
+
+ if (data_size == 0 || array_size == 0)
+ PB_RETURN_ERROR(stream, "invalid size");
+
+#ifdef __AVR__
+ /* Workaround for AVR libc bug 53284: http://savannah.nongnu.org/bugs/?53284
+ * Realloc to size of 1 byte can cause corruption of the malloc structures.
+ */
+ if (data_size == 1 && array_size == 1)
+ {
+ data_size = 2;
+ }
+#endif
+
+ /* Check for multiplication overflows.
+ * This code avoids the costly division if the sizes are small enough.
+ * Multiplication is safe as long as only half of bits are set
+ * in either multiplicand.
+ */
+ {
+ const size_t check_limit = (size_t)1 << (sizeof(size_t) * 4);
+ if (data_size >= check_limit || array_size >= check_limit)
+ {
+ const size_t size_max = (size_t)-1;
+ if (size_max / array_size < data_size)
+ {
+ PB_RETURN_ERROR(stream, "size too large");
+ }
+ }
+ }
+
+ /* Allocate new or expand previous allocation */
+ /* Note: on failure the old pointer will remain in the structure,
+ * the message must be freed by caller also on error return. */
+ ptr = pb_realloc(ptr, array_size * data_size);
+ if (ptr == NULL)
+ PB_RETURN_ERROR(stream, "realloc failed");
+
+ *(void**)pData = ptr;
+ return true;
+}
+
+/* Clear a newly allocated item in case it contains a pointer, or is a submessage. */
+static void initialize_pointer_field(void *pItem, pb_field_iter_t *field)
+{
+ if (PB_LTYPE(field->type) == PB_LTYPE_STRING ||
+ PB_LTYPE(field->type) == PB_LTYPE_BYTES)
+ {
+ *(void**)pItem = NULL;
+ }
+ else if (PB_LTYPE_IS_SUBMSG(field->type))
+ {
+ /* We memset to zero so that any callbacks are set to NULL.
+ * Default values will be set by pb_dec_submessage(). */
+ memset(pItem, 0, field->data_size);
+ }
+}
+#endif
+
+static bool checkreturn decode_pointer_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+#ifndef PB_ENABLE_MALLOC
+ PB_UNUSED(wire_type);
+ PB_UNUSED(field);
+ PB_RETURN_ERROR(stream, "no malloc support");
+#else
+ switch (PB_HTYPE(field->type))
+ {
+ case PB_HTYPE_REQUIRED:
+ case PB_HTYPE_OPTIONAL:
+ case PB_HTYPE_ONEOF:
+ if (PB_LTYPE_IS_SUBMSG(field->type) && *(void**)field->pField != NULL)
+ {
+ /* Duplicate field, have to release the old allocation first. */
+ /* FIXME: Does this work correctly for oneofs? */
+ pb_release_single_field(field);
+ }
+
+ if (PB_HTYPE(field->type) == PB_HTYPE_ONEOF)
+ {
+ *(pb_size_t*)field->pSize = field->tag;
+ }
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_STRING ||
+ PB_LTYPE(field->type) == PB_LTYPE_BYTES)
+ {
+ /* pb_dec_string and pb_dec_bytes handle allocation themselves */
+ field->pData = field->pField;
+ return decode_basic_field(stream, wire_type, field);
+ }
+ else
+ {
+ if (!allocate_field(stream, field->pField, field->data_size, 1))
+ return false;
+
+ field->pData = *(void**)field->pField;
+ initialize_pointer_field(field->pData, field);
+ return decode_basic_field(stream, wire_type, field);
+ }
+
+ case PB_HTYPE_REPEATED:
+ if (wire_type == PB_WT_STRING
+ && PB_LTYPE(field->type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ /* Packed array, multiple items come in at once. */
+ bool status = true;
+ pb_size_t *size = (pb_size_t*)field->pSize;
+ size_t allocated_size = *size;
+ pb_istream_t substream;
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ while (substream.bytes_left)
+ {
+ if (*size == PB_SIZE_MAX)
+ {
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = "too many array entries";
+#endif
+ status = false;
+ break;
+ }
+
+ if ((size_t)*size + 1 > allocated_size)
+ {
+ /* Allocate more storage. This tries to guess the
+ * number of remaining entries. Round the division
+ * upwards. */
+ size_t remain = (substream.bytes_left - 1) / field->data_size + 1;
+ if (remain < PB_SIZE_MAX - allocated_size)
+ allocated_size += remain;
+ else
+ allocated_size += 1;
+
+ if (!allocate_field(&substream, field->pField, field->data_size, allocated_size))
+ {
+ status = false;
+ break;
+ }
+ }
+
+ /* Decode the array entry */
+ field->pData = *(char**)field->pField + field->data_size * (*size);
+ initialize_pointer_field(field->pData, field);
+ if (!decode_basic_field(&substream, PB_WT_PACKED, field))
+ {
+ status = false;
+ break;
+ }
+
+ (*size)++;
+ }
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return status;
+ }
+ else
+ {
+ /* Normal repeated field, i.e. only one item at a time. */
+ pb_size_t *size = (pb_size_t*)field->pSize;
+
+ if (*size == PB_SIZE_MAX)
+ PB_RETURN_ERROR(stream, "too many array entries");
+
+ if (!allocate_field(stream, field->pField, field->data_size, (size_t)(*size + 1)))
+ return false;
+
+ field->pData = *(char**)field->pField + field->data_size * (*size);
+ (*size)++;
+ initialize_pointer_field(field->pData, field);
+ return decode_basic_field(stream, wire_type, field);
+ }
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+#endif
+}
+
+static bool checkreturn decode_callback_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+ if (!field->descriptor->field_callback)
+ return pb_skip_field(stream, wire_type);
+
+ if (wire_type == PB_WT_STRING)
+ {
+ pb_istream_t substream;
+ size_t prev_bytes_left;
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ do
+ {
+ prev_bytes_left = substream.bytes_left;
+ if (!field->descriptor->field_callback(&substream, NULL, field))
+ PB_RETURN_ERROR(stream, "callback failed");
+ } while (substream.bytes_left > 0 && substream.bytes_left < prev_bytes_left);
+
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return true;
+ }
+ else
+ {
+ /* Copy the single scalar value to stack.
+ * This is required so that we can limit the stream length,
+ * which in turn allows to use same callback for packed and
+ * not-packed fields. */
+ pb_istream_t substream;
+ pb_byte_t buffer[10];
+ size_t size = sizeof(buffer);
+
+ if (!read_raw_value(stream, wire_type, buffer, &size))
+ return false;
+ substream = pb_istream_from_buffer(buffer, size);
+
+ return field->descriptor->field_callback(&substream, NULL, field);
+ }
+}
+
+static bool checkreturn decode_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+#ifdef PB_ENABLE_MALLOC
+ /* When decoding an oneof field, check if there is old data that must be
+ * released first. */
+ if (PB_HTYPE(field->type) == PB_HTYPE_ONEOF)
+ {
+ if (!pb_release_union_field(stream, field))
+ return false;
+ }
+#endif
+
+ switch (PB_ATYPE(field->type))
+ {
+ case PB_ATYPE_STATIC:
+ return decode_static_field(stream, wire_type, field);
+
+ case PB_ATYPE_POINTER:
+ return decode_pointer_field(stream, wire_type, field);
+
+ case PB_ATYPE_CALLBACK:
+ return decode_callback_field(stream, wire_type, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+/* Default handler for extension fields. Expects to have a pb_msgdesc_t
+ * pointer in the extension->type->arg field, pointing to a message with
+ * only one field in it. */
+static bool checkreturn default_extension_decoder(pb_istream_t *stream,
+ pb_extension_t *extension, uint32_t tag, pb_wire_type_t wire_type)
+{
+ pb_field_iter_t iter;
+
+ if (!pb_field_iter_begin_extension(&iter, extension))
+ PB_RETURN_ERROR(stream, "invalid extension");
+
+ if (iter.tag != tag || !iter.message)
+ return true;
+
+ extension->found = true;
+ return decode_field(stream, wire_type, &iter);
+}
+
+/* Try to decode an unknown field as an extension field. Tries each extension
+ * decoder in turn, until one of them handles the field or loop ends. */
+static bool checkreturn decode_extension(pb_istream_t *stream,
+ uint32_t tag, pb_wire_type_t wire_type, pb_extension_t *extension)
+{
+ size_t pos = stream->bytes_left;
+
+ while (extension != NULL && pos == stream->bytes_left)
+ {
+ bool status;
+ if (extension->type->decode)
+ status = extension->type->decode(stream, extension, tag, wire_type);
+ else
+ status = default_extension_decoder(stream, extension, tag, wire_type);
+
+ if (!status)
+ return false;
+
+ extension = extension->next;
+ }
+
+ return true;
+}
+
+/* Initialize message fields to default values, recursively */
+static bool pb_field_set_to_default(pb_field_iter_t *field)
+{
+ pb_type_t type;
+ type = field->type;
+
+ if (PB_LTYPE(type) == PB_LTYPE_EXTENSION)
+ {
+ pb_extension_t *ext = *(pb_extension_t* const *)field->pData;
+ while (ext != NULL)
+ {
+ pb_field_iter_t ext_iter;
+ if (pb_field_iter_begin_extension(&ext_iter, ext))
+ {
+ ext->found = false;
+ if (!pb_message_set_to_defaults(&ext_iter))
+ return false;
+ }
+ ext = ext->next;
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_STATIC)
+ {
+ bool init_data = true;
+ if (PB_HTYPE(type) == PB_HTYPE_OPTIONAL && field->pSize != NULL)
+ {
+ /* Set has_field to false. Still initialize the optional field
+ * itself also. */
+ *(bool*)field->pSize = false;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_REPEATED ||
+ PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ /* REPEATED: Set array count to 0, no need to initialize contents.
+ ONEOF: Set which_field to 0. */
+ *(pb_size_t*)field->pSize = 0;
+ init_data = false;
+ }
+
+ if (init_data)
+ {
+ if (PB_LTYPE_IS_SUBMSG(field->type) &&
+ (field->submsg_desc->default_value != NULL ||
+ field->submsg_desc->field_callback != NULL ||
+ field->submsg_desc->submsg_info[0] != NULL))
+ {
+ /* Initialize submessage to defaults.
+ * Only needed if it has default values
+ * or callback/submessage fields. */
+ pb_field_iter_t submsg_iter;
+ if (pb_field_iter_begin(&submsg_iter, field->submsg_desc, field->pData))
+ {
+ if (!pb_message_set_to_defaults(&submsg_iter))
+ return false;
+ }
+ }
+ else
+ {
+ /* Initialize to zeros */
+ memset(field->pData, 0, (size_t)field->data_size);
+ }
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ /* Initialize the pointer to NULL. */
+ *(void**)field->pField = NULL;
+
+ /* Initialize array count to 0. */
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED ||
+ PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ *(pb_size_t*)field->pSize = 0;
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_CALLBACK)
+ {
+ /* Don't overwrite callback */
+ }
+
+ return true;
+}
+
+static bool pb_message_set_to_defaults(pb_field_iter_t *iter)
+{
+ pb_istream_t defstream = PB_ISTREAM_EMPTY;
+ uint32_t tag = 0;
+ pb_wire_type_t wire_type = PB_WT_VARINT;
+ bool eof;
+
+ if (iter->descriptor->default_value)
+ {
+ defstream = pb_istream_from_buffer(iter->descriptor->default_value, (size_t)-1);
+ if (!pb_decode_tag(&defstream, &wire_type, &tag, &eof))
+ return false;
+ }
+
+ do
+ {
+ if (!pb_field_set_to_default(iter))
+ return false;
+
+ if (tag != 0 && iter->tag == tag)
+ {
+ /* We have a default value for this field in the defstream */
+ if (!decode_field(&defstream, wire_type, iter))
+ return false;
+ if (!pb_decode_tag(&defstream, &wire_type, &tag, &eof))
+ return false;
+
+ if (iter->pSize)
+ *(bool*)iter->pSize = false;
+ }
+ } while (pb_field_iter_next(iter));
+
+ return true;
+}
+
+/*********************
+ * Decode all fields *
+ *********************/
+
+static bool checkreturn pb_decode_inner(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct, unsigned int flags)
+{
+ uint32_t extension_range_start = 0;
+ pb_extension_t *extensions = NULL;
+
+ /* 'fixed_count_field' and 'fixed_count_size' track position of a repeated fixed
+ * count field. This can only handle _one_ repeated fixed count field that
+ * is unpacked and unordered among other (non repeated fixed count) fields.
+ */
+ pb_size_t fixed_count_field = PB_SIZE_MAX;
+ pb_size_t fixed_count_size = 0;
+ pb_size_t fixed_count_total_size = 0;
+
+ pb_fields_seen_t fields_seen = {{0, 0}};
+ const uint32_t allbits = ~(uint32_t)0;
+ pb_field_iter_t iter;
+
+ if (pb_field_iter_begin(&iter, fields, dest_struct))
+ {
+ if ((flags & PB_DECODE_NOINIT) == 0)
+ {
+ if (!pb_message_set_to_defaults(&iter))
+ PB_RETURN_ERROR(stream, "failed to set defaults");
+ }
+ }
+
+ while (stream->bytes_left)
+ {
+ uint32_t tag;
+ pb_wire_type_t wire_type;
+ bool eof;
+
+ if (!pb_decode_tag(stream, &wire_type, &tag, &eof))
+ {
+ if (eof)
+ break;
+ else
+ return false;
+ }
+
+ if (tag == 0)
+ {
+ if (flags & PB_DECODE_NULLTERMINATED)
+ {
+ break;
+ }
+ else
+ {
+ PB_RETURN_ERROR(stream, "zero tag");
+ }
+ }
+
+ if (!pb_field_iter_find(&iter, tag) || PB_LTYPE(iter.type) == PB_LTYPE_EXTENSION)
+ {
+ /* No match found, check if it matches an extension. */
+ if (extension_range_start == 0)
+ {
+ if (pb_field_iter_find_extension(&iter))
+ {
+ extensions = *(pb_extension_t* const *)iter.pData;
+ extension_range_start = iter.tag;
+ }
+
+ if (!extensions)
+ {
+ extension_range_start = (uint32_t)-1;
+ }
+ }
+
+ if (tag >= extension_range_start)
+ {
+ size_t pos = stream->bytes_left;
+
+ if (!decode_extension(stream, tag, wire_type, extensions))
+ return false;
+
+ if (pos != stream->bytes_left)
+ {
+ /* The field was handled */
+ continue;
+ }
+ }
+
+ /* No match found, skip data */
+ if (!pb_skip_field(stream, wire_type))
+ return false;
+ continue;
+ }
+
+ /* If a repeated fixed count field was found, get size from
+ * 'fixed_count_field' as there is no counter contained in the struct.
+ */
+ if (PB_HTYPE(iter.type) == PB_HTYPE_REPEATED && iter.pSize == &iter.array_size)
+ {
+ if (fixed_count_field != iter.index) {
+ /* If the new fixed count field does not match the previous one,
+ * check that the previous one is NULL or that it finished
+ * receiving all the expected data.
+ */
+ if (fixed_count_field != PB_SIZE_MAX &&
+ fixed_count_size != fixed_count_total_size)
+ {
+ PB_RETURN_ERROR(stream, "wrong size for fixed count field");
+ }
+
+ fixed_count_field = iter.index;
+ fixed_count_size = 0;
+ fixed_count_total_size = iter.array_size;
+ }
+
+ iter.pSize = &fixed_count_size;
+ }
+
+ if (PB_HTYPE(iter.type) == PB_HTYPE_REQUIRED
+ && iter.required_field_index < PB_MAX_REQUIRED_FIELDS)
+ {
+ uint32_t tmp = ((uint32_t)1 << (iter.required_field_index & 31));
+ fields_seen.bitfield[iter.required_field_index >> 5] |= tmp;
+ }
+
+ if (!decode_field(stream, wire_type, &iter))
+ return false;
+ }
+
+ /* Check that all elements of the last decoded fixed count field were present. */
+ if (fixed_count_field != PB_SIZE_MAX &&
+ fixed_count_size != fixed_count_total_size)
+ {
+ PB_RETURN_ERROR(stream, "wrong size for fixed count field");
+ }
+
+ /* Check that all required fields were present. */
+ {
+ pb_size_t req_field_count = iter.descriptor->required_field_count;
+
+ if (req_field_count > 0)
+ {
+ pb_size_t i;
+
+ if (req_field_count > PB_MAX_REQUIRED_FIELDS)
+ req_field_count = PB_MAX_REQUIRED_FIELDS;
+
+ /* Check the whole words */
+ for (i = 0; i < (req_field_count >> 5); i++)
+ {
+ if (fields_seen.bitfield[i] != allbits)
+ PB_RETURN_ERROR(stream, "missing required field");
+ }
+
+ /* Check the remaining bits (if any) */
+ if ((req_field_count & 31) != 0)
+ {
+ if (fields_seen.bitfield[req_field_count >> 5] !=
+ (allbits >> (uint_least8_t)(32 - (req_field_count & 31))))
+ {
+ PB_RETURN_ERROR(stream, "missing required field");
+ }
+ }
+ }
+ }
+
+ return true;
+}
+
+bool checkreturn pb_decode_ex(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct, unsigned int flags)
+{
+ bool status;
+
+ if ((flags & PB_DECODE_DELIMITED) == 0)
+ {
+ status = pb_decode_inner(stream, fields, dest_struct, flags);
+ }
+ else
+ {
+ pb_istream_t substream;
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ status = pb_decode_inner(&substream, fields, dest_struct, flags);
+
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+ }
+
+#ifdef PB_ENABLE_MALLOC
+ if (!status)
+ pb_release(fields, dest_struct);
+#endif
+
+ return status;
+}
+
+bool checkreturn pb_decode(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct)
+{
+ bool status;
+
+ status = pb_decode_inner(stream, fields, dest_struct, 0);
+
+#ifdef PB_ENABLE_MALLOC
+ if (!status)
+ pb_release(fields, dest_struct);
+#endif
+
+ return status;
+}
+
+#ifdef PB_ENABLE_MALLOC
+/* Given an oneof field, if there has already been a field inside this oneof,
+ * release it before overwriting with a different one. */
+static bool pb_release_union_field(pb_istream_t *stream, pb_field_iter_t *field)
+{
+ pb_field_iter_t old_field = *field;
+ pb_size_t old_tag = *(pb_size_t*)field->pSize; /* Previous which_ value */
+ pb_size_t new_tag = field->tag; /* New which_ value */
+
+ if (old_tag == 0)
+ return true; /* Ok, no old data in union */
+
+ if (old_tag == new_tag)
+ return true; /* Ok, old data is of same type => merge */
+
+ /* Release old data. The find can fail if the message struct contains
+ * invalid data. */
+ if (!pb_field_iter_find(&old_field, old_tag))
+ PB_RETURN_ERROR(stream, "invalid union tag");
+
+ pb_release_single_field(&old_field);
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+ /* Initialize the pointer to NULL to make sure it is valid
+ * even in case of error return. */
+ *(void**)field->pField = NULL;
+ field->pData = NULL;
+ }
+
+ return true;
+}
+
+static void pb_release_single_field(pb_field_iter_t *field)
+{
+ pb_type_t type;
+ type = field->type;
+
+ if (PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ if (*(pb_size_t*)field->pSize != field->tag)
+ return; /* This is not the current field in the union */
+ }
+
+ /* Release anything contained inside an extension or submsg.
+ * This has to be done even if the submsg itself is statically
+ * allocated. */
+ if (PB_LTYPE(type) == PB_LTYPE_EXTENSION)
+ {
+ /* Release fields from all extensions in the linked list */
+ pb_extension_t *ext = *(pb_extension_t**)field->pData;
+ while (ext != NULL)
+ {
+ pb_field_iter_t ext_iter;
+ if (pb_field_iter_begin_extension(&ext_iter, ext))
+ {
+ pb_release_single_field(&ext_iter);
+ }
+ ext = ext->next;
+ }
+ }
+ else if (PB_LTYPE_IS_SUBMSG(type) && PB_ATYPE(type) != PB_ATYPE_CALLBACK)
+ {
+ /* Release fields in submessage or submsg array */
+ pb_size_t count = 1;
+
+ if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ field->pData = *(void**)field->pField;
+ }
+ else
+ {
+ field->pData = field->pField;
+ }
+
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED)
+ {
+ count = *(pb_size_t*)field->pSize;
+
+ if (PB_ATYPE(type) == PB_ATYPE_STATIC && count > field->array_size)
+ {
+ /* Protect against corrupted _count fields */
+ count = field->array_size;
+ }
+ }
+
+ if (field->pData)
+ {
+ for (; count > 0; count--)
+ {
+ pb_release(field->submsg_desc, field->pData);
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ }
+ }
+
+ if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED &&
+ (PB_LTYPE(type) == PB_LTYPE_STRING ||
+ PB_LTYPE(type) == PB_LTYPE_BYTES))
+ {
+ /* Release entries in repeated string or bytes array */
+ void **pItem = *(void***)field->pField;
+ pb_size_t count = *(pb_size_t*)field->pSize;
+ for (; count > 0; count--)
+ {
+ pb_free(*pItem);
+ *pItem++ = NULL;
+ }
+ }
+
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED)
+ {
+ /* We are going to release the array, so set the size to 0 */
+ *(pb_size_t*)field->pSize = 0;
+ }
+
+ /* Release main pointer */
+ pb_free(*(void**)field->pField);
+ *(void**)field->pField = NULL;
+ }
+}
+
+void pb_release(const pb_msgdesc_t *fields, void *dest_struct)
+{
+ pb_field_iter_t iter;
+
+ if (!dest_struct)
+ return; /* Ignore NULL pointers, similar to free() */
+
+ if (!pb_field_iter_begin(&iter, fields, dest_struct))
+ return; /* Empty message type */
+
+ do
+ {
+ pb_release_single_field(&iter);
+ } while (pb_field_iter_next(&iter));
+}
+#endif
+
+/* Field decoders */
+
+bool pb_decode_bool(pb_istream_t *stream, bool *dest)
+{
+ uint32_t value;
+ if (!pb_decode_varint32(stream, &value))
+ return false;
+
+ *(bool*)dest = (value != 0);
+ return true;
+}
+
+bool pb_decode_svarint(pb_istream_t *stream, pb_int64_t *dest)
+{
+ pb_uint64_t value;
+ if (!pb_decode_varint(stream, &value))
+ return false;
+
+ if (value & 1)
+ *dest = (pb_int64_t)(~(value >> 1));
+ else
+ *dest = (pb_int64_t)(value >> 1);
+
+ return true;
+}
+
+bool pb_decode_fixed32(pb_istream_t *stream, void *dest)
+{
+ union {
+ uint32_t fixed32;
+ pb_byte_t bytes[4];
+ } u;
+
+ if (!pb_read(stream, u.bytes, 4))
+ return false;
+
+#if defined(__BYTE_ORDER) && __BYTE_ORDER == __LITTLE_ENDIAN && CHAR_BIT == 8
+ /* fast path - if we know that we're on little endian, assign directly */
+ *(uint32_t*)dest = u.fixed32;
+#else
+ *(uint32_t*)dest = ((uint32_t)u.bytes[0] << 0) |
+ ((uint32_t)u.bytes[1] << 8) |
+ ((uint32_t)u.bytes[2] << 16) |
+ ((uint32_t)u.bytes[3] << 24);
+#endif
+ return true;
+}
+
+#ifndef PB_WITHOUT_64BIT
+bool pb_decode_fixed64(pb_istream_t *stream, void *dest)
+{
+ union {
+ uint64_t fixed64;
+ pb_byte_t bytes[8];
+ } u;
+
+ if (!pb_read(stream, u.bytes, 8))
+ return false;
+
+#if defined(__BYTE_ORDER) && __BYTE_ORDER == __LITTLE_ENDIAN && CHAR_BIT == 8
+ /* fast path - if we know that we're on little endian, assign directly */
+ *(uint64_t*)dest = u.fixed64;
+#else
+ *(uint64_t*)dest = ((uint64_t)u.bytes[0] << 0) |
+ ((uint64_t)u.bytes[1] << 8) |
+ ((uint64_t)u.bytes[2] << 16) |
+ ((uint64_t)u.bytes[3] << 24) |
+ ((uint64_t)u.bytes[4] << 32) |
+ ((uint64_t)u.bytes[5] << 40) |
+ ((uint64_t)u.bytes[6] << 48) |
+ ((uint64_t)u.bytes[7] << 56);
+#endif
+ return true;
+}
+#endif
+
+static bool checkreturn pb_dec_bool(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ return pb_decode_bool(stream, (bool*)field->pData);
+}
+
+static bool checkreturn pb_dec_varint(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ if (PB_LTYPE(field->type) == PB_LTYPE_UVARINT)
+ {
+ pb_uint64_t value, clamped;
+ if (!pb_decode_varint(stream, &value))
+ return false;
+
+ /* Cast to the proper field size, while checking for overflows */
+ if (field->data_size == sizeof(pb_uint64_t))
+ clamped = *(pb_uint64_t*)field->pData = value;
+ else if (field->data_size == sizeof(uint32_t))
+ clamped = *(uint32_t*)field->pData = (uint32_t)value;
+ else if (field->data_size == sizeof(uint_least16_t))
+ clamped = *(uint_least16_t*)field->pData = (uint_least16_t)value;
+ else if (field->data_size == sizeof(uint_least8_t))
+ clamped = *(uint_least8_t*)field->pData = (uint_least8_t)value;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ if (clamped != value)
+ PB_RETURN_ERROR(stream, "integer too large");
+
+ return true;
+ }
+ else
+ {
+ pb_uint64_t value;
+ pb_int64_t svalue;
+ pb_int64_t clamped;
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_SVARINT)
+ {
+ if (!pb_decode_svarint(stream, &svalue))
+ return false;
+ }
+ else
+ {
+ if (!pb_decode_varint(stream, &value))
+ return false;
+
+ /* See issue 97: Google's C++ protobuf allows negative varint values to
+ * be cast as int32_t, instead of the int64_t that should be used when
+ * encoding. Nanopb versions before 0.2.5 had a bug in encoding. In order to
+ * not break decoding of such messages, we cast <=32 bit fields to
+ * int32_t first to get the sign correct.
+ */
+ if (field->data_size == sizeof(pb_int64_t))
+ svalue = (pb_int64_t)value;
+ else
+ svalue = (int32_t)value;
+ }
+
+ /* Cast to the proper field size, while checking for overflows */
+ if (field->data_size == sizeof(pb_int64_t))
+ clamped = *(pb_int64_t*)field->pData = svalue;
+ else if (field->data_size == sizeof(int32_t))
+ clamped = *(int32_t*)field->pData = (int32_t)svalue;
+ else if (field->data_size == sizeof(int_least16_t))
+ clamped = *(int_least16_t*)field->pData = (int_least16_t)svalue;
+ else if (field->data_size == sizeof(int_least8_t))
+ clamped = *(int_least8_t*)field->pData = (int_least8_t)svalue;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ if (clamped != svalue)
+ PB_RETURN_ERROR(stream, "integer too large");
+
+ return true;
+ }
+}
+
+static bool checkreturn pb_dec_bytes(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t size;
+ size_t alloc_size;
+ pb_bytes_array_t *dest;
+
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ if (size > PB_SIZE_MAX)
+ PB_RETURN_ERROR(stream, "bytes overflow");
+
+ alloc_size = PB_BYTES_ARRAY_T_ALLOCSIZE(size);
+ if (size > alloc_size)
+ PB_RETURN_ERROR(stream, "size too large");
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+#ifndef PB_ENABLE_MALLOC
+ PB_RETURN_ERROR(stream, "no malloc support");
+#else
+ if (stream->bytes_left < size)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+ if (!allocate_field(stream, field->pData, alloc_size, 1))
+ return false;
+ dest = *(pb_bytes_array_t**)field->pData;
+#endif
+ }
+ else
+ {
+ if (alloc_size > field->data_size)
+ PB_RETURN_ERROR(stream, "bytes overflow");
+ dest = (pb_bytes_array_t*)field->pData;
+ }
+
+ dest->size = (pb_size_t)size;
+ return pb_read(stream, dest->bytes, (size_t)size);
+}
+
+static bool checkreturn pb_dec_string(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t size;
+ size_t alloc_size;
+ pb_byte_t *dest = (pb_byte_t*)field->pData;
+
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ if (size == (uint32_t)-1)
+ PB_RETURN_ERROR(stream, "size too large");
+
+ /* Space for null terminator */
+ alloc_size = (size_t)(size + 1);
+
+ if (alloc_size < size)
+ PB_RETURN_ERROR(stream, "size too large");
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+#ifndef PB_ENABLE_MALLOC
+ PB_RETURN_ERROR(stream, "no malloc support");
+#else
+ if (stream->bytes_left < size)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+ if (!allocate_field(stream, field->pData, alloc_size, 1))
+ return false;
+ dest = *(pb_byte_t**)field->pData;
+#endif
+ }
+ else
+ {
+ if (alloc_size > field->data_size)
+ PB_RETURN_ERROR(stream, "string overflow");
+ }
+
+ dest[size] = 0;
+
+ if (!pb_read(stream, dest, (size_t)size))
+ return false;
+
+#ifdef PB_VALIDATE_UTF8
+ if (!pb_validate_utf8((const char*)dest))
+ PB_RETURN_ERROR(stream, "invalid utf8");
+#endif
+
+ return true;
+}
+
+static bool checkreturn pb_dec_submessage(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ bool status = true;
+ bool submsg_consumed = false;
+ pb_istream_t substream;
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ if (field->submsg_desc == NULL)
+ PB_RETURN_ERROR(stream, "invalid field descriptor");
+
+ /* Submessages can have a separate message-level callback that is called
+ * before decoding the message. Typically it is used to set callback fields
+ * inside oneofs. */
+ if (PB_LTYPE(field->type) == PB_LTYPE_SUBMSG_W_CB && field->pSize != NULL)
+ {
+ /* Message callback is stored right before pSize. */
+ pb_callback_t *callback = (pb_callback_t*)field->pSize - 1;
+ if (callback->funcs.decode)
+ {
+ status = callback->funcs.decode(&substream, field, &callback->arg);
+
+ if (substream.bytes_left == 0)
+ {
+ submsg_consumed = true;
+ }
+ }
+ }
+
+ /* Now decode the submessage contents */
+ if (status && !submsg_consumed)
+ {
+ unsigned int flags = 0;
+
+ /* Static required/optional fields are already initialized by top-level
+ * pb_decode(), no need to initialize them again. */
+ if (PB_ATYPE(field->type) == PB_ATYPE_STATIC &&
+ PB_HTYPE(field->type) != PB_HTYPE_REPEATED)
+ {
+ flags = PB_DECODE_NOINIT;
+ }
+
+ status = pb_decode_inner(&substream, field->submsg_desc, field->pData, flags);
+ }
+
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return status;
+}
+
+static bool checkreturn pb_dec_fixed_length_bytes(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t size;
+
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ if (size > PB_SIZE_MAX)
+ PB_RETURN_ERROR(stream, "bytes overflow");
+
+ if (size == 0)
+ {
+ /* As a special case, treat empty bytes string as all zeros for fixed_length_bytes. */
+ memset(field->pData, 0, (size_t)field->data_size);
+ return true;
+ }
+
+ if (size != field->data_size)
+ PB_RETURN_ERROR(stream, "incorrect fixed length bytes size");
+
+ return pb_read(stream, (pb_byte_t*)field->pData, (size_t)field->data_size);
+}
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+bool pb_decode_double_as_float(pb_istream_t *stream, float *dest)
+{
+ uint_least8_t sign;
+ int exponent;
+ uint32_t mantissa;
+ uint64_t value;
+ union { float f; uint32_t i; } out;
+
+ if (!pb_decode_fixed64(stream, &value))
+ return false;
+
+ /* Decompose input value */
+ sign = (uint_least8_t)((value >> 63) & 1);
+ exponent = (int)((value >> 52) & 0x7FF) - 1023;
+ mantissa = (value >> 28) & 0xFFFFFF; /* Highest 24 bits */
+
+ /* Figure if value is in range representable by floats. */
+ if (exponent == 1024)
+ {
+ /* Special value */
+ exponent = 128;
+ mantissa >>= 1;
+ }
+ else
+ {
+ if (exponent > 127)
+ {
+ /* Too large, convert to infinity */
+ exponent = 128;
+ mantissa = 0;
+ }
+ else if (exponent < -150)
+ {
+ /* Too small, convert to zero */
+ exponent = -127;
+ mantissa = 0;
+ }
+ else if (exponent < -126)
+ {
+ /* Denormalized */
+ mantissa |= 0x1000000;
+ mantissa >>= (-126 - exponent);
+ exponent = -127;
+ }
+
+ /* Round off mantissa */
+ mantissa = (mantissa + 1) >> 1;
+
+ /* Check if mantissa went over 2.0 */
+ if (mantissa & 0x800000)
+ {
+ exponent += 1;
+ mantissa &= 0x7FFFFF;
+ mantissa >>= 1;
+ }
+ }
+
+ /* Combine fields */
+ out.i = mantissa;
+ out.i |= (uint32_t)(exponent + 127) << 23;
+ out.i |= (uint32_t)sign << 31;
+
+ *dest = out.f;
+ return true;
+}
+#endif
diff --git a/security/container/protos/nanopb/pb_decode.h b/security/container/protos/nanopb/pb_decode.h
new file mode 100644
index 0000000..824acd4
--- /dev/null
+++ b/security/container/protos/nanopb/pb_decode.h
@@ -0,0 +1,199 @@
+/* pb_decode.h: Functions to decode protocol buffers. Depends on pb_decode.c.
+ * The main function is pb_decode. You also need an input stream, and the
+ * field descriptions created by nanopb_generator.py.
+ */
+
+#ifndef PB_DECODE_H_INCLUDED
+#define PB_DECODE_H_INCLUDED
+
+#include "pb.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Structure for defining custom input streams. You will need to provide
+ * a callback function to read the bytes from your storage, which can be
+ * for example a file or a network socket.
+ *
+ * The callback must conform to these rules:
+ *
+ * 1) Return false on IO errors. This will cause decoding to abort.
+ * 2) You can use state to store your own data (e.g. buffer pointer),
+ * and rely on pb_read to verify that no-body reads past bytes_left.
+ * 3) Your callback may be used with substreams, in which case bytes_left
+ * is different than from the main stream. Don't use bytes_left to compute
+ * any pointers.
+ */
+struct pb_istream_s
+{
+#ifdef PB_BUFFER_ONLY
+ /* Callback pointer is not used in buffer-only configuration.
+ * Having an int pointer here allows binary compatibility but
+ * gives an error if someone tries to assign callback function.
+ */
+ int *callback;
+#else
+ bool (*callback)(pb_istream_t *stream, pb_byte_t *buf, size_t count);
+#endif
+
+ void *state; /* Free field for use by callback implementation */
+ size_t bytes_left;
+
+#ifndef PB_NO_ERRMSG
+ const char *errmsg;
+#endif
+};
+
+#ifndef PB_NO_ERRMSG
+#define PB_ISTREAM_EMPTY {0,0,0,0}
+#else
+#define PB_ISTREAM_EMPTY {0,0,0}
+#endif
+
+/***************************
+ * Main decoding functions *
+ ***************************/
+
+/* Decode a single protocol buffers message from input stream into a C structure.
+ * Returns true on success, false on any failure.
+ * The actual struct pointed to by dest must match the description in fields.
+ * Callback fields of the destination structure must be initialized by caller.
+ * All other fields will be initialized by this function.
+ *
+ * Example usage:
+ * MyMessage msg = {};
+ * uint8_t buffer[64];
+ * pb_istream_t stream;
+ *
+ * // ... read some data into buffer ...
+ *
+ * stream = pb_istream_from_buffer(buffer, count);
+ * pb_decode(&stream, MyMessage_fields, &msg);
+ */
+bool pb_decode(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct);
+
+/* Extended version of pb_decode, with several options to control
+ * the decoding process:
+ *
+ * PB_DECODE_NOINIT: Do not initialize the fields to default values.
+ * This is slightly faster if you do not need the default
+ * values and instead initialize the structure to 0 using
+ * e.g. memset(). This can also be used for merging two
+ * messages, i.e. combine already existing data with new
+ * values.
+ *
+ * PB_DECODE_DELIMITED: Input message starts with the message size as varint.
+ * Corresponds to parseDelimitedFrom() in Google's
+ * protobuf API.
+ *
+ * PB_DECODE_NULLTERMINATED: Stop reading when field tag is read as 0. This allows
+ * reading null terminated messages.
+ * NOTE: Until nanopb-0.4.0, pb_decode() also allows
+ * null-termination. This behaviour is not supported in
+ * most other protobuf implementations, so PB_DECODE_DELIMITED
+ * is a better option for compatibility.
+ *
+ * Multiple flags can be combined with bitwise or (| operator)
+ */
+#define PB_DECODE_NOINIT 0x01U
+#define PB_DECODE_DELIMITED 0x02U
+#define PB_DECODE_NULLTERMINATED 0x04U
+bool pb_decode_ex(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct, unsigned int flags);
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define pb_decode_noinit(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_NOINIT)
+#define pb_decode_delimited(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_DELIMITED)
+#define pb_decode_delimited_noinit(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_DELIMITED | PB_DECODE_NOINIT)
+#define pb_decode_nullterminated(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_NULLTERMINATED)
+
+#ifdef PB_ENABLE_MALLOC
+/* Release any allocated pointer fields. If you use dynamic allocation, you should
+ * call this for any successfully decoded message when you are done with it. If
+ * pb_decode() returns with an error, the message is already released.
+ */
+void pb_release(const pb_msgdesc_t *fields, void *dest_struct);
+#else
+/* Allocation is not supported, so release is no-op */
+#define pb_release(fields, dest_struct) PB_UNUSED(fields); PB_UNUSED(dest_struct);
+#endif
+
+
+/**************************************
+ * Functions for manipulating streams *
+ **************************************/
+
+/* Create an input stream for reading from a memory buffer.
+ *
+ * msglen should be the actual length of the message, not the full size of
+ * allocated buffer.
+ *
+ * Alternatively, you can use a custom stream that reads directly from e.g.
+ * a file or a network socket.
+ */
+pb_istream_t pb_istream_from_buffer(const pb_byte_t *buf, size_t msglen);
+
+/* Function to read from a pb_istream_t. You can use this if you need to
+ * read some custom header data, or to read data in field callbacks.
+ */
+bool pb_read(pb_istream_t *stream, pb_byte_t *buf, size_t count);
+
+
+/************************************************
+ * Helper functions for writing field callbacks *
+ ************************************************/
+
+/* Decode the tag for the next field in the stream. Gives the wire type and
+ * field tag. At end of the message, returns false and sets eof to true. */
+bool pb_decode_tag(pb_istream_t *stream, pb_wire_type_t *wire_type, uint32_t *tag, bool *eof);
+
+/* Skip the field payload data, given the wire type. */
+bool pb_skip_field(pb_istream_t *stream, pb_wire_type_t wire_type);
+
+/* Decode an integer in the varint format. This works for enum, int32,
+ * int64, uint32 and uint64 field types. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_decode_varint(pb_istream_t *stream, uint64_t *dest);
+#else
+#define pb_decode_varint pb_decode_varint32
+#endif
+
+/* Decode an integer in the varint format. This works for enum, int32,
+ * and uint32 field types. */
+bool pb_decode_varint32(pb_istream_t *stream, uint32_t *dest);
+
+/* Decode a bool value in varint format. */
+bool pb_decode_bool(pb_istream_t *stream, bool *dest);
+
+/* Decode an integer in the zig-zagged svarint format. This works for sint32
+ * and sint64. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_decode_svarint(pb_istream_t *stream, int64_t *dest);
+#else
+bool pb_decode_svarint(pb_istream_t *stream, int32_t *dest);
+#endif
+
+/* Decode a fixed32, sfixed32 or float value. You need to pass a pointer to
+ * a 4-byte wide C variable. */
+bool pb_decode_fixed32(pb_istream_t *stream, void *dest);
+
+#ifndef PB_WITHOUT_64BIT
+/* Decode a fixed64, sfixed64 or double value. You need to pass a pointer to
+ * a 8-byte wide C variable. */
+bool pb_decode_fixed64(pb_istream_t *stream, void *dest);
+#endif
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+/* Decode a double value into float variable. */
+bool pb_decode_double_as_float(pb_istream_t *stream, float *dest);
+#endif
+
+/* Make a limited-length substream for reading a PB_WT_STRING field. */
+bool pb_make_string_substream(pb_istream_t *stream, pb_istream_t *substream);
+bool pb_close_string_substream(pb_istream_t *stream, pb_istream_t *substream);
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/nanopb/pb_encode.c b/security/container/protos/nanopb/pb_encode.c
new file mode 100644
index 0000000..de716f7
--- /dev/null
+++ b/security/container/protos/nanopb/pb_encode.c
@@ -0,0 +1,987 @@
+/* pb_encode.c -- encode a protobuf using minimal resources
+ *
+ * 2011 Petteri Aimonen <jpa@kapsi.fi>
+ */
+
+#include "pb.h"
+#include "pb_encode.h"
+#include "pb_common.h"
+
+/* Use the GCC warn_unused_result attribute to check that all return values
+ * are propagated correctly. On other compilers and gcc before 3.4.0 just
+ * ignore the annotation.
+ */
+#if !defined(__GNUC__) || ( __GNUC__ < 3) || (__GNUC__ == 3 && __GNUC_MINOR__ < 4)
+ #define checkreturn
+#else
+ #define checkreturn __attribute__((warn_unused_result))
+#endif
+
+/**************************************
+ * Declarations internal to this file *
+ **************************************/
+static bool checkreturn buf_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count);
+static bool checkreturn encode_array(pb_ostream_t *stream, pb_field_iter_t *field);
+static bool checkreturn pb_check_proto3_default_value(const pb_field_iter_t *field);
+static bool checkreturn encode_basic_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn encode_callback_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn encode_field(pb_ostream_t *stream, pb_field_iter_t *field);
+static bool checkreturn encode_extension_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn default_extension_encoder(pb_ostream_t *stream, const pb_extension_t *extension);
+static bool checkreturn pb_encode_varint_32(pb_ostream_t *stream, uint32_t low, uint32_t high);
+static bool checkreturn pb_enc_bool(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_varint(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_fixed(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_bytes(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_string(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_submessage(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_fixed_length_bytes(pb_ostream_t *stream, const pb_field_iter_t *field);
+
+#ifdef PB_WITHOUT_64BIT
+#define pb_int64_t int32_t
+#define pb_uint64_t uint32_t
+#else
+#define pb_int64_t int64_t
+#define pb_uint64_t uint64_t
+#endif
+
+/*******************************
+ * pb_ostream_t implementation *
+ *******************************/
+
+static bool checkreturn buf_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count)
+{
+ size_t i;
+ pb_byte_t *dest = (pb_byte_t*)stream->state;
+ stream->state = dest + count;
+
+ for (i = 0; i < count; i++)
+ dest[i] = buf[i];
+
+ return true;
+}
+
+pb_ostream_t pb_ostream_from_buffer(pb_byte_t *buf, size_t bufsize)
+{
+ pb_ostream_t stream;
+#ifdef PB_BUFFER_ONLY
+ stream.callback = (void*)1; /* Just a marker value */
+#else
+ stream.callback = &buf_write;
+#endif
+ stream.state = buf;
+ stream.max_size = bufsize;
+ stream.bytes_written = 0;
+#ifndef PB_NO_ERRMSG
+ stream.errmsg = NULL;
+#endif
+ return stream;
+}
+
+bool checkreturn pb_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count)
+{
+ if (count > 0 && stream->callback != NULL)
+ {
+ if (stream->bytes_written + count < stream->bytes_written ||
+ stream->bytes_written + count > stream->max_size)
+ {
+ PB_RETURN_ERROR(stream, "stream full");
+ }
+
+#ifdef PB_BUFFER_ONLY
+ if (!buf_write(stream, buf, count))
+ PB_RETURN_ERROR(stream, "io error");
+#else
+ if (!stream->callback(stream, buf, count))
+ PB_RETURN_ERROR(stream, "io error");
+#endif
+ }
+
+ stream->bytes_written += count;
+ return true;
+}
+
+/*************************
+ * Encode a single field *
+ *************************/
+
+/* Read a bool value without causing undefined behavior even if the value
+ * is invalid. See issue #434 and
+ * https://stackoverflow.com/questions/27661768/weird-results-for-conditional
+ */
+static bool safe_read_bool(const void *pSize)
+{
+ const char *p = (const char *)pSize;
+ size_t i;
+ for (i = 0; i < sizeof(bool); i++)
+ {
+ if (p[i] != 0)
+ return true;
+ }
+ return false;
+}
+
+/* Encode a static array. Handles the size calculations and possible packing. */
+static bool checkreturn encode_array(pb_ostream_t *stream, pb_field_iter_t *field)
+{
+ pb_size_t i;
+ pb_size_t count;
+#ifndef PB_ENCODE_ARRAYS_UNPACKED
+ size_t size;
+#endif
+
+ count = *(pb_size_t*)field->pSize;
+
+ if (count == 0)
+ return true;
+
+ if (PB_ATYPE(field->type) != PB_ATYPE_POINTER && count > field->array_size)
+ PB_RETURN_ERROR(stream, "array max size exceeded");
+
+#ifndef PB_ENCODE_ARRAYS_UNPACKED
+ /* We always pack arrays if the datatype allows it. */
+ if (PB_LTYPE(field->type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ if (!pb_encode_tag(stream, PB_WT_STRING, field->tag))
+ return false;
+
+ /* Determine the total size of packed array. */
+ if (PB_LTYPE(field->type) == PB_LTYPE_FIXED32)
+ {
+ size = 4 * (size_t)count;
+ }
+ else if (PB_LTYPE(field->type) == PB_LTYPE_FIXED64)
+ {
+ size = 8 * (size_t)count;
+ }
+ else
+ {
+ pb_ostream_t sizestream = PB_OSTREAM_SIZING;
+ void *pData_orig = field->pData;
+ for (i = 0; i < count; i++)
+ {
+ if (!pb_enc_varint(&sizestream, field))
+ PB_RETURN_ERROR(stream, PB_GET_ERROR(&sizestream));
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ field->pData = pData_orig;
+ size = sizestream.bytes_written;
+ }
+
+ if (!pb_encode_varint(stream, (pb_uint64_t)size))
+ return false;
+
+ if (stream->callback == NULL)
+ return pb_write(stream, NULL, size); /* Just sizing.. */
+
+ /* Write the data */
+ for (i = 0; i < count; i++)
+ {
+ if (PB_LTYPE(field->type) == PB_LTYPE_FIXED32 || PB_LTYPE(field->type) == PB_LTYPE_FIXED64)
+ {
+ if (!pb_enc_fixed(stream, field))
+ return false;
+ }
+ else
+ {
+ if (!pb_enc_varint(stream, field))
+ return false;
+ }
+
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ }
+ else /* Unpacked fields */
+#endif
+ {
+ for (i = 0; i < count; i++)
+ {
+ /* Normally the data is stored directly in the array entries, but
+ * for pointer-type string and bytes fields, the array entries are
+ * actually pointers themselves also. So we have to dereference once
+ * more to get to the actual data. */
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER &&
+ (PB_LTYPE(field->type) == PB_LTYPE_STRING ||
+ PB_LTYPE(field->type) == PB_LTYPE_BYTES))
+ {
+ bool status;
+ void *pData_orig = field->pData;
+ field->pData = *(void* const*)field->pData;
+
+ if (!field->pData)
+ {
+ /* Null pointer in array is treated as empty string / bytes */
+ status = pb_encode_tag_for_field(stream, field) &&
+ pb_encode_varint(stream, 0);
+ }
+ else
+ {
+ status = encode_basic_field(stream, field);
+ }
+
+ field->pData = pData_orig;
+
+ if (!status)
+ return false;
+ }
+ else
+ {
+ if (!encode_basic_field(stream, field))
+ return false;
+ }
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ }
+
+ return true;
+}
+
+/* In proto3, all fields are optional and are only encoded if their value is "non-zero".
+ * This function implements the check for the zero value. */
+static bool checkreturn pb_check_proto3_default_value(const pb_field_iter_t *field)
+{
+ pb_type_t type = field->type;
+
+ if (PB_ATYPE(type) == PB_ATYPE_STATIC)
+ {
+ if (PB_HTYPE(type) == PB_HTYPE_REQUIRED)
+ {
+ /* Required proto2 fields inside proto3 submessage, pretty rare case */
+ return false;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_REPEATED)
+ {
+ /* Repeated fields inside proto3 submessage: present if count != 0 */
+ return *(const pb_size_t*)field->pSize == 0;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ /* Oneof fields */
+ return *(const pb_size_t*)field->pSize == 0;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_OPTIONAL && field->pSize != NULL)
+ {
+ /* Proto2 optional fields inside proto3 message, or proto3
+ * submessage fields. */
+ return safe_read_bool(field->pSize) == false;
+ }
+ else if (field->descriptor->default_value)
+ {
+ /* Proto3 messages do not have default values, but proto2 messages
+ * can contain optional fields without has_fields (generator option 'proto3').
+ * In this case they must always be encoded, to make sure that the
+ * non-zero default value is overwritten.
+ */
+ return false;
+ }
+
+ /* Rest is proto3 singular fields */
+ if (PB_LTYPE(type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ /* Simple integer / float fields */
+ pb_size_t i;
+ const char *p = (const char*)field->pData;
+ for (i = 0; i < field->data_size; i++)
+ {
+ if (p[i] != 0)
+ {
+ return false;
+ }
+ }
+
+ return true;
+ }
+ else if (PB_LTYPE(type) == PB_LTYPE_BYTES)
+ {
+ const pb_bytes_array_t *bytes = (const pb_bytes_array_t*)field->pData;
+ return bytes->size == 0;
+ }
+ else if (PB_LTYPE(type) == PB_LTYPE_STRING)
+ {
+ return *(const char*)field->pData == '\0';
+ }
+ else if (PB_LTYPE(type) == PB_LTYPE_FIXED_LENGTH_BYTES)
+ {
+ /* Fixed length bytes is only empty if its length is fixed
+ * as 0. Which would be pretty strange, but we can check
+ * it anyway. */
+ return field->data_size == 0;
+ }
+ else if (PB_LTYPE_IS_SUBMSG(type))
+ {
+ /* Check all fields in the submessage to find if any of them
+ * are non-zero. The comparison cannot be done byte-per-byte
+ * because the C struct may contain padding bytes that must
+ * be skipped. Note that usually proto3 submessages have
+ * a separate has_field that is checked earlier in this if.
+ */
+ pb_field_iter_t iter;
+ if (pb_field_iter_begin(&iter, field->submsg_desc, field->pData))
+ {
+ do
+ {
+ if (!pb_check_proto3_default_value(&iter))
+ {
+ return false;
+ }
+ } while (pb_field_iter_next(&iter));
+ }
+ return true;
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ return field->pData == NULL;
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_CALLBACK)
+ {
+ if (PB_LTYPE(type) == PB_LTYPE_EXTENSION)
+ {
+ const pb_extension_t *extension = *(const pb_extension_t* const *)field->pData;
+ return extension == NULL;
+ }
+ else if (field->descriptor->field_callback == pb_default_field_callback)
+ {
+ pb_callback_t *pCallback = (pb_callback_t*)field->pData;
+ return pCallback->funcs.encode == NULL;
+ }
+ else
+ {
+ return field->descriptor->field_callback == NULL;
+ }
+ }
+
+ return false; /* Not typically reached, safe default for weird special cases. */
+}
+
+/* Encode a field with static or pointer allocation, i.e. one whose data
+ * is available to the encoder directly. */
+static bool checkreturn encode_basic_field(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (!field->pData)
+ {
+ /* Missing pointer field */
+ return true;
+ }
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return false;
+
+ switch (PB_LTYPE(field->type))
+ {
+ case PB_LTYPE_BOOL:
+ return pb_enc_bool(stream, field);
+
+ case PB_LTYPE_VARINT:
+ case PB_LTYPE_UVARINT:
+ case PB_LTYPE_SVARINT:
+ return pb_enc_varint(stream, field);
+
+ case PB_LTYPE_FIXED32:
+ case PB_LTYPE_FIXED64:
+ return pb_enc_fixed(stream, field);
+
+ case PB_LTYPE_BYTES:
+ return pb_enc_bytes(stream, field);
+
+ case PB_LTYPE_STRING:
+ return pb_enc_string(stream, field);
+
+ case PB_LTYPE_SUBMESSAGE:
+ case PB_LTYPE_SUBMSG_W_CB:
+ return pb_enc_submessage(stream, field);
+
+ case PB_LTYPE_FIXED_LENGTH_BYTES:
+ return pb_enc_fixed_length_bytes(stream, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+/* Encode a field with callback semantics. This means that a user function is
+ * called to provide and encode the actual data. */
+static bool checkreturn encode_callback_field(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (field->descriptor->field_callback != NULL)
+ {
+ if (!field->descriptor->field_callback(NULL, stream, field))
+ PB_RETURN_ERROR(stream, "callback error");
+ }
+ return true;
+}
+
+/* Encode a single field of any callback, pointer or static type. */
+static bool checkreturn encode_field(pb_ostream_t *stream, pb_field_iter_t *field)
+{
+ /* Check field presence */
+ if (PB_HTYPE(field->type) == PB_HTYPE_ONEOF)
+ {
+ if (*(const pb_size_t*)field->pSize != field->tag)
+ {
+ /* Different type oneof field */
+ return true;
+ }
+ }
+ else if (PB_HTYPE(field->type) == PB_HTYPE_OPTIONAL)
+ {
+ if (field->pSize)
+ {
+ if (safe_read_bool(field->pSize) == false)
+ {
+ /* Missing optional field */
+ return true;
+ }
+ }
+ else if (PB_ATYPE(field->type) == PB_ATYPE_STATIC)
+ {
+ /* Proto3 singular field */
+ if (pb_check_proto3_default_value(field))
+ return true;
+ }
+ }
+
+ if (!field->pData)
+ {
+ if (PB_HTYPE(field->type) == PB_HTYPE_REQUIRED)
+ PB_RETURN_ERROR(stream, "missing required field");
+
+ /* Pointer field set to NULL */
+ return true;
+ }
+
+ /* Then encode field contents */
+ if (PB_ATYPE(field->type) == PB_ATYPE_CALLBACK)
+ {
+ return encode_callback_field(stream, field);
+ }
+ else if (PB_HTYPE(field->type) == PB_HTYPE_REPEATED)
+ {
+ return encode_array(stream, field);
+ }
+ else
+ {
+ return encode_basic_field(stream, field);
+ }
+}
+
+/* Default handler for extension fields. Expects to have a pb_msgdesc_t
+ * pointer in the extension->type->arg field, pointing to a message with
+ * only one field in it. */
+static bool checkreturn default_extension_encoder(pb_ostream_t *stream, const pb_extension_t *extension)
+{
+ pb_field_iter_t iter;
+
+ if (!pb_field_iter_begin_extension_const(&iter, extension))
+ PB_RETURN_ERROR(stream, "invalid extension");
+
+ return encode_field(stream, &iter);
+}
+
+
+/* Walk through all the registered extensions and give them a chance
+ * to encode themselves. */
+static bool checkreturn encode_extension_field(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ const pb_extension_t *extension = *(const pb_extension_t* const *)field->pData;
+
+ while (extension)
+ {
+ bool status;
+ if (extension->type->encode)
+ status = extension->type->encode(stream, extension);
+ else
+ status = default_extension_encoder(stream, extension);
+
+ if (!status)
+ return false;
+
+ extension = extension->next;
+ }
+
+ return true;
+}
+
+/*********************
+ * Encode all fields *
+ *********************/
+
+bool checkreturn pb_encode(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct)
+{
+ pb_field_iter_t iter;
+ if (!pb_field_iter_begin_const(&iter, fields, src_struct))
+ return true; /* Empty message type */
+
+ do {
+ if (PB_LTYPE(iter.type) == PB_LTYPE_EXTENSION)
+ {
+ /* Special case for the extension field placeholder */
+ if (!encode_extension_field(stream, &iter))
+ return false;
+ }
+ else
+ {
+ /* Regular field */
+ if (!encode_field(stream, &iter))
+ return false;
+ }
+ } while (pb_field_iter_next(&iter));
+
+ return true;
+}
+
+bool checkreturn pb_encode_ex(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct, unsigned int flags)
+{
+ if ((flags & PB_ENCODE_DELIMITED) != 0)
+ {
+ return pb_encode_submessage(stream, fields, src_struct);
+ }
+ else if ((flags & PB_ENCODE_NULLTERMINATED) != 0)
+ {
+ const pb_byte_t zero = 0;
+
+ if (!pb_encode(stream, fields, src_struct))
+ return false;
+
+ return pb_write(stream, &zero, 1);
+ }
+ else
+ {
+ return pb_encode(stream, fields, src_struct);
+ }
+}
+
+bool pb_get_encoded_size(size_t *size, const pb_msgdesc_t *fields, const void *src_struct)
+{
+ pb_ostream_t stream = PB_OSTREAM_SIZING;
+
+ if (!pb_encode(&stream, fields, src_struct))
+ return false;
+
+ *size = stream.bytes_written;
+ return true;
+}
+
+/********************
+ * Helper functions *
+ ********************/
+
+/* This function avoids 64-bit shifts as they are quite slow on many platforms. */
+static bool checkreturn pb_encode_varint_32(pb_ostream_t *stream, uint32_t low, uint32_t high)
+{
+ size_t i = 0;
+ pb_byte_t buffer[10];
+ pb_byte_t byte = (pb_byte_t)(low & 0x7F);
+ low >>= 7;
+
+ while (i < 4 && (low != 0 || high != 0))
+ {
+ byte |= 0x80;
+ buffer[i++] = byte;
+ byte = (pb_byte_t)(low & 0x7F);
+ low >>= 7;
+ }
+
+ if (high)
+ {
+ byte = (pb_byte_t)(byte | ((high & 0x07) << 4));
+ high >>= 3;
+
+ while (high)
+ {
+ byte |= 0x80;
+ buffer[i++] = byte;
+ byte = (pb_byte_t)(high & 0x7F);
+ high >>= 7;
+ }
+ }
+
+ buffer[i++] = byte;
+
+ return pb_write(stream, buffer, i);
+}
+
+bool checkreturn pb_encode_varint(pb_ostream_t *stream, pb_uint64_t value)
+{
+ if (value <= 0x7F)
+ {
+ /* Fast path: single byte */
+ pb_byte_t byte = (pb_byte_t)value;
+ return pb_write(stream, &byte, 1);
+ }
+ else
+ {
+#ifdef PB_WITHOUT_64BIT
+ return pb_encode_varint_32(stream, value, 0);
+#else
+ return pb_encode_varint_32(stream, (uint32_t)value, (uint32_t)(value >> 32));
+#endif
+ }
+}
+
+bool checkreturn pb_encode_svarint(pb_ostream_t *stream, pb_int64_t value)
+{
+ pb_uint64_t zigzagged;
+ if (value < 0)
+ zigzagged = ~((pb_uint64_t)value << 1);
+ else
+ zigzagged = (pb_uint64_t)value << 1;
+
+ return pb_encode_varint(stream, zigzagged);
+}
+
+bool checkreturn pb_encode_fixed32(pb_ostream_t *stream, const void *value)
+{
+ uint32_t val = *(const uint32_t*)value;
+ pb_byte_t bytes[4];
+ bytes[0] = (pb_byte_t)(val & 0xFF);
+ bytes[1] = (pb_byte_t)((val >> 8) & 0xFF);
+ bytes[2] = (pb_byte_t)((val >> 16) & 0xFF);
+ bytes[3] = (pb_byte_t)((val >> 24) & 0xFF);
+ return pb_write(stream, bytes, 4);
+}
+
+#ifndef PB_WITHOUT_64BIT
+bool checkreturn pb_encode_fixed64(pb_ostream_t *stream, const void *value)
+{
+ uint64_t val = *(const uint64_t*)value;
+ pb_byte_t bytes[8];
+ bytes[0] = (pb_byte_t)(val & 0xFF);
+ bytes[1] = (pb_byte_t)((val >> 8) & 0xFF);
+ bytes[2] = (pb_byte_t)((val >> 16) & 0xFF);
+ bytes[3] = (pb_byte_t)((val >> 24) & 0xFF);
+ bytes[4] = (pb_byte_t)((val >> 32) & 0xFF);
+ bytes[5] = (pb_byte_t)((val >> 40) & 0xFF);
+ bytes[6] = (pb_byte_t)((val >> 48) & 0xFF);
+ bytes[7] = (pb_byte_t)((val >> 56) & 0xFF);
+ return pb_write(stream, bytes, 8);
+}
+#endif
+
+bool checkreturn pb_encode_tag(pb_ostream_t *stream, pb_wire_type_t wiretype, uint32_t field_number)
+{
+ pb_uint64_t tag = ((pb_uint64_t)field_number << 3) | wiretype;
+ return pb_encode_varint(stream, tag);
+}
+
+bool pb_encode_tag_for_field ( pb_ostream_t* stream, const pb_field_iter_t* field )
+{
+ pb_wire_type_t wiretype;
+ switch (PB_LTYPE(field->type))
+ {
+ case PB_LTYPE_BOOL:
+ case PB_LTYPE_VARINT:
+ case PB_LTYPE_UVARINT:
+ case PB_LTYPE_SVARINT:
+ wiretype = PB_WT_VARINT;
+ break;
+
+ case PB_LTYPE_FIXED32:
+ wiretype = PB_WT_32BIT;
+ break;
+
+ case PB_LTYPE_FIXED64:
+ wiretype = PB_WT_64BIT;
+ break;
+
+ case PB_LTYPE_BYTES:
+ case PB_LTYPE_STRING:
+ case PB_LTYPE_SUBMESSAGE:
+ case PB_LTYPE_SUBMSG_W_CB:
+ case PB_LTYPE_FIXED_LENGTH_BYTES:
+ wiretype = PB_WT_STRING;
+ break;
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+
+ return pb_encode_tag(stream, wiretype, field->tag);
+}
+
+bool checkreturn pb_encode_string(pb_ostream_t *stream, const pb_byte_t *buffer, size_t size)
+{
+ if (!pb_encode_varint(stream, (pb_uint64_t)size))
+ return false;
+
+ return pb_write(stream, buffer, size);
+}
+
+bool checkreturn pb_encode_submessage(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct)
+{
+ /* First calculate the message size using a non-writing substream. */
+ pb_ostream_t substream = PB_OSTREAM_SIZING;
+ size_t size;
+ bool status;
+
+ if (!pb_encode(&substream, fields, src_struct))
+ {
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = substream.errmsg;
+#endif
+ return false;
+ }
+
+ size = substream.bytes_written;
+
+ if (!pb_encode_varint(stream, (pb_uint64_t)size))
+ return false;
+
+ if (stream->callback == NULL)
+ return pb_write(stream, NULL, size); /* Just sizing */
+
+ if (stream->bytes_written + size > stream->max_size)
+ PB_RETURN_ERROR(stream, "stream full");
+
+ /* Use a substream to verify that a callback doesn't write more than
+ * what it did the first time. */
+ substream.callback = stream->callback;
+ substream.state = stream->state;
+ substream.max_size = size;
+ substream.bytes_written = 0;
+#ifndef PB_NO_ERRMSG
+ substream.errmsg = NULL;
+#endif
+
+ status = pb_encode(&substream, fields, src_struct);
+
+ stream->bytes_written += substream.bytes_written;
+ stream->state = substream.state;
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = substream.errmsg;
+#endif
+
+ if (substream.bytes_written != size)
+ PB_RETURN_ERROR(stream, "submsg size changed");
+
+ return status;
+}
+
+/* Field encoders */
+
+static bool checkreturn pb_enc_bool(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t value = safe_read_bool(field->pData) ? 1 : 0;
+ PB_UNUSED(field);
+ return pb_encode_varint(stream, value);
+}
+
+static bool checkreturn pb_enc_varint(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (PB_LTYPE(field->type) == PB_LTYPE_UVARINT)
+ {
+ /* Perform unsigned integer extension */
+ pb_uint64_t value = 0;
+
+ if (field->data_size == sizeof(uint_least8_t))
+ value = *(const uint_least8_t*)field->pData;
+ else if (field->data_size == sizeof(uint_least16_t))
+ value = *(const uint_least16_t*)field->pData;
+ else if (field->data_size == sizeof(uint32_t))
+ value = *(const uint32_t*)field->pData;
+ else if (field->data_size == sizeof(pb_uint64_t))
+ value = *(const pb_uint64_t*)field->pData;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ return pb_encode_varint(stream, value);
+ }
+ else
+ {
+ /* Perform signed integer extension */
+ pb_int64_t value = 0;
+
+ if (field->data_size == sizeof(int_least8_t))
+ value = *(const int_least8_t*)field->pData;
+ else if (field->data_size == sizeof(int_least16_t))
+ value = *(const int_least16_t*)field->pData;
+ else if (field->data_size == sizeof(int32_t))
+ value = *(const int32_t*)field->pData;
+ else if (field->data_size == sizeof(pb_int64_t))
+ value = *(const pb_int64_t*)field->pData;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_SVARINT)
+ return pb_encode_svarint(stream, value);
+#ifdef PB_WITHOUT_64BIT
+ else if (value < 0)
+ return pb_encode_varint_32(stream, (uint32_t)value, (uint32_t)-1);
+#endif
+ else
+ return pb_encode_varint(stream, (pb_uint64_t)value);
+
+ }
+}
+
+static bool checkreturn pb_enc_fixed(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+ if (field->data_size == sizeof(float) && PB_LTYPE(field->type) == PB_LTYPE_FIXED64)
+ {
+ return pb_encode_float_as_double(stream, *(float*)field->pData);
+ }
+#endif
+
+ if (field->data_size == sizeof(uint32_t))
+ {
+ return pb_encode_fixed32(stream, field->pData);
+ }
+#ifndef PB_WITHOUT_64BIT
+ else if (field->data_size == sizeof(uint64_t))
+ {
+ return pb_encode_fixed64(stream, field->pData);
+ }
+#endif
+ else
+ {
+ PB_RETURN_ERROR(stream, "invalid data_size");
+ }
+}
+
+static bool checkreturn pb_enc_bytes(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ const pb_bytes_array_t *bytes = NULL;
+
+ bytes = (const pb_bytes_array_t*)field->pData;
+
+ if (bytes == NULL)
+ {
+ /* Treat null pointer as an empty bytes field */
+ return pb_encode_string(stream, NULL, 0);
+ }
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_STATIC &&
+ bytes->size > field->data_size - offsetof(pb_bytes_array_t, bytes))
+ {
+ PB_RETURN_ERROR(stream, "bytes size exceeded");
+ }
+
+ return pb_encode_string(stream, bytes->bytes, (size_t)bytes->size);
+}
+
+static bool checkreturn pb_enc_string(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ size_t size = 0;
+ size_t max_size = (size_t)field->data_size;
+ const char *str = (const char*)field->pData;
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+ max_size = (size_t)-1;
+ }
+ else
+ {
+ /* pb_dec_string() assumes string fields end with a null
+ * terminator when the type isn't PB_ATYPE_POINTER, so we
+ * shouldn't allow more than max-1 bytes to be written to
+ * allow space for the null terminator.
+ */
+ if (max_size == 0)
+ PB_RETURN_ERROR(stream, "zero-length string");
+
+ max_size -= 1;
+ }
+
+
+ if (str == NULL)
+ {
+ size = 0; /* Treat null pointer as an empty string */
+ }
+ else
+ {
+ const char *p = str;
+
+ /* strnlen() is not always available, so just use a loop */
+ while (size < max_size && *p != '\0')
+ {
+ size++;
+ p++;
+ }
+
+ if (*p != '\0')
+ {
+ PB_RETURN_ERROR(stream, "unterminated string");
+ }
+ }
+
+#ifdef PB_VALIDATE_UTF8
+ if (!pb_validate_utf8(str))
+ PB_RETURN_ERROR(stream, "invalid utf8");
+#endif
+
+ return pb_encode_string(stream, (const pb_byte_t*)str, size);
+}
+
+static bool checkreturn pb_enc_submessage(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (field->submsg_desc == NULL)
+ PB_RETURN_ERROR(stream, "invalid field descriptor");
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_SUBMSG_W_CB && field->pSize != NULL)
+ {
+ /* Message callback is stored right before pSize. */
+ pb_callback_t *callback = (pb_callback_t*)field->pSize - 1;
+ if (callback->funcs.encode)
+ {
+ if (!callback->funcs.encode(stream, field, &callback->arg))
+ return false;
+ }
+ }
+
+ return pb_encode_submessage(stream, field->submsg_desc, field->pData);
+}
+
+static bool checkreturn pb_enc_fixed_length_bytes(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ return pb_encode_string(stream, (const pb_byte_t*)field->pData, (size_t)field->data_size);
+}
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+bool pb_encode_float_as_double(pb_ostream_t *stream, float value)
+{
+ union { float f; uint32_t i; } in;
+ uint_least8_t sign;
+ int exponent;
+ uint64_t mantissa;
+
+ in.f = value;
+
+ /* Decompose input value */
+ sign = (uint_least8_t)((in.i >> 31) & 1);
+ exponent = (int)((in.i >> 23) & 0xFF) - 127;
+ mantissa = in.i & 0x7FFFFF;
+
+ if (exponent == 128)
+ {
+ /* Special value (NaN etc.) */
+ exponent = 1024;
+ }
+ else if (exponent == -127)
+ {
+ if (!mantissa)
+ {
+ /* Zero */
+ exponent = -1023;
+ }
+ else
+ {
+ /* Denormalized */
+ mantissa <<= 1;
+ while (!(mantissa & 0x800000))
+ {
+ mantissa <<= 1;
+ exponent--;
+ }
+ mantissa &= 0x7FFFFF;
+ }
+ }
+
+ /* Combine fields */
+ mantissa <<= 29;
+ mantissa |= (uint64_t)(exponent + 1023) << 52;
+ mantissa |= (uint64_t)sign << 63;
+
+ return pb_encode_fixed64(stream, &mantissa);
+}
+#endif
diff --git a/security/container/protos/nanopb/pb_encode.h b/security/container/protos/nanopb/pb_encode.h
new file mode 100644
index 0000000..9cff22a
--- /dev/null
+++ b/security/container/protos/nanopb/pb_encode.h
@@ -0,0 +1,185 @@
+/* pb_encode.h: Functions to encode protocol buffers. Depends on pb_encode.c.
+ * The main function is pb_encode. You also need an output stream, and the
+ * field descriptions created by nanopb_generator.py.
+ */
+
+#ifndef PB_ENCODE_H_INCLUDED
+#define PB_ENCODE_H_INCLUDED
+
+#include "pb.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Structure for defining custom output streams. You will need to provide
+ * a callback function to write the bytes to your storage, which can be
+ * for example a file or a network socket.
+ *
+ * The callback must conform to these rules:
+ *
+ * 1) Return false on IO errors. This will cause encoding to abort.
+ * 2) You can use state to store your own data (e.g. buffer pointer).
+ * 3) pb_write will update bytes_written after your callback runs.
+ * 4) Substreams will modify max_size and bytes_written. Don't use them
+ * to calculate any pointers.
+ */
+struct pb_ostream_s
+{
+#ifdef PB_BUFFER_ONLY
+ /* Callback pointer is not used in buffer-only configuration.
+ * Having an int pointer here allows binary compatibility but
+ * gives an error if someone tries to assign callback function.
+ * Also, NULL pointer marks a 'sizing stream' that does not
+ * write anything.
+ */
+ int *callback;
+#else
+ bool (*callback)(pb_ostream_t *stream, const pb_byte_t *buf, size_t count);
+#endif
+ void *state; /* Free field for use by callback implementation. */
+ size_t max_size; /* Limit number of output bytes written (or use SIZE_MAX). */
+ size_t bytes_written; /* Number of bytes written so far. */
+
+#ifndef PB_NO_ERRMSG
+ const char *errmsg;
+#endif
+};
+
+/***************************
+ * Main encoding functions *
+ ***************************/
+
+/* Encode a single protocol buffers message from C structure into a stream.
+ * Returns true on success, false on any failure.
+ * The actual struct pointed to by src_struct must match the description in fields.
+ * All required fields in the struct are assumed to have been filled in.
+ *
+ * Example usage:
+ * MyMessage msg = {};
+ * uint8_t buffer[64];
+ * pb_ostream_t stream;
+ *
+ * msg.field1 = 42;
+ * stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
+ * pb_encode(&stream, MyMessage_fields, &msg);
+ */
+bool pb_encode(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct);
+
+/* Extended version of pb_encode, with several options to control the
+ * encoding process:
+ *
+ * PB_ENCODE_DELIMITED: Prepend the length of message as a varint.
+ * Corresponds to writeDelimitedTo() in Google's
+ * protobuf API.
+ *
+ * PB_ENCODE_NULLTERMINATED: Append a null byte to the message for termination.
+ * NOTE: This behaviour is not supported in most other
+ * protobuf implementations, so PB_ENCODE_DELIMITED
+ * is a better option for compatibility.
+ */
+#define PB_ENCODE_DELIMITED 0x02U
+#define PB_ENCODE_NULLTERMINATED 0x04U
+bool pb_encode_ex(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct, unsigned int flags);
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define pb_encode_delimited(s,f,d) pb_encode_ex(s,f,d, PB_ENCODE_DELIMITED)
+#define pb_encode_nullterminated(s,f,d) pb_encode_ex(s,f,d, PB_ENCODE_NULLTERMINATED)
+
+/* Encode the message to get the size of the encoded data, but do not store
+ * the data. */
+bool pb_get_encoded_size(size_t *size, const pb_msgdesc_t *fields, const void *src_struct);
+
+/**************************************
+ * Functions for manipulating streams *
+ **************************************/
+
+/* Create an output stream for writing into a memory buffer.
+ * The number of bytes written can be found in stream.bytes_written after
+ * encoding the message.
+ *
+ * Alternatively, you can use a custom stream that writes directly to e.g.
+ * a file or a network socket.
+ */
+pb_ostream_t pb_ostream_from_buffer(pb_byte_t *buf, size_t bufsize);
+
+/* Pseudo-stream for measuring the size of a message without actually storing
+ * the encoded data.
+ *
+ * Example usage:
+ * MyMessage msg = {};
+ * pb_ostream_t stream = PB_OSTREAM_SIZING;
+ * pb_encode(&stream, MyMessage_fields, &msg);
+ * printf("Message size is %d\n", stream.bytes_written);
+ */
+#ifndef PB_NO_ERRMSG
+#define PB_OSTREAM_SIZING {0,0,0,0,0}
+#else
+#define PB_OSTREAM_SIZING {0,0,0,0}
+#endif
+
+/* Function to write into a pb_ostream_t stream. You can use this if you need
+ * to append or prepend some custom headers to the message.
+ */
+bool pb_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count);
+
+
+/************************************************
+ * Helper functions for writing field callbacks *
+ ************************************************/
+
+/* Encode field header based on type and field number defined in the field
+ * structure. Call this from the callback before writing out field contents. */
+bool pb_encode_tag_for_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+
+/* Encode field header by manually specifying wire type. You need to use this
+ * if you want to write out packed arrays from a callback field. */
+bool pb_encode_tag(pb_ostream_t *stream, pb_wire_type_t wiretype, uint32_t field_number);
+
+/* Encode an integer in the varint format.
+ * This works for bool, enum, int32, int64, uint32 and uint64 field types. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_encode_varint(pb_ostream_t *stream, uint64_t value);
+#else
+bool pb_encode_varint(pb_ostream_t *stream, uint32_t value);
+#endif
+
+/* Encode an integer in the zig-zagged svarint format.
+ * This works for sint32 and sint64. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_encode_svarint(pb_ostream_t *stream, int64_t value);
+#else
+bool pb_encode_svarint(pb_ostream_t *stream, int32_t value);
+#endif
+
+/* Encode a string or bytes type field. For strings, pass strlen(s) as size. */
+bool pb_encode_string(pb_ostream_t *stream, const pb_byte_t *buffer, size_t size);
+
+/* Encode a fixed32, sfixed32 or float value.
+ * You need to pass a pointer to a 4-byte wide C variable. */
+bool pb_encode_fixed32(pb_ostream_t *stream, const void *value);
+
+#ifndef PB_WITHOUT_64BIT
+/* Encode a fixed64, sfixed64 or double value.
+ * You need to pass a pointer to a 8-byte wide C variable. */
+bool pb_encode_fixed64(pb_ostream_t *stream, const void *value);
+#endif
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+/* Encode a float value so that it appears like a double in the encoded
+ * message. */
+bool pb_encode_float_as_double(pb_ostream_t *stream, float value);
+#endif
+
+/* Encode a submessage field.
+ * You need to pass the pb_field_t array and pointer to struct, just like
+ * with pb_encode(). This internally encodes the submessage twice, first to
+ * calculate message size and then to actually write it out.
+ */
+bool pb_encode_submessage(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct);
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/pbsystem.h b/security/container/protos/pbsystem.h
new file mode 100644
index 0000000..f2308f8
--- /dev/null
+++ b/security/container/protos/pbsystem.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Header and types for nanopb to work with the Linux kernel */
+#include <linux/kernel.h>
+#include <linux/string.h>
+
+/* Small types. */
+
+/* Signed. */
+typedef signed char int_least8_t;
+typedef short int int_least16_t;
+typedef int int_least32_t;
+typedef long int int_least64_t;
+
+/* Unsigned. */
+typedef unsigned char uint_least8_t;
+typedef unsigned short int uint_least16_t;
+typedef unsigned int uint_least32_t;
+typedef unsigned long int uint_least64_t;
+
+/* Fast types. */
+
+/* Signed. */
+typedef signed char int_fast8_t;
+typedef long int int_fast16_t;
+typedef long int int_fast32_t;
+typedef long int int_fast64_t;
+
+/* Unsigned. */
+typedef unsigned char uint_fast8_t;
+typedef unsigned long int uint_fast16_t;
+typedef unsigned long int uint_fast32_t;
+typedef unsigned long int uint_fast64_t;
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 04b9e46..5d47313 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -13,7 +13,7 @@
#include <linux/fs.h>
#include <linux/xattr.h>
#include <linux/evm.h>
-#include <linux/iversion.h>
+#include <linux/fsverity.h>
#include "ima.h"
@@ -218,10 +218,11 @@
struct inode *inode = file_inode(file);
struct inode *real_inode = d_real_inode(file_dentry(file));
const char *filename = file->f_path.dentry->d_name.name;
+ struct kstat stat;
int result = 0;
int length;
void *tmpbuf;
- u64 i_version;
+ u64 i_version = 0;
struct {
struct ima_digest_data hdr;
char digest[IMA_MAX_DIGEST_SIZE];
@@ -243,7 +244,10 @@
* which do not support i_version, support is limited to an initial
* measurement/appraisal/audit.
*/
- i_version = inode_query_iversion(inode);
+ result = vfs_getattr_nosec(&file->f_path, &stat, STATX_CHANGE_COOKIE,
+ AT_STATX_SYNC_AS_STAT);
+ if (!result && (stat.result_mask & STATX_CHANGE_COOKIE))
+ i_version = stat.change_cookie;
hash.hdr.algo = algo;
/* Initialize hash digest to 0's in case of failure */
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 7cd9df8..f64d86d 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -24,7 +24,6 @@
#include <linux/slab.h>
#include <linux/xattr.h>
#include <linux/ima.h>
-#include <linux/iversion.h>
#include <linux/fs.h>
#include <linux/iversion.h>
@@ -164,11 +163,16 @@
mutex_lock(&iint->mutex);
if (atomic_read(&inode->i_writecount) == 1) {
+ struct kstat stat;
+
update = test_and_clear_bit(IMA_UPDATE_XATTR,
&iint->atomic_flags);
- if (!IS_I_VERSION(inode) ||
- !inode_eq_iversion(inode, iint->version) ||
- (iint->flags & IMA_NEW_FILE)) {
+ if ((iint->flags & IMA_NEW_FILE) ||
+ vfs_getattr_nosec(&file->f_path, &stat,
+ STATX_CHANGE_COOKIE,
+ AT_STATX_SYNC_AS_STAT) ||
+ !(stat.result_mask & STATX_CHANGE_COOKIE) ||
+ stat.change_cookie != iint->version) {
iint->flags &= ~(IMA_DONE_MASK | IMA_NEW_FILE);
iint->measured_pcrs = 0;
if (update)
diff --git a/security/security.c b/security/security.c
index 33864d0..099989d 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1550,6 +1550,11 @@
}
}
+void security_file_pre_free(struct file *file)
+{
+ call_void_hook(file_pre_free_security, file);
+}
+
int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
return call_int_hook(file_ioctl, 0, file, cmd, arg);
@@ -1684,6 +1689,11 @@
return rc;
}
+void security_task_post_alloc(struct task_struct *task)
+{
+ call_void_hook(task_post_alloc, task);
+}
+
void security_task_free(struct task_struct *task)
{
call_void_hook(task_free, task);
@@ -1903,6 +1913,11 @@
return call_int_hook(task_kill, 0, p, info, sig, cred);
}
+void security_task_exit(struct task_struct *p)
+{
+ call_void_hook(task_exit, p);
+}
+
int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
unsigned long arg4, unsigned long arg5)
{
diff --git a/tools/testing/selftests/futex/functional/.gitignore b/tools/testing/selftests/futex/functional/.gitignore
index 0e78b49..6fd0c01 100644
--- a/tools/testing/selftests/futex/functional/.gitignore
+++ b/tools/testing/selftests/futex/functional/.gitignore
@@ -2,6 +2,7 @@
futex_requeue_pi
futex_requeue_pi_mismatched_ops
futex_requeue_pi_signal_restart
+futex_swap
futex_wait_private_mapped_file
futex_wait_timeout
futex_wait_uninitialized_heap
diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile
index ece2e38..80a4333 100644
--- a/tools/testing/selftests/futex/functional/Makefile
+++ b/tools/testing/selftests/futex/functional/Makefile
@@ -14,6 +14,7 @@
futex_requeue_pi \
futex_requeue_pi_signal_restart \
futex_requeue_pi_mismatched_ops \
+ futex_swap \
futex_wait_uninitialized_heap \
futex_wait_private_mapped_file \
futex_wait \
diff --git a/tools/testing/selftests/futex/functional/futex_swap.c b/tools/testing/selftests/futex/functional/futex_swap.c
new file mode 100644
index 0000000..9034d04
--- /dev/null
+++ b/tools/testing/selftests/futex/functional/futex_swap.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include <errno.h>
+#include <getopt.h>
+#include <pthread.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <time.h>
+#include "atomic.h"
+#include "futextest.h"
+
+/* The futex the main thread waits on. */
+futex_t futex_main = FUTEX_INITIALIZER;
+/* The futex the other thread wats on. */
+futex_t futex_other = FUTEX_INITIALIZER;
+
+/* The number of iterations to run (>1 => run benchmarks. */
+static int cfg_iterations = 1;
+
+/* If != 0, print diagnostic messages. */
+static int cfg_verbose;
+
+/* If == 0, do not use validation_counter. Useful for benchmarking. */
+static int cfg_validate = 1;
+
+/* How to swap threads. */
+#define SWAP_WAKE_WAIT 1
+#define SWAP_SWAP 2
+
+/* Futex values. */
+#define FUTEX_WAITING 0
+#define FUTEX_WAKEUP 1
+
+/* An atomic counter used to validate proper swapping. */
+static atomic_t validation_counter;
+
+void futex_swap_op(int mode, futex_t *futex_this, futex_t *futex_that)
+{
+ int ret;
+
+ switch (mode) {
+ case SWAP_WAKE_WAIT:
+ futex_set(futex_this, FUTEX_WAITING);
+ futex_set(futex_that, FUTEX_WAKEUP);
+ futex_wake(futex_that, 1, FUTEX_PRIVATE_FLAG);
+ futex_wait(futex_this, FUTEX_WAITING, NULL, FUTEX_PRIVATE_FLAG);
+ if (*futex_this != FUTEX_WAKEUP) {
+ fprintf(stderr, "unexpected futex_this value on wakeup\n");
+ exit(1);
+ }
+ break;
+
+ case SWAP_SWAP:
+ futex_set(futex_this, FUTEX_WAITING);
+ futex_set(futex_that, FUTEX_WAKEUP);
+ ret = futex_swap(futex_this, FUTEX_WAITING, NULL,
+ futex_that, FUTEX_PRIVATE_FLAG);
+ if (ret < 0 && errno == ENOSYS) {
+ /* futex_swap not implemented */
+ perror("futex_swap");
+ exit(1);
+ }
+ if (*futex_this != FUTEX_WAKEUP) {
+ fprintf(stderr, "unexpected futex_this value on wakeup\n");
+ exit(1);
+ }
+ break;
+
+ default:
+ fprintf(stderr, "unknown mode in %s\n", __func__);
+ exit(1);
+ }
+}
+
+void *other_thread(void *arg)
+{
+ int mode = *((int *)arg);
+ int counter;
+
+ if (cfg_verbose)
+ printf("%s started\n", __func__);
+
+ futex_wait(&futex_other, 0, NULL, FUTEX_PRIVATE_FLAG);
+
+ for (counter = 0; counter < cfg_iterations; ++counter) {
+ if (cfg_validate) {
+ int prev = 2 * counter + 1;
+
+ if (prev != atomic_cmpxchg(&validation_counter, prev,
+ prev + 1)) {
+ fprintf(stderr, "swap validation failed\n");
+ exit(1);
+ }
+ }
+ futex_swap_op(mode, &futex_other, &futex_main);
+ }
+
+ if (cfg_verbose)
+ printf("%s finished: %d iteration(s)\n", __func__, counter);
+
+ return NULL;
+}
+
+void run_test(int mode)
+{
+ struct timespec start, stop;
+ int ret, counter;
+ pthread_t thread;
+ uint64_t duration;
+
+ futex_set(&futex_other, FUTEX_WAITING);
+ atomic_set(&validation_counter, 0);
+ ret = pthread_create(&thread, NULL, &other_thread, &mode);
+ if (ret) {
+ perror("pthread_create");
+ exit(1);
+ }
+
+ ret = clock_gettime(CLOCK_MONOTONIC, &start);
+ if (ret) {
+ perror("clock_gettime");
+ exit(1);
+ }
+
+ for (counter = 0; counter < cfg_iterations; ++counter) {
+ if (cfg_validate) {
+ int prev = 2 * counter;
+
+ if (prev != atomic_cmpxchg(&validation_counter, prev,
+ prev + 1)) {
+ fprintf(stderr, "swap validation failed\n");
+ exit(1);
+ }
+ }
+ futex_swap_op(mode, &futex_main, &futex_other);
+ }
+ if (cfg_validate && validation_counter.val != 2 * cfg_iterations) {
+ fprintf(stderr, "final swap validation failed\n");
+ exit(1);
+ }
+
+ ret = clock_gettime(CLOCK_MONOTONIC, &stop);
+ if (ret) {
+ perror("clock_gettime");
+ exit(1);
+ }
+
+ duration = (stop.tv_sec - start.tv_sec) * 1000000000LL +
+ stop.tv_nsec - start.tv_nsec;
+ if (cfg_verbose || cfg_iterations > 1) {
+ printf("completed %d swap and back iterations in %lu ns: %lu ns per swap\n",
+ cfg_iterations, duration,
+ duration / (cfg_iterations * 2));
+ }
+
+ /* The remote thread is blocked; send it the final wake. */
+ futex_set(&futex_other, FUTEX_WAKEUP);
+ futex_wake(&futex_other, 1, FUTEX_PRIVATE_FLAG);
+ if (pthread_join(thread, NULL)) {
+ perror("pthread_join");
+ exit(1);
+ }
+}
+
+void usage(char *prog)
+{
+ printf("Usage: %s\n", prog);
+ printf(" -h Display this help message\n");
+ printf(" -i N Use N iterations to benchmark\n");
+ printf(" -n Do not validate swapping correctness\n");
+ printf(" -v Print diagnostic messages\n");
+}
+
+int main(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt(argc, argv, "hi:nv")) != -1) {
+ switch (c) {
+ case 'h':
+ usage(basename(argv[0]));
+ exit(0);
+ case 'i':
+ cfg_iterations = atoi(optarg);
+ break;
+ case 'n':
+ cfg_validate = 0;
+ break;
+ case 'v':
+ cfg_verbose = 1;
+ break;
+ default:
+ usage(basename(argv[0]));
+ exit(1);
+ }
+ }
+
+ printf("\n\n------- running SWAP_WAKE_WAIT -----------\n\n");
+ run_test(SWAP_WAKE_WAIT);
+ printf("PASS\n");
+
+ printf("\n\n------- running SWAP_SWAP -----------\n\n");
+ run_test(SWAP_SWAP);
+ printf("PASS\n");
+
+ return 0;
+}
diff --git a/tools/testing/selftests/futex/include/futextest.h b/tools/testing/selftests/futex/include/futextest.h
index ddbcfc9..d2861fd 100644
--- a/tools/testing/selftests/futex/include/futextest.h
+++ b/tools/testing/selftests/futex/include/futextest.h
@@ -38,6 +38,9 @@
#ifndef FUTEX_CMP_REQUEUE_PI
#define FUTEX_CMP_REQUEUE_PI 12
#endif
+#ifndef GFUTEX_SWAP
+#define GFUTEX_SWAP 60
+#endif
#ifndef FUTEX_WAIT_REQUEUE_PI_PRIVATE
#define FUTEX_WAIT_REQUEUE_PI_PRIVATE (FUTEX_WAIT_REQUEUE_PI | \
FUTEX_PRIVATE_FLAG)
@@ -205,6 +208,19 @@
}
/**
+ * futex_swap() - block on uaddr and wake one task blocked on uaddr2.
+ * @uaddr: futex to block the current task on
+ * @timeout: relative timeout for the current task block
+ * @uaddr2: futex to wake tasks at (can be the same as uaddr)
+ */
+static inline int
+futex_swap(futex_t *uaddr, futex_t val, struct timespec *timeout,
+ futex_t *uaddr2, int opflags)
+{
+ return futex(uaddr, GFUTEX_SWAP, val, timeout, uaddr2, 0, opflags);
+}
+
+/**
* futex_cmpxchg() - atomic compare and exchange
* @uaddr: The address of the futex to be modified
* @oldval: The expected value of the futex
diff --git a/tools/testing/selftests/net/msg_zerocopy.sh b/tools/testing/selftests/net/msg_zerocopy.sh
index 825ffec..89c22f5 100755
--- a/tools/testing/selftests/net/msg_zerocopy.sh
+++ b/tools/testing/selftests/net/msg_zerocopy.sh
@@ -70,23 +70,22 @@
esac
# Start of state changes: install cleanup handler
-save_sysctl_mem="$(sysctl -n ${path_sysctl_mem})"
cleanup() {
ip netns del "${NS2}"
ip netns del "${NS1}"
- sysctl -w -q "${path_sysctl_mem}=${save_sysctl_mem}"
}
trap cleanup EXIT
-# Configure system settings
-sysctl -w -q "${path_sysctl_mem}=1000000"
-
# Create virtual ethernet pair between network namespaces
ip netns add "${NS1}"
ip netns add "${NS2}"
+# Configure system settings
+ip netns exec "${NS1}" sysctl -w -q "${path_sysctl_mem}=1000000"
+ip netns exec "${NS2}" sysctl -w -q "${path_sysctl_mem}=1000000"
+
ip link add "${DEV}" mtu "${DEV_MTU}" netns "${NS1}" type veth \
peer name "${DEV}" mtu "${DEV_MTU}" netns "${NS2}"