merge-upstream/v6.1.75 from branch/tag: upstream/v6.1.75 into branch: main-R109-cos-6.1
Changelog:
-------------------------------------------------------------
Abhinav Singh (1):
drm/nouveau/fence:: fix warning directly dereferencing a rcu pointer
Adam Ford (1):
arm64: dts: imx8mm: Reduce GPU to nominal speed
Ahmad Fatoum (1):
ARM: dts: stm32: don't mix SCMI and non-SCMI board compatibles
Alex Deucher (1):
drm/amdgpu/debugfs: fix error code when smc register accessors are NULL
Alexandra Diupina (1):
cpufreq: scmi: process the result of devm_of_clk_add_hw_provider()
Alexandre Ghiti (5):
riscv: Check if the code to patch lies in the exit section
riscv: Fix module_alloc() that did not reset the linear mapping permissions
riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings
riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC
riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping
Amir Goldstein (1):
scsi: target: core: add missing file_{start,end}_write()
Amit Cohen (2):
mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
selftests: mlxsw: qos_pfc: Adjust the test to support 8 lanes
Amit Kumar Mahapatra (1):
spi: spi-zynqmp-gqspi: fix driver kconfig dependencies
Andrei Matei (1):
bpf: Fix verification of indirect var-off stack access
Andrii Nakryiko (2):
bpf: enforce precision of R0 on callback return
bpf: fix check for attempt to corrupt spilled pointer
Andy Shevchenko (2):
ACPI: LPSS: Fix the fractional clock divider flags
mfd: intel-lpss: Fix the fractional clock divider flags
AngeloGioacchino Del Regno (3):
drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()
drm/panfrost: Ignore core_mask for poweroff and disable PWRTRANS irq
ASoC: mediatek: sof-common: Add NULL check for normal_link string
Anton Protopopov (1):
bpf: add percpu stats for bpf_map elements insertions/deletions
Ard Biesheuvel (1):
efivarfs: Free s_fs_info on unmount
Arnaldo Carvalho de Melo (1):
libapi: Add missing linux/types.h header to get the __u64 type on io.h
Arnd Bergmann (6):
EDAC/thunderx: Fix possible out-of-bounds string access
csky: fix arch_jump_label_transform_static override
wifi: libertas: stop selecting wext
ARM: davinci: always select CONFIG_CPU_ARM926T
nvmet: re-fix tracing strncpy() warning
nvme: trace: avoid memcpy overflow warning
Arseniy Krasnov (1):
virtio/vsock: fix logic which reduces credit update messages
Artem Chernyshev (1):
scsi: fnic: Return error if vmalloc() failed
Asmaa Mnebhi (2):
mlxbf_gige: Fix intermittent no ip issue
mlxbf_gige: Enable the GigE port in mlxbf_gige_open
Atul Dhudase (1):
soc: qcom: llcc: Fix dis_cap_alloc and retain_on_pc configuration
Bart Van Assche (2):
scsi: ufs: core: Simplify power management during async scan
md/raid1: Use blk_opf_t for read and write operations
Benjamin Coddington (1):
blocklayoutdriver: Fix reference leak of pnfs_device_node
Bin Li (1):
ALSA: hda/realtek: Enable headset mic on Lenovo M70 Gen5
Brent Lu (1):
ASoC: Intel: glk_rt5682_max98357a: fix board id mismatch
Carlos Llamas (3):
binder: fix async space check for 0-sized buffers
binder: fix unused alloc->free_async_space
binder: fix race between mmput() and do_exit()
Chandrakanth patil (2):
scsi: mpi3mr: Refresh sdev queue depth after controller reset
scsi: mpi3mr: Block PEL Enable Command on Controller Reset and Unrecoverable State
Chao Yu (4):
f2fs: fix to avoid dirent corruption
f2fs: fix to wait on block writeback for post_read case
f2fs: fix to check compress file in f2fs_move_file_range()
f2fs: fix to update iostat correctly in f2fs_filemap_fault()
Chen Ni (2):
KEYS: encrypted: Add check for strsep
crypto: sa2ul - Return crypto_aead_setkey to transfer the error
Chengchang Tang (1):
RDMA/hns: Fix memory leak in free_mr_init()
Chenghai Huang (1):
crypto: hisilicon/zip - add zip comp high perf mode configuration
Chengming Zhou (1):
crypto: scomp - fix req->dst buffer overflow
Chih-Kang Chang (1):
wifi: rtw88: fix RX filter in FIF_ALLMULTI flag
Chris Morgan (2):
drm/panel-elida-kd35t133: hold panel in reset for unprepare
drm/panel: st7701: Fix AVCL calculation
Christian A. Ehrhardt (1):
of: Fix double free in of_parse_phandle_with_args_map
Christian Brauner (1):
fs: indicate request originates from old mount API
Christian Marangi (2):
arm64: dts: qcom: ipq6018: improve pcie phy pcs reg table
wifi: mt76: fix broken precal loading from MTD for mt7915
Christoph Hellwig (2):
null_blk: don't cap max_hw_sectors to BLK_DEF_MAX_SECTORS
loop: fix the the direct I/O support check when used on top of block devices
Christoph Niedermaier (2):
serial: imx: Ensure that imx_uart_rs485_config() is called with enabled clock
serial: imx: Correct clock error message in function probe()
Christophe JAILLET (6):
firmware: ti_sci: Fix an off-by-one in ti_sci_debugfs_create()
media: dvb-frontends: m88ds3103: Fix a memory leak in an error handling path of m88ds3103_probe()
MIPS: Alchemy: Fix an out-of-bound access in db1200_dev_setup()
MIPS: Alchemy: Fix an out-of-bound access in db1550_dev_setup()
vdpa: Fix an error handling path in eni_vdpa_probe()
kdb: Fix a potential buffer overflow in kdb_local()
Chukun Pan (1):
arm64: dts: qcom: ipq6018: fix clock rates for GCC_USB0_MOCK_UTMI_CLK
Chunfeng Yun (1):
usb: xhci-mtk: fix a short packet issue of gen1 isoc-in transfer
Claudiu Beznea (3):
clk: renesas: rzg2l-cpg: Reuse code in rzg2l_cpg_reset()
clk: renesas: rzg2l: Check reset monitor registers
net: phy: micrel: populate .soft_reset for KSZ9131
Colin Ian King (1):
x86/lib: Fix overflow when counting digits
Curtis Klein (1):
watchdog: set cdev owner before adding
Dafna Hirschfeld (1):
drm/amdkfd: fixes for HMM mem allocation
Dan Carpenter (2):
wifi: plfxlc: check for allocation failure in plfxlc_usb_wreq_async()
media: dvbdev: drop refcount on error path in dvb_device_open()
Dang Huynh (1):
leds: aw2013: Select missing dependency REGMAP_I2C
Dario Binacchi (2):
drm/bridge: Fix typo in post_disable() description
fbdev: imxfb: fix left margin setting
David E. Box (1):
platform/x86/intel/vsec: Fix xa_alloc memory leak
David Howells (1):
keys, dns: Fix size check of V1 server-list header
David Lin (1):
wifi: mwifiex: configure BSSID consistently when starting AP
Deepak R Varma (1):
drm/amdkfd: Use resource_size() helper function
Dinghao Liu (1):
crypto: ccp - fix memleak in ccp_init_dm_workarea
Dmitry Baryshkov (4):
ARM: dts: qcom: apq8064: correct XOADC register address
arm64: dts: qcom: sm8150-hdk: fix SS USB regulators
drm/msm/mdp4: flush vblank event on disable
drm/drv: propagate errors from drm_modeset_register_all()
Dmitry Rokosov (1):
firmware: meson_sm: populate platform devices from sm device tree data
Douglas Anderson (7):
arm64: dts: qcom: sc7180: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sc7280: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sdm845: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sm8150: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sm8250: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sc8280xp: Make watchdog bark interrupt edge triggered
arm64: dts: qcom: sm6350: Make watchdog bark interrupt edge triggered
Eric Dumazet (9):
sctp: support MSG_ERRQUEUE flag in recvmsg()
sctp: fix busy polling
ip6_tunnel: fix NEXTHDR_FRAGMENT handling in ip6_tnl_parse_tlv_enc_lim()
mptcp: mptcp_parse_option() fix for MPTCPOPT_MP_JOIN
mptcp: strict validation before using mp_opt->hmac
mptcp: use OPTION_MPTCP_MPJ_SYNACK in subflow_finish_connect()
mptcp: use OPTION_MPTCP_MPJ_SYN in subflow_check_req()
mptcp: refine opt_mp_capable determination
udp: annotate data-races around up->pending
Fedor Pchelkin (2):
apparmor: avoid crash when parsed profile name is empty
ipvs: avoid stat macros calls from preemptible context
Florian Lehner (1):
bpf, lpm: Fix check prefixlen before walking trie
Florian Westphal (1):
netfilter: nf_tables: mark newset as dead on transaction abort
Francesco Dolcini (1):
Bluetooth: btmtkuart: fix recv_buf() return value
Frank Li (3):
usb: cdns3: fix uvc failure work since sg support enabled
usb: cdns3: fix iso transfer error when mult is not zero
usb: cdns3: Fix uvc fail when DMA cross 4k boundery since sg enabled
Frederik Haxel (1):
riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro
Gao Xiang (1):
erofs: fix memory leak on short-lived bounced pages
Gavrilov Ilia (1):
calipso: fix memory leak in netlbl_calipso_add_pass()
Geert Uytterhoeven (2):
arm64: dts: renesas: white-hawk-cpu: Fix missing serial console pin control
of: unittest: Fix of_count_phandle_with_args() expected value message
Geoffrey D. Bennett (5):
ALSA: scarlett2: Add missing error check to scarlett2_config_save()
ALSA: scarlett2: Add missing error check to scarlett2_usb_set_config()
ALSA: scarlett2: Allow passing any output to line_out_remap()
ALSA: scarlett2: Add missing error checks to *_ctl_get()
ALSA: scarlett2: Add clamp() in scarlett2_mixer_ctl_put()
Gonglei (Arei) (1):
crypto: virtio - Handle dataq logic with tasklet
Greg Kroah-Hartman (1):
Linux 6.1.75
Gregory Price (1):
base/node.c: initialize the accessor list before registering
Gui-Dong Han (2):
usb: mon: Fix atomicity violation in mon_bin_vma_fault
Bluetooth: Fix atomicity violation in {min,max}_key_size_set
Hangbin Liu (2):
selftests/net: specify the interface when do arping
selftests/net: fix grep checking for fib_nexthop_multiprefix
Hans de Goede (2):
ASoC: rt5645: Drop double EF20 entry from dmi_platform_data[]
Input: atkbd - use ab83 as id when skipping the getid command
Hao Sun (1):
bpf: Reject variable offset alu on PTR_TO_FLOW_KEYS
Heikki Krogerus (1):
Revert "usb: typec: class: fix typec_altmode_put_partner to put plugs"
Heiko Carstens (1):
tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug
Hengqi Chen (1):
LoongArch: BPF: Prevent out-of-bounds memory access
Herbert Xu (1):
crypto: af_alg - Disallow multiple in-flight AIO requests
Hou Tao (2):
bpf: Add map and need_defer parameters to .map_fd_put_ptr()
bpf: Defer the free of inner map when necessary
Hsiao Chien Sung (2):
drm/mediatek: Return error if MDP RDMA failed to enable the clock
drm/mediatek: Fix underrun in VDO1 when switches off the layer
Huang Ying (1):
cxl/port: Fix decoder initialization when nr_targets > interleave_ways
Hugo Villeneuve (2):
serial: sc16is7xx: add check for unsupported SPI modes during probe
serial: sc16is7xx: set safe default SPI clock frequency
Ian Rogers (1):
perf env: Avoid recursively taking env->bpf_progs.lock
Ilias Apalodimas (1):
efivarfs: force RO when remounting if SetVariable is not supported
Ilpo Järvinen (2):
wifi: rtlwifi: Remove bogus and dangerous ASPM disable/enable code
wifi: rtlwifi: Convert LNKCTL change to PCIe cap RMW accessors
Isaac J. Manjarres (1):
iommu/dma: Trace bounce buffer usage when mapping buffers
Jan Beulich (1):
xen-netback: don't produce zero-size SKB frags
Jan Palus (1):
power: supply: cw2015: correct time_to_empty units in sysfs
Jason Gerecke (1):
HID: wacom: Correct behavior when processing some confidence == false touches
Jay Buddhabhatti (2):
drivers: clk: zynqmp: calculate closest mux rate
drivers: clk: zynqmp: update divider round rate logic
Jens Axboe (2):
io_uring/rw: ensure io->bytes_done is always initialized
block: ensure we hold a queue reference when using queue limits
Jeroen van Ingen Schenau (1):
selftests/bpf: Fix erroneous bitmask operation
Jerry Hoemann (1):
watchdog/hpwdt: Only claim UNKNOWN NMI if from iLO
Jessica Zhang (2):
drm/msm/dpu: Set input_sel bit for INTF
drm/msm/dpu: Drop enable and frame_count parameters from dpu_hw_setup_misr()
Jianjun Wang (1):
PCI: mediatek-gen3: Fix translation window size calculation
Jim Harris (1):
cxl/region: fix x9 interleave typo
Jiri Olsa (1):
bpf: Fix re-attachment branch in bpf_tracing_prog_attach
Jiri Slaby (SUSE) (4):
tty: change tty_write_lock()'s ndelay parameter to bool
tty: early return from send_break() on TTY_DRIVER_HARDWARE_BREAK
tty: don't check for signal_pending() in send_break()
tty: use 'if' in send_break() instead of 'goto'
Jo Van Bulck (3):
selftests/sgx: Fix uninitialized pointer dereference in error path
selftests/sgx: Fix uninitialized pointer dereferences in encl_get_entry
selftests/sgx: Include memory clobber for inline asm in test enclave
Joakim Zhang (1):
dma-mapping: clear dev->dma_mem to NULL after freeing it
Johan Hovold (2):
arm64: dts: qcom: sc7280: fix usb_2 wakeup interrupt types
arm64: dts: hisilicon: hikey970-pmic: fix regulator cells properties
Johannes Berg (2):
wifi: iwlwifi: mvm: set siso/mimo chains to 1 in FW SMPS request
wifi: iwlwifi: mvm: send TX path flush in rfkill
John Fastabend (1):
bpf: sockmap, fix proto update hook to avoid dup calls
Jordan Rome (2):
bpf: Add crosstask check to __bpf_get_stack
selftests/bpf: Add assert for user stacks in test_task_stack
Junxian Huang (1):
RDMA/hns: Fix inappropriate err code for unsupported operations
Keith Busch (1):
block: make BLK_DEF_MAX_SECTORS unsigned
Kirill A. Shutemov (1):
x86/kvm: Do not try to disable kvmclock if it was not enabled
Konrad Dybcio (7):
arm64: dts: qcom: sc7280: Fix up GPU SIDs
arm64: dts: qcom: sc7280: Mark Adreno SMMU as DMA coherent
arm64: dts: qcom: sc7280: Mark SDHCI hosts as cache-coherent
arm64: dts: qcom: ipq6018: Use lowercase hex
arm64: dts: qcom: ipq6018: Pad addresses to 8 hex digits
arm64: dts: qcom: ipq6018: Fix up indentation
drm/msm/dsi: Use pm_runtime_resume_and_get to prevent refcnt leaks
Krzysztof Kozlowski (3):
ARM: dts: qcom: sdx65: correct SPMI node name
arm64: dts: qcom: qrb5165-rb5: correct LED panic indicator
arm64: dts: qcom: sdm845-db845c: correct LED panic indicator
Kunwu Chan (6):
powerpc/powernv: Add a null pointer check to scom_debug_init_one()
powerpc/powernv: Add a null pointer check in opal_event_init()
powerpc/powernv: Add a null pointer check in opal_powercap_init()
powerpc/imc-pmu: Add a null pointer check in update_events_in_group()
mfd: syscon: Fix null pointer dereference in of_syscon_register()
net: dsa: vsc73xx: Add null pointer check to vsc73xx_gpio_probe
Leon Romanovsky (1):
RDMA/usnic: Silence uninitialized symbol smatch warnings
Leone Fernando (1):
ipmr: support IP_PKTINFO on cache report IGMP msg
Li Nan (3):
block: Set memalloc_noio to false on device_add_disk() error path
block: add check of 'minors' and 'first_minor' in device_add_disk()
ksmbd: validate the zero field of packet header
Lin Ma (1):
net: qualcomm: rmnet: fix global oob in rmnet_policy
Lino Sanfilippo (5):
serial: core: fix sanitizing check for RTS settings
serial: core: make sure RS485 cannot be enabled when it is not supported
serial: core, imx: do not set RS485 enabled if it is not supported
serial: 8250_exar: Set missing rs485_supported flag
serial: omap: do not override settings for RS485 support
Linus Walleij (2):
ASoC: cs35l33: Fix GPIO name and drop legacy include
ASoC: cs35l34: Fix GPIO name and drop legacy include
Luca Weiss (2):
wifi: ath11k: Defer on rproc_get failure
arm64: dts: qcom: sc7280: Mark some nodes as 'reserved'
Ludvig Pärsson (1):
ethtool: netlink: Add missing ethnl_ops_begin/complete
Luiz Augusto von Dentz (1):
Bluetooth: Fix bogus check for re-auth no supported with non-ssp
Marc Zyngier (1):
KVM: arm64: vgic-v4: Restore pending state on host userspace write
Marcelo Schmitt (1):
iio: adc: ad7091r: Pass iio_dev to event handler
Marek Szyprowski (2):
i2c: s3c24xx: fix read transfers in polling mode
i2c: s3c24xx: fix transferring more than one message in polling mode
Mario Limonciello (1):
drm/amd: Enable PCIe PME from D3
Masahiro Yamada (2):
powerpc: remove checks for binutils older than 2.25
powerpc: add crtsavres.o to always-y instead of extra-y
Matthew Wilcox (Oracle) (2):
block: Fix iterating over an empty bio with bio_for_each_folio_all
block: Remove special-casing of compound pages
Maurizio Lombardi (3):
nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length
nvmet-tcp: fix a crash in nvmet_req_complete()
nvmet-tcp: Fix the H2C expected PDU len calculation
Mehdi Djait (1):
media: dt-bindings: media: rkisp1: Fix the port description for the parallel interface
Michael Ellerman (2):
selftests/powerpc: Fix error handling in FPU/VMX preemption tests
powerpc/64s: Increase default stack size to 32KB
Michal Simek (1):
dt-bindings: gpio: xilinx: Fix node address in gpio
Mickaël Salaün (1):
selinux: Fix error priority for bind with AF_UNSPEC on PF_INET6 socket
Mimi Zohar (1):
Revert "KEYS: encrypted: Add check for strsep"
Min Li (1):
block: add check that partition length needs to be aligned with block size
Ming Yen Hsieh (1):
wifi: mt76: mt7921: fix country count limitation for CLC
Mirsad Todorovac (2):
kselftest/alsa - mixer-test: fix the number of parameters to ksft_exit_fail_msg()
kselftest/alsa - mixer-test: Fix the print format specifier warning
Moudy Ho (2):
dt-bindings: media: mediatek: mdp3: correct RDMA and WROT node with generic names
arm64: dts: mediatek: mt8183: correct MDP3 DMA-related nodes
Nam Cao (2):
fbdev: flush deferred work in fb_deferred_io_fsync()
fbdev: flush deferred IO before closing
Namhyung Kim (1):
perf genelf: Set ELF program header addresses properly
Namjae Jeon (3):
ksmbd: validate mech token in session setup
ksmbd: fix UAF issue in ksmbd_tcp_new_connection()
ksmbd: only v2 leases handle the directory
Nathan Lynch (1):
powerpc/pseries/memhp: Fix access beyond end of drmem array
Nia Espera (1):
arm64: dts: qcom: sm8350: Fix DMA0 address
Nicolas Dichtel (1):
Revert "net: rtnetlink: Enslave device before bringing it up"
Nikita Kiryushin (2):
ACPI: video: check for error while searching for backlight device parent
ACPI: LPIT: Avoid u32 multiplication overflow
Nikita Yushchenko (1):
net: ravb: Fix dma_addr_t truncation in error case
Nikita Zhandarovich (5):
crypto: safexcel - Add error handling for dma_map_sg() calls
drm/radeon/r600_cs: Fix possible int overflows in r600_cs_check_reg()
drm/radeon/r100: Fix integer overflow issues in r100_cs_track_check()
drm/radeon: check return value of radeon_ring_lock()
ipv6: mcast: fix data-race in ipv6_mc_down / mld_ifc_work
Niklas Cassel (1):
PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support
Niklas Schnelle (1):
s390/pci: fix max size calculation in zpci_memcpy_toio()
Nitin Yadav (1):
arm64: dts: ti: k3-am62a-main: Fix GPIO pin count in DT nodes
Nuno Sa (3):
iio: adc: ad9467: fix reset gpio handling
iio: adc: ad9467: don't ignore error codes
iio: adc: ad9467: fix scale setting
Nícolas F. R. A. Prado (2):
drm/mediatek: dp: Add phy_mtk_dp module as pre-dependency
spmi: mtk-pmif: Serialize PMIF status check and command submission
Olga Kornievskaia (1):
SUNRPC: fix _xprt_switch_find_current_entry logic
Oliver Neukum (1):
usb: cdc-acm: return correct error code on unsupported break
Oliver Upton (1):
KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache
Osama Muhammad (1):
gfs2: Fix kernel NULL pointer dereference in gfs2_rgrp_dump
Ovidiu Panait (12):
crypto: sahara - remove FLAGS_NEW_KEY logic
crypto: sahara - fix cbc selftest failure
crypto: sahara - fix ahash selftest failure
crypto: sahara - fix processing requests with cryptlen < sg->length
crypto: sahara - fix error handling in sahara_hw_descriptor_create()
crypto: sahara - avoid skcipher fallback code duplication
crypto: sahara - handle zero-length aes requests
crypto: sahara - fix ahash reqsize
crypto: sahara - fix wait_for_completion_timeout() error handling
crypto: sahara - improve error handling in sahara_sha_process()
crypto: sahara - fix processing hash requests with req->nbytes < sg->length
crypto: sahara - do not resize req->src when doing hash operations
Pablo Neira Ayuso (6):
netfilter: nf_tables: check if catch-all set element is active in next generation
netfilter: nf_tables: reject invalid set policy
netfilter: nft_limit: do not ignore unsupported flags
netfilter: nf_tables: do not allow mismatch field size and set key length
netfilter: nf_tables: skip dead set elements in netlink dump
netfilter: nf_tables: reject NFT_SET_CONCAT with not field length description
Paolo Abeni (1):
mptcp: relax check on MPC passive fallback
Paul E. McKenney (1):
rcu-tasks: Provide rcu_trace_implies_rcu_gp()
Paul Geurts (1):
serial: imx: fix tx statemachine deadlock
Paul Kocialkowski (2):
media: verisilicon: Hook the (TRY_)DECODER_CMD stateless ioctls
media: rkvdec: Hook the (TRY_)DECODER_CMD stateless ioctls
Pavel Tikhomirov (4):
netfilter: nfnetlink_log: use proper helper for fetching physinif
netfilter: nf_queue: remove excess nf_bridge variable
netfilter: propagate net to nf_bridge_get_physindev
netfilter: bridge: replace physindev with physinif in nf_bridge_info
Peter Delevoryas (1):
net/ncsi: Fix netlink major/minor version numbers
Peter Robinson (2):
mmc: sdhci_am654: Fix TI SoC dependencies
mmc: sdhci_omap: Fix TI SoC dependencies
Philipp Zabel (2):
pwm: stm32: Use hweight32 in stm32_pwm_detect_channels
pwm: stm32: Fix enable count for clk in .probe()
Qiang Ma (1):
net: stmmac: ethtool: Fixed calltrace caused by unbalanced disable_irq_wake calls
RD Babiera (1):
usb: typec: class: fix typec_altmode_put_partner to put plugs
Randy Dunlap (2):
powerpc/44x: select I2C for CURRITUCK
ARM: 9330/1: davinci: also select PINCTRL
Ricardo B. Marliere (1):
media: pvrusb2: fix use after free on context disconnection
Richard Fitzgerald (1):
kunit: debugfs: Fix unchecked dereference in debugfs_print_results()
Rob Clark (1):
iommu/arm-smmu-qcom: Add missing GMU entry to match table
Rob Herring (2):
of: Add of_property_present() helper
cpufreq: Use of_property_present() for testing DT property presence
Ronald Monthero (1):
mtd: rawnand: Increment IFC_TIMEOUT_MSECS for nand controller response
Sakari Ailus (2):
acpi: property: Let args be NULL in __acpi_node_get_property_reference
software node: Let args be NULL in software_node_get_reference_args
Sanjuán García, Jorge (1):
net: ethernet: ti: am65-cpsw: Fix max mtu to fit ethernet frames
Satya Priya Kakitapalli (4):
clk: qcom: gpucc-sm8150: Update the gpu_cc_pll1 config
dt-bindings: clock: Update the videocc resets for sm8150
clk: qcom: videocc-sm8150: Update the videocc resets
clk: qcom: videocc-sm8150: Add missing PLL config property
Serge Semin (2):
mips: dmi: Fix early remap on MIPS32
mips: Fix incorrect max_low_pfn adjustment
Sergey Gorenko (1):
IB/iser: Prevent invalidating wrong MR
Sergey Shtylyov (1):
pstore: ram_core: fix possible overflow in persistent_ram_init_ecc()
Siddharth Vadapalli (1):
PCI: keystone: Fix race condition when initializing PHYs
Sjoerd Simons (1):
arm64: dts: armada-3720-turris-mox: set irq type for RTC
Song Liu (1):
Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
Srinivas Pandruvada (3):
platform/x86/intel/vsec: Enhance and Export intel_vsec_add_aux()
platform/x86/intel/vsec: Support private data
platform/x86/intel/vsec: Use mutex for ida_alloc() and ida_free()
Srinivasan Shanmugam (1):
drm/amdkfd: Confirm list is non-empty before utilizing list_first_entry in kfd_topology.c
Stefan Berger (1):
rootfs: Fix support for rootfstype= when root= is given
Stefan Wahren (2):
watchdog: bcm2835_wdt: Fix WDIOC_SETTIMEOUT handling
serial: 8250_bcm2835aux: Restore clock error handling
Stefan Wiehler (1):
mips/smp: Call rcutree_report_cpu_starting() earlier
Su Hui (11):
wifi: rtlwifi: rtl8821ae: phy: fix an undefined bitwise shift behavior
wifi: rtlwifi: add calculate_bit_shift()
wifi: rtlwifi: rtl8188ee: phy: using calculate_bit_shift()
wifi: rtlwifi: rtl8192c: using calculate_bit_shift()
wifi: rtlwifi: rtl8192cu: using calculate_bit_shift()
wifi: rtlwifi: rtl8192ce: using calculate_bit_shift()
wifi: rtlwifi: rtl8192de: using calculate_bit_shift()
wifi: rtlwifi: rtl8192ee: using calculate_bit_shift()
wifi: rtlwifi: rtl8192se: using calculate_bit_shift()
clk: si5341: fix an error code problem in si5341_output_clk_set_rate
power: supply: bq256xx: fix some problem in bq256xx_hw_init
Tadeusz Struk (1):
PCI/P2PDMA: Remove reference to pci_p2pdma_map_sg()
Taehee Yoo (1):
amt: do not use overwrapped cb area
Takashi Iwai (1):
ALSA: oxygen: Fix right channel of capture volume mixer
Tao Liu (1):
net/sched: act_ct: fix skb leak and crash on ooo frags
Thinh Nguyen (2):
Revert "usb: dwc3: Soft reset phy on probe for host"
Revert "usb: dwc3: don't reset device side if dwc3 was configured as host-only"
Théo Lebrun (1):
clk: fixed-rate: fix clk_hw_register_fixed_rate_with_accuracy_parent_hw
Tomi Valkeinen (12):
arm64: dts: ti: k3-am65-main: Fix DSS irq trigger type
Revert "drm/tidss: Annotate dma-fence critical section in commit path"
Revert "drm/omapdrm: Annotate dma-fence critical section in commit path"
drm/tilcdc: Fix irq free on unload
drm/tidss: Move reset to the end of dispc_init()
drm/tidss: Return error value from from softreset
drm/tidss: Check for K2G in in dispc_softreset()
drm/tidss: Fix dss reset
drm/bridge: cdns-mhdp8546: Fix use of uninitialized variable
drm/bridge: tc358767: Fix return value on error case
media: imx-mipi-csis: Fix clock handling in remove()
media: rkisp1: Fix media device memory leak
Tony Lindgren (1):
clocksource/drivers/timer-ti-dm: Fix make W=n kerneldoc warnings
Tony Luck (1):
ACPI: extlog: Clear Extended Error Log status when RAS_CEC handled the error
Trond Myklebust (2):
NFSv4.1/pnfs: Ensure we handle the error NFS4ERR_RETURNCONFLICT
pNFS: Fix the pnfs block driver's calculation of layoutget size
Uttkarsh Aggarwal (1):
usb: dwc: ep0: Update request status in dwc3_ep0_stall_restart
Uwe Kleine-König (5):
drm/bridge: tpd12s015: Drop buggy __exit annotation for remove function
pwm: stm32: Use regmap_clear_bits and regmap_set_bits where applicable
pwm: jz4740: Don't use dev_err_probe() in .request()
pwm: Fix out-of-bounds access in of_pwm_single_xlate()
serial: 8250: omap: Don't skip resource freeing if pm_runtime_resume_and_get() failed
Vignesh Raghavendra (1):
watchdog: rti_wdt: Drop runtime pm reference count when watchdog is unused
Wang Zhao (1):
wifi: mt76: mt7921s: fix workqueue problem causes STA association fail
Wenkai Lin (1):
crypto: hisilicon/qm - add a function to set qm algs
Wolfram Sang (1):
spi: sh-msiof: Enforce fixed DTDL for R-Car H3
Xi Ruoyao (1):
LoongArch: Fix and simplify fcsr initialization on execve()
Xingyuan Mo (1):
accel/habanalabs: fix information leak in sec_attest_info()
Xu Yang (2):
usb: phy: mxs: remove CONFIG_USB_OTG condition for mxs_phy_is_otg_host()
usb: chipidea: wait controller resume finished for wakeup irq
Yang Yingliang (1):
drm/radeon: check the alloc_workqueue return value in radeon_crtc_init()
Yazen Ghannam (1):
x86/mce/inject: Clear test status value
YiFei Zhu (1):
selftests/bpf: Relax time_tai test for equal timestamps in tai_forward
Yicong Yang (2):
perf header: Fix one memory leakage in perf_event__fprintf_event_update()
perf hisi-ptt: Fix one memory leakage in hisi_ptt_process_auxtrace_event()
Yihang Li (3):
scsi: hisi_sas: Replace with standard error code return value
scsi: hisi_sas: Rollback some operations if FLR failed
scsi: hisi_sas: Correct the number of global debugfs registers
Yo-Jung Lin (1):
ALSA: hda/realtek: Enable mute/micmute LEDs and limit mic boost on HP ZBook
Yu Kuai (1):
md: synchronize flush io with array reconfiguration
Zack Rusin (2):
drm/vmwgfx: Fix possible invalid drm gem put calls
drm/vmwgfx: Keep a gem reference to user bos in surfaces
Zhao Mengmeng (1):
selftests/sgx: Skip non X86_64 platform
ZhaoLong Wang (1):
mtd: Fix gluebi NULL pointer dereference caused by ftl notifier
Zheng Wang (1):
media: mtk-jpeg: Remove cancel worker in mtk_jpeg_remove to avoid the crash of multi-core JPEG devices
Zhiguo Niu (1):
f2fs: fix to check return value of f2fs_recover_xattr_data
Zhipeng Lu (8):
drm/radeon/dpm: fix a memleak in sumo_parse_power_table
drm/radeon/trinity_dpm: fix a memleak in trinity_parse_power_table
media: cx231xx: fix a memleak in cx231xx_init_isoc
drm/amd/pm: fix a double-free in si_dpm_init
drivers/amd/pm: fix a use-after-free in kv_parse_power_table
gpu/drm/radeon: fix two memleaks in radeon_vm_init
drm/amd/pm: fix a double-free in amdgpu_parse_extended_power_table
drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_init
Zhiqi Song (4):
crypto: hisilicon/qm - save capability registers in qm init process
crypto: hisilicon/hpre - save capability registers in probe process
crypto: hisilicon/sec2 - save capability registers in probe process
crypto: hisilicon/zip - save capability registers in probe process
kyrie wu (1):
media: mtk-jpegdec: export jpeg decoder functions
qizhong cheng (1):
PCI: mediatek: Clear interrupt status before dispatching handler
wangyangxin (1):
crypto: virtio - Wait for tasklet to complete on device remove
Çağhan Demir (1):
ALSA: hda/relatek: Enable Mute LED on HP Laptop 15s-fq2xxx
BUG=b/322459544
TEST=tryjob, validation and K8s e2e
RELEASE_NOTE=Updated the Linux kernel to v6.1.75.
Change-Id: I6edda4f143bf108d6ac889a98e58a72b1436432a
Signed-off-by: COS Kernel Merge Bot <cloud-image-merge-automation@prod.google.com>
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 4ad60e1..847ca62 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4184,6 +4184,12 @@
nomsi [MSI] If the PCI_MSI kernel config parameter is
enabled, this kernel boot option can be used to
disable the use of MSI interrupts system-wide.
+ clearmsi [X86] Clears MSI/MSI-X enable bits early in boot
+ time in order to avoid issues like adapters
+ screaming irqs and preventing boot progress.
+ Also, it enforces the PCI Local Bus spec
+ rule that those bits should be 0 in system reset
+ events (useful for kexec/kdump cases).
noioapicquirk [APIC] Disable all boot interrupt quirks.
Safety option to keep boot IRQs enabled. This
should never be necessary.
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index b3588ff..cbcc773 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -447,6 +447,25 @@
``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
+io_uring_disabled
+=================
+
+Prevents all processes from creating new io_uring instances. Enabling this
+shrinks the kernel's attack surface.
+
+= ==================================================================
+0 All processes can create io_uring instances as normal. This is the
+ default setting.
+1 io_uring creation is disabled for unprivileged processes.
+ io_uring_setup fails with -EPERM unless the calling process is
+ privileged (CAP_SYS_ADMIN). Existing io_uring instances can
+ still be used.
+2 io_uring creation is disabled for all processes. io_uring_setup
+ always fails with -EPERM. Existing io_uring instances can still be
+ used.
+= ==================================================================
+
+
kexec_load_disabled
===================
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index 6394f5d..ab7801d 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -338,7 +338,10 @@
----------
Maximum ancillary buffer size allowed per socket. Ancillary data is a sequence
-of struct cmsghdr structures with appended data.
+of struct cmsghdr structures with appended data. TCP tx zerocopy also uses
+optmem_max as a limit for its internal structures.
+
+Default : 128 KB
fb_tunnels_only_for_init_net
----------------------------
diff --git a/Documentation/networking/device_drivers/ethernet/google/gve.rst b/Documentation/networking/device_drivers/ethernet/google/gve.rst
index 6d73ee7..31d621b 100644
--- a/Documentation/networking/device_drivers/ethernet/google/gve.rst
+++ b/Documentation/networking/device_drivers/ethernet/google/gve.rst
@@ -52,6 +52,15 @@
GVE supports two descriptor formats: GQI and DQO. These two formats have
entirely different descriptors, which will be described below.
+Addressing Mode
+------------------
+GVE supports two addressing modes: QPL and RDA.
+QPL ("queue-page-list") mode communicates data through a set of
+pre-registered pages.
+
+For RDA ("raw DMA addressing") mode, the set of pages is dynamic.
+Therefore, the packet buffers can be anywhere in guest memory.
+
Registers
---------
All registers are MMIO.
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst
index b47b3d0..f539f17 100644
--- a/Documentation/networking/ip-sysctl.rst
+++ b/Documentation/networking/ip-sysctl.rst
@@ -318,6 +318,7 @@
option can harm clients of your server.
tcp_adv_win_scale - INTEGER
+ Obsolete since linux-6.6
Count buffering overhead as bytes/2^tcp_adv_win_scale
(if tcp_adv_win_scale > 0) or bytes-bytes/2^(-tcp_adv_win_scale),
if it is <= 0.
@@ -741,6 +742,13 @@
Default : 44
+tcp_backlog_ack_defer - BOOLEAN
+ If set, user thread processing socket backlog tries sending
+ one ACK for the whole queue. This helps to avoid potential
+ long latencies at end of a TCP socket syscall.
+
+ Default : true
+
tcp_slow_start_after_idle - BOOLEAN
If set, provide RFC2861 behavior and time out the congestion
window after an idle period. An idle period is defined at
diff --git a/Documentation/virt/coco/tdx-guest.rst b/Documentation/virt/coco/tdx-guest.rst
new file mode 100644
index 0000000..46e316d
--- /dev/null
+++ b/Documentation/virt/coco/tdx-guest.rst
@@ -0,0 +1,52 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================================
+TDX Guest API Documentation
+===================================================================
+
+1. General description
+======================
+
+The TDX guest driver exposes IOCTL interfaces via the /dev/tdx-guest misc
+device to allow userspace to get certain TDX guest-specific details.
+
+2. API description
+==================
+
+In this section, for each supported IOCTL, the following information is
+provided along with a generic description.
+
+:Input parameters: Parameters passed to the IOCTL and related details.
+:Output: Details about output data and return value (with details about
+ the non common error values).
+
+2.1 TDX_CMD_GET_REPORT0
+-----------------------
+
+:Input parameters: struct tdx_report_req
+:Output: Upon successful execution, TDREPORT data is copied to
+ tdx_report_req.tdreport and return 0. Return -EINVAL for invalid
+ operands, -EIO on TDCALL failure or standard error number on other
+ common failures.
+
+The TDX_CMD_GET_REPORT0 IOCTL can be used by the attestation software to get
+the TDREPORT0 (a.k.a. TDREPORT subtype 0) from the TDX module using
+TDCALL[TDG.MR.REPORT].
+
+A subtype index is added at the end of this IOCTL CMD to uniquely identify the
+subtype-specific TDREPORT request. Although the subtype option is mentioned in
+the TDX Module v1.0 specification, section titled "TDG.MR.REPORT", it is not
+currently used, and it expects this value to be 0. So to keep the IOCTL
+implementation simple, the subtype option was not included as part of the input
+ABI. However, in the future, if the TDX Module supports more than one subtype,
+a new IOCTL CMD will be created to handle it. To keep the IOCTL naming
+consistent, a subtype index is added as part of the IOCTL CMD.
+
+Reference
+---------
+
+TDX reference material is collected here:
+
+https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html
+
+The driver is based on TDX module specification v1.0 and TDX GHCI specification v1.0.
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst
index 2f1cffa..56e003f 100644
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -14,6 +14,7 @@
ne_overview
acrn/index
coco/sev-guest
+ coco/tdx-guest
hyperv/index
.. only:: html and subproject
diff --git a/Documentation/x86/tdx.rst b/Documentation/x86/tdx.rst
index b8fa4329..dc8d9fd 100644
--- a/Documentation/x86/tdx.rst
+++ b/Documentation/x86/tdx.rst
@@ -210,6 +210,49 @@
For coherent DMA allocation, the DMA buffer gets converted on the
allocation. Check force_dma_unencrypted() for details.
+Attestation
+===========
+
+Attestation is used to verify the TDX guest trustworthiness to other
+entities before provisioning secrets to the guest. For example, a key
+server may want to use attestation to verify that the guest is the
+desired one before releasing the encryption keys to mount the encrypted
+rootfs or a secondary drive.
+
+The TDX module records the state of the TDX guest in various stages of
+the guest boot process using the build time measurement register (MRTD)
+and runtime measurement registers (RTMR). Measurements related to the
+guest initial configuration and firmware image are recorded in the MRTD
+register. Measurements related to initial state, kernel image, firmware
+image, command line options, initrd, ACPI tables, etc are recorded in
+RTMR registers. For more details, as an example, please refer to TDX
+Virtual Firmware design specification, section titled "TD Measurement".
+At TDX guest runtime, the attestation process is used to attest to these
+measurements.
+
+The attestation process consists of two steps: TDREPORT generation and
+Quote generation.
+
+TDX guest uses TDCALL[TDG.MR.REPORT] to get the TDREPORT (TDREPORT_STRUCT)
+from the TDX module. TDREPORT is a fixed-size data structure generated by
+the TDX module which contains guest-specific information (such as build
+and boot measurements), platform security version, and the MAC to protect
+the integrity of the TDREPORT. A user-provided 64-Byte REPORTDATA is used
+as input and included in the TDREPORT. Typically it can be some nonce
+provided by attestation service so the TDREPORT can be verified uniquely.
+More details about the TDREPORT can be found in Intel TDX Module
+specification, section titled "TDG.MR.REPORT Leaf".
+
+After getting the TDREPORT, the second step of the attestation process
+is to send it to the Quoting Enclave (QE) to generate the Quote. TDREPORT
+by design can only be verified on the local platform as the MAC key is
+bound to the platform. To support remote verification of the TDREPORT,
+TDX leverages Intel SGX Quoting Enclave to verify the TDREPORT locally
+and convert it to a remotely verifiable Quote. Method of sending TDREPORT
+to QE is implementation specific. Attestation software can choose
+whatever communication channel available (i.e. vsock or TCP/IP) to
+send the TDREPORT to QE and receive the Quote.
+
References
==========
diff --git a/Makefile b/Makefile
index 7cd49d9..d00f818 100644
--- a/Makefile
+++ b/Makefile
@@ -1011,8 +1011,8 @@
export CC_FLAGS_CFI
endif
-ifdef CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B
-KBUILD_CFLAGS += -falign-functions=64
+ifneq ($(CONFIG_FUNCTION_ALIGNMENT),0)
+KBUILD_CFLAGS += -falign-functions=$(CONFIG_FUNCTION_ALIGNMENT)
endif
# arch Makefile may override CC so keep this after arch Makefile is included
diff --git a/PRESUBMIT.cfg b/PRESUBMIT.cfg
new file mode 100644
index 0000000..e42c329
--- /dev/null
+++ b/PRESUBMIT.cfg
@@ -0,0 +1,14 @@
+[Hook Overrides]
+# Make sure cos_patch trailer is present
+cos_patch_trailer_check: true
+
+aosp_license_check: false
+cros_license_check: false
+long_line_check: false
+stray_whitespace_check: false
+tab_check: false
+tabbed_indent_required_check: false
+signoff_check: true
+
+# Make sure RELEASE_NOTE field is present.
+release_note_field_check: true
diff --git a/arch/Kconfig b/arch/Kconfig
index 14273a6..d7000cc 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1434,4 +1434,28 @@
source "scripts/gcc-plugins/Kconfig"
+config FUNCTION_ALIGNMENT_4B
+ bool
+
+config FUNCTION_ALIGNMENT_8B
+ bool
+
+config FUNCTION_ALIGNMENT_16B
+ bool
+
+config FUNCTION_ALIGNMENT_32B
+ bool
+
+config FUNCTION_ALIGNMENT_64B
+ bool
+
+config FUNCTION_ALIGNMENT
+ int
+ default 64 if FUNCTION_ALIGNMENT_64B
+ default 32 if FUNCTION_ALIGNMENT_32B
+ default 16 if FUNCTION_ALIGNMENT_16B
+ default 8 if FUNCTION_ALIGNMENT_8B
+ default 4 if FUNCTION_ALIGNMENT_4B
+ default 0
+
endmenu
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index ea70eb9..42a8b8d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -119,6 +119,8 @@
select DMA_DIRECT_REMAP
select EDAC_SUPPORT
select FRAME_POINTER
+ select FUNCTION_ALIGNMENT_4B
+ select FUNCTION_ALIGNMENT_8B if DYNAMIC_FTRACE_WITH_CALL_OPS
select GENERIC_ALLOCATOR
select GENERIC_ARCH_TOPOLOGY
select GENERIC_CLOCKEVENTS_BROADCAST
@@ -180,8 +182,14 @@
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
+ select HAVE_DYNAMIC_FTRACE_WITH_ARGS \
+ if $(cc-option,-fpatchable-function-entry=2)
+ select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
+ if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS
+ select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
+ if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG)
select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
- if DYNAMIC_FTRACE_WITH_REGS
+ if DYNAMIC_FTRACE_WITH_ARGS
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
@@ -233,16 +241,16 @@
help
ARM 64-bit (AArch64) Linux support.
-config CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_REGS
+config CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS
def_bool CC_IS_CLANG
# https://github.com/ClangBuiltLinux/linux/issues/1507
depends on AS_IS_GNU || (AS_IS_LLVM && (LD_IS_LLD || LD_VERSION >= 23600))
- select HAVE_DYNAMIC_FTRACE_WITH_REGS
+ select HAVE_DYNAMIC_FTRACE_WITH_ARGS
-config GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_REGS
+config GCC_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS
def_bool CC_IS_GCC
depends on $(cc-option,-fpatchable-function-entry=2)
- select HAVE_DYNAMIC_FTRACE_WITH_REGS
+ select HAVE_DYNAMIC_FTRACE_WITH_ARGS
config 64BIT
def_bool y
@@ -1850,7 +1858,7 @@
# which is only understood by binutils starting with version 2.33.1.
depends on LD_IS_LLD || LD_VERSION >= 23301 || (CC_IS_GCC && GCC_VERSION < 90100)
depends on !CC_IS_CLANG || AS_HAS_CFI_NEGATE_RA_STATE
- depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_REGS)
+ depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_ARGS)
help
If the compiler supports the -mbranch-protection or
-msign-return-address flag (e.g. GCC 7 or later), then this option
@@ -1860,7 +1868,7 @@
disabled with minimal loss of protection.
This feature works with FUNCTION_GRAPH_TRACER option only if
- DYNAMIC_FTRACE_WITH_REGS is enabled.
+ DYNAMIC_FTRACE_WITH_ARGS is enabled.
config CC_HAS_BRANCH_PROT_PAC_RET
# GCC 9 or later, clang 8 or later
@@ -1958,7 +1966,7 @@
depends on !CC_IS_GCC
# https://github.com/llvm/llvm-project/commit/a88c722e687e6780dcd6a58718350dc76fcc4cc9
depends on !CC_IS_CLANG || CLANG_VERSION >= 120000
- depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_REGS)
+ depends on (!FUNCTION_GRAPH_TRACER || DYNAMIC_FTRACE_WITH_ARGS)
help
Build the kernel with Branch Target Identification annotations
and enable enforcement of this for kernel code. When this option
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index c949653..63c0b8d 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -128,7 +128,10 @@
CHECKFLAGS += -D__aarch64__
-ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
+ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS),y)
+ KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2
+else ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_ARGS),y)
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
CC_FLAGS_FTRACE := -fpatchable-function-entry=2
endif
diff --git a/arch/arm64/configs/google/xfstest.config b/arch/arm64/configs/google/xfstest.config
new file mode 100644
index 0000000..fb9126e
--- /dev/null
+++ b/arch/arm64/configs/google/xfstest.config
@@ -0,0 +1,25 @@
+# Configurations required to run xfs tests
+CONFIG_MODULE_SIG=n
+CONFIG_MODULE_SIG_ALL=n
+CONFIG_SECURITY_LOADPIN=n
+CONFIG_SECURITY_LOADPIN_ENFORCE=n
+CONFIG_SECURITY_YAMA=n
+CONFIG_SECURITY_LOCKDOWN_LSM=n
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=n
+CONFIG_LSM=""
+CONFIG_SYSTEM_TRUSTED_KEYRING=n
+CONFIG_SECONDARY_TRUSTED_KEYRING=n
+CONFIG_VFAT_FS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_UTF8=y
+CONFIG_FAT_DEFAULT_UTF8=y
+CONFIG_GVE=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_NFSD=y
+CONFIG_NFSD_V4=y
diff --git a/arch/arm64/configs/lakitu_defconfig b/arch/arm64/configs/lakitu_defconfig
new file mode 100644
index 0000000..c353099
--- /dev/null
+++ b/arch/arm64/configs/lakitu_defconfig
@@ -0,0 +1,4335 @@
+#
+# Automatically generated file; DO NOT EDIT.
+# Linux/arm64 6.1.38 Kernel Configuration
+#
+CONFIG_CC_VERSION_TEXT="Chromium OS 15.0_pre458507_p20220602-r18 clang version 15.0.0 (/var/tmp/portage/sys-devel/llvm-15.0_pre458507_p20220602-r18/work/llvm-15.0_pre458507_p20220602/clang a58d0af058038595c93de961b725f86997cf8d4a)"
+CONFIG_GCC_VERSION=0
+CONFIG_CC_IS_CLANG=y
+CONFIG_CLANG_VERSION=150000
+CONFIG_AS_IS_LLVM=y
+CONFIG_AS_VERSION=150000
+CONFIG_LD_VERSION=0
+CONFIG_LD_IS_LLD=y
+CONFIG_LLD_VERSION=150000
+CONFIG_CC_CAN_LINK=y
+CONFIG_CC_CAN_LINK_STATIC=y
+CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
+CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
+CONFIG_TOOLS_SUPPORT_RELR=y
+CONFIG_CC_HAS_ASM_INLINE=y
+CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
+CONFIG_PAHOLE_VERSION=121
+CONFIG_IRQ_WORK=y
+CONFIG_BUILDTIME_TABLE_SORT=y
+CONFIG_THREAD_INFO_IN_TASK=y
+
+#
+# General setup
+#
+CONFIG_INIT_ENV_ARG_LIMIT=32
+# CONFIG_COMPILE_TEST is not set
+# CONFIG_WERROR is not set
+CONFIG_LOCALVERSION=""
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_BUILD_SALT=""
+CONFIG_DEFAULT_INIT=""
+CONFIG_DEFAULT_HOSTNAME="localhost"
+CONFIG_SYSVIPC=y
+CONFIG_SYSVIPC_SYSCTL=y
+CONFIG_SYSVIPC_COMPAT=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_POSIX_MQUEUE_SYSCTL=y
+# CONFIG_WATCH_QUEUE is not set
+CONFIG_CROSS_MEMORY_ATTACH=y
+CONFIG_USELIB=y
+CONFIG_AUDIT=y
+CONFIG_HAVE_ARCH_AUDITSYSCALL=y
+CONFIG_AUDITSYSCALL=y
+
+#
+# IRQ subsystem
+#
+CONFIG_GENERIC_IRQ_PROBE=y
+CONFIG_GENERIC_IRQ_SHOW=y
+CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
+CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
+CONFIG_GENERIC_IRQ_MIGRATION=y
+CONFIG_HARDIRQS_SW_RESEND=y
+CONFIG_IRQ_DOMAIN=y
+CONFIG_IRQ_DOMAIN_HIERARCHY=y
+CONFIG_GENERIC_IRQ_IPI=y
+CONFIG_GENERIC_MSI_IRQ=y
+CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
+CONFIG_IRQ_FORCED_THREADING=y
+CONFIG_SPARSE_IRQ=y
+# CONFIG_GENERIC_IRQ_DEBUGFS is not set
+# end of IRQ subsystem
+
+CONFIG_GENERIC_TIME_VSYSCALL=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_ARCH_HAS_TICK_BROADCAST=y
+CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
+CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
+CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
+CONFIG_CONTEXT_TRACKING=y
+CONFIG_CONTEXT_TRACKING_IDLE=y
+
+#
+# Timers subsystem
+#
+CONFIG_TICK_ONESHOT=y
+CONFIG_NO_HZ_COMMON=y
+# CONFIG_HZ_PERIODIC is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_NO_HZ_FULL is not set
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+# end of Timers subsystem
+
+CONFIG_BPF=y
+CONFIG_HAVE_EBPF_JIT=y
+CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
+
+#
+# BPF subsystem
+#
+CONFIG_BPF_SYSCALL=y
+CONFIG_BPF_JIT=y
+CONFIG_BPF_JIT_ALWAYS_ON=y
+CONFIG_BPF_JIT_DEFAULT_ON=y
+# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set
+# CONFIG_BPF_PRELOAD is not set
+CONFIG_BPF_LSM=y
+# end of BPF subsystem
+
+CONFIG_PREEMPT_NONE_BUILD=y
+CONFIG_PREEMPT_NONE=y
+# CONFIG_PREEMPT_VOLUNTARY is not set
+# CONFIG_PREEMPT is not set
+# CONFIG_PREEMPT_DYNAMIC is not set
+CONFIG_SCHED_CORE=y
+
+#
+# CPU/Task time and stats accounting
+#
+CONFIG_TICK_CPU_ACCOUNTING=y
+# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
+# CONFIG_IRQ_TIME_ACCOUNTING is not set
+CONFIG_HAVE_SCHED_AVG_IRQ=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_PSI=y
+# CONFIG_PSI_DEFAULT_DISABLED is not set
+# end of CPU/Task time and stats accounting
+
+CONFIG_CPU_ISOLATION=y
+
+#
+# RCU Subsystem
+#
+CONFIG_TREE_RCU=y
+# CONFIG_RCU_EXPERT is not set
+CONFIG_SRCU=y
+CONFIG_TREE_SRCU=y
+CONFIG_TASKS_RCU_GENERIC=y
+CONFIG_TASKS_RUDE_RCU=y
+CONFIG_TASKS_TRACE_RCU=y
+CONFIG_RCU_STALL_COMMON=y
+CONFIG_RCU_NEED_SEGCBLIST=y
+# end of RCU Subsystem
+
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_IKHEADERS=m
+CONFIG_LOG_BUF_SHIFT=18
+CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
+CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
+# CONFIG_PRINTK_INDEX is not set
+CONFIG_GENERIC_SCHED_CLOCK=y
+
+#
+# Scheduler features
+#
+# end of Scheduler features
+
+CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
+CONFIG_CC_HAS_INT128=y
+CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough"
+CONFIG_GCC11_NO_ARRAY_BOUNDS=y
+CONFIG_ARCH_SUPPORTS_INT128=y
+# CONFIG_NUMA_BALANCING is not set
+CONFIG_CGROUPS=y
+CONFIG_PAGE_COUNTER=y
+# CONFIG_CGROUP_FAVOR_DYNMODS is not set
+CONFIG_MEMCG=y
+CONFIG_MEMCG_KMEM=y
+CONFIG_BLK_CGROUP=y
+CONFIG_CGROUP_WRITEBACK=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_FAIR_GROUP_SCHED=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_CGROUP_PIDS=y
+CONFIG_CGROUP_RDMA=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_HUGETLB=y
+CONFIG_CPUSETS=y
+CONFIG_PROC_PID_CPUSET=y
+CONFIG_CGROUP_DEVICE=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_PERF=y
+CONFIG_CGROUP_BPF=y
+# CONFIG_CGROUP_MISC is not set
+# CONFIG_CGROUP_DEBUG is not set
+CONFIG_SOCK_CGROUP_DATA=y
+CONFIG_NAMESPACES=y
+CONFIG_UTS_NS=y
+CONFIG_TIME_NS=y
+CONFIG_IPC_NS=y
+CONFIG_USER_NS=y
+CONFIG_PID_NS=y
+CONFIG_NET_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
+# CONFIG_SCHED_AUTOGROUP is not set
+# CONFIG_SYSFS_DEPRECATED is not set
+CONFIG_RELAY=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_INITRAMFS_SOURCE=""
+CONFIG_RD_GZIP=y
+# CONFIG_RD_BZIP2 is not set
+# CONFIG_RD_LZMA is not set
+CONFIG_RD_XZ=y
+# CONFIG_RD_LZO is not set
+CONFIG_RD_LZ4=y
+CONFIG_RD_ZSTD=y
+# CONFIG_BOOT_CONFIG is not set
+CONFIG_INITRAMFS_PRESERVE_MTIME=y
+CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
+# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
+CONFIG_LD_ORPHAN_WARN=y
+CONFIG_SYSCTL=y
+CONFIG_HAVE_UID16=y
+CONFIG_SYSCTL_EXCEPTION_TRACE=y
+CONFIG_EXPERT=y
+CONFIG_UID16=y
+CONFIG_MULTIUSER=y
+CONFIG_SGETMASK_SYSCALL=y
+CONFIG_SYSFS_SYSCALL=y
+CONFIG_FHANDLE=y
+CONFIG_POSIX_TIMERS=y
+CONFIG_PRINTK=y
+CONFIG_BUG=y
+CONFIG_ELF_CORE=y
+CONFIG_BASE_FULL=y
+CONFIG_FUTEX=y
+CONFIG_FUTEX_PI=y
+CONFIG_EPOLL=y
+CONFIG_SIGNALFD=y
+CONFIG_TIMERFD=y
+CONFIG_EVENTFD=y
+CONFIG_SHMEM=y
+CONFIG_AIO=y
+CONFIG_IO_URING=y
+CONFIG_ADVISE_SYSCALLS=y
+CONFIG_MEMBARRIER=y
+CONFIG_KALLSYMS=y
+# CONFIG_KALLSYMS_ALL is not set
+CONFIG_KALLSYMS_BASE_RELATIVE=y
+CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
+CONFIG_KCMP=y
+CONFIG_RSEQ=y
+# CONFIG_DEBUG_RSEQ is not set
+CONFIG_EMBEDDED=y
+CONFIG_HAVE_PERF_EVENTS=y
+# CONFIG_PC104 is not set
+
+#
+# Kernel Performance Events And Counters
+#
+CONFIG_PERF_EVENTS=y
+# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
+# end of Kernel Performance Events And Counters
+
+CONFIG_SYSTEM_DATA_VERIFICATION=y
+CONFIG_PROFILING=y
+CONFIG_TRACEPOINTS=y
+# end of General setup
+
+CONFIG_ARM64=y
+CONFIG_CLANG_SUPPORTS_DYNAMIC_FTRACE_WITH_ARGS=y
+CONFIG_64BIT=y
+CONFIG_MMU=y
+CONFIG_ARM64_PAGE_SHIFT=12
+CONFIG_ARM64_CONT_PTE_SHIFT=4
+CONFIG_ARM64_CONT_PMD_SHIFT=4
+CONFIG_ARCH_MMAP_RND_BITS_MIN=18
+CONFIG_ARCH_MMAP_RND_BITS_MAX=33
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=11
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
+CONFIG_STACKTRACE_SUPPORT=y
+CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
+CONFIG_LOCKDEP_SUPPORT=y
+CONFIG_GENERIC_BUG=y
+CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
+CONFIG_GENERIC_HWEIGHT=y
+CONFIG_GENERIC_CSUM=y
+CONFIG_GENERIC_CALIBRATE_DELAY=y
+CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y
+CONFIG_SMP=y
+CONFIG_KERNEL_MODE_NEON=y
+CONFIG_FIX_EARLYCON_MEM=y
+CONFIG_PGTABLE_LEVELS=4
+CONFIG_ARCH_SUPPORTS_UPROBES=y
+CONFIG_ARCH_PROC_KCORE_TEXT=y
+
+#
+# Platform selection
+#
+# CONFIG_ARCH_ACTIONS is not set
+# CONFIG_ARCH_SUNXI is not set
+# CONFIG_ARCH_ALPINE is not set
+# CONFIG_ARCH_APPLE is not set
+# CONFIG_ARCH_BCM is not set
+# CONFIG_ARCH_BERLIN is not set
+# CONFIG_ARCH_BITMAIN is not set
+# CONFIG_ARCH_EXYNOS is not set
+# CONFIG_ARCH_SPARX5 is not set
+# CONFIG_ARCH_K3 is not set
+# CONFIG_ARCH_LG1K is not set
+# CONFIG_ARCH_HISI is not set
+# CONFIG_ARCH_KEEMBAY is not set
+# CONFIG_ARCH_MEDIATEK is not set
+# CONFIG_ARCH_MESON is not set
+# CONFIG_ARCH_MVEBU is not set
+# CONFIG_ARCH_NXP is not set
+# CONFIG_ARCH_NPCM is not set
+# CONFIG_ARCH_QCOM is not set
+# CONFIG_ARCH_REALTEK is not set
+# CONFIG_ARCH_RENESAS is not set
+# CONFIG_ARCH_ROCKCHIP is not set
+# CONFIG_ARCH_SEATTLE is not set
+# CONFIG_ARCH_INTEL_SOCFPGA is not set
+# CONFIG_ARCH_SYNQUACER is not set
+# CONFIG_ARCH_TEGRA is not set
+# CONFIG_ARCH_SPRD is not set
+# CONFIG_ARCH_THUNDER is not set
+# CONFIG_ARCH_THUNDER2 is not set
+# CONFIG_ARCH_UNIPHIER is not set
+# CONFIG_ARCH_VEXPRESS is not set
+# CONFIG_ARCH_VISCONTI is not set
+# CONFIG_ARCH_XGENE is not set
+# CONFIG_ARCH_ZYNQMP is not set
+# end of Platform selection
+
+#
+# Kernel Features
+#
+
+#
+# ARM errata workarounds via the alternatives framework
+#
+CONFIG_ARM64_WORKAROUND_CLEAN_CACHE=y
+CONFIG_ARM64_ERRATUM_826319=y
+CONFIG_ARM64_ERRATUM_827319=y
+CONFIG_ARM64_ERRATUM_824069=y
+CONFIG_ARM64_ERRATUM_819472=y
+CONFIG_ARM64_ERRATUM_832075=y
+CONFIG_ARM64_ERRATUM_1742098=y
+CONFIG_ARM64_ERRATUM_845719=y
+CONFIG_ARM64_ERRATUM_843419=y
+CONFIG_ARM64_LD_HAS_FIX_ERRATUM_843419=y
+CONFIG_ARM64_ERRATUM_1024718=y
+CONFIG_ARM64_ERRATUM_1418040=y
+CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT=y
+CONFIG_ARM64_ERRATUM_1165522=y
+CONFIG_ARM64_ERRATUM_1319367=y
+CONFIG_ARM64_ERRATUM_1530923=y
+CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y
+CONFIG_ARM64_ERRATUM_2441007=y
+CONFIG_ARM64_ERRATUM_1286807=y
+CONFIG_ARM64_ERRATUM_1463225=y
+CONFIG_ARM64_ERRATUM_1542419=y
+CONFIG_ARM64_ERRATUM_1508412=y
+CONFIG_ARM64_ERRATUM_2051678=y
+CONFIG_ARM64_ERRATUM_2077057=y
+CONFIG_ARM64_ERRATUM_2658417=y
+CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE=y
+CONFIG_ARM64_ERRATUM_2054223=y
+CONFIG_ARM64_ERRATUM_2067961=y
+CONFIG_ARM64_ERRATUM_2441009=y
+CONFIG_ARM64_ERRATUM_2457168=y
+CONFIG_ARM64_ERRATUM_2966298=y
+CONFIG_CAVIUM_ERRATUM_22375=y
+CONFIG_CAVIUM_ERRATUM_23144=y
+CONFIG_CAVIUM_ERRATUM_23154=y
+CONFIG_CAVIUM_ERRATUM_27456=y
+CONFIG_CAVIUM_ERRATUM_30115=y
+CONFIG_CAVIUM_TX2_ERRATUM_219=y
+CONFIG_FUJITSU_ERRATUM_010001=y
+CONFIG_HISILICON_ERRATUM_161600802=y
+CONFIG_QCOM_FALKOR_ERRATUM_1003=y
+CONFIG_QCOM_FALKOR_ERRATUM_1009=y
+CONFIG_QCOM_QDF2400_ERRATUM_0065=y
+CONFIG_QCOM_FALKOR_ERRATUM_E1041=y
+CONFIG_NVIDIA_CARMEL_CNP_ERRATUM=y
+CONFIG_SOCIONEXT_SYNQUACER_PREITS=y
+# end of ARM errata workarounds via the alternatives framework
+
+CONFIG_ARM64_4K_PAGES=y
+# CONFIG_ARM64_16K_PAGES is not set
+# CONFIG_ARM64_64K_PAGES is not set
+# CONFIG_ARM64_VA_BITS_39 is not set
+CONFIG_ARM64_VA_BITS_48=y
+CONFIG_ARM64_VA_BITS=48
+CONFIG_ARM64_PA_BITS_48=y
+CONFIG_ARM64_PA_BITS=48
+# CONFIG_CPU_BIG_ENDIAN is not set
+CONFIG_CPU_LITTLE_ENDIAN=y
+CONFIG_SCHED_MC=y
+# CONFIG_SCHED_CLUSTER is not set
+CONFIG_SCHED_SMT=y
+CONFIG_NR_CPUS=512
+CONFIG_HOTPLUG_CPU=y
+CONFIG_NUMA=y
+CONFIG_NODES_SHIFT=6
+# CONFIG_HZ_100 is not set
+# CONFIG_HZ_250 is not set
+# CONFIG_HZ_300 is not set
+CONFIG_HZ_1000=y
+CONFIG_HZ=1000
+CONFIG_SCHED_HRTICK=y
+CONFIG_ARCH_SPARSEMEM_ENABLE=y
+CONFIG_HW_PERF_EVENTS=y
+CONFIG_CC_HAVE_SHADOW_CALL_STACK=y
+CONFIG_PARAVIRT=y
+CONFIG_PARAVIRT_TIME_ACCOUNTING=y
+# CONFIG_KEXEC is not set
+CONFIG_KEXEC_FILE=y
+# CONFIG_KEXEC_SIG is not set
+# CONFIG_CRASH_DUMP is not set
+CONFIG_TRANS_TABLE=y
+CONFIG_XEN_DOM0=y
+CONFIG_XEN=y
+CONFIG_ARCH_FORCE_MAX_ORDER=11
+CONFIG_UNMAP_KERNEL_AT_EL0=y
+CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY=y
+CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
+# CONFIG_ARM64_SW_TTBR0_PAN is not set
+CONFIG_ARM64_TAGGED_ADDR_ABI=y
+CONFIG_COMPAT=y
+CONFIG_KUSER_HELPERS=y
+# CONFIG_COMPAT_VDSO is not set
+# CONFIG_COMPAT_ALIGNMENT_FIXUPS is not set
+# CONFIG_ARMV8_DEPRECATED is not set
+
+#
+# ARMv8.1 architectural features
+#
+CONFIG_ARM64_HW_AFDBM=y
+CONFIG_ARM64_PAN=y
+CONFIG_AS_HAS_LDAPR=y
+CONFIG_AS_HAS_LSE_ATOMICS=y
+# end of ARMv8.1 architectural features
+
+#
+# ARMv8.2 architectural features
+#
+CONFIG_AS_HAS_ARMV8_2=y
+CONFIG_AS_HAS_SHA3=y
+# CONFIG_ARM64_PMEM is not set
+CONFIG_ARM64_RAS_EXTN=y
+CONFIG_ARM64_CNP=y
+# end of ARMv8.2 architectural features
+
+#
+# ARMv8.3 architectural features
+#
+CONFIG_ARM64_PTR_AUTH=y
+CONFIG_ARM64_PTR_AUTH_KERNEL=y
+CONFIG_CC_HAS_BRANCH_PROT_PAC_RET=y
+CONFIG_CC_HAS_SIGN_RETURN_ADDRESS=y
+CONFIG_AS_HAS_PAC=y
+CONFIG_AS_HAS_CFI_NEGATE_RA_STATE=y
+# end of ARMv8.3 architectural features
+
+#
+# ARMv8.4 architectural features
+#
+CONFIG_ARM64_AMU_EXTN=y
+CONFIG_AS_HAS_ARMV8_4=y
+CONFIG_ARM64_TLB_RANGE=y
+# end of ARMv8.4 architectural features
+
+#
+# ARMv8.5 architectural features
+#
+CONFIG_AS_HAS_ARMV8_5=y
+CONFIG_ARM64_BTI=y
+CONFIG_ARM64_BTI_KERNEL=y
+CONFIG_CC_HAS_BRANCH_PROT_PAC_RET_BTI=y
+CONFIG_ARM64_E0PD=y
+CONFIG_ARM64_AS_HAS_MTE=y
+CONFIG_ARM64_MTE=y
+# end of ARMv8.5 architectural features
+
+#
+# ARMv8.7 architectural features
+#
+CONFIG_ARM64_EPAN=y
+# end of ARMv8.7 architectural features
+
+CONFIG_ARM64_SVE=y
+CONFIG_ARM64_SME=y
+CONFIG_ARM64_MODULE_PLTS=y
+# CONFIG_ARM64_PSEUDO_NMI is not set
+CONFIG_RELOCATABLE=y
+CONFIG_RANDOMIZE_BASE=y
+CONFIG_RANDOMIZE_MODULE_REGION_FULL=y
+CONFIG_CC_HAVE_STACKPROTECTOR_SYSREG=y
+CONFIG_STACKPROTECTOR_PER_TASK=y
+CONFIG_ARCH_NR_GPIO=0
+# end of Kernel Features
+
+#
+# Boot options
+#
+# CONFIG_ARM64_ACPI_PARKING_PROTOCOL is not set
+CONFIG_CMDLINE=""
+CONFIG_EFI_STUB=y
+CONFIG_EFI=y
+CONFIG_DMI=y
+# end of Boot options
+
+#
+# Power management options
+#
+CONFIG_SUSPEND=y
+CONFIG_SUSPEND_FREEZER=y
+# CONFIG_SUSPEND_SKIP_SYNC is not set
+# CONFIG_HIBERNATION is not set
+CONFIG_PM_SLEEP=y
+CONFIG_PM_SLEEP_SMP=y
+# CONFIG_PM_AUTOSLEEP is not set
+# CONFIG_PM_USERSPACE_AUTOSLEEP is not set
+# CONFIG_PM_WAKELOCKS is not set
+CONFIG_PM=y
+CONFIG_PM_DEBUG=y
+# CONFIG_PM_ADVANCED_DEBUG is not set
+# CONFIG_PM_TEST_SUSPEND is not set
+CONFIG_PM_SLEEP_DEBUG=y
+# CONFIG_DPM_WATCHDOG is not set
+CONFIG_PM_CLK=y
+# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
+CONFIG_CPU_PM=y
+# CONFIG_ENERGY_MODEL is not set
+CONFIG_ARCH_HIBERNATION_POSSIBLE=y
+CONFIG_ARCH_SUSPEND_POSSIBLE=y
+# end of Power management options
+
+#
+# CPU Power Management
+#
+
+#
+# CPU Idle
+#
+CONFIG_CPU_IDLE=y
+CONFIG_CPU_IDLE_GOV_LADDER=y
+CONFIG_CPU_IDLE_GOV_MENU=y
+# CONFIG_CPU_IDLE_GOV_TEO is not set
+
+#
+# ARM CPU Idle Drivers
+#
+# CONFIG_ARM_PSCI_CPUIDLE is not set
+# end of ARM CPU Idle Drivers
+# end of CPU Idle
+
+#
+# CPU Frequency scaling
+#
+CONFIG_CPU_FREQ=y
+CONFIG_CPU_FREQ_STAT=y
+CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
+CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
+# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
+# CONFIG_CPU_FREQ_GOV_SCHEDUTIL is not set
+
+#
+# CPU frequency scaling drivers
+#
+# CONFIG_CPUFREQ_DT is not set
+# CONFIG_ACPI_CPPC_CPUFREQ is not set
+# end of CPU Frequency scaling
+# end of CPU Power Management
+
+CONFIG_ARCH_SUPPORTS_ACPI=y
+CONFIG_ACPI=y
+CONFIG_ACPI_GENERIC_GSI=y
+CONFIG_ACPI_CCA_REQUIRED=y
+# CONFIG_ACPI_DEBUGGER is not set
+CONFIG_ACPI_SPCR_TABLE=y
+# CONFIG_ACPI_EC_DEBUGFS is not set
+# CONFIG_ACPI_AC is not set
+# CONFIG_ACPI_BATTERY is not set
+CONFIG_ACPI_BUTTON=y
+# CONFIG_ACPI_FAN is not set
+# CONFIG_ACPI_TAD is not set
+# CONFIG_ACPI_DOCK is not set
+CONFIG_ACPI_PROCESSOR_IDLE=y
+CONFIG_ACPI_MCFG=y
+CONFIG_ACPI_PROCESSOR=y
+CONFIG_ACPI_HOTPLUG_CPU=y
+CONFIG_ACPI_THERMAL=y
+CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
+# CONFIG_ACPI_TABLE_UPGRADE is not set
+# CONFIG_ACPI_DEBUG is not set
+# CONFIG_ACPI_PCI_SLOT is not set
+CONFIG_ACPI_CONTAINER=y
+# CONFIG_ACPI_HED is not set
+# CONFIG_ACPI_CUSTOM_METHOD is not set
+# CONFIG_ACPI_BGRT is not set
+CONFIG_ACPI_REDUCED_HARDWARE_ONLY=y
+CONFIG_ACPI_NUMA=y
+# CONFIG_ACPI_HMAT is not set
+CONFIG_HAVE_ACPI_APEI=y
+# CONFIG_ACPI_APEI is not set
+# CONFIG_ACPI_CONFIGFS is not set
+# CONFIG_ACPI_PFRUT is not set
+CONFIG_ACPI_IORT=y
+CONFIG_ACPI_GTDT=y
+CONFIG_ACPI_PPTT=y
+CONFIG_ACPI_PCC=y
+# CONFIG_PMIC_OPREGION is not set
+CONFIG_ACPI_PRMT=y
+CONFIG_IRQ_BYPASS_MANAGER=m
+CONFIG_HAVE_KVM=y
+CONFIG_VIRTUALIZATION=y
+# CONFIG_KVM is not set
+
+#
+# General architecture-dependent options
+#
+CONFIG_CRASH_CORE=y
+CONFIG_KEXEC_CORE=y
+CONFIG_HAVE_IMA_KEXEC=y
+CONFIG_ARCH_HAS_SUBPAGE_FAULTS=y
+CONFIG_KPROBES=y
+# CONFIG_JUMP_LABEL is not set
+CONFIG_UPROBES=y
+CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
+CONFIG_KRETPROBES=y
+CONFIG_HAVE_IOREMAP_PROT=y
+CONFIG_HAVE_KPROBES=y
+CONFIG_HAVE_KRETPROBES=y
+CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y
+CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
+CONFIG_HAVE_NMI=y
+CONFIG_TRACE_IRQFLAGS_SUPPORT=y
+CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
+CONFIG_HAVE_ARCH_TRACEHOOK=y
+CONFIG_HAVE_DMA_CONTIGUOUS=y
+CONFIG_GENERIC_SMP_IDLE_THREAD=y
+CONFIG_GENERIC_IDLE_POLL_SETUP=y
+CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
+CONFIG_ARCH_HAS_KEEPINITRD=y
+CONFIG_ARCH_HAS_SET_MEMORY=y
+CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
+CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
+CONFIG_ARCH_WANTS_NO_INSTR=y
+CONFIG_HAVE_ASM_MODVERSIONS=y
+CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
+CONFIG_HAVE_RSEQ=y
+CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
+CONFIG_HAVE_HW_BREAKPOINT=y
+CONFIG_HAVE_PERF_REGS=y
+CONFIG_HAVE_PERF_USER_STACK_DUMP=y
+CONFIG_HAVE_ARCH_JUMP_LABEL=y
+CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
+CONFIG_MMU_GATHER_TABLE_FREE=y
+CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
+CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
+CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
+CONFIG_HAVE_CMPXCHG_LOCAL=y
+CONFIG_HAVE_CMPXCHG_DOUBLE=y
+CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
+CONFIG_HAVE_ARCH_SECCOMP=y
+CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
+CONFIG_SECCOMP=y
+CONFIG_SECCOMP_FILTER=y
+# CONFIG_SECCOMP_CACHE_DEBUG is not set
+CONFIG_HAVE_ARCH_STACKLEAK=y
+CONFIG_HAVE_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR_STRONG=y
+CONFIG_ARCH_SUPPORTS_SHADOW_CALL_STACK=y
+CONFIG_SHADOW_CALL_STACK=y
+CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
+CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
+CONFIG_LTO_NONE=y
+CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
+CONFIG_HAVE_CONTEXT_TRACKING_USER=y
+CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
+CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
+CONFIG_HAVE_MOVE_PUD=y
+CONFIG_HAVE_MOVE_PMD=y
+CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
+CONFIG_HAVE_ARCH_HUGE_VMAP=y
+CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
+CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
+CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
+CONFIG_MODULES_USE_ELF_RELA=y
+CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
+CONFIG_SOFTIRQ_ON_OWN_STACK=y
+CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
+CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
+CONFIG_ARCH_MMAP_RND_BITS=31
+CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS=11
+CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
+CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
+CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT=y
+CONFIG_CLONE_BACKWARDS=y
+CONFIG_OLD_SIGSUSPEND3=y
+CONFIG_COMPAT_OLD_SIGACTION=y
+CONFIG_COMPAT_32BIT_TIME=y
+CONFIG_HAVE_ARCH_VMAP_STACK=y
+CONFIG_VMAP_STACK=y
+CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
+CONFIG_RANDOMIZE_KSTACK_OFFSET=y
+CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y
+CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
+CONFIG_STRICT_KERNEL_RWX=y
+CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
+CONFIG_STRICT_MODULE_RWX=y
+CONFIG_HAVE_ARCH_COMPILER_H=y
+CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
+CONFIG_ARCH_USE_MEMREMAP_PROT=y
+# CONFIG_LOCK_EVENT_COUNTS is not set
+CONFIG_ARCH_HAS_RELR=y
+CONFIG_RELR=y
+CONFIG_HAVE_PREEMPT_DYNAMIC=y
+CONFIG_HAVE_PREEMPT_DYNAMIC_KEY=y
+CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
+CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
+CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y
+CONFIG_ARCH_HAVE_TRACE_MMIO_ACCESS=y
+
+#
+# GCOV-based kernel profiling
+#
+# CONFIG_GCOV_KERNEL is not set
+CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
+# end of GCOV-based kernel profiling
+
+CONFIG_HAVE_GCC_PLUGINS=y
+CONFIG_FUNCTION_ALIGNMENT_4B=y
+CONFIG_FUNCTION_ALIGNMENT_8B=y
+CONFIG_FUNCTION_ALIGNMENT=8
+# end of General architecture-dependent options
+
+CONFIG_RT_MUTEXES=y
+CONFIG_BASE_SMALL=0
+CONFIG_MODULE_SIG_FORMAT=y
+CONFIG_MODULES=y
+# CONFIG_MODULE_FORCE_LOAD is not set
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULE_FORCE_UNLOAD=y
+# CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set
+# CONFIG_MODVERSIONS is not set
+# CONFIG_MODULE_SRCVERSION_ALL is not set
+CONFIG_MODULE_SIG=y
+# CONFIG_MODULE_SIG_FORCE is not set
+CONFIG_MODULE_SIG_ALL=y
+# CONFIG_MODULE_SIG_SHA1 is not set
+# CONFIG_MODULE_SIG_SHA224 is not set
+CONFIG_MODULE_SIG_SHA256=y
+# CONFIG_MODULE_SIG_SHA384 is not set
+# CONFIG_MODULE_SIG_SHA512 is not set
+CONFIG_MODULE_SIG_HASH="sha256"
+CONFIG_MODULE_COMPRESS_NONE=y
+# CONFIG_MODULE_COMPRESS_GZIP is not set
+# CONFIG_MODULE_COMPRESS_XZ is not set
+# CONFIG_MODULE_COMPRESS_ZSTD is not set
+# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
+CONFIG_MODPROBE_PATH="/sbin/modprobe"
+# CONFIG_TRIM_UNUSED_KSYMS is not set
+CONFIG_MODULES_TREE_LOOKUP=y
+CONFIG_BLOCK=y
+CONFIG_BLOCK_LEGACY_AUTOLOAD=y
+CONFIG_BLK_CGROUP_RWSTAT=y
+CONFIG_BLK_DEV_BSG_COMMON=y
+CONFIG_BLK_ICQ=y
+CONFIG_BLK_DEV_BSGLIB=y
+CONFIG_BLK_DEV_INTEGRITY=y
+CONFIG_BLK_DEV_INTEGRITY_T10=y
+# CONFIG_BLK_DEV_ZONED is not set
+CONFIG_BLK_DEV_THROTTLING=y
+# CONFIG_BLK_DEV_THROTTLING_LOW is not set
+CONFIG_BLK_WBT=y
+CONFIG_BLK_WBT_MQ=y
+# CONFIG_BLK_CGROUP_IOLATENCY is not set
+# CONFIG_BLK_CGROUP_IOCOST is not set
+# CONFIG_BLK_CGROUP_IOPRIO is not set
+# CONFIG_BLK_DEBUG_FS is not set
+# CONFIG_BLK_SED_OPAL is not set
+# CONFIG_BLK_INLINE_ENCRYPTION is not set
+
+#
+# Partition Types
+#
+CONFIG_PARTITION_ADVANCED=y
+# CONFIG_ACORN_PARTITION is not set
+# CONFIG_AIX_PARTITION is not set
+# CONFIG_OSF_PARTITION is not set
+# CONFIG_AMIGA_PARTITION is not set
+# CONFIG_ATARI_PARTITION is not set
+# CONFIG_MAC_PARTITION is not set
+CONFIG_MSDOS_PARTITION=y
+# CONFIG_BSD_DISKLABEL is not set
+# CONFIG_MINIX_SUBPARTITION is not set
+# CONFIG_SOLARIS_X86_PARTITION is not set
+# CONFIG_UNIXWARE_DISKLABEL is not set
+# CONFIG_LDM_PARTITION is not set
+# CONFIG_SGI_PARTITION is not set
+# CONFIG_ULTRIX_PARTITION is not set
+# CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
+CONFIG_EFI_PARTITION=y
+# CONFIG_SYSV68_PARTITION is not set
+# CONFIG_CMDLINE_PARTITION is not set
+# end of Partition Types
+
+CONFIG_BLOCK_COMPAT=y
+CONFIG_BLK_MQ_PCI=y
+CONFIG_BLK_MQ_VIRTIO=y
+CONFIG_BLK_PM=y
+CONFIG_BLOCK_HOLDER_DEPRECATED=y
+CONFIG_BLK_MQ_STACKING=y
+
+#
+# IO Schedulers
+#
+CONFIG_MQ_IOSCHED_DEADLINE=y
+CONFIG_MQ_IOSCHED_KYBER=m
+CONFIG_IOSCHED_BFQ=m
+CONFIG_BFQ_GROUP_IOSCHED=y
+# CONFIG_BFQ_CGROUP_DEBUG is not set
+# end of IO Schedulers
+
+CONFIG_ASN1=y
+CONFIG_ARCH_INLINE_SPIN_TRYLOCK=y
+CONFIG_ARCH_INLINE_SPIN_TRYLOCK_BH=y
+CONFIG_ARCH_INLINE_SPIN_LOCK=y
+CONFIG_ARCH_INLINE_SPIN_LOCK_BH=y
+CONFIG_ARCH_INLINE_SPIN_LOCK_IRQ=y
+CONFIG_ARCH_INLINE_SPIN_LOCK_IRQSAVE=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK_BH=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQ=y
+CONFIG_ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE=y
+CONFIG_ARCH_INLINE_READ_LOCK=y
+CONFIG_ARCH_INLINE_READ_LOCK_BH=y
+CONFIG_ARCH_INLINE_READ_LOCK_IRQ=y
+CONFIG_ARCH_INLINE_READ_LOCK_IRQSAVE=y
+CONFIG_ARCH_INLINE_READ_UNLOCK=y
+CONFIG_ARCH_INLINE_READ_UNLOCK_BH=y
+CONFIG_ARCH_INLINE_READ_UNLOCK_IRQ=y
+CONFIG_ARCH_INLINE_READ_UNLOCK_IRQRESTORE=y
+CONFIG_ARCH_INLINE_WRITE_LOCK=y
+CONFIG_ARCH_INLINE_WRITE_LOCK_BH=y
+CONFIG_ARCH_INLINE_WRITE_LOCK_IRQ=y
+CONFIG_ARCH_INLINE_WRITE_LOCK_IRQSAVE=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK_BH=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQ=y
+CONFIG_ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE=y
+CONFIG_INLINE_SPIN_TRYLOCK=y
+CONFIG_INLINE_SPIN_TRYLOCK_BH=y
+CONFIG_INLINE_SPIN_LOCK=y
+CONFIG_INLINE_SPIN_LOCK_BH=y
+CONFIG_INLINE_SPIN_LOCK_IRQ=y
+CONFIG_INLINE_SPIN_LOCK_IRQSAVE=y
+CONFIG_INLINE_SPIN_UNLOCK_BH=y
+CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
+CONFIG_INLINE_SPIN_UNLOCK_IRQRESTORE=y
+CONFIG_INLINE_READ_LOCK=y
+CONFIG_INLINE_READ_LOCK_BH=y
+CONFIG_INLINE_READ_LOCK_IRQ=y
+CONFIG_INLINE_READ_LOCK_IRQSAVE=y
+CONFIG_INLINE_READ_UNLOCK=y
+CONFIG_INLINE_READ_UNLOCK_BH=y
+CONFIG_INLINE_READ_UNLOCK_IRQ=y
+CONFIG_INLINE_READ_UNLOCK_IRQRESTORE=y
+CONFIG_INLINE_WRITE_LOCK=y
+CONFIG_INLINE_WRITE_LOCK_BH=y
+CONFIG_INLINE_WRITE_LOCK_IRQ=y
+CONFIG_INLINE_WRITE_LOCK_IRQSAVE=y
+CONFIG_INLINE_WRITE_UNLOCK=y
+CONFIG_INLINE_WRITE_UNLOCK_BH=y
+CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
+CONFIG_INLINE_WRITE_UNLOCK_IRQRESTORE=y
+CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
+CONFIG_MUTEX_SPIN_ON_OWNER=y
+CONFIG_RWSEM_SPIN_ON_OWNER=y
+CONFIG_LOCK_SPIN_ON_OWNER=y
+CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
+CONFIG_QUEUED_SPINLOCKS=y
+CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
+CONFIG_QUEUED_RWLOCKS=y
+CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
+CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
+CONFIG_FREEZER=y
+
+#
+# Executable file formats
+#
+CONFIG_BINFMT_ELF=y
+CONFIG_COMPAT_BINFMT_ELF=y
+CONFIG_ARCH_BINFMT_ELF_STATE=y
+CONFIG_ARCH_BINFMT_ELF_EXTRA_PHDRS=y
+CONFIG_ARCH_HAVE_ELF_PROT=y
+CONFIG_ARCH_USE_GNU_PROPERTY=y
+CONFIG_ELFCORE=y
+CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
+CONFIG_BINFMT_SCRIPT=y
+CONFIG_BINFMT_MISC=y
+CONFIG_COREDUMP=y
+# end of Executable file formats
+
+#
+# Memory Management options
+#
+CONFIG_SWAP=y
+# CONFIG_ZSWAP is not set
+CONFIG_ZSMALLOC=m
+# CONFIG_ZSMALLOC_STAT is not set
+
+#
+# SLAB allocator options
+#
+# CONFIG_SLAB is not set
+CONFIG_SLUB=y
+# CONFIG_SLOB is not set
+CONFIG_SLAB_MERGE_DEFAULT=y
+CONFIG_SLAB_FREELIST_RANDOM=y
+CONFIG_SLAB_FREELIST_HARDENED=y
+# CONFIG_SLUB_STATS is not set
+CONFIG_SLUB_CPU_PARTIAL=y
+# end of SLAB allocator options
+
+CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
+# CONFIG_COMPAT_BRK is not set
+CONFIG_SPARSEMEM=y
+CONFIG_SPARSEMEM_EXTREME=y
+CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
+CONFIG_SPARSEMEM_VMEMMAP=y
+CONFIG_HAVE_FAST_GUP=y
+CONFIG_ARCH_KEEP_MEMBLOCK=y
+CONFIG_MEMORY_ISOLATION=y
+CONFIG_EXCLUSIVE_SYSTEM_RAM=y
+CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
+CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
+# CONFIG_MEMORY_HOTPLUG is not set
+CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
+CONFIG_MEMORY_BALLOON=y
+CONFIG_BALLOON_COMPACTION=y
+CONFIG_COMPACTION=y
+CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
+CONFIG_PAGE_REPORTING=y
+CONFIG_MIGRATION=y
+CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
+CONFIG_ARCH_ENABLE_THP_MIGRATION=y
+CONFIG_CONTIG_ALLOC=y
+CONFIG_PHYS_ADDR_T_64BIT=y
+CONFIG_MMU_NOTIFIER=y
+# CONFIG_KSM is not set
+CONFIG_DEFAULT_MMAP_MIN_ADDR=65536
+CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
+# CONFIG_MEMORY_FAILURE is not set
+CONFIG_ARCH_WANTS_THP_SWAP=y
+CONFIG_TRANSPARENT_HUGEPAGE=y
+# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
+CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
+CONFIG_THP_SWAP=y
+# CONFIG_READ_ONLY_THP_FOR_FS is not set
+CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
+CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
+CONFIG_USE_PERCPU_NUMA_NODE_ID=y
+CONFIG_HAVE_SETUP_PER_CPU_AREA=y
+CONFIG_CMA=y
+# CONFIG_CMA_DEBUG is not set
+# CONFIG_CMA_DEBUGFS is not set
+# CONFIG_CMA_SYSFS is not set
+CONFIG_CMA_AREAS=7
+CONFIG_GENERIC_EARLY_IOREMAP=y
+# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
+# CONFIG_IDLE_PAGE_TRACKING is not set
+CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
+CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
+CONFIG_ARCH_HAS_PTE_DEVMAP=y
+CONFIG_ARCH_HAS_ZONE_DMA_SET=y
+CONFIG_ZONE_DMA=y
+CONFIG_ZONE_DMA32=y
+CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
+CONFIG_VM_EVENT_COUNTERS=y
+# CONFIG_PERCPU_STATS is not set
+# CONFIG_GUP_TEST is not set
+CONFIG_ARCH_HAS_PTE_SPECIAL=y
+# CONFIG_ANON_VMA_NAME is not set
+# CONFIG_USERFAULTFD is not set
+CONFIG_LRU_GEN=y
+# CONFIG_LRU_GEN_ENABLED is not set
+# CONFIG_LRU_GEN_STATS is not set
+CONFIG_LOCK_MM_AND_FIND_VMA=y
+
+#
+# Data Access Monitoring
+#
+# CONFIG_DAMON is not set
+# end of Data Access Monitoring
+# end of Memory Management options
+
+CONFIG_NET=y
+CONFIG_NET_INGRESS=y
+CONFIG_NET_EGRESS=y
+CONFIG_NET_REDIRECT=y
+CONFIG_SKB_EXTENSIONS=y
+
+#
+# Networking options
+#
+CONFIG_PACKET=y
+CONFIG_PACKET_DIAG=m
+CONFIG_UNIX=y
+CONFIG_UNIX_SCM=y
+CONFIG_AF_UNIX_OOB=y
+CONFIG_UNIX_DIAG=m
+CONFIG_TLS=y
+CONFIG_TLS_DEVICE=y
+# CONFIG_TLS_TOE is not set
+CONFIG_XFRM=y
+CONFIG_XFRM_ALGO=y
+CONFIG_XFRM_USER=y
+CONFIG_XFRM_INTERFACE=m
+# CONFIG_XFRM_SUB_POLICY is not set
+# CONFIG_XFRM_MIGRATE is not set
+CONFIG_XFRM_STATISTICS=y
+CONFIG_XFRM_AH=m
+CONFIG_XFRM_ESP=m
+CONFIG_XFRM_IPCOMP=m
+CONFIG_NET_KEY=m
+# CONFIG_NET_KEY_MIGRATE is not set
+CONFIG_XDP_SOCKETS=y
+# CONFIG_XDP_SOCKETS_DIAG is not set
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+# CONFIG_IP_FIB_TRIE_STATS is not set
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_ROUTE_MULTIPATH=y
+CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_IP_ROUTE_CLASSID=y
+# CONFIG_IP_PNP is not set
+CONFIG_NET_IPIP=m
+CONFIG_NET_IPGRE_DEMUX=m
+CONFIG_NET_IP_TUNNEL=m
+CONFIG_NET_IPGRE=m
+# CONFIG_NET_IPGRE_BROADCAST is not set
+CONFIG_IP_MROUTE_COMMON=y
+CONFIG_IP_MROUTE=y
+# CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set
+CONFIG_IP_PIMSM_V1=y
+CONFIG_IP_PIMSM_V2=y
+CONFIG_SYN_COOKIES=y
+# CONFIG_NET_IPVTI is not set
+CONFIG_NET_UDP_TUNNEL=m
+CONFIG_NET_FOU=m
+CONFIG_NET_FOU_IP_TUNNELS=y
+CONFIG_INET_AH=m
+CONFIG_INET_ESP=m
+# CONFIG_INET_ESP_OFFLOAD is not set
+# CONFIG_INET_ESPINTCP is not set
+CONFIG_INET_IPCOMP=m
+CONFIG_INET_TABLE_PERTURB_ORDER=16
+CONFIG_INET_XFRM_TUNNEL=m
+CONFIG_INET_TUNNEL=m
+CONFIG_INET_DIAG=m
+CONFIG_INET_TCP_DIAG=m
+CONFIG_INET_UDP_DIAG=m
+# CONFIG_INET_RAW_DIAG is not set
+CONFIG_INET_DIAG_DESTROY=y
+CONFIG_TCP_CONG_ADVANCED=y
+# CONFIG_TCP_CONG_BIC is not set
+CONFIG_TCP_CONG_CUBIC=y
+# CONFIG_TCP_CONG_WESTWOOD is not set
+# CONFIG_TCP_CONG_HTCP is not set
+# CONFIG_TCP_CONG_HSTCP is not set
+# CONFIG_TCP_CONG_HYBLA is not set
+# CONFIG_TCP_CONG_VEGAS is not set
+# CONFIG_TCP_CONG_NV is not set
+# CONFIG_TCP_CONG_SCALABLE is not set
+CONFIG_TCP_CONG_LP=m
+# CONFIG_TCP_CONG_VENO is not set
+# CONFIG_TCP_CONG_YEAH is not set
+# CONFIG_TCP_CONG_ILLINOIS is not set
+# CONFIG_TCP_CONG_DCTCP is not set
+# CONFIG_TCP_CONG_CDG is not set
+CONFIG_TCP_CONG_BBR=m
+CONFIG_DEFAULT_CUBIC=y
+# CONFIG_DEFAULT_RENO is not set
+CONFIG_DEFAULT_TCP_CONG="cubic"
+CONFIG_TCP_MD5SIG=y
+CONFIG_IPV6=y
+# CONFIG_IPV6_ROUTER_PREF is not set
+# CONFIG_IPV6_OPTIMISTIC_DAD is not set
+# CONFIG_INET6_AH is not set
+CONFIG_INET6_ESP=m
+# CONFIG_INET6_ESP_OFFLOAD is not set
+# CONFIG_INET6_ESPINTCP is not set
+# CONFIG_INET6_IPCOMP is not set
+# CONFIG_IPV6_MIP6 is not set
+# CONFIG_IPV6_ILA is not set
+CONFIG_INET6_TUNNEL=m
+# CONFIG_IPV6_VTI is not set
+# CONFIG_IPV6_SIT is not set
+CONFIG_IPV6_TUNNEL=m
+CONFIG_IPV6_GRE=m
+CONFIG_IPV6_FOU=m
+CONFIG_IPV6_FOU_TUNNEL=m
+CONFIG_IPV6_MULTIPLE_TABLES=y
+# CONFIG_IPV6_SUBTREES is not set
+# CONFIG_IPV6_MROUTE is not set
+# CONFIG_IPV6_SEG6_LWTUNNEL is not set
+# CONFIG_IPV6_SEG6_HMAC is not set
+# CONFIG_IPV6_RPL_LWTUNNEL is not set
+# CONFIG_IPV6_IOAM6_LWTUNNEL is not set
+# CONFIG_NETLABEL is not set
+# CONFIG_MPTCP is not set
+CONFIG_NETWORK_SECMARK=y
+CONFIG_NET_PTP_CLASSIFY=y
+# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
+CONFIG_NETFILTER=y
+CONFIG_NETFILTER_ADVANCED=y
+CONFIG_BRIDGE_NETFILTER=m
+
+#
+# Core Netfilter Configuration
+#
+CONFIG_NETFILTER_INGRESS=y
+CONFIG_NETFILTER_EGRESS=y
+CONFIG_NETFILTER_SKIP_EGRESS=y
+CONFIG_NETFILTER_NETLINK=y
+CONFIG_NETFILTER_FAMILY_BRIDGE=y
+CONFIG_NETFILTER_FAMILY_ARP=y
+# CONFIG_NETFILTER_NETLINK_HOOK is not set
+CONFIG_NETFILTER_NETLINK_ACCT=m
+CONFIG_NETFILTER_NETLINK_QUEUE=y
+CONFIG_NETFILTER_NETLINK_LOG=y
+CONFIG_NETFILTER_NETLINK_OSF=m
+CONFIG_NF_CONNTRACK=y
+CONFIG_NF_LOG_SYSLOG=m
+CONFIG_NETFILTER_CONNCOUNT=m
+CONFIG_NF_CONNTRACK_MARK=y
+CONFIG_NF_CONNTRACK_SECMARK=y
+CONFIG_NF_CONNTRACK_ZONES=y
+CONFIG_NF_CONNTRACK_PROCFS=y
+CONFIG_NF_CONNTRACK_EVENTS=y
+CONFIG_NF_CONNTRACK_TIMEOUT=y
+CONFIG_NF_CONNTRACK_TIMESTAMP=y
+CONFIG_NF_CONNTRACK_LABELS=y
+CONFIG_NF_CT_PROTO_DCCP=y
+CONFIG_NF_CT_PROTO_GRE=y
+CONFIG_NF_CT_PROTO_SCTP=y
+CONFIG_NF_CT_PROTO_UDPLITE=y
+CONFIG_NF_CONNTRACK_AMANDA=m
+CONFIG_NF_CONNTRACK_FTP=m
+CONFIG_NF_CONNTRACK_H323=m
+CONFIG_NF_CONNTRACK_IRC=m
+CONFIG_NF_CONNTRACK_BROADCAST=m
+CONFIG_NF_CONNTRACK_NETBIOS_NS=m
+CONFIG_NF_CONNTRACK_SNMP=m
+CONFIG_NF_CONNTRACK_PPTP=m
+CONFIG_NF_CONNTRACK_SANE=m
+CONFIG_NF_CONNTRACK_SIP=m
+CONFIG_NF_CONNTRACK_TFTP=m
+CONFIG_NF_CT_NETLINK=y
+CONFIG_NF_CT_NETLINK_TIMEOUT=m
+CONFIG_NF_CT_NETLINK_HELPER=m
+CONFIG_NETFILTER_NETLINK_GLUE_CT=y
+CONFIG_NF_NAT=m
+CONFIG_NF_NAT_AMANDA=m
+CONFIG_NF_NAT_FTP=m
+CONFIG_NF_NAT_IRC=m
+CONFIG_NF_NAT_SIP=m
+CONFIG_NF_NAT_TFTP=m
+CONFIG_NF_NAT_REDIRECT=y
+CONFIG_NF_NAT_MASQUERADE=y
+CONFIG_NETFILTER_SYNPROXY=y
+CONFIG_NF_TABLES=m
+CONFIG_NF_TABLES_INET=y
+# CONFIG_NF_TABLES_NETDEV is not set
+CONFIG_NFT_NUMGEN=m
+CONFIG_NFT_CT=m
+# CONFIG_NFT_FLOW_OFFLOAD is not set
+CONFIG_NFT_CONNLIMIT=m
+CONFIG_NFT_LOG=m
+CONFIG_NFT_LIMIT=m
+CONFIG_NFT_MASQ=m
+CONFIG_NFT_REDIR=m
+CONFIG_NFT_NAT=m
+CONFIG_NFT_TUNNEL=m
+CONFIG_NFT_OBJREF=m
+CONFIG_NFT_QUEUE=m
+CONFIG_NFT_QUOTA=m
+CONFIG_NFT_REJECT=m
+CONFIG_NFT_REJECT_INET=m
+CONFIG_NFT_COMPAT=m
+CONFIG_NFT_HASH=m
+CONFIG_NFT_XFRM=m
+CONFIG_NFT_SOCKET=m
+CONFIG_NFT_OSF=m
+CONFIG_NFT_TPROXY=m
+CONFIG_NFT_SYNPROXY=m
+# CONFIG_NF_FLOW_TABLE_INET is not set
+CONFIG_NF_FLOW_TABLE=m
+# CONFIG_NF_FLOW_TABLE_PROCFS is not set
+CONFIG_NETFILTER_XTABLES=y
+CONFIG_NETFILTER_XTABLES_COMPAT=y
+
+#
+# Xtables combined modules
+#
+CONFIG_NETFILTER_XT_MARK=m
+CONFIG_NETFILTER_XT_CONNMARK=m
+CONFIG_NETFILTER_XT_SET=m
+
+#
+# Xtables targets
+#
+CONFIG_NETFILTER_XT_TARGET_AUDIT=m
+CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
+CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
+CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
+CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
+CONFIG_NETFILTER_XT_TARGET_CT=m
+CONFIG_NETFILTER_XT_TARGET_DSCP=m
+CONFIG_NETFILTER_XT_TARGET_HL=m
+CONFIG_NETFILTER_XT_TARGET_HMARK=m
+CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
+CONFIG_NETFILTER_XT_TARGET_LOG=m
+CONFIG_NETFILTER_XT_TARGET_MARK=m
+CONFIG_NETFILTER_XT_NAT=m
+CONFIG_NETFILTER_XT_TARGET_NETMAP=m
+CONFIG_NETFILTER_XT_TARGET_NFLOG=y
+CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
+# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set
+CONFIG_NETFILTER_XT_TARGET_RATEEST=m
+CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
+CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m
+CONFIG_NETFILTER_XT_TARGET_TEE=m
+CONFIG_NETFILTER_XT_TARGET_TPROXY=m
+CONFIG_NETFILTER_XT_TARGET_TRACE=m
+CONFIG_NETFILTER_XT_TARGET_SECMARK=m
+CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
+CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
+
+#
+# Xtables matches
+#
+CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
+CONFIG_NETFILTER_XT_MATCH_BPF=m
+CONFIG_NETFILTER_XT_MATCH_CGROUP=m
+CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
+CONFIG_NETFILTER_XT_MATCH_COMMENT=y
+CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
+CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
+CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
+CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
+CONFIG_NETFILTER_XT_MATCH_CPU=m
+CONFIG_NETFILTER_XT_MATCH_DCCP=m
+CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
+CONFIG_NETFILTER_XT_MATCH_DSCP=m
+CONFIG_NETFILTER_XT_MATCH_ECN=m
+CONFIG_NETFILTER_XT_MATCH_ESP=m
+CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_HELPER=m
+CONFIG_NETFILTER_XT_MATCH_HL=m
+CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
+CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
+CONFIG_NETFILTER_XT_MATCH_IPVS=m
+CONFIG_NETFILTER_XT_MATCH_L2TP=m
+CONFIG_NETFILTER_XT_MATCH_LENGTH=m
+CONFIG_NETFILTER_XT_MATCH_LIMIT=m
+CONFIG_NETFILTER_XT_MATCH_MAC=m
+CONFIG_NETFILTER_XT_MATCH_MARK=m
+CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
+CONFIG_NETFILTER_XT_MATCH_NFACCT=m
+CONFIG_NETFILTER_XT_MATCH_OSF=m
+CONFIG_NETFILTER_XT_MATCH_OWNER=y
+CONFIG_NETFILTER_XT_MATCH_POLICY=y
+CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
+CONFIG_NETFILTER_XT_MATCH_QUOTA=m
+CONFIG_NETFILTER_XT_MATCH_RATEEST=m
+CONFIG_NETFILTER_XT_MATCH_REALM=y
+CONFIG_NETFILTER_XT_MATCH_RECENT=m
+CONFIG_NETFILTER_XT_MATCH_SCTP=m
+CONFIG_NETFILTER_XT_MATCH_SOCKET=m
+CONFIG_NETFILTER_XT_MATCH_STATE=m
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
+CONFIG_NETFILTER_XT_MATCH_STRING=m
+CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
+CONFIG_NETFILTER_XT_MATCH_TIME=m
+CONFIG_NETFILTER_XT_MATCH_U32=m
+# end of Core Netfilter Configuration
+
+CONFIG_IP_SET=m
+CONFIG_IP_SET_MAX=256
+CONFIG_IP_SET_BITMAP_IP=m
+CONFIG_IP_SET_BITMAP_IPMAC=m
+CONFIG_IP_SET_BITMAP_PORT=m
+CONFIG_IP_SET_HASH_IP=m
+CONFIG_IP_SET_HASH_IPMARK=m
+CONFIG_IP_SET_HASH_IPPORT=m
+CONFIG_IP_SET_HASH_IPPORTIP=m
+CONFIG_IP_SET_HASH_IPPORTNET=m
+# CONFIG_IP_SET_HASH_IPMAC is not set
+CONFIG_IP_SET_HASH_MAC=m
+CONFIG_IP_SET_HASH_NETPORTNET=m
+CONFIG_IP_SET_HASH_NET=m
+CONFIG_IP_SET_HASH_NETNET=m
+CONFIG_IP_SET_HASH_NETPORT=m
+CONFIG_IP_SET_HASH_NETIFACE=m
+CONFIG_IP_SET_LIST_SET=m
+CONFIG_IP_VS=m
+# CONFIG_IP_VS_IPV6 is not set
+# CONFIG_IP_VS_DEBUG is not set
+CONFIG_IP_VS_TAB_BITS=12
+
+#
+# IPVS transport protocol load balancing support
+#
+CONFIG_IP_VS_PROTO_TCP=y
+CONFIG_IP_VS_PROTO_UDP=y
+CONFIG_IP_VS_PROTO_AH_ESP=y
+CONFIG_IP_VS_PROTO_ESP=y
+CONFIG_IP_VS_PROTO_AH=y
+CONFIG_IP_VS_PROTO_SCTP=y
+
+#
+# IPVS scheduler
+#
+CONFIG_IP_VS_RR=m
+CONFIG_IP_VS_WRR=m
+CONFIG_IP_VS_LC=m
+CONFIG_IP_VS_WLC=m
+CONFIG_IP_VS_FO=m
+CONFIG_IP_VS_OVF=m
+CONFIG_IP_VS_LBLC=m
+CONFIG_IP_VS_LBLCR=m
+CONFIG_IP_VS_DH=m
+CONFIG_IP_VS_SH=m
+# CONFIG_IP_VS_MH is not set
+CONFIG_IP_VS_SED=m
+CONFIG_IP_VS_NQ=m
+# CONFIG_IP_VS_TWOS is not set
+
+#
+# IPVS SH scheduler
+#
+CONFIG_IP_VS_SH_TAB_BITS=8
+
+#
+# IPVS MH scheduler
+#
+CONFIG_IP_VS_MH_TAB_INDEX=12
+
+#
+# IPVS application helper
+#
+CONFIG_IP_VS_FTP=m
+CONFIG_IP_VS_NFCT=y
+CONFIG_IP_VS_PE_SIP=m
+
+#
+# IP: Netfilter Configuration
+#
+CONFIG_NF_DEFRAG_IPV4=y
+CONFIG_NF_SOCKET_IPV4=m
+CONFIG_NF_TPROXY_IPV4=m
+CONFIG_NF_TABLES_IPV4=y
+CONFIG_NFT_REJECT_IPV4=m
+# CONFIG_NFT_DUP_IPV4 is not set
+# CONFIG_NFT_FIB_IPV4 is not set
+# CONFIG_NF_TABLES_ARP is not set
+CONFIG_NF_DUP_IPV4=m
+CONFIG_NF_LOG_ARP=m
+CONFIG_NF_LOG_IPV4=m
+CONFIG_NF_REJECT_IPV4=y
+CONFIG_NF_NAT_SNMP_BASIC=m
+CONFIG_NF_NAT_PPTP=m
+CONFIG_NF_NAT_H323=m
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IP_NF_MATCH_AH=m
+CONFIG_IP_NF_MATCH_ECN=m
+CONFIG_IP_NF_MATCH_RPFILTER=m
+CONFIG_IP_NF_MATCH_TTL=m
+CONFIG_IP_NF_FILTER=y
+CONFIG_IP_NF_TARGET_REJECT=y
+CONFIG_IP_NF_TARGET_SYNPROXY=m
+CONFIG_IP_NF_NAT=m
+CONFIG_IP_NF_TARGET_MASQUERADE=m
+CONFIG_IP_NF_TARGET_NETMAP=m
+CONFIG_IP_NF_TARGET_REDIRECT=m
+CONFIG_IP_NF_MANGLE=y
+CONFIG_IP_NF_TARGET_CLUSTERIP=m
+CONFIG_IP_NF_TARGET_ECN=m
+CONFIG_IP_NF_TARGET_TTL=m
+CONFIG_IP_NF_RAW=m
+CONFIG_IP_NF_SECURITY=m
+CONFIG_IP_NF_ARPTABLES=m
+CONFIG_IP_NF_ARPFILTER=m
+CONFIG_IP_NF_ARP_MANGLE=m
+# end of IP: Netfilter Configuration
+
+#
+# IPv6: Netfilter Configuration
+#
+CONFIG_NF_SOCKET_IPV6=m
+CONFIG_NF_TPROXY_IPV6=m
+CONFIG_NF_TABLES_IPV6=y
+CONFIG_NFT_REJECT_IPV6=m
+# CONFIG_NFT_DUP_IPV6 is not set
+# CONFIG_NFT_FIB_IPV6 is not set
+CONFIG_NF_DUP_IPV6=m
+CONFIG_NF_REJECT_IPV6=y
+CONFIG_NF_LOG_IPV6=m
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP6_NF_MATCH_AH=m
+# CONFIG_IP6_NF_MATCH_EUI64 is not set
+# CONFIG_IP6_NF_MATCH_FRAG is not set
+# CONFIG_IP6_NF_MATCH_OPTS is not set
+# CONFIG_IP6_NF_MATCH_HL is not set
+# CONFIG_IP6_NF_MATCH_IPV6HEADER is not set
+# CONFIG_IP6_NF_MATCH_MH is not set
+CONFIG_IP6_NF_MATCH_RPFILTER=m
+# CONFIG_IP6_NF_MATCH_RT is not set
+# CONFIG_IP6_NF_MATCH_SRH is not set
+# CONFIG_IP6_NF_TARGET_HL is not set
+CONFIG_IP6_NF_FILTER=y
+CONFIG_IP6_NF_TARGET_REJECT=y
+CONFIG_IP6_NF_TARGET_SYNPROXY=y
+CONFIG_IP6_NF_MANGLE=m
+CONFIG_IP6_NF_RAW=m
+CONFIG_IP6_NF_SECURITY=m
+CONFIG_IP6_NF_NAT=m
+CONFIG_IP6_NF_TARGET_MASQUERADE=m
+# CONFIG_IP6_NF_TARGET_NPT is not set
+# end of IPv6: Netfilter Configuration
+
+CONFIG_NF_DEFRAG_IPV6=y
+# CONFIG_NF_TABLES_BRIDGE is not set
+# CONFIG_NF_CONNTRACK_BRIDGE is not set
+CONFIG_BRIDGE_NF_EBTABLES=m
+CONFIG_BRIDGE_EBT_BROUTE=m
+CONFIG_BRIDGE_EBT_T_FILTER=m
+CONFIG_BRIDGE_EBT_T_NAT=m
+CONFIG_BRIDGE_EBT_802_3=m
+CONFIG_BRIDGE_EBT_AMONG=m
+CONFIG_BRIDGE_EBT_ARP=m
+CONFIG_BRIDGE_EBT_IP=m
+# CONFIG_BRIDGE_EBT_IP6 is not set
+CONFIG_BRIDGE_EBT_LIMIT=m
+CONFIG_BRIDGE_EBT_MARK=m
+CONFIG_BRIDGE_EBT_PKTTYPE=m
+CONFIG_BRIDGE_EBT_STP=m
+CONFIG_BRIDGE_EBT_VLAN=m
+CONFIG_BRIDGE_EBT_ARPREPLY=m
+CONFIG_BRIDGE_EBT_DNAT=m
+CONFIG_BRIDGE_EBT_MARK_T=m
+CONFIG_BRIDGE_EBT_REDIRECT=m
+CONFIG_BRIDGE_EBT_SNAT=m
+CONFIG_BRIDGE_EBT_LOG=m
+CONFIG_BRIDGE_EBT_NFLOG=m
+# CONFIG_BPFILTER is not set
+# CONFIG_IP_DCCP is not set
+# CONFIG_IP_SCTP is not set
+# CONFIG_RDS is not set
+# CONFIG_TIPC is not set
+# CONFIG_ATM is not set
+# CONFIG_L2TP is not set
+CONFIG_STP=y
+CONFIG_BRIDGE=y
+CONFIG_BRIDGE_IGMP_SNOOPING=y
+CONFIG_BRIDGE_VLAN_FILTERING=y
+# CONFIG_BRIDGE_MRP is not set
+# CONFIG_BRIDGE_CFM is not set
+# CONFIG_NET_DSA is not set
+CONFIG_VLAN_8021Q=m
+# CONFIG_VLAN_8021Q_GVRP is not set
+# CONFIG_VLAN_8021Q_MVRP is not set
+CONFIG_LLC=y
+# CONFIG_LLC2 is not set
+# CONFIG_ATALK is not set
+# CONFIG_X25 is not set
+# CONFIG_LAPB is not set
+# CONFIG_PHONET is not set
+# CONFIG_6LOWPAN is not set
+# CONFIG_IEEE802154 is not set
+CONFIG_NET_SCHED=y
+
+#
+# Queueing/Scheduling
+#
+CONFIG_NET_SCH_CBQ=m
+CONFIG_NET_SCH_HTB=m
+CONFIG_NET_SCH_HFSC=m
+CONFIG_NET_SCH_PRIO=m
+CONFIG_NET_SCH_MULTIQ=m
+CONFIG_NET_SCH_RED=m
+CONFIG_NET_SCH_SFB=m
+CONFIG_NET_SCH_SFQ=m
+CONFIG_NET_SCH_TEQL=m
+CONFIG_NET_SCH_TBF=m
+# CONFIG_NET_SCH_CBS is not set
+# CONFIG_NET_SCH_ETF is not set
+# CONFIG_NET_SCH_TAPRIO is not set
+CONFIG_NET_SCH_GRED=m
+CONFIG_NET_SCH_DSMARK=m
+CONFIG_NET_SCH_NETEM=m
+CONFIG_NET_SCH_DRR=m
+CONFIG_NET_SCH_MQPRIO=m
+# CONFIG_NET_SCH_SKBPRIO is not set
+CONFIG_NET_SCH_CHOKE=m
+CONFIG_NET_SCH_QFQ=m
+CONFIG_NET_SCH_CODEL=m
+CONFIG_NET_SCH_FQ_CODEL=m
+# CONFIG_NET_SCH_CAKE is not set
+CONFIG_NET_SCH_FQ=m
+CONFIG_NET_SCH_HHF=m
+CONFIG_NET_SCH_PIE=m
+# CONFIG_NET_SCH_FQ_PIE is not set
+CONFIG_NET_SCH_INGRESS=m
+CONFIG_NET_SCH_PLUG=m
+# CONFIG_NET_SCH_ETS is not set
+# CONFIG_NET_SCH_DEFAULT is not set
+
+#
+# Classification
+#
+CONFIG_NET_CLS=y
+CONFIG_NET_CLS_BASIC=m
+CONFIG_NET_CLS_ROUTE4=m
+CONFIG_NET_CLS_FW=m
+CONFIG_NET_CLS_U32=m
+# CONFIG_CLS_U32_PERF is not set
+CONFIG_CLS_U32_MARK=y
+CONFIG_NET_CLS_RSVP=m
+CONFIG_NET_CLS_RSVP6=m
+# CONFIG_NET_CLS_FLOW is not set
+CONFIG_NET_CLS_CGROUP=m
+CONFIG_NET_CLS_BPF=m
+# CONFIG_NET_CLS_FLOWER is not set
+# CONFIG_NET_CLS_MATCHALL is not set
+# CONFIG_NET_EMATCH is not set
+CONFIG_NET_CLS_ACT=y
+# CONFIG_NET_ACT_POLICE is not set
+CONFIG_NET_ACT_GACT=m
+# CONFIG_GACT_PROB is not set
+CONFIG_NET_ACT_MIRRED=y
+# CONFIG_NET_ACT_SAMPLE is not set
+# CONFIG_NET_ACT_IPT is not set
+CONFIG_NET_ACT_NAT=m
+CONFIG_NET_ACT_PEDIT=y
+# CONFIG_NET_ACT_SIMP is not set
+# CONFIG_NET_ACT_SKBEDIT is not set
+# CONFIG_NET_ACT_CSUM is not set
+# CONFIG_NET_ACT_MPLS is not set
+# CONFIG_NET_ACT_VLAN is not set
+# CONFIG_NET_ACT_BPF is not set
+# CONFIG_NET_ACT_CONNMARK is not set
+# CONFIG_NET_ACT_CTINFO is not set
+# CONFIG_NET_ACT_SKBMOD is not set
+# CONFIG_NET_ACT_IFE is not set
+# CONFIG_NET_ACT_TUNNEL_KEY is not set
+# CONFIG_NET_ACT_CT is not set
+# CONFIG_NET_ACT_GATE is not set
+# CONFIG_NET_TC_SKB_EXT is not set
+CONFIG_NET_SCH_FIFO=y
+# CONFIG_DCB is not set
+CONFIG_DNS_RESOLVER=m
+# CONFIG_BATMAN_ADV is not set
+# CONFIG_OPENVSWITCH is not set
+CONFIG_VSOCKETS=y
+CONFIG_VSOCKETS_DIAG=y
+CONFIG_VSOCKETS_LOOPBACK=y
+CONFIG_VIRTIO_VSOCKETS=y
+CONFIG_VIRTIO_VSOCKETS_COMMON=y
+CONFIG_NETLINK_DIAG=m
+# CONFIG_MPLS is not set
+# CONFIG_NET_NSH is not set
+# CONFIG_HSR is not set
+# CONFIG_NET_SWITCHDEV is not set
+CONFIG_NET_L3_MASTER_DEV=y
+# CONFIG_QRTR is not set
+# CONFIG_NET_NCSI is not set
+CONFIG_PCPU_DEV_REFCNT=y
+CONFIG_MAX_SKB_FRAGS=17
+CONFIG_RPS=y
+CONFIG_RFS_ACCEL=y
+CONFIG_SOCK_RX_QUEUE_MAPPING=y
+CONFIG_XPS=y
+CONFIG_CGROUP_NET_PRIO=y
+CONFIG_CGROUP_NET_CLASSID=y
+CONFIG_NET_RX_BUSY_POLL=y
+CONFIG_BQL=y
+CONFIG_BPF_STREAM_PARSER=y
+CONFIG_NET_FLOW_LIMIT=y
+
+#
+# Network testing
+#
+# CONFIG_NET_PKTGEN is not set
+# CONFIG_NET_DROP_MONITOR is not set
+# end of Network testing
+# end of Networking options
+
+# CONFIG_HAMRADIO is not set
+# CONFIG_CAN is not set
+# CONFIG_BT is not set
+# CONFIG_AF_RXRPC is not set
+# CONFIG_AF_KCM is not set
+CONFIG_STREAM_PARSER=y
+# CONFIG_MCTP is not set
+CONFIG_FIB_RULES=y
+# CONFIG_WIRELESS is not set
+# CONFIG_RFKILL is not set
+CONFIG_NET_9P=y
+CONFIG_NET_9P_FD=y
+CONFIG_NET_9P_VIRTIO=y
+# CONFIG_NET_9P_XEN is not set
+# CONFIG_NET_9P_DEBUG is not set
+# CONFIG_CAIF is not set
+# CONFIG_CEPH_LIB is not set
+# CONFIG_NFC is not set
+# CONFIG_PSAMPLE is not set
+# CONFIG_NET_IFE is not set
+CONFIG_LWTUNNEL=y
+CONFIG_LWTUNNEL_BPF=y
+CONFIG_DST_CACHE=y
+CONFIG_GRO_CELLS=y
+CONFIG_SOCK_VALIDATE_XMIT=y
+CONFIG_NET_SELFTESTS=y
+CONFIG_NET_SOCK_MSG=y
+CONFIG_NET_DEVLINK=y
+CONFIG_PAGE_POOL=y
+# CONFIG_PAGE_POOL_STATS is not set
+CONFIG_FAILOVER=y
+CONFIG_ETHTOOL_NETLINK=y
+
+#
+# Device Drivers
+#
+CONFIG_ARM_AMBA=y
+CONFIG_HAVE_PCI=y
+CONFIG_PCI=y
+CONFIG_PCI_DOMAINS=y
+CONFIG_PCI_DOMAINS_GENERIC=y
+CONFIG_PCI_SYSCALL=y
+CONFIG_PCIEPORTBUS=y
+CONFIG_HOTPLUG_PCI_PCIE=y
+CONFIG_PCIEAER=y
+# CONFIG_PCIEAER_INJECT is not set
+# CONFIG_PCIE_ECRC is not set
+CONFIG_PCIEASPM=y
+CONFIG_PCIEASPM_DEFAULT=y
+# CONFIG_PCIEASPM_POWERSAVE is not set
+# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
+# CONFIG_PCIEASPM_PERFORMANCE is not set
+CONFIG_PCIE_PME=y
+# CONFIG_PCIE_DPC is not set
+# CONFIG_PCIE_PTM is not set
+CONFIG_PCI_MSI=y
+CONFIG_PCI_MSI_IRQ_DOMAIN=y
+CONFIG_PCI_QUIRKS=y
+# CONFIG_PCI_DEBUG is not set
+# CONFIG_PCI_STUB is not set
+CONFIG_PCI_ECAM=y
+# CONFIG_PCI_IOV is not set
+# CONFIG_PCI_PRI is not set
+# CONFIG_PCI_PASID is not set
+CONFIG_PCI_LABEL=y
+# CONFIG_PCIE_BUS_TUNE_OFF is not set
+CONFIG_PCIE_BUS_DEFAULT=y
+# CONFIG_PCIE_BUS_SAFE is not set
+# CONFIG_PCIE_BUS_PERFORMANCE is not set
+# CONFIG_PCIE_BUS_PEER2PEER is not set
+# CONFIG_VGA_ARB is not set
+CONFIG_HOTPLUG_PCI=y
+CONFIG_HOTPLUG_PCI_ACPI=y
+# CONFIG_HOTPLUG_PCI_ACPI_IBM is not set
+# CONFIG_HOTPLUG_PCI_CPCI is not set
+# CONFIG_HOTPLUG_PCI_SHPC is not set
+
+#
+# PCI controller drivers
+#
+# CONFIG_PCI_FTPCI100 is not set
+# CONFIG_PCI_HOST_GENERIC is not set
+# CONFIG_PCIE_XILINX is not set
+# CONFIG_PCI_XGENE is not set
+# CONFIG_PCIE_ALTERA is not set
+# CONFIG_PCI_HOST_THUNDER_PEM is not set
+# CONFIG_PCI_HOST_THUNDER_ECAM is not set
+# CONFIG_PCIE_MICROCHIP_HOST is not set
+
+#
+# DesignWare PCI Core Support
+#
+# CONFIG_PCIE_DW_PLAT_HOST is not set
+# CONFIG_PCI_HISI is not set
+# CONFIG_PCIE_KIRIN is not set
+# CONFIG_PCI_MESON is not set
+# CONFIG_PCIE_AL is not set
+# end of DesignWare PCI Core Support
+
+#
+# Mobiveil PCIe Core Support
+#
+# end of Mobiveil PCIe Core Support
+
+#
+# Cadence PCIe controllers support
+#
+# CONFIG_PCIE_CADENCE_PLAT_HOST is not set
+# CONFIG_PCI_J721E_HOST is not set
+# end of Cadence PCIe controllers support
+# end of PCI controller drivers
+
+#
+# PCI Endpoint
+#
+# CONFIG_PCI_ENDPOINT is not set
+# end of PCI Endpoint
+
+#
+# PCI switch controller drivers
+#
+# CONFIG_PCI_SW_SWITCHTEC is not set
+# end of PCI switch controller drivers
+
+# CONFIG_CXL_BUS is not set
+# CONFIG_PCCARD is not set
+# CONFIG_RAPIDIO is not set
+
+#
+# Generic Driver Options
+#
+CONFIG_AUXILIARY_BUS=y
+CONFIG_UEVENT_HELPER=y
+CONFIG_UEVENT_HELPER_PATH=""
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_DEVTMPFS_SAFE=y
+CONFIG_STANDALONE=y
+CONFIG_PREVENT_FIRMWARE_BUILD=y
+
+#
+# Firmware loader
+#
+CONFIG_FW_LOADER=y
+CONFIG_EXTRA_FIRMWARE=""
+# CONFIG_FW_LOADER_USER_HELPER is not set
+# CONFIG_FW_LOADER_COMPRESS is not set
+CONFIG_FW_CACHE=y
+# CONFIG_FW_UPLOAD is not set
+# end of Firmware loader
+
+CONFIG_ALLOW_DEV_COREDUMP=y
+# CONFIG_DEBUG_DRIVER is not set
+CONFIG_DEBUG_DEVRES=y
+# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
+# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
+CONFIG_SYS_HYPERVISOR=y
+CONFIG_GENERIC_CPU_AUTOPROBE=y
+CONFIG_GENERIC_CPU_VULNERABILITIES=y
+CONFIG_SOC_BUS=y
+CONFIG_DMA_SHARED_BUFFER=y
+# CONFIG_DMA_FENCE_TRACE is not set
+CONFIG_GENERIC_ARCH_TOPOLOGY=y
+CONFIG_GENERIC_ARCH_NUMA=y
+# end of Generic Driver Options
+
+#
+# Bus devices
+#
+# CONFIG_BRCMSTB_GISB_ARB is not set
+# CONFIG_VEXPRESS_CONFIG is not set
+# CONFIG_MHI_BUS is not set
+# CONFIG_MHI_BUS_EP is not set
+# end of Bus devices
+
+CONFIG_CONNECTOR=y
+CONFIG_PROC_EVENTS=y
+
+#
+# Firmware Drivers
+#
+
+#
+# ARM System Control and Management Interface Protocol
+#
+# CONFIG_ARM_SCMI_PROTOCOL is not set
+# end of ARM System Control and Management Interface Protocol
+
+# CONFIG_ARM_SCPI_PROTOCOL is not set
+# CONFIG_FIRMWARE_MEMMAP is not set
+CONFIG_DMIID=y
+# CONFIG_DMI_SYSFS is not set
+# CONFIG_ISCSI_IBFT is not set
+# CONFIG_FW_CFG_SYSFS is not set
+# CONFIG_SYSFB_SIMPLEFB is not set
+# CONFIG_ARM_FFA_TRANSPORT is not set
+# CONFIG_GOOGLE_FIRMWARE is not set
+
+#
+# EFI (Extensible Firmware Interface) Support
+#
+CONFIG_EFI_ESRT=y
+CONFIG_EFI_VARS_PSTORE=y
+# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set
+CONFIG_EFI_PARAMS_FROM_FDT=y
+CONFIG_EFI_RUNTIME_WRAPPERS=y
+CONFIG_EFI_GENERIC_STUB=y
+# CONFIG_EFI_ZBOOT is not set
+CONFIG_EFI_ARMSTUB_DTB_LOADER=y
+CONFIG_EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER=y
+# CONFIG_EFI_BOOTLOADER_CONTROL is not set
+# CONFIG_EFI_CAPSULE_LOADER is not set
+# CONFIG_EFI_TEST is not set
+# CONFIG_RESET_ATTACK_MITIGATION is not set
+# CONFIG_EFI_DISABLE_PCI_DMA is not set
+CONFIG_EFI_EARLYCON=y
+# CONFIG_EFI_CUSTOM_SSDT_OVERLAYS is not set
+# CONFIG_EFI_DISABLE_RUNTIME is not set
+# CONFIG_EFI_COCO_SECRET is not set
+# end of EFI (Extensible Firmware Interface) Support
+
+CONFIG_ARM_PSCI_FW=y
+# CONFIG_ARM_PSCI_CHECKER is not set
+CONFIG_HAVE_ARM_SMCCC=y
+CONFIG_HAVE_ARM_SMCCC_DISCOVERY=y
+CONFIG_ARM_SMCCC_SOC_ID=y
+
+#
+# Tegra firmware driver
+#
+# end of Tegra firmware driver
+# end of Firmware Drivers
+
+# CONFIG_GNSS is not set
+# CONFIG_MTD is not set
+CONFIG_DTC=y
+CONFIG_OF=y
+# CONFIG_OF_UNITTEST is not set
+CONFIG_OF_FLATTREE=y
+CONFIG_OF_EARLY_FLATTREE=y
+CONFIG_OF_KOBJ=y
+CONFIG_OF_ADDRESS=y
+CONFIG_OF_IRQ=y
+CONFIG_OF_RESERVED_MEM=y
+# CONFIG_OF_OVERLAY is not set
+CONFIG_OF_NUMA=y
+# CONFIG_PARPORT is not set
+CONFIG_PNP=y
+CONFIG_PNP_DEBUG_MESSAGES=y
+
+#
+# Protocols
+#
+CONFIG_PNPACPI=y
+CONFIG_BLK_DEV=y
+# CONFIG_BLK_DEV_NULL_BLK is not set
+CONFIG_CDROM=y
+# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
+CONFIG_ZRAM=m
+CONFIG_ZRAM_DEF_COMP_LZORLE=y
+# CONFIG_ZRAM_DEF_COMP_LZ4 is not set
+# CONFIG_ZRAM_DEF_COMP_LZO is not set
+CONFIG_ZRAM_DEF_COMP="lzo-rle"
+# CONFIG_ZRAM_WRITEBACK is not set
+# CONFIG_ZRAM_MEMORY_TRACKING is not set
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
+# CONFIG_BLK_DEV_DRBD is not set
+# CONFIG_BLK_DEV_NBD is not set
+# CONFIG_BLK_DEV_RAM is not set
+# CONFIG_CDROM_PKTCDVD is not set
+# CONFIG_ATA_OVER_ETH is not set
+CONFIG_XEN_BLKDEV_FRONTEND=y
+CONFIG_XEN_BLKDEV_BACKEND=m
+CONFIG_VIRTIO_BLK=m
+# CONFIG_BLK_DEV_RBD is not set
+# CONFIG_BLK_DEV_UBLK is not set
+
+#
+# NVME Support
+#
+CONFIG_NVME_CORE=y
+CONFIG_BLK_DEV_NVME=y
+# CONFIG_NVME_MULTIPATH is not set
+# CONFIG_NVME_VERBOSE_ERRORS is not set
+# CONFIG_NVME_FC is not set
+# CONFIG_NVME_TCP is not set
+# CONFIG_NVME_AUTH is not set
+# CONFIG_NVME_TARGET is not set
+# end of NVME Support
+
+#
+# Misc devices
+#
+# CONFIG_DUMMY_IRQ is not set
+# CONFIG_PHANTOM is not set
+# CONFIG_TIFM_CORE is not set
+# CONFIG_ENCLOSURE_SERVICES is not set
+# CONFIG_HP_ILO is not set
+# CONFIG_SRAM is not set
+# CONFIG_DW_XDATA_PCIE is not set
+# CONFIG_PCI_ENDPOINT_TEST is not set
+# CONFIG_XILINX_SDFEC is not set
+# CONFIG_OPEN_DICE is not set
+# CONFIG_VCPU_STALL_DETECTOR is not set
+# CONFIG_C2PORT is not set
+
+#
+# EEPROM support
+#
+CONFIG_EEPROM_93CX6=m
+# end of EEPROM support
+
+# CONFIG_CB710_CORE is not set
+
+#
+# Texas Instruments shared transport line discipline
+#
+# end of Texas Instruments shared transport line discipline
+
+#
+# Altera FPGA firmware download module (requires I2C)
+#
+# CONFIG_VMWARE_VMCI is not set
+# CONFIG_GENWQE is not set
+# CONFIG_ECHO is not set
+# CONFIG_BCM_VK is not set
+# CONFIG_MISC_ALCOR_PCI is not set
+# CONFIG_MISC_RTSX_PCI is not set
+# CONFIG_HABANA_AI is not set
+# CONFIG_UACCE is not set
+CONFIG_PVPANIC=y
+CONFIG_PVPANIC_MMIO=y
+# CONFIG_PVPANIC_PCI is not set
+# end of Misc devices
+
+#
+# SCSI device support
+#
+CONFIG_SCSI_MOD=y
+# CONFIG_RAID_ATTRS is not set
+CONFIG_SCSI_COMMON=y
+CONFIG_SCSI=y
+CONFIG_SCSI_DMA=y
+CONFIG_SCSI_PROC_FS=y
+
+#
+# SCSI support type (disk, tape, CD-ROM)
+#
+CONFIG_BLK_DEV_SD=y
+# CONFIG_CHR_DEV_ST is not set
+CONFIG_BLK_DEV_SR=y
+# CONFIG_CHR_DEV_SG is not set
+CONFIG_BLK_DEV_BSG=y
+# CONFIG_CHR_DEV_SCH is not set
+CONFIG_SCSI_CONSTANTS=y
+# CONFIG_SCSI_LOGGING is not set
+# CONFIG_SCSI_SCAN_ASYNC is not set
+
+#
+# SCSI Transports
+#
+CONFIG_SCSI_SPI_ATTRS=y
+# CONFIG_SCSI_FC_ATTRS is not set
+CONFIG_SCSI_ISCSI_ATTRS=m
+# CONFIG_SCSI_SAS_ATTRS is not set
+# CONFIG_SCSI_SAS_LIBSAS is not set
+# CONFIG_SCSI_SRP_ATTRS is not set
+# end of SCSI Transports
+
+CONFIG_SCSI_LOWLEVEL=y
+CONFIG_ISCSI_TCP=m
+# CONFIG_ISCSI_BOOT_SYSFS is not set
+# CONFIG_SCSI_CXGB3_ISCSI is not set
+# CONFIG_SCSI_CXGB4_ISCSI is not set
+# CONFIG_SCSI_BNX2_ISCSI is not set
+# CONFIG_BE2ISCSI is not set
+# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
+# CONFIG_SCSI_HPSA is not set
+# CONFIG_SCSI_3W_9XXX is not set
+# CONFIG_SCSI_3W_SAS is not set
+# CONFIG_SCSI_ACARD is not set
+# CONFIG_SCSI_AACRAID is not set
+# CONFIG_SCSI_AIC7XXX is not set
+# CONFIG_SCSI_AIC79XX is not set
+# CONFIG_SCSI_AIC94XX is not set
+# CONFIG_SCSI_HISI_SAS is not set
+# CONFIG_SCSI_MVSAS is not set
+# CONFIG_SCSI_MVUMI is not set
+# CONFIG_SCSI_ADVANSYS is not set
+# CONFIG_SCSI_ARCMSR is not set
+# CONFIG_SCSI_ESAS2R is not set
+# CONFIG_MEGARAID_NEWGEN is not set
+# CONFIG_MEGARAID_LEGACY is not set
+# CONFIG_MEGARAID_SAS is not set
+# CONFIG_SCSI_MPT3SAS is not set
+# CONFIG_SCSI_MPT2SAS is not set
+# CONFIG_SCSI_MPI3MR is not set
+# CONFIG_SCSI_SMARTPQI is not set
+# CONFIG_SCSI_HPTIOP is not set
+# CONFIG_SCSI_BUSLOGIC is not set
+# CONFIG_SCSI_MYRB is not set
+# CONFIG_SCSI_MYRS is not set
+CONFIG_XEN_SCSI_FRONTEND=m
+# CONFIG_SCSI_SNIC is not set
+# CONFIG_SCSI_DMX3191D is not set
+# CONFIG_SCSI_FDOMAIN_PCI is not set
+# CONFIG_SCSI_IPS is not set
+# CONFIG_SCSI_INITIO is not set
+# CONFIG_SCSI_INIA100 is not set
+# CONFIG_SCSI_STEX is not set
+# CONFIG_SCSI_SYM53C8XX_2 is not set
+# CONFIG_SCSI_IPR is not set
+# CONFIG_SCSI_QLOGIC_1280 is not set
+# CONFIG_SCSI_QLA_ISCSI is not set
+# CONFIG_SCSI_DC395x is not set
+# CONFIG_SCSI_AM53C974 is not set
+# CONFIG_SCSI_WD719X is not set
+# CONFIG_SCSI_DEBUG is not set
+# CONFIG_SCSI_PMCRAID is not set
+# CONFIG_SCSI_PM8001 is not set
+CONFIG_SCSI_VIRTIO=y
+# CONFIG_SCSI_DH is not set
+# end of SCSI device support
+
+CONFIG_ATA=y
+CONFIG_SATA_HOST=y
+CONFIG_PATA_TIMINGS=y
+CONFIG_ATA_VERBOSE_ERROR=y
+CONFIG_ATA_FORCE=y
+CONFIG_ATA_ACPI=y
+# CONFIG_SATA_ZPODD is not set
+# CONFIG_SATA_PMP is not set
+
+#
+# Controllers with non-SFF native interface
+#
+CONFIG_SATA_AHCI=y
+CONFIG_SATA_MOBILE_LPM_POLICY=0
+# CONFIG_SATA_AHCI_PLATFORM is not set
+# CONFIG_AHCI_DWC is not set
+# CONFIG_AHCI_CEVA is not set
+# CONFIG_AHCI_QORIQ is not set
+# CONFIG_SATA_INIC162X is not set
+# CONFIG_SATA_ACARD_AHCI is not set
+# CONFIG_SATA_SIL24 is not set
+CONFIG_ATA_SFF=y
+
+#
+# SFF controllers with custom DMA interface
+#
+# CONFIG_PDC_ADMA is not set
+# CONFIG_SATA_QSTOR is not set
+# CONFIG_SATA_SX4 is not set
+CONFIG_ATA_BMDMA=y
+
+#
+# SATA SFF controllers with BMDMA
+#
+CONFIG_ATA_PIIX=y
+# CONFIG_SATA_MV is not set
+# CONFIG_SATA_NV is not set
+# CONFIG_SATA_PROMISE is not set
+# CONFIG_SATA_SIL is not set
+# CONFIG_SATA_SIS is not set
+# CONFIG_SATA_SVW is not set
+# CONFIG_SATA_ULI is not set
+# CONFIG_SATA_VIA is not set
+# CONFIG_SATA_VITESSE is not set
+
+#
+# PATA SFF controllers with BMDMA
+#
+# CONFIG_PATA_ALI is not set
+# CONFIG_PATA_AMD is not set
+# CONFIG_PATA_ARTOP is not set
+# CONFIG_PATA_ATIIXP is not set
+# CONFIG_PATA_ATP867X is not set
+# CONFIG_PATA_CMD64X is not set
+# CONFIG_PATA_CYPRESS is not set
+# CONFIG_PATA_EFAR is not set
+# CONFIG_PATA_HPT366 is not set
+# CONFIG_PATA_HPT37X is not set
+# CONFIG_PATA_HPT3X2N is not set
+# CONFIG_PATA_HPT3X3 is not set
+# CONFIG_PATA_IT8213 is not set
+# CONFIG_PATA_IT821X is not set
+# CONFIG_PATA_JMICRON is not set
+# CONFIG_PATA_MARVELL is not set
+# CONFIG_PATA_NETCELL is not set
+# CONFIG_PATA_NINJA32 is not set
+# CONFIG_PATA_NS87415 is not set
+# CONFIG_PATA_OLDPIIX is not set
+# CONFIG_PATA_OPTIDMA is not set
+# CONFIG_PATA_PDC2027X is not set
+# CONFIG_PATA_PDC_OLD is not set
+# CONFIG_PATA_RADISYS is not set
+# CONFIG_PATA_RDC is not set
+# CONFIG_PATA_SCH is not set
+# CONFIG_PATA_SERVERWORKS is not set
+# CONFIG_PATA_SIL680 is not set
+# CONFIG_PATA_SIS is not set
+# CONFIG_PATA_TOSHIBA is not set
+# CONFIG_PATA_TRIFLEX is not set
+# CONFIG_PATA_VIA is not set
+# CONFIG_PATA_WINBOND is not set
+
+#
+# PIO-only SFF controllers
+#
+# CONFIG_PATA_CMD640_PCI is not set
+# CONFIG_PATA_MPIIX is not set
+# CONFIG_PATA_NS87410 is not set
+# CONFIG_PATA_OPTI is not set
+# CONFIG_PATA_OF_PLATFORM is not set
+# CONFIG_PATA_RZ1000 is not set
+
+#
+# Generic fallback / legacy drivers
+#
+# CONFIG_PATA_ACPI is not set
+CONFIG_ATA_GENERIC=y
+# CONFIG_PATA_LEGACY is not set
+CONFIG_MD=y
+CONFIG_BLK_DEV_MD=y
+CONFIG_MD_AUTODETECT=y
+# CONFIG_MD_LINEAR is not set
+CONFIG_MD_RAID0=y
+CONFIG_MD_RAID1=m
+CONFIG_MD_RAID10=m
+CONFIG_MD_RAID456=m
+# CONFIG_MD_MULTIPATH is not set
+# CONFIG_MD_FAULTY is not set
+CONFIG_BCACHE=m
+# CONFIG_BCACHE_DEBUG is not set
+# CONFIG_BCACHE_CLOSURES_DEBUG is not set
+# CONFIG_BCACHE_ASYNC_REGISTRATION is not set
+CONFIG_BLK_DEV_DM_BUILTIN=y
+CONFIG_BLK_DEV_DM=y
+CONFIG_DM_DEBUG=y
+CONFIG_DM_BUFIO=y
+# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
+CONFIG_DM_BIO_PRISON=m
+CONFIG_DM_PERSISTENT_DATA=m
+# CONFIG_DM_UNSTRIPED is not set
+CONFIG_DM_CRYPT=y
+CONFIG_DM_SNAPSHOT=m
+CONFIG_DM_THIN_PROVISIONING=m
+CONFIG_DM_CACHE=m
+CONFIG_DM_CACHE_SMQ=m
+CONFIG_DM_WRITECACHE=m
+# CONFIG_DM_EBS is not set
+# CONFIG_DM_ERA is not set
+# CONFIG_DM_CLONE is not set
+# CONFIG_DM_MIRROR is not set
+CONFIG_DM_RAID=m
+# CONFIG_DM_ZERO is not set
+CONFIG_DM_MULTIPATH=m
+CONFIG_DM_MULTIPATH_QL=m
+CONFIG_DM_MULTIPATH_ST=m
+CONFIG_DM_MULTIPATH_HST=m
+# CONFIG_DM_MULTIPATH_IOA is not set
+# CONFIG_DM_DELAY is not set
+# CONFIG_DM_DUST is not set
+CONFIG_DM_INIT=y
+# CONFIG_DM_UEVENT is not set
+# CONFIG_DM_FLAKEY is not set
+CONFIG_DM_VERITY=y
+# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set
+# CONFIG_DM_VERITY_FEC is not set
+# CONFIG_DM_SWITCH is not set
+# CONFIG_DM_LOG_WRITES is not set
+CONFIG_DM_INTEGRITY=y
+CONFIG_DM_AUDIT=y
+CONFIG_TARGET_CORE=m
+CONFIG_TCM_IBLOCK=m
+CONFIG_TCM_FILEIO=m
+# CONFIG_TCM_PSCSI is not set
+CONFIG_TCM_USER2=m
+CONFIG_LOOPBACK_TARGET=m
+# CONFIG_ISCSI_TARGET is not set
+# CONFIG_FUSION is not set
+
+#
+# IEEE 1394 (FireWire) support
+#
+# CONFIG_FIREWIRE is not set
+# CONFIG_FIREWIRE_NOSY is not set
+# end of IEEE 1394 (FireWire) support
+
+CONFIG_NETDEVICES=y
+CONFIG_NET_CORE=y
+CONFIG_BONDING=m
+CONFIG_DUMMY=m
+CONFIG_WIREGUARD=m
+# CONFIG_WIREGUARD_DEBUG is not set
+# CONFIG_EQUALIZER is not set
+# CONFIG_NET_FC is not set
+CONFIG_IFB=m
+# CONFIG_NET_TEAM is not set
+CONFIG_MACVLAN=y
+# CONFIG_MACVTAP is not set
+CONFIG_IPVLAN_L3S=y
+CONFIG_IPVLAN=m
+# CONFIG_IPVTAP is not set
+CONFIG_VXLAN=m
+CONFIG_GENEVE=m
+# CONFIG_BAREUDP is not set
+# CONFIG_GTP is not set
+# CONFIG_AMT is not set
+# CONFIG_MACSEC is not set
+# CONFIG_NETCONSOLE is not set
+CONFIG_TUN=m
+# CONFIG_TUN_VNET_CROSS_LE is not set
+CONFIG_VETH=m
+CONFIG_VIRTIO_NET=y
+# CONFIG_NLMON is not set
+CONFIG_NET_VRF=m
+CONFIG_VSOCKMON=m
+# CONFIG_ARCNET is not set
+CONFIG_ETHERNET=y
+# CONFIG_NET_VENDOR_3COM is not set
+# CONFIG_NET_VENDOR_ADAPTEC is not set
+# CONFIG_NET_VENDOR_AGERE is not set
+# CONFIG_NET_VENDOR_ALACRITECH is not set
+# CONFIG_NET_VENDOR_ALTEON is not set
+# CONFIG_ALTERA_TSE is not set
+CONFIG_NET_VENDOR_AMAZON=y
+CONFIG_ENA_ETHERNET=y
+# CONFIG_NET_VENDOR_AMD is not set
+# CONFIG_NET_VENDOR_AQUANTIA is not set
+# CONFIG_NET_VENDOR_ARC is not set
+CONFIG_NET_VENDOR_ASIX=y
+# CONFIG_NET_VENDOR_ATHEROS is not set
+# CONFIG_NET_VENDOR_BROADCOM is not set
+# CONFIG_NET_VENDOR_CADENCE is not set
+# CONFIG_NET_VENDOR_CAVIUM is not set
+# CONFIG_NET_VENDOR_CHELSIO is not set
+# CONFIG_NET_VENDOR_CISCO is not set
+# CONFIG_NET_VENDOR_CORTINA is not set
+CONFIG_NET_VENDOR_DAVICOM=y
+# CONFIG_DNET is not set
+# CONFIG_NET_VENDOR_DEC is not set
+# CONFIG_NET_VENDOR_DLINK is not set
+# CONFIG_NET_VENDOR_EMULEX is not set
+CONFIG_NET_VENDOR_ENGLEDER=y
+# CONFIG_TSNEP is not set
+# CONFIG_NET_VENDOR_EZCHIP is not set
+CONFIG_NET_VENDOR_FUNGIBLE=y
+# CONFIG_FUN_ETH is not set
+CONFIG_NET_VENDOR_GOOGLE=y
+CONFIG_GVE=m
+CONFIG_NET_VENDOR_HISILICON=y
+# CONFIG_HIX5HD2_GMAC is not set
+# CONFIG_HISI_FEMAC is not set
+# CONFIG_HIP04_ETH is not set
+# CONFIG_HNS_DSAF is not set
+# CONFIG_HNS_ENET is not set
+# CONFIG_HNS3 is not set
+# CONFIG_NET_VENDOR_HUAWEI is not set
+CONFIG_NET_VENDOR_I825XX=y
+CONFIG_NET_VENDOR_INTEL=y
+# CONFIG_E100 is not set
+# CONFIG_E1000 is not set
+# CONFIG_E1000E is not set
+# CONFIG_IGB is not set
+# CONFIG_IGBVF is not set
+# CONFIG_IXGB is not set
+# CONFIG_IXGBE is not set
+CONFIG_IXGBEVF=y
+# CONFIG_I40E is not set
+# CONFIG_I40EVF is not set
+# CONFIG_ICE is not set
+# CONFIG_FM10K is not set
+# CONFIG_IGC is not set
+CONFIG_NET_VENDOR_WANGXUN=y
+# CONFIG_NGBE is not set
+# CONFIG_TXGBE is not set
+# CONFIG_JME is not set
+CONFIG_NET_VENDOR_LITEX=y
+# CONFIG_LITEX_LITEETH is not set
+# CONFIG_NET_VENDOR_MARVELL is not set
+CONFIG_NET_VENDOR_MELLANOX=y
+CONFIG_MLX4_EN=m
+CONFIG_MLX4_CORE=m
+CONFIG_MLX4_DEBUG=y
+CONFIG_MLX4_CORE_GEN2=y
+CONFIG_MLX5_CORE=m
+CONFIG_MLX5_FPGA=y
+CONFIG_MLX5_CORE_EN=y
+CONFIG_MLX5_EN_ARFS=y
+CONFIG_MLX5_EN_RXNFC=y
+CONFIG_MLX5_MPFS=y
+# CONFIG_MLX5_CORE_IPOIB is not set
+# CONFIG_MLX5_EN_TLS is not set
+# CONFIG_MLX5_SF is not set
+CONFIG_MLXSW_CORE=m
+CONFIG_MLXSW_CORE_THERMAL=y
+CONFIG_MLXSW_PCI=m
+CONFIG_MLXFW=m
+# CONFIG_MLXBF_GIGE is not set
+# CONFIG_NET_VENDOR_MICREL is not set
+CONFIG_NET_VENDOR_MICROCHIP=y
+# CONFIG_LAN743X is not set
+# CONFIG_NET_VENDOR_MICROSEMI is not set
+CONFIG_NET_VENDOR_MICROSOFT=y
+# CONFIG_NET_VENDOR_MYRI is not set
+# CONFIG_FEALNX is not set
+# CONFIG_NET_VENDOR_NI is not set
+# CONFIG_NET_VENDOR_NATSEMI is not set
+# CONFIG_NET_VENDOR_NETERION is not set
+# CONFIG_NET_VENDOR_NETRONOME is not set
+# CONFIG_NET_VENDOR_NVIDIA is not set
+# CONFIG_NET_VENDOR_OKI is not set
+# CONFIG_ETHOC is not set
+# CONFIG_NET_VENDOR_PACKET_ENGINES is not set
+CONFIG_NET_VENDOR_PENSANDO=y
+# CONFIG_IONIC is not set
+# CONFIG_NET_VENDOR_QLOGIC is not set
+# CONFIG_NET_VENDOR_BROCADE is not set
+# CONFIG_NET_VENDOR_QUALCOMM is not set
+# CONFIG_NET_VENDOR_RDC is not set
+# CONFIG_NET_VENDOR_REALTEK is not set
+# CONFIG_NET_VENDOR_RENESAS is not set
+# CONFIG_NET_VENDOR_ROCKER is not set
+# CONFIG_NET_VENDOR_SAMSUNG is not set
+# CONFIG_NET_VENDOR_SEEQ is not set
+# CONFIG_NET_VENDOR_SILAN is not set
+# CONFIG_NET_VENDOR_SIS is not set
+# CONFIG_NET_VENDOR_SOLARFLARE is not set
+# CONFIG_NET_VENDOR_SMSC is not set
+# CONFIG_NET_VENDOR_SOCIONEXT is not set
+# CONFIG_NET_VENDOR_STMICRO is not set
+# CONFIG_NET_VENDOR_SUN is not set
+# CONFIG_NET_VENDOR_SYNOPSYS is not set
+# CONFIG_NET_VENDOR_TEHUTI is not set
+# CONFIG_NET_VENDOR_TI is not set
+CONFIG_NET_VENDOR_VERTEXCOM=y
+# CONFIG_NET_VENDOR_VIA is not set
+# CONFIG_NET_VENDOR_WIZNET is not set
+CONFIG_NET_VENDOR_XILINX=y
+# CONFIG_XILINX_EMACLITE is not set
+# CONFIG_XILINX_AXI_EMAC is not set
+# CONFIG_XILINX_LL_TEMAC is not set
+# CONFIG_FDDI is not set
+# CONFIG_HIPPI is not set
+# CONFIG_NET_SB1000 is not set
+CONFIG_PHYLIB=y
+CONFIG_SWPHY=y
+CONFIG_FIXED_PHY=y
+
+#
+# MII PHY device drivers
+#
+# CONFIG_AMD_PHY is not set
+# CONFIG_ADIN_PHY is not set
+# CONFIG_ADIN1100_PHY is not set
+# CONFIG_AQUANTIA_PHY is not set
+# CONFIG_AX88796B_PHY is not set
+# CONFIG_BROADCOM_PHY is not set
+# CONFIG_BCM54140_PHY is not set
+# CONFIG_BCM7XXX_PHY is not set
+# CONFIG_BCM84881_PHY is not set
+# CONFIG_BCM87XX_PHY is not set
+# CONFIG_CICADA_PHY is not set
+# CONFIG_CORTINA_PHY is not set
+# CONFIG_DAVICOM_PHY is not set
+# CONFIG_ICPLUS_PHY is not set
+# CONFIG_LXT_PHY is not set
+# CONFIG_INTEL_XWAY_PHY is not set
+# CONFIG_LSI_ET1011C_PHY is not set
+# CONFIG_MARVELL_PHY is not set
+# CONFIG_MARVELL_10G_PHY is not set
+# CONFIG_MARVELL_88X2222_PHY is not set
+# CONFIG_MAXLINEAR_GPHY is not set
+# CONFIG_MEDIATEK_GE_PHY is not set
+# CONFIG_MICREL_PHY is not set
+# CONFIG_MICROCHIP_PHY is not set
+# CONFIG_MICROCHIP_T1_PHY is not set
+# CONFIG_MICROSEMI_PHY is not set
+# CONFIG_MOTORCOMM_PHY is not set
+# CONFIG_NATIONAL_PHY is not set
+# CONFIG_NXP_C45_TJA11XX_PHY is not set
+# CONFIG_QSEMI_PHY is not set
+# CONFIG_REALTEK_PHY is not set
+# CONFIG_RENESAS_PHY is not set
+# CONFIG_ROCKCHIP_PHY is not set
+# CONFIG_SMSC_PHY is not set
+# CONFIG_STE10XP is not set
+# CONFIG_TERANETICS_PHY is not set
+# CONFIG_DP83822_PHY is not set
+# CONFIG_DP83TC811_PHY is not set
+# CONFIG_DP83848_PHY is not set
+# CONFIG_DP83867_PHY is not set
+# CONFIG_DP83869_PHY is not set
+# CONFIG_DP83TD510_PHY is not set
+# CONFIG_VITESSE_PHY is not set
+# CONFIG_XILINX_GMII2RGMII is not set
+# CONFIG_PSE_CONTROLLER is not set
+CONFIG_MDIO_DEVICE=y
+CONFIG_MDIO_BUS=y
+CONFIG_FWNODE_MDIO=y
+CONFIG_OF_MDIO=y
+CONFIG_ACPI_MDIO=y
+CONFIG_MDIO_DEVRES=y
+# CONFIG_MDIO_BITBANG is not set
+# CONFIG_MDIO_BCM_UNIMAC is not set
+# CONFIG_MDIO_HISI_FEMAC is not set
+# CONFIG_MDIO_OCTEON is not set
+# CONFIG_MDIO_IPQ4019 is not set
+# CONFIG_MDIO_THUNDER is not set
+
+#
+# MDIO Multiplexers
+#
+# CONFIG_MDIO_BUS_MUX_MULTIPLEXER is not set
+# CONFIG_MDIO_BUS_MUX_MMIOREG is not set
+
+#
+# PCS device drivers
+#
+# end of PCS device drivers
+
+CONFIG_PPP=m
+# CONFIG_PPP_BSDCOMP is not set
+# CONFIG_PPP_DEFLATE is not set
+# CONFIG_PPP_FILTER is not set
+# CONFIG_PPP_MPPE is not set
+# CONFIG_PPP_MULTILINK is not set
+# CONFIG_PPPOE is not set
+# CONFIG_PPTP is not set
+CONFIG_PPP_ASYNC=m
+# CONFIG_PPP_SYNC_TTY is not set
+# CONFIG_SLIP is not set
+CONFIG_SLHC=m
+
+#
+# Host-side USB support is needed for USB Network Adapter support
+#
+# CONFIG_WLAN is not set
+# CONFIG_WAN is not set
+
+#
+# Wireless WAN
+#
+# CONFIG_WWAN is not set
+# end of Wireless WAN
+
+CONFIG_XEN_NETDEV_FRONTEND=y
+CONFIG_XEN_NETDEV_BACKEND=m
+CONFIG_VMXNET3=y
+# CONFIG_FUJITSU_ES is not set
+# CONFIG_NETDEVSIM is not set
+CONFIG_NET_FAILOVER=y
+# CONFIG_ISDN is not set
+
+#
+# Input device support
+#
+CONFIG_INPUT=y
+CONFIG_INPUT_FF_MEMLESS=y
+CONFIG_INPUT_SPARSEKMAP=m
+# CONFIG_INPUT_MATRIXKMAP is not set
+CONFIG_INPUT_VIVALDIFMAP=y
+
+#
+# Userland interfaces
+#
+# CONFIG_INPUT_MOUSEDEV is not set
+# CONFIG_INPUT_JOYDEV is not set
+CONFIG_INPUT_EVDEV=y
+# CONFIG_INPUT_EVBUG is not set
+
+#
+# Input Device Drivers
+#
+CONFIG_INPUT_KEYBOARD=y
+CONFIG_KEYBOARD_ATKBD=y
+# CONFIG_KEYBOARD_LKKBD is not set
+# CONFIG_KEYBOARD_NEWTON is not set
+# CONFIG_KEYBOARD_OPENCORES is not set
+# CONFIG_KEYBOARD_SAMSUNG is not set
+# CONFIG_KEYBOARD_STOWAWAY is not set
+# CONFIG_KEYBOARD_SUNKBD is not set
+# CONFIG_KEYBOARD_OMAP4 is not set
+# CONFIG_KEYBOARD_XTKBD is not set
+# CONFIG_KEYBOARD_BCM is not set
+# CONFIG_INPUT_MOUSE is not set
+# CONFIG_INPUT_JOYSTICK is not set
+# CONFIG_INPUT_TABLET is not set
+# CONFIG_INPUT_TOUCHSCREEN is not set
+CONFIG_INPUT_MISC=y
+# CONFIG_INPUT_AD714X is not set
+# CONFIG_INPUT_E3X0_BUTTON is not set
+CONFIG_INPUT_UINPUT=m
+# CONFIG_INPUT_ADXL34X is not set
+# CONFIG_INPUT_CMA3000 is not set
+CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y
+# CONFIG_RMI4_CORE is not set
+
+#
+# Hardware I/O ports
+#
+CONFIG_SERIO=y
+CONFIG_SERIO_SERPORT=m
+# CONFIG_SERIO_AMBAKMI is not set
+CONFIG_SERIO_PCIPS2=m
+CONFIG_SERIO_LIBPS2=y
+CONFIG_SERIO_RAW=y
+# CONFIG_SERIO_ALTERA_PS2 is not set
+# CONFIG_SERIO_PS2MULT is not set
+# CONFIG_SERIO_ARC_PS2 is not set
+# CONFIG_SERIO_APBPS2 is not set
+# CONFIG_USERIO is not set
+# CONFIG_GAMEPORT is not set
+# end of Hardware I/O ports
+# end of Input device support
+
+#
+# Character devices
+#
+CONFIG_TTY=y
+CONFIG_VT=y
+CONFIG_CONSOLE_TRANSLATIONS=y
+CONFIG_VT_CONSOLE=y
+CONFIG_VT_CONSOLE_SLEEP=y
+CONFIG_HW_CONSOLE=y
+# CONFIG_VT_HW_CONSOLE_BINDING is not set
+CONFIG_UNIX98_PTYS=y
+# CONFIG_LEGACY_PTYS is not set
+CONFIG_LDISC_AUTOLOAD=y
+
+#
+# Serial drivers
+#
+CONFIG_SERIAL_EARLYCON=y
+CONFIG_SERIAL_8250=y
+# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
+CONFIG_SERIAL_8250_PNP=y
+CONFIG_SERIAL_8250_16550A_VARIANTS=y
+# CONFIG_SERIAL_8250_FINTEK is not set
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_PCI=y
+# CONFIG_SERIAL_8250_EXAR is not set
+CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
+# CONFIG_SERIAL_8250_EXTENDED is not set
+CONFIG_SERIAL_8250_FSL=y
+# CONFIG_SERIAL_8250_DW is not set
+# CONFIG_SERIAL_8250_RT288X is not set
+CONFIG_SERIAL_8250_PERICOM=y
+# CONFIG_SERIAL_OF_PLATFORM is not set
+
+#
+# Non-8250 serial port support
+#
+CONFIG_SERIAL_AMBA_PL010=y
+CONFIG_SERIAL_AMBA_PL010_CONSOLE=y
+CONFIG_SERIAL_AMBA_PL011=y
+CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
+CONFIG_SERIAL_EARLYCON_ARM_SEMIHOST=y
+# CONFIG_SERIAL_UARTLITE is not set
+CONFIG_SERIAL_CORE=y
+CONFIG_SERIAL_CORE_CONSOLE=y
+# CONFIG_SERIAL_JSM is not set
+# CONFIG_SERIAL_SIFIVE is not set
+# CONFIG_SERIAL_SCCNXP is not set
+# CONFIG_SERIAL_ALTERA_JTAGUART is not set
+# CONFIG_SERIAL_ALTERA_UART is not set
+# CONFIG_SERIAL_XILINX_PS_UART is not set
+# CONFIG_SERIAL_ARC is not set
+# CONFIG_SERIAL_RP2 is not set
+# CONFIG_SERIAL_FSL_LPUART is not set
+# CONFIG_SERIAL_FSL_LINFLEXUART is not set
+# CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set
+# CONFIG_SERIAL_SPRD is not set
+# end of Serial drivers
+
+# CONFIG_SERIAL_NONSTANDARD is not set
+# CONFIG_N_GSM is not set
+# CONFIG_NOZOMI is not set
+# CONFIG_NULL_TTY is not set
+CONFIG_HVC_DRIVER=y
+CONFIG_HVC_IRQ=y
+CONFIG_HVC_XEN=y
+CONFIG_HVC_XEN_FRONTEND=y
+# CONFIG_HVC_DCC is not set
+# CONFIG_SERIAL_DEV_BUS is not set
+# CONFIG_TTY_PRINTK is not set
+# CONFIG_VIRTIO_CONSOLE is not set
+# CONFIG_IPMI_HANDLER is not set
+CONFIG_HW_RANDOM=y
+# CONFIG_HW_RANDOM_TIMERIOMEM is not set
+# CONFIG_HW_RANDOM_BA431 is not set
+CONFIG_HW_RANDOM_VIRTIO=y
+# CONFIG_HW_RANDOM_CCTRNG is not set
+# CONFIG_HW_RANDOM_XIPHERA is not set
+CONFIG_HW_RANDOM_ARM_SMCCC_TRNG=y
+CONFIG_HW_RANDOM_CN10K=y
+# CONFIG_APPLICOM is not set
+CONFIG_DEVMEM=y
+CONFIG_DEVPORT=y
+CONFIG_TCG_TPM=y
+# CONFIG_HW_RANDOM_TPM is not set
+CONFIG_TCG_TIS_CORE=y
+CONFIG_TCG_TIS=y
+# CONFIG_TCG_ATMEL is not set
+# CONFIG_TCG_INFINEON is not set
+CONFIG_TCG_XEN=m
+CONFIG_TCG_CRB=y
+# CONFIG_TCG_VTPM_PROXY is not set
+# CONFIG_XILLYBUS is not set
+CONFIG_RANDOM_TRUST_CPU=y
+# CONFIG_RANDOM_TRUST_BOOTLOADER is not set
+# end of Character devices
+
+#
+# I2C support
+#
+# CONFIG_I2C is not set
+# end of I2C support
+
+# CONFIG_I3C is not set
+# CONFIG_SPI is not set
+# CONFIG_SPMI is not set
+# CONFIG_HSI is not set
+CONFIG_PPS=m
+# CONFIG_PPS_DEBUG is not set
+
+#
+# PPS clients support
+#
+# CONFIG_PPS_CLIENT_KTIMER is not set
+# CONFIG_PPS_CLIENT_LDISC is not set
+# CONFIG_PPS_CLIENT_GPIO is not set
+
+#
+# PPS generators support
+#
+
+#
+# PTP clock support
+#
+CONFIG_PTP_1588_CLOCK=m
+CONFIG_PTP_1588_CLOCK_OPTIONAL=m
+
+#
+# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
+#
+CONFIG_PTP_1588_CLOCK_KVM=m
+# end of PTP clock support
+
+# CONFIG_PINCTRL is not set
+# CONFIG_GPIOLIB is not set
+# CONFIG_W1 is not set
+CONFIG_POWER_RESET=y
+# CONFIG_POWER_RESET_RESTART is not set
+# CONFIG_POWER_RESET_XGENE is not set
+# CONFIG_POWER_RESET_SYSCON is not set
+# CONFIG_POWER_RESET_SYSCON_POWEROFF is not set
+# CONFIG_NVMEM_REBOOT_MODE is not set
+CONFIG_POWER_SUPPLY=y
+# CONFIG_POWER_SUPPLY_DEBUG is not set
+# CONFIG_PDA_POWER is not set
+# CONFIG_TEST_POWER is not set
+# CONFIG_BATTERY_DS2780 is not set
+# CONFIG_BATTERY_DS2781 is not set
+# CONFIG_BATTERY_SAMSUNG_SDI is not set
+# CONFIG_BATTERY_BQ27XXX is not set
+# CONFIG_CHARGER_MAX8903 is not set
+# CONFIG_BATTERY_GOLDFISH is not set
+# CONFIG_HWMON is not set
+CONFIG_THERMAL=y
+# CONFIG_THERMAL_NETLINK is not set
+# CONFIG_THERMAL_STATISTICS is not set
+CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
+CONFIG_THERMAL_OF=y
+# CONFIG_THERMAL_WRITABLE_TRIPS is not set
+CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
+# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
+# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
+# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
+CONFIG_THERMAL_GOV_STEP_WISE=y
+# CONFIG_THERMAL_GOV_BANG_BANG is not set
+# CONFIG_THERMAL_GOV_USER_SPACE is not set
+# CONFIG_CPU_THERMAL is not set
+# CONFIG_THERMAL_EMULATION is not set
+# CONFIG_THERMAL_MMIO is not set
+CONFIG_WATCHDOG=y
+CONFIG_WATCHDOG_CORE=y
+# CONFIG_WATCHDOG_NOWAYOUT is not set
+CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
+CONFIG_WATCHDOG_OPEN_TIMEOUT=0
+# CONFIG_WATCHDOG_SYSFS is not set
+# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set
+
+#
+# Watchdog Pretimeout Governors
+#
+# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
+
+#
+# Watchdog Device Drivers
+#
+# CONFIG_SOFT_WATCHDOG is not set
+# CONFIG_WDAT_WDT is not set
+# CONFIG_XILINX_WATCHDOG is not set
+# CONFIG_MLX_WDT is not set
+# CONFIG_ARM_SP805_WATCHDOG is not set
+# CONFIG_ARM_SBSA_WATCHDOG is not set
+# CONFIG_CADENCE_WATCHDOG is not set
+# CONFIG_DW_WATCHDOG is not set
+# CONFIG_MAX63XX_WATCHDOG is not set
+# CONFIG_ARM_SMC_WATCHDOG is not set
+# CONFIG_ALIM7101_WDT is not set
+# CONFIG_I6300ESB_WDT is not set
+# CONFIG_HP_WATCHDOG is not set
+# CONFIG_XEN_WDT is not set
+
+#
+# PCI-based Watchdog Cards
+#
+# CONFIG_PCIPCWATCHDOG is not set
+# CONFIG_WDTPCI is not set
+CONFIG_SSB_POSSIBLE=y
+# CONFIG_SSB is not set
+CONFIG_BCMA_POSSIBLE=y
+# CONFIG_BCMA is not set
+
+#
+# Multifunction device drivers
+#
+CONFIG_MFD_CORE=y
+# CONFIG_MFD_ATMEL_FLEXCOM is not set
+# CONFIG_MFD_ATMEL_HLCDC is not set
+# CONFIG_MFD_MADERA is not set
+# CONFIG_MFD_HI6421_PMIC is not set
+# CONFIG_HTC_PASIC3 is not set
+CONFIG_LPC_ICH=y
+CONFIG_LPC_SCH=m
+# CONFIG_MFD_JANZ_CMODIO is not set
+# CONFIG_MFD_KEMPLD is not set
+# CONFIG_MFD_MT6397 is not set
+# CONFIG_MFD_RDC321X is not set
+# CONFIG_MFD_SM501 is not set
+# CONFIG_MFD_SYSCON is not set
+# CONFIG_MFD_TI_AM335X_TSCADC is not set
+# CONFIG_MFD_TQMX86 is not set
+# CONFIG_MFD_VX855 is not set
+# end of Multifunction device drivers
+
+# CONFIG_REGULATOR is not set
+# CONFIG_RC_CORE is not set
+
+#
+# CEC support
+#
+# CONFIG_MEDIA_CEC_SUPPORT is not set
+# end of CEC support
+
+# CONFIG_MEDIA_SUPPORT is not set
+
+#
+# Graphics support
+#
+# CONFIG_DRM is not set
+# CONFIG_DRM_DEBUG_MODESET_LOCK is not set
+
+#
+# ARM devices
+#
+# end of ARM devices
+
+#
+# Frame buffer Devices
+#
+# CONFIG_FB is not set
+# end of Frame buffer Devices
+
+#
+# Backlight & LCD device support
+#
+# CONFIG_LCD_CLASS_DEVICE is not set
+# CONFIG_BACKLIGHT_CLASS_DEVICE is not set
+# end of Backlight & LCD device support
+
+#
+# Console display driver support
+#
+CONFIG_DUMMY_CONSOLE=y
+CONFIG_DUMMY_CONSOLE_COLUMNS=80
+CONFIG_DUMMY_CONSOLE_ROWS=25
+# end of Console display driver support
+# end of Graphics support
+
+# CONFIG_SOUND is not set
+
+#
+# HID support
+#
+# CONFIG_HID is not set
+# end of HID support
+
+CONFIG_USB_OHCI_LITTLE_ENDIAN=y
+# CONFIG_USB_SUPPORT is not set
+# CONFIG_MMC is not set
+# CONFIG_SCSI_UFSHCD is not set
+# CONFIG_MEMSTICK is not set
+# CONFIG_NEW_LEDS is not set
+# CONFIG_ACCESSIBILITY is not set
+# CONFIG_INFINIBAND is not set
+CONFIG_EDAC_SUPPORT=y
+# CONFIG_EDAC is not set
+CONFIG_RTC_LIB=y
+CONFIG_RTC_CLASS=y
+CONFIG_RTC_HCTOSYS=y
+CONFIG_RTC_HCTOSYS_DEVICE="rtc0"
+CONFIG_RTC_SYSTOHC=y
+CONFIG_RTC_SYSTOHC_DEVICE="rtc0"
+# CONFIG_RTC_DEBUG is not set
+CONFIG_RTC_NVMEM=y
+
+#
+# RTC interfaces
+#
+CONFIG_RTC_INTF_SYSFS=y
+CONFIG_RTC_INTF_PROC=y
+CONFIG_RTC_INTF_DEV=y
+# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
+# CONFIG_RTC_DRV_TEST is not set
+
+#
+# I2C RTC drivers
+#
+
+#
+# SPI RTC drivers
+#
+
+#
+# SPI and I2C RTC drivers
+#
+
+#
+# Platform RTC drivers
+#
+# CONFIG_RTC_DRV_DS1286 is not set
+# CONFIG_RTC_DRV_DS1511 is not set
+# CONFIG_RTC_DRV_DS1553 is not set
+# CONFIG_RTC_DRV_DS1685_FAMILY is not set
+# CONFIG_RTC_DRV_DS1742 is not set
+# CONFIG_RTC_DRV_DS2404 is not set
+CONFIG_RTC_DRV_EFI=y
+# CONFIG_RTC_DRV_STK17TA8 is not set
+# CONFIG_RTC_DRV_M48T86 is not set
+# CONFIG_RTC_DRV_M48T35 is not set
+# CONFIG_RTC_DRV_M48T59 is not set
+# CONFIG_RTC_DRV_MSM6242 is not set
+# CONFIG_RTC_DRV_BQ4802 is not set
+# CONFIG_RTC_DRV_RP5C01 is not set
+# CONFIG_RTC_DRV_V3020 is not set
+# CONFIG_RTC_DRV_ZYNQMP is not set
+
+#
+# on-CPU RTC drivers
+#
+# CONFIG_RTC_DRV_PL030 is not set
+CONFIG_RTC_DRV_PL031=y
+# CONFIG_RTC_DRV_CADENCE is not set
+# CONFIG_RTC_DRV_FTRTC010 is not set
+# CONFIG_RTC_DRV_R7301 is not set
+
+#
+# HID Sensor RTC drivers
+#
+# CONFIG_RTC_DRV_GOLDFISH is not set
+# CONFIG_DMADEVICES is not set
+
+#
+# DMABUF options
+#
+# CONFIG_SYNC_FILE is not set
+# CONFIG_UDMABUF is not set
+# CONFIG_DMABUF_MOVE_NOTIFY is not set
+# CONFIG_DMABUF_DEBUG is not set
+# CONFIG_DMABUF_SELFTESTS is not set
+# CONFIG_DMABUF_HEAPS is not set
+# CONFIG_DMABUF_SYSFS_STATS is not set
+# end of DMABUF options
+
+# CONFIG_AUXDISPLAY is not set
+CONFIG_UIO=m
+# CONFIG_UIO_CIF is not set
+# CONFIG_UIO_PDRV_GENIRQ is not set
+# CONFIG_UIO_DMEM_GENIRQ is not set
+# CONFIG_UIO_AEC is not set
+# CONFIG_UIO_SERCOS3 is not set
+CONFIG_UIO_PCI_GENERIC=m
+# CONFIG_UIO_NETX is not set
+# CONFIG_UIO_PRUSS is not set
+# CONFIG_UIO_MF624 is not set
+CONFIG_VFIO=m
+CONFIG_VFIO_IOMMU_TYPE1=m
+CONFIG_VFIO_VIRQFD=m
+CONFIG_VFIO_NOIOMMU=y
+CONFIG_VFIO_PCI_CORE=m
+CONFIG_VFIO_PCI_MMAP=y
+CONFIG_VFIO_PCI_INTX=y
+CONFIG_VFIO_PCI=m
+# CONFIG_MLX5_VFIO_PCI is not set
+# CONFIG_VFIO_PLATFORM is not set
+# CONFIG_VFIO_MDEV is not set
+# CONFIG_VIRT_DRIVERS is not set
+CONFIG_VIRTIO_ANCHOR=y
+CONFIG_VIRTIO=y
+CONFIG_VIRTIO_PCI_LIB=y
+CONFIG_VIRTIO_PCI_LIB_LEGACY=y
+CONFIG_VIRTIO_MENU=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_PCI_LEGACY=y
+CONFIG_VIRTIO_BALLOON=m
+# CONFIG_VIRTIO_INPUT is not set
+# CONFIG_VIRTIO_MMIO is not set
+# CONFIG_VDPA is not set
+CONFIG_VHOST_IOTLB=m
+CONFIG_VHOST=m
+CONFIG_VHOST_MENU=y
+CONFIG_VHOST_NET=m
+# CONFIG_VHOST_SCSI is not set
+CONFIG_VHOST_VSOCK=m
+# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
+
+#
+# Microsoft Hyper-V guest support
+#
+# CONFIG_HYPERV is not set
+# end of Microsoft Hyper-V guest support
+
+#
+# Xen driver support
+#
+CONFIG_XEN_BALLOON=y
+CONFIG_XEN_SCRUB_PAGES_DEFAULT=y
+CONFIG_XEN_DEV_EVTCHN=m
+CONFIG_XEN_BACKEND=y
+CONFIG_XENFS=m
+CONFIG_XEN_COMPAT_XENFS=y
+CONFIG_XEN_SYS_HYPERVISOR=y
+CONFIG_XEN_XENBUS_FRONTEND=y
+CONFIG_XEN_GNTDEV=m
+CONFIG_XEN_GRANT_DEV_ALLOC=m
+# CONFIG_XEN_GRANT_DMA_ALLOC is not set
+CONFIG_SWIOTLB_XEN=y
+CONFIG_XEN_PCI_STUB=y
+CONFIG_XEN_PCIDEV_STUB=m
+# CONFIG_XEN_PVCALLS_FRONTEND is not set
+# CONFIG_XEN_PVCALLS_BACKEND is not set
+# CONFIG_XEN_SCSI_BACKEND is not set
+CONFIG_XEN_PRIVCMD=m
+CONFIG_XEN_EFI=y
+CONFIG_XEN_AUTO_XLATE=y
+# CONFIG_XEN_VIRTIO is not set
+# end of Xen driver support
+
+# CONFIG_GREYBUS is not set
+# CONFIG_COMEDI is not set
+# CONFIG_STAGING is not set
+# CONFIG_GOLDFISH is not set
+# CONFIG_CHROME_PLATFORMS is not set
+CONFIG_MELLANOX_PLATFORM=y
+# CONFIG_MLXBF_BOOTCTL is not set
+CONFIG_SURFACE_PLATFORMS=y
+# CONFIG_SURFACE_GPE is not set
+# CONFIG_SURFACE_PRO3_BUTTON is not set
+CONFIG_HAVE_CLK=y
+CONFIG_HAVE_CLK_PREPARE=y
+CONFIG_COMMON_CLK=y
+
+#
+# Clock driver for ARM Reference designs
+#
+# CONFIG_CLK_ICST is not set
+# CONFIG_CLK_SP810 is not set
+# end of Clock driver for ARM Reference designs
+
+# CONFIG_COMMON_CLK_AXI_CLKGEN is not set
+# CONFIG_COMMON_CLK_XGENE is not set
+# CONFIG_COMMON_CLK_FIXED_MMIO is not set
+# CONFIG_XILINX_VCU is not set
+# CONFIG_COMMON_CLK_XLNX_CLKWZRD is not set
+# CONFIG_HWSPINLOCK is not set
+
+#
+# Clock Source drivers
+#
+CONFIG_TIMER_OF=y
+CONFIG_TIMER_ACPI=y
+CONFIG_TIMER_PROBE=y
+CONFIG_ARM_ARCH_TIMER=y
+CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
+CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND=y
+CONFIG_FSL_ERRATUM_A008585=y
+CONFIG_HISILICON_ERRATUM_161010101=y
+CONFIG_ARM64_ERRATUM_858921=y
+# CONFIG_MICROCHIP_PIT64B is not set
+# end of Clock Source drivers
+
+CONFIG_MAILBOX=y
+# CONFIG_ARM_MHU is not set
+# CONFIG_ARM_MHU_V2 is not set
+# CONFIG_PLATFORM_MHU is not set
+# CONFIG_PL320_MBOX is not set
+CONFIG_PCC=y
+# CONFIG_ALTERA_MBOX is not set
+# CONFIG_MAILBOX_TEST is not set
+CONFIG_IOMMU_API=y
+# CONFIG_IOMMU_SUPPORT is not set
+
+#
+# Remoteproc drivers
+#
+# CONFIG_REMOTEPROC is not set
+# end of Remoteproc drivers
+
+#
+# Rpmsg drivers
+#
+# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
+# CONFIG_RPMSG_VIRTIO is not set
+# end of Rpmsg drivers
+
+# CONFIG_SOUNDWIRE is not set
+
+#
+# SOC (System On Chip) specific Drivers
+#
+
+#
+# Amlogic SoC drivers
+#
+# end of Amlogic SoC drivers
+
+#
+# Broadcom SoC drivers
+#
+# CONFIG_SOC_BRCMSTB is not set
+# end of Broadcom SoC drivers
+
+#
+# NXP/Freescale QorIQ SoC drivers
+#
+# CONFIG_QUICC_ENGINE is not set
+# CONFIG_FSL_RCPM is not set
+# end of NXP/Freescale QorIQ SoC drivers
+
+#
+# fujitsu SoC drivers
+#
+# CONFIG_A64FX_DIAG is not set
+# end of fujitsu SoC drivers
+
+#
+# i.MX SoC drivers
+#
+# end of i.MX SoC drivers
+
+#
+# Enable LiteX SoC Builder specific drivers
+#
+# CONFIG_LITEX_SOC_CONTROLLER is not set
+# end of Enable LiteX SoC Builder specific drivers
+
+#
+# Qualcomm SoC drivers
+#
+# end of Qualcomm SoC drivers
+
+# CONFIG_SOC_TI is not set
+
+#
+# Xilinx SoC drivers
+#
+# end of Xilinx SoC drivers
+# end of SOC (System On Chip) specific Drivers
+
+# CONFIG_PM_DEVFREQ is not set
+# CONFIG_EXTCON is not set
+# CONFIG_MEMORY is not set
+# CONFIG_IIO is not set
+# CONFIG_NTB is not set
+# CONFIG_PWM is not set
+
+#
+# IRQ chip support
+#
+CONFIG_IRQCHIP=y
+CONFIG_ARM_GIC=y
+CONFIG_ARM_GIC_MAX_NR=1
+CONFIG_ARM_GIC_V2M=y
+CONFIG_ARM_GIC_V3=y
+CONFIG_ARM_GIC_V3_ITS=y
+CONFIG_ARM_GIC_V3_ITS_PCI=y
+# CONFIG_AL_FIC is not set
+# CONFIG_XILINX_INTC is not set
+CONFIG_PARTITION_PERCPU=y
+# end of IRQ chip support
+
+# CONFIG_IPACK_BUS is not set
+# CONFIG_RESET_CONTROLLER is not set
+
+#
+# PHY Subsystem
+#
+# CONFIG_GENERIC_PHY is not set
+# CONFIG_PHY_XGENE is not set
+# CONFIG_PHY_CAN_TRANSCEIVER is not set
+
+#
+# PHY drivers for Broadcom platforms
+#
+# CONFIG_BCM_KONA_USB2_PHY is not set
+# end of PHY drivers for Broadcom platforms
+
+# CONFIG_PHY_CADENCE_TORRENT is not set
+# CONFIG_PHY_CADENCE_DPHY is not set
+# CONFIG_PHY_CADENCE_DPHY_RX is not set
+# CONFIG_PHY_CADENCE_SALVO is not set
+# CONFIG_PHY_PXA_28NM_HSIC is not set
+# CONFIG_PHY_PXA_28NM_USB2 is not set
+# end of PHY Subsystem
+
+# CONFIG_POWERCAP is not set
+# CONFIG_MCB is not set
+
+#
+# Performance monitor support
+#
+# CONFIG_ARM_CCI_PMU is not set
+# CONFIG_ARM_CCN is not set
+# CONFIG_ARM_CMN is not set
+CONFIG_ARM_PMU=y
+CONFIG_ARM_PMU_ACPI=y
+# CONFIG_ARM_SMMU_V3_PMU is not set
+# CONFIG_ARM_DSU_PMU is not set
+# CONFIG_ARM_SPE_PMU is not set
+# CONFIG_ARM_DMC620_PMU is not set
+# CONFIG_ALIBABA_UNCORE_DRW_PMU is not set
+# CONFIG_HISI_PMU is not set
+# CONFIG_HISI_PCIE_PMU is not set
+# CONFIG_HNS3_PMU is not set
+# end of Performance monitor support
+
+CONFIG_RAS=y
+# CONFIG_USB4 is not set
+
+#
+# Android
+#
+# CONFIG_ANDROID_BINDER_IPC is not set
+# end of Android
+
+# CONFIG_LIBNVDIMM is not set
+CONFIG_DAX=y
+# CONFIG_DEV_DAX is not set
+CONFIG_NVMEM=y
+CONFIG_NVMEM_SYSFS=y
+# CONFIG_NVMEM_RMEM is not set
+
+#
+# HW tracing support
+#
+# CONFIG_STM is not set
+# CONFIG_INTEL_TH is not set
+# CONFIG_HISI_PTT is not set
+# end of HW tracing support
+
+# CONFIG_FPGA is not set
+# CONFIG_FSI is not set
+# CONFIG_TEE is not set
+# CONFIG_SIOX is not set
+# CONFIG_SLIMBUS is not set
+# CONFIG_INTERCONNECT is not set
+# CONFIG_COUNTER is not set
+# CONFIG_MOST is not set
+# CONFIG_PECI is not set
+# CONFIG_HTE is not set
+# end of Device Drivers
+
+#
+# File systems
+#
+CONFIG_DCACHE_WORD_ACCESS=y
+# CONFIG_VALIDATE_FS_PARSER is not set
+CONFIG_FS_IOMAP=y
+# CONFIG_EXT2_FS is not set
+# CONFIG_EXT3_FS is not set
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_USE_FOR_EXT2=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+# CONFIG_EXT4_DEBUG is not set
+CONFIG_JBD2=y
+# CONFIG_JBD2_DEBUG is not set
+CONFIG_FS_MBCACHE=y
+# CONFIG_REISERFS_FS is not set
+# CONFIG_JFS_FS is not set
+CONFIG_XFS_FS=y
+CONFIG_XFS_SUPPORT_V4=y
+CONFIG_XFS_QUOTA=y
+CONFIG_XFS_POSIX_ACL=y
+# CONFIG_XFS_RT is not set
+# CONFIG_XFS_ONLINE_SCRUB is not set
+# CONFIG_XFS_WARN is not set
+# CONFIG_XFS_DEBUG is not set
+# CONFIG_GFS2_FS is not set
+# CONFIG_OCFS2_FS is not set
+# CONFIG_BTRFS_FS is not set
+# CONFIG_NILFS2_FS is not set
+# CONFIG_F2FS_FS is not set
+CONFIG_FS_POSIX_ACL=y
+CONFIG_EXPORTFS=y
+# CONFIG_EXPORTFS_BLOCK_OPS is not set
+CONFIG_FILE_LOCKING=y
+# CONFIG_FS_ENCRYPTION is not set
+# CONFIG_FS_VERITY is not set
+CONFIG_FSNOTIFY=y
+CONFIG_DNOTIFY=y
+CONFIG_INOTIFY_USER=y
+CONFIG_FANOTIFY=y
+CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
+CONFIG_QUOTA=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
+# CONFIG_PRINT_QUOTA_WARNING is not set
+# CONFIG_QUOTA_DEBUG is not set
+CONFIG_QUOTA_TREE=y
+# CONFIG_QFMT_V1 is not set
+CONFIG_QFMT_V2=y
+CONFIG_QUOTACTL=y
+CONFIG_AUTOFS4_FS=y
+CONFIG_AUTOFS_FS=y
+CONFIG_FUSE_FS=m
+# CONFIG_CUSE is not set
+# CONFIG_VIRTIO_FS is not set
+CONFIG_OVERLAY_FS=y
+# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set
+CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y
+# CONFIG_OVERLAY_FS_INDEX is not set
+# CONFIG_OVERLAY_FS_XINO_AUTO is not set
+# CONFIG_OVERLAY_FS_METACOPY is not set
+
+#
+# Caches
+#
+CONFIG_NETFS_SUPPORT=y
+# CONFIG_NETFS_STATS is not set
+CONFIG_FSCACHE=m
+# CONFIG_FSCACHE_STATS is not set
+# CONFIG_FSCACHE_DEBUG is not set
+CONFIG_CACHEFILES=m
+# CONFIG_CACHEFILES_DEBUG is not set
+# CONFIG_CACHEFILES_ERROR_INJECTION is not set
+# CONFIG_CACHEFILES_ONDEMAND is not set
+# end of Caches
+
+#
+# CD-ROM/DVD Filesystems
+#
+CONFIG_ISO9660_FS=m
+CONFIG_JOLIET=y
+CONFIG_ZISOFS=y
+CONFIG_UDF_FS=m
+# end of CD-ROM/DVD Filesystems
+
+#
+# DOS/FAT/EXFAT/NT Filesystems
+#
+CONFIG_FAT_FS=m
+# CONFIG_MSDOS_FS is not set
+CONFIG_VFAT_FS=m
+CONFIG_FAT_DEFAULT_CODEPAGE=437
+CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
+# CONFIG_FAT_DEFAULT_UTF8 is not set
+# CONFIG_EXFAT_FS is not set
+CONFIG_NTFS_FS=m
+# CONFIG_NTFS_DEBUG is not set
+# CONFIG_NTFS_RW is not set
+# CONFIG_NTFS3_FS is not set
+# end of DOS/FAT/EXFAT/NT Filesystems
+
+#
+# Pseudo filesystems
+#
+CONFIG_PROC_FS=y
+CONFIG_PROC_KCORE=y
+CONFIG_PROC_SYSCTL=y
+CONFIG_PROC_PAGE_MONITOR=y
+CONFIG_PROC_CHILDREN=y
+# CONFIG_PROC_SELF_MEM_READONLY is not set
+CONFIG_KERNFS=y
+CONFIG_SYSFS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_TMPFS_XATTR=y
+# CONFIG_TMPFS_INODE64 is not set
+CONFIG_ARCH_SUPPORTS_HUGETLBFS=y
+CONFIG_HUGETLBFS=y
+CONFIG_HUGETLB_PAGE=y
+CONFIG_MEMFD_CREATE=y
+CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
+CONFIG_CONFIGFS_FS=m
+CONFIG_EFIVAR_FS=y
+# end of Pseudo filesystems
+
+CONFIG_MISC_FILESYSTEMS=y
+# CONFIG_ORANGEFS_FS is not set
+# CONFIG_ADFS_FS is not set
+# CONFIG_AFFS_FS is not set
+# CONFIG_ECRYPT_FS is not set
+# CONFIG_HFS_FS is not set
+# CONFIG_HFSPLUS_FS is not set
+# CONFIG_BEFS_FS is not set
+# CONFIG_BFS_FS is not set
+# CONFIG_EFS_FS is not set
+# CONFIG_CRAMFS is not set
+CONFIG_SQUASHFS=m
+# CONFIG_SQUASHFS_FILE_CACHE is not set
+CONFIG_SQUASHFS_FILE_DIRECT=y
+# CONFIG_SQUASHFS_DECOMP_SINGLE is not set
+# CONFIG_SQUASHFS_DECOMP_MULTI is not set
+CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y
+CONFIG_SQUASHFS_XATTR=y
+CONFIG_SQUASHFS_ZLIB=y
+# CONFIG_SQUASHFS_LZ4 is not set
+# CONFIG_SQUASHFS_LZO is not set
+# CONFIG_SQUASHFS_XZ is not set
+CONFIG_SQUASHFS_ZSTD=y
+CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
+# CONFIG_SQUASHFS_EMBEDDED is not set
+CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
+# CONFIG_VXFS_FS is not set
+# CONFIG_MINIX_FS is not set
+# CONFIG_OMFS_FS is not set
+# CONFIG_HPFS_FS is not set
+# CONFIG_QNX4FS_FS is not set
+# CONFIG_QNX6FS_FS is not set
+# CONFIG_ROMFS_FS is not set
+CONFIG_PSTORE=y
+CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
+CONFIG_PSTORE_DEFLATE_COMPRESS=y
+# CONFIG_PSTORE_LZO_COMPRESS is not set
+# CONFIG_PSTORE_LZ4_COMPRESS is not set
+# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
+# CONFIG_PSTORE_842_COMPRESS is not set
+# CONFIG_PSTORE_ZSTD_COMPRESS is not set
+CONFIG_PSTORE_COMPRESS=y
+CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
+CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
+# CONFIG_PSTORE_CONSOLE is not set
+# CONFIG_PSTORE_PMSG is not set
+# CONFIG_PSTORE_FTRACE is not set
+# CONFIG_PSTORE_RAM is not set
+# CONFIG_PSTORE_BLK is not set
+# CONFIG_SYSV_FS is not set
+# CONFIG_UFS_FS is not set
+# CONFIG_EROFS_FS is not set
+CONFIG_NETWORK_FILESYSTEMS=y
+CONFIG_NFS_FS=m
+# CONFIG_NFS_V2 is not set
+CONFIG_NFS_V3=m
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=m
+# CONFIG_NFS_SWAP is not set
+CONFIG_NFS_V4_1=y
+CONFIG_NFS_V4_2=y
+CONFIG_PNFS_FILE_LAYOUT=m
+CONFIG_PNFS_BLOCK=m
+CONFIG_PNFS_FLEXFILE_LAYOUT=m
+CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
+# CONFIG_NFS_V4_1_MIGRATION is not set
+CONFIG_NFS_V4_SECURITY_LABEL=y
+CONFIG_NFS_FSCACHE=y
+# CONFIG_NFS_USE_LEGACY_DNS is not set
+CONFIG_NFS_USE_KERNEL_DNS=y
+CONFIG_NFS_DEBUG=y
+CONFIG_NFS_DISABLE_UDP_SUPPORT=y
+# CONFIG_NFS_V4_2_READ_PLUS is not set
+CONFIG_NFSD=m
+CONFIG_NFSD_V2_ACL=y
+CONFIG_NFSD_V3_ACL=y
+CONFIG_NFSD_V4=y
+# CONFIG_NFSD_BLOCKLAYOUT is not set
+# CONFIG_NFSD_SCSILAYOUT is not set
+# CONFIG_NFSD_FLEXFILELAYOUT is not set
+# CONFIG_NFSD_V4_2_INTER_SSC is not set
+CONFIG_NFSD_V4_SECURITY_LABEL=y
+CONFIG_GRACE_PERIOD=m
+CONFIG_LOCKD=m
+CONFIG_LOCKD_V4=y
+CONFIG_NFS_ACL_SUPPORT=m
+CONFIG_NFS_COMMON=y
+CONFIG_NFS_V4_2_SSC_HELPER=y
+CONFIG_SUNRPC=m
+CONFIG_SUNRPC_GSS=m
+CONFIG_SUNRPC_BACKCHANNEL=y
+CONFIG_RPCSEC_GSS_KRB5=m
+# CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES is not set
+CONFIG_SUNRPC_DEBUG=y
+# CONFIG_CEPH_FS is not set
+CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
+# CONFIG_CIFS_ALLOW_INSECURE_LEGACY is not set
+CONFIG_CIFS_UPCALL=y
+CONFIG_CIFS_XATTR=y
+CONFIG_CIFS_DEBUG=y
+# CONFIG_CIFS_DEBUG2 is not set
+# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set
+CONFIG_CIFS_DFS_UPCALL=y
+# CONFIG_CIFS_SWN_UPCALL is not set
+# CONFIG_CIFS_FSCACHE is not set
+# CONFIG_SMB_SERVER is not set
+CONFIG_SMBFS=m
+# CONFIG_CODA_FS is not set
+# CONFIG_AFS_FS is not set
+CONFIG_9P_FS=y
+CONFIG_9P_FS_POSIX_ACL=y
+CONFIG_9P_FS_SECURITY=y
+CONFIG_NLS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=m
+# CONFIG_NLS_CODEPAGE_737 is not set
+# CONFIG_NLS_CODEPAGE_775 is not set
+# CONFIG_NLS_CODEPAGE_850 is not set
+# CONFIG_NLS_CODEPAGE_852 is not set
+# CONFIG_NLS_CODEPAGE_855 is not set
+# CONFIG_NLS_CODEPAGE_857 is not set
+# CONFIG_NLS_CODEPAGE_860 is not set
+# CONFIG_NLS_CODEPAGE_861 is not set
+# CONFIG_NLS_CODEPAGE_862 is not set
+# CONFIG_NLS_CODEPAGE_863 is not set
+# CONFIG_NLS_CODEPAGE_864 is not set
+# CONFIG_NLS_CODEPAGE_865 is not set
+# CONFIG_NLS_CODEPAGE_866 is not set
+# CONFIG_NLS_CODEPAGE_869 is not set
+# CONFIG_NLS_CODEPAGE_936 is not set
+# CONFIG_NLS_CODEPAGE_950 is not set
+# CONFIG_NLS_CODEPAGE_932 is not set
+# CONFIG_NLS_CODEPAGE_949 is not set
+# CONFIG_NLS_CODEPAGE_874 is not set
+# CONFIG_NLS_ISO8859_8 is not set
+# CONFIG_NLS_CODEPAGE_1250 is not set
+# CONFIG_NLS_CODEPAGE_1251 is not set
+CONFIG_NLS_ASCII=m
+CONFIG_NLS_ISO8859_1=m
+# CONFIG_NLS_ISO8859_2 is not set
+# CONFIG_NLS_ISO8859_3 is not set
+# CONFIG_NLS_ISO8859_4 is not set
+# CONFIG_NLS_ISO8859_5 is not set
+# CONFIG_NLS_ISO8859_6 is not set
+# CONFIG_NLS_ISO8859_7 is not set
+# CONFIG_NLS_ISO8859_9 is not set
+# CONFIG_NLS_ISO8859_13 is not set
+# CONFIG_NLS_ISO8859_14 is not set
+# CONFIG_NLS_ISO8859_15 is not set
+# CONFIG_NLS_KOI8_R is not set
+# CONFIG_NLS_KOI8_U is not set
+# CONFIG_NLS_MAC_ROMAN is not set
+# CONFIG_NLS_MAC_CELTIC is not set
+# CONFIG_NLS_MAC_CENTEURO is not set
+# CONFIG_NLS_MAC_CROATIAN is not set
+# CONFIG_NLS_MAC_CYRILLIC is not set
+# CONFIG_NLS_MAC_GAELIC is not set
+# CONFIG_NLS_MAC_GREEK is not set
+# CONFIG_NLS_MAC_ICELAND is not set
+# CONFIG_NLS_MAC_INUIT is not set
+# CONFIG_NLS_MAC_ROMANIAN is not set
+# CONFIG_NLS_MAC_TURKISH is not set
+CONFIG_NLS_UTF8=m
+# CONFIG_DLM is not set
+# CONFIG_UNICODE is not set
+CONFIG_IO_WQ=y
+# end of File systems
+
+#
+# Security options
+#
+CONFIG_KEYS=y
+# CONFIG_KEYS_REQUEST_CACHE is not set
+# CONFIG_PERSISTENT_KEYRINGS is not set
+# CONFIG_TRUSTED_KEYS is not set
+# CONFIG_ENCRYPTED_KEYS is not set
+# CONFIG_KEY_DH_OPERATIONS is not set
+CONFIG_SECURITY_DMESG_RESTRICT=y
+CONFIG_SECURITY=y
+CONFIG_SECURITYFS=y
+CONFIG_SECURITY_NETWORK=y
+# CONFIG_SECURITY_NETWORK_XFRM is not set
+CONFIG_SECURITY_PATH=y
+CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
+CONFIG_HARDENED_USERCOPY=y
+# CONFIG_FORTIFY_SOURCE is not set
+# CONFIG_STATIC_USERMODEHELPER is not set
+# CONFIG_SECURITY_SELINUX is not set
+# CONFIG_SECURITY_SMACK is not set
+# CONFIG_SECURITY_TOMOYO is not set
+CONFIG_SECURITY_APPARMOR=y
+# CONFIG_SECURITY_APPARMOR_DEBUG is not set
+CONFIG_SECURITY_APPARMOR_INTROSPECT_POLICY=y
+CONFIG_SECURITY_APPARMOR_HASH=y
+CONFIG_SECURITY_APPARMOR_HASH_DEFAULT=y
+CONFIG_SECURITY_APPARMOR_EXPORT_BINARY=y
+CONFIG_SECURITY_APPARMOR_PARANOID_LOAD=y
+CONFIG_SECURITY_LOADPIN=y
+CONFIG_SECURITY_LOADPIN_ENFORCE=y
+# CONFIG_SECURITY_LOADPIN_VERITY is not set
+CONFIG_SECURITY_YAMA=y
+CONFIG_SECURITY_SAFESETID=y
+CONFIG_SECURITY_LOCKDOWN_LSM=y
+# CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is not set
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=y
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY is not set
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY is not set
+# CONFIG_SECURITY_LANDLOCK is not set
+CONFIG_INTEGRITY=y
+CONFIG_INTEGRITY_SIGNATURE=y
+CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
+CONFIG_INTEGRITY_TRUSTED_KEYRING=y
+CONFIG_INTEGRITY_PLATFORM_KEYRING=y
+# CONFIG_INTEGRITY_MACHINE_KEYRING is not set
+CONFIG_LOAD_UEFI_KEYS=y
+CONFIG_INTEGRITY_AUDIT=y
+CONFIG_IMA=y
+# CONFIG_IMA_KEXEC is not set
+CONFIG_IMA_MEASURE_PCR_IDX=10
+CONFIG_IMA_LSM_RULES=y
+CONFIG_IMA_NG_TEMPLATE=y
+# CONFIG_IMA_SIG_TEMPLATE is not set
+CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
+# CONFIG_IMA_DEFAULT_HASH_SHA1 is not set
+CONFIG_IMA_DEFAULT_HASH_SHA256=y
+CONFIG_IMA_DEFAULT_HASH="sha256"
+CONFIG_IMA_WRITE_POLICY=y
+# CONFIG_IMA_READ_POLICY is not set
+CONFIG_IMA_APPRAISE=y
+# CONFIG_IMA_ARCH_POLICY is not set
+CONFIG_IMA_APPRAISE_BUILD_POLICY=y
+CONFIG_IMA_APPRAISE_REQUIRE_FIRMWARE_SIGS=y
+# CONFIG_IMA_APPRAISE_REQUIRE_KEXEC_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_MODULE_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_POLICY_SIGS is not set
+CONFIG_IMA_APPRAISE_BOOTPARAM=y
+# CONFIG_IMA_APPRAISE_MODSIG is not set
+CONFIG_IMA_TRUSTED_KEYRING=y
+# CONFIG_IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY is not set
+# CONFIG_IMA_BLACKLIST_KEYRING is not set
+CONFIG_IMA_LOAD_X509=y
+CONFIG_IMA_X509_PATH="/etc/ima/pubkey.x509"
+# CONFIG_IMA_APPRAISE_SIGNED_INIT is not set
+CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS=y
+CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS=y
+# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set
+# CONFIG_IMA_DISABLE_HTABLE is not set
+# CONFIG_EVM is not set
+# CONFIG_DEFAULT_SECURITY_APPARMOR is not set
+CONFIG_DEFAULT_SECURITY_DAC=y
+CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,bpf"
+
+#
+# Kernel hardening options
+#
+
+#
+# Memory initialization
+#
+CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_ENABLER=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
+# CONFIG_INIT_STACK_NONE is not set
+# CONFIG_INIT_STACK_ALL_PATTERN is not set
+CONFIG_INIT_STACK_ALL_ZERO=y
+# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set
+# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
+# end of Memory initialization
+
+CONFIG_RANDSTRUCT_NONE=y
+# end of Kernel hardening options
+# end of Security options
+
+CONFIG_XOR_BLOCKS=y
+CONFIG_ASYNC_CORE=y
+CONFIG_ASYNC_MEMCPY=m
+CONFIG_ASYNC_XOR=y
+CONFIG_ASYNC_PQ=m
+CONFIG_ASYNC_RAID6_RECOV=m
+CONFIG_CRYPTO=y
+
+#
+# Crypto core or helper
+#
+CONFIG_CRYPTO_ALGAPI=y
+CONFIG_CRYPTO_ALGAPI2=y
+CONFIG_CRYPTO_AEAD=y
+CONFIG_CRYPTO_AEAD2=y
+CONFIG_CRYPTO_SKCIPHER=y
+CONFIG_CRYPTO_SKCIPHER2=y
+CONFIG_CRYPTO_HASH=y
+CONFIG_CRYPTO_HASH2=y
+CONFIG_CRYPTO_RNG=m
+CONFIG_CRYPTO_RNG2=y
+CONFIG_CRYPTO_RNG_DEFAULT=m
+CONFIG_CRYPTO_AKCIPHER2=y
+CONFIG_CRYPTO_AKCIPHER=y
+CONFIG_CRYPTO_KPP2=y
+CONFIG_CRYPTO_ACOMP2=y
+CONFIG_CRYPTO_MANAGER=y
+CONFIG_CRYPTO_MANAGER2=y
+# CONFIG_CRYPTO_USER is not set
+CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
+CONFIG_CRYPTO_GF128MUL=y
+CONFIG_CRYPTO_NULL=y
+CONFIG_CRYPTO_NULL2=y
+# CONFIG_CRYPTO_PCRYPT is not set
+CONFIG_CRYPTO_CRYPTD=m
+CONFIG_CRYPTO_AUTHENC=y
+# CONFIG_CRYPTO_TEST is not set
+CONFIG_CRYPTO_ENGINE=m
+# end of Crypto core or helper
+
+#
+# Public-key cryptography
+#
+CONFIG_CRYPTO_RSA=y
+# CONFIG_CRYPTO_DH is not set
+# CONFIG_CRYPTO_ECDH is not set
+# CONFIG_CRYPTO_ECDSA is not set
+# CONFIG_CRYPTO_ECRDSA is not set
+# CONFIG_CRYPTO_SM2 is not set
+# CONFIG_CRYPTO_CURVE25519 is not set
+# end of Public-key cryptography
+
+#
+# Block ciphers
+#
+CONFIG_CRYPTO_AES=y
+# CONFIG_CRYPTO_AES_TI is not set
+# CONFIG_CRYPTO_ANUBIS is not set
+# CONFIG_CRYPTO_ARIA is not set
+# CONFIG_CRYPTO_BLOWFISH is not set
+# CONFIG_CRYPTO_CAMELLIA is not set
+# CONFIG_CRYPTO_CAST5 is not set
+# CONFIG_CRYPTO_CAST6 is not set
+CONFIG_CRYPTO_DES=y
+# CONFIG_CRYPTO_FCRYPT is not set
+# CONFIG_CRYPTO_KHAZAD is not set
+# CONFIG_CRYPTO_SEED is not set
+# CONFIG_CRYPTO_SERPENT is not set
+# CONFIG_CRYPTO_SM4_GENERIC is not set
+# CONFIG_CRYPTO_TEA is not set
+# CONFIG_CRYPTO_TWOFISH is not set
+# end of Block ciphers
+
+#
+# Length-preserving ciphers and modes
+#
+# CONFIG_CRYPTO_ADIANTUM is not set
+CONFIG_CRYPTO_ARC4=y
+# CONFIG_CRYPTO_CHACHA20 is not set
+CONFIG_CRYPTO_CBC=y
+# CONFIG_CRYPTO_CFB is not set
+CONFIG_CRYPTO_CTR=y
+CONFIG_CRYPTO_CTS=y
+CONFIG_CRYPTO_ECB=y
+# CONFIG_CRYPTO_HCTR2 is not set
+# CONFIG_CRYPTO_KEYWRAP is not set
+CONFIG_CRYPTO_LRW=m
+# CONFIG_CRYPTO_OFB is not set
+# CONFIG_CRYPTO_PCBC is not set
+CONFIG_CRYPTO_XTS=m
+# end of Length-preserving ciphers and modes
+
+#
+# AEAD (authenticated encryption with associated data) ciphers
+#
+# CONFIG_CRYPTO_AEGIS128 is not set
+# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
+CONFIG_CRYPTO_CCM=m
+CONFIG_CRYPTO_GCM=y
+CONFIG_CRYPTO_SEQIV=m
+CONFIG_CRYPTO_ECHAINIV=m
+CONFIG_CRYPTO_ESSIV=y
+# end of AEAD (authenticated encryption with associated data) ciphers
+
+#
+# Hashes, digests, and MACs
+#
+# CONFIG_CRYPTO_BLAKE2B is not set
+CONFIG_CRYPTO_CMAC=m
+CONFIG_CRYPTO_GHASH=y
+CONFIG_CRYPTO_HMAC=y
+CONFIG_CRYPTO_MD4=m
+CONFIG_CRYPTO_MD5=y
+CONFIG_CRYPTO_MICHAEL_MIC=m
+# CONFIG_CRYPTO_POLY1305 is not set
+# CONFIG_CRYPTO_RMD160 is not set
+CONFIG_CRYPTO_SHA1=y
+CONFIG_CRYPTO_SHA256=y
+CONFIG_CRYPTO_SHA512=m
+# CONFIG_CRYPTO_SHA3 is not set
+# CONFIG_CRYPTO_SM3_GENERIC is not set
+# CONFIG_CRYPTO_STREEBOG is not set
+# CONFIG_CRYPTO_VMAC is not set
+# CONFIG_CRYPTO_WP512 is not set
+# CONFIG_CRYPTO_XCBC is not set
+# CONFIG_CRYPTO_XXHASH is not set
+# end of Hashes, digests, and MACs
+
+#
+# CRCs (cyclic redundancy checks)
+#
+CONFIG_CRYPTO_CRC32C=y
+# CONFIG_CRYPTO_CRC32 is not set
+CONFIG_CRYPTO_CRCT10DIF=y
+CONFIG_CRYPTO_CRC64_ROCKSOFT=y
+# end of CRCs (cyclic redundancy checks)
+
+#
+# Compression
+#
+CONFIG_CRYPTO_DEFLATE=y
+CONFIG_CRYPTO_LZO=m
+# CONFIG_CRYPTO_842 is not set
+CONFIG_CRYPTO_LZ4=m
+# CONFIG_CRYPTO_LZ4HC is not set
+# CONFIG_CRYPTO_ZSTD is not set
+# end of Compression
+
+#
+# Random number generation
+#
+# CONFIG_CRYPTO_ANSI_CPRNG is not set
+CONFIG_CRYPTO_DRBG_MENU=m
+CONFIG_CRYPTO_DRBG_HMAC=y
+# CONFIG_CRYPTO_DRBG_HASH is not set
+# CONFIG_CRYPTO_DRBG_CTR is not set
+CONFIG_CRYPTO_DRBG=m
+CONFIG_CRYPTO_JITTERENTROPY=m
+# end of Random number generation
+
+#
+# Userspace interface
+#
+CONFIG_CRYPTO_USER_API=y
+CONFIG_CRYPTO_USER_API_HASH=y
+CONFIG_CRYPTO_USER_API_SKCIPHER=y
+# CONFIG_CRYPTO_USER_API_RNG is not set
+CONFIG_CRYPTO_USER_API_AEAD=y
+CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y
+# end of Userspace interface
+
+CONFIG_CRYPTO_HASH_INFO=y
+# CONFIG_CRYPTO_NHPOLY1305_NEON is not set
+CONFIG_CRYPTO_CHACHA20_NEON=m
+
+#
+# Accelerated Cryptographic Algorithms for CPU (arm64)
+#
+# CONFIG_CRYPTO_GHASH_ARM64_CE is not set
+CONFIG_CRYPTO_POLY1305_NEON=m
+# CONFIG_CRYPTO_SHA1_ARM64_CE is not set
+# CONFIG_CRYPTO_SHA256_ARM64 is not set
+# CONFIG_CRYPTO_SHA2_ARM64_CE is not set
+# CONFIG_CRYPTO_SHA512_ARM64 is not set
+# CONFIG_CRYPTO_SHA512_ARM64_CE is not set
+# CONFIG_CRYPTO_SHA3_ARM64 is not set
+# CONFIG_CRYPTO_SM3_NEON is not set
+# CONFIG_CRYPTO_SM3_ARM64_CE is not set
+# CONFIG_CRYPTO_POLYVAL_ARM64_CE is not set
+# CONFIG_CRYPTO_AES_ARM64 is not set
+# CONFIG_CRYPTO_AES_ARM64_CE is not set
+# CONFIG_CRYPTO_AES_ARM64_CE_BLK is not set
+# CONFIG_CRYPTO_AES_ARM64_NEON_BLK is not set
+# CONFIG_CRYPTO_AES_ARM64_BS is not set
+# CONFIG_CRYPTO_SM4_ARM64_CE is not set
+# CONFIG_CRYPTO_SM4_ARM64_CE_BLK is not set
+# CONFIG_CRYPTO_SM4_ARM64_NEON_BLK is not set
+# CONFIG_CRYPTO_AES_ARM64_CE_CCM is not set
+# CONFIG_CRYPTO_CRCT10DIF_ARM64_CE is not set
+# end of Accelerated Cryptographic Algorithms for CPU (arm64)
+
+CONFIG_CRYPTO_HW=y
+# CONFIG_CRYPTO_DEV_CCP is not set
+# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
+# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
+# CONFIG_CRYPTO_DEV_QAT_C62X is not set
+# CONFIG_CRYPTO_DEV_QAT_4XXX is not set
+# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
+# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
+# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
+# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
+# CONFIG_CRYPTO_DEV_CAVIUM_ZIP is not set
+CONFIG_CRYPTO_DEV_VIRTIO=m
+# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
+# CONFIG_CRYPTO_DEV_CCREE is not set
+# CONFIG_CRYPTO_DEV_HISI_SEC is not set
+# CONFIG_CRYPTO_DEV_HISI_SEC2 is not set
+# CONFIG_CRYPTO_DEV_HISI_ZIP is not set
+# CONFIG_CRYPTO_DEV_HISI_HPRE is not set
+# CONFIG_CRYPTO_DEV_HISI_TRNG is not set
+# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
+CONFIG_ASYMMETRIC_KEY_TYPE=y
+CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
+CONFIG_X509_CERTIFICATE_PARSER=y
+# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
+CONFIG_PKCS7_MESSAGE_PARSER=y
+# CONFIG_PKCS7_TEST_KEY is not set
+# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set
+# CONFIG_FIPS_SIGNATURE_SELFTEST is not set
+
+#
+# Certificates for signature checking
+#
+CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
+CONFIG_MODULE_SIG_KEY_TYPE_RSA=y
+# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set
+CONFIG_SYSTEM_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_TRUSTED_KEYS="google/certs/lakitu_root_cert.pem"
+# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
+CONFIG_SECONDARY_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_HASH_LIST=""
+# CONFIG_SYSTEM_REVOCATION_LIST is not set
+# CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE is not set
+# end of Certificates for signature checking
+
+CONFIG_BINARY_PRINTF=y
+
+#
+# Library routines
+#
+CONFIG_RAID6_PQ=m
+CONFIG_RAID6_PQ_BENCHMARK=y
+# CONFIG_PACKING is not set
+CONFIG_BITREVERSE=y
+CONFIG_HAVE_ARCH_BITREVERSE=y
+CONFIG_GENERIC_STRNCPY_FROM_USER=y
+CONFIG_GENERIC_STRNLEN_USER=y
+CONFIG_GENERIC_NET_UTILS=y
+# CONFIG_CORDIC is not set
+# CONFIG_PRIME_NUMBERS is not set
+CONFIG_RATIONAL=y
+CONFIG_GENERIC_PCI_IOMAP=y
+CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
+CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
+CONFIG_ARCH_USE_SYM_ANNOTATIONS=y
+# CONFIG_INDIRECT_PIO is not set
+# CONFIG_TRACE_MMIO_ACCESS is not set
+
+#
+# Crypto library routines
+#
+CONFIG_CRYPTO_LIB_UTILS=y
+CONFIG_CRYPTO_LIB_AES=y
+CONFIG_CRYPTO_LIB_ARC4=y
+CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
+CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA=m
+CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m
+CONFIG_CRYPTO_LIB_CHACHA=m
+CONFIG_CRYPTO_LIB_CURVE25519_GENERIC=m
+CONFIG_CRYPTO_LIB_CURVE25519=m
+CONFIG_CRYPTO_LIB_DES=y
+CONFIG_CRYPTO_LIB_POLY1305_RSIZE=9
+CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m
+CONFIG_CRYPTO_LIB_SHA1=y
+CONFIG_CRYPTO_LIB_SHA256=y
+# end of Crypto library routines
+
+CONFIG_CRC_CCITT=y
+CONFIG_CRC16=y
+CONFIG_CRC_T10DIF=y
+CONFIG_CRC64_ROCKSOFT=y
+CONFIG_CRC_ITU_T=y
+CONFIG_CRC32=y
+# CONFIG_CRC32_SELFTEST is not set
+CONFIG_CRC32_SLICEBY8=y
+# CONFIG_CRC32_SLICEBY4 is not set
+# CONFIG_CRC32_SARWATE is not set
+# CONFIG_CRC32_BIT is not set
+CONFIG_CRC64=y
+# CONFIG_CRC4 is not set
+CONFIG_CRC7=m
+CONFIG_LIBCRC32C=y
+# CONFIG_CRC8 is not set
+CONFIG_XXHASH=y
+CONFIG_AUDIT_GENERIC=y
+CONFIG_AUDIT_ARCH_COMPAT_GENERIC=y
+CONFIG_AUDIT_COMPAT_GENERIC=y
+# CONFIG_RANDOM32_SELFTEST is not set
+CONFIG_ZLIB_INFLATE=y
+CONFIG_ZLIB_DEFLATE=y
+CONFIG_LZO_COMPRESS=m
+CONFIG_LZO_DECOMPRESS=m
+CONFIG_LZ4_COMPRESS=m
+CONFIG_LZ4_DECOMPRESS=y
+CONFIG_ZSTD_COMMON=y
+CONFIG_ZSTD_DECOMPRESS=y
+CONFIG_XZ_DEC=y
+CONFIG_XZ_DEC_X86=y
+# CONFIG_XZ_DEC_POWERPC is not set
+# CONFIG_XZ_DEC_IA64 is not set
+# CONFIG_XZ_DEC_ARM is not set
+# CONFIG_XZ_DEC_ARMTHUMB is not set
+# CONFIG_XZ_DEC_SPARC is not set
+# CONFIG_XZ_DEC_MICROLZMA is not set
+CONFIG_XZ_DEC_BCJ=y
+# CONFIG_XZ_DEC_TEST is not set
+CONFIG_DECOMPRESS_GZIP=y
+CONFIG_DECOMPRESS_XZ=y
+CONFIG_DECOMPRESS_LZ4=y
+CONFIG_DECOMPRESS_ZSTD=y
+CONFIG_GENERIC_ALLOCATOR=y
+CONFIG_TEXTSEARCH=y
+CONFIG_TEXTSEARCH_KMP=m
+CONFIG_TEXTSEARCH_BM=m
+CONFIG_TEXTSEARCH_FSM=m
+CONFIG_INTERVAL_TREE=y
+CONFIG_XARRAY_MULTI=y
+CONFIG_ASSOCIATIVE_ARRAY=y
+CONFIG_HAS_IOMEM=y
+CONFIG_HAS_IOPORT_MAP=y
+CONFIG_HAS_DMA=y
+CONFIG_DMA_OPS=y
+CONFIG_NEED_SG_DMA_LENGTH=y
+CONFIG_NEED_DMA_MAP_STATE=y
+CONFIG_ARCH_DMA_ADDR_T_64BIT=y
+CONFIG_DMA_DECLARE_COHERENT=y
+CONFIG_ARCH_HAS_SETUP_DMA_OPS=y
+CONFIG_ARCH_HAS_SYNC_DMA_FOR_DEVICE=y
+CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU=y
+CONFIG_ARCH_HAS_DMA_PREP_COHERENT=y
+CONFIG_SWIOTLB=y
+# CONFIG_DMA_RESTRICTED_POOL is not set
+CONFIG_DMA_NONCOHERENT_MMAP=y
+CONFIG_DMA_COHERENT_POOL=y
+CONFIG_DMA_DIRECT_REMAP=y
+# CONFIG_DMA_CMA is not set
+# CONFIG_DMA_API_DEBUG is not set
+# CONFIG_DMA_MAP_BENCHMARK is not set
+CONFIG_SGL_ALLOC=y
+# CONFIG_FORCE_NR_CPUS is not set
+CONFIG_CPU_RMAP=y
+CONFIG_DQL=y
+CONFIG_GLOB=y
+# CONFIG_GLOB_SELFTEST is not set
+CONFIG_NLATTR=y
+CONFIG_CLZ_TAB=y
+# CONFIG_IRQ_POLL is not set
+CONFIG_MPILIB=y
+CONFIG_SIGNATURE=y
+CONFIG_DIMLIB=y
+CONFIG_LIBFDT=y
+CONFIG_OID_REGISTRY=y
+CONFIG_UCS2_STRING=y
+CONFIG_HAVE_GENERIC_VDSO=y
+CONFIG_GENERIC_GETTIMEOFDAY=y
+CONFIG_GENERIC_VDSO_TIME_NS=y
+CONFIG_FONT_SUPPORT=y
+CONFIG_FONT_8x16=y
+CONFIG_FONT_AUTOSELECT=y
+CONFIG_SG_POOL=y
+CONFIG_ARCH_STACKWALK=y
+CONFIG_STACKDEPOT=y
+CONFIG_SBITMAP=y
+# end of Library routines
+
+CONFIG_GENERIC_IOREMAP=y
+CONFIG_GENERIC_LIB_DEVMEM_IS_ALLOWED=y
+
+#
+# Kernel hacking
+#
+
+#
+# printk and dmesg options
+#
+CONFIG_PRINTK_TIME=y
+# CONFIG_PRINTK_CALLER is not set
+# CONFIG_STACKTRACE_BUILD_ID is not set
+CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
+CONFIG_CONSOLE_LOGLEVEL_QUIET=4
+CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
+# CONFIG_BOOT_PRINTK_DELAY is not set
+# CONFIG_DYNAMIC_DEBUG is not set
+# CONFIG_DYNAMIC_DEBUG_CORE is not set
+CONFIG_SYMBOLIC_ERRNAME=y
+CONFIG_DEBUG_BUGVERBOSE=y
+# end of printk and dmesg options
+
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DEBUG_MISC=y
+
+#
+# Compile-time checks and compiler options
+#
+CONFIG_DEBUG_INFO=y
+CONFIG_AS_HAS_NON_CONST_LEB128=y
+# CONFIG_DEBUG_INFO_NONE is not set
+# CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is not set
+CONFIG_DEBUG_INFO_DWARF4=y
+# CONFIG_DEBUG_INFO_DWARF5 is not set
+# CONFIG_DEBUG_INFO_REDUCED is not set
+# CONFIG_DEBUG_INFO_COMPRESSED is not set
+# CONFIG_DEBUG_INFO_SPLIT is not set
+CONFIG_DEBUG_INFO_BTF=y
+CONFIG_PAHOLE_HAS_SPLIT_BTF=y
+CONFIG_DEBUG_INFO_BTF_MODULES=y
+# CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set
+# CONFIG_GDB_SCRIPTS is not set
+CONFIG_FRAME_WARN=2048
+# CONFIG_STRIP_ASM_SYMS is not set
+# CONFIG_HEADERS_INSTALL is not set
+CONFIG_SECTION_MISMATCH_WARN_ONLY=y
+# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set
+CONFIG_ARCH_WANT_FRAME_POINTERS=y
+CONFIG_FRAME_POINTER=y
+# CONFIG_VMLINUX_MAP is not set
+# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
+# end of Compile-time checks and compiler options
+
+#
+# Generic Kernel Debugging Instruments
+#
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
+CONFIG_MAGIC_SYSRQ_SERIAL=y
+CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE=""
+CONFIG_DEBUG_FS=y
+CONFIG_DEBUG_FS_ALLOW_ALL=y
+# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
+# CONFIG_DEBUG_FS_ALLOW_NONE is not set
+CONFIG_HAVE_ARCH_KGDB=y
+# CONFIG_KGDB is not set
+CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
+CONFIG_UBSAN=y
+# CONFIG_UBSAN_TRAP is not set
+CONFIG_CC_HAS_UBSAN_BOUNDS=y
+CONFIG_CC_HAS_UBSAN_ARRAY_BOUNDS=y
+CONFIG_UBSAN_BOUNDS=y
+CONFIG_UBSAN_ARRAY_BOUNDS=y
+# CONFIG_UBSAN_SHIFT is not set
+# CONFIG_UBSAN_UNREACHABLE is not set
+# CONFIG_UBSAN_BOOL is not set
+# CONFIG_UBSAN_ENUM is not set
+# CONFIG_UBSAN_ALIGNMENT is not set
+# CONFIG_UBSAN_SANITIZE_ALL is not set
+# CONFIG_TEST_UBSAN is not set
+CONFIG_HAVE_ARCH_KCSAN=y
+CONFIG_HAVE_KCSAN_COMPILER=y
+# CONFIG_KCSAN is not set
+# end of Generic Kernel Debugging Instruments
+
+#
+# Networking Debugging
+#
+# CONFIG_NET_DEV_REFCNT_TRACKER is not set
+# CONFIG_NET_NS_REFCNT_TRACKER is not set
+# CONFIG_DEBUG_NET is not set
+# end of Networking Debugging
+
+#
+# Memory Debugging
+#
+# CONFIG_PAGE_EXTENSION is not set
+# CONFIG_DEBUG_PAGEALLOC is not set
+CONFIG_SLUB_DEBUG=y
+# CONFIG_SLUB_DEBUG_ON is not set
+# CONFIG_PAGE_OWNER is not set
+# CONFIG_PAGE_TABLE_CHECK is not set
+# CONFIG_PAGE_POISONING is not set
+# CONFIG_DEBUG_PAGE_REF is not set
+# CONFIG_DEBUG_RODATA_TEST is not set
+CONFIG_ARCH_HAS_DEBUG_WX=y
+# CONFIG_DEBUG_WX is not set
+CONFIG_GENERIC_PTDUMP=y
+# CONFIG_PTDUMP_DEBUGFS is not set
+# CONFIG_DEBUG_OBJECTS is not set
+# CONFIG_SHRINKER_DEBUG is not set
+CONFIG_HAVE_DEBUG_KMEMLEAK=y
+# CONFIG_DEBUG_KMEMLEAK is not set
+# CONFIG_DEBUG_STACK_USAGE is not set
+# CONFIG_SCHED_STACK_END_CHECK is not set
+CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
+# CONFIG_DEBUG_VM is not set
+# CONFIG_DEBUG_VM_PGTABLE is not set
+CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
+# CONFIG_DEBUG_VIRTUAL is not set
+# CONFIG_DEBUG_MEMORY_INIT is not set
+# CONFIG_DEBUG_PER_CPU_MAPS is not set
+CONFIG_HAVE_ARCH_KASAN=y
+CONFIG_HAVE_ARCH_KASAN_SW_TAGS=y
+CONFIG_HAVE_ARCH_KASAN_HW_TAGS=y
+CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
+CONFIG_CC_HAS_KASAN_GENERIC=y
+CONFIG_CC_HAS_KASAN_SW_TAGS=y
+CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
+# CONFIG_KASAN is not set
+CONFIG_HAVE_ARCH_KFENCE=y
+# CONFIG_KFENCE is not set
+# end of Memory Debugging
+
+# CONFIG_DEBUG_SHIRQ is not set
+
+#
+# Debug Oops, Lockups and Hangs
+#
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PANIC_ON_OOPS_VALUE=1
+CONFIG_PANIC_TIMEOUT=-1
+CONFIG_LOCKUP_DETECTOR=y
+CONFIG_SOFTLOCKUP_DETECTOR=y
+# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
+# CONFIG_WQ_WATCHDOG is not set
+# CONFIG_TEST_LOCKUP is not set
+# end of Debug Oops, Lockups and Hangs
+
+#
+# Scheduler Debugging
+#
+# CONFIG_SCHED_DEBUG is not set
+CONFIG_SCHED_INFO=y
+CONFIG_SCHEDSTATS=y
+# end of Scheduler Debugging
+
+# CONFIG_DEBUG_TIMEKEEPING is not set
+
+#
+# Lock Debugging (spinlocks, mutexes, etc...)
+#
+CONFIG_LOCK_DEBUGGING_SUPPORT=y
+# CONFIG_PROVE_LOCKING is not set
+# CONFIG_LOCK_STAT is not set
+# CONFIG_DEBUG_RT_MUTEXES is not set
+# CONFIG_DEBUG_SPINLOCK is not set
+# CONFIG_DEBUG_MUTEXES is not set
+# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
+# CONFIG_DEBUG_RWSEMS is not set
+# CONFIG_DEBUG_LOCK_ALLOC is not set
+# CONFIG_DEBUG_ATOMIC_SLEEP is not set
+# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
+# CONFIG_LOCK_TORTURE_TEST is not set
+# CONFIG_WW_MUTEX_SELFTEST is not set
+# CONFIG_SCF_TORTURE_TEST is not set
+# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
+# end of Lock Debugging (spinlocks, mutexes, etc...)
+
+# CONFIG_DEBUG_IRQFLAGS is not set
+CONFIG_STACKTRACE=y
+# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
+# CONFIG_DEBUG_KOBJECT is not set
+
+#
+# Debug kernel data structures
+#
+CONFIG_DEBUG_LIST=y
+# CONFIG_DEBUG_PLIST is not set
+# CONFIG_DEBUG_SG is not set
+CONFIG_DEBUG_NOTIFIERS=y
+# CONFIG_BUG_ON_DATA_CORRUPTION is not set
+# CONFIG_DEBUG_MAPLE_TREE is not set
+# end of Debug kernel data structures
+
+# CONFIG_DEBUG_CREDENTIALS is not set
+
+#
+# RCU Debugging
+#
+# CONFIG_RCU_SCALE_TEST is not set
+# CONFIG_RCU_TORTURE_TEST is not set
+# CONFIG_RCU_REF_SCALE_TEST is not set
+CONFIG_RCU_CPU_STALL_TIMEOUT=60
+CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
+# CONFIG_RCU_TRACE is not set
+# CONFIG_RCU_EQS_DEBUG is not set
+# end of RCU Debugging
+
+# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
+# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
+# CONFIG_LATENCYTOP is not set
+CONFIG_NOP_TRACER=y
+CONFIG_HAVE_FUNCTION_TRACER=y
+CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
+CONFIG_HAVE_DYNAMIC_FTRACE=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
+CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
+CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
+CONFIG_HAVE_C_RECORDMCOUNT=y
+CONFIG_TRACE_CLOCK=y
+CONFIG_RING_BUFFER=y
+CONFIG_EVENT_TRACING=y
+CONFIG_CONTEXT_SWITCH_TRACER=y
+CONFIG_TRACING=y
+CONFIG_GENERIC_TRACER=y
+CONFIG_TRACING_SUPPORT=y
+CONFIG_FTRACE=y
+# CONFIG_BOOTTIME_TRACING is not set
+CONFIG_FUNCTION_TRACER=y
+CONFIG_FUNCTION_GRAPH_TRACER=y
+CONFIG_DYNAMIC_FTRACE=y
+CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
+CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y
+CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
+# CONFIG_FUNCTION_PROFILER is not set
+# CONFIG_STACK_TRACER is not set
+# CONFIG_IRQSOFF_TRACER is not set
+# CONFIG_SCHED_TRACER is not set
+# CONFIG_HWLAT_TRACER is not set
+# CONFIG_OSNOISE_TRACER is not set
+# CONFIG_TIMERLAT_TRACER is not set
+CONFIG_FTRACE_SYSCALLS=y
+# CONFIG_TRACER_SNAPSHOT is not set
+CONFIG_BRANCH_PROFILE_NONE=y
+# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
+# CONFIG_PROFILE_ALL_BRANCHES is not set
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_KPROBE_EVENTS=y
+# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
+CONFIG_UPROBE_EVENTS=y
+CONFIG_BPF_EVENTS=y
+CONFIG_DYNAMIC_EVENTS=y
+CONFIG_PROBE_EVENTS=y
+# CONFIG_BPF_KPROBE_OVERRIDE is not set
+CONFIG_FTRACE_MCOUNT_RECORD=y
+CONFIG_FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY=y
+# CONFIG_SYNTH_EVENTS is not set
+# CONFIG_HIST_TRIGGERS is not set
+# CONFIG_TRACE_EVENT_INJECT is not set
+# CONFIG_TRACEPOINT_BENCHMARK is not set
+# CONFIG_RING_BUFFER_BENCHMARK is not set
+# CONFIG_TRACE_EVAL_MAP_FILE is not set
+# CONFIG_FTRACE_RECORD_RECURSION is not set
+# CONFIG_FTRACE_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
+# CONFIG_PREEMPTIRQ_DELAY_TEST is not set
+# CONFIG_KPROBE_EVENT_GEN_TEST is not set
+# CONFIG_RV is not set
+# CONFIG_SAMPLES is not set
+CONFIG_STRICT_DEVMEM=y
+CONFIG_IO_STRICT_DEVMEM=y
+
+#
+# arm64 Debugging
+#
+# CONFIG_PID_IN_CONTEXTIDR is not set
+# CONFIG_DEBUG_EFI is not set
+# CONFIG_ARM64_RELOC_TEST is not set
+# CONFIG_CORESIGHT is not set
+# end of arm64 Debugging
+
+#
+# Kernel Testing and Coverage
+#
+# CONFIG_KUNIT is not set
+# CONFIG_NOTIFIER_ERROR_INJECTION is not set
+CONFIG_FUNCTION_ERROR_INJECTION=y
+# CONFIG_FAULT_INJECTION is not set
+CONFIG_ARCH_HAS_KCOV=y
+CONFIG_CC_HAS_SANCOV_TRACE_PC=y
+# CONFIG_KCOV is not set
+CONFIG_RUNTIME_TESTING_MENU=y
+# CONFIG_LKDTM is not set
+# CONFIG_TEST_MIN_HEAP is not set
+# CONFIG_TEST_DIV64 is not set
+# CONFIG_BACKTRACE_SELF_TEST is not set
+# CONFIG_TEST_REF_TRACKER is not set
+# CONFIG_RBTREE_TEST is not set
+# CONFIG_REED_SOLOMON_TEST is not set
+# CONFIG_INTERVAL_TREE_TEST is not set
+# CONFIG_PERCPU_TEST is not set
+# CONFIG_ATOMIC64_SELFTEST is not set
+# CONFIG_ASYNC_RAID6_TEST is not set
+# CONFIG_TEST_HEXDUMP is not set
+# CONFIG_STRING_SELFTEST is not set
+# CONFIG_TEST_STRING_HELPERS is not set
+# CONFIG_TEST_STRSCPY is not set
+# CONFIG_TEST_KSTRTOX is not set
+# CONFIG_TEST_PRINTF is not set
+# CONFIG_TEST_SCANF is not set
+# CONFIG_TEST_BITMAP is not set
+# CONFIG_TEST_UUID is not set
+# CONFIG_TEST_XARRAY is not set
+# CONFIG_TEST_MAPLE_TREE is not set
+# CONFIG_TEST_RHASHTABLE is not set
+# CONFIG_TEST_SIPHASH is not set
+# CONFIG_TEST_IDA is not set
+# CONFIG_TEST_LKM is not set
+# CONFIG_TEST_BITOPS is not set
+# CONFIG_TEST_VMALLOC is not set
+# CONFIG_TEST_USER_COPY is not set
+CONFIG_TEST_BPF=m
+# CONFIG_TEST_BLACKHOLE_DEV is not set
+# CONFIG_FIND_BIT_BENCHMARK is not set
+# CONFIG_TEST_FIRMWARE is not set
+# CONFIG_TEST_SYSCTL is not set
+# CONFIG_TEST_UDELAY is not set
+# CONFIG_TEST_STATIC_KEYS is not set
+# CONFIG_TEST_KMOD is not set
+# CONFIG_TEST_MEMCAT_P is not set
+# CONFIG_TEST_MEMINIT is not set
+# CONFIG_TEST_FREE_PAGES is not set
+CONFIG_ARCH_USE_MEMTEST=y
+# CONFIG_MEMTEST is not set
+# end of Kernel Testing and Coverage
+
+#
+# Rust hacking
+#
+# end of Rust hacking
+# end of Kernel hacking
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 329dbbd..b87d70b 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -23,7 +23,7 @@
*/
#define HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
#define ARCH_SUPPORTS_FTRACE_OPS 1
#else
#define MCOUNT_ADDR ((unsigned long)_mcount)
@@ -33,8 +33,7 @@
#define MCOUNT_INSN_SIZE AARCH64_INSN_SIZE
#define FTRACE_PLT_IDX 0
-#define FTRACE_REGS_PLT_IDX 1
-#define NR_FTRACE_PLTS 2
+#define NR_FTRACE_PLTS 1
/*
* Currently, gcc tends to save the link register after the local variables
@@ -63,25 +62,82 @@
extern void return_to_handler(void);
-static inline unsigned long ftrace_call_adjust(unsigned long addr)
-{
- /*
- * Adjust addr to point at the BL in the callsite.
- * See ftrace_init_nop() for the callsite sequence.
- */
- if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS))
- return addr + AARCH64_INSN_SIZE;
- /*
- * addr is the address of the mcount call instruction.
- * recordmcount does the necessary offset calculation.
- */
- return addr;
-}
+unsigned long ftrace_call_adjust(unsigned long addr);
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
struct dyn_ftrace;
struct ftrace_ops;
-struct ftrace_regs;
+
+#define arch_ftrace_get_regs(regs) NULL
+
+/*
+ * Note: sizeof(struct ftrace_regs) must be a multiple of 16 to ensure correct
+ * stack alignment
+ */
+struct ftrace_regs {
+ /* x0 - x8 */
+ unsigned long regs[9];
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ unsigned long direct_tramp;
+#else
+ unsigned long __unused;
+#endif
+
+ unsigned long fp;
+ unsigned long lr;
+
+ unsigned long sp;
+ unsigned long pc;
+};
+
+static __always_inline unsigned long
+ftrace_regs_get_instruction_pointer(const struct ftrace_regs *fregs)
+{
+ return fregs->pc;
+}
+
+static __always_inline void
+ftrace_regs_set_instruction_pointer(struct ftrace_regs *fregs,
+ unsigned long pc)
+{
+ fregs->pc = pc;
+}
+
+static __always_inline unsigned long
+ftrace_regs_get_stack_pointer(const struct ftrace_regs *fregs)
+{
+ return fregs->sp;
+}
+
+static __always_inline unsigned long
+ftrace_regs_get_argument(struct ftrace_regs *fregs, unsigned int n)
+{
+ if (n < 8)
+ return fregs->regs[n];
+ return 0;
+}
+
+static __always_inline unsigned long
+ftrace_regs_get_return_value(const struct ftrace_regs *fregs)
+{
+ return fregs->regs[0];
+}
+
+static __always_inline void
+ftrace_regs_set_return_value(struct ftrace_regs *fregs,
+ unsigned long ret)
+{
+ fregs->regs[0] = ret;
+}
+
+static __always_inline void
+ftrace_override_function_with_return(struct ftrace_regs *fregs)
+{
+ fregs->pc = fregs->lr;
+}
+
+int ftrace_regs_query_register_offset(const char *name);
int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
#define ftrace_init_nop ftrace_init_nop
@@ -89,6 +145,19 @@
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs);
#define ftrace_graph_func ftrace_graph_func
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
+ unsigned long addr)
+{
+ /*
+ * The ftrace trampoline will return to this address instead of the
+ * instrumented function.
+ */
+ fregs->direct_tramp = addr;
+}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
#endif
#define ftrace_return_address(n) return_address(n)
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 834bff7..0cc3048 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -428,6 +428,7 @@
__AARCH64_INSN_FUNCS(clrex, 0xFFFFF0FF, 0xD503305F)
__AARCH64_INSN_FUNCS(ssbb, 0xFFFFFFFF, 0xD503309F)
__AARCH64_INSN_FUNCS(pssbb, 0xFFFFFFFF, 0xD503349F)
+__AARCH64_INSN_FUNCS(bti, 0xFFFFFF3F, 0xD503241f)
#undef __AARCH64_INSN_FUNCS
diff --git a/arch/arm64/include/asm/linkage.h b/arch/arm64/include/asm/linkage.h
index 1436fa1..d3acd9c 100644
--- a/arch/arm64/include/asm/linkage.h
+++ b/arch/arm64/include/asm/linkage.h
@@ -5,8 +5,8 @@
#include <asm/assembler.h>
#endif
-#define __ALIGN .align 2
-#define __ALIGN_STR ".align 2"
+#define __ALIGN .balign CONFIG_FUNCTION_ALIGNMENT
+#define __ALIGN_STR ".balign " #CONFIG_FUNCTION_ALIGNMENT
/*
* When using in-kernel BTI we need to ensure that PCS-conformant
diff --git a/arch/arm64/include/asm/patching.h b/arch/arm64/include/asm/patching.h
index 6bf5adc..68908b8 100644
--- a/arch/arm64/include/asm/patching.h
+++ b/arch/arm64/include/asm/patching.h
@@ -7,6 +7,8 @@
int aarch64_insn_read(void *addr, u32 *insnp);
int aarch64_insn_write(void *addr, u32 insn);
+int aarch64_insn_write_literal_u64(void *addr, u64 val);
+
int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 1197e76..0996094 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -9,6 +9,7 @@
#include <linux/arm_sdei.h>
#include <linux/sched.h>
+#include <linux/ftrace.h>
#include <linux/kexec.h>
#include <linux/mm.h>
#include <linux/dma-mapping.h>
@@ -82,6 +83,22 @@
DEFINE(S_STACKFRAME, offsetof(struct pt_regs, stackframe));
DEFINE(PT_REGS_SIZE, sizeof(struct pt_regs));
BLANK();
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
+ DEFINE(FREGS_X0, offsetof(struct ftrace_regs, regs[0]));
+ DEFINE(FREGS_X2, offsetof(struct ftrace_regs, regs[2]));
+ DEFINE(FREGS_X4, offsetof(struct ftrace_regs, regs[4]));
+ DEFINE(FREGS_X6, offsetof(struct ftrace_regs, regs[6]));
+ DEFINE(FREGS_X8, offsetof(struct ftrace_regs, regs[8]));
+ DEFINE(FREGS_FP, offsetof(struct ftrace_regs, fp));
+ DEFINE(FREGS_LR, offsetof(struct ftrace_regs, lr));
+ DEFINE(FREGS_SP, offsetof(struct ftrace_regs, sp));
+ DEFINE(FREGS_PC, offsetof(struct ftrace_regs, pc));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ DEFINE(FREGS_DIRECT_TRAMP, offsetof(struct ftrace_regs, direct_tramp));
+#endif
+ DEFINE(FREGS_SIZE, sizeof(struct ftrace_regs));
+ BLANK();
+#endif
#ifdef CONFIG_COMPAT
DEFINE(COMPAT_SIGFRAME_REGS_OFFSET, offsetof(struct compat_sigframe, uc.uc_mcontext.arm_r0));
DEFINE(COMPAT_RT_SIGFRAME_REGS_OFFSET, offsetof(struct compat_rt_sigframe, sig.uc.uc_mcontext.arm_r0));
@@ -181,5 +198,11 @@
DEFINE(KIMAGE_START, offsetof(struct kimage, start));
BLANK();
#endif
+#ifdef CONFIG_FUNCTION_TRACER
+ DEFINE(FTRACE_OPS_FUNC, offsetof(struct ftrace_ops, func));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call));
+#endif
+#endif
return 0;
}
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 322a831..079d98c 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -13,90 +13,103 @@
#include <asm/ftrace.h>
#include <asm/insn.h>
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
/*
* Due to -fpatchable-function-entry=2, the compiler has placed two NOPs before
* the regular function prologue. For an enabled callsite, ftrace_init_nop() and
* ftrace_make_call() have patched those NOPs to:
*
* MOV X9, LR
- * BL <entry>
- *
- * ... where <entry> is either ftrace_caller or ftrace_regs_caller.
+ * BL ftrace_caller
*
* Each instrumented function follows the AAPCS, so here x0-x8 and x18-x30 are
* live (x18 holds the Shadow Call Stack pointer), and x9-x17 are safe to
* clobber.
*
- * We save the callsite's context into a pt_regs before invoking any ftrace
- * callbacks. So that we can get a sensible backtrace, we create a stack record
- * for the callsite and the ftrace entry assembly. This is not sufficient for
- * reliable stacktrace: until we create the callsite stack record, its caller
- * is missing from the LR and existing chain of frame records.
+ * We save the callsite's context into a struct ftrace_regs before invoking any
+ * ftrace callbacks. So that we can get a sensible backtrace, we create frame
+ * records for the callsite and the ftrace entry assembly. This is not
+ * sufficient for reliable stacktrace: until we create the callsite stack
+ * record, its caller is missing from the LR and existing chain of frame
+ * records.
*/
- .macro ftrace_regs_entry, allregs=0
- /* Make room for pt_regs, plus a callee frame */
- sub sp, sp, #(PT_REGS_SIZE + 16)
-
- /* Save function arguments (and x9 for simplicity) */
- stp x0, x1, [sp, #S_X0]
- stp x2, x3, [sp, #S_X2]
- stp x4, x5, [sp, #S_X4]
- stp x6, x7, [sp, #S_X6]
- stp x8, x9, [sp, #S_X8]
-
- /* Optionally save the callee-saved registers, always save the FP */
- .if \allregs == 1
- stp x10, x11, [sp, #S_X10]
- stp x12, x13, [sp, #S_X12]
- stp x14, x15, [sp, #S_X14]
- stp x16, x17, [sp, #S_X16]
- stp x18, x19, [sp, #S_X18]
- stp x20, x21, [sp, #S_X20]
- stp x22, x23, [sp, #S_X22]
- stp x24, x25, [sp, #S_X24]
- stp x26, x27, [sp, #S_X26]
- stp x28, x29, [sp, #S_X28]
- .else
- str x29, [sp, #S_FP]
- .endif
-
- /* Save the callsite's SP and LR */
- add x10, sp, #(PT_REGS_SIZE + 16)
- stp x9, x10, [sp, #S_LR]
-
- /* Save the PC after the ftrace callsite */
- str x30, [sp, #S_PC]
-
- /* Create a frame record for the callsite above pt_regs */
- stp x29, x9, [sp, #PT_REGS_SIZE]
- add x29, sp, #PT_REGS_SIZE
-
- /* Create our frame record within pt_regs. */
- stp x29, x30, [sp, #S_STACKFRAME]
- add x29, sp, #S_STACKFRAME
- .endm
-
-SYM_CODE_START(ftrace_regs_caller)
- bti c
- ftrace_regs_entry 1
- b ftrace_common
-SYM_CODE_END(ftrace_regs_caller)
-
SYM_CODE_START(ftrace_caller)
bti c
- ftrace_regs_entry 0
- b ftrace_common
-SYM_CODE_END(ftrace_caller)
-SYM_CODE_START(ftrace_common)
- sub x0, x30, #AARCH64_INSN_SIZE // ip (callsite's BL insn)
- mov x1, x9 // parent_ip (callsite's LR)
- ldr_l x2, function_trace_op // op
- mov x3, sp // regs
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+ /*
+ * The literal pointer to the ops is at an 8-byte aligned boundary
+ * which is either 12 or 16 bytes before the BL instruction in the call
+ * site. See ftrace_call_adjust() for details.
+ *
+ * Therefore here the LR points at `literal + 16` or `literal + 20`,
+ * and we can find the address of the literal in either case by
+ * aligning to an 8-byte boundary and subtracting 16. We do the
+ * alignment first as this allows us to fold the subtraction into the
+ * LDR.
+ */
+ bic x11, x30, 0x7
+ ldr x11, [x11, #-(4 * AARCH64_INSN_SIZE)] // op
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ /*
+ * If the op has a direct call, handle it immediately without
+ * saving/restoring registers.
+ */
+ ldr x17, [x11, #FTRACE_OPS_DIRECT_CALL] // op->direct_call
+ cbnz x17, ftrace_caller_direct
+#endif
+#endif
+
+ /* Save original SP */
+ mov x10, sp
+
+ /* Make room for ftrace regs, plus two frame records */
+ sub sp, sp, #(FREGS_SIZE + 32)
+
+ /* Save function arguments */
+ stp x0, x1, [sp, #FREGS_X0]
+ stp x2, x3, [sp, #FREGS_X2]
+ stp x4, x5, [sp, #FREGS_X4]
+ stp x6, x7, [sp, #FREGS_X6]
+ str x8, [sp, #FREGS_X8]
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ str xzr, [sp, #FREGS_DIRECT_TRAMP]
+#endif
+
+ /* Save the callsite's FP, LR, SP */
+ str x29, [sp, #FREGS_FP]
+ str x9, [sp, #FREGS_LR]
+ str x10, [sp, #FREGS_SP]
+
+ /* Save the PC after the ftrace callsite */
+ str x30, [sp, #FREGS_PC]
+
+ /* Create a frame record for the callsite above the ftrace regs */
+ stp x29, x9, [sp, #FREGS_SIZE + 16]
+ add x29, sp, #FREGS_SIZE + 16
+
+ /* Create our frame record above the ftrace regs */
+ stp x29, x30, [sp, #FREGS_SIZE]
+ add x29, sp, #FREGS_SIZE
+
+ /* Prepare arguments for the the tracer func */
+ sub x0, x30, #AARCH64_INSN_SIZE // ip (callsite's BL insn)
+ mov x1, x9 // parent_ip (callsite's LR)
+ mov x3, sp // regs
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+ mov x2, x11 // op
+ ldr x4, [x2, #FTRACE_OPS_FUNC] // op->func
+ blr x4 // op->func(ip, parent_ip, op, regs)
+
+#else
+ ldr_l x2, function_trace_op // op
SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
- bl ftrace_stub
+ bl ftrace_stub // func(ip, parent_ip, op, regs)
+#endif
/*
* At the callsite x0-x8 and x19-x30 were live. Any C code will have preserved
@@ -104,24 +117,68 @@
* to restore x0-x8, x29, and x30.
*/
/* Restore function arguments */
- ldp x0, x1, [sp]
- ldp x2, x3, [sp, #S_X2]
- ldp x4, x5, [sp, #S_X4]
- ldp x6, x7, [sp, #S_X6]
- ldr x8, [sp, #S_X8]
+ ldp x0, x1, [sp, #FREGS_X0]
+ ldp x2, x3, [sp, #FREGS_X2]
+ ldp x4, x5, [sp, #FREGS_X4]
+ ldp x6, x7, [sp, #FREGS_X6]
+ ldr x8, [sp, #FREGS_X8]
- /* Restore the callsite's FP, LR, PC */
- ldr x29, [sp, #S_FP]
- ldr x30, [sp, #S_LR]
- ldr x9, [sp, #S_PC]
+ /* Restore the callsite's FP */
+ ldr x29, [sp, #FREGS_FP]
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ ldr x17, [sp, #FREGS_DIRECT_TRAMP]
+ cbnz x17, ftrace_caller_direct_late
+#endif
+
+ /* Restore the callsite's LR and PC */
+ ldr x30, [sp, #FREGS_LR]
+ ldr x9, [sp, #FREGS_PC]
/* Restore the callsite's SP */
- add sp, sp, #PT_REGS_SIZE + 16
+ add sp, sp, #FREGS_SIZE + 32
ret x9
-SYM_CODE_END(ftrace_common)
-#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_INNER_LABEL(ftrace_caller_direct_late, SYM_L_LOCAL)
+ /*
+ * Head to a direct trampoline in x17 after having run other tracers.
+ * The ftrace_regs are live, and x0-x8 and FP have been restored. The
+ * LR, PC, and SP have not been restored.
+ */
+
+ /*
+ * Restore the callsite's LR and PC matching the trampoline calling
+ * convention.
+ */
+ ldr x9, [sp, #FREGS_LR]
+ ldr x30, [sp, #FREGS_PC]
+
+ /* Restore the callsite's SP */
+ add sp, sp, #FREGS_SIZE + 32
+
+SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL)
+ /*
+ * Head to a direct trampoline in x17.
+ *
+ * We use `BR X17` as this can safely land on a `BTI C` or `PACIASP` in
+ * the trampoline, and will not unbalance any return stack.
+ */
+ br x17
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+SYM_CODE_END(ftrace_caller)
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_CODE_START(ftrace_stub_direct_tramp)
+ bti c
+ mov x10, x30
+ mov x30, x9
+ ret x10
+SYM_CODE_END(ftrace_stub_direct_tramp)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
+#else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
/*
* Gcc with -pg will put the following code in the beginning of each function:
@@ -293,7 +350,7 @@
mcount_exit
SYM_FUNC_END(ftrace_graph_caller)
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
-#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
SYM_TYPED_FUNC_START(ftrace_stub)
ret
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 8745175..890190c 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -17,7 +17,133 @@
#include <asm/insn.h>
#include <asm/patching.h>
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
+struct fregs_offset {
+ const char *name;
+ int offset;
+};
+
+#define FREGS_OFFSET(n, field) \
+{ \
+ .name = n, \
+ .offset = offsetof(struct ftrace_regs, field), \
+}
+
+static const struct fregs_offset fregs_offsets[] = {
+ FREGS_OFFSET("x0", regs[0]),
+ FREGS_OFFSET("x1", regs[1]),
+ FREGS_OFFSET("x2", regs[2]),
+ FREGS_OFFSET("x3", regs[3]),
+ FREGS_OFFSET("x4", regs[4]),
+ FREGS_OFFSET("x5", regs[5]),
+ FREGS_OFFSET("x6", regs[6]),
+ FREGS_OFFSET("x7", regs[7]),
+ FREGS_OFFSET("x8", regs[8]),
+
+ FREGS_OFFSET("x29", fp),
+ FREGS_OFFSET("x30", lr),
+ FREGS_OFFSET("lr", lr),
+
+ FREGS_OFFSET("sp", sp),
+ FREGS_OFFSET("pc", pc),
+};
+
+int ftrace_regs_query_register_offset(const char *name)
+{
+ for (int i = 0; i < ARRAY_SIZE(fregs_offsets); i++) {
+ const struct fregs_offset *roff = &fregs_offsets[i];
+ if (!strcmp(roff->name, name))
+ return roff->offset;
+ }
+
+ return -EINVAL;
+}
+#endif
+
#ifdef CONFIG_DYNAMIC_FTRACE
+unsigned long ftrace_call_adjust(unsigned long addr)
+{
+ /*
+ * When using mcount, addr is the address of the mcount call
+ * instruction, and no adjustment is necessary.
+ */
+ if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_ARGS))
+ return addr;
+
+ /*
+ * When using patchable-function-entry without pre-function NOPS, addr
+ * is the address of the first NOP after the function entry point.
+ *
+ * The compiler has either generated:
+ *
+ * addr+00: func: NOP // To be patched to MOV X9, LR
+ * addr+04: NOP // To be patched to BL <caller>
+ *
+ * Or:
+ *
+ * addr-04: BTI C
+ * addr+00: func: NOP // To be patched to MOV X9, LR
+ * addr+04: NOP // To be patched to BL <caller>
+ *
+ * We must adjust addr to the address of the NOP which will be patched
+ * to `BL <caller>`, which is at `addr + 4` bytes in either case.
+ *
+ */
+ if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
+ return addr + AARCH64_INSN_SIZE;
+
+ /*
+ * When using patchable-function-entry with pre-function NOPs, addr is
+ * the address of the first pre-function NOP.
+ *
+ * Starting from an 8-byte aligned base, the compiler has either
+ * generated:
+ *
+ * addr+00: NOP // Literal (first 32 bits)
+ * addr+04: NOP // Literal (last 32 bits)
+ * addr+08: func: NOP // To be patched to MOV X9, LR
+ * addr+12: NOP // To be patched to BL <caller>
+ *
+ * Or:
+ *
+ * addr+00: NOP // Literal (first 32 bits)
+ * addr+04: NOP // Literal (last 32 bits)
+ * addr+08: func: BTI C
+ * addr+12: NOP // To be patched to MOV X9, LR
+ * addr+16: NOP // To be patched to BL <caller>
+ *
+ * We must adjust addr to the address of the NOP which will be patched
+ * to `BL <caller>`, which is at either addr+12 or addr+16 depending on
+ * whether there is a BTI.
+ */
+
+ if (!IS_ALIGNED(addr, sizeof(unsigned long))) {
+ WARN_RATELIMIT(1, "Misaligned patch-site %pS\n",
+ (void *)(addr + 8));
+ return 0;
+ }
+
+ /* Skip the NOPs placed before the function entry point */
+ addr += 2 * AARCH64_INSN_SIZE;
+
+ /* Skip any BTI */
+ if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)) {
+ u32 insn = le32_to_cpu(*(__le32 *)addr);
+
+ if (aarch64_insn_is_bti(insn)) {
+ addr += AARCH64_INSN_SIZE;
+ } else if (insn != aarch64_insn_gen_nop()) {
+ WARN_RATELIMIT(1, "unexpected insn in patch-site %pS: 0x%08x\n",
+ (void *)addr, insn);
+ }
+ }
+
+ /* Skip the first NOP after function entry */
+ addr += AARCH64_INSN_SIZE;
+
+ return addr;
+}
+
/*
* Replace a single instruction, which may be a branch or NOP.
* If @validate == true, a replaced instruction is checked against 'old'.
@@ -56,6 +182,13 @@
unsigned long pc;
u32 new;
+ /*
+ * When using CALL_OPS, the function to call is associated with the
+ * call site, and we don't have a global function pointer to update.
+ */
+ if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
+ return 0;
+
pc = (unsigned long)ftrace_call;
new = aarch64_insn_gen_branch_imm(pc, (unsigned long)func,
AARCH64_INSN_BRANCH_LINK);
@@ -70,13 +203,17 @@
if (addr == FTRACE_ADDR)
return &plt[FTRACE_PLT_IDX];
- if (addr == FTRACE_REGS_ADDR &&
- IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS))
- return &plt[FTRACE_REGS_PLT_IDX];
#endif
return NULL;
}
+static bool reachable_by_bl(unsigned long addr, unsigned long pc)
+{
+ long offset = (long)addr - (long)pc;
+
+ return offset >= -SZ_128M && offset < SZ_128M;
+}
+
/*
* Find the address the callsite must branch to in order to reach '*addr'.
*
@@ -91,14 +228,21 @@
unsigned long *addr)
{
unsigned long pc = rec->ip;
- long offset = (long)*addr - (long)pc;
struct plt_entry *plt;
/*
+ * If a custom trampoline is unreachable, rely on the ftrace_caller
+ * trampoline which knows how to indirectly reach that trampoline
+ * through ops->direct_call.
+ */
+ if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
+ *addr = FTRACE_ADDR;
+
+ /*
* When the target is within range of the 'BL' instruction, use 'addr'
* as-is and branch to that directly.
*/
- if (offset >= -SZ_128M && offset < SZ_128M)
+ if (reachable_by_bl(*addr, pc))
return true;
/*
@@ -137,6 +281,44 @@
return true;
}
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+static const struct ftrace_ops *arm64_rec_get_ops(struct dyn_ftrace *rec)
+{
+ const struct ftrace_ops *ops = NULL;
+
+ if (rec->flags & FTRACE_FL_CALL_OPS_EN) {
+ ops = ftrace_find_unique_ops(rec);
+ WARN_ON_ONCE(!ops);
+ }
+
+ if (!ops)
+ ops = &ftrace_list_ops;
+
+ return ops;
+}
+
+static int ftrace_rec_set_ops(const struct dyn_ftrace *rec,
+ const struct ftrace_ops *ops)
+{
+ unsigned long literal = ALIGN_DOWN(rec->ip - 12, 8);
+ return aarch64_insn_write_literal_u64((void *)literal,
+ (unsigned long)ops);
+}
+
+static int ftrace_rec_set_nop_ops(struct dyn_ftrace *rec)
+{
+ return ftrace_rec_set_ops(rec, &ftrace_nop_ops);
+}
+
+static int ftrace_rec_update_ops(struct dyn_ftrace *rec)
+{
+ return ftrace_rec_set_ops(rec, arm64_rec_get_ops(rec));
+}
+#else
+static int ftrace_rec_set_nop_ops(struct dyn_ftrace *rec) { return 0; }
+static int ftrace_rec_update_ops(struct dyn_ftrace *rec) { return 0; }
+#endif
+
/*
* Turn on the call to ftrace_caller() in instrumented function
*/
@@ -144,6 +326,11 @@
{
unsigned long pc = rec->ip;
u32 old, new;
+ int ret;
+
+ ret = ftrace_rec_update_ops(rec);
+ if (ret)
+ return ret;
if (!ftrace_find_callable_addr(rec, NULL, &addr))
return -EINVAL;
@@ -154,12 +341,17 @@
return ftrace_modify_code(pc, old, new, true);
}
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
- unsigned long addr)
+ unsigned long addr)
{
unsigned long pc = rec->ip;
u32 old, new;
+ int ret;
+
+ ret = ftrace_rec_set_ops(rec, arm64_rec_get_ops(rec));
+ if (ret)
+ return ret;
if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
return -EINVAL;
@@ -172,7 +364,9 @@
return ftrace_modify_code(pc, old, new, true);
}
+#endif
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
/*
* The compiler has inserted two NOPs before the regular function prologue.
* All instrumented functions follow the AAPCS, so x0-x8 and x19-x30 are live,
@@ -199,6 +393,11 @@
{
unsigned long pc = rec->ip - AARCH64_INSN_SIZE;
u32 old, new;
+ int ret;
+
+ ret = ftrace_rec_set_nop_ops(rec);
+ if (ret)
+ return ret;
old = aarch64_insn_gen_nop();
new = aarch64_insn_gen_move_reg(AARCH64_INSN_REG_9,
@@ -216,9 +415,14 @@
{
unsigned long pc = rec->ip;
u32 old = 0, new;
+ int ret;
new = aarch64_insn_gen_nop();
+ ret = ftrace_rec_set_nop_ops(rec);
+ if (ret)
+ return ret;
+
/*
* When using mcount, callsites in modules may have been initalized to
* call an arbitrary module PLT (which redirects to the _mcount stub)
@@ -228,7 +432,7 @@
*
* Note: 'mod' is only set at module load time.
*/
- if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) &&
+ if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_ARGS) &&
IS_ENABLED(CONFIG_ARM64_MODULE_PLTS) && mod) {
return aarch64_insn_patch_text_nosync((void *)pc, new);
}
@@ -279,19 +483,11 @@
#ifdef CONFIG_DYNAMIC_FTRACE
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs)
{
- /*
- * When DYNAMIC_FTRACE_WITH_REGS is selected, `fregs` can never be NULL
- * and arch_ftrace_get_regs(fregs) will always give a non-NULL pt_regs
- * in which we can safely modify the LR.
- */
- struct pt_regs *regs = arch_ftrace_get_regs(fregs);
- unsigned long *parent = (unsigned long *)&procedure_link_pointer(regs);
-
- prepare_ftrace_return(ip, parent, frame_pointer(regs));
+ prepare_ftrace_return(ip, &fregs->lr, fregs->fp);
}
#else
/*
@@ -323,6 +519,6 @@
{
return ftrace_modify_graph_caller(false);
}
-#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 76b41e4..acd0d88 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -497,9 +497,6 @@
__init_plt(&plts[FTRACE_PLT_IDX], FTRACE_ADDR);
- if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS))
- __init_plt(&plts[FTRACE_REGS_PLT_IDX], FTRACE_REGS_ADDR);
-
mod->arch.ftrace_trampolines = plts;
#endif
return 0;
diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
index 33e0fab..b4835f6 100644
--- a/arch/arm64/kernel/patching.c
+++ b/arch/arm64/kernel/patching.c
@@ -88,6 +88,23 @@
return __aarch64_insn_write(addr, cpu_to_le32(insn));
}
+noinstr int aarch64_insn_write_literal_u64(void *addr, u64 val)
+{
+ u64 *waddr;
+ unsigned long flags;
+ int ret;
+
+ raw_spin_lock_irqsave(&patch_lock, flags);
+ waddr = patch_map(addr, FIX_TEXT_POKE0);
+
+ ret = copy_to_kernel_nofault(waddr, &val, sizeof(val));
+
+ patch_unmap(FIX_TEXT_POKE0);
+ raw_spin_unlock_irqrestore(&patch_lock, flags);
+
+ return ret;
+}
+
int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
{
u32 *tp = addr;
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 45131e3..d32b191 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -93,6 +93,7 @@
#ifdef CONFIG_HIBERNATION
#define HIBERNATE_TEXT \
+ ALIGN_FUNCTION(); \
__hibernate_exit_text_start = .; \
*(.hibernate_exit.text) \
__hibernate_exit_text_end = .;
@@ -102,6 +103,7 @@
#ifdef CONFIG_KEXEC_CORE
#define KEXEC_TEXT \
+ ALIGN_FUNCTION(); \
__relocate_new_kernel_start = .; \
*(.kexec_relocate.text) \
__relocate_new_kernel_end = .;
@@ -342,6 +344,8 @@
#ifdef CONFIG_HIBERNATION
ASSERT(__hibernate_exit_text_end - __hibernate_exit_text_start <= SZ_4K,
"Hibernate exit text is bigger than 4 KiB")
+ASSERT(__hibernate_exit_text_start == swsusp_arch_suspend_exit,
+ "Hibernate exit text does not start with swsusp_arch_suspend_exit")
#endif
#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) <= 3*PAGE_SIZE,
@@ -368,4 +372,6 @@
ASSERT(__relocate_new_kernel_end - __relocate_new_kernel_start <= SZ_4K,
"kexec relocation code is bigger than 4 KiB")
ASSERT(KEXEC_CONTROL_PAGE_SIZE >= SZ_4K, "KEXEC_CONTROL_PAGE_SIZE is broken")
+ASSERT(__relocate_new_kernel_start == arm64_relocate_new_kernel,
+ "kexec control page does not start with arm64_relocate_new_kernel")
#endif
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 283b751..25ebc90 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -64,6 +64,7 @@
select NUMA if !FLATMEM
select PCI_MSI_ARCH_FALLBACKS if PCI_MSI
select ZONE_DMA32
+ select FUNCTION_ALIGNMENT_32B
default y
help
The Itanium Processor Family is Intel's 64-bit successor to
diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile
index 56c4bb2..d553ab7 100644
--- a/arch/ia64/Makefile
+++ b/arch/ia64/Makefile
@@ -23,7 +23,7 @@
EXTRA :=
cflags-y := -pipe $(EXTRA) -ffixed-r13 -mfixed-range=f12-f15,f32-f127 \
- -falign-functions=32 -frename-registers -fno-optimize-sibling-calls
+ -frename-registers -fno-optimize-sibling-calls
KBUILD_CFLAGS_KERNEL := -mconstant-gp
GAS_STATUS = $(shell $(srctree)/arch/ia64/scripts/check-gas "$(CC)" "$(OBJDUMP)")
diff --git a/arch/powerpc/include/asm/ftrace.h b/arch/powerpc/include/asm/ftrace.h
index e3d1f37..f6c1060 100644
--- a/arch/powerpc/include/asm/ftrace.h
+++ b/arch/powerpc/include/asm/ftrace.h
@@ -37,12 +37,32 @@
return fregs->regs.msr ? &fregs->regs : NULL;
}
-static __always_inline void ftrace_instruction_pointer_set(struct ftrace_regs *fregs,
- unsigned long ip)
+static __always_inline void
+ftrace_regs_set_instruction_pointer(struct ftrace_regs *fregs,
+ unsigned long ip)
{
regs_set_return_ip(&fregs->regs, ip);
}
+static __always_inline unsigned long
+ftrace_regs_get_instruction_pointer(struct ftrace_regs *fregs)
+{
+ return instruction_pointer(&fregs->regs);
+}
+
+#define ftrace_regs_get_argument(fregs, n) \
+ regs_get_kernel_argument(&(fregs)->regs, n)
+#define ftrace_regs_get_stack_pointer(fregs) \
+ kernel_stack_pointer(&(fregs)->regs)
+#define ftrace_regs_return_value(fregs) \
+ regs_return_value(&(fregs)->regs)
+#define ftrace_regs_set_return_value(fregs, ret) \
+ regs_set_return_value(&(fregs)->regs, ret)
+#define ftrace_override_function_with_return(fregs) \
+ override_function_with_return(&(fregs)->regs)
+#define ftrace_regs_query_register_offset(name) \
+ regs_query_register_offset(name)
+
struct ftrace_ops;
#define ftrace_graph_func ftrace_graph_func
diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index 6f80ec9..e5c5cb1 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -54,12 +54,33 @@
return NULL;
}
-static __always_inline void ftrace_instruction_pointer_set(struct ftrace_regs *fregs,
- unsigned long ip)
+static __always_inline unsigned long
+ftrace_regs_get_instruction_pointer(const struct ftrace_regs *fregs)
+{
+ return fregs->regs.psw.addr;
+}
+
+static __always_inline void
+ftrace_regs_set_instruction_pointer(struct ftrace_regs *fregs,
+ unsigned long ip)
{
fregs->regs.psw.addr = ip;
}
+#define ftrace_regs_get_argument(fregs, n) \
+ regs_get_kernel_argument(&(fregs)->regs, n)
+#define ftrace_regs_get_stack_pointer(fregs) \
+ kernel_stack_pointer(&(fregs)->regs)
+#define ftrace_regs_return_value(fregs) \
+ regs_return_value(&(fregs)->regs)
+#define ftrace_regs_set_return_value(fregs, ret) \
+ regs_set_return_value(&(fregs)->regs, ret)
+#define ftrace_override_function_with_return(fregs) \
+ override_function_with_return(&(fregs)->regs)
+#define ftrace_regs_query_register_offset(name) \
+ regs_query_register_offset(name)
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
/*
* When an ftrace registered caller is tracing a function that is
* also set by a register_ftrace_direct() call, it needs to be
@@ -67,10 +88,12 @@
* place the direct caller in the ORIG_GPR2 part of pt_regs. This
* tells the ftrace_caller that there's a direct caller.
*/
-static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs, unsigned long addr)
+static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs, unsigned long addr)
{
+ struct pt_regs *regs = &fregs->regs;
regs->orig_gpr2 = addr;
}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
/*
* Even though the system call numbers are identical for s390/s390x a
diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S
index 4786bfe..ad13a0e 100644
--- a/arch/s390/kernel/mcount.S
+++ b/arch/s390/kernel/mcount.S
@@ -32,6 +32,11 @@
BR_EX %r14
ENDPROC(ftrace_stub)
+SYM_CODE_START(ftrace_stub_direct_tramp)
+ lgr %r1, %r0
+ BR_EX %r1
+SYM_CODE_END(ftrace_stub_direct_tramp)
+
.macro ftrace_regs_entry, allregs=0
stg %r14,(__SF_GPRS+8*8)(%r15) # save traced function caller
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4c9bfc4..3969623 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -292,6 +292,8 @@
select X86_FEATURE_NAMES if PROC_FS
select PROC_PID_ARCH_STATUS if PROC_FS
select HAVE_ARCH_NODE_DEV_GROUP if X86_SGX
+ select FUNCTION_ALIGNMENT_16B if X86_64 || X86_ALIGNMENT_16
+ select FUNCTION_ALIGNMENT_4B
imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI
select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
@@ -885,9 +887,11 @@
bool "Intel TDX (Trust Domain Extensions) - Guest Support"
depends on X86_64 && CPU_SUP_INTEL
depends on X86_X2APIC
+ depends on EFI_STUB
select ARCH_HAS_CC_PLATFORM
select X86_MEM_ENCRYPT
select X86_MCE
+ select UNACCEPTED_MEMORY
help
Support running as a guest under Intel TDX. Without this support,
the guest kernel can not boot or run under TDX.
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index 15b7b40..fbfcc19 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -106,7 +106,8 @@
endif
vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o
-vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o
+vmlinux-objs-$(CONFIG_INTEL_TDX_GUEST) += $(obj)/tdx.o $(obj)/tdcall.o $(obj)/tdx-shared.o
+vmlinux-objs-$(CONFIG_UNACCEPTED_MEMORY) += $(obj)/mem.o
vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
vmlinux-objs-$(CONFIG_EFI) += $(obj)/efi.o
diff --git a/arch/x86/boot/compressed/efi.h b/arch/x86/boot/compressed/efi.h
index 7db2f41b..866c0af 100644
--- a/arch/x86/boot/compressed/efi.h
+++ b/arch/x86/boot/compressed/efi.h
@@ -16,6 +16,7 @@
#define ACPI_TABLE_GUID EFI_GUID(0xeb9d2d30, 0x2d88, 0x11d3, 0x9a, 0x16, 0x00, 0x90, 0x27, 0x3f, 0xc1, 0x4d)
#define ACPI_20_TABLE_GUID EFI_GUID(0x8868e871, 0xe4f1, 0x11d3, 0xbc, 0x22, 0x00, 0x80, 0xc7, 0x3c, 0x88, 0x81)
#define EFI_CC_BLOB_GUID EFI_GUID(0x067b1f5f, 0xcf26, 0x44c5, 0x85, 0x54, 0x93, 0xd7, 0x77, 0x91, 0x2d, 0x42)
+#define LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID EFI_GUID(0xd5d1de3c, 0x105c, 0x44f9, 0x9e, 0xa9, 0xbc, 0xef, 0x98, 0x12, 0x00, 0x31)
#define EFI32_LOADER_SIGNATURE "EL32"
#define EFI64_LOADER_SIGNATURE "EL64"
@@ -32,6 +33,7 @@
} efi_table_hdr_t;
#define EFI_CONVENTIONAL_MEMORY 7
+#define EFI_UNACCEPTED_MEMORY 15
#define EFI_MEMORY_MORE_RELIABLE \
((u64)0x0000000000010000ULL) /* higher reliability */
@@ -104,6 +106,14 @@
u64 reserved[8];
};
+struct efi_unaccepted_memory {
+ u32 version;
+ u32 unit_size;
+ u64 phys_base;
+ u64 size;
+ unsigned long bitmap[];
+};
+
static inline int efi_guidcmp (efi_guid_t left, efi_guid_t right)
{
return memcmp(&left, &right, sizeof (efi_guid_t));
diff --git a/arch/x86/boot/compressed/error.c b/arch/x86/boot/compressed/error.c
index c881878..5313c5c 100644
--- a/arch/x86/boot/compressed/error.c
+++ b/arch/x86/boot/compressed/error.c
@@ -22,3 +22,22 @@
while (1)
asm("hlt");
}
+
+/* EFI libstub provides vsnprintf() */
+#ifdef CONFIG_EFI_STUB
+void panic(const char *fmt, ...)
+{
+ static char buf[1024];
+ va_list args;
+ int len;
+
+ va_start(args, fmt);
+ len = vsnprintf(buf, sizeof(buf), fmt, args);
+ va_end(args);
+
+ if (len && buf[len - 1] == '\n')
+ buf[len - 1] = '\0';
+
+ error(buf);
+}
+#endif
diff --git a/arch/x86/boot/compressed/error.h b/arch/x86/boot/compressed/error.h
index 1de5821..86fe33b9 100644
--- a/arch/x86/boot/compressed/error.h
+++ b/arch/x86/boot/compressed/error.h
@@ -6,5 +6,6 @@
void warn(char *m);
void error(char *m) __noreturn;
+void panic(const char *fmt, ...) __noreturn __cold;
#endif /* BOOT_COMPRESSED_ERROR_H */
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index b4bd6df..4a532c5 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -38,6 +38,14 @@
#include "pgtable.h"
/*
+ * Fix alignment at 16 bytes. Following CONFIG_FUNCTION_ALIGNMENT will result
+ * in assembly errors due to trying to move .org backward due to the excessive
+ * alignment.
+ */
+#undef __ALIGN
+#define __ALIGN .balign 16, 0x90
+
+/*
* Locally defined symbols should be marked hidden:
*/
.hidden _bss
diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index b415527..08f93b0 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -8,14 +8,6 @@
* Copyright (C) 2016 Kees Cook
*/
-/*
- * Since we're dealing with identity mappings, physical and virtual
- * addresses are the same, so override these defines which are ultimately
- * used by the headers in misc.h.
- */
-#define __pa(x) ((unsigned long)(x))
-#define __va(x) ((void *)((unsigned long)(x)))
-
/* No PAGE_TABLE_ISOLATION support needed either: */
#undef CONFIG_PAGE_TABLE_ISOLATION
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index e476bcb..1e72a63 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -672,6 +672,33 @@
}
#ifdef CONFIG_EFI
+
+/*
+ * Only EFI_CONVENTIONAL_MEMORY and EFI_UNACCEPTED_MEMORY (if supported) are
+ * guaranteed to be free.
+ *
+ * Pick free memory more conservatively than the EFI spec allows: according to
+ * the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also free memory and thus
+ * available to place the kernel image into, but in practice there's firmware
+ * where using that memory leads to crashes. Buggy vendor EFI code registers
+ * for an event that triggers on SetVirtualAddressMap(). The handler assumes
+ * that EFI_BOOT_SERVICES_DATA memory has not been touched by loader yet, which
+ * is probably true for Windows.
+ *
+ * Preserve EFI_BOOT_SERVICES_* regions until after SetVirtualAddressMap().
+ */
+static inline bool memory_type_is_free(efi_memory_desc_t *md)
+{
+ if (md->type == EFI_CONVENTIONAL_MEMORY)
+ return true;
+
+ if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) &&
+ md->type == EFI_UNACCEPTED_MEMORY)
+ return true;
+
+ return false;
+}
+
/*
* Returns true if we processed the EFI memmap, which we prefer over the E820
* table if it is available.
@@ -716,18 +743,7 @@
for (i = 0; i < nr_desc; i++) {
md = efi_early_memdesc_ptr(pmap, e->efi_memdesc_size, i);
- /*
- * Here we are more conservative in picking free memory than
- * the EFI spec allows:
- *
- * According to the spec, EFI_BOOT_SERVICES_{CODE|DATA} are also
- * free memory and thus available to place the kernel image into,
- * but in practice there's firmware where using that memory leads
- * to crashes.
- *
- * Only EFI_CONVENTIONAL_MEMORY is guaranteed to be free.
- */
- if (md->type != EFI_CONVENTIONAL_MEMORY)
+ if (!memory_type_is_free(md))
continue;
if (efi_soft_reserve_enabled() &&
diff --git a/arch/x86/boot/compressed/mem.c b/arch/x86/boot/compressed/mem.c
new file mode 100644
index 0000000..f04b29f
--- /dev/null
+++ b/arch/x86/boot/compressed/mem.c
@@ -0,0 +1,83 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "error.h"
+#include "misc.h"
+#include "tdx.h"
+#include <asm/shared/tdx.h>
+
+/*
+ * accept_memory() and process_unaccepted_memory() called from EFI stub which
+ * runs before decompresser and its early_tdx_detect().
+ *
+ * Enumerate TDX directly from the early users.
+ */
+static bool early_is_tdx_guest(void)
+{
+ static bool once;
+ static bool is_tdx;
+
+ if (!IS_ENABLED(CONFIG_INTEL_TDX_GUEST))
+ return false;
+
+ if (!once) {
+ u32 eax, sig[3];
+
+ cpuid_count(TDX_CPUID_LEAF_ID, 0, &eax,
+ &sig[0], &sig[2], &sig[1]);
+ is_tdx = !memcmp(TDX_IDENT, sig, sizeof(sig));
+ once = true;
+ }
+
+ return is_tdx;
+}
+
+void arch_accept_memory(phys_addr_t start, phys_addr_t end)
+{
+ /* Platform-specific memory-acceptance call goes here */
+ if (early_is_tdx_guest()) {
+ if (!tdx_accept_memory(start, end))
+ panic("TDX: Failed to accept memory\n");
+ } else {
+ error("Cannot accept memory: unknown platform\n");
+ }
+}
+
+bool init_unaccepted_memory(void)
+{
+ guid_t guid = LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID;
+ struct efi_unaccepted_memory *table;
+ unsigned long cfg_table_pa;
+ unsigned int cfg_table_len;
+ enum efi_type et;
+ int ret;
+
+ et = efi_get_type(boot_params);
+ if (et == EFI_TYPE_NONE)
+ return false;
+
+ ret = efi_get_conf_table(boot_params, &cfg_table_pa, &cfg_table_len);
+ if (ret) {
+ warn("EFI config table not found.");
+ return false;
+ }
+
+ table = (void *)efi_find_vendor_table(boot_params, cfg_table_pa,
+ cfg_table_len, guid);
+ if (!table)
+ return false;
+
+ if (table->version != 1)
+ error("Unknown version of unaccepted memory table\n");
+
+ /*
+ * In many cases unaccepted_table is already set by EFI stub, but it
+ * has to be initialized again to cover cases when the table is not
+ * allocated by EFI stub or EFI stub copied the kernel image with
+ * efi_relocate_kernel() before the variable is set.
+ *
+ * It must be initialized before the first usage of accept_memory().
+ */
+ unaccepted_table = table;
+
+ return true;
+}
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index cf690d8..d99d7c9 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -454,6 +454,12 @@
#endif
debug_putstr("\nDecompressing Linux... ");
+
+ if (init_unaccepted_memory()) {
+ debug_putstr("Accepting memory... ");
+ accept_memory(__pa(output), __pa(output) + needed_size);
+ }
+
__decompress(input_data, input_len, NULL, NULL, output, output_len,
NULL, error);
parse_elf(output);
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 20118fb..964fe90 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -19,6 +19,15 @@
/* cpu_feature_enabled() cannot be used this early */
#define USE_EARLY_PGTABLE_L5
+/*
+ * Boot stub deals with identity mappings, physical and virtual addresses are
+ * the same, so override these defines.
+ *
+ * <asm/page.h> will not define them if they are already defined.
+ */
+#define __pa(x) ((unsigned long)(x))
+#define __va(x) ((void *)((unsigned long)(x)))
+
#include <linux/linkage.h>
#include <linux/screen_info.h>
#include <linux/elf.h>
@@ -238,4 +247,14 @@
}
#endif /* CONFIG_EFI */
+#ifdef CONFIG_UNACCEPTED_MEMORY
+bool init_unaccepted_memory(void);
+#else
+static inline bool init_unaccepted_memory(void) { return false; }
+#endif
+
+/* Defined in EFI stub */
+extern struct efi_unaccepted_memory *unaccepted_table;
+void accept_memory(phys_addr_t start, phys_addr_t end);
+
#endif /* BOOT_COMPRESSED_MISC_H */
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index 9c91cc4..1b1b554 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -114,9 +114,7 @@
}
#undef __init
-#undef __pa
#define __init
-#define __pa(x) ((unsigned long)(x))
#define __BOOT_COMPRESSED
diff --git a/arch/x86/boot/compressed/tdx-shared.c b/arch/x86/boot/compressed/tdx-shared.c
new file mode 100644
index 0000000..5ac4376
--- /dev/null
+++ b/arch/x86/boot/compressed/tdx-shared.c
@@ -0,0 +1,2 @@
+#include "error.h"
+#include "../../coco/tdx/tdx-shared.c"
diff --git a/arch/x86/boot/compressed/tdx.c b/arch/x86/boot/compressed/tdx.c
index 918a760..2d81d3c 100644
--- a/arch/x86/boot/compressed/tdx.c
+++ b/arch/x86/boot/compressed/tdx.c
@@ -26,7 +26,7 @@
.r14 = port,
};
- if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
+ if (__tdx_hypercall_ret(&args))
return UINT_MAX;
return args.r11;
@@ -43,7 +43,7 @@
.r15 = value,
};
- __tdx_hypercall(&args, 0);
+ __tdx_hypercall(&args);
}
static inline u8 tdx_inb(u16 port)
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S
index 112b237..b22f34b 100644
--- a/arch/x86/boot/compressed/vmlinux.lds.S
+++ b/arch/x86/boot/compressed/vmlinux.lds.S
@@ -34,6 +34,7 @@
_text = .; /* Text */
*(.text)
*(.text.*)
+ *(.noinstr.text)
_etext = . ;
}
.rodata : {
diff --git a/arch/x86/coco/tdx/Makefile b/arch/x86/coco/tdx/Makefile
index 46c5599..2c7dcbf 100644
--- a/arch/x86/coco/tdx/Makefile
+++ b/arch/x86/coco/tdx/Makefile
@@ -1,3 +1,3 @@
# SPDX-License-Identifier: GPL-2.0
-obj-y += tdx.o tdcall.o
+obj-y += tdx.o tdx-shared.o tdcall.o
diff --git a/arch/x86/coco/tdx/tdcall.S b/arch/x86/coco/tdx/tdcall.S
index f9eb113..b193c0a 100644
--- a/arch/x86/coco/tdx/tdcall.S
+++ b/arch/x86/coco/tdx/tdcall.S
@@ -13,6 +13,12 @@
/*
* Bitmasks of exposed registers (with VMM).
*/
+#define TDX_RDX BIT(2)
+#define TDX_RBX BIT(3)
+#define TDX_RSI BIT(6)
+#define TDX_RDI BIT(7)
+#define TDX_R8 BIT(8)
+#define TDX_R9 BIT(9)
#define TDX_R10 BIT(10)
#define TDX_R11 BIT(11)
#define TDX_R12 BIT(12)
@@ -27,9 +33,11 @@
* details can be found in TDX GHCI specification, section
* titled "TDCALL [TDG.VP.VMCALL] leaf".
*/
-#define TDVMCALL_EXPOSE_REGS_MASK ( TDX_R10 | TDX_R11 | \
- TDX_R12 | TDX_R13 | \
- TDX_R14 | TDX_R15 )
+#define TDVMCALL_EXPOSE_REGS_MASK \
+ ( TDX_RDX | TDX_RBX | TDX_RSI | TDX_RDI | TDX_R8 | TDX_R9 | \
+ TDX_R10 | TDX_R11 | TDX_R12 | TDX_R13 | TDX_R14 | TDX_R15 )
+
+.section .noinstr.text, "ax"
/*
* __tdx_module_call() - Used by TDX guests to request services from
@@ -77,12 +85,12 @@
SYM_FUNC_END(__tdx_module_call)
/*
- * __tdx_hypercall() - Make hypercalls to a TDX VMM using TDVMCALL leaf
- * of TDCALL instruction
+ * TDX_HYPERCALL - Make hypercalls to a TDX VMM using TDVMCALL leaf of TDCALL
+ * instruction
*
* Transforms values in function call argument struct tdx_hypercall_args @args
* into the TDCALL register ABI. After TDCALL operation, VMM output is saved
- * back in @args.
+ * back in @args, if \ret is 1.
*
*-------------------------------------------------------------------------
* TD VMCALL ABI:
@@ -97,26 +105,18 @@
* specification. Non zero value indicates vendor
* specific ABI.
* R11 - VMCALL sub function number
- * RBX, RBP, RDI, RSI - Used to pass VMCALL sub function specific arguments.
+ * RBX, RDX, RDI, RSI - Used to pass VMCALL sub function specific arguments.
* R8-R9, R12-R15 - Same as above.
*
* Output Registers:
*
* RAX - TDCALL instruction status (Not related to hypercall
* output).
- * R10 - Hypercall output error code.
- * R11-R15 - Hypercall sub function specific output values.
+ * RBX, RDX, RDI, RSI - Hypercall sub function specific output values.
+ * R8-R15 - Same as above.
*
- *-------------------------------------------------------------------------
- *
- * __tdx_hypercall() function ABI:
- *
- * @args (RDI) - struct tdx_hypercall_args for input and output
- * @flags (RSI) - TDX_HCALL_* flags
- *
- * On successful completion, return the hypercall error code.
*/
-SYM_FUNC_START(__tdx_hypercall)
+.macro TDX_HYPERCALL ret:req
FRAME_BEGIN
/* Save callee-saved GPRs as mandated by the x86_64 ABI */
@@ -124,38 +124,37 @@
push %r14
push %r13
push %r12
+ push %rbx
+
+ /* Free RDI to be used as TDVMCALL arguments */
+ movq %rdi, %rax
+
+ /* Copy hypercall registers from arg struct: */
+ movq TDX_HYPERCALL_r8(%rax), %r8
+ movq TDX_HYPERCALL_r9(%rax), %r9
+ movq TDX_HYPERCALL_r10(%rax), %r10
+ movq TDX_HYPERCALL_r11(%rax), %r11
+ movq TDX_HYPERCALL_r12(%rax), %r12
+ movq TDX_HYPERCALL_r13(%rax), %r13
+ movq TDX_HYPERCALL_r14(%rax), %r14
+ movq TDX_HYPERCALL_r15(%rax), %r15
+ movq TDX_HYPERCALL_rdi(%rax), %rdi
+ movq TDX_HYPERCALL_rsi(%rax), %rsi
+ movq TDX_HYPERCALL_rbx(%rax), %rbx
+ movq TDX_HYPERCALL_rdx(%rax), %rdx
+
+ push %rax
/* Mangle function call ABI into TDCALL ABI: */
/* Set TDCALL leaf ID (TDVMCALL (0)) in RAX */
xor %eax, %eax
- /* Copy hypercall registers from arg struct: */
- movq TDX_HYPERCALL_r10(%rdi), %r10
- movq TDX_HYPERCALL_r11(%rdi), %r11
- movq TDX_HYPERCALL_r12(%rdi), %r12
- movq TDX_HYPERCALL_r13(%rdi), %r13
- movq TDX_HYPERCALL_r14(%rdi), %r14
- movq TDX_HYPERCALL_r15(%rdi), %r15
-
movl $TDVMCALL_EXPOSE_REGS_MASK, %ecx
- /*
- * For the idle loop STI needs to be called directly before the TDCALL
- * that enters idle (EXIT_REASON_HLT case). STI instruction enables
- * interrupts only one instruction later. If there is a window between
- * STI and the instruction that emulates the HALT state, there is a
- * chance for interrupts to happen in this window, which can delay the
- * HLT operation indefinitely. Since this is the not the desired
- * result, conditionally call STI before TDCALL.
- */
- testq $TDX_HCALL_ISSUE_STI, %rsi
- jz .Lskip_sti
- sti
-.Lskip_sti:
tdcall
/*
- * RAX==0 indicates a failure of the TDVMCALL mechanism itself and that
+ * RAX!=0 indicates a failure of the TDVMCALL mechanism itself and that
* something has gone horribly wrong with the TDX module.
*
* The return status of the hypercall operation is in a separate
@@ -163,32 +162,43 @@
* and are handled by callers.
*/
testq %rax, %rax
- jne .Lpanic
+ jne .Lpanic\@
+
+ pop %rax
+
+ .if \ret
+ movq %r8, TDX_HYPERCALL_r8(%rax)
+ movq %r9, TDX_HYPERCALL_r9(%rax)
+ movq %r10, TDX_HYPERCALL_r10(%rax)
+ movq %r11, TDX_HYPERCALL_r11(%rax)
+ movq %r12, TDX_HYPERCALL_r12(%rax)
+ movq %r13, TDX_HYPERCALL_r13(%rax)
+ movq %r14, TDX_HYPERCALL_r14(%rax)
+ movq %r15, TDX_HYPERCALL_r15(%rax)
+ movq %rdi, TDX_HYPERCALL_rdi(%rax)
+ movq %rsi, TDX_HYPERCALL_rsi(%rax)
+ movq %rbx, TDX_HYPERCALL_rbx(%rax)
+ movq %rdx, TDX_HYPERCALL_rdx(%rax)
+ .endif
/* TDVMCALL leaf return code is in R10 */
movq %r10, %rax
- /* Copy hypercall result registers to arg struct if needed */
- testq $TDX_HCALL_HAS_OUTPUT, %rsi
- jz .Lout
-
- movq %r10, TDX_HYPERCALL_r10(%rdi)
- movq %r11, TDX_HYPERCALL_r11(%rdi)
- movq %r12, TDX_HYPERCALL_r12(%rdi)
- movq %r13, TDX_HYPERCALL_r13(%rdi)
- movq %r14, TDX_HYPERCALL_r14(%rdi)
- movq %r15, TDX_HYPERCALL_r15(%rdi)
-.Lout:
/*
* Zero out registers exposed to the VMM to avoid speculative execution
* with VMM-controlled values. This needs to include all registers
- * present in TDVMCALL_EXPOSE_REGS_MASK (except R12-R15). R12-R15
- * context will be restored.
+ * present in TDVMCALL_EXPOSE_REGS_MASK, except RBX, and R12-R15 which
+ * will be restored.
*/
+ xor %r8d, %r8d
+ xor %r9d, %r9d
xor %r10d, %r10d
xor %r11d, %r11d
+ xor %rdi, %rdi
+ xor %rdx, %rdx
/* Restore callee-saved GPRs as mandated by the x86_64 ABI */
+ pop %rbx
pop %r12
pop %r13
pop %r14
@@ -197,9 +207,33 @@
FRAME_END
RET
-.Lpanic:
+.Lpanic\@:
call __tdx_hypercall_failed
/* __tdx_hypercall_failed never returns */
REACHABLE
- jmp .Lpanic
+ jmp .Lpanic\@
+.endm
+
+/*
+ *
+ * __tdx_hypercall() function ABI:
+ *
+ * @args (RDI) - struct tdx_hypercall_args for input
+ *
+ * On successful completion, return the hypercall error code.
+ */
+SYM_FUNC_START(__tdx_hypercall)
+ TDX_HYPERCALL ret=0
SYM_FUNC_END(__tdx_hypercall)
+
+/*
+ *
+ * __tdx_hypercall_ret() function ABI:
+ *
+ * @args (RDI) - struct tdx_hypercall_args for input and output
+ *
+ * On successful completion, return the hypercall error code.
+ */
+SYM_FUNC_START(__tdx_hypercall_ret)
+ TDX_HYPERCALL ret=1
+SYM_FUNC_END(__tdx_hypercall_ret)
diff --git a/arch/x86/coco/tdx/tdx-shared.c b/arch/x86/coco/tdx/tdx-shared.c
new file mode 100644
index 0000000..ef20ddc
--- /dev/null
+++ b/arch/x86/coco/tdx/tdx-shared.c
@@ -0,0 +1,71 @@
+#include <asm/tdx.h>
+#include <asm/pgtable.h>
+
+static unsigned long try_accept_one(phys_addr_t start, unsigned long len,
+ enum pg_level pg_level)
+{
+ unsigned long accept_size = page_level_size(pg_level);
+ u64 tdcall_rcx;
+ u8 page_size;
+
+ if (!IS_ALIGNED(start, accept_size))
+ return 0;
+
+ if (len < accept_size)
+ return 0;
+
+ /*
+ * Pass the page physical address to the TDX module to accept the
+ * pending, private page.
+ *
+ * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G.
+ */
+ switch (pg_level) {
+ case PG_LEVEL_4K:
+ page_size = 0;
+ break;
+ case PG_LEVEL_2M:
+ page_size = 1;
+ break;
+ case PG_LEVEL_1G:
+ page_size = 2;
+ break;
+ default:
+ return 0;
+ }
+
+ tdcall_rcx = start | page_size;
+ if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL))
+ return 0;
+
+ return accept_size;
+}
+
+bool tdx_accept_memory(phys_addr_t start, phys_addr_t end)
+{
+ /*
+ * For shared->private conversion, accept the page using
+ * TDX_ACCEPT_PAGE TDX module call.
+ */
+ while (start < end) {
+ unsigned long len = end - start;
+ unsigned long accept_size;
+
+ /*
+ * Try larger accepts first. It gives chance to VMM to keep
+ * 1G/2M Secure EPT entries where possible and speeds up
+ * process by cutting number of hypercalls (if successful).
+ */
+
+ accept_size = try_accept_one(start, len, PG_LEVEL_1G);
+ if (!accept_size)
+ accept_size = try_accept_one(start, len, PG_LEVEL_2M);
+ if (!accept_size)
+ accept_size = try_accept_one(start, len, PG_LEVEL_4K);
+ if (!accept_size)
+ return false;
+ start += accept_size;
+ }
+
+ return true;
+}
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index d0565a9..a7edf50 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -5,6 +5,8 @@
#define pr_fmt(fmt) "tdx: " fmt
#include <linux/cpufeature.h>
+#include <linux/export.h>
+#include <linux/io.h>
#include <asm/coco.h>
#include <asm/tdx.h>
#include <asm/vmx.h>
@@ -13,14 +15,6 @@
#include <asm/insn-eval.h>
#include <asm/pgtable.h>
-/* TDX module Call Leaf IDs */
-#define TDX_GET_INFO 1
-#define TDX_GET_VEINFO 3
-#define TDX_ACCEPT_PAGE 6
-
-/* TDX hypercall Leaf IDs */
-#define TDVMCALL_MAP_GPA 0x10001
-
/* MMIO direction */
#define EPT_READ 0
#define EPT_WRITE 1
@@ -35,29 +29,19 @@
#define VE_GET_PORT_NUM(e) ((e) >> 16)
#define VE_IS_IO_STRING(e) ((e) & BIT(4))
+#define ATTR_DEBUG BIT(0)
#define ATTR_SEPT_VE_DISABLE BIT(28)
-/*
- * Wrapper for standard use of __tdx_hypercall with no output aside from
- * return code.
- */
-static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15)
-{
- struct tdx_hypercall_args args = {
- .r10 = TDX_HYPERCALL_STANDARD,
- .r11 = fn,
- .r12 = r12,
- .r13 = r13,
- .r14 = r14,
- .r15 = r15,
- };
+/* TDX Module call error codes */
+#define TDCALL_RETURN_CODE(a) ((a) >> 32)
+#define TDCALL_INVALID_OPERAND 0xc0000100
- return __tdx_hypercall(&args, 0);
-}
+#define TDREPORT_SUBTYPE_0 0
/* Called from __tdx_hypercall() for unrecoverable failure */
-void __tdx_hypercall_failed(void)
+noinstr void __tdx_hypercall_failed(void)
{
+ instrumentation_begin();
panic("TDVMCALL failed. TDX module bug?");
}
@@ -67,7 +51,7 @@
* Reusing the KVM EXIT_REASON macros makes it easier to connect the host and
* guest sides of these calls.
*/
-static u64 hcall_func(u64 exit_reason)
+static __always_inline u64 hcall_func(u64 exit_reason)
{
return exit_reason;
}
@@ -84,7 +68,7 @@
.r14 = p4,
};
- return __tdx_hypercall(&args, 0);
+ return __tdx_hypercall(&args);
}
EXPORT_SYMBOL_GPL(tdx_kvm_hypercall);
#endif
@@ -101,6 +85,72 @@
panic("TDCALL %lld failed (Buggy TDX module!)\n", fn);
}
+/**
+ * tdx_mcall_get_report0() - Wrapper to get TDREPORT0 (a.k.a. TDREPORT
+ * subtype 0) using TDG.MR.REPORT TDCALL.
+ * @reportdata: Address of the input buffer which contains user-defined
+ * REPORTDATA to be included into TDREPORT.
+ * @tdreport: Address of the output buffer to store TDREPORT.
+ *
+ * Refer to section titled "TDG.MR.REPORT leaf" in the TDX Module
+ * v1.0 specification for more information on TDG.MR.REPORT TDCALL.
+ * It is used in the TDX guest driver module to get the TDREPORT0.
+ *
+ * Return 0 on success, -EINVAL for invalid operands, or -EIO on
+ * other TDCALL failures.
+ */
+int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport)
+{
+ u64 ret;
+
+ ret = __tdx_module_call(TDX_GET_REPORT, virt_to_phys(tdreport),
+ virt_to_phys(reportdata), TDREPORT_SUBTYPE_0,
+ 0, NULL);
+ if (ret) {
+ if (TDCALL_RETURN_CODE(ret) == TDCALL_INVALID_OPERAND)
+ return -EINVAL;
+ return -EIO;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(tdx_mcall_get_report0);
+
+static void __noreturn tdx_panic(const char *msg)
+{
+ struct tdx_hypercall_args args = {
+ .r10 = TDX_HYPERCALL_STANDARD,
+ .r11 = TDVMCALL_REPORT_FATAL_ERROR,
+ .r12 = 0, /* Error code: 0 is Panic */
+ };
+ union {
+ /* Define register order according to the GHCI */
+ struct { u64 r14, r15, rbx, rdi, rsi, r8, r9, rdx; };
+
+ char str[64];
+ } message;
+
+ /* VMM assumes '\0' in byte 65, if the message took all 64 bytes */
+ strncpy(message.str, msg, 64);
+
+ args.r8 = message.r8;
+ args.r9 = message.r9;
+ args.r14 = message.r14;
+ args.r15 = message.r15;
+ args.rdi = message.rdi;
+ args.rsi = message.rsi;
+ args.rbx = message.rbx;
+ args.rdx = message.rdx;
+
+ /*
+ * This hypercall should never return and it is not safe
+ * to keep the guest running. Call it forever if it
+ * happens to return.
+ */
+ while (1)
+ __tdx_hypercall(&args);
+}
+
static void tdx_parse_tdinfo(u64 *cc_mask)
{
struct tdx_module_output out;
@@ -132,8 +182,15 @@
* TD-private memory. Only VMM-shared memory (MMIO) will #VE.
*/
td_attr = out.rdx;
- if (!(td_attr & ATTR_SEPT_VE_DISABLE))
- panic("TD misconfiguration: SEPT_VE_DISABLE attibute must be set.\n");
+ if (!(td_attr & ATTR_SEPT_VE_DISABLE)) {
+ const char *msg = "TD misconfiguration: SEPT_VE_DISABLE attribute must be set.";
+
+ /* Relax SEPT_VE_DISABLE check for debug TD. */
+ if (td_attr & ATTR_DEBUG)
+ pr_warn("%s\n", msg);
+ else
+ tdx_panic(msg);
+ }
}
/*
@@ -181,7 +238,7 @@
}
}
-static u64 __cpuidle __halt(const bool irq_disabled, const bool do_sti)
+static u64 __cpuidle __halt(const bool irq_disabled)
{
struct tdx_hypercall_args args = {
.r10 = TDX_HYPERCALL_STANDARD,
@@ -201,20 +258,14 @@
* can keep the vCPU in virtual HLT, even if an IRQ is
* pending, without hanging/breaking the guest.
*/
- return __tdx_hypercall(&args, do_sti ? TDX_HCALL_ISSUE_STI : 0);
+ return __tdx_hypercall(&args);
}
static int handle_halt(struct ve_info *ve)
{
- /*
- * Since non safe halt is mainly used in CPU offlining
- * and the guest will always stay in the halt state, don't
- * call the STI instruction (set do_sti as false).
- */
const bool irq_disabled = irqs_disabled();
- const bool do_sti = false;
- if (__halt(irq_disabled, do_sti))
+ if (__halt(irq_disabled))
return -EIO;
return ve_instr_len(ve);
@@ -222,18 +273,12 @@
void __cpuidle tdx_safe_halt(void)
{
- /*
- * For do_sti=true case, __tdx_hypercall() function enables
- * interrupts using the STI instruction before the TDCALL. So
- * set irq_disabled as false.
- */
const bool irq_disabled = false;
- const bool do_sti = true;
/*
* Use WARN_ONCE() to report the failure.
*/
- if (__halt(irq_disabled, do_sti))
+ if (__halt(irq_disabled))
WARN_ONCE(1, "HLT instruction emulation failed\n");
}
@@ -250,7 +295,7 @@
* can be found in TDX Guest-Host-Communication Interface
* (GHCI), section titled "TDG.VP.VMCALL<Instruction.RDMSR>".
*/
- if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
+ if (__tdx_hypercall_ret(&args))
return -EIO;
regs->ax = lower_32_bits(args.r11);
@@ -272,7 +317,7 @@
* can be found in TDX Guest-Host-Communication Interface
* (GHCI) section titled "TDG.VP.VMCALL<Instruction.WRMSR>".
*/
- if (__tdx_hypercall(&args, 0))
+ if (__tdx_hypercall(&args))
return -EIO;
return ve_instr_len(ve);
@@ -304,7 +349,7 @@
* ABI can be found in TDX Guest-Host-Communication Interface
* (GHCI), section titled "VP.VMCALL<Instruction.CPUID>".
*/
- if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
+ if (__tdx_hypercall_ret(&args))
return -EIO;
/*
@@ -331,7 +376,7 @@
.r15 = *val,
};
- if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
+ if (__tdx_hypercall_ret(&args))
return false;
*val = args.r11;
return true;
@@ -347,8 +392,8 @@
{
unsigned long *reg, val, vaddr;
char buffer[MAX_INSN_SIZE];
+ enum insn_mmio_type mmio;
struct insn insn = {};
- enum mmio_type mmio;
int size, extend_size;
u8 extend_val = 0;
@@ -363,10 +408,10 @@
return -EINVAL;
mmio = insn_decode_mmio(&insn, &size);
- if (WARN_ON_ONCE(mmio == MMIO_DECODE_FAILED))
+ if (WARN_ON_ONCE(mmio == INSN_MMIO_DECODE_FAILED))
return -EINVAL;
- if (mmio != MMIO_WRITE_IMM && mmio != MMIO_MOVS) {
+ if (mmio != INSN_MMIO_WRITE_IMM && mmio != INSN_MMIO_MOVS) {
reg = insn_get_modrm_reg_ptr(&insn, regs);
if (!reg)
return -EINVAL;
@@ -387,23 +432,23 @@
/* Handle writes first */
switch (mmio) {
- case MMIO_WRITE:
+ case INSN_MMIO_WRITE:
memcpy(&val, reg, size);
if (!mmio_write(size, ve->gpa, val))
return -EIO;
return insn.length;
- case MMIO_WRITE_IMM:
+ case INSN_MMIO_WRITE_IMM:
val = insn.immediate.value;
if (!mmio_write(size, ve->gpa, val))
return -EIO;
return insn.length;
- case MMIO_READ:
- case MMIO_READ_ZERO_EXTEND:
- case MMIO_READ_SIGN_EXTEND:
+ case INSN_MMIO_READ:
+ case INSN_MMIO_READ_ZERO_EXTEND:
+ case INSN_MMIO_READ_SIGN_EXTEND:
/* Reads are handled below */
break;
- case MMIO_MOVS:
- case MMIO_DECODE_FAILED:
+ case INSN_MMIO_MOVS:
+ case INSN_MMIO_DECODE_FAILED:
/*
* MMIO was accessed with an instruction that could not be
* decoded or handled properly. It was likely not using io.h
@@ -420,15 +465,15 @@
return -EIO;
switch (mmio) {
- case MMIO_READ:
+ case INSN_MMIO_READ:
/* Zero-extend for 32-bit operation */
extend_size = size == 4 ? sizeof(*reg) : 0;
break;
- case MMIO_READ_ZERO_EXTEND:
+ case INSN_MMIO_READ_ZERO_EXTEND:
/* Zero extend based on operand size */
extend_size = insn.opnd_bytes;
break;
- case MMIO_READ_SIGN_EXTEND:
+ case INSN_MMIO_READ_SIGN_EXTEND:
/* Sign extend based on operand size */
extend_size = insn.opnd_bytes;
if (size == 1 && val & BIT(7))
@@ -465,7 +510,7 @@
* in TDX Guest-Host-Communication Interface (GHCI) section titled
* "TDG.VP.VMCALL<Instruction.IO>".
*/
- success = !__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT);
+ success = !__tdx_hypercall_ret(&args);
/* Update part of the register affected by the emulated instruction */
regs->ax &= ~mask;
@@ -589,6 +634,11 @@
}
}
+static inline bool is_private_gpa(u64 gpa)
+{
+ return gpa == cc_mkenc(gpa);
+}
+
/*
* Handle the kernel #VE.
*
@@ -607,6 +657,8 @@
case EXIT_REASON_CPUID:
return handle_cpuid(regs, ve);
case EXIT_REASON_EPT_VIOLATION:
+ if (is_private_gpa(ve->gpa))
+ panic("Unexpected EPT-violation on private memory.");
return handle_mmio(regs, ve);
case EXIT_REASON_IO_INSTRUCTION:
return handle_io(regs, ve);
@@ -662,47 +714,6 @@
return true;
}
-static bool try_accept_one(phys_addr_t *start, unsigned long len,
- enum pg_level pg_level)
-{
- unsigned long accept_size = page_level_size(pg_level);
- u64 tdcall_rcx;
- u8 page_size;
-
- if (!IS_ALIGNED(*start, accept_size))
- return false;
-
- if (len < accept_size)
- return false;
-
- /*
- * Pass the page physical address to the TDX module to accept the
- * pending, private page.
- *
- * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G.
- */
- switch (pg_level) {
- case PG_LEVEL_4K:
- page_size = 0;
- break;
- case PG_LEVEL_2M:
- page_size = 1;
- break;
- case PG_LEVEL_1G:
- page_size = 2;
- break;
- default:
- return false;
- }
-
- tdcall_rcx = *start | page_size;
- if (__tdx_module_call(TDX_ACCEPT_PAGE, tdcall_rcx, 0, 0, 0, NULL))
- return false;
-
- *start += accept_size;
- return true;
-}
-
/*
* Inform the VMM of the guest's intent for this physical page: shared with
* the VMM or private to the guest. The VMM is expected to change its mapping
@@ -727,32 +738,9 @@
if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0))
return false;
- /* private->shared conversion requires only MapGPA call */
- if (!enc)
- return true;
-
- /*
- * For shared->private conversion, accept the page using
- * TDX_ACCEPT_PAGE TDX module call.
- */
- while (start < end) {
- unsigned long len = end - start;
-
- /*
- * Try larger accepts first. It gives chance to VMM to keep
- * 1G/2M SEPT entries where possible and speeds up process by
- * cutting number of hypercalls (if successful).
- */
-
- if (try_accept_one(&start, len, PG_LEVEL_1G))
- continue;
-
- if (try_accept_one(&start, len, PG_LEVEL_2M))
- continue;
-
- if (!try_accept_one(&start, len, PG_LEVEL_4K))
- return false;
- }
+ /* shared->private conversion requires memory to be accepted before use */
+ if (enc)
+ return tdx_accept_memory(start, end);
return true;
}
@@ -797,6 +785,9 @@
tdx_parse_tdinfo(&cc_mask);
cc_set_mask(cc_mask);
+ /* Kernel does not use NOTIFY_ENABLES and does not need random #VEs */
+ tdx_module_call(TDX_WR, 0, TDCS_NOTIFY_ENABLES, 0, -1ULL, NULL);
+
/*
* All bits above GPA width are reserved and kernel treats shared bit
* as flag, not as part of physical address.
diff --git a/arch/x86/configs/google/xfstest.config b/arch/x86/configs/google/xfstest.config
new file mode 100644
index 0000000..1b6faaa
--- /dev/null
+++ b/arch/x86/configs/google/xfstest.config
@@ -0,0 +1,24 @@
+#Configurations required to run xfs tests
+CONFIG_MODULE_SIG=n
+CONFIG_MODULE_SIG_ALL=n
+CONFIG_SECURITY_LOADPIN=n
+CONFIG_SECURITY_LOADPIN_ENFORCE=n
+CONFIG_SECURITY_YAMA=n
+CONFIG_SECURITY_LOCKDOWN_LSM=n
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=n
+CONFIG_LSM=""
+CONFIG_SYSTEM_TRUSTED_KEYRING=n
+CONFIG_SECONDARY_TRUSTED_KEYRING=n
+CONFIG_VFAT_FS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_ASCII=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_NLS_UTF8=y
+CONFIG_FAT_DEFAULT_UTF8=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V3=y
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=y
+CONFIG_NFSD=y
+CONFIG_NFSD_V4=y
diff --git a/arch/x86/configs/lakitu_defconfig b/arch/x86/configs/lakitu_defconfig
new file mode 100644
index 0000000..826869f
--- /dev/null
+++ b/arch/x86/configs/lakitu_defconfig
@@ -0,0 +1,4469 @@
+#
+# Automatically generated file; DO NOT EDIT.
+# Linux/x86_64 6.1.38 Kernel Configuration
+#
+CONFIG_CC_VERSION_TEXT="Chromium OS 15.0_pre458507_p20220602-r18 clang version 15.0.0 (/var/tmp/portage/sys-devel/llvm-15.0_pre458507_p20220602-r18/work/llvm-15.0_pre458507_p20220602/clang a58d0af058038595c93de961b725f86997cf8d4a)"
+CONFIG_GCC_VERSION=0
+CONFIG_CC_IS_CLANG=y
+CONFIG_CLANG_VERSION=150000
+CONFIG_AS_IS_LLVM=y
+CONFIG_AS_VERSION=150000
+CONFIG_LD_VERSION=0
+CONFIG_LD_IS_LLD=y
+CONFIG_LLD_VERSION=150000
+CONFIG_CC_CAN_LINK=y
+CONFIG_CC_CAN_LINK_STATIC=y
+CONFIG_CC_HAS_ASM_GOTO_OUTPUT=y
+CONFIG_CC_HAS_ASM_GOTO_TIED_OUTPUT=y
+CONFIG_TOOLS_SUPPORT_RELR=y
+CONFIG_CC_HAS_ASM_INLINE=y
+CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
+CONFIG_PAHOLE_VERSION=121
+CONFIG_IRQ_WORK=y
+CONFIG_BUILDTIME_TABLE_SORT=y
+CONFIG_THREAD_INFO_IN_TASK=y
+
+#
+# General setup
+#
+CONFIG_INIT_ENV_ARG_LIMIT=32
+# CONFIG_COMPILE_TEST is not set
+# CONFIG_WERROR is not set
+CONFIG_LOCALVERSION=""
+# CONFIG_LOCALVERSION_AUTO is not set
+CONFIG_BUILD_SALT=""
+CONFIG_HAVE_KERNEL_GZIP=y
+CONFIG_HAVE_KERNEL_BZIP2=y
+CONFIG_HAVE_KERNEL_LZMA=y
+CONFIG_HAVE_KERNEL_XZ=y
+CONFIG_HAVE_KERNEL_LZO=y
+CONFIG_HAVE_KERNEL_LZ4=y
+CONFIG_HAVE_KERNEL_ZSTD=y
+CONFIG_KERNEL_GZIP=y
+# CONFIG_KERNEL_BZIP2 is not set
+# CONFIG_KERNEL_LZMA is not set
+# CONFIG_KERNEL_XZ is not set
+# CONFIG_KERNEL_LZO is not set
+# CONFIG_KERNEL_LZ4 is not set
+# CONFIG_KERNEL_ZSTD is not set
+CONFIG_DEFAULT_INIT=""
+CONFIG_DEFAULT_HOSTNAME="localhost"
+CONFIG_SYSVIPC=y
+CONFIG_SYSVIPC_SYSCTL=y
+CONFIG_SYSVIPC_COMPAT=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_POSIX_MQUEUE_SYSCTL=y
+# CONFIG_WATCH_QUEUE is not set
+CONFIG_CROSS_MEMORY_ATTACH=y
+CONFIG_USELIB=y
+CONFIG_AUDIT=y
+CONFIG_HAVE_ARCH_AUDITSYSCALL=y
+CONFIG_AUDITSYSCALL=y
+
+#
+# IRQ subsystem
+#
+CONFIG_GENERIC_IRQ_PROBE=y
+CONFIG_GENERIC_IRQ_SHOW=y
+CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
+CONFIG_GENERIC_PENDING_IRQ=y
+CONFIG_GENERIC_IRQ_MIGRATION=y
+CONFIG_HARDIRQS_SW_RESEND=y
+CONFIG_IRQ_DOMAIN=y
+CONFIG_IRQ_DOMAIN_HIERARCHY=y
+CONFIG_GENERIC_MSI_IRQ=y
+CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
+CONFIG_IRQ_MSI_IOMMU=y
+CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
+CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
+CONFIG_IRQ_FORCED_THREADING=y
+CONFIG_SPARSE_IRQ=y
+# CONFIG_GENERIC_IRQ_DEBUGFS is not set
+# end of IRQ subsystem
+
+CONFIG_CLOCKSOURCE_WATCHDOG=y
+CONFIG_ARCH_CLOCKSOURCE_INIT=y
+CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
+CONFIG_GENERIC_TIME_VSYSCALL=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
+CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
+CONFIG_GENERIC_CMOS_UPDATE=y
+CONFIG_HAVE_POSIX_CPU_TIMERS_TASK_WORK=y
+CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y
+CONFIG_CONTEXT_TRACKING=y
+CONFIG_CONTEXT_TRACKING_IDLE=y
+
+#
+# Timers subsystem
+#
+CONFIG_TICK_ONESHOT=y
+CONFIG_NO_HZ_COMMON=y
+# CONFIG_HZ_PERIODIC is not set
+CONFIG_NO_HZ_IDLE=y
+# CONFIG_NO_HZ_FULL is not set
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US=100
+# end of Timers subsystem
+
+CONFIG_BPF=y
+CONFIG_HAVE_EBPF_JIT=y
+CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y
+
+#
+# BPF subsystem
+#
+CONFIG_BPF_SYSCALL=y
+CONFIG_BPF_JIT=y
+CONFIG_BPF_JIT_ALWAYS_ON=y
+CONFIG_BPF_JIT_DEFAULT_ON=y
+# CONFIG_BPF_UNPRIV_DEFAULT_OFF is not set
+# CONFIG_BPF_PRELOAD is not set
+CONFIG_BPF_LSM=y
+# end of BPF subsystem
+
+CONFIG_PREEMPT_BUILD=y
+CONFIG_PREEMPT_NONE=y
+# CONFIG_PREEMPT_VOLUNTARY is not set
+# CONFIG_PREEMPT is not set
+CONFIG_PREEMPT_COUNT=y
+CONFIG_PREEMPTION=y
+CONFIG_PREEMPT_DYNAMIC=y
+CONFIG_SCHED_CORE=y
+
+#
+# CPU/Task time and stats accounting
+#
+CONFIG_TICK_CPU_ACCOUNTING=y
+# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
+# CONFIG_IRQ_TIME_ACCOUNTING is not set
+CONFIG_HAVE_SCHED_AVG_IRQ=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_TASKSTATS=y
+CONFIG_TASK_DELAY_ACCT=y
+CONFIG_TASK_XACCT=y
+CONFIG_TASK_IO_ACCOUNTING=y
+CONFIG_PSI=y
+# CONFIG_PSI_DEFAULT_DISABLED is not set
+# end of CPU/Task time and stats accounting
+
+CONFIG_CPU_ISOLATION=y
+
+#
+# RCU Subsystem
+#
+CONFIG_TREE_RCU=y
+CONFIG_PREEMPT_RCU=y
+# CONFIG_RCU_EXPERT is not set
+CONFIG_SRCU=y
+CONFIG_TREE_SRCU=y
+CONFIG_TASKS_RCU_GENERIC=y
+CONFIG_TASKS_RCU=y
+CONFIG_TASKS_RUDE_RCU=y
+CONFIG_TASKS_TRACE_RCU=y
+CONFIG_RCU_STALL_COMMON=y
+CONFIG_RCU_NEED_SEGCBLIST=y
+# end of RCU Subsystem
+
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_IKHEADERS=m
+CONFIG_LOG_BUF_SHIFT=18
+CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
+CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
+# CONFIG_PRINTK_INDEX is not set
+CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
+
+#
+# Scheduler features
+#
+# CONFIG_UCLAMP_TASK is not set
+# end of Scheduler features
+
+CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
+CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
+CONFIG_CC_HAS_INT128=y
+CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough"
+CONFIG_GCC11_NO_ARRAY_BOUNDS=y
+CONFIG_ARCH_SUPPORTS_INT128=y
+# CONFIG_NUMA_BALANCING is not set
+CONFIG_CGROUPS=y
+CONFIG_PAGE_COUNTER=y
+# CONFIG_CGROUP_FAVOR_DYNMODS is not set
+CONFIG_MEMCG=y
+CONFIG_MEMCG_KMEM=y
+CONFIG_BLK_CGROUP=y
+CONFIG_CGROUP_WRITEBACK=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_FAIR_GROUP_SCHED=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_RT_GROUP_SCHED=y
+CONFIG_CGROUP_PIDS=y
+CONFIG_CGROUP_RDMA=y
+CONFIG_CGROUP_FREEZER=y
+CONFIG_CGROUP_HUGETLB=y
+CONFIG_CPUSETS=y
+CONFIG_PROC_PID_CPUSET=y
+CONFIG_CGROUP_DEVICE=y
+CONFIG_CGROUP_CPUACCT=y
+CONFIG_CGROUP_PERF=y
+CONFIG_CGROUP_BPF=y
+# CONFIG_CGROUP_MISC is not set
+# CONFIG_CGROUP_DEBUG is not set
+CONFIG_SOCK_CGROUP_DATA=y
+CONFIG_NAMESPACES=y
+CONFIG_UTS_NS=y
+CONFIG_TIME_NS=y
+CONFIG_IPC_NS=y
+CONFIG_USER_NS=y
+CONFIG_PID_NS=y
+CONFIG_NET_NS=y
+CONFIG_CHECKPOINT_RESTORE=y
+# CONFIG_SCHED_AUTOGROUP is not set
+# CONFIG_SYSFS_DEPRECATED is not set
+CONFIG_RELAY=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_INITRAMFS_SOURCE=""
+CONFIG_RD_GZIP=y
+# CONFIG_RD_BZIP2 is not set
+# CONFIG_RD_LZMA is not set
+CONFIG_RD_XZ=y
+# CONFIG_RD_LZO is not set
+CONFIG_RD_LZ4=y
+CONFIG_RD_ZSTD=y
+# CONFIG_BOOT_CONFIG is not set
+CONFIG_INITRAMFS_PRESERVE_MTIME=y
+CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
+# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
+CONFIG_LD_ORPHAN_WARN=y
+CONFIG_SYSCTL=y
+CONFIG_HAVE_UID16=y
+CONFIG_SYSCTL_EXCEPTION_TRACE=y
+CONFIG_HAVE_PCSPKR_PLATFORM=y
+CONFIG_EXPERT=y
+CONFIG_UID16=y
+CONFIG_MULTIUSER=y
+CONFIG_SGETMASK_SYSCALL=y
+CONFIG_SYSFS_SYSCALL=y
+CONFIG_FHANDLE=y
+CONFIG_POSIX_TIMERS=y
+CONFIG_PRINTK=y
+CONFIG_BUG=y
+CONFIG_ELF_CORE=y
+# CONFIG_PCSPKR_PLATFORM is not set
+CONFIG_BASE_FULL=y
+CONFIG_FUTEX=y
+CONFIG_FUTEX_PI=y
+CONFIG_EPOLL=y
+CONFIG_SIGNALFD=y
+CONFIG_TIMERFD=y
+CONFIG_EVENTFD=y
+CONFIG_SHMEM=y
+CONFIG_AIO=y
+CONFIG_IO_URING=y
+CONFIG_ADVISE_SYSCALLS=y
+CONFIG_MEMBARRIER=y
+CONFIG_KALLSYMS=y
+# CONFIG_KALLSYMS_ALL is not set
+CONFIG_KALLSYMS_ABSOLUTE_PERCPU=y
+CONFIG_KALLSYMS_BASE_RELATIVE=y
+CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
+CONFIG_KCMP=y
+CONFIG_RSEQ=y
+# CONFIG_DEBUG_RSEQ is not set
+CONFIG_EMBEDDED=y
+CONFIG_HAVE_PERF_EVENTS=y
+CONFIG_GUEST_PERF_EVENTS=y
+# CONFIG_PC104 is not set
+
+#
+# Kernel Performance Events And Counters
+#
+CONFIG_PERF_EVENTS=y
+# CONFIG_DEBUG_PERF_USE_VMALLOC is not set
+# end of Kernel Performance Events And Counters
+
+CONFIG_SYSTEM_DATA_VERIFICATION=y
+CONFIG_PROFILING=y
+CONFIG_TRACEPOINTS=y
+# end of General setup
+
+CONFIG_64BIT=y
+CONFIG_X86_64=y
+CONFIG_X86=y
+CONFIG_INSTRUCTION_DECODER=y
+CONFIG_OUTPUT_FORMAT="elf64-x86-64"
+CONFIG_LOCKDEP_SUPPORT=y
+CONFIG_STACKTRACE_SUPPORT=y
+CONFIG_MMU=y
+CONFIG_ARCH_MMAP_RND_BITS_MIN=28
+CONFIG_ARCH_MMAP_RND_BITS_MAX=32
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
+CONFIG_GENERIC_ISA_DMA=y
+CONFIG_GENERIC_BUG=y
+CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
+CONFIG_ARCH_MAY_HAVE_PC_FDC=y
+CONFIG_GENERIC_CALIBRATE_DELAY=y
+CONFIG_ARCH_HAS_CPU_RELAX=y
+CONFIG_ARCH_HIBERNATION_POSSIBLE=y
+CONFIG_ARCH_NR_GPIO=1024
+CONFIG_ARCH_SUSPEND_POSSIBLE=y
+CONFIG_AUDIT_ARCH=y
+CONFIG_HAVE_INTEL_TXT=y
+CONFIG_X86_64_SMP=y
+CONFIG_ARCH_SUPPORTS_UPROBES=y
+CONFIG_FIX_EARLYCON_MEM=y
+CONFIG_DYNAMIC_PHYSICAL_MASK=y
+CONFIG_PGTABLE_LEVELS=4
+CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
+
+#
+# Processor type and features
+#
+CONFIG_SMP=y
+CONFIG_X86_FEATURE_NAMES=y
+CONFIG_X86_X2APIC=y
+CONFIG_X86_MPPARSE=y
+# CONFIG_GOLDFISH is not set
+# CONFIG_X86_CPU_RESCTRL is not set
+# CONFIG_X86_EXTENDED_PLATFORM is not set
+# CONFIG_X86_INTEL_LPSS is not set
+# CONFIG_X86_AMD_PLATFORM_DEVICE is not set
+# CONFIG_IOSF_MBI is not set
+CONFIG_X86_SUPPORTS_MEMORY_FAILURE=y
+CONFIG_SCHED_OMIT_FRAME_POINTER=y
+CONFIG_HYPERVISOR_GUEST=y
+CONFIG_PARAVIRT=y
+CONFIG_PARAVIRT_XXL=y
+# CONFIG_PARAVIRT_DEBUG is not set
+CONFIG_PARAVIRT_SPINLOCKS=y
+CONFIG_X86_HV_CALLBACK_VECTOR=y
+CONFIG_XEN=y
+CONFIG_XEN_PV=y
+CONFIG_XEN_512GB=y
+CONFIG_XEN_PV_SMP=y
+CONFIG_XEN_PV_DOM0=y
+CONFIG_XEN_PVHVM=y
+CONFIG_XEN_PVHVM_SMP=y
+CONFIG_XEN_PVHVM_GUEST=y
+CONFIG_XEN_SAVE_RESTORE=y
+# CONFIG_XEN_DEBUG_FS is not set
+CONFIG_XEN_PVH=y
+CONFIG_XEN_DOM0=y
+CONFIG_XEN_PV_MSR_SAFE=y
+CONFIG_KVM_GUEST=y
+CONFIG_ARCH_CPUIDLE_HALTPOLL=y
+CONFIG_PVH=y
+CONFIG_PARAVIRT_TIME_ACCOUNTING=y
+CONFIG_PARAVIRT_CLOCK=y
+# CONFIG_JAILHOUSE_GUEST is not set
+# CONFIG_ACRN_GUEST is not set
+CONFIG_INTEL_TDX_GUEST=y
+# CONFIG_MK8 is not set
+# CONFIG_MPSC is not set
+# CONFIG_MCORE2 is not set
+# CONFIG_MATOM is not set
+CONFIG_GENERIC_CPU=y
+CONFIG_X86_INTERNODE_CACHE_SHIFT=6
+CONFIG_X86_L1_CACHE_SHIFT=6
+CONFIG_X86_TSC=y
+CONFIG_X86_CMPXCHG64=y
+CONFIG_X86_CMOV=y
+CONFIG_X86_MINIMUM_CPU_FAMILY=64
+CONFIG_X86_DEBUGCTLMSR=y
+CONFIG_IA32_FEAT_CTL=y
+CONFIG_X86_VMX_FEATURE_NAMES=y
+# CONFIG_PROCESSOR_SELECT is not set
+CONFIG_CPU_SUP_INTEL=y
+CONFIG_CPU_SUP_AMD=y
+CONFIG_CPU_SUP_HYGON=y
+CONFIG_CPU_SUP_CENTAUR=y
+CONFIG_CPU_SUP_ZHAOXIN=y
+CONFIG_HPET_TIMER=y
+CONFIG_HPET_EMULATE_RTC=y
+CONFIG_DMI=y
+CONFIG_GART_IOMMU=y
+# CONFIG_MAXSMP is not set
+CONFIG_NR_CPUS_RANGE_BEGIN=2
+CONFIG_NR_CPUS_RANGE_END=512
+CONFIG_NR_CPUS_DEFAULT=64
+CONFIG_NR_CPUS=512
+CONFIG_SCHED_CLUSTER=y
+CONFIG_SCHED_SMT=y
+CONFIG_SCHED_MC=y
+CONFIG_SCHED_MC_PRIO=y
+CONFIG_X86_LOCAL_APIC=y
+CONFIG_X86_IO_APIC=y
+CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS=y
+CONFIG_X86_MCE=y
+# CONFIG_X86_MCELOG_LEGACY is not set
+CONFIG_X86_MCE_INTEL=y
+CONFIG_X86_MCE_AMD=y
+CONFIG_X86_MCE_THRESHOLD=y
+# CONFIG_X86_MCE_INJECT is not set
+
+#
+# Performance monitoring
+#
+CONFIG_PERF_EVENTS_INTEL_UNCORE=y
+CONFIG_PERF_EVENTS_INTEL_RAPL=y
+CONFIG_PERF_EVENTS_INTEL_CSTATE=y
+# CONFIG_PERF_EVENTS_AMD_POWER is not set
+CONFIG_PERF_EVENTS_AMD_UNCORE=y
+# CONFIG_PERF_EVENTS_AMD_BRS is not set
+# end of Performance monitoring
+
+CONFIG_X86_16BIT=y
+CONFIG_X86_ESPFIX64=y
+CONFIG_X86_VSYSCALL_EMULATION=y
+CONFIG_X86_IOPL_IOPERM=y
+# CONFIG_MICROCODE is not set
+CONFIG_X86_MSR=y
+CONFIG_X86_CPUID=y
+# CONFIG_X86_5LEVEL is not set
+CONFIG_X86_DIRECT_GBPAGES=y
+# CONFIG_X86_CPA_STATISTICS is not set
+CONFIG_X86_MEM_ENCRYPT=y
+CONFIG_AMD_MEM_ENCRYPT=y
+CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y
+CONFIG_NUMA=y
+# CONFIG_AMD_NUMA is not set
+CONFIG_X86_64_ACPI_NUMA=y
+CONFIG_NUMA_EMU=y
+CONFIG_NODES_SHIFT=6
+CONFIG_ARCH_SPARSEMEM_ENABLE=y
+CONFIG_ARCH_SPARSEMEM_DEFAULT=y
+CONFIG_ARCH_SELECT_MEMORY_MODEL=y
+# CONFIG_ARCH_MEMORY_PROBE is not set
+CONFIG_ARCH_PROC_KCORE_TEXT=y
+CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
+# CONFIG_X86_PMEM_LEGACY is not set
+CONFIG_X86_CHECK_BIOS_CORRUPTION=y
+CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
+CONFIG_MTRR=y
+# CONFIG_MTRR_SANITIZER is not set
+CONFIG_X86_PAT=y
+CONFIG_ARCH_USES_PG_UNCACHED=y
+CONFIG_X86_UMIP=y
+CONFIG_CC_HAS_IBT=y
+# CONFIG_X86_KERNEL_IBT is not set
+CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
+CONFIG_X86_INTEL_TSX_MODE_OFF=y
+# CONFIG_X86_INTEL_TSX_MODE_ON is not set
+# CONFIG_X86_INTEL_TSX_MODE_AUTO is not set
+# CONFIG_X86_SGX is not set
+CONFIG_EFI=y
+CONFIG_EFI_STUB=y
+# CONFIG_EFI_MIXED is not set
+# CONFIG_HZ_100 is not set
+# CONFIG_HZ_250 is not set
+# CONFIG_HZ_300 is not set
+CONFIG_HZ_1000=y
+CONFIG_HZ=1000
+CONFIG_SCHED_HRTICK=y
+# CONFIG_KEXEC is not set
+CONFIG_KEXEC_FILE=y
+CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
+# CONFIG_KEXEC_SIG is not set
+# CONFIG_CRASH_DUMP is not set
+CONFIG_PHYSICAL_START=0x1000000
+CONFIG_RELOCATABLE=y
+CONFIG_RANDOMIZE_BASE=y
+CONFIG_X86_NEED_RELOCS=y
+CONFIG_PHYSICAL_ALIGN=0x1000000
+CONFIG_DYNAMIC_MEMORY_LAYOUT=y
+CONFIG_RANDOMIZE_MEMORY=y
+CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0xa
+CONFIG_HOTPLUG_CPU=y
+# CONFIG_BOOTPARAM_HOTPLUG_CPU0 is not set
+# CONFIG_DEBUG_HOTPLUG_CPU0 is not set
+# CONFIG_COMPAT_VDSO is not set
+CONFIG_LEGACY_VSYSCALL_XONLY=y
+# CONFIG_LEGACY_VSYSCALL_NONE is not set
+# CONFIG_CMDLINE_BOOL is not set
+CONFIG_MODIFY_LDT_SYSCALL=y
+# CONFIG_STRICT_SIGALTSTACK_SIZE is not set
+CONFIG_HAVE_LIVEPATCH=y
+# end of Processor type and features
+
+CONFIG_CC_HAS_SLS=y
+CONFIG_CC_HAS_RETURN_THUNK=y
+CONFIG_SPECULATION_MITIGATIONS=y
+CONFIG_PAGE_TABLE_ISOLATION=y
+CONFIG_RETPOLINE=y
+CONFIG_RETHUNK=y
+CONFIG_CPU_UNRET_ENTRY=y
+CONFIG_CPU_IBPB_ENTRY=y
+CONFIG_CPU_IBRS_ENTRY=y
+# CONFIG_SLS is not set
+CONFIG_ARCH_HAS_ADD_PAGES=y
+CONFIG_ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE=y
+
+#
+# Power management and ACPI options
+#
+CONFIG_SUSPEND=y
+CONFIG_SUSPEND_FREEZER=y
+# CONFIG_SUSPEND_SKIP_SYNC is not set
+CONFIG_HIBERNATE_CALLBACKS=y
+# CONFIG_HIBERNATION is not set
+CONFIG_PM_SLEEP=y
+CONFIG_PM_SLEEP_SMP=y
+# CONFIG_PM_AUTOSLEEP is not set
+# CONFIG_PM_USERSPACE_AUTOSLEEP is not set
+# CONFIG_PM_WAKELOCKS is not set
+CONFIG_PM=y
+CONFIG_PM_DEBUG=y
+# CONFIG_PM_ADVANCED_DEBUG is not set
+# CONFIG_PM_TEST_SUSPEND is not set
+CONFIG_PM_SLEEP_DEBUG=y
+# CONFIG_DPM_WATCHDOG is not set
+CONFIG_PM_TRACE=y
+CONFIG_PM_TRACE_RTC=y
+CONFIG_PM_CLK=y
+# CONFIG_WQ_POWER_EFFICIENT_DEFAULT is not set
+# CONFIG_ENERGY_MODEL is not set
+CONFIG_ARCH_SUPPORTS_ACPI=y
+CONFIG_ACPI=y
+CONFIG_ACPI_LEGACY_TABLES_LOOKUP=y
+CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC=y
+CONFIG_ACPI_SYSTEM_POWER_STATES_SUPPORT=y
+# CONFIG_ACPI_DEBUGGER is not set
+CONFIG_ACPI_SPCR_TABLE=y
+# CONFIG_ACPI_FPDT is not set
+CONFIG_ACPI_LPIT=y
+CONFIG_ACPI_SLEEP=y
+CONFIG_ACPI_REV_OVERRIDE_POSSIBLE=y
+# CONFIG_ACPI_EC_DEBUGFS is not set
+# CONFIG_ACPI_AC is not set
+# CONFIG_ACPI_BATTERY is not set
+CONFIG_ACPI_BUTTON=y
+# CONFIG_ACPI_FAN is not set
+# CONFIG_ACPI_TAD is not set
+# CONFIG_ACPI_DOCK is not set
+CONFIG_ACPI_CPU_FREQ_PSS=y
+CONFIG_ACPI_PROCESSOR_CSTATE=y
+CONFIG_ACPI_PROCESSOR_IDLE=y
+CONFIG_ACPI_CPPC_LIB=y
+CONFIG_ACPI_PROCESSOR=y
+CONFIG_ACPI_HOTPLUG_CPU=y
+# CONFIG_ACPI_PROCESSOR_AGGREGATOR is not set
+CONFIG_ACPI_THERMAL=y
+CONFIG_ARCH_HAS_ACPI_TABLE_UPGRADE=y
+# CONFIG_ACPI_TABLE_UPGRADE is not set
+# CONFIG_ACPI_DEBUG is not set
+# CONFIG_ACPI_PCI_SLOT is not set
+CONFIG_ACPI_CONTAINER=y
+# CONFIG_ACPI_HOTPLUG_MEMORY is not set
+CONFIG_ACPI_HOTPLUG_IOAPIC=y
+# CONFIG_ACPI_SBS is not set
+# CONFIG_ACPI_HED is not set
+# CONFIG_ACPI_CUSTOM_METHOD is not set
+# CONFIG_ACPI_BGRT is not set
+# CONFIG_ACPI_REDUCED_HARDWARE_ONLY is not set
+# CONFIG_ACPI_NFIT is not set
+CONFIG_ACPI_NUMA=y
+# CONFIG_ACPI_HMAT is not set
+CONFIG_HAVE_ACPI_APEI=y
+CONFIG_HAVE_ACPI_APEI_NMI=y
+# CONFIG_ACPI_APEI is not set
+# CONFIG_ACPI_DPTF is not set
+# CONFIG_ACPI_CONFIGFS is not set
+# CONFIG_ACPI_PFRUT is not set
+CONFIG_ACPI_PCC=y
+# CONFIG_PMIC_OPREGION is not set
+CONFIG_ACPI_PRMT=y
+# CONFIG_X86_PM_TIMER is not set
+
+#
+# CPU Frequency scaling
+#
+CONFIG_CPU_FREQ=y
+CONFIG_CPU_FREQ_GOV_ATTR_SET=y
+CONFIG_CPU_FREQ_STAT=y
+CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
+CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
+# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
+# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
+# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
+# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set
+CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
+
+#
+# CPU frequency scaling drivers
+#
+CONFIG_X86_INTEL_PSTATE=y
+# CONFIG_X86_PCC_CPUFREQ is not set
+# CONFIG_X86_AMD_PSTATE is not set
+# CONFIG_X86_AMD_PSTATE_UT is not set
+# CONFIG_X86_ACPI_CPUFREQ is not set
+# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
+# CONFIG_X86_P4_CLOCKMOD is not set
+
+#
+# shared options
+#
+# end of CPU Frequency scaling
+
+#
+# CPU Idle
+#
+CONFIG_CPU_IDLE=y
+CONFIG_CPU_IDLE_GOV_LADDER=y
+CONFIG_CPU_IDLE_GOV_MENU=y
+# CONFIG_CPU_IDLE_GOV_TEO is not set
+# CONFIG_CPU_IDLE_GOV_HALTPOLL is not set
+CONFIG_HALTPOLL_CPUIDLE=y
+# end of CPU Idle
+
+CONFIG_INTEL_IDLE=y
+# end of Power management and ACPI options
+
+#
+# Bus options (PCI etc.)
+#
+CONFIG_PCI_DIRECT=y
+CONFIG_PCI_MMCONFIG=y
+CONFIG_PCI_XEN=y
+CONFIG_MMCONF_FAM10H=y
+# CONFIG_PCI_CNB20LE_QUIRK is not set
+# CONFIG_ISA_BUS is not set
+CONFIG_ISA_DMA_API=y
+CONFIG_AMD_NB=y
+# end of Bus options (PCI etc.)
+
+#
+# Binary Emulations
+#
+CONFIG_IA32_EMULATION=y
+CONFIG_COMPAT_32=y
+CONFIG_COMPAT=y
+CONFIG_COMPAT_FOR_U64_ALIGNMENT=y
+# end of Binary Emulations
+
+CONFIG_HAVE_KVM=y
+CONFIG_HAVE_KVM_PFNCACHE=y
+CONFIG_HAVE_KVM_IRQCHIP=y
+CONFIG_HAVE_KVM_IRQFD=y
+CONFIG_HAVE_KVM_IRQ_ROUTING=y
+CONFIG_HAVE_KVM_DIRTY_RING=y
+CONFIG_HAVE_KVM_DIRTY_RING_TSO=y
+CONFIG_HAVE_KVM_DIRTY_RING_ACQ_REL=y
+CONFIG_HAVE_KVM_EVENTFD=y
+CONFIG_KVM_MMIO=y
+CONFIG_KVM_ASYNC_PF=y
+CONFIG_HAVE_KVM_MSI=y
+CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
+CONFIG_KVM_VFIO=y
+CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT=y
+CONFIG_KVM_COMPAT=y
+CONFIG_HAVE_KVM_IRQ_BYPASS=y
+CONFIG_HAVE_KVM_NO_POLL=y
+CONFIG_KVM_XFER_TO_GUEST_WORK=y
+CONFIG_HAVE_KVM_PM_NOTIFIER=y
+CONFIG_VIRTUALIZATION=y
+CONFIG_KVM=m
+CONFIG_KVM_WERROR=y
+CONFIG_KVM_INTEL=m
+CONFIG_KVM_AMD=m
+# CONFIG_KVM_XEN is not set
+CONFIG_AS_AVX512=y
+CONFIG_AS_SHA1_NI=y
+CONFIG_AS_SHA256_NI=y
+CONFIG_AS_TPAUSE=y
+
+#
+# General architecture-dependent options
+#
+CONFIG_CRASH_CORE=y
+CONFIG_KEXEC_CORE=y
+CONFIG_HAVE_IMA_KEXEC=y
+CONFIG_HOTPLUG_SMT=y
+CONFIG_GENERIC_ENTRY=y
+CONFIG_KPROBES=y
+# CONFIG_JUMP_LABEL is not set
+# CONFIG_STATIC_CALL_SELFTEST is not set
+CONFIG_OPTPROBES=y
+CONFIG_KPROBES_ON_FTRACE=y
+CONFIG_UPROBES=y
+CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
+CONFIG_ARCH_USE_BUILTIN_BSWAP=y
+CONFIG_KRETPROBES=y
+CONFIG_KRETPROBE_ON_RETHOOK=y
+CONFIG_USER_RETURN_NOTIFIER=y
+CONFIG_HAVE_IOREMAP_PROT=y
+CONFIG_HAVE_KPROBES=y
+CONFIG_HAVE_KRETPROBES=y
+CONFIG_HAVE_OPTPROBES=y
+CONFIG_HAVE_KPROBES_ON_FTRACE=y
+CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE=y
+CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
+CONFIG_HAVE_NMI=y
+CONFIG_TRACE_IRQFLAGS_SUPPORT=y
+CONFIG_TRACE_IRQFLAGS_NMI_SUPPORT=y
+CONFIG_HAVE_ARCH_TRACEHOOK=y
+CONFIG_HAVE_DMA_CONTIGUOUS=y
+CONFIG_GENERIC_SMP_IDLE_THREAD=y
+CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
+CONFIG_ARCH_HAS_SET_MEMORY=y
+CONFIG_ARCH_HAS_SET_DIRECT_MAP=y
+CONFIG_HAVE_ARCH_THREAD_STRUCT_WHITELIST=y
+CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT=y
+CONFIG_ARCH_WANTS_NO_INSTR=y
+CONFIG_HAVE_ASM_MODVERSIONS=y
+CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
+CONFIG_HAVE_RSEQ=y
+CONFIG_HAVE_RUST=y
+CONFIG_HAVE_FUNCTION_ARG_ACCESS_API=y
+CONFIG_HAVE_HW_BREAKPOINT=y
+CONFIG_HAVE_MIXED_BREAKPOINTS_REGS=y
+CONFIG_HAVE_USER_RETURN_NOTIFIER=y
+CONFIG_HAVE_PERF_EVENTS_NMI=y
+CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
+CONFIG_HAVE_PERF_REGS=y
+CONFIG_HAVE_PERF_USER_STACK_DUMP=y
+CONFIG_HAVE_ARCH_JUMP_LABEL=y
+CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
+CONFIG_MMU_GATHER_TABLE_FREE=y
+CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
+CONFIG_MMU_GATHER_MERGE_VMAS=y
+CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
+CONFIG_HAVE_ALIGNED_STRUCT_PAGE=y
+CONFIG_HAVE_CMPXCHG_LOCAL=y
+CONFIG_HAVE_CMPXCHG_DOUBLE=y
+CONFIG_ARCH_WANT_COMPAT_IPC_PARSE_VERSION=y
+CONFIG_ARCH_WANT_OLD_COMPAT_IPC=y
+CONFIG_HAVE_ARCH_SECCOMP=y
+CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
+CONFIG_SECCOMP=y
+CONFIG_SECCOMP_FILTER=y
+# CONFIG_SECCOMP_CACHE_DEBUG is not set
+CONFIG_HAVE_ARCH_STACKLEAK=y
+CONFIG_HAVE_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR=y
+CONFIG_STACKPROTECTOR_STRONG=y
+CONFIG_ARCH_SUPPORTS_LTO_CLANG=y
+CONFIG_ARCH_SUPPORTS_LTO_CLANG_THIN=y
+CONFIG_LTO_NONE=y
+CONFIG_ARCH_SUPPORTS_CFI_CLANG=y
+CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES=y
+CONFIG_HAVE_CONTEXT_TRACKING_USER=y
+CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK=y
+CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
+CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
+CONFIG_HAVE_MOVE_PUD=y
+CONFIG_HAVE_MOVE_PMD=y
+CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
+CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
+CONFIG_HAVE_ARCH_HUGE_VMAP=y
+CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
+CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
+CONFIG_HAVE_ARCH_SOFT_DIRTY=y
+CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
+CONFIG_MODULES_USE_ELF_RELA=y
+CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
+CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
+CONFIG_SOFTIRQ_ON_OWN_STACK=y
+CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
+CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
+CONFIG_HAVE_EXIT_THREAD=y
+CONFIG_ARCH_MMAP_RND_BITS=31
+CONFIG_HAVE_ARCH_MMAP_RND_COMPAT_BITS=y
+CONFIG_ARCH_MMAP_RND_COMPAT_BITS=8
+CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES=y
+CONFIG_PAGE_SIZE_LESS_THAN_64KB=y
+CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
+CONFIG_HAVE_OBJTOOL=y
+CONFIG_HAVE_JUMP_LABEL_HACK=y
+CONFIG_HAVE_NOINSTR_HACK=y
+CONFIG_HAVE_NOINSTR_VALIDATION=y
+CONFIG_HAVE_UACCESS_VALIDATION=y
+CONFIG_HAVE_STACK_VALIDATION=y
+CONFIG_HAVE_RELIABLE_STACKTRACE=y
+CONFIG_OLD_SIGSUSPEND3=y
+CONFIG_COMPAT_OLD_SIGACTION=y
+CONFIG_COMPAT_32BIT_TIME=y
+CONFIG_HAVE_ARCH_VMAP_STACK=y
+CONFIG_VMAP_STACK=y
+CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET=y
+CONFIG_RANDOMIZE_KSTACK_OFFSET=y
+CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y
+CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
+CONFIG_STRICT_KERNEL_RWX=y
+CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
+CONFIG_STRICT_MODULE_RWX=y
+CONFIG_HAVE_ARCH_PREL32_RELOCATIONS=y
+CONFIG_ARCH_USE_MEMREMAP_PROT=y
+# CONFIG_LOCK_EVENT_COUNTS is not set
+CONFIG_ARCH_HAS_MEM_ENCRYPT=y
+CONFIG_ARCH_HAS_CC_PLATFORM=y
+CONFIG_HAVE_STATIC_CALL=y
+CONFIG_HAVE_STATIC_CALL_INLINE=y
+CONFIG_HAVE_PREEMPT_DYNAMIC=y
+CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
+CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
+CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
+CONFIG_ARCH_SUPPORTS_PAGE_TABLE_CHECK=y
+CONFIG_ARCH_HAS_ELFCORE_COMPAT=y
+CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH=y
+CONFIG_DYNAMIC_SIGFRAME=y
+CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG=y
+
+#
+# GCOV-based kernel profiling
+#
+# CONFIG_GCOV_KERNEL is not set
+CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
+# end of GCOV-based kernel profiling
+
+CONFIG_HAVE_GCC_PLUGINS=y
+CONFIG_FUNCTION_ALIGNMENT_4B=y
+CONFIG_FUNCTION_ALIGNMENT_16B=y
+CONFIG_FUNCTION_ALIGNMENT=16
+# end of General architecture-dependent options
+
+CONFIG_RT_MUTEXES=y
+CONFIG_BASE_SMALL=0
+CONFIG_MODULE_SIG_FORMAT=y
+CONFIG_MODULES=y
+# CONFIG_MODULE_FORCE_LOAD is not set
+CONFIG_MODULE_UNLOAD=y
+CONFIG_MODULE_FORCE_UNLOAD=y
+# CONFIG_MODULE_UNLOAD_TAINT_TRACKING is not set
+# CONFIG_MODVERSIONS is not set
+# CONFIG_MODULE_SRCVERSION_ALL is not set
+CONFIG_MODULE_SIG=y
+# CONFIG_MODULE_SIG_FORCE is not set
+CONFIG_MODULE_SIG_ALL=y
+# CONFIG_MODULE_SIG_SHA1 is not set
+# CONFIG_MODULE_SIG_SHA224 is not set
+CONFIG_MODULE_SIG_SHA256=y
+# CONFIG_MODULE_SIG_SHA384 is not set
+# CONFIG_MODULE_SIG_SHA512 is not set
+CONFIG_MODULE_SIG_HASH="sha256"
+CONFIG_MODULE_COMPRESS_NONE=y
+# CONFIG_MODULE_COMPRESS_GZIP is not set
+# CONFIG_MODULE_COMPRESS_XZ is not set
+# CONFIG_MODULE_COMPRESS_ZSTD is not set
+# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
+CONFIG_MODPROBE_PATH="/sbin/modprobe"
+# CONFIG_TRIM_UNUSED_KSYMS is not set
+CONFIG_MODULES_TREE_LOOKUP=y
+CONFIG_BLOCK=y
+CONFIG_BLOCK_LEGACY_AUTOLOAD=y
+CONFIG_BLK_CGROUP_RWSTAT=y
+CONFIG_BLK_DEV_BSG_COMMON=y
+CONFIG_BLK_ICQ=y
+CONFIG_BLK_DEV_BSGLIB=y
+CONFIG_BLK_DEV_INTEGRITY=y
+CONFIG_BLK_DEV_INTEGRITY_T10=y
+# CONFIG_BLK_DEV_ZONED is not set
+CONFIG_BLK_DEV_THROTTLING=y
+# CONFIG_BLK_DEV_THROTTLING_LOW is not set
+CONFIG_BLK_WBT=y
+CONFIG_BLK_WBT_MQ=y
+# CONFIG_BLK_CGROUP_IOLATENCY is not set
+# CONFIG_BLK_CGROUP_IOCOST is not set
+# CONFIG_BLK_CGROUP_IOPRIO is not set
+# CONFIG_BLK_DEBUG_FS is not set
+# CONFIG_BLK_SED_OPAL is not set
+# CONFIG_BLK_INLINE_ENCRYPTION is not set
+
+#
+# Partition Types
+#
+CONFIG_PARTITION_ADVANCED=y
+# CONFIG_ACORN_PARTITION is not set
+# CONFIG_AIX_PARTITION is not set
+# CONFIG_OSF_PARTITION is not set
+# CONFIG_AMIGA_PARTITION is not set
+# CONFIG_ATARI_PARTITION is not set
+# CONFIG_MAC_PARTITION is not set
+CONFIG_MSDOS_PARTITION=y
+# CONFIG_BSD_DISKLABEL is not set
+# CONFIG_MINIX_SUBPARTITION is not set
+# CONFIG_SOLARIS_X86_PARTITION is not set
+# CONFIG_UNIXWARE_DISKLABEL is not set
+# CONFIG_LDM_PARTITION is not set
+# CONFIG_SGI_PARTITION is not set
+# CONFIG_ULTRIX_PARTITION is not set
+# CONFIG_SUN_PARTITION is not set
+# CONFIG_KARMA_PARTITION is not set
+CONFIG_EFI_PARTITION=y
+# CONFIG_SYSV68_PARTITION is not set
+# CONFIG_CMDLINE_PARTITION is not set
+# end of Partition Types
+
+CONFIG_BLOCK_COMPAT=y
+CONFIG_BLK_MQ_PCI=y
+CONFIG_BLK_MQ_VIRTIO=y
+CONFIG_BLK_PM=y
+CONFIG_BLOCK_HOLDER_DEPRECATED=y
+CONFIG_BLK_MQ_STACKING=y
+
+#
+# IO Schedulers
+#
+CONFIG_MQ_IOSCHED_DEADLINE=y
+CONFIG_MQ_IOSCHED_KYBER=m
+CONFIG_IOSCHED_BFQ=m
+CONFIG_BFQ_GROUP_IOSCHED=y
+# CONFIG_BFQ_CGROUP_DEBUG is not set
+# end of IO Schedulers
+
+CONFIG_PREEMPT_NOTIFIERS=y
+CONFIG_ASN1=y
+CONFIG_UNINLINE_SPIN_UNLOCK=y
+CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
+CONFIG_MUTEX_SPIN_ON_OWNER=y
+CONFIG_RWSEM_SPIN_ON_OWNER=y
+CONFIG_LOCK_SPIN_ON_OWNER=y
+CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
+CONFIG_QUEUED_SPINLOCKS=y
+CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
+CONFIG_QUEUED_RWLOCKS=y
+CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y
+CONFIG_ARCH_HAS_SYNC_CORE_BEFORE_USERMODE=y
+CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
+CONFIG_FREEZER=y
+
+#
+# Executable file formats
+#
+CONFIG_BINFMT_ELF=y
+CONFIG_COMPAT_BINFMT_ELF=y
+CONFIG_ELFCORE=y
+CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
+CONFIG_BINFMT_SCRIPT=y
+CONFIG_BINFMT_MISC=y
+CONFIG_COREDUMP=y
+# end of Executable file formats
+
+#
+# Memory Management options
+#
+CONFIG_SWAP=y
+# CONFIG_ZSWAP is not set
+CONFIG_ZSMALLOC=m
+# CONFIG_ZSMALLOC_STAT is not set
+
+#
+# SLAB allocator options
+#
+# CONFIG_SLAB is not set
+CONFIG_SLUB=y
+# CONFIG_SLOB is not set
+CONFIG_SLAB_MERGE_DEFAULT=y
+CONFIG_SLAB_FREELIST_RANDOM=y
+CONFIG_SLAB_FREELIST_HARDENED=y
+# CONFIG_SLUB_STATS is not set
+CONFIG_SLUB_CPU_PARTIAL=y
+# end of SLAB allocator options
+
+CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
+# CONFIG_COMPAT_BRK is not set
+CONFIG_SPARSEMEM=y
+CONFIG_SPARSEMEM_EXTREME=y
+CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
+CONFIG_SPARSEMEM_VMEMMAP=y
+CONFIG_HAVE_FAST_GUP=y
+CONFIG_NUMA_KEEP_MEMINFO=y
+CONFIG_MEMORY_ISOLATION=y
+CONFIG_HAVE_BOOTMEM_INFO_NODE=y
+CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTPLUG_SPARSE=y
+# CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is not set
+CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
+CONFIG_MEMORY_HOTREMOVE=y
+CONFIG_MHP_MEMMAP_ON_MEMORY=y
+CONFIG_SPLIT_PTLOCK_CPUS=4
+CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
+CONFIG_MEMORY_BALLOON=y
+CONFIG_BALLOON_COMPACTION=y
+CONFIG_COMPACTION=y
+CONFIG_COMPACT_UNEVICTABLE_DEFAULT=1
+CONFIG_PAGE_REPORTING=y
+CONFIG_MIGRATION=y
+CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
+CONFIG_ARCH_ENABLE_THP_MIGRATION=y
+CONFIG_CONTIG_ALLOC=y
+CONFIG_PHYS_ADDR_T_64BIT=y
+CONFIG_MMU_NOTIFIER=y
+# CONFIG_KSM is not set
+CONFIG_DEFAULT_MMAP_MIN_ADDR=65536
+CONFIG_ARCH_SUPPORTS_MEMORY_FAILURE=y
+CONFIG_MEMORY_FAILURE=y
+# CONFIG_HWPOISON_INJECT is not set
+CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
+CONFIG_ARCH_WANTS_THP_SWAP=y
+CONFIG_TRANSPARENT_HUGEPAGE=y
+# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
+CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
+CONFIG_THP_SWAP=y
+# CONFIG_READ_ONLY_THP_FOR_FS is not set
+CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
+CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
+CONFIG_USE_PERCPU_NUMA_NODE_ID=y
+CONFIG_HAVE_SETUP_PER_CPU_AREA=y
+CONFIG_CMA=y
+# CONFIG_CMA_DEBUG is not set
+# CONFIG_CMA_DEBUGFS is not set
+# CONFIG_CMA_SYSFS is not set
+CONFIG_CMA_AREAS=7
+CONFIG_MEM_SOFT_DIRTY=y
+CONFIG_GENERIC_EARLY_IOREMAP=y
+# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
+# CONFIG_IDLE_PAGE_TRACKING is not set
+CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
+CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
+CONFIG_ARCH_HAS_PTE_DEVMAP=y
+CONFIG_ARCH_HAS_ZONE_DMA_SET=y
+CONFIG_ZONE_DMA=y
+CONFIG_ZONE_DMA32=y
+CONFIG_VMAP_PFN=y
+CONFIG_ZONE_DEVICE=y
+# CONFIG_DEVICE_PRIVATE is not set
+CONFIG_ARCH_USES_HIGH_VMA_FLAGS=y
+CONFIG_ARCH_HAS_PKEYS=y
+CONFIG_VM_EVENT_COUNTERS=y
+# CONFIG_PERCPU_STATS is not set
+# CONFIG_GUP_TEST is not set
+CONFIG_ARCH_HAS_PTE_SPECIAL=y
+# CONFIG_ANON_VMA_NAME is not set
+# CONFIG_USERFAULTFD is not set
+CONFIG_LRU_GEN=y
+# CONFIG_LRU_GEN_ENABLED is not set
+# CONFIG_LRU_GEN_STATS is not set
+CONFIG_LOCK_MM_AND_FIND_VMA=y
+
+#
+# Data Access Monitoring
+#
+# CONFIG_DAMON is not set
+# end of Data Access Monitoring
+# end of Memory Management options
+
+CONFIG_NET=y
+CONFIG_NET_INGRESS=y
+CONFIG_NET_EGRESS=y
+CONFIG_NET_REDIRECT=y
+CONFIG_SKB_EXTENSIONS=y
+
+#
+# Networking options
+#
+CONFIG_PACKET=y
+CONFIG_PACKET_DIAG=m
+CONFIG_UNIX=y
+CONFIG_UNIX_SCM=y
+CONFIG_AF_UNIX_OOB=y
+CONFIG_UNIX_DIAG=m
+CONFIG_TLS=y
+CONFIG_TLS_DEVICE=y
+# CONFIG_TLS_TOE is not set
+CONFIG_XFRM=y
+CONFIG_XFRM_ALGO=y
+CONFIG_XFRM_USER=y
+# CONFIG_XFRM_USER_COMPAT is not set
+CONFIG_XFRM_INTERFACE=m
+# CONFIG_XFRM_SUB_POLICY is not set
+# CONFIG_XFRM_MIGRATE is not set
+CONFIG_XFRM_STATISTICS=y
+CONFIG_XFRM_AH=m
+CONFIG_XFRM_ESP=m
+CONFIG_XFRM_IPCOMP=m
+CONFIG_NET_KEY=m
+# CONFIG_NET_KEY_MIGRATE is not set
+CONFIG_XDP_SOCKETS=y
+# CONFIG_XDP_SOCKETS_DIAG is not set
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+# CONFIG_IP_FIB_TRIE_STATS is not set
+CONFIG_IP_MULTIPLE_TABLES=y
+CONFIG_IP_ROUTE_MULTIPATH=y
+CONFIG_IP_ROUTE_VERBOSE=y
+CONFIG_IP_ROUTE_CLASSID=y
+# CONFIG_IP_PNP is not set
+CONFIG_NET_IPIP=m
+CONFIG_NET_IPGRE_DEMUX=m
+CONFIG_NET_IP_TUNNEL=m
+CONFIG_NET_IPGRE=m
+# CONFIG_NET_IPGRE_BROADCAST is not set
+CONFIG_IP_MROUTE_COMMON=y
+CONFIG_IP_MROUTE=y
+# CONFIG_IP_MROUTE_MULTIPLE_TABLES is not set
+CONFIG_IP_PIMSM_V1=y
+CONFIG_IP_PIMSM_V2=y
+CONFIG_SYN_COOKIES=y
+# CONFIG_NET_IPVTI is not set
+CONFIG_NET_UDP_TUNNEL=m
+CONFIG_NET_FOU=m
+CONFIG_NET_FOU_IP_TUNNELS=y
+CONFIG_INET_AH=m
+CONFIG_INET_ESP=m
+# CONFIG_INET_ESP_OFFLOAD is not set
+# CONFIG_INET_ESPINTCP is not set
+CONFIG_INET_IPCOMP=m
+CONFIG_INET_TABLE_PERTURB_ORDER=16
+CONFIG_INET_XFRM_TUNNEL=m
+CONFIG_INET_TUNNEL=m
+CONFIG_INET_DIAG=m
+CONFIG_INET_TCP_DIAG=m
+CONFIG_INET_UDP_DIAG=m
+# CONFIG_INET_RAW_DIAG is not set
+CONFIG_INET_DIAG_DESTROY=y
+CONFIG_TCP_CONG_ADVANCED=y
+# CONFIG_TCP_CONG_BIC is not set
+CONFIG_TCP_CONG_CUBIC=y
+# CONFIG_TCP_CONG_WESTWOOD is not set
+# CONFIG_TCP_CONG_HTCP is not set
+# CONFIG_TCP_CONG_HSTCP is not set
+# CONFIG_TCP_CONG_HYBLA is not set
+# CONFIG_TCP_CONG_VEGAS is not set
+# CONFIG_TCP_CONG_NV is not set
+# CONFIG_TCP_CONG_SCALABLE is not set
+CONFIG_TCP_CONG_LP=m
+# CONFIG_TCP_CONG_VENO is not set
+# CONFIG_TCP_CONG_YEAH is not set
+# CONFIG_TCP_CONG_ILLINOIS is not set
+# CONFIG_TCP_CONG_DCTCP is not set
+# CONFIG_TCP_CONG_CDG is not set
+CONFIG_TCP_CONG_BBR=m
+CONFIG_DEFAULT_CUBIC=y
+# CONFIG_DEFAULT_RENO is not set
+CONFIG_DEFAULT_TCP_CONG="cubic"
+CONFIG_TCP_MD5SIG=y
+CONFIG_IPV6=y
+# CONFIG_IPV6_ROUTER_PREF is not set
+# CONFIG_IPV6_OPTIMISTIC_DAD is not set
+# CONFIG_INET6_AH is not set
+CONFIG_INET6_ESP=m
+# CONFIG_INET6_ESP_OFFLOAD is not set
+# CONFIG_INET6_ESPINTCP is not set
+# CONFIG_INET6_IPCOMP is not set
+# CONFIG_IPV6_MIP6 is not set
+# CONFIG_IPV6_ILA is not set
+CONFIG_INET6_TUNNEL=m
+# CONFIG_IPV6_VTI is not set
+# CONFIG_IPV6_SIT is not set
+CONFIG_IPV6_TUNNEL=m
+CONFIG_IPV6_GRE=m
+CONFIG_IPV6_FOU=m
+CONFIG_IPV6_FOU_TUNNEL=m
+CONFIG_IPV6_MULTIPLE_TABLES=y
+# CONFIG_IPV6_SUBTREES is not set
+# CONFIG_IPV6_MROUTE is not set
+# CONFIG_IPV6_SEG6_LWTUNNEL is not set
+# CONFIG_IPV6_SEG6_HMAC is not set
+# CONFIG_IPV6_RPL_LWTUNNEL is not set
+# CONFIG_IPV6_IOAM6_LWTUNNEL is not set
+# CONFIG_NETLABEL is not set
+# CONFIG_MPTCP is not set
+CONFIG_NETWORK_SECMARK=y
+CONFIG_NET_PTP_CLASSIFY=y
+# CONFIG_NETWORK_PHY_TIMESTAMPING is not set
+CONFIG_NETFILTER=y
+CONFIG_NETFILTER_ADVANCED=y
+CONFIG_BRIDGE_NETFILTER=m
+
+#
+# Core Netfilter Configuration
+#
+CONFIG_NETFILTER_INGRESS=y
+CONFIG_NETFILTER_EGRESS=y
+CONFIG_NETFILTER_SKIP_EGRESS=y
+CONFIG_NETFILTER_NETLINK=y
+CONFIG_NETFILTER_FAMILY_BRIDGE=y
+CONFIG_NETFILTER_FAMILY_ARP=y
+# CONFIG_NETFILTER_NETLINK_HOOK is not set
+CONFIG_NETFILTER_NETLINK_ACCT=m
+CONFIG_NETFILTER_NETLINK_QUEUE=y
+CONFIG_NETFILTER_NETLINK_LOG=y
+CONFIG_NETFILTER_NETLINK_OSF=m
+CONFIG_NF_CONNTRACK=y
+CONFIG_NF_LOG_SYSLOG=m
+CONFIG_NETFILTER_CONNCOUNT=m
+CONFIG_NF_CONNTRACK_MARK=y
+CONFIG_NF_CONNTRACK_SECMARK=y
+CONFIG_NF_CONNTRACK_ZONES=y
+CONFIG_NF_CONNTRACK_PROCFS=y
+CONFIG_NF_CONNTRACK_EVENTS=y
+CONFIG_NF_CONNTRACK_TIMEOUT=y
+CONFIG_NF_CONNTRACK_TIMESTAMP=y
+CONFIG_NF_CONNTRACK_LABELS=y
+CONFIG_NF_CT_PROTO_DCCP=y
+CONFIG_NF_CT_PROTO_GRE=y
+CONFIG_NF_CT_PROTO_SCTP=y
+CONFIG_NF_CT_PROTO_UDPLITE=y
+CONFIG_NF_CONNTRACK_AMANDA=m
+CONFIG_NF_CONNTRACK_FTP=m
+CONFIG_NF_CONNTRACK_H323=m
+CONFIG_NF_CONNTRACK_IRC=m
+CONFIG_NF_CONNTRACK_BROADCAST=m
+CONFIG_NF_CONNTRACK_NETBIOS_NS=m
+CONFIG_NF_CONNTRACK_SNMP=m
+CONFIG_NF_CONNTRACK_PPTP=m
+CONFIG_NF_CONNTRACK_SANE=m
+CONFIG_NF_CONNTRACK_SIP=m
+CONFIG_NF_CONNTRACK_TFTP=m
+CONFIG_NF_CT_NETLINK=y
+CONFIG_NF_CT_NETLINK_TIMEOUT=m
+CONFIG_NF_CT_NETLINK_HELPER=m
+CONFIG_NETFILTER_NETLINK_GLUE_CT=y
+CONFIG_NF_NAT=m
+CONFIG_NF_NAT_AMANDA=m
+CONFIG_NF_NAT_FTP=m
+CONFIG_NF_NAT_IRC=m
+CONFIG_NF_NAT_SIP=m
+CONFIG_NF_NAT_TFTP=m
+CONFIG_NF_NAT_REDIRECT=y
+CONFIG_NF_NAT_MASQUERADE=y
+CONFIG_NETFILTER_SYNPROXY=y
+CONFIG_NF_TABLES=m
+CONFIG_NF_TABLES_INET=y
+# CONFIG_NF_TABLES_NETDEV is not set
+CONFIG_NFT_NUMGEN=m
+CONFIG_NFT_CT=m
+# CONFIG_NFT_FLOW_OFFLOAD is not set
+CONFIG_NFT_CONNLIMIT=m
+CONFIG_NFT_LOG=m
+CONFIG_NFT_LIMIT=m
+CONFIG_NFT_MASQ=m
+CONFIG_NFT_REDIR=m
+CONFIG_NFT_NAT=m
+CONFIG_NFT_TUNNEL=m
+CONFIG_NFT_OBJREF=m
+CONFIG_NFT_QUEUE=m
+CONFIG_NFT_QUOTA=m
+CONFIG_NFT_REJECT=m
+CONFIG_NFT_REJECT_INET=m
+CONFIG_NFT_COMPAT=m
+CONFIG_NFT_HASH=m
+CONFIG_NFT_XFRM=m
+CONFIG_NFT_SOCKET=m
+CONFIG_NFT_OSF=m
+CONFIG_NFT_TPROXY=m
+CONFIG_NFT_SYNPROXY=m
+# CONFIG_NF_FLOW_TABLE_INET is not set
+CONFIG_NF_FLOW_TABLE=m
+# CONFIG_NF_FLOW_TABLE_PROCFS is not set
+CONFIG_NETFILTER_XTABLES=y
+CONFIG_NETFILTER_XTABLES_COMPAT=y
+
+#
+# Xtables combined modules
+#
+CONFIG_NETFILTER_XT_MARK=m
+CONFIG_NETFILTER_XT_CONNMARK=m
+CONFIG_NETFILTER_XT_SET=m
+
+#
+# Xtables targets
+#
+CONFIG_NETFILTER_XT_TARGET_AUDIT=m
+CONFIG_NETFILTER_XT_TARGET_CHECKSUM=m
+CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
+CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
+CONFIG_NETFILTER_XT_TARGET_CONNSECMARK=m
+CONFIG_NETFILTER_XT_TARGET_CT=m
+CONFIG_NETFILTER_XT_TARGET_DSCP=m
+CONFIG_NETFILTER_XT_TARGET_HL=m
+CONFIG_NETFILTER_XT_TARGET_HMARK=m
+CONFIG_NETFILTER_XT_TARGET_IDLETIMER=m
+CONFIG_NETFILTER_XT_TARGET_LOG=m
+CONFIG_NETFILTER_XT_TARGET_MARK=m
+CONFIG_NETFILTER_XT_NAT=m
+CONFIG_NETFILTER_XT_TARGET_NETMAP=m
+CONFIG_NETFILTER_XT_TARGET_NFLOG=y
+CONFIG_NETFILTER_XT_TARGET_NFQUEUE=y
+# CONFIG_NETFILTER_XT_TARGET_NOTRACK is not set
+CONFIG_NETFILTER_XT_TARGET_RATEEST=m
+CONFIG_NETFILTER_XT_TARGET_REDIRECT=m
+CONFIG_NETFILTER_XT_TARGET_MASQUERADE=m
+CONFIG_NETFILTER_XT_TARGET_TEE=m
+CONFIG_NETFILTER_XT_TARGET_TPROXY=m
+CONFIG_NETFILTER_XT_TARGET_TRACE=m
+CONFIG_NETFILTER_XT_TARGET_SECMARK=m
+CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
+CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
+
+#
+# Xtables matches
+#
+CONFIG_NETFILTER_XT_MATCH_ADDRTYPE=m
+CONFIG_NETFILTER_XT_MATCH_BPF=m
+CONFIG_NETFILTER_XT_MATCH_CGROUP=m
+CONFIG_NETFILTER_XT_MATCH_CLUSTER=m
+CONFIG_NETFILTER_XT_MATCH_COMMENT=y
+CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
+CONFIG_NETFILTER_XT_MATCH_CONNLABEL=m
+CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
+CONFIG_NETFILTER_XT_MATCH_CONNTRACK=y
+CONFIG_NETFILTER_XT_MATCH_CPU=m
+CONFIG_NETFILTER_XT_MATCH_DCCP=m
+CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
+CONFIG_NETFILTER_XT_MATCH_DSCP=m
+CONFIG_NETFILTER_XT_MATCH_ECN=m
+CONFIG_NETFILTER_XT_MATCH_ESP=m
+CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
+CONFIG_NETFILTER_XT_MATCH_HELPER=m
+CONFIG_NETFILTER_XT_MATCH_HL=m
+CONFIG_NETFILTER_XT_MATCH_IPCOMP=m
+CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
+CONFIG_NETFILTER_XT_MATCH_IPVS=m
+CONFIG_NETFILTER_XT_MATCH_L2TP=m
+CONFIG_NETFILTER_XT_MATCH_LENGTH=m
+CONFIG_NETFILTER_XT_MATCH_LIMIT=m
+CONFIG_NETFILTER_XT_MATCH_MAC=m
+CONFIG_NETFILTER_XT_MATCH_MARK=m
+CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
+CONFIG_NETFILTER_XT_MATCH_NFACCT=m
+CONFIG_NETFILTER_XT_MATCH_OSF=m
+CONFIG_NETFILTER_XT_MATCH_OWNER=y
+CONFIG_NETFILTER_XT_MATCH_POLICY=y
+CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
+CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
+CONFIG_NETFILTER_XT_MATCH_QUOTA=m
+CONFIG_NETFILTER_XT_MATCH_RATEEST=m
+CONFIG_NETFILTER_XT_MATCH_REALM=y
+CONFIG_NETFILTER_XT_MATCH_RECENT=m
+CONFIG_NETFILTER_XT_MATCH_SCTP=m
+CONFIG_NETFILTER_XT_MATCH_SOCKET=m
+CONFIG_NETFILTER_XT_MATCH_STATE=m
+CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
+CONFIG_NETFILTER_XT_MATCH_STRING=m
+CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
+CONFIG_NETFILTER_XT_MATCH_TIME=m
+CONFIG_NETFILTER_XT_MATCH_U32=m
+# end of Core Netfilter Configuration
+
+CONFIG_IP_SET=m
+CONFIG_IP_SET_MAX=256
+CONFIG_IP_SET_BITMAP_IP=m
+CONFIG_IP_SET_BITMAP_IPMAC=m
+CONFIG_IP_SET_BITMAP_PORT=m
+CONFIG_IP_SET_HASH_IP=m
+CONFIG_IP_SET_HASH_IPMARK=m
+CONFIG_IP_SET_HASH_IPPORT=m
+CONFIG_IP_SET_HASH_IPPORTIP=m
+CONFIG_IP_SET_HASH_IPPORTNET=m
+# CONFIG_IP_SET_HASH_IPMAC is not set
+CONFIG_IP_SET_HASH_MAC=m
+CONFIG_IP_SET_HASH_NETPORTNET=m
+CONFIG_IP_SET_HASH_NET=m
+CONFIG_IP_SET_HASH_NETNET=m
+CONFIG_IP_SET_HASH_NETPORT=m
+CONFIG_IP_SET_HASH_NETIFACE=m
+CONFIG_IP_SET_LIST_SET=m
+CONFIG_IP_VS=m
+# CONFIG_IP_VS_IPV6 is not set
+# CONFIG_IP_VS_DEBUG is not set
+CONFIG_IP_VS_TAB_BITS=12
+
+#
+# IPVS transport protocol load balancing support
+#
+CONFIG_IP_VS_PROTO_TCP=y
+CONFIG_IP_VS_PROTO_UDP=y
+CONFIG_IP_VS_PROTO_AH_ESP=y
+CONFIG_IP_VS_PROTO_ESP=y
+CONFIG_IP_VS_PROTO_AH=y
+CONFIG_IP_VS_PROTO_SCTP=y
+
+#
+# IPVS scheduler
+#
+CONFIG_IP_VS_RR=m
+CONFIG_IP_VS_WRR=m
+CONFIG_IP_VS_LC=m
+CONFIG_IP_VS_WLC=m
+CONFIG_IP_VS_FO=m
+CONFIG_IP_VS_OVF=m
+CONFIG_IP_VS_LBLC=m
+CONFIG_IP_VS_LBLCR=m
+CONFIG_IP_VS_DH=m
+CONFIG_IP_VS_SH=m
+# CONFIG_IP_VS_MH is not set
+CONFIG_IP_VS_SED=m
+CONFIG_IP_VS_NQ=m
+# CONFIG_IP_VS_TWOS is not set
+
+#
+# IPVS SH scheduler
+#
+CONFIG_IP_VS_SH_TAB_BITS=8
+
+#
+# IPVS MH scheduler
+#
+CONFIG_IP_VS_MH_TAB_INDEX=12
+
+#
+# IPVS application helper
+#
+CONFIG_IP_VS_FTP=m
+CONFIG_IP_VS_NFCT=y
+CONFIG_IP_VS_PE_SIP=m
+
+#
+# IP: Netfilter Configuration
+#
+CONFIG_NF_DEFRAG_IPV4=y
+CONFIG_NF_SOCKET_IPV4=m
+CONFIG_NF_TPROXY_IPV4=m
+CONFIG_NF_TABLES_IPV4=y
+CONFIG_NFT_REJECT_IPV4=m
+# CONFIG_NFT_DUP_IPV4 is not set
+# CONFIG_NFT_FIB_IPV4 is not set
+# CONFIG_NF_TABLES_ARP is not set
+CONFIG_NF_DUP_IPV4=m
+CONFIG_NF_LOG_ARP=m
+CONFIG_NF_LOG_IPV4=m
+CONFIG_NF_REJECT_IPV4=y
+CONFIG_NF_NAT_SNMP_BASIC=m
+CONFIG_NF_NAT_PPTP=m
+CONFIG_NF_NAT_H323=m
+CONFIG_IP_NF_IPTABLES=y
+CONFIG_IP_NF_MATCH_AH=m
+CONFIG_IP_NF_MATCH_ECN=m
+CONFIG_IP_NF_MATCH_RPFILTER=m
+CONFIG_IP_NF_MATCH_TTL=m
+CONFIG_IP_NF_FILTER=y
+CONFIG_IP_NF_TARGET_REJECT=y
+CONFIG_IP_NF_TARGET_SYNPROXY=m
+CONFIG_IP_NF_NAT=m
+CONFIG_IP_NF_TARGET_MASQUERADE=m
+CONFIG_IP_NF_TARGET_NETMAP=m
+CONFIG_IP_NF_TARGET_REDIRECT=m
+CONFIG_IP_NF_MANGLE=y
+CONFIG_IP_NF_TARGET_CLUSTERIP=m
+CONFIG_IP_NF_TARGET_ECN=m
+CONFIG_IP_NF_TARGET_TTL=m
+CONFIG_IP_NF_RAW=m
+CONFIG_IP_NF_SECURITY=m
+CONFIG_IP_NF_ARPTABLES=m
+CONFIG_IP_NF_ARPFILTER=m
+CONFIG_IP_NF_ARP_MANGLE=m
+# end of IP: Netfilter Configuration
+
+#
+# IPv6: Netfilter Configuration
+#
+CONFIG_NF_SOCKET_IPV6=m
+CONFIG_NF_TPROXY_IPV6=m
+CONFIG_NF_TABLES_IPV6=y
+CONFIG_NFT_REJECT_IPV6=m
+# CONFIG_NFT_DUP_IPV6 is not set
+# CONFIG_NFT_FIB_IPV6 is not set
+CONFIG_NF_DUP_IPV6=m
+CONFIG_NF_REJECT_IPV6=y
+CONFIG_NF_LOG_IPV6=m
+CONFIG_IP6_NF_IPTABLES=y
+CONFIG_IP6_NF_MATCH_AH=m
+# CONFIG_IP6_NF_MATCH_EUI64 is not set
+# CONFIG_IP6_NF_MATCH_FRAG is not set
+# CONFIG_IP6_NF_MATCH_OPTS is not set
+# CONFIG_IP6_NF_MATCH_HL is not set
+# CONFIG_IP6_NF_MATCH_IPV6HEADER is not set
+# CONFIG_IP6_NF_MATCH_MH is not set
+CONFIG_IP6_NF_MATCH_RPFILTER=m
+# CONFIG_IP6_NF_MATCH_RT is not set
+# CONFIG_IP6_NF_MATCH_SRH is not set
+# CONFIG_IP6_NF_TARGET_HL is not set
+CONFIG_IP6_NF_FILTER=y
+CONFIG_IP6_NF_TARGET_REJECT=y
+CONFIG_IP6_NF_TARGET_SYNPROXY=y
+CONFIG_IP6_NF_MANGLE=m
+CONFIG_IP6_NF_RAW=m
+CONFIG_IP6_NF_SECURITY=m
+CONFIG_IP6_NF_NAT=m
+CONFIG_IP6_NF_TARGET_MASQUERADE=m
+# CONFIG_IP6_NF_TARGET_NPT is not set
+# end of IPv6: Netfilter Configuration
+
+CONFIG_NF_DEFRAG_IPV6=y
+# CONFIG_NF_TABLES_BRIDGE is not set
+# CONFIG_NF_CONNTRACK_BRIDGE is not set
+CONFIG_BRIDGE_NF_EBTABLES=m
+CONFIG_BRIDGE_EBT_BROUTE=m
+CONFIG_BRIDGE_EBT_T_FILTER=m
+CONFIG_BRIDGE_EBT_T_NAT=m
+CONFIG_BRIDGE_EBT_802_3=m
+CONFIG_BRIDGE_EBT_AMONG=m
+CONFIG_BRIDGE_EBT_ARP=m
+CONFIG_BRIDGE_EBT_IP=m
+# CONFIG_BRIDGE_EBT_IP6 is not set
+CONFIG_BRIDGE_EBT_LIMIT=m
+CONFIG_BRIDGE_EBT_MARK=m
+CONFIG_BRIDGE_EBT_PKTTYPE=m
+CONFIG_BRIDGE_EBT_STP=m
+CONFIG_BRIDGE_EBT_VLAN=m
+CONFIG_BRIDGE_EBT_ARPREPLY=m
+CONFIG_BRIDGE_EBT_DNAT=m
+CONFIG_BRIDGE_EBT_MARK_T=m
+CONFIG_BRIDGE_EBT_REDIRECT=m
+CONFIG_BRIDGE_EBT_SNAT=m
+CONFIG_BRIDGE_EBT_LOG=m
+CONFIG_BRIDGE_EBT_NFLOG=m
+# CONFIG_BPFILTER is not set
+# CONFIG_IP_DCCP is not set
+# CONFIG_IP_SCTP is not set
+# CONFIG_RDS is not set
+# CONFIG_TIPC is not set
+# CONFIG_ATM is not set
+# CONFIG_L2TP is not set
+CONFIG_STP=y
+CONFIG_BRIDGE=y
+CONFIG_BRIDGE_IGMP_SNOOPING=y
+CONFIG_BRIDGE_VLAN_FILTERING=y
+# CONFIG_BRIDGE_MRP is not set
+# CONFIG_BRIDGE_CFM is not set
+# CONFIG_NET_DSA is not set
+CONFIG_VLAN_8021Q=m
+# CONFIG_VLAN_8021Q_GVRP is not set
+# CONFIG_VLAN_8021Q_MVRP is not set
+CONFIG_LLC=y
+# CONFIG_LLC2 is not set
+# CONFIG_ATALK is not set
+# CONFIG_X25 is not set
+# CONFIG_LAPB is not set
+# CONFIG_PHONET is not set
+# CONFIG_6LOWPAN is not set
+# CONFIG_IEEE802154 is not set
+CONFIG_NET_SCHED=y
+
+#
+# Queueing/Scheduling
+#
+CONFIG_NET_SCH_CBQ=m
+CONFIG_NET_SCH_HTB=m
+CONFIG_NET_SCH_HFSC=m
+CONFIG_NET_SCH_PRIO=m
+CONFIG_NET_SCH_MULTIQ=m
+CONFIG_NET_SCH_RED=m
+CONFIG_NET_SCH_SFB=m
+CONFIG_NET_SCH_SFQ=m
+CONFIG_NET_SCH_TEQL=m
+CONFIG_NET_SCH_TBF=m
+# CONFIG_NET_SCH_CBS is not set
+# CONFIG_NET_SCH_ETF is not set
+# CONFIG_NET_SCH_TAPRIO is not set
+CONFIG_NET_SCH_GRED=m
+CONFIG_NET_SCH_DSMARK=m
+CONFIG_NET_SCH_NETEM=m
+CONFIG_NET_SCH_DRR=m
+CONFIG_NET_SCH_MQPRIO=m
+# CONFIG_NET_SCH_SKBPRIO is not set
+CONFIG_NET_SCH_CHOKE=m
+CONFIG_NET_SCH_QFQ=m
+CONFIG_NET_SCH_CODEL=m
+CONFIG_NET_SCH_FQ_CODEL=m
+# CONFIG_NET_SCH_CAKE is not set
+CONFIG_NET_SCH_FQ=m
+CONFIG_NET_SCH_HHF=m
+CONFIG_NET_SCH_PIE=m
+# CONFIG_NET_SCH_FQ_PIE is not set
+CONFIG_NET_SCH_INGRESS=m
+CONFIG_NET_SCH_PLUG=m
+# CONFIG_NET_SCH_ETS is not set
+# CONFIG_NET_SCH_DEFAULT is not set
+
+#
+# Classification
+#
+CONFIG_NET_CLS=y
+CONFIG_NET_CLS_BASIC=m
+CONFIG_NET_CLS_ROUTE4=m
+CONFIG_NET_CLS_FW=m
+CONFIG_NET_CLS_U32=m
+# CONFIG_CLS_U32_PERF is not set
+CONFIG_CLS_U32_MARK=y
+CONFIG_NET_CLS_RSVP=m
+CONFIG_NET_CLS_RSVP6=m
+# CONFIG_NET_CLS_FLOW is not set
+CONFIG_NET_CLS_CGROUP=m
+CONFIG_NET_CLS_BPF=m
+# CONFIG_NET_CLS_FLOWER is not set
+# CONFIG_NET_CLS_MATCHALL is not set
+# CONFIG_NET_EMATCH is not set
+CONFIG_NET_CLS_ACT=y
+# CONFIG_NET_ACT_POLICE is not set
+CONFIG_NET_ACT_GACT=m
+# CONFIG_GACT_PROB is not set
+CONFIG_NET_ACT_MIRRED=y
+# CONFIG_NET_ACT_SAMPLE is not set
+# CONFIG_NET_ACT_IPT is not set
+CONFIG_NET_ACT_NAT=m
+CONFIG_NET_ACT_PEDIT=y
+# CONFIG_NET_ACT_SIMP is not set
+# CONFIG_NET_ACT_SKBEDIT is not set
+# CONFIG_NET_ACT_CSUM is not set
+# CONFIG_NET_ACT_MPLS is not set
+# CONFIG_NET_ACT_VLAN is not set
+# CONFIG_NET_ACT_BPF is not set
+# CONFIG_NET_ACT_CONNMARK is not set
+# CONFIG_NET_ACT_CTINFO is not set
+# CONFIG_NET_ACT_SKBMOD is not set
+# CONFIG_NET_ACT_IFE is not set
+# CONFIG_NET_ACT_TUNNEL_KEY is not set
+# CONFIG_NET_ACT_CT is not set
+# CONFIG_NET_ACT_GATE is not set
+# CONFIG_NET_TC_SKB_EXT is not set
+CONFIG_NET_SCH_FIFO=y
+# CONFIG_DCB is not set
+CONFIG_DNS_RESOLVER=m
+# CONFIG_BATMAN_ADV is not set
+# CONFIG_OPENVSWITCH is not set
+CONFIG_VSOCKETS=y
+CONFIG_VSOCKETS_DIAG=y
+CONFIG_VSOCKETS_LOOPBACK=y
+CONFIG_VMWARE_VMCI_VSOCKETS=y
+CONFIG_VIRTIO_VSOCKETS=y
+CONFIG_VIRTIO_VSOCKETS_COMMON=y
+# CONFIG_HYPERV_VSOCKETS is not set
+CONFIG_NETLINK_DIAG=m
+# CONFIG_MPLS is not set
+# CONFIG_NET_NSH is not set
+# CONFIG_HSR is not set
+# CONFIG_NET_SWITCHDEV is not set
+CONFIG_NET_L3_MASTER_DEV=y
+# CONFIG_QRTR is not set
+# CONFIG_NET_NCSI is not set
+CONFIG_PCPU_DEV_REFCNT=y
+CONFIG_MAX_SKB_FRAGS=17
+CONFIG_RPS=y
+CONFIG_RFS_ACCEL=y
+CONFIG_SOCK_RX_QUEUE_MAPPING=y
+CONFIG_XPS=y
+CONFIG_CGROUP_NET_PRIO=y
+CONFIG_CGROUP_NET_CLASSID=y
+CONFIG_NET_RX_BUSY_POLL=y
+CONFIG_BQL=y
+CONFIG_BPF_STREAM_PARSER=y
+CONFIG_NET_FLOW_LIMIT=y
+
+#
+# Network testing
+#
+# CONFIG_NET_PKTGEN is not set
+# CONFIG_NET_DROP_MONITOR is not set
+# end of Network testing
+# end of Networking options
+
+# CONFIG_HAMRADIO is not set
+# CONFIG_CAN is not set
+# CONFIG_BT is not set
+# CONFIG_AF_RXRPC is not set
+# CONFIG_AF_KCM is not set
+CONFIG_STREAM_PARSER=y
+# CONFIG_MCTP is not set
+CONFIG_FIB_RULES=y
+# CONFIG_WIRELESS is not set
+# CONFIG_RFKILL is not set
+CONFIG_NET_9P=y
+CONFIG_NET_9P_FD=y
+CONFIG_NET_9P_VIRTIO=y
+# CONFIG_NET_9P_XEN is not set
+# CONFIG_NET_9P_DEBUG is not set
+# CONFIG_CAIF is not set
+# CONFIG_CEPH_LIB is not set
+# CONFIG_NFC is not set
+# CONFIG_PSAMPLE is not set
+# CONFIG_NET_IFE is not set
+CONFIG_LWTUNNEL=y
+CONFIG_LWTUNNEL_BPF=y
+CONFIG_DST_CACHE=y
+CONFIG_GRO_CELLS=y
+CONFIG_SOCK_VALIDATE_XMIT=y
+CONFIG_NET_SELFTESTS=y
+CONFIG_NET_SOCK_MSG=y
+CONFIG_NET_DEVLINK=y
+CONFIG_PAGE_POOL=y
+# CONFIG_PAGE_POOL_STATS is not set
+CONFIG_FAILOVER=y
+CONFIG_ETHTOOL_NETLINK=y
+
+#
+# Device Drivers
+#
+CONFIG_HAVE_EISA=y
+# CONFIG_EISA is not set
+CONFIG_HAVE_PCI=y
+CONFIG_PCI=y
+CONFIG_PCI_DOMAINS=y
+CONFIG_PCIEPORTBUS=y
+CONFIG_HOTPLUG_PCI_PCIE=y
+CONFIG_PCIEAER=y
+# CONFIG_PCIEAER_INJECT is not set
+# CONFIG_PCIE_ECRC is not set
+CONFIG_PCIEASPM=y
+CONFIG_PCIEASPM_DEFAULT=y
+# CONFIG_PCIEASPM_POWERSAVE is not set
+# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
+# CONFIG_PCIEASPM_PERFORMANCE is not set
+CONFIG_PCIE_PME=y
+# CONFIG_PCIE_DPC is not set
+# CONFIG_PCIE_PTM is not set
+CONFIG_PCI_MSI=y
+CONFIG_PCI_MSI_IRQ_DOMAIN=y
+CONFIG_PCI_QUIRKS=y
+# CONFIG_PCI_DEBUG is not set
+# CONFIG_PCI_STUB is not set
+CONFIG_XEN_PCIDEV_FRONTEND=y
+CONFIG_PCI_ATS=y
+CONFIG_PCI_LOCKLESS_CONFIG=y
+# CONFIG_PCI_IOV is not set
+CONFIG_PCI_PRI=y
+CONFIG_PCI_PASID=y
+# CONFIG_PCI_P2PDMA is not set
+CONFIG_PCI_LABEL=y
+CONFIG_PCI_HYPERV=y
+# CONFIG_PCIE_BUS_TUNE_OFF is not set
+CONFIG_PCIE_BUS_DEFAULT=y
+# CONFIG_PCIE_BUS_SAFE is not set
+# CONFIG_PCIE_BUS_PERFORMANCE is not set
+# CONFIG_PCIE_BUS_PEER2PEER is not set
+# CONFIG_VGA_ARB is not set
+CONFIG_HOTPLUG_PCI=y
+CONFIG_HOTPLUG_PCI_ACPI=y
+# CONFIG_HOTPLUG_PCI_ACPI_IBM is not set
+# CONFIG_HOTPLUG_PCI_CPCI is not set
+# CONFIG_HOTPLUG_PCI_SHPC is not set
+
+#
+# PCI controller drivers
+#
+# CONFIG_VMD is not set
+CONFIG_PCI_HYPERV_INTERFACE=y
+
+#
+# DesignWare PCI Core Support
+#
+# CONFIG_PCIE_DW_PLAT_HOST is not set
+# CONFIG_PCI_MESON is not set
+# end of DesignWare PCI Core Support
+
+#
+# Mobiveil PCIe Core Support
+#
+# end of Mobiveil PCIe Core Support
+
+#
+# Cadence PCIe controllers support
+#
+# end of Cadence PCIe controllers support
+# end of PCI controller drivers
+
+#
+# PCI Endpoint
+#
+# CONFIG_PCI_ENDPOINT is not set
+# end of PCI Endpoint
+
+#
+# PCI switch controller drivers
+#
+# CONFIG_PCI_SW_SWITCHTEC is not set
+# end of PCI switch controller drivers
+
+# CONFIG_CXL_BUS is not set
+# CONFIG_PCCARD is not set
+# CONFIG_RAPIDIO is not set
+
+#
+# Generic Driver Options
+#
+CONFIG_AUXILIARY_BUS=y
+CONFIG_UEVENT_HELPER=y
+CONFIG_UEVENT_HELPER_PATH=""
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_DEVTMPFS_SAFE=y
+CONFIG_STANDALONE=y
+CONFIG_PREVENT_FIRMWARE_BUILD=y
+
+#
+# Firmware loader
+#
+CONFIG_FW_LOADER=y
+CONFIG_EXTRA_FIRMWARE=""
+# CONFIG_FW_LOADER_USER_HELPER is not set
+# CONFIG_FW_LOADER_COMPRESS is not set
+CONFIG_FW_CACHE=y
+# CONFIG_FW_UPLOAD is not set
+# end of Firmware loader
+
+CONFIG_ALLOW_DEV_COREDUMP=y
+# CONFIG_DEBUG_DRIVER is not set
+CONFIG_DEBUG_DEVRES=y
+# CONFIG_DEBUG_TEST_DRIVER_REMOVE is not set
+# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
+CONFIG_SYS_HYPERVISOR=y
+CONFIG_GENERIC_CPU_AUTOPROBE=y
+CONFIG_GENERIC_CPU_VULNERABILITIES=y
+CONFIG_DMA_SHARED_BUFFER=y
+# CONFIG_DMA_FENCE_TRACE is not set
+# end of Generic Driver Options
+
+#
+# Bus devices
+#
+# CONFIG_MHI_BUS is not set
+# CONFIG_MHI_BUS_EP is not set
+# end of Bus devices
+
+CONFIG_CONNECTOR=y
+CONFIG_PROC_EVENTS=y
+
+#
+# Firmware Drivers
+#
+
+#
+# ARM System Control and Management Interface Protocol
+#
+# end of ARM System Control and Management Interface Protocol
+
+# CONFIG_EDD is not set
+# CONFIG_FIRMWARE_MEMMAP is not set
+CONFIG_DMIID=y
+# CONFIG_DMI_SYSFS is not set
+CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK=y
+# CONFIG_ISCSI_IBFT is not set
+# CONFIG_FW_CFG_SYSFS is not set
+# CONFIG_SYSFB_SIMPLEFB is not set
+# CONFIG_GOOGLE_FIRMWARE is not set
+
+#
+# EFI (Extensible Firmware Interface) Support
+#
+CONFIG_EFI_ESRT=y
+CONFIG_EFI_VARS_PSTORE=y
+# CONFIG_EFI_VARS_PSTORE_DEFAULT_DISABLE is not set
+CONFIG_EFI_RUNTIME_MAP=y
+# CONFIG_EFI_FAKE_MEMMAP is not set
+CONFIG_EFI_DXE_MEM_ATTRIBUTES=y
+CONFIG_EFI_RUNTIME_WRAPPERS=y
+CONFIG_EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER=y
+# CONFIG_EFI_BOOTLOADER_CONTROL is not set
+# CONFIG_EFI_CAPSULE_LOADER is not set
+# CONFIG_EFI_TEST is not set
+# CONFIG_APPLE_PROPERTIES is not set
+# CONFIG_RESET_ATTACK_MITIGATION is not set
+# CONFIG_EFI_RCI2_TABLE is not set
+# CONFIG_EFI_DISABLE_PCI_DMA is not set
+CONFIG_EFI_EARLYCON=y
+# CONFIG_EFI_CUSTOM_SSDT_OVERLAYS is not set
+# CONFIG_EFI_DISABLE_RUNTIME is not set
+# CONFIG_EFI_COCO_SECRET is not set
+CONFIG_UNACCEPTED_MEMORY=y
+# end of EFI (Extensible Firmware Interface) Support
+
+#
+# Tegra firmware driver
+#
+# end of Tegra firmware driver
+# end of Firmware Drivers
+
+# CONFIG_GNSS is not set
+# CONFIG_MTD is not set
+# CONFIG_OF is not set
+CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
+# CONFIG_PARPORT is not set
+CONFIG_PNP=y
+CONFIG_PNP_DEBUG_MESSAGES=y
+
+#
+# Protocols
+#
+CONFIG_PNPACPI=y
+CONFIG_BLK_DEV=y
+# CONFIG_BLK_DEV_NULL_BLK is not set
+# CONFIG_BLK_DEV_FD is not set
+CONFIG_CDROM=y
+# CONFIG_BLK_DEV_PCIESSD_MTIP32XX is not set
+CONFIG_ZRAM=m
+CONFIG_ZRAM_DEF_COMP_LZORLE=y
+# CONFIG_ZRAM_DEF_COMP_LZ4 is not set
+# CONFIG_ZRAM_DEF_COMP_LZO is not set
+CONFIG_ZRAM_DEF_COMP="lzo-rle"
+# CONFIG_ZRAM_WRITEBACK is not set
+# CONFIG_ZRAM_MEMORY_TRACKING is not set
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
+# CONFIG_BLK_DEV_DRBD is not set
+# CONFIG_BLK_DEV_NBD is not set
+# CONFIG_BLK_DEV_RAM is not set
+# CONFIG_CDROM_PKTCDVD is not set
+# CONFIG_ATA_OVER_ETH is not set
+CONFIG_XEN_BLKDEV_FRONTEND=y
+CONFIG_XEN_BLKDEV_BACKEND=m
+CONFIG_VIRTIO_BLK=m
+# CONFIG_BLK_DEV_RBD is not set
+# CONFIG_BLK_DEV_UBLK is not set
+
+#
+# NVME Support
+#
+CONFIG_NVME_CORE=y
+CONFIG_BLK_DEV_NVME=y
+# CONFIG_NVME_MULTIPATH is not set
+# CONFIG_NVME_VERBOSE_ERRORS is not set
+# CONFIG_NVME_FC is not set
+# CONFIG_NVME_TCP is not set
+# CONFIG_NVME_AUTH is not set
+# CONFIG_NVME_TARGET is not set
+# end of NVME Support
+
+#
+# Misc devices
+#
+# CONFIG_DUMMY_IRQ is not set
+# CONFIG_IBM_ASM is not set
+# CONFIG_PHANTOM is not set
+# CONFIG_TIFM_CORE is not set
+# CONFIG_ENCLOSURE_SERVICES is not set
+# CONFIG_HP_ILO is not set
+CONFIG_VMWARE_BALLOON=y
+# CONFIG_SRAM is not set
+# CONFIG_DW_XDATA_PCIE is not set
+# CONFIG_PCI_ENDPOINT_TEST is not set
+# CONFIG_XILINX_SDFEC is not set
+# CONFIG_C2PORT is not set
+
+#
+# EEPROM support
+#
+CONFIG_EEPROM_93CX6=m
+# end of EEPROM support
+
+# CONFIG_CB710_CORE is not set
+
+#
+# Texas Instruments shared transport line discipline
+#
+# end of Texas Instruments shared transport line discipline
+
+#
+# Altera FPGA firmware download module (requires I2C)
+#
+# CONFIG_INTEL_MEI is not set
+# CONFIG_INTEL_MEI_ME is not set
+# CONFIG_INTEL_MEI_TXE is not set
+CONFIG_VMWARE_VMCI=y
+# CONFIG_GENWQE is not set
+# CONFIG_ECHO is not set
+# CONFIG_BCM_VK is not set
+# CONFIG_MISC_ALCOR_PCI is not set
+# CONFIG_MISC_RTSX_PCI is not set
+# CONFIG_HABANA_AI is not set
+# CONFIG_UACCE is not set
+CONFIG_PVPANIC=y
+CONFIG_PVPANIC_MMIO=y
+# CONFIG_PVPANIC_PCI is not set
+# end of Misc devices
+
+#
+# SCSI device support
+#
+CONFIG_SCSI_MOD=y
+# CONFIG_RAID_ATTRS is not set
+CONFIG_SCSI_COMMON=y
+CONFIG_SCSI=y
+CONFIG_SCSI_DMA=y
+CONFIG_SCSI_PROC_FS=y
+
+#
+# SCSI support type (disk, tape, CD-ROM)
+#
+CONFIG_BLK_DEV_SD=y
+# CONFIG_CHR_DEV_ST is not set
+CONFIG_BLK_DEV_SR=y
+# CONFIG_CHR_DEV_SG is not set
+CONFIG_BLK_DEV_BSG=y
+# CONFIG_CHR_DEV_SCH is not set
+CONFIG_SCSI_CONSTANTS=y
+# CONFIG_SCSI_LOGGING is not set
+# CONFIG_SCSI_SCAN_ASYNC is not set
+
+#
+# SCSI Transports
+#
+CONFIG_SCSI_SPI_ATTRS=y
+# CONFIG_SCSI_FC_ATTRS is not set
+CONFIG_SCSI_ISCSI_ATTRS=m
+# CONFIG_SCSI_SAS_ATTRS is not set
+# CONFIG_SCSI_SAS_LIBSAS is not set
+# CONFIG_SCSI_SRP_ATTRS is not set
+# end of SCSI Transports
+
+CONFIG_SCSI_LOWLEVEL=y
+CONFIG_ISCSI_TCP=m
+# CONFIG_ISCSI_BOOT_SYSFS is not set
+# CONFIG_SCSI_CXGB3_ISCSI is not set
+# CONFIG_SCSI_CXGB4_ISCSI is not set
+# CONFIG_SCSI_BNX2_ISCSI is not set
+# CONFIG_BE2ISCSI is not set
+# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
+# CONFIG_SCSI_HPSA is not set
+# CONFIG_SCSI_3W_9XXX is not set
+# CONFIG_SCSI_3W_SAS is not set
+# CONFIG_SCSI_ACARD is not set
+# CONFIG_SCSI_AACRAID is not set
+# CONFIG_SCSI_AIC7XXX is not set
+# CONFIG_SCSI_AIC79XX is not set
+# CONFIG_SCSI_AIC94XX is not set
+# CONFIG_SCSI_MVSAS is not set
+# CONFIG_SCSI_MVUMI is not set
+# CONFIG_SCSI_ADVANSYS is not set
+# CONFIG_SCSI_ARCMSR is not set
+# CONFIG_SCSI_ESAS2R is not set
+# CONFIG_MEGARAID_NEWGEN is not set
+# CONFIG_MEGARAID_LEGACY is not set
+# CONFIG_MEGARAID_SAS is not set
+# CONFIG_SCSI_MPT3SAS is not set
+# CONFIG_SCSI_MPT2SAS is not set
+# CONFIG_SCSI_MPI3MR is not set
+# CONFIG_SCSI_SMARTPQI is not set
+# CONFIG_SCSI_HPTIOP is not set
+# CONFIG_SCSI_BUSLOGIC is not set
+# CONFIG_SCSI_MYRB is not set
+# CONFIG_SCSI_MYRS is not set
+CONFIG_VMWARE_PVSCSI=y
+CONFIG_XEN_SCSI_FRONTEND=m
+CONFIG_HYPERV_STORAGE=y
+# CONFIG_SCSI_SNIC is not set
+# CONFIG_SCSI_DMX3191D is not set
+# CONFIG_SCSI_FDOMAIN_PCI is not set
+# CONFIG_SCSI_ISCI is not set
+# CONFIG_SCSI_IPS is not set
+# CONFIG_SCSI_INITIO is not set
+# CONFIG_SCSI_INIA100 is not set
+# CONFIG_SCSI_STEX is not set
+# CONFIG_SCSI_SYM53C8XX_2 is not set
+# CONFIG_SCSI_IPR is not set
+# CONFIG_SCSI_QLOGIC_1280 is not set
+# CONFIG_SCSI_QLA_ISCSI is not set
+# CONFIG_SCSI_DC395x is not set
+# CONFIG_SCSI_AM53C974 is not set
+# CONFIG_SCSI_WD719X is not set
+# CONFIG_SCSI_DEBUG is not set
+# CONFIG_SCSI_PMCRAID is not set
+# CONFIG_SCSI_PM8001 is not set
+CONFIG_SCSI_VIRTIO=y
+# CONFIG_SCSI_DH is not set
+# end of SCSI device support
+
+CONFIG_ATA=y
+CONFIG_SATA_HOST=y
+CONFIG_PATA_TIMINGS=y
+CONFIG_ATA_VERBOSE_ERROR=y
+CONFIG_ATA_FORCE=y
+CONFIG_ATA_ACPI=y
+# CONFIG_SATA_ZPODD is not set
+# CONFIG_SATA_PMP is not set
+
+#
+# Controllers with non-SFF native interface
+#
+CONFIG_SATA_AHCI=y
+CONFIG_SATA_MOBILE_LPM_POLICY=0
+# CONFIG_SATA_AHCI_PLATFORM is not set
+# CONFIG_AHCI_DWC is not set
+# CONFIG_SATA_INIC162X is not set
+# CONFIG_SATA_ACARD_AHCI is not set
+# CONFIG_SATA_SIL24 is not set
+CONFIG_ATA_SFF=y
+
+#
+# SFF controllers with custom DMA interface
+#
+# CONFIG_PDC_ADMA is not set
+# CONFIG_SATA_QSTOR is not set
+# CONFIG_SATA_SX4 is not set
+CONFIG_ATA_BMDMA=y
+
+#
+# SATA SFF controllers with BMDMA
+#
+CONFIG_ATA_PIIX=y
+# CONFIG_SATA_MV is not set
+# CONFIG_SATA_NV is not set
+# CONFIG_SATA_PROMISE is not set
+# CONFIG_SATA_SIL is not set
+# CONFIG_SATA_SIS is not set
+# CONFIG_SATA_SVW is not set
+# CONFIG_SATA_ULI is not set
+# CONFIG_SATA_VIA is not set
+# CONFIG_SATA_VITESSE is not set
+
+#
+# PATA SFF controllers with BMDMA
+#
+# CONFIG_PATA_ALI is not set
+# CONFIG_PATA_AMD is not set
+# CONFIG_PATA_ARTOP is not set
+# CONFIG_PATA_ATIIXP is not set
+# CONFIG_PATA_ATP867X is not set
+# CONFIG_PATA_CMD64X is not set
+# CONFIG_PATA_CYPRESS is not set
+# CONFIG_PATA_EFAR is not set
+# CONFIG_PATA_HPT366 is not set
+# CONFIG_PATA_HPT37X is not set
+# CONFIG_PATA_HPT3X2N is not set
+# CONFIG_PATA_HPT3X3 is not set
+# CONFIG_PATA_IT8213 is not set
+# CONFIG_PATA_IT821X is not set
+# CONFIG_PATA_JMICRON is not set
+# CONFIG_PATA_MARVELL is not set
+# CONFIG_PATA_NETCELL is not set
+# CONFIG_PATA_NINJA32 is not set
+# CONFIG_PATA_NS87415 is not set
+# CONFIG_PATA_OLDPIIX is not set
+# CONFIG_PATA_OPTIDMA is not set
+# CONFIG_PATA_PDC2027X is not set
+# CONFIG_PATA_PDC_OLD is not set
+# CONFIG_PATA_RADISYS is not set
+# CONFIG_PATA_RDC is not set
+# CONFIG_PATA_SCH is not set
+# CONFIG_PATA_SERVERWORKS is not set
+# CONFIG_PATA_SIL680 is not set
+# CONFIG_PATA_SIS is not set
+# CONFIG_PATA_TOSHIBA is not set
+# CONFIG_PATA_TRIFLEX is not set
+# CONFIG_PATA_VIA is not set
+# CONFIG_PATA_WINBOND is not set
+
+#
+# PIO-only SFF controllers
+#
+# CONFIG_PATA_CMD640_PCI is not set
+# CONFIG_PATA_MPIIX is not set
+# CONFIG_PATA_NS87410 is not set
+# CONFIG_PATA_OPTI is not set
+# CONFIG_PATA_RZ1000 is not set
+
+#
+# Generic fallback / legacy drivers
+#
+# CONFIG_PATA_ACPI is not set
+CONFIG_ATA_GENERIC=y
+# CONFIG_PATA_LEGACY is not set
+CONFIG_MD=y
+CONFIG_BLK_DEV_MD=y
+CONFIG_MD_AUTODETECT=y
+# CONFIG_MD_LINEAR is not set
+CONFIG_MD_RAID0=y
+CONFIG_MD_RAID1=m
+CONFIG_MD_RAID10=m
+CONFIG_MD_RAID456=m
+# CONFIG_MD_MULTIPATH is not set
+# CONFIG_MD_FAULTY is not set
+CONFIG_BCACHE=m
+# CONFIG_BCACHE_DEBUG is not set
+# CONFIG_BCACHE_CLOSURES_DEBUG is not set
+# CONFIG_BCACHE_ASYNC_REGISTRATION is not set
+CONFIG_BLK_DEV_DM_BUILTIN=y
+CONFIG_BLK_DEV_DM=y
+CONFIG_DM_DEBUG=y
+CONFIG_DM_BUFIO=y
+# CONFIG_DM_DEBUG_BLOCK_MANAGER_LOCKING is not set
+CONFIG_DM_BIO_PRISON=m
+CONFIG_DM_PERSISTENT_DATA=m
+# CONFIG_DM_UNSTRIPED is not set
+CONFIG_DM_CRYPT=y
+CONFIG_DM_SNAPSHOT=m
+CONFIG_DM_THIN_PROVISIONING=m
+CONFIG_DM_CACHE=m
+CONFIG_DM_CACHE_SMQ=m
+CONFIG_DM_WRITECACHE=m
+# CONFIG_DM_EBS is not set
+# CONFIG_DM_ERA is not set
+# CONFIG_DM_CLONE is not set
+# CONFIG_DM_MIRROR is not set
+CONFIG_DM_RAID=m
+# CONFIG_DM_ZERO is not set
+CONFIG_DM_MULTIPATH=m
+CONFIG_DM_MULTIPATH_QL=m
+CONFIG_DM_MULTIPATH_ST=m
+CONFIG_DM_MULTIPATH_HST=m
+# CONFIG_DM_MULTIPATH_IOA is not set
+# CONFIG_DM_DELAY is not set
+# CONFIG_DM_DUST is not set
+CONFIG_DM_INIT=y
+# CONFIG_DM_UEVENT is not set
+# CONFIG_DM_FLAKEY is not set
+CONFIG_DM_VERITY=y
+# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG is not set
+# CONFIG_DM_VERITY_FEC is not set
+# CONFIG_DM_SWITCH is not set
+# CONFIG_DM_LOG_WRITES is not set
+CONFIG_DM_INTEGRITY=y
+CONFIG_DM_AUDIT=y
+CONFIG_TARGET_CORE=m
+CONFIG_TCM_IBLOCK=m
+CONFIG_TCM_FILEIO=m
+# CONFIG_TCM_PSCSI is not set
+CONFIG_TCM_USER2=m
+CONFIG_LOOPBACK_TARGET=m
+# CONFIG_ISCSI_TARGET is not set
+# CONFIG_FUSION is not set
+
+#
+# IEEE 1394 (FireWire) support
+#
+# CONFIG_FIREWIRE is not set
+# CONFIG_FIREWIRE_NOSY is not set
+# end of IEEE 1394 (FireWire) support
+
+# CONFIG_MACINTOSH_DRIVERS is not set
+CONFIG_NETDEVICES=y
+CONFIG_NET_CORE=y
+CONFIG_BONDING=m
+CONFIG_DUMMY=m
+CONFIG_WIREGUARD=m
+# CONFIG_WIREGUARD_DEBUG is not set
+# CONFIG_EQUALIZER is not set
+# CONFIG_NET_FC is not set
+CONFIG_IFB=m
+# CONFIG_NET_TEAM is not set
+CONFIG_MACVLAN=y
+# CONFIG_MACVTAP is not set
+CONFIG_IPVLAN_L3S=y
+CONFIG_IPVLAN=m
+# CONFIG_IPVTAP is not set
+CONFIG_VXLAN=m
+CONFIG_GENEVE=m
+# CONFIG_BAREUDP is not set
+# CONFIG_GTP is not set
+# CONFIG_AMT is not set
+# CONFIG_MACSEC is not set
+# CONFIG_NETCONSOLE is not set
+CONFIG_TUN=m
+# CONFIG_TUN_VNET_CROSS_LE is not set
+CONFIG_VETH=m
+CONFIG_VIRTIO_NET=y
+# CONFIG_NLMON is not set
+CONFIG_NET_VRF=m
+CONFIG_VSOCKMON=m
+# CONFIG_ARCNET is not set
+CONFIG_ETHERNET=y
+# CONFIG_NET_VENDOR_3COM is not set
+# CONFIG_NET_VENDOR_ADAPTEC is not set
+# CONFIG_NET_VENDOR_AGERE is not set
+# CONFIG_NET_VENDOR_ALACRITECH is not set
+# CONFIG_NET_VENDOR_ALTEON is not set
+# CONFIG_ALTERA_TSE is not set
+CONFIG_NET_VENDOR_AMAZON=y
+CONFIG_ENA_ETHERNET=y
+# CONFIG_NET_VENDOR_AMD is not set
+# CONFIG_NET_VENDOR_AQUANTIA is not set
+# CONFIG_NET_VENDOR_ARC is not set
+CONFIG_NET_VENDOR_ASIX=y
+# CONFIG_NET_VENDOR_ATHEROS is not set
+# CONFIG_CX_ECAT is not set
+# CONFIG_NET_VENDOR_BROADCOM is not set
+# CONFIG_NET_VENDOR_CADENCE is not set
+# CONFIG_NET_VENDOR_CAVIUM is not set
+# CONFIG_NET_VENDOR_CHELSIO is not set
+# CONFIG_NET_VENDOR_CISCO is not set
+# CONFIG_NET_VENDOR_CORTINA is not set
+CONFIG_NET_VENDOR_DAVICOM=y
+# CONFIG_DNET is not set
+# CONFIG_NET_VENDOR_DEC is not set
+# CONFIG_NET_VENDOR_DLINK is not set
+# CONFIG_NET_VENDOR_EMULEX is not set
+CONFIG_NET_VENDOR_ENGLEDER=y
+# CONFIG_TSNEP is not set
+# CONFIG_NET_VENDOR_EZCHIP is not set
+CONFIG_NET_VENDOR_FUNGIBLE=y
+# CONFIG_FUN_ETH is not set
+CONFIG_NET_VENDOR_GOOGLE=y
+CONFIG_GVE=m
+# CONFIG_NET_VENDOR_HUAWEI is not set
+CONFIG_NET_VENDOR_I825XX=y
+CONFIG_NET_VENDOR_INTEL=y
+# CONFIG_E100 is not set
+# CONFIG_E1000 is not set
+# CONFIG_E1000E is not set
+# CONFIG_IGB is not set
+# CONFIG_IGBVF is not set
+# CONFIG_IXGB is not set
+# CONFIG_IXGBE is not set
+CONFIG_IXGBEVF=y
+# CONFIG_I40E is not set
+# CONFIG_I40EVF is not set
+# CONFIG_ICE is not set
+# CONFIG_FM10K is not set
+# CONFIG_IGC is not set
+CONFIG_NET_VENDOR_WANGXUN=y
+# CONFIG_NGBE is not set
+# CONFIG_TXGBE is not set
+# CONFIG_JME is not set
+CONFIG_NET_VENDOR_LITEX=y
+# CONFIG_NET_VENDOR_MARVELL is not set
+CONFIG_NET_VENDOR_MELLANOX=y
+CONFIG_MLX4_EN=m
+CONFIG_MLX4_CORE=m
+CONFIG_MLX4_DEBUG=y
+CONFIG_MLX4_CORE_GEN2=y
+CONFIG_MLX5_CORE=m
+CONFIG_MLX5_FPGA=y
+CONFIG_MLX5_CORE_EN=y
+CONFIG_MLX5_EN_ARFS=y
+CONFIG_MLX5_EN_RXNFC=y
+CONFIG_MLX5_MPFS=y
+# CONFIG_MLX5_CORE_IPOIB is not set
+# CONFIG_MLX5_EN_TLS is not set
+# CONFIG_MLX5_SF is not set
+CONFIG_MLXSW_CORE=m
+CONFIG_MLXSW_CORE_THERMAL=y
+CONFIG_MLXSW_PCI=m
+CONFIG_MLXFW=m
+# CONFIG_NET_VENDOR_MICREL is not set
+CONFIG_NET_VENDOR_MICROCHIP=y
+# CONFIG_LAN743X is not set
+# CONFIG_NET_VENDOR_MICROSEMI is not set
+CONFIG_NET_VENDOR_MICROSOFT=y
+# CONFIG_MICROSOFT_MANA is not set
+# CONFIG_NET_VENDOR_MYRI is not set
+# CONFIG_FEALNX is not set
+# CONFIG_NET_VENDOR_NI is not set
+# CONFIG_NET_VENDOR_NATSEMI is not set
+# CONFIG_NET_VENDOR_NETERION is not set
+# CONFIG_NET_VENDOR_NETRONOME is not set
+# CONFIG_NET_VENDOR_NVIDIA is not set
+# CONFIG_NET_VENDOR_OKI is not set
+# CONFIG_ETHOC is not set
+# CONFIG_NET_VENDOR_PACKET_ENGINES is not set
+CONFIG_NET_VENDOR_PENSANDO=y
+# CONFIG_IONIC is not set
+# CONFIG_NET_VENDOR_QLOGIC is not set
+# CONFIG_NET_VENDOR_BROCADE is not set
+# CONFIG_NET_VENDOR_QUALCOMM is not set
+# CONFIG_NET_VENDOR_RDC is not set
+# CONFIG_NET_VENDOR_REALTEK is not set
+# CONFIG_NET_VENDOR_RENESAS is not set
+# CONFIG_NET_VENDOR_ROCKER is not set
+# CONFIG_NET_VENDOR_SAMSUNG is not set
+# CONFIG_NET_VENDOR_SEEQ is not set
+# CONFIG_NET_VENDOR_SILAN is not set
+# CONFIG_NET_VENDOR_SIS is not set
+# CONFIG_NET_VENDOR_SOLARFLARE is not set
+# CONFIG_NET_VENDOR_SMSC is not set
+# CONFIG_NET_VENDOR_SOCIONEXT is not set
+# CONFIG_NET_VENDOR_STMICRO is not set
+# CONFIG_NET_VENDOR_SUN is not set
+# CONFIG_NET_VENDOR_SYNOPSYS is not set
+# CONFIG_NET_VENDOR_TEHUTI is not set
+# CONFIG_NET_VENDOR_TI is not set
+CONFIG_NET_VENDOR_VERTEXCOM=y
+# CONFIG_NET_VENDOR_VIA is not set
+# CONFIG_NET_VENDOR_WIZNET is not set
+CONFIG_NET_VENDOR_XILINX=y
+# CONFIG_XILINX_EMACLITE is not set
+# CONFIG_XILINX_AXI_EMAC is not set
+# CONFIG_XILINX_LL_TEMAC is not set
+# CONFIG_FDDI is not set
+# CONFIG_HIPPI is not set
+# CONFIG_NET_SB1000 is not set
+CONFIG_PHYLIB=y
+CONFIG_SWPHY=y
+CONFIG_FIXED_PHY=y
+
+#
+# MII PHY device drivers
+#
+# CONFIG_AMD_PHY is not set
+# CONFIG_ADIN_PHY is not set
+# CONFIG_ADIN1100_PHY is not set
+# CONFIG_AQUANTIA_PHY is not set
+# CONFIG_AX88796B_PHY is not set
+# CONFIG_BROADCOM_PHY is not set
+# CONFIG_BCM54140_PHY is not set
+# CONFIG_BCM7XXX_PHY is not set
+# CONFIG_BCM84881_PHY is not set
+# CONFIG_BCM87XX_PHY is not set
+# CONFIG_CICADA_PHY is not set
+# CONFIG_CORTINA_PHY is not set
+# CONFIG_DAVICOM_PHY is not set
+# CONFIG_ICPLUS_PHY is not set
+# CONFIG_LXT_PHY is not set
+# CONFIG_INTEL_XWAY_PHY is not set
+# CONFIG_LSI_ET1011C_PHY is not set
+# CONFIG_MARVELL_PHY is not set
+# CONFIG_MARVELL_10G_PHY is not set
+# CONFIG_MARVELL_88X2222_PHY is not set
+# CONFIG_MAXLINEAR_GPHY is not set
+# CONFIG_MEDIATEK_GE_PHY is not set
+# CONFIG_MICREL_PHY is not set
+# CONFIG_MICROCHIP_PHY is not set
+# CONFIG_MICROCHIP_T1_PHY is not set
+# CONFIG_MICROSEMI_PHY is not set
+# CONFIG_MOTORCOMM_PHY is not set
+# CONFIG_NATIONAL_PHY is not set
+# CONFIG_NXP_C45_TJA11XX_PHY is not set
+# CONFIG_QSEMI_PHY is not set
+# CONFIG_REALTEK_PHY is not set
+# CONFIG_RENESAS_PHY is not set
+# CONFIG_ROCKCHIP_PHY is not set
+# CONFIG_SMSC_PHY is not set
+# CONFIG_STE10XP is not set
+# CONFIG_TERANETICS_PHY is not set
+# CONFIG_DP83822_PHY is not set
+# CONFIG_DP83TC811_PHY is not set
+# CONFIG_DP83848_PHY is not set
+# CONFIG_DP83867_PHY is not set
+# CONFIG_DP83869_PHY is not set
+# CONFIG_DP83TD510_PHY is not set
+# CONFIG_VITESSE_PHY is not set
+# CONFIG_XILINX_GMII2RGMII is not set
+# CONFIG_PSE_CONTROLLER is not set
+CONFIG_MDIO_DEVICE=y
+CONFIG_MDIO_BUS=y
+CONFIG_FWNODE_MDIO=y
+CONFIG_ACPI_MDIO=y
+CONFIG_MDIO_DEVRES=y
+# CONFIG_MDIO_BITBANG is not set
+# CONFIG_MDIO_BCM_UNIMAC is not set
+# CONFIG_MDIO_THUNDER is not set
+
+#
+# MDIO Multiplexers
+#
+
+#
+# PCS device drivers
+#
+# end of PCS device drivers
+
+CONFIG_PPP=m
+# CONFIG_PPP_BSDCOMP is not set
+# CONFIG_PPP_DEFLATE is not set
+# CONFIG_PPP_FILTER is not set
+# CONFIG_PPP_MPPE is not set
+# CONFIG_PPP_MULTILINK is not set
+# CONFIG_PPPOE is not set
+# CONFIG_PPTP is not set
+CONFIG_PPP_ASYNC=m
+# CONFIG_PPP_SYNC_TTY is not set
+# CONFIG_SLIP is not set
+CONFIG_SLHC=m
+
+#
+# Host-side USB support is needed for USB Network Adapter support
+#
+# CONFIG_WLAN is not set
+# CONFIG_WAN is not set
+
+#
+# Wireless WAN
+#
+# CONFIG_WWAN is not set
+# end of Wireless WAN
+
+CONFIG_XEN_NETDEV_FRONTEND=y
+CONFIG_XEN_NETDEV_BACKEND=m
+CONFIG_VMXNET3=y
+# CONFIG_FUJITSU_ES is not set
+CONFIG_HYPERV_NET=y
+# CONFIG_NETDEVSIM is not set
+CONFIG_NET_FAILOVER=y
+# CONFIG_ISDN is not set
+
+#
+# Input device support
+#
+CONFIG_INPUT=y
+CONFIG_INPUT_FF_MEMLESS=y
+CONFIG_INPUT_SPARSEKMAP=m
+# CONFIG_INPUT_MATRIXKMAP is not set
+CONFIG_INPUT_VIVALDIFMAP=y
+
+#
+# Userland interfaces
+#
+# CONFIG_INPUT_MOUSEDEV is not set
+# CONFIG_INPUT_JOYDEV is not set
+CONFIG_INPUT_EVDEV=y
+# CONFIG_INPUT_EVBUG is not set
+
+#
+# Input Device Drivers
+#
+CONFIG_INPUT_KEYBOARD=y
+CONFIG_KEYBOARD_ATKBD=y
+# CONFIG_KEYBOARD_LKKBD is not set
+# CONFIG_KEYBOARD_NEWTON is not set
+# CONFIG_KEYBOARD_OPENCORES is not set
+# CONFIG_KEYBOARD_SAMSUNG is not set
+# CONFIG_KEYBOARD_STOWAWAY is not set
+# CONFIG_KEYBOARD_SUNKBD is not set
+# CONFIG_KEYBOARD_XTKBD is not set
+# CONFIG_INPUT_MOUSE is not set
+# CONFIG_INPUT_JOYSTICK is not set
+# CONFIG_INPUT_TABLET is not set
+# CONFIG_INPUT_TOUCHSCREEN is not set
+CONFIG_INPUT_MISC=y
+# CONFIG_INPUT_AD714X is not set
+# CONFIG_INPUT_E3X0_BUTTON is not set
+# CONFIG_INPUT_ATLAS_BTNS is not set
+CONFIG_INPUT_UINPUT=m
+# CONFIG_INPUT_ADXL34X is not set
+# CONFIG_INPUT_CMA3000 is not set
+CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y
+# CONFIG_INPUT_IDEAPAD_SLIDEBAR is not set
+# CONFIG_RMI4_CORE is not set
+
+#
+# Hardware I/O ports
+#
+CONFIG_SERIO=y
+CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
+CONFIG_SERIO_I8042=y
+CONFIG_SERIO_SERPORT=m
+CONFIG_SERIO_CT82C710=m
+CONFIG_SERIO_PCIPS2=m
+CONFIG_SERIO_LIBPS2=y
+CONFIG_SERIO_RAW=y
+# CONFIG_SERIO_ALTERA_PS2 is not set
+# CONFIG_SERIO_PS2MULT is not set
+# CONFIG_SERIO_ARC_PS2 is not set
+CONFIG_HYPERV_KEYBOARD=y
+# CONFIG_USERIO is not set
+# CONFIG_GAMEPORT is not set
+# end of Hardware I/O ports
+# end of Input device support
+
+#
+# Character devices
+#
+CONFIG_TTY=y
+CONFIG_VT=y
+CONFIG_CONSOLE_TRANSLATIONS=y
+CONFIG_VT_CONSOLE=y
+CONFIG_VT_CONSOLE_SLEEP=y
+CONFIG_HW_CONSOLE=y
+# CONFIG_VT_HW_CONSOLE_BINDING is not set
+CONFIG_UNIX98_PTYS=y
+# CONFIG_LEGACY_PTYS is not set
+CONFIG_LDISC_AUTOLOAD=y
+
+#
+# Serial drivers
+#
+CONFIG_SERIAL_EARLYCON=y
+CONFIG_SERIAL_8250=y
+# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
+CONFIG_SERIAL_8250_PNP=y
+# CONFIG_SERIAL_8250_16550A_VARIANTS is not set
+# CONFIG_SERIAL_8250_FINTEK is not set
+CONFIG_SERIAL_8250_CONSOLE=y
+# CONFIG_SERIAL_8250_PCI is not set
+CONFIG_SERIAL_8250_NR_UARTS=4
+CONFIG_SERIAL_8250_RUNTIME_UARTS=4
+# CONFIG_SERIAL_8250_EXTENDED is not set
+# CONFIG_SERIAL_8250_DW is not set
+# CONFIG_SERIAL_8250_RT288X is not set
+# CONFIG_SERIAL_8250_LPSS is not set
+# CONFIG_SERIAL_8250_MID is not set
+CONFIG_SERIAL_8250_PERICOM=y
+
+#
+# Non-8250 serial port support
+#
+# CONFIG_SERIAL_UARTLITE is not set
+CONFIG_SERIAL_CORE=y
+CONFIG_SERIAL_CORE_CONSOLE=y
+# CONFIG_SERIAL_JSM is not set
+# CONFIG_SERIAL_LANTIQ is not set
+# CONFIG_SERIAL_SCCNXP is not set
+# CONFIG_SERIAL_ALTERA_JTAGUART is not set
+# CONFIG_SERIAL_ALTERA_UART is not set
+# CONFIG_SERIAL_ARC is not set
+# CONFIG_SERIAL_RP2 is not set
+# CONFIG_SERIAL_FSL_LPUART is not set
+# CONFIG_SERIAL_FSL_LINFLEXUART is not set
+# CONFIG_SERIAL_SPRD is not set
+# end of Serial drivers
+
+# CONFIG_SERIAL_NONSTANDARD is not set
+# CONFIG_N_GSM is not set
+# CONFIG_NOZOMI is not set
+# CONFIG_NULL_TTY is not set
+CONFIG_HVC_DRIVER=y
+CONFIG_HVC_IRQ=y
+CONFIG_HVC_XEN=y
+CONFIG_HVC_XEN_FRONTEND=y
+# CONFIG_SERIAL_DEV_BUS is not set
+# CONFIG_TTY_PRINTK is not set
+# CONFIG_VIRTIO_CONSOLE is not set
+# CONFIG_IPMI_HANDLER is not set
+CONFIG_HW_RANDOM=y
+# CONFIG_HW_RANDOM_TIMERIOMEM is not set
+# CONFIG_HW_RANDOM_INTEL is not set
+# CONFIG_HW_RANDOM_AMD is not set
+# CONFIG_HW_RANDOM_BA431 is not set
+# CONFIG_HW_RANDOM_VIA is not set
+CONFIG_HW_RANDOM_VIRTIO=y
+# CONFIG_HW_RANDOM_XIPHERA is not set
+# CONFIG_APPLICOM is not set
+# CONFIG_MWAVE is not set
+CONFIG_DEVMEM=y
+CONFIG_NVRAM=y
+CONFIG_DEVPORT=y
+CONFIG_HPET=y
+# CONFIG_HPET_MMAP is not set
+# CONFIG_HANGCHECK_TIMER is not set
+CONFIG_TCG_TPM=y
+# CONFIG_HW_RANDOM_TPM is not set
+CONFIG_TCG_TIS_CORE=y
+CONFIG_TCG_TIS=y
+# CONFIG_TCG_NSC is not set
+# CONFIG_TCG_ATMEL is not set
+# CONFIG_TCG_INFINEON is not set
+CONFIG_TCG_XEN=m
+CONFIG_TCG_CRB=y
+# CONFIG_TCG_VTPM_PROXY is not set
+# CONFIG_TELCLOCK is not set
+# CONFIG_XILLYBUS is not set
+CONFIG_RANDOM_TRUST_CPU=y
+# CONFIG_RANDOM_TRUST_BOOTLOADER is not set
+# end of Character devices
+
+#
+# I2C support
+#
+# CONFIG_I2C is not set
+# end of I2C support
+
+# CONFIG_I3C is not set
+# CONFIG_SPI is not set
+# CONFIG_SPMI is not set
+# CONFIG_HSI is not set
+CONFIG_PPS=y
+# CONFIG_PPS_DEBUG is not set
+
+#
+# PPS clients support
+#
+# CONFIG_PPS_CLIENT_KTIMER is not set
+# CONFIG_PPS_CLIENT_LDISC is not set
+# CONFIG_PPS_CLIENT_GPIO is not set
+
+#
+# PPS generators support
+#
+
+#
+# PTP clock support
+#
+CONFIG_PTP_1588_CLOCK=y
+CONFIG_PTP_1588_CLOCK_OPTIONAL=y
+
+#
+# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
+#
+CONFIG_PTP_1588_CLOCK_KVM=m
+# CONFIG_PTP_1588_CLOCK_VMW is not set
+# end of PTP clock support
+
+# CONFIG_PINCTRL is not set
+# CONFIG_GPIOLIB is not set
+# CONFIG_W1 is not set
+# CONFIG_POWER_RESET is not set
+# CONFIG_POWER_SUPPLY is not set
+# CONFIG_HWMON is not set
+CONFIG_THERMAL=y
+# CONFIG_THERMAL_NETLINK is not set
+# CONFIG_THERMAL_STATISTICS is not set
+CONFIG_THERMAL_EMERGENCY_POWEROFF_DELAY_MS=0
+CONFIG_THERMAL_WRITABLE_TRIPS=y
+CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE=y
+# CONFIG_THERMAL_DEFAULT_GOV_FAIR_SHARE is not set
+# CONFIG_THERMAL_DEFAULT_GOV_USER_SPACE is not set
+# CONFIG_THERMAL_GOV_FAIR_SHARE is not set
+CONFIG_THERMAL_GOV_STEP_WISE=y
+# CONFIG_THERMAL_GOV_BANG_BANG is not set
+CONFIG_THERMAL_GOV_USER_SPACE=y
+# CONFIG_THERMAL_EMULATION is not set
+
+#
+# Intel thermal drivers
+#
+# CONFIG_INTEL_POWERCLAMP is not set
+CONFIG_X86_THERMAL_VECTOR=y
+CONFIG_X86_PKG_TEMP_THERMAL=m
+# CONFIG_INTEL_SOC_DTS_THERMAL is not set
+
+#
+# ACPI INT340X thermal drivers
+#
+# CONFIG_INT340X_THERMAL is not set
+# end of ACPI INT340X thermal drivers
+
+# CONFIG_INTEL_PCH_THERMAL is not set
+# CONFIG_INTEL_TCC_COOLING is not set
+# CONFIG_INTEL_MENLOW is not set
+# CONFIG_INTEL_HFI_THERMAL is not set
+# end of Intel thermal drivers
+
+CONFIG_WATCHDOG=y
+CONFIG_WATCHDOG_CORE=y
+# CONFIG_WATCHDOG_NOWAYOUT is not set
+CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED=y
+CONFIG_WATCHDOG_OPEN_TIMEOUT=0
+# CONFIG_WATCHDOG_SYSFS is not set
+# CONFIG_WATCHDOG_HRTIMER_PRETIMEOUT is not set
+
+#
+# Watchdog Pretimeout Governors
+#
+# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
+
+#
+# Watchdog Device Drivers
+#
+# CONFIG_SOFT_WATCHDOG is not set
+# CONFIG_WDAT_WDT is not set
+# CONFIG_XILINX_WATCHDOG is not set
+# CONFIG_MLX_WDT is not set
+# CONFIG_CADENCE_WATCHDOG is not set
+# CONFIG_DW_WATCHDOG is not set
+# CONFIG_MAX63XX_WATCHDOG is not set
+# CONFIG_ACQUIRE_WDT is not set
+# CONFIG_ADVANTECH_WDT is not set
+# CONFIG_ALIM1535_WDT is not set
+# CONFIG_ALIM7101_WDT is not set
+# CONFIG_EBC_C384_WDT is not set
+# CONFIG_EXAR_WDT is not set
+# CONFIG_F71808E_WDT is not set
+# CONFIG_SP5100_TCO is not set
+# CONFIG_SBC_FITPC2_WATCHDOG is not set
+# CONFIG_EUROTECH_WDT is not set
+# CONFIG_IB700_WDT is not set
+# CONFIG_IBMASR is not set
+# CONFIG_WAFER_WDT is not set
+# CONFIG_I6300ESB_WDT is not set
+# CONFIG_IE6XX_WDT is not set
+CONFIG_ITCO_WDT=y
+CONFIG_ITCO_VENDOR_SUPPORT=y
+# CONFIG_IT8712F_WDT is not set
+# CONFIG_IT87_WDT is not set
+# CONFIG_HP_WATCHDOG is not set
+# CONFIG_SC1200_WDT is not set
+# CONFIG_PC87413_WDT is not set
+# CONFIG_NV_TCO is not set
+# CONFIG_60XX_WDT is not set
+# CONFIG_CPU5_WDT is not set
+# CONFIG_SMSC_SCH311X_WDT is not set
+# CONFIG_SMSC37B787_WDT is not set
+# CONFIG_TQMX86_WDT is not set
+# CONFIG_VIA_WDT is not set
+# CONFIG_W83627HF_WDT is not set
+# CONFIG_W83877F_WDT is not set
+# CONFIG_W83977F_WDT is not set
+# CONFIG_MACHZ_WDT is not set
+# CONFIG_SBC_EPX_C3_WATCHDOG is not set
+# CONFIG_NI903X_WDT is not set
+# CONFIG_NIC7018_WDT is not set
+# CONFIG_XEN_WDT is not set
+
+#
+# PCI-based Watchdog Cards
+#
+# CONFIG_PCIPCWATCHDOG is not set
+# CONFIG_WDTPCI is not set
+CONFIG_SSB_POSSIBLE=y
+# CONFIG_SSB is not set
+CONFIG_BCMA_POSSIBLE=y
+# CONFIG_BCMA is not set
+
+#
+# Multifunction device drivers
+#
+CONFIG_MFD_CORE=y
+# CONFIG_MFD_MADERA is not set
+# CONFIG_HTC_PASIC3 is not set
+# CONFIG_MFD_INTEL_QUARK_I2C_GPIO is not set
+CONFIG_LPC_ICH=y
+CONFIG_LPC_SCH=m
+# CONFIG_MFD_INTEL_LPSS_ACPI is not set
+# CONFIG_MFD_INTEL_LPSS_PCI is not set
+# CONFIG_MFD_INTEL_PMC_BXT is not set
+# CONFIG_MFD_JANZ_CMODIO is not set
+# CONFIG_MFD_KEMPLD is not set
+# CONFIG_MFD_MT6397 is not set
+# CONFIG_MFD_RDC321X is not set
+# CONFIG_MFD_SM501 is not set
+# CONFIG_MFD_SYSCON is not set
+# CONFIG_MFD_TI_AM335X_TSCADC is not set
+# CONFIG_MFD_TQMX86 is not set
+# CONFIG_MFD_VX855 is not set
+# end of Multifunction device drivers
+
+# CONFIG_REGULATOR is not set
+# CONFIG_RC_CORE is not set
+
+#
+# CEC support
+#
+# CONFIG_MEDIA_CEC_SUPPORT is not set
+# end of CEC support
+
+# CONFIG_MEDIA_SUPPORT is not set
+
+#
+# Graphics support
+#
+CONFIG_APERTURE_HELPERS=y
+# CONFIG_AGP is not set
+# CONFIG_VGA_SWITCHEROO is not set
+# CONFIG_DRM is not set
+# CONFIG_DRM_DEBUG_MODESET_LOCK is not set
+
+#
+# ARM devices
+#
+# end of ARM devices
+
+#
+# Frame buffer Devices
+#
+# CONFIG_FB is not set
+# end of Frame buffer Devices
+
+#
+# Backlight & LCD device support
+#
+# CONFIG_LCD_CLASS_DEVICE is not set
+# CONFIG_BACKLIGHT_CLASS_DEVICE is not set
+# end of Backlight & LCD device support
+
+#
+# Console display driver support
+#
+CONFIG_VGA_CONSOLE=y
+CONFIG_DUMMY_CONSOLE=y
+CONFIG_DUMMY_CONSOLE_COLUMNS=80
+CONFIG_DUMMY_CONSOLE_ROWS=25
+# end of Console display driver support
+# end of Graphics support
+
+# CONFIG_SOUND is not set
+
+#
+# HID support
+#
+# CONFIG_HID is not set
+
+#
+# Intel ISH HID support
+#
+# CONFIG_INTEL_ISH_HID is not set
+# end of Intel ISH HID support
+# end of HID support
+
+CONFIG_USB_OHCI_LITTLE_ENDIAN=y
+# CONFIG_USB_SUPPORT is not set
+# CONFIG_MMC is not set
+# CONFIG_SCSI_UFSHCD is not set
+# CONFIG_MEMSTICK is not set
+# CONFIG_NEW_LEDS is not set
+# CONFIG_ACCESSIBILITY is not set
+# CONFIG_INFINIBAND is not set
+CONFIG_EDAC_ATOMIC_SCRUB=y
+CONFIG_EDAC_SUPPORT=y
+# CONFIG_EDAC is not set
+CONFIG_RTC_LIB=y
+CONFIG_RTC_MC146818_LIB=y
+CONFIG_RTC_CLASS=y
+# CONFIG_RTC_HCTOSYS is not set
+CONFIG_RTC_SYSTOHC=y
+CONFIG_RTC_SYSTOHC_DEVICE="rtc0"
+# CONFIG_RTC_DEBUG is not set
+CONFIG_RTC_NVMEM=y
+
+#
+# RTC interfaces
+#
+CONFIG_RTC_INTF_SYSFS=y
+CONFIG_RTC_INTF_PROC=y
+CONFIG_RTC_INTF_DEV=y
+# CONFIG_RTC_INTF_DEV_UIE_EMUL is not set
+# CONFIG_RTC_DRV_TEST is not set
+
+#
+# I2C RTC drivers
+#
+
+#
+# SPI RTC drivers
+#
+
+#
+# SPI and I2C RTC drivers
+#
+
+#
+# Platform RTC drivers
+#
+CONFIG_RTC_DRV_CMOS=y
+# CONFIG_RTC_DRV_DS1286 is not set
+# CONFIG_RTC_DRV_DS1511 is not set
+# CONFIG_RTC_DRV_DS1553 is not set
+# CONFIG_RTC_DRV_DS1685_FAMILY is not set
+# CONFIG_RTC_DRV_DS1742 is not set
+# CONFIG_RTC_DRV_DS2404 is not set
+# CONFIG_RTC_DRV_STK17TA8 is not set
+# CONFIG_RTC_DRV_M48T86 is not set
+# CONFIG_RTC_DRV_M48T35 is not set
+# CONFIG_RTC_DRV_M48T59 is not set
+# CONFIG_RTC_DRV_MSM6242 is not set
+# CONFIG_RTC_DRV_BQ4802 is not set
+# CONFIG_RTC_DRV_RP5C01 is not set
+# CONFIG_RTC_DRV_V3020 is not set
+
+#
+# on-CPU RTC drivers
+#
+# CONFIG_RTC_DRV_FTRTC010 is not set
+
+#
+# HID Sensor RTC drivers
+#
+# CONFIG_RTC_DRV_GOLDFISH is not set
+# CONFIG_DMADEVICES is not set
+
+#
+# DMABUF options
+#
+# CONFIG_SYNC_FILE is not set
+# CONFIG_UDMABUF is not set
+# CONFIG_DMABUF_MOVE_NOTIFY is not set
+# CONFIG_DMABUF_DEBUG is not set
+# CONFIG_DMABUF_SELFTESTS is not set
+# CONFIG_DMABUF_HEAPS is not set
+CONFIG_DMABUF_SYSFS_STATS=y
+# end of DMABUF options
+
+# CONFIG_AUXDISPLAY is not set
+CONFIG_UIO=m
+# CONFIG_UIO_CIF is not set
+# CONFIG_UIO_PDRV_GENIRQ is not set
+# CONFIG_UIO_DMEM_GENIRQ is not set
+# CONFIG_UIO_AEC is not set
+# CONFIG_UIO_SERCOS3 is not set
+CONFIG_UIO_PCI_GENERIC=m
+# CONFIG_UIO_NETX is not set
+# CONFIG_UIO_PRUSS is not set
+# CONFIG_UIO_MF624 is not set
+# CONFIG_UIO_HV_GENERIC is not set
+CONFIG_VFIO=m
+CONFIG_VFIO_IOMMU_TYPE1=m
+CONFIG_VFIO_VIRQFD=m
+CONFIG_VFIO_NOIOMMU=y
+CONFIG_VFIO_PCI_CORE=m
+CONFIG_VFIO_PCI_MMAP=y
+CONFIG_VFIO_PCI_INTX=y
+CONFIG_VFIO_PCI=m
+# CONFIG_VFIO_PCI_IGD is not set
+# CONFIG_MLX5_VFIO_PCI is not set
+# CONFIG_VFIO_MDEV is not set
+CONFIG_IRQ_BYPASS_MANAGER=m
+CONFIG_VIRT_DRIVERS=y
+CONFIG_VMGENID=y
+# CONFIG_VBOXGUEST is not set
+# CONFIG_NITRO_ENCLAVES is not set
+# CONFIG_EFI_SECRET is not set
+CONFIG_SEV_GUEST=y
+# CONFIG_TDX_GUEST_DRIVER is not set
+CONFIG_VIRTIO_ANCHOR=y
+CONFIG_VIRTIO=y
+CONFIG_VIRTIO_PCI_LIB=y
+CONFIG_VIRTIO_PCI_LIB_LEGACY=y
+CONFIG_VIRTIO_MENU=y
+CONFIG_VIRTIO_PCI=y
+CONFIG_VIRTIO_PCI_LEGACY=y
+CONFIG_VIRTIO_BALLOON=m
+CONFIG_VIRTIO_MEM=m
+# CONFIG_VIRTIO_INPUT is not set
+# CONFIG_VIRTIO_MMIO is not set
+# CONFIG_VDPA is not set
+CONFIG_VHOST_IOTLB=m
+CONFIG_VHOST=m
+CONFIG_VHOST_MENU=y
+CONFIG_VHOST_NET=m
+# CONFIG_VHOST_SCSI is not set
+CONFIG_VHOST_VSOCK=m
+# CONFIG_VHOST_CROSS_ENDIAN_LEGACY is not set
+
+#
+# Microsoft Hyper-V guest support
+#
+CONFIG_HYPERV=y
+CONFIG_HYPERV_TIMER=y
+CONFIG_HYPERV_UTILS=y
+CONFIG_HYPERV_BALLOON=y
+# end of Microsoft Hyper-V guest support
+
+#
+# Xen driver support
+#
+CONFIG_XEN_BALLOON=y
+CONFIG_XEN_BALLOON_MEMORY_HOTPLUG=y
+CONFIG_XEN_MEMORY_HOTPLUG_LIMIT=512
+CONFIG_XEN_SCRUB_PAGES_DEFAULT=y
+CONFIG_XEN_DEV_EVTCHN=m
+CONFIG_XEN_BACKEND=y
+CONFIG_XENFS=m
+CONFIG_XEN_COMPAT_XENFS=y
+CONFIG_XEN_SYS_HYPERVISOR=y
+CONFIG_XEN_XENBUS_FRONTEND=y
+CONFIG_XEN_GNTDEV=m
+CONFIG_XEN_GRANT_DEV_ALLOC=m
+# CONFIG_XEN_GRANT_DMA_ALLOC is not set
+CONFIG_SWIOTLB_XEN=y
+CONFIG_XEN_PCI_STUB=y
+CONFIG_XEN_PCIDEV_BACKEND=m
+# CONFIG_XEN_PVCALLS_FRONTEND is not set
+# CONFIG_XEN_PVCALLS_BACKEND is not set
+# CONFIG_XEN_SCSI_BACKEND is not set
+CONFIG_XEN_PRIVCMD=m
+CONFIG_XEN_ACPI_PROCESSOR=m
+# CONFIG_XEN_MCE_LOG is not set
+CONFIG_XEN_HAVE_PVMMU=y
+CONFIG_XEN_EFI=y
+CONFIG_XEN_AUTO_XLATE=y
+CONFIG_XEN_ACPI=y
+CONFIG_XEN_SYMS=y
+CONFIG_XEN_HAVE_VPMU=y
+# CONFIG_XEN_VIRTIO is not set
+CONFIG_XEN_UNPOPULATED_ALLOC=y
+# end of Xen driver support
+
+# CONFIG_GREYBUS is not set
+# CONFIG_COMEDI is not set
+# CONFIG_STAGING is not set
+# CONFIG_CHROME_PLATFORMS is not set
+CONFIG_MELLANOX_PLATFORM=y
+CONFIG_SURFACE_PLATFORMS=y
+# CONFIG_SURFACE_GPE is not set
+# CONFIG_SURFACE_PRO3_BUTTON is not set
+CONFIG_X86_PLATFORM_DEVICES=y
+# CONFIG_ACPI_WMI is not set
+# CONFIG_ACERHDF is not set
+# CONFIG_ACER_WIRELESS is not set
+# CONFIG_AMD_PMC is not set
+# CONFIG_AMD_HSMP is not set
+# CONFIG_ADV_SWBUTTON is not set
+# CONFIG_ASUS_WIRELESS is not set
+# CONFIG_X86_PLATFORM_DRIVERS_DELL is not set
+# CONFIG_FUJITSU_TABLET is not set
+# CONFIG_GPD_POCKET_FAN is not set
+# CONFIG_X86_PLATFORM_DRIVERS_HP is not set
+# CONFIG_WIRELESS_HOTKEY is not set
+# CONFIG_IBM_RTL is not set
+# CONFIG_SENSORS_HDAPS is not set
+# CONFIG_INTEL_SAR_INT1092 is not set
+# CONFIG_INTEL_PMC_CORE is not set
+
+#
+# Intel Speed Select Technology interface support
+#
+# CONFIG_INTEL_SPEED_SELECT_INTERFACE is not set
+# end of Intel Speed Select Technology interface support
+
+#
+# Intel Uncore Frequency Control
+#
+# CONFIG_INTEL_UNCORE_FREQ_CONTROL is not set
+# end of Intel Uncore Frequency Control
+
+# CONFIG_INTEL_PUNIT_IPC is not set
+# CONFIG_INTEL_RST is not set
+# CONFIG_INTEL_SMARTCONNECT is not set
+# CONFIG_INTEL_TURBO_MAX_3 is not set
+# CONFIG_INTEL_VSEC is not set
+# CONFIG_SAMSUNG_Q10 is not set
+# CONFIG_TOSHIBA_BT_RFKILL is not set
+# CONFIG_TOSHIBA_HAPS is not set
+# CONFIG_ACPI_CMPC is not set
+# CONFIG_TOPSTAR_LAPTOP is not set
+# CONFIG_INTEL_IPS is not set
+# CONFIG_INTEL_SCU_PCI is not set
+# CONFIG_INTEL_SCU_PLATFORM is not set
+# CONFIG_SIEMENS_SIMATIC_IPC is not set
+# CONFIG_WINMATE_FM07_KEYS is not set
+CONFIG_P2SB=y
+CONFIG_HAVE_CLK=y
+CONFIG_HAVE_CLK_PREPARE=y
+CONFIG_COMMON_CLK=y
+# CONFIG_XILINX_VCU is not set
+# CONFIG_HWSPINLOCK is not set
+
+#
+# Clock Source drivers
+#
+CONFIG_CLKEVT_I8253=y
+CONFIG_CLKBLD_I8253=y
+# end of Clock Source drivers
+
+CONFIG_MAILBOX=y
+CONFIG_PCC=y
+# CONFIG_ALTERA_MBOX is not set
+CONFIG_IOMMU_IOVA=y
+CONFIG_IOASID=y
+CONFIG_IOMMU_API=y
+CONFIG_IOMMU_SUPPORT=y
+
+#
+# Generic IOMMU Pagetable Support
+#
+CONFIG_IOMMU_IO_PGTABLE=y
+# end of Generic IOMMU Pagetable Support
+
+# CONFIG_IOMMU_DEBUGFS is not set
+# CONFIG_IOMMU_DEFAULT_DMA_STRICT is not set
+CONFIG_IOMMU_DEFAULT_DMA_LAZY=y
+# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set
+CONFIG_IOMMU_DMA=y
+CONFIG_AMD_IOMMU=y
+CONFIG_AMD_IOMMU_V2=m
+CONFIG_DMAR_TABLE=y
+CONFIG_INTEL_IOMMU=y
+# CONFIG_INTEL_IOMMU_SVM is not set
+# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
+CONFIG_INTEL_IOMMU_FLOPPY_WA=y
+# CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON is not set
+CONFIG_IRQ_REMAP=y
+CONFIG_HYPERV_IOMMU=y
+# CONFIG_VIRTIO_IOMMU is not set
+
+#
+# Remoteproc drivers
+#
+# CONFIG_REMOTEPROC is not set
+# end of Remoteproc drivers
+
+#
+# Rpmsg drivers
+#
+# CONFIG_RPMSG_QCOM_GLINK_RPM is not set
+# CONFIG_RPMSG_VIRTIO is not set
+# end of Rpmsg drivers
+
+# CONFIG_SOUNDWIRE is not set
+
+#
+# SOC (System On Chip) specific Drivers
+#
+
+#
+# Amlogic SoC drivers
+#
+# end of Amlogic SoC drivers
+
+#
+# Broadcom SoC drivers
+#
+# end of Broadcom SoC drivers
+
+#
+# NXP/Freescale QorIQ SoC drivers
+#
+# end of NXP/Freescale QorIQ SoC drivers
+
+#
+# fujitsu SoC drivers
+#
+# end of fujitsu SoC drivers
+
+#
+# i.MX SoC drivers
+#
+# end of i.MX SoC drivers
+
+#
+# Enable LiteX SoC Builder specific drivers
+#
+# end of Enable LiteX SoC Builder specific drivers
+
+#
+# Qualcomm SoC drivers
+#
+# end of Qualcomm SoC drivers
+
+# CONFIG_SOC_TI is not set
+
+#
+# Xilinx SoC drivers
+#
+# end of Xilinx SoC drivers
+# end of SOC (System On Chip) specific Drivers
+
+# CONFIG_PM_DEVFREQ is not set
+# CONFIG_EXTCON is not set
+# CONFIG_MEMORY is not set
+# CONFIG_IIO is not set
+# CONFIG_NTB is not set
+# CONFIG_PWM is not set
+
+#
+# IRQ chip support
+#
+# end of IRQ chip support
+
+# CONFIG_IPACK_BUS is not set
+# CONFIG_RESET_CONTROLLER is not set
+
+#
+# PHY Subsystem
+#
+# CONFIG_GENERIC_PHY is not set
+# CONFIG_PHY_CAN_TRANSCEIVER is not set
+
+#
+# PHY drivers for Broadcom platforms
+#
+# CONFIG_BCM_KONA_USB2_PHY is not set
+# end of PHY drivers for Broadcom platforms
+
+# CONFIG_PHY_PXA_28NM_HSIC is not set
+# CONFIG_PHY_PXA_28NM_USB2 is not set
+# CONFIG_PHY_INTEL_LGM_EMMC is not set
+# end of PHY Subsystem
+
+# CONFIG_POWERCAP is not set
+# CONFIG_MCB is not set
+
+#
+# Performance monitor support
+#
+# end of Performance monitor support
+
+CONFIG_RAS=y
+# CONFIG_RAS_CEC is not set
+# CONFIG_USB4 is not set
+
+#
+# Android
+#
+# CONFIG_ANDROID_BINDER_IPC is not set
+# end of Android
+
+# CONFIG_LIBNVDIMM is not set
+CONFIG_DAX=y
+# CONFIG_DEV_DAX is not set
+CONFIG_NVMEM=y
+CONFIG_NVMEM_SYSFS=y
+# CONFIG_NVMEM_RMEM is not set
+
+#
+# HW tracing support
+#
+# CONFIG_STM is not set
+# CONFIG_INTEL_TH is not set
+# end of HW tracing support
+
+# CONFIG_FPGA is not set
+# CONFIG_TEE is not set
+# CONFIG_SIOX is not set
+# CONFIG_SLIMBUS is not set
+# CONFIG_INTERCONNECT is not set
+# CONFIG_COUNTER is not set
+# CONFIG_MOST is not set
+# CONFIG_PECI is not set
+# CONFIG_HTE is not set
+# end of Device Drivers
+
+#
+# File systems
+#
+CONFIG_DCACHE_WORD_ACCESS=y
+# CONFIG_VALIDATE_FS_PARSER is not set
+CONFIG_FS_IOMAP=y
+# CONFIG_EXT2_FS is not set
+# CONFIG_EXT3_FS is not set
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_USE_FOR_EXT2=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_EXT4_FS_SECURITY=y
+# CONFIG_EXT4_DEBUG is not set
+CONFIG_JBD2=y
+# CONFIG_JBD2_DEBUG is not set
+CONFIG_FS_MBCACHE=y
+# CONFIG_REISERFS_FS is not set
+# CONFIG_JFS_FS is not set
+CONFIG_XFS_FS=y
+CONFIG_XFS_SUPPORT_V4=y
+CONFIG_XFS_QUOTA=y
+CONFIG_XFS_POSIX_ACL=y
+# CONFIG_XFS_RT is not set
+# CONFIG_XFS_ONLINE_SCRUB is not set
+# CONFIG_XFS_WARN is not set
+# CONFIG_XFS_DEBUG is not set
+# CONFIG_GFS2_FS is not set
+# CONFIG_OCFS2_FS is not set
+# CONFIG_BTRFS_FS is not set
+# CONFIG_NILFS2_FS is not set
+# CONFIG_F2FS_FS is not set
+CONFIG_FS_POSIX_ACL=y
+CONFIG_EXPORTFS=y
+# CONFIG_EXPORTFS_BLOCK_OPS is not set
+CONFIG_FILE_LOCKING=y
+# CONFIG_FS_ENCRYPTION is not set
+# CONFIG_FS_VERITY is not set
+CONFIG_FSNOTIFY=y
+CONFIG_DNOTIFY=y
+CONFIG_INOTIFY_USER=y
+CONFIG_FANOTIFY=y
+CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
+CONFIG_QUOTA=y
+CONFIG_QUOTA_NETLINK_INTERFACE=y
+# CONFIG_PRINT_QUOTA_WARNING is not set
+# CONFIG_QUOTA_DEBUG is not set
+CONFIG_QUOTA_TREE=y
+# CONFIG_QFMT_V1 is not set
+CONFIG_QFMT_V2=y
+CONFIG_QUOTACTL=y
+CONFIG_AUTOFS4_FS=y
+CONFIG_AUTOFS_FS=y
+CONFIG_FUSE_FS=m
+# CONFIG_CUSE is not set
+# CONFIG_VIRTIO_FS is not set
+CONFIG_OVERLAY_FS=y
+# CONFIG_OVERLAY_FS_REDIRECT_DIR is not set
+CONFIG_OVERLAY_FS_REDIRECT_ALWAYS_FOLLOW=y
+# CONFIG_OVERLAY_FS_INDEX is not set
+# CONFIG_OVERLAY_FS_XINO_AUTO is not set
+# CONFIG_OVERLAY_FS_METACOPY is not set
+
+#
+# Caches
+#
+CONFIG_NETFS_SUPPORT=y
+# CONFIG_NETFS_STATS is not set
+CONFIG_FSCACHE=m
+# CONFIG_FSCACHE_STATS is not set
+# CONFIG_FSCACHE_DEBUG is not set
+CONFIG_CACHEFILES=m
+# CONFIG_CACHEFILES_DEBUG is not set
+# CONFIG_CACHEFILES_ERROR_INJECTION is not set
+# CONFIG_CACHEFILES_ONDEMAND is not set
+# end of Caches
+
+#
+# CD-ROM/DVD Filesystems
+#
+CONFIG_ISO9660_FS=m
+CONFIG_JOLIET=y
+CONFIG_ZISOFS=y
+CONFIG_UDF_FS=m
+# end of CD-ROM/DVD Filesystems
+
+#
+# DOS/FAT/EXFAT/NT Filesystems
+#
+CONFIG_FAT_FS=m
+# CONFIG_MSDOS_FS is not set
+CONFIG_VFAT_FS=m
+CONFIG_FAT_DEFAULT_CODEPAGE=437
+CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
+# CONFIG_FAT_DEFAULT_UTF8 is not set
+# CONFIG_EXFAT_FS is not set
+CONFIG_NTFS_FS=m
+# CONFIG_NTFS_DEBUG is not set
+# CONFIG_NTFS_RW is not set
+# CONFIG_NTFS3_FS is not set
+# end of DOS/FAT/EXFAT/NT Filesystems
+
+#
+# Pseudo filesystems
+#
+CONFIG_PROC_FS=y
+CONFIG_PROC_KCORE=y
+CONFIG_PROC_SYSCTL=y
+CONFIG_PROC_PAGE_MONITOR=y
+CONFIG_PROC_CHILDREN=y
+CONFIG_PROC_PID_ARCH_STATUS=y
+# CONFIG_PROC_SELF_MEM_READONLY is not set
+CONFIG_KERNFS=y
+CONFIG_SYSFS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_TMPFS_XATTR=y
+# CONFIG_TMPFS_INODE64 is not set
+CONFIG_HUGETLBFS=y
+CONFIG_HUGETLB_PAGE=y
+CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y
+CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y
+# CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON is not set
+CONFIG_MEMFD_CREATE=y
+CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
+CONFIG_CONFIGFS_FS=m
+CONFIG_EFIVAR_FS=y
+# end of Pseudo filesystems
+
+CONFIG_MISC_FILESYSTEMS=y
+# CONFIG_ORANGEFS_FS is not set
+# CONFIG_ADFS_FS is not set
+# CONFIG_AFFS_FS is not set
+# CONFIG_ECRYPT_FS is not set
+# CONFIG_HFS_FS is not set
+# CONFIG_HFSPLUS_FS is not set
+# CONFIG_BEFS_FS is not set
+# CONFIG_BFS_FS is not set
+# CONFIG_EFS_FS is not set
+# CONFIG_CRAMFS is not set
+CONFIG_SQUASHFS=m
+# CONFIG_SQUASHFS_FILE_CACHE is not set
+CONFIG_SQUASHFS_FILE_DIRECT=y
+# CONFIG_SQUASHFS_DECOMP_SINGLE is not set
+# CONFIG_SQUASHFS_DECOMP_MULTI is not set
+CONFIG_SQUASHFS_DECOMP_MULTI_PERCPU=y
+CONFIG_SQUASHFS_XATTR=y
+CONFIG_SQUASHFS_ZLIB=y
+# CONFIG_SQUASHFS_LZ4 is not set
+# CONFIG_SQUASHFS_LZO is not set
+# CONFIG_SQUASHFS_XZ is not set
+CONFIG_SQUASHFS_ZSTD=y
+CONFIG_SQUASHFS_4K_DEVBLK_SIZE=y
+# CONFIG_SQUASHFS_EMBEDDED is not set
+CONFIG_SQUASHFS_FRAGMENT_CACHE_SIZE=3
+# CONFIG_VXFS_FS is not set
+# CONFIG_MINIX_FS is not set
+# CONFIG_OMFS_FS is not set
+# CONFIG_HPFS_FS is not set
+# CONFIG_QNX4FS_FS is not set
+# CONFIG_QNX6FS_FS is not set
+# CONFIG_ROMFS_FS is not set
+CONFIG_PSTORE=y
+CONFIG_PSTORE_DEFAULT_KMSG_BYTES=10240
+CONFIG_PSTORE_DEFLATE_COMPRESS=y
+# CONFIG_PSTORE_LZO_COMPRESS is not set
+# CONFIG_PSTORE_LZ4_COMPRESS is not set
+# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
+# CONFIG_PSTORE_842_COMPRESS is not set
+# CONFIG_PSTORE_ZSTD_COMPRESS is not set
+CONFIG_PSTORE_COMPRESS=y
+CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
+CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
+# CONFIG_PSTORE_CONSOLE is not set
+# CONFIG_PSTORE_PMSG is not set
+# CONFIG_PSTORE_FTRACE is not set
+# CONFIG_PSTORE_RAM is not set
+# CONFIG_PSTORE_BLK is not set
+# CONFIG_SYSV_FS is not set
+# CONFIG_UFS_FS is not set
+# CONFIG_EROFS_FS is not set
+CONFIG_NETWORK_FILESYSTEMS=y
+CONFIG_NFS_FS=m
+# CONFIG_NFS_V2 is not set
+CONFIG_NFS_V3=m
+CONFIG_NFS_V3_ACL=y
+CONFIG_NFS_V4=m
+# CONFIG_NFS_SWAP is not set
+CONFIG_NFS_V4_1=y
+CONFIG_NFS_V4_2=y
+CONFIG_PNFS_FILE_LAYOUT=m
+CONFIG_PNFS_BLOCK=m
+CONFIG_PNFS_FLEXFILE_LAYOUT=m
+CONFIG_NFS_V4_1_IMPLEMENTATION_ID_DOMAIN="kernel.org"
+# CONFIG_NFS_V4_1_MIGRATION is not set
+CONFIG_NFS_V4_SECURITY_LABEL=y
+CONFIG_NFS_FSCACHE=y
+# CONFIG_NFS_USE_LEGACY_DNS is not set
+CONFIG_NFS_USE_KERNEL_DNS=y
+CONFIG_NFS_DEBUG=y
+CONFIG_NFS_DISABLE_UDP_SUPPORT=y
+# CONFIG_NFS_V4_2_READ_PLUS is not set
+CONFIG_NFSD=m
+CONFIG_NFSD_V2_ACL=y
+CONFIG_NFSD_V3_ACL=y
+CONFIG_NFSD_V4=y
+# CONFIG_NFSD_BLOCKLAYOUT is not set
+# CONFIG_NFSD_SCSILAYOUT is not set
+# CONFIG_NFSD_FLEXFILELAYOUT is not set
+# CONFIG_NFSD_V4_2_INTER_SSC is not set
+CONFIG_NFSD_V4_SECURITY_LABEL=y
+CONFIG_GRACE_PERIOD=m
+CONFIG_LOCKD=m
+CONFIG_LOCKD_V4=y
+CONFIG_NFS_ACL_SUPPORT=m
+CONFIG_NFS_COMMON=y
+CONFIG_NFS_V4_2_SSC_HELPER=y
+CONFIG_SUNRPC=m
+CONFIG_SUNRPC_GSS=m
+CONFIG_SUNRPC_BACKCHANNEL=y
+CONFIG_RPCSEC_GSS_KRB5=m
+# CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES is not set
+CONFIG_SUNRPC_DEBUG=y
+# CONFIG_CEPH_FS is not set
+CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
+# CONFIG_CIFS_ALLOW_INSECURE_LEGACY is not set
+CONFIG_CIFS_UPCALL=y
+CONFIG_CIFS_XATTR=y
+CONFIG_CIFS_DEBUG=y
+# CONFIG_CIFS_DEBUG2 is not set
+# CONFIG_CIFS_DEBUG_DUMP_KEYS is not set
+CONFIG_CIFS_DFS_UPCALL=y
+# CONFIG_CIFS_SWN_UPCALL is not set
+# CONFIG_CIFS_FSCACHE is not set
+# CONFIG_SMB_SERVER is not set
+CONFIG_SMBFS=m
+# CONFIG_CODA_FS is not set
+# CONFIG_AFS_FS is not set
+CONFIG_9P_FS=y
+CONFIG_9P_FS_POSIX_ACL=y
+CONFIG_9P_FS_SECURITY=y
+CONFIG_NLS=y
+CONFIG_NLS_DEFAULT="utf8"
+CONFIG_NLS_CODEPAGE_437=m
+# CONFIG_NLS_CODEPAGE_737 is not set
+# CONFIG_NLS_CODEPAGE_775 is not set
+# CONFIG_NLS_CODEPAGE_850 is not set
+# CONFIG_NLS_CODEPAGE_852 is not set
+# CONFIG_NLS_CODEPAGE_855 is not set
+# CONFIG_NLS_CODEPAGE_857 is not set
+# CONFIG_NLS_CODEPAGE_860 is not set
+# CONFIG_NLS_CODEPAGE_861 is not set
+# CONFIG_NLS_CODEPAGE_862 is not set
+# CONFIG_NLS_CODEPAGE_863 is not set
+# CONFIG_NLS_CODEPAGE_864 is not set
+# CONFIG_NLS_CODEPAGE_865 is not set
+# CONFIG_NLS_CODEPAGE_866 is not set
+# CONFIG_NLS_CODEPAGE_869 is not set
+# CONFIG_NLS_CODEPAGE_936 is not set
+# CONFIG_NLS_CODEPAGE_950 is not set
+# CONFIG_NLS_CODEPAGE_932 is not set
+# CONFIG_NLS_CODEPAGE_949 is not set
+# CONFIG_NLS_CODEPAGE_874 is not set
+# CONFIG_NLS_ISO8859_8 is not set
+# CONFIG_NLS_CODEPAGE_1250 is not set
+# CONFIG_NLS_CODEPAGE_1251 is not set
+CONFIG_NLS_ASCII=m
+CONFIG_NLS_ISO8859_1=m
+# CONFIG_NLS_ISO8859_2 is not set
+# CONFIG_NLS_ISO8859_3 is not set
+# CONFIG_NLS_ISO8859_4 is not set
+# CONFIG_NLS_ISO8859_5 is not set
+# CONFIG_NLS_ISO8859_6 is not set
+# CONFIG_NLS_ISO8859_7 is not set
+# CONFIG_NLS_ISO8859_9 is not set
+# CONFIG_NLS_ISO8859_13 is not set
+# CONFIG_NLS_ISO8859_14 is not set
+# CONFIG_NLS_ISO8859_15 is not set
+# CONFIG_NLS_KOI8_R is not set
+# CONFIG_NLS_KOI8_U is not set
+# CONFIG_NLS_MAC_ROMAN is not set
+# CONFIG_NLS_MAC_CELTIC is not set
+# CONFIG_NLS_MAC_CENTEURO is not set
+# CONFIG_NLS_MAC_CROATIAN is not set
+# CONFIG_NLS_MAC_CYRILLIC is not set
+# CONFIG_NLS_MAC_GAELIC is not set
+# CONFIG_NLS_MAC_GREEK is not set
+# CONFIG_NLS_MAC_ICELAND is not set
+# CONFIG_NLS_MAC_INUIT is not set
+# CONFIG_NLS_MAC_ROMANIAN is not set
+# CONFIG_NLS_MAC_TURKISH is not set
+CONFIG_NLS_UTF8=m
+# CONFIG_DLM is not set
+# CONFIG_UNICODE is not set
+CONFIG_IO_WQ=y
+# end of File systems
+
+#
+# Security options
+#
+CONFIG_KEYS=y
+# CONFIG_KEYS_REQUEST_CACHE is not set
+# CONFIG_PERSISTENT_KEYRINGS is not set
+# CONFIG_TRUSTED_KEYS is not set
+# CONFIG_ENCRYPTED_KEYS is not set
+# CONFIG_KEY_DH_OPERATIONS is not set
+CONFIG_SECURITY_DMESG_RESTRICT=y
+CONFIG_SECURITY=y
+CONFIG_SECURITYFS=y
+CONFIG_SECURITY_NETWORK=y
+# CONFIG_SECURITY_NETWORK_XFRM is not set
+CONFIG_SECURITY_PATH=y
+# CONFIG_INTEL_TXT is not set
+CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
+CONFIG_HARDENED_USERCOPY=y
+# CONFIG_FORTIFY_SOURCE is not set
+# CONFIG_STATIC_USERMODEHELPER is not set
+# CONFIG_SECURITY_SELINUX is not set
+# CONFIG_SECURITY_SMACK is not set
+# CONFIG_SECURITY_TOMOYO is not set
+CONFIG_SECURITY_APPARMOR=y
+# CONFIG_SECURITY_APPARMOR_DEBUG is not set
+CONFIG_SECURITY_APPARMOR_INTROSPECT_POLICY=y
+CONFIG_SECURITY_APPARMOR_HASH=y
+CONFIG_SECURITY_APPARMOR_HASH_DEFAULT=y
+CONFIG_SECURITY_APPARMOR_EXPORT_BINARY=y
+CONFIG_SECURITY_APPARMOR_PARANOID_LOAD=y
+CONFIG_SECURITY_LOADPIN=y
+CONFIG_SECURITY_LOADPIN_ENFORCE=y
+# CONFIG_SECURITY_LOADPIN_VERITY is not set
+CONFIG_SECURITY_YAMA=y
+CONFIG_SECURITY_CONTAINER_MONITOR=y
+# CONFIG_SECURITY_CONTAINER_MONITOR_DEBUG is not set
+CONFIG_SECURITY_SAFESETID=y
+CONFIG_SECURITY_LOCKDOWN_LSM=y
+# CONFIG_SECURITY_LOCKDOWN_LSM_EARLY is not set
+CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE=y
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY is not set
+# CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY is not set
+# CONFIG_SECURITY_LANDLOCK is not set
+CONFIG_INTEGRITY=y
+CONFIG_INTEGRITY_SIGNATURE=y
+CONFIG_INTEGRITY_ASYMMETRIC_KEYS=y
+CONFIG_INTEGRITY_TRUSTED_KEYRING=y
+CONFIG_INTEGRITY_PLATFORM_KEYRING=y
+# CONFIG_INTEGRITY_MACHINE_KEYRING is not set
+CONFIG_LOAD_UEFI_KEYS=y
+CONFIG_INTEGRITY_AUDIT=y
+CONFIG_IMA=y
+# CONFIG_IMA_KEXEC is not set
+CONFIG_IMA_MEASURE_PCR_IDX=10
+CONFIG_IMA_LSM_RULES=y
+CONFIG_IMA_NG_TEMPLATE=y
+# CONFIG_IMA_SIG_TEMPLATE is not set
+CONFIG_IMA_DEFAULT_TEMPLATE="ima-ng"
+# CONFIG_IMA_DEFAULT_HASH_SHA1 is not set
+CONFIG_IMA_DEFAULT_HASH_SHA256=y
+CONFIG_IMA_DEFAULT_HASH="sha256"
+CONFIG_IMA_WRITE_POLICY=y
+# CONFIG_IMA_READ_POLICY is not set
+CONFIG_IMA_APPRAISE=y
+# CONFIG_IMA_ARCH_POLICY is not set
+CONFIG_IMA_APPRAISE_BUILD_POLICY=y
+CONFIG_IMA_APPRAISE_REQUIRE_FIRMWARE_SIGS=y
+# CONFIG_IMA_APPRAISE_REQUIRE_KEXEC_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_MODULE_SIGS is not set
+# CONFIG_IMA_APPRAISE_REQUIRE_POLICY_SIGS is not set
+CONFIG_IMA_APPRAISE_BOOTPARAM=y
+# CONFIG_IMA_APPRAISE_MODSIG is not set
+CONFIG_IMA_TRUSTED_KEYRING=y
+# CONFIG_IMA_KEYRINGS_PERMIT_SIGNED_BY_BUILTIN_OR_SECONDARY is not set
+# CONFIG_IMA_BLACKLIST_KEYRING is not set
+CONFIG_IMA_LOAD_X509=y
+CONFIG_IMA_X509_PATH="/etc/ima/pubkey.x509"
+# CONFIG_IMA_APPRAISE_SIGNED_INIT is not set
+CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS=y
+CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS=y
+# CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT is not set
+# CONFIG_IMA_DISABLE_HTABLE is not set
+# CONFIG_EVM is not set
+# CONFIG_DEFAULT_SECURITY_APPARMOR is not set
+CONFIG_DEFAULT_SECURITY_DAC=y
+CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,apparmor,bpf"
+
+#
+# Kernel hardening options
+#
+
+#
+# Memory initialization
+#
+CONFIG_CC_HAS_AUTO_VAR_INIT_PATTERN=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO_ENABLER=y
+CONFIG_CC_HAS_AUTO_VAR_INIT_ZERO=y
+# CONFIG_INIT_STACK_NONE is not set
+# CONFIG_INIT_STACK_ALL_PATTERN is not set
+CONFIG_INIT_STACK_ALL_ZERO=y
+# CONFIG_INIT_ON_ALLOC_DEFAULT_ON is not set
+# CONFIG_INIT_ON_FREE_DEFAULT_ON is not set
+# end of Memory initialization
+
+CONFIG_RANDSTRUCT_NONE=y
+# end of Kernel hardening options
+# end of Security options
+
+CONFIG_XOR_BLOCKS=y
+CONFIG_ASYNC_CORE=y
+CONFIG_ASYNC_MEMCPY=m
+CONFIG_ASYNC_XOR=y
+CONFIG_ASYNC_PQ=m
+CONFIG_ASYNC_RAID6_RECOV=m
+CONFIG_CRYPTO=y
+
+#
+# Crypto core or helper
+#
+CONFIG_CRYPTO_ALGAPI=y
+CONFIG_CRYPTO_ALGAPI2=y
+CONFIG_CRYPTO_AEAD=y
+CONFIG_CRYPTO_AEAD2=y
+CONFIG_CRYPTO_SKCIPHER=y
+CONFIG_CRYPTO_SKCIPHER2=y
+CONFIG_CRYPTO_HASH=y
+CONFIG_CRYPTO_HASH2=y
+CONFIG_CRYPTO_RNG=m
+CONFIG_CRYPTO_RNG2=y
+CONFIG_CRYPTO_RNG_DEFAULT=m
+CONFIG_CRYPTO_AKCIPHER2=y
+CONFIG_CRYPTO_AKCIPHER=y
+CONFIG_CRYPTO_KPP2=y
+CONFIG_CRYPTO_ACOMP2=y
+CONFIG_CRYPTO_MANAGER=y
+CONFIG_CRYPTO_MANAGER2=y
+# CONFIG_CRYPTO_USER is not set
+CONFIG_CRYPTO_MANAGER_DISABLE_TESTS=y
+CONFIG_CRYPTO_GF128MUL=y
+CONFIG_CRYPTO_NULL=y
+CONFIG_CRYPTO_NULL2=y
+# CONFIG_CRYPTO_PCRYPT is not set
+CONFIG_CRYPTO_CRYPTD=m
+CONFIG_CRYPTO_AUTHENC=y
+# CONFIG_CRYPTO_TEST is not set
+CONFIG_CRYPTO_SIMD=m
+CONFIG_CRYPTO_ENGINE=m
+# end of Crypto core or helper
+
+#
+# Public-key cryptography
+#
+CONFIG_CRYPTO_RSA=y
+# CONFIG_CRYPTO_DH is not set
+# CONFIG_CRYPTO_ECDH is not set
+# CONFIG_CRYPTO_ECDSA is not set
+# CONFIG_CRYPTO_ECRDSA is not set
+# CONFIG_CRYPTO_SM2 is not set
+# CONFIG_CRYPTO_CURVE25519 is not set
+# end of Public-key cryptography
+
+#
+# Block ciphers
+#
+CONFIG_CRYPTO_AES=y
+# CONFIG_CRYPTO_AES_TI is not set
+# CONFIG_CRYPTO_ANUBIS is not set
+# CONFIG_CRYPTO_ARIA is not set
+# CONFIG_CRYPTO_BLOWFISH is not set
+# CONFIG_CRYPTO_CAMELLIA is not set
+# CONFIG_CRYPTO_CAST5 is not set
+# CONFIG_CRYPTO_CAST6 is not set
+CONFIG_CRYPTO_DES=y
+# CONFIG_CRYPTO_FCRYPT is not set
+# CONFIG_CRYPTO_KHAZAD is not set
+# CONFIG_CRYPTO_SEED is not set
+# CONFIG_CRYPTO_SERPENT is not set
+# CONFIG_CRYPTO_SM4_GENERIC is not set
+# CONFIG_CRYPTO_TEA is not set
+# CONFIG_CRYPTO_TWOFISH is not set
+# end of Block ciphers
+
+#
+# Length-preserving ciphers and modes
+#
+# CONFIG_CRYPTO_ADIANTUM is not set
+CONFIG_CRYPTO_ARC4=y
+# CONFIG_CRYPTO_CHACHA20 is not set
+CONFIG_CRYPTO_CBC=y
+# CONFIG_CRYPTO_CFB is not set
+CONFIG_CRYPTO_CTR=y
+CONFIG_CRYPTO_CTS=y
+CONFIG_CRYPTO_ECB=y
+# CONFIG_CRYPTO_HCTR2 is not set
+# CONFIG_CRYPTO_KEYWRAP is not set
+CONFIG_CRYPTO_LRW=m
+# CONFIG_CRYPTO_OFB is not set
+# CONFIG_CRYPTO_PCBC is not set
+CONFIG_CRYPTO_XTS=m
+# end of Length-preserving ciphers and modes
+
+#
+# AEAD (authenticated encryption with associated data) ciphers
+#
+# CONFIG_CRYPTO_AEGIS128 is not set
+# CONFIG_CRYPTO_CHACHA20POLY1305 is not set
+CONFIG_CRYPTO_CCM=m
+CONFIG_CRYPTO_GCM=y
+CONFIG_CRYPTO_SEQIV=m
+CONFIG_CRYPTO_ECHAINIV=m
+CONFIG_CRYPTO_ESSIV=y
+# end of AEAD (authenticated encryption with associated data) ciphers
+
+#
+# Hashes, digests, and MACs
+#
+# CONFIG_CRYPTO_BLAKE2B is not set
+CONFIG_CRYPTO_CMAC=m
+CONFIG_CRYPTO_GHASH=y
+CONFIG_CRYPTO_HMAC=y
+CONFIG_CRYPTO_MD4=m
+CONFIG_CRYPTO_MD5=y
+CONFIG_CRYPTO_MICHAEL_MIC=m
+# CONFIG_CRYPTO_POLY1305 is not set
+# CONFIG_CRYPTO_RMD160 is not set
+CONFIG_CRYPTO_SHA1=y
+CONFIG_CRYPTO_SHA256=y
+CONFIG_CRYPTO_SHA512=m
+# CONFIG_CRYPTO_SHA3 is not set
+# CONFIG_CRYPTO_SM3_GENERIC is not set
+# CONFIG_CRYPTO_STREEBOG is not set
+# CONFIG_CRYPTO_VMAC is not set
+# CONFIG_CRYPTO_WP512 is not set
+# CONFIG_CRYPTO_XCBC is not set
+# CONFIG_CRYPTO_XXHASH is not set
+# end of Hashes, digests, and MACs
+
+#
+# CRCs (cyclic redundancy checks)
+#
+CONFIG_CRYPTO_CRC32C=y
+# CONFIG_CRYPTO_CRC32 is not set
+CONFIG_CRYPTO_CRCT10DIF=y
+CONFIG_CRYPTO_CRC64_ROCKSOFT=y
+# end of CRCs (cyclic redundancy checks)
+
+#
+# Compression
+#
+CONFIG_CRYPTO_DEFLATE=y
+CONFIG_CRYPTO_LZO=m
+# CONFIG_CRYPTO_842 is not set
+CONFIG_CRYPTO_LZ4=m
+# CONFIG_CRYPTO_LZ4HC is not set
+# CONFIG_CRYPTO_ZSTD is not set
+# end of Compression
+
+#
+# Random number generation
+#
+# CONFIG_CRYPTO_ANSI_CPRNG is not set
+CONFIG_CRYPTO_DRBG_MENU=m
+CONFIG_CRYPTO_DRBG_HMAC=y
+# CONFIG_CRYPTO_DRBG_HASH is not set
+# CONFIG_CRYPTO_DRBG_CTR is not set
+CONFIG_CRYPTO_DRBG=m
+CONFIG_CRYPTO_JITTERENTROPY=m
+# end of Random number generation
+
+#
+# Userspace interface
+#
+CONFIG_CRYPTO_USER_API=y
+CONFIG_CRYPTO_USER_API_HASH=y
+CONFIG_CRYPTO_USER_API_SKCIPHER=y
+# CONFIG_CRYPTO_USER_API_RNG is not set
+CONFIG_CRYPTO_USER_API_AEAD=y
+CONFIG_CRYPTO_USER_API_ENABLE_OBSOLETE=y
+# end of Userspace interface
+
+CONFIG_CRYPTO_HASH_INFO=y
+
+#
+# Accelerated Cryptographic Algorithms for CPU (x86)
+#
+CONFIG_CRYPTO_CURVE25519_X86=m
+CONFIG_CRYPTO_AES_NI_INTEL=m
+# CONFIG_CRYPTO_BLOWFISH_X86_64 is not set
+# CONFIG_CRYPTO_CAMELLIA_X86_64 is not set
+# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX_X86_64 is not set
+# CONFIG_CRYPTO_CAMELLIA_AESNI_AVX2_X86_64 is not set
+# CONFIG_CRYPTO_CAST5_AVX_X86_64 is not set
+# CONFIG_CRYPTO_CAST6_AVX_X86_64 is not set
+# CONFIG_CRYPTO_DES3_EDE_X86_64 is not set
+# CONFIG_CRYPTO_SERPENT_SSE2_X86_64 is not set
+# CONFIG_CRYPTO_SERPENT_AVX_X86_64 is not set
+# CONFIG_CRYPTO_SERPENT_AVX2_X86_64 is not set
+# CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64 is not set
+# CONFIG_CRYPTO_SM4_AESNI_AVX2_X86_64 is not set
+# CONFIG_CRYPTO_TWOFISH_X86_64 is not set
+# CONFIG_CRYPTO_TWOFISH_X86_64_3WAY is not set
+# CONFIG_CRYPTO_TWOFISH_AVX_X86_64 is not set
+# CONFIG_CRYPTO_ARIA_AESNI_AVX_X86_64 is not set
+CONFIG_CRYPTO_CHACHA20_X86_64=m
+# CONFIG_CRYPTO_AEGIS128_AESNI_SSE2 is not set
+# CONFIG_CRYPTO_NHPOLY1305_SSE2 is not set
+# CONFIG_CRYPTO_NHPOLY1305_AVX2 is not set
+CONFIG_CRYPTO_BLAKE2S_X86=y
+# CONFIG_CRYPTO_POLYVAL_CLMUL_NI is not set
+CONFIG_CRYPTO_POLY1305_X86_64=m
+# CONFIG_CRYPTO_SHA1_SSSE3 is not set
+# CONFIG_CRYPTO_SHA256_SSSE3 is not set
+# CONFIG_CRYPTO_SHA512_SSSE3 is not set
+# CONFIG_CRYPTO_SM3_AVX_X86_64 is not set
+# CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL is not set
+# CONFIG_CRYPTO_CRC32C_INTEL is not set
+# CONFIG_CRYPTO_CRC32_PCLMUL is not set
+# CONFIG_CRYPTO_CRCT10DIF_PCLMUL is not set
+# end of Accelerated Cryptographic Algorithms for CPU (x86)
+
+CONFIG_CRYPTO_HW=y
+# CONFIG_CRYPTO_DEV_PADLOCK is not set
+# CONFIG_CRYPTO_DEV_CCP is not set
+# CONFIG_CRYPTO_DEV_QAT_DH895xCC is not set
+# CONFIG_CRYPTO_DEV_QAT_C3XXX is not set
+# CONFIG_CRYPTO_DEV_QAT_C62X is not set
+# CONFIG_CRYPTO_DEV_QAT_4XXX is not set
+# CONFIG_CRYPTO_DEV_QAT_DH895xCCVF is not set
+# CONFIG_CRYPTO_DEV_QAT_C3XXXVF is not set
+# CONFIG_CRYPTO_DEV_QAT_C62XVF is not set
+# CONFIG_CRYPTO_DEV_NITROX_CNN55XX is not set
+CONFIG_CRYPTO_DEV_VIRTIO=m
+# CONFIG_CRYPTO_DEV_SAFEXCEL is not set
+# CONFIG_CRYPTO_DEV_AMLOGIC_GXL is not set
+CONFIG_ASYMMETRIC_KEY_TYPE=y
+CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE=y
+CONFIG_X509_CERTIFICATE_PARSER=y
+# CONFIG_PKCS8_PRIVATE_KEY_PARSER is not set
+CONFIG_PKCS7_MESSAGE_PARSER=y
+# CONFIG_PKCS7_TEST_KEY is not set
+# CONFIG_SIGNED_PE_FILE_VERIFICATION is not set
+# CONFIG_FIPS_SIGNATURE_SELFTEST is not set
+
+#
+# Certificates for signature checking
+#
+CONFIG_MODULE_SIG_KEY="certs/signing_key.pem"
+CONFIG_MODULE_SIG_KEY_TYPE_RSA=y
+# CONFIG_MODULE_SIG_KEY_TYPE_ECDSA is not set
+CONFIG_SYSTEM_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_TRUSTED_KEYS="google/certs/lakitu_root_cert.pem"
+# CONFIG_SYSTEM_EXTRA_CERTIFICATE is not set
+CONFIG_SECONDARY_TRUSTED_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_KEYRING=y
+CONFIG_SYSTEM_BLACKLIST_HASH_LIST=""
+# CONFIG_SYSTEM_REVOCATION_LIST is not set
+# CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE is not set
+# end of Certificates for signature checking
+
+CONFIG_BINARY_PRINTF=y
+
+#
+# Library routines
+#
+CONFIG_RAID6_PQ=m
+CONFIG_RAID6_PQ_BENCHMARK=y
+# CONFIG_PACKING is not set
+CONFIG_BITREVERSE=y
+CONFIG_GENERIC_STRNCPY_FROM_USER=y
+CONFIG_GENERIC_STRNLEN_USER=y
+CONFIG_GENERIC_NET_UTILS=y
+# CONFIG_CORDIC is not set
+# CONFIG_PRIME_NUMBERS is not set
+CONFIG_RATIONAL=y
+CONFIG_GENERIC_PCI_IOMAP=y
+CONFIG_GENERIC_IOMAP=y
+CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
+CONFIG_ARCH_HAS_FAST_MULTIPLIER=y
+CONFIG_ARCH_USE_SYM_ANNOTATIONS=y
+
+#
+# Crypto library routines
+#
+CONFIG_CRYPTO_LIB_UTILS=y
+CONFIG_CRYPTO_LIB_AES=y
+CONFIG_CRYPTO_LIB_ARC4=y
+CONFIG_CRYPTO_ARCH_HAVE_LIB_BLAKE2S=y
+CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
+CONFIG_CRYPTO_ARCH_HAVE_LIB_CHACHA=m
+CONFIG_CRYPTO_LIB_CHACHA_GENERIC=m
+CONFIG_CRYPTO_LIB_CHACHA=m
+CONFIG_CRYPTO_ARCH_HAVE_LIB_CURVE25519=m
+CONFIG_CRYPTO_LIB_CURVE25519_GENERIC=m
+CONFIG_CRYPTO_LIB_CURVE25519=m
+CONFIG_CRYPTO_LIB_DES=y
+CONFIG_CRYPTO_LIB_POLY1305_RSIZE=11
+CONFIG_CRYPTO_ARCH_HAVE_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_POLY1305_GENERIC=m
+CONFIG_CRYPTO_LIB_POLY1305=m
+CONFIG_CRYPTO_LIB_CHACHA20POLY1305=m
+CONFIG_CRYPTO_LIB_SHA1=y
+CONFIG_CRYPTO_LIB_SHA256=y
+# end of Crypto library routines
+
+CONFIG_CRC_CCITT=y
+CONFIG_CRC16=y
+CONFIG_CRC_T10DIF=y
+CONFIG_CRC64_ROCKSOFT=y
+CONFIG_CRC_ITU_T=y
+CONFIG_CRC32=y
+# CONFIG_CRC32_SELFTEST is not set
+CONFIG_CRC32_SLICEBY8=y
+# CONFIG_CRC32_SLICEBY4 is not set
+# CONFIG_CRC32_SARWATE is not set
+# CONFIG_CRC32_BIT is not set
+CONFIG_CRC64=y
+# CONFIG_CRC4 is not set
+CONFIG_CRC7=m
+CONFIG_LIBCRC32C=y
+# CONFIG_CRC8 is not set
+CONFIG_XXHASH=y
+# CONFIG_RANDOM32_SELFTEST is not set
+CONFIG_ZLIB_INFLATE=y
+CONFIG_ZLIB_DEFLATE=y
+CONFIG_LZO_COMPRESS=m
+CONFIG_LZO_DECOMPRESS=m
+CONFIG_LZ4_COMPRESS=m
+CONFIG_LZ4_DECOMPRESS=y
+CONFIG_ZSTD_COMMON=y
+CONFIG_ZSTD_DECOMPRESS=y
+CONFIG_XZ_DEC=y
+CONFIG_XZ_DEC_X86=y
+# CONFIG_XZ_DEC_POWERPC is not set
+# CONFIG_XZ_DEC_IA64 is not set
+# CONFIG_XZ_DEC_ARM is not set
+# CONFIG_XZ_DEC_ARMTHUMB is not set
+# CONFIG_XZ_DEC_SPARC is not set
+# CONFIG_XZ_DEC_MICROLZMA is not set
+CONFIG_XZ_DEC_BCJ=y
+# CONFIG_XZ_DEC_TEST is not set
+CONFIG_DECOMPRESS_GZIP=y
+CONFIG_DECOMPRESS_XZ=y
+CONFIG_DECOMPRESS_LZ4=y
+CONFIG_DECOMPRESS_ZSTD=y
+CONFIG_GENERIC_ALLOCATOR=y
+CONFIG_TEXTSEARCH=y
+CONFIG_TEXTSEARCH_KMP=m
+CONFIG_TEXTSEARCH_BM=m
+CONFIG_TEXTSEARCH_FSM=m
+CONFIG_INTERVAL_TREE=y
+CONFIG_XARRAY_MULTI=y
+CONFIG_ASSOCIATIVE_ARRAY=y
+CONFIG_HAS_IOMEM=y
+CONFIG_HAS_IOPORT_MAP=y
+CONFIG_HAS_DMA=y
+CONFIG_DMA_OPS=y
+CONFIG_NEED_SG_DMA_LENGTH=y
+CONFIG_NEED_DMA_MAP_STATE=y
+CONFIG_ARCH_DMA_ADDR_T_64BIT=y
+CONFIG_ARCH_HAS_FORCE_DMA_UNENCRYPTED=y
+CONFIG_SWIOTLB=y
+CONFIG_DMA_COHERENT_POOL=y
+# CONFIG_DMA_CMA is not set
+# CONFIG_DMA_API_DEBUG is not set
+# CONFIG_DMA_MAP_BENCHMARK is not set
+CONFIG_SGL_ALLOC=y
+CONFIG_IOMMU_HELPER=y
+# CONFIG_FORCE_NR_CPUS is not set
+CONFIG_CPU_RMAP=y
+CONFIG_DQL=y
+CONFIG_GLOB=y
+# CONFIG_GLOB_SELFTEST is not set
+CONFIG_NLATTR=y
+CONFIG_CLZ_TAB=y
+# CONFIG_IRQ_POLL is not set
+CONFIG_MPILIB=y
+CONFIG_SIGNATURE=y
+CONFIG_DIMLIB=y
+CONFIG_OID_REGISTRY=y
+CONFIG_UCS2_STRING=y
+CONFIG_HAVE_GENERIC_VDSO=y
+CONFIG_GENERIC_GETTIMEOFDAY=y
+CONFIG_GENERIC_VDSO_TIME_NS=y
+CONFIG_FONT_SUPPORT=y
+CONFIG_FONT_8x16=y
+CONFIG_FONT_AUTOSELECT=y
+CONFIG_SG_POOL=y
+CONFIG_ARCH_HAS_PMEM_API=y
+CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
+CONFIG_ARCH_HAS_COPY_MC=y
+CONFIG_ARCH_STACKWALK=y
+CONFIG_STACKDEPOT=y
+CONFIG_SBITMAP=y
+# end of Library routines
+
+#
+# Kernel hacking
+#
+
+#
+# printk and dmesg options
+#
+CONFIG_PRINTK_TIME=y
+# CONFIG_PRINTK_CALLER is not set
+# CONFIG_STACKTRACE_BUILD_ID is not set
+CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
+CONFIG_CONSOLE_LOGLEVEL_QUIET=4
+CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
+# CONFIG_BOOT_PRINTK_DELAY is not set
+# CONFIG_DYNAMIC_DEBUG is not set
+# CONFIG_DYNAMIC_DEBUG_CORE is not set
+CONFIG_SYMBOLIC_ERRNAME=y
+CONFIG_DEBUG_BUGVERBOSE=y
+# end of printk and dmesg options
+
+CONFIG_DEBUG_KERNEL=y
+CONFIG_DEBUG_MISC=y
+
+#
+# Compile-time checks and compiler options
+#
+CONFIG_DEBUG_INFO=y
+CONFIG_AS_HAS_NON_CONST_LEB128=y
+# CONFIG_DEBUG_INFO_NONE is not set
+# CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT is not set
+CONFIG_DEBUG_INFO_DWARF4=y
+# CONFIG_DEBUG_INFO_DWARF5 is not set
+# CONFIG_DEBUG_INFO_REDUCED is not set
+# CONFIG_DEBUG_INFO_COMPRESSED is not set
+# CONFIG_DEBUG_INFO_SPLIT is not set
+CONFIG_DEBUG_INFO_BTF=y
+CONFIG_PAHOLE_HAS_SPLIT_BTF=y
+CONFIG_DEBUG_INFO_BTF_MODULES=y
+# CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set
+# CONFIG_GDB_SCRIPTS is not set
+CONFIG_FRAME_WARN=2048
+# CONFIG_STRIP_ASM_SYMS is not set
+# CONFIG_HEADERS_INSTALL is not set
+CONFIG_SECTION_MISMATCH_WARN_ONLY=y
+# CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not set
+CONFIG_OBJTOOL=y
+# CONFIG_VMLINUX_MAP is not set
+# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
+# end of Compile-time checks and compiler options
+
+#
+# Generic Kernel Debugging Instruments
+#
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
+CONFIG_MAGIC_SYSRQ_SERIAL=y
+CONFIG_MAGIC_SYSRQ_SERIAL_SEQUENCE=""
+CONFIG_DEBUG_FS=y
+CONFIG_DEBUG_FS_ALLOW_ALL=y
+# CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
+# CONFIG_DEBUG_FS_ALLOW_NONE is not set
+CONFIG_HAVE_ARCH_KGDB=y
+# CONFIG_KGDB is not set
+CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
+CONFIG_UBSAN=y
+# CONFIG_UBSAN_TRAP is not set
+CONFIG_CC_HAS_UBSAN_BOUNDS=y
+CONFIG_CC_HAS_UBSAN_ARRAY_BOUNDS=y
+CONFIG_UBSAN_BOUNDS=y
+CONFIG_UBSAN_ARRAY_BOUNDS=y
+# CONFIG_UBSAN_SHIFT is not set
+# CONFIG_UBSAN_BOOL is not set
+# CONFIG_UBSAN_ENUM is not set
+# CONFIG_UBSAN_ALIGNMENT is not set
+# CONFIG_UBSAN_SANITIZE_ALL is not set
+# CONFIG_TEST_UBSAN is not set
+CONFIG_HAVE_ARCH_KCSAN=y
+CONFIG_HAVE_KCSAN_COMPILER=y
+# CONFIG_KCSAN is not set
+# end of Generic Kernel Debugging Instruments
+
+#
+# Networking Debugging
+#
+# CONFIG_NET_DEV_REFCNT_TRACKER is not set
+# CONFIG_NET_NS_REFCNT_TRACKER is not set
+# CONFIG_DEBUG_NET is not set
+# end of Networking Debugging
+
+#
+# Memory Debugging
+#
+# CONFIG_PAGE_EXTENSION is not set
+# CONFIG_DEBUG_PAGEALLOC is not set
+CONFIG_SLUB_DEBUG=y
+# CONFIG_SLUB_DEBUG_ON is not set
+# CONFIG_PAGE_OWNER is not set
+# CONFIG_PAGE_TABLE_CHECK is not set
+# CONFIG_PAGE_POISONING is not set
+# CONFIG_DEBUG_PAGE_REF is not set
+# CONFIG_DEBUG_RODATA_TEST is not set
+CONFIG_ARCH_HAS_DEBUG_WX=y
+# CONFIG_DEBUG_WX is not set
+CONFIG_GENERIC_PTDUMP=y
+# CONFIG_PTDUMP_DEBUGFS is not set
+# CONFIG_DEBUG_OBJECTS is not set
+# CONFIG_SHRINKER_DEBUG is not set
+CONFIG_HAVE_DEBUG_KMEMLEAK=y
+# CONFIG_DEBUG_KMEMLEAK is not set
+# CONFIG_DEBUG_STACK_USAGE is not set
+# CONFIG_SCHED_STACK_END_CHECK is not set
+CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
+# CONFIG_DEBUG_VM is not set
+# CONFIG_DEBUG_VM_PGTABLE is not set
+CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
+# CONFIG_DEBUG_VIRTUAL is not set
+# CONFIG_DEBUG_MEMORY_INIT is not set
+# CONFIG_DEBUG_PER_CPU_MAPS is not set
+CONFIG_ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP=y
+# CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is not set
+CONFIG_HAVE_ARCH_KASAN=y
+CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
+CONFIG_CC_HAS_KASAN_GENERIC=y
+CONFIG_CC_HAS_KASAN_SW_TAGS=y
+CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
+# CONFIG_KASAN is not set
+CONFIG_HAVE_ARCH_KFENCE=y
+# CONFIG_KFENCE is not set
+CONFIG_HAVE_ARCH_KMSAN=y
+CONFIG_HAVE_KMSAN_COMPILER=y
+# CONFIG_KMSAN is not set
+# end of Memory Debugging
+
+# CONFIG_DEBUG_SHIRQ is not set
+
+#
+# Debug Oops, Lockups and Hangs
+#
+CONFIG_PANIC_ON_OOPS=y
+CONFIG_PANIC_ON_OOPS_VALUE=1
+CONFIG_PANIC_TIMEOUT=-1
+CONFIG_LOCKUP_DETECTOR=y
+CONFIG_SOFTLOCKUP_DETECTOR=y
+# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
+CONFIG_HARDLOCKUP_DETECTOR_PERF=y
+CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
+CONFIG_HARDLOCKUP_DETECTOR=y
+CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
+CONFIG_DETECT_HUNG_TASK=y
+CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=120
+CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y
+# CONFIG_WQ_WATCHDOG is not set
+# CONFIG_TEST_LOCKUP is not set
+# end of Debug Oops, Lockups and Hangs
+
+#
+# Scheduler Debugging
+#
+# CONFIG_SCHED_DEBUG is not set
+CONFIG_SCHED_INFO=y
+CONFIG_SCHEDSTATS=y
+# end of Scheduler Debugging
+
+# CONFIG_DEBUG_TIMEKEEPING is not set
+CONFIG_DEBUG_PREEMPT=y
+
+#
+# Lock Debugging (spinlocks, mutexes, etc...)
+#
+CONFIG_LOCK_DEBUGGING_SUPPORT=y
+# CONFIG_PROVE_LOCKING is not set
+# CONFIG_LOCK_STAT is not set
+# CONFIG_DEBUG_RT_MUTEXES is not set
+# CONFIG_DEBUG_SPINLOCK is not set
+# CONFIG_DEBUG_MUTEXES is not set
+# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
+# CONFIG_DEBUG_RWSEMS is not set
+# CONFIG_DEBUG_LOCK_ALLOC is not set
+# CONFIG_DEBUG_ATOMIC_SLEEP is not set
+# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
+# CONFIG_LOCK_TORTURE_TEST is not set
+# CONFIG_WW_MUTEX_SELFTEST is not set
+# CONFIG_SCF_TORTURE_TEST is not set
+# CONFIG_CSD_LOCK_WAIT_DEBUG is not set
+# end of Lock Debugging (spinlocks, mutexes, etc...)
+
+# CONFIG_DEBUG_IRQFLAGS is not set
+CONFIG_STACKTRACE=y
+# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set
+# CONFIG_DEBUG_KOBJECT is not set
+
+#
+# Debug kernel data structures
+#
+CONFIG_DEBUG_LIST=y
+# CONFIG_DEBUG_PLIST is not set
+# CONFIG_DEBUG_SG is not set
+CONFIG_DEBUG_NOTIFIERS=y
+# CONFIG_BUG_ON_DATA_CORRUPTION is not set
+# CONFIG_DEBUG_MAPLE_TREE is not set
+# end of Debug kernel data structures
+
+# CONFIG_DEBUG_CREDENTIALS is not set
+
+#
+# RCU Debugging
+#
+# CONFIG_RCU_SCALE_TEST is not set
+# CONFIG_RCU_TORTURE_TEST is not set
+# CONFIG_RCU_REF_SCALE_TEST is not set
+CONFIG_RCU_CPU_STALL_TIMEOUT=60
+CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
+# CONFIG_RCU_TRACE is not set
+# CONFIG_RCU_EQS_DEBUG is not set
+# end of RCU Debugging
+
+# CONFIG_DEBUG_WQ_FORCE_RR_CPU is not set
+# CONFIG_CPU_HOTPLUG_STATE_CONTROL is not set
+# CONFIG_LATENCYTOP is not set
+CONFIG_USER_STACKTRACE_SUPPORT=y
+CONFIG_NOP_TRACER=y
+CONFIG_HAVE_RETHOOK=y
+CONFIG_RETHOOK=y
+CONFIG_HAVE_FUNCTION_TRACER=y
+CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
+CONFIG_HAVE_DYNAMIC_FTRACE=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
+CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y
+CONFIG_HAVE_DYNAMIC_FTRACE_NO_PATCHABLE=y
+CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
+CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
+CONFIG_HAVE_FENTRY=y
+CONFIG_HAVE_OBJTOOL_MCOUNT=y
+CONFIG_HAVE_C_RECORDMCOUNT=y
+CONFIG_HAVE_BUILDTIME_MCOUNT_SORT=y
+CONFIG_BUILDTIME_MCOUNT_SORT=y
+CONFIG_TRACE_CLOCK=y
+CONFIG_RING_BUFFER=y
+CONFIG_EVENT_TRACING=y
+CONFIG_CONTEXT_SWITCH_TRACER=y
+CONFIG_TRACING=y
+CONFIG_GENERIC_TRACER=y
+CONFIG_TRACING_SUPPORT=y
+CONFIG_FTRACE=y
+# CONFIG_BOOTTIME_TRACING is not set
+CONFIG_FUNCTION_TRACER=y
+CONFIG_FUNCTION_GRAPH_TRACER=y
+CONFIG_DYNAMIC_FTRACE=y
+CONFIG_DYNAMIC_FTRACE_WITH_REGS=y
+CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
+CONFIG_DYNAMIC_FTRACE_WITH_ARGS=y
+# CONFIG_FPROBE is not set
+# CONFIG_FUNCTION_PROFILER is not set
+# CONFIG_STACK_TRACER is not set
+# CONFIG_IRQSOFF_TRACER is not set
+# CONFIG_PREEMPT_TRACER is not set
+# CONFIG_SCHED_TRACER is not set
+# CONFIG_HWLAT_TRACER is not set
+# CONFIG_OSNOISE_TRACER is not set
+# CONFIG_TIMERLAT_TRACER is not set
+# CONFIG_MMIOTRACE is not set
+CONFIG_FTRACE_SYSCALLS=y
+# CONFIG_TRACER_SNAPSHOT is not set
+CONFIG_BRANCH_PROFILE_NONE=y
+# CONFIG_PROFILE_ANNOTATED_BRANCHES is not set
+# CONFIG_PROFILE_ALL_BRANCHES is not set
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_KPROBE_EVENTS=y
+# CONFIG_KPROBE_EVENTS_ON_NOTRACE is not set
+CONFIG_UPROBE_EVENTS=y
+CONFIG_BPF_EVENTS=y
+CONFIG_DYNAMIC_EVENTS=y
+CONFIG_PROBE_EVENTS=y
+# CONFIG_BPF_KPROBE_OVERRIDE is not set
+CONFIG_FTRACE_MCOUNT_RECORD=y
+CONFIG_FTRACE_MCOUNT_USE_OBJTOOL=y
+# CONFIG_SYNTH_EVENTS is not set
+# CONFIG_HIST_TRIGGERS is not set
+# CONFIG_TRACE_EVENT_INJECT is not set
+# CONFIG_TRACEPOINT_BENCHMARK is not set
+# CONFIG_RING_BUFFER_BENCHMARK is not set
+# CONFIG_TRACE_EVAL_MAP_FILE is not set
+# CONFIG_FTRACE_RECORD_RECURSION is not set
+# CONFIG_FTRACE_STARTUP_TEST is not set
+# CONFIG_FTRACE_SORT_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_STARTUP_TEST is not set
+# CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is not set
+# CONFIG_PREEMPTIRQ_DELAY_TEST is not set
+# CONFIG_KPROBE_EVENT_GEN_TEST is not set
+# CONFIG_RV is not set
+CONFIG_PROVIDE_OHCI1394_DMA_INIT=y
+# CONFIG_SAMPLES is not set
+CONFIG_HAVE_SAMPLE_FTRACE_DIRECT=y
+CONFIG_HAVE_SAMPLE_FTRACE_DIRECT_MULTI=y
+CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
+CONFIG_STRICT_DEVMEM=y
+CONFIG_IO_STRICT_DEVMEM=y
+
+#
+# x86 Debugging
+#
+CONFIG_EARLY_PRINTK_USB=y
+CONFIG_X86_VERBOSE_BOOTUP=y
+CONFIG_EARLY_PRINTK=y
+CONFIG_EARLY_PRINTK_DBGP=y
+# CONFIG_EARLY_PRINTK_USB_XDBC is not set
+# CONFIG_EFI_PGT_DUMP is not set
+# CONFIG_DEBUG_TLBFLUSH is not set
+# CONFIG_IOMMU_DEBUG is not set
+CONFIG_HAVE_MMIOTRACE_SUPPORT=y
+# CONFIG_X86_DECODER_SELFTEST is not set
+# CONFIG_IO_DELAY_0X80 is not set
+CONFIG_IO_DELAY_0XED=y
+# CONFIG_IO_DELAY_UDELAY is not set
+# CONFIG_IO_DELAY_NONE is not set
+CONFIG_DEBUG_BOOT_PARAMS=y
+# CONFIG_CPA_DEBUG is not set
+# CONFIG_DEBUG_ENTRY is not set
+# CONFIG_DEBUG_NMI_SELFTEST is not set
+# CONFIG_X86_DEBUG_FPU is not set
+# CONFIG_PUNIT_ATOM_DEBUG is not set
+CONFIG_UNWINDER_ORC=y
+# CONFIG_UNWINDER_FRAME_POINTER is not set
+# end of x86 Debugging
+
+#
+# Kernel Testing and Coverage
+#
+# CONFIG_KUNIT is not set
+# CONFIG_NOTIFIER_ERROR_INJECTION is not set
+CONFIG_FUNCTION_ERROR_INJECTION=y
+# CONFIG_FAULT_INJECTION is not set
+CONFIG_ARCH_HAS_KCOV=y
+CONFIG_CC_HAS_SANCOV_TRACE_PC=y
+# CONFIG_KCOV is not set
+CONFIG_RUNTIME_TESTING_MENU=y
+# CONFIG_LKDTM is not set
+# CONFIG_TEST_MIN_HEAP is not set
+# CONFIG_TEST_DIV64 is not set
+# CONFIG_BACKTRACE_SELF_TEST is not set
+# CONFIG_TEST_REF_TRACKER is not set
+# CONFIG_RBTREE_TEST is not set
+# CONFIG_REED_SOLOMON_TEST is not set
+# CONFIG_INTERVAL_TREE_TEST is not set
+# CONFIG_PERCPU_TEST is not set
+# CONFIG_ATOMIC64_SELFTEST is not set
+# CONFIG_ASYNC_RAID6_TEST is not set
+# CONFIG_TEST_HEXDUMP is not set
+# CONFIG_STRING_SELFTEST is not set
+# CONFIG_TEST_STRING_HELPERS is not set
+# CONFIG_TEST_STRSCPY is not set
+# CONFIG_TEST_KSTRTOX is not set
+# CONFIG_TEST_PRINTF is not set
+# CONFIG_TEST_SCANF is not set
+# CONFIG_TEST_BITMAP is not set
+# CONFIG_TEST_UUID is not set
+# CONFIG_TEST_XARRAY is not set
+# CONFIG_TEST_MAPLE_TREE is not set
+# CONFIG_TEST_RHASHTABLE is not set
+# CONFIG_TEST_SIPHASH is not set
+# CONFIG_TEST_IDA is not set
+# CONFIG_TEST_LKM is not set
+# CONFIG_TEST_BITOPS is not set
+# CONFIG_TEST_VMALLOC is not set
+# CONFIG_TEST_USER_COPY is not set
+CONFIG_TEST_BPF=m
+# CONFIG_TEST_BLACKHOLE_DEV is not set
+# CONFIG_FIND_BIT_BENCHMARK is not set
+# CONFIG_TEST_FIRMWARE is not set
+# CONFIG_TEST_SYSCTL is not set
+# CONFIG_TEST_UDELAY is not set
+# CONFIG_TEST_STATIC_KEYS is not set
+# CONFIG_TEST_KMOD is not set
+# CONFIG_TEST_MEMCAT_P is not set
+# CONFIG_TEST_MEMINIT is not set
+# CONFIG_TEST_FREE_PAGES is not set
+# CONFIG_TEST_FPU is not set
+# CONFIG_TEST_CLOCKSOURCE_WATCHDOG is not set
+CONFIG_ARCH_USE_MEMTEST=y
+# CONFIG_MEMTEST is not set
+# CONFIG_HYPERV_TESTING is not set
+# end of Kernel Testing and Coverage
+
+#
+# Rust hacking
+#
+# end of Rust hacking
+# end of Kernel hacking
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 3216da7..b069d08 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -109,7 +109,16 @@
static inline u32 native_apic_mem_read(u32 reg)
{
- return *((volatile u32 *)(APIC_BASE + reg));
+ volatile u32 *addr = (volatile u32 *)(APIC_BASE + reg);
+ u32 out;
+ /*
+ * This access an MMIO address, which may need to be emulated in some
+ * cases. The emulator doesn't necessarily support all instructions, so
+ * we force the read from addr to use a mov instruction.
+ */
+ asm_inline("movl %1, %0" : "=r"(out): "m"(*addr));
+
+ return out;
}
extern void native_apic_wait_icr_idle(void);
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 233ae69..813213b 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -31,6 +31,8 @@
#define ARCH_EFI_IRQ_FLAGS_MASK X86_EFLAGS_IF
+#define EFI_UNACCEPTED_UNIT_SIZE PMD_SIZE
+
/*
* The EFI services are called through variadic functions in many cases. These
* functions are implemented in assembler and support only a fixed number of
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index 908d99b..5061ac9 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -34,19 +34,6 @@
return addr;
}
-/*
- * When a ftrace registered caller is tracing a function that is
- * also set by a register_ftrace_direct() call, it needs to be
- * differentiated in the ftrace_caller trampoline. To do this, we
- * place the direct caller in the ORIG_AX part of pt_regs. This
- * tells the ftrace_caller that there's a direct caller.
- */
-static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs, unsigned long addr)
-{
- /* Emulate a call */
- regs->orig_ax = addr;
-}
-
#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
struct ftrace_regs {
struct pt_regs regs;
@@ -61,9 +48,25 @@
return &fregs->regs;
}
-#define ftrace_instruction_pointer_set(fregs, _ip) \
+#define ftrace_regs_set_instruction_pointer(fregs, _ip) \
do { (fregs)->regs.ip = (_ip); } while (0)
+#define ftrace_regs_get_instruction_pointer(fregs) \
+ ((fregs)->regs.ip)
+
+#define ftrace_regs_get_argument(fregs, n) \
+ regs_get_kernel_argument(&(fregs)->regs, n)
+#define ftrace_regs_get_stack_pointer(fregs) \
+ kernel_stack_pointer(&(fregs)->regs)
+#define ftrace_regs_return_value(fregs) \
+ regs_return_value(&(fregs)->regs)
+#define ftrace_regs_set_return_value(fregs, ret) \
+ regs_set_return_value(&(fregs)->regs, ret)
+#define ftrace_override_function_with_return(fregs) \
+ override_function_with_return(&(fregs)->regs)
+#define ftrace_regs_query_register_offset(name) \
+ regs_query_register_offset(name)
+
struct ftrace_ops;
#define ftrace_graph_func ftrace_graph_func
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
@@ -72,6 +75,24 @@
#define FTRACE_GRAPH_TRAMP_ADDR FTRACE_GRAPH_ADDR
#endif
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+/*
+ * When a ftrace registered caller is tracing a function that is
+ * also set by a register_ftrace_direct() call, it needs to be
+ * differentiated in the ftrace_caller trampoline. To do this, we
+ * place the direct caller in the ORIG_AX part of pt_regs. This
+ * tells the ftrace_caller that there's a direct caller.
+ */
+static inline void
+__arch_ftrace_set_direct_caller(struct pt_regs *regs, unsigned long addr)
+{
+ /* Emulate a call */
+ regs->orig_ax = addr;
+}
+#define arch_ftrace_set_direct_caller(fregs, addr) \
+ __arch_ftrace_set_direct_caller(&(fregs)->regs, addr)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
#ifdef CONFIG_DYNAMIC_FTRACE
struct dyn_arch_ftrace {
diff --git a/arch/x86/include/asm/insn-eval.h b/arch/x86/include/asm/insn-eval.h
index f07faa6..54368a4 100644
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -32,16 +32,16 @@
bool insn_decode_from_regs(struct insn *insn, struct pt_regs *regs,
unsigned char buf[MAX_INSN_SIZE], int buf_size);
-enum mmio_type {
- MMIO_DECODE_FAILED,
- MMIO_WRITE,
- MMIO_WRITE_IMM,
- MMIO_READ,
- MMIO_READ_ZERO_EXTEND,
- MMIO_READ_SIGN_EXTEND,
- MMIO_MOVS,
+enum insn_mmio_type {
+ INSN_MMIO_DECODE_FAILED,
+ INSN_MMIO_WRITE,
+ INSN_MMIO_WRITE_IMM,
+ INSN_MMIO_READ,
+ INSN_MMIO_READ_ZERO_EXTEND,
+ INSN_MMIO_READ_SIGN_EXTEND,
+ INSN_MMIO_MOVS,
};
-enum mmio_type insn_decode_mmio(struct insn *insn, int *bytes);
+enum insn_mmio_type insn_decode_mmio(struct insn *insn, int *bytes);
#endif /* _ASM_X86_INSN_EVAL_H */
diff --git a/arch/x86/include/asm/linkage.h b/arch/x86/include/asm/linkage.h
index 3a0282a..de507d9 100644
--- a/arch/x86/include/asm/linkage.h
+++ b/arch/x86/include/asm/linkage.h
@@ -22,10 +22,8 @@
#ifdef __ASSEMBLY__
-#if defined(CONFIG_X86_64) || defined(CONFIG_X86_ALIGNMENT_16)
-#define __ALIGN .p2align 4, 0x90
+#define __ALIGN .balign CONFIG_FUNCTION_ALIGNMENT, 0x90;
#define __ALIGN_STR __stringify(__ALIGN)
-#endif
#if defined(CONFIG_RETHUNK) && !defined(__DISABLE_EXPORTS) && !defined(BUILD_VDSO)
#define RET jmp __x86_return_thunk
diff --git a/arch/x86/include/asm/pci-direct.h b/arch/x86/include/asm/pci-direct.h
index 94597a3..ebb3db2 100644
--- a/arch/x86/include/asm/pci-direct.h
+++ b/arch/x86/include/asm/pci-direct.h
@@ -10,9 +10,11 @@
extern u32 read_pci_config(u8 bus, u8 slot, u8 func, u8 offset);
extern u8 read_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset);
extern u16 read_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset);
+extern u32 pci_early_find_cap(int bus, int slot, int func, int cap);
extern void write_pci_config(u8 bus, u8 slot, u8 func, u8 offset, u32 val);
extern void write_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset, u8 val);
extern void write_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset, u16 val);
+extern unsigned int pci_early_clear_msi;
extern int early_pci_allowed(void);
#endif /* _ASM_X86_PCI_DIRECT_H */
diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/shared/tdx.h
index e53f262..19228be 100644
--- a/arch/x86/include/asm/shared/tdx.h
+++ b/arch/x86/include/asm/shared/tdx.h
@@ -7,12 +7,23 @@
#define TDX_HYPERCALL_STANDARD 0
-#define TDX_HCALL_HAS_OUTPUT BIT(0)
-#define TDX_HCALL_ISSUE_STI BIT(1)
-
#define TDX_CPUID_LEAF_ID 0x21
#define TDX_IDENT "IntelTDX "
+/* TDX module Call Leaf IDs */
+#define TDX_GET_INFO 1
+#define TDX_GET_VEINFO 3
+#define TDX_GET_REPORT 4
+#define TDX_ACCEPT_PAGE 6
+#define TDX_WR 8
+
+/* TDCS fields. To be used by TDG.VM.WR and TDG.VM.RD module calls */
+#define TDCS_NOTIFY_ENABLES 0x9100000000000010
+
+/* TDX hypercall Leaf IDs */
+#define TDVMCALL_MAP_GPA 0x10001
+#define TDVMCALL_REPORT_FATAL_ERROR 0x10003
+
#ifndef __ASSEMBLY__
/*
@@ -22,19 +33,65 @@
* This is a software only structure and not part of the TDX module/VMM ABI.
*/
struct tdx_hypercall_args {
+ u64 r8;
+ u64 r9;
u64 r10;
u64 r11;
u64 r12;
u64 r13;
u64 r14;
u64 r15;
+ u64 rdi;
+ u64 rsi;
+ u64 rbx;
+ u64 rdx;
};
/* Used to request services from the VMM */
-u64 __tdx_hypercall(struct tdx_hypercall_args *args, unsigned long flags);
+u64 __tdx_hypercall(struct tdx_hypercall_args *args);
+u64 __tdx_hypercall_ret(struct tdx_hypercall_args *args);
+
+/*
+ * Wrapper for standard use of __tdx_hypercall with no output aside from
+ * return code.
+ */
+static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15)
+{
+ struct tdx_hypercall_args args = {
+ .r10 = TDX_HYPERCALL_STANDARD,
+ .r11 = fn,
+ .r12 = r12,
+ .r13 = r13,
+ .r14 = r14,
+ .r15 = r15,
+ };
+
+ return __tdx_hypercall(&args);
+}
+
/* Called from __tdx_hypercall() for unrecoverable failure */
void __tdx_hypercall_failed(void);
+/*
+ * Used in __tdx_module_call() to gather the output registers' values of the
+ * TDCALL instruction when requesting services from the TDX module. This is a
+ * software only structure and not part of the TDX module/VMM ABI
+ */
+struct tdx_module_output {
+ u64 rcx;
+ u64 rdx;
+ u64 r8;
+ u64 r9;
+ u64 r10;
+ u64 r11;
+};
+
+/* Used to communicate with the TDX module */
+u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
+ struct tdx_module_output *out);
+
+bool tdx_accept_memory(phys_addr_t start, phys_addr_t end);
+
#endif /* !__ASSEMBLY__ */
#endif /* _ASM_X86_SHARED_TDX_H */
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 020c81a..603e6d1 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -5,6 +5,8 @@
#include <linux/init.h>
#include <linux/bits.h>
+
+#include <asm/errno.h>
#include <asm/ptrace.h>
#include <asm/shared/tdx.h>
@@ -21,21 +23,6 @@
#ifndef __ASSEMBLY__
/*
- * Used to gather the output registers values of the TDCALL and SEAMCALL
- * instructions when requesting services from the TDX module.
- *
- * This is a software only structure and not part of the TDX module/VMM ABI.
- */
-struct tdx_module_output {
- u64 rcx;
- u64 rdx;
- u64 r8;
- u64 r9;
- u64 r10;
- u64 r11;
-};
-
-/*
* Used by the #VE exception handler to gather the #VE exception
* info from the TDX module. This is a software only structure
* and not part of the TDX module/VMM ABI.
@@ -55,10 +42,6 @@
void __init tdx_early_init(void);
-/* Used to communicate with the TDX module */
-u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
- struct tdx_module_output *out);
-
void tdx_get_ve_info(struct ve_info *ve);
bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve);
@@ -67,6 +50,8 @@
bool tdx_early_handle_ve(struct pt_regs *regs);
+int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport);
+
#else
static inline void tdx_early_init(void) { };
diff --git a/arch/x86/include/asm/unaccepted_memory.h b/arch/x86/include/asm/unaccepted_memory.h
new file mode 100644
index 0000000..572514e
--- /dev/null
+++ b/arch/x86/include/asm/unaccepted_memory.h
@@ -0,0 +1,24 @@
+#ifndef _ASM_X86_UNACCEPTED_MEMORY_H
+#define _ASM_X86_UNACCEPTED_MEMORY_H
+
+#include <linux/efi.h>
+#include <asm/tdx.h>
+
+static inline void arch_accept_memory(phys_addr_t start, phys_addr_t end)
+{
+ /* Platform-specific memory-acceptance call goes here */
+ if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
+ if (!tdx_accept_memory(start, end))
+ panic("TDX: Failed to accept memory\n");
+ } else {
+ panic("Cannot accept memory: unknown platform\n");
+ }
+}
+
+static inline struct efi_unaccepted_memory *efi_get_unaccepted_table(void)
+{
+ if (efi.unaccepted == EFI_INVALID_TABLE_ADDR)
+ return NULL;
+ return __va(efi.unaccepted);
+}
+#endif
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 4feaa67..dd7c987 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -136,32 +136,6 @@
}
-/* Find a PCI capability */
-static u32 __init find_cap(int bus, int slot, int func, int cap)
-{
- int bytes;
- u8 pos;
-
- if (!(read_pci_config_16(bus, slot, func, PCI_STATUS) &
- PCI_STATUS_CAP_LIST))
- return 0;
-
- pos = read_pci_config_byte(bus, slot, func, PCI_CAPABILITY_LIST);
- for (bytes = 0; bytes < 48 && pos >= 0x40; bytes++) {
- u8 id;
-
- pos &= ~3;
- id = read_pci_config_byte(bus, slot, func, pos+PCI_CAP_LIST_ID);
- if (id == 0xff)
- break;
- if (id == cap)
- return pos;
- pos = read_pci_config_byte(bus, slot, func,
- pos+PCI_CAP_LIST_NEXT);
- }
- return 0;
-}
-
/* Read a standard AGPv3 bridge header */
static u32 __init read_agp(int bus, int slot, int func, int cap, u32 *order)
{
@@ -250,8 +224,8 @@
case PCI_CLASS_BRIDGE_HOST:
case PCI_CLASS_BRIDGE_OTHER: /* needed? */
/* AGP bridge? */
- cap = find_cap(bus, slot, func,
- PCI_CAP_ID_AGP);
+ cap = pci_early_find_cap(bus, slot,
+ func, PCI_CAP_ID_AGP);
if (!cap)
break;
*valid_agp = 1;
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 4373080..9f09947 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -75,12 +75,18 @@
OFFSET(TDX_MODULE_r11, tdx_module_output, r11);
BLANK();
+ OFFSET(TDX_HYPERCALL_r8, tdx_hypercall_args, r8);
+ OFFSET(TDX_HYPERCALL_r9, tdx_hypercall_args, r9);
OFFSET(TDX_HYPERCALL_r10, tdx_hypercall_args, r10);
OFFSET(TDX_HYPERCALL_r11, tdx_hypercall_args, r11);
OFFSET(TDX_HYPERCALL_r12, tdx_hypercall_args, r12);
OFFSET(TDX_HYPERCALL_r13, tdx_hypercall_args, r13);
OFFSET(TDX_HYPERCALL_r14, tdx_hypercall_args, r14);
OFFSET(TDX_HYPERCALL_r15, tdx_hypercall_args, r15);
+ OFFSET(TDX_HYPERCALL_rdi, tdx_hypercall_args, rdi);
+ OFFSET(TDX_HYPERCALL_rsi, tdx_hypercall_args, rsi);
+ OFFSET(TDX_HYPERCALL_rbx, tdx_hypercall_args, rbx);
+ OFFSET(TDX_HYPERCALL_rdx, tdx_hypercall_args, rdx);
BLANK();
OFFSET(BP_scratch, boot_params, scratch);
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index a6c1867..c749acd 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -29,6 +29,37 @@
#include <asm/irq_remapping.h>
#include <asm/early_ioremap.h>
+static void __init early_pci_clear_msi(int bus, int slot, int func)
+{
+ int pos;
+ u16 ctrl;
+
+ if (likely(!pci_early_clear_msi))
+ return;
+
+ pr_info_once("Clearing MSI/MSI-X enable bits early in boot (quirk)\n");
+
+ pos = pci_early_find_cap(bus, slot, func, PCI_CAP_ID_MSI);
+ if (pos) {
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS);
+ ctrl &= ~PCI_MSI_FLAGS_ENABLE;
+ write_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS, ctrl);
+
+ /* Read again to flush previous write */
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSI_FLAGS);
+ }
+
+ pos = pci_early_find_cap(bus, slot, func, PCI_CAP_ID_MSIX);
+ if (pos) {
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS);
+ ctrl &= ~PCI_MSIX_FLAGS_ENABLE;
+ write_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS, ctrl);
+
+ /* Read again to flush previous write */
+ ctrl = read_pci_config_16(bus, slot, func, pos + PCI_MSIX_FLAGS);
+ }
+}
+
static void __init fix_hypertransport_config(int num, int slot, int func)
{
u32 htcfg;
@@ -728,6 +759,7 @@
PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, 0, force_disable_hpet},
{ PCI_VENDOR_ID_BROADCOM, 0x4331,
PCI_CLASS_NETWORK_OTHER, PCI_ANY_ID, 0, apple_airport_reset},
+ { PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0, early_pci_clear_msi},
{}
};
@@ -750,7 +782,6 @@
u16 vendor;
u16 device;
u8 type;
- u8 sec;
int i;
class = read_pci_config_16(num, slot, func, PCI_CLASS_DEVICE);
@@ -779,11 +810,8 @@
type = read_pci_config_byte(num, slot, func,
PCI_HEADER_TYPE);
- if ((type & 0x7f) == PCI_HEADER_TYPE_BRIDGE) {
- sec = read_pci_config_byte(num, slot, func, PCI_SECONDARY_BUS);
- if (sec > num)
- early_pci_scan_bus(sec);
- }
+ if ((type & 0x7f) == PCI_HEADER_TYPE_BRIDGE)
+ return -1;
if (!(type & 0x80))
return -1;
@@ -806,8 +834,10 @@
void __init early_quirks(void)
{
+ int bus;
if (!early_pci_allowed())
return;
- early_pci_scan_bus(0);
+ for (bus = 0; bus < 256; bus++)
+ early_pci_scan_bus(bus);
}
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index a0ed0e4..1803698 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -163,6 +163,10 @@
jmp .Lftrace_ret
SYM_CODE_END(ftrace_regs_caller)
+SYM_FUNC_START(ftrace_stub_direct_tramp)
+ RET
+SYM_FUNC_END(ftrace_stub_direct_tramp)
+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
SYM_CODE_START(ftrace_graph_caller)
pushl %eax
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 6233c5b..244d97d 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -282,6 +282,9 @@
SYM_FUNC_END(ftrace_regs_caller)
STACK_FRAME_NON_STANDARD_FP(ftrace_regs_caller)
+SYM_FUNC_START(ftrace_stub_direct_tramp)
+ RET
+SYM_FUNC_END(ftrace_stub_direct_tramp)
#else /* ! CONFIG_DYNAMIC_FTRACE */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 804a252..933c5b7 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -575,6 +575,10 @@
/* 0 means: find the address automatically */
if (!crash_base) {
+ unsigned long long max_addr = high ? CRASH_ADDR_HIGH_MAX
+ : CRASH_ADDR_LOW_MAX;
+ unsigned long long base = CRASH_ALIGN;
+
/*
* Set CRASH_ADDR_LOW_MAX upper bound for crash memory,
* crashkernel=x,high reserves memory over 4G, also allocates
@@ -582,15 +586,14 @@
* But the extra memory is not required for all machines.
* So try low memory first and fall back to high memory
* unless "crashkernel=size[KMG],high" is specified.
+ * To conserve memory in crash-capture kernel try
+ * to allocate crash_base at the lowest address possible.
*/
- if (!high)
+ do {
crash_base = memblock_phys_alloc_range(crash_size,
- CRASH_ALIGN, CRASH_ALIGN,
- CRASH_ADDR_LOW_MAX);
- if (!crash_base)
- crash_base = memblock_phys_alloc_range(crash_size,
- CRASH_ALIGN, CRASH_ALIGN,
- CRASH_ADDR_HIGH_MAX);
+ CRASH_ALIGN, base, base + crash_size);
+ base += CRASH_ALIGN;
+ } while (!crash_base && base + crash_size <= max_addr);
if (!crash_base) {
pr_info("crashkernel reservation failed - No suitable area found.\n");
return;
diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c
index 71d8698..bd40fae 100644
--- a/arch/x86/kernel/sev-shared.c
+++ b/arch/x86/kernel/sev-shared.c
@@ -63,6 +63,23 @@
*/
static u16 ghcb_version __ro_after_init;
+/*
+ * This may be called early while still running on the initial identity
+ * mapping. Use RIP-relative addressing to obtain the correct address
+ * while running with the initial identity mapping as well as the
+ * switch-over to kernel virtual addresses later.
+ */
+static u16 *get_ghcb_version_ptr(void)
+{
+ void *ptr;
+
+ asm ("lea ghcb_version(%%rip), %0"
+ : "=r" (ptr)
+ : "p" (&ghcb_version));
+
+ return (u16 *)ptr;
+}
+
/* Copy of the SNP firmware's CPUID page. */
static struct snp_cpuid_table cpuid_table_copy __ro_after_init;
@@ -72,9 +89,13 @@
* invalid/out-of-range leaves. This is needed since all-zero leaves
* still need to be post-processed.
*/
-static u32 cpuid_std_range_max __ro_after_init;
-static u32 cpuid_hyp_range_max __ro_after_init;
-static u32 cpuid_ext_range_max __ro_after_init;
+struct cpuid_maxes {
+ u32 std_range;
+ u32 hyp_range;
+ u32 ext_range;
+};
+
+static struct cpuid_maxes cpuid_range_maxes __ro_after_init;
static bool __init sev_es_check_cpu_features(void)
{
@@ -108,7 +129,7 @@
{
u64 val;
- if (ghcb_version < 2)
+ if (*get_ghcb_version_ptr() < 2)
return 0;
sev_es_wr_ghcb_msr(GHCB_MSR_HV_FT_REQ);
@@ -153,7 +174,7 @@
GHCB_MSR_PROTO_MIN(val) > GHCB_PROTOCOL_MAX)
return false;
- ghcb_version = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
+ *get_ghcb_version_ptr() = min_t(size_t, GHCB_MSR_PROTO_MAX(val), GHCB_PROTOCOL_MAX);
return true;
}
@@ -332,6 +353,17 @@
return ptr;
}
+static const struct cpuid_maxes *snp_cpuid_get_maxes(void)
+{
+ void *ptr;
+
+ asm ("lea cpuid_range_maxes(%%rip), %0"
+ : "=r" (ptr)
+ : "p" (&cpuid_range_maxes));
+
+ return ptr;
+}
+
/*
* The SNP Firmware ABI, Revision 0.9, Section 7.1, details the use of
* XCR0_IN and XSS_IN to encode multiple versions of 0xD subfunctions 0
@@ -528,6 +560,7 @@
static int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf)
{
const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
+ const struct cpuid_maxes *maxes = snp_cpuid_get_maxes();
if (!cpuid_table->count)
return -EOPNOTSUPP;
@@ -553,9 +586,9 @@
leaf->eax = leaf->ebx = leaf->ecx = leaf->edx = 0;
/* Skip post-processing for out-of-range zero leafs. */
- if (!(leaf->fn <= cpuid_std_range_max ||
- (leaf->fn >= 0x40000000 && leaf->fn <= cpuid_hyp_range_max) ||
- (leaf->fn >= 0x80000000 && leaf->fn <= cpuid_ext_range_max)))
+ if (!(leaf->fn <= maxes->std_range ||
+ (leaf->fn >= 0x40000000 && leaf->fn <= maxes->hyp_range) ||
+ (leaf->fn >= 0x80000000 && leaf->fn <= maxes->ext_range)))
return 0;
}
@@ -1043,6 +1076,8 @@
static void __init setup_cpuid_table(const struct cc_blob_sev_info *cc_info)
{
const struct snp_cpuid_table *cpuid_table_fw, *cpuid_table;
+ const struct cpuid_maxes *range_maxes;
+ struct cpuid_maxes local;
int i;
if (!cc_info || !cc_info->cpuid_phys || cc_info->cpuid_len < PAGE_SIZE)
@@ -1053,6 +1088,7 @@
sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_CPUID);
cpuid_table = snp_cpuid_get_table();
+ range_maxes = snp_cpuid_get_maxes();
memcpy((void *)cpuid_table, cpuid_table_fw, sizeof(*cpuid_table));
/* Initialize CPUID ranges for range-checking. */
@@ -1060,10 +1096,11 @@
const struct snp_cpuid_fn *fn = &cpuid_table->fn[i];
if (fn->eax_in == 0x0)
- cpuid_std_range_max = fn->eax;
+ local.std_range = fn->eax;
else if (fn->eax_in == 0x40000000)
- cpuid_hyp_range_max = fn->eax;
+ local.hyp_range = fn->eax;
else if (fn->eax_in == 0x80000000)
- cpuid_ext_range_max = fn->eax;
+ local.ext_range = fn->eax;
}
+ memcpy((void *)range_maxes, &local, sizeof(*range_maxes));
}
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index c8dfb0f..f87d478 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1568,17 +1568,17 @@
static enum es_result vc_handle_mmio(struct ghcb *ghcb, struct es_em_ctxt *ctxt)
{
struct insn *insn = &ctxt->insn;
+ enum insn_mmio_type mmio;
unsigned int bytes = 0;
- enum mmio_type mmio;
enum es_result ret;
u8 sign_byte;
long *reg_data;
mmio = insn_decode_mmio(insn, &bytes);
- if (mmio == MMIO_DECODE_FAILED)
+ if (mmio == INSN_MMIO_DECODE_FAILED)
return ES_DECODE_FAILED;
- if (mmio != MMIO_WRITE_IMM && mmio != MMIO_MOVS) {
+ if (mmio != INSN_MMIO_WRITE_IMM && mmio != INSN_MMIO_MOVS) {
reg_data = insn_get_modrm_reg_ptr(insn, ctxt->regs);
if (!reg_data)
return ES_DECODE_FAILED;
@@ -1588,15 +1588,15 @@
return ES_UNSUPPORTED;
switch (mmio) {
- case MMIO_WRITE:
+ case INSN_MMIO_WRITE:
memcpy(ghcb->shared_buffer, reg_data, bytes);
ret = vc_do_mmio(ghcb, ctxt, bytes, false);
break;
- case MMIO_WRITE_IMM:
+ case INSN_MMIO_WRITE_IMM:
memcpy(ghcb->shared_buffer, insn->immediate1.bytes, bytes);
ret = vc_do_mmio(ghcb, ctxt, bytes, false);
break;
- case MMIO_READ:
+ case INSN_MMIO_READ:
ret = vc_do_mmio(ghcb, ctxt, bytes, true);
if (ret)
break;
@@ -1607,7 +1607,7 @@
memcpy(reg_data, ghcb->shared_buffer, bytes);
break;
- case MMIO_READ_ZERO_EXTEND:
+ case INSN_MMIO_READ_ZERO_EXTEND:
ret = vc_do_mmio(ghcb, ctxt, bytes, true);
if (ret)
break;
@@ -1616,7 +1616,7 @@
memset(reg_data, 0, insn->opnd_bytes);
memcpy(reg_data, ghcb->shared_buffer, bytes);
break;
- case MMIO_READ_SIGN_EXTEND:
+ case INSN_MMIO_READ_SIGN_EXTEND:
ret = vc_do_mmio(ghcb, ctxt, bytes, true);
if (ret)
break;
@@ -1635,7 +1635,7 @@
memset(reg_data, sign_byte, insn->opnd_bytes);
memcpy(reg_data, ghcb->shared_buffer, bytes);
break;
- case MMIO_MOVS:
+ case INSN_MMIO_MOVS:
ret = vc_handle_mmio_movs(ctxt, bytes);
break;
default:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 98d732b..d4189c8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1866,12 +1866,42 @@
return nested && guest_cpuid_has(vcpu, X86_FEATURE_VMX);
}
-static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu,
- uint64_t val)
-{
- uint64_t valid_bits = to_vmx(vcpu)->msr_ia32_feature_control_valid_bits;
+/*
+ * Userspace is allowed to set any supported IA32_FEATURE_CONTROL regardless of
+ * guest CPUID. Note, KVM allows userspace to set "VMX in SMX" to maintain
+ * backwards compatibility even though KVM doesn't support emulating SMX. And
+ * because userspace set "VMX in SMX", the guest must also be allowed to set it,
+ * e.g. if the MSR is left unlocked and the guest does a RMW operation.
+ */
+#define KVM_SUPPORTED_FEATURE_CONTROL (FEAT_CTL_LOCKED | \
+ FEAT_CTL_VMX_ENABLED_INSIDE_SMX | \
+ FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX | \
+ FEAT_CTL_SGX_LC_ENABLED | \
+ FEAT_CTL_SGX_ENABLED | \
+ FEAT_CTL_LMCE_ENABLED)
- return !(val & ~valid_bits);
+static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
+ struct msr_data *msr)
+{
+ uint64_t valid_bits;
+
+ /*
+ * Ensure KVM_SUPPORTED_FEATURE_CONTROL is updated when new bits are
+ * exposed to the guest.
+ */
+ WARN_ON_ONCE(vmx->msr_ia32_feature_control_valid_bits &
+ ~KVM_SUPPORTED_FEATURE_CONTROL);
+
+ if (!msr->host_initiated &&
+ (vmx->msr_ia32_feature_control & FEAT_CTL_LOCKED))
+ return false;
+
+ if (msr->host_initiated)
+ valid_bits = KVM_SUPPORTED_FEATURE_CONTROL;
+ else
+ valid_bits = vmx->msr_ia32_feature_control_valid_bits;
+
+ return !(msr->data & ~valid_bits);
}
static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
@@ -2273,10 +2303,9 @@
vcpu->arch.mcg_ext_ctl = data;
break;
case MSR_IA32_FEAT_CTL:
- if (!vmx_feature_control_msr_valid(vcpu, data) ||
- (to_vmx(vcpu)->msr_ia32_feature_control &
- FEAT_CTL_LOCKED && !msr_info->host_initiated))
+ if (!is_vmx_feature_control_msr_valid(vmx, msr_info))
return 1;
+
vmx->msr_ia32_feature_control = data;
if (msr_info->host_initiated && data == 0)
vmx_leave_nested(vcpu);
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 21104c4..558a605 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -1595,16 +1595,16 @@
* Returns:
*
* Type of the instruction. Size of the memory operand is stored in
- * @bytes. If decode failed, MMIO_DECODE_FAILED returned.
+ * @bytes. If decode failed, INSN_MMIO_DECODE_FAILED returned.
*/
-enum mmio_type insn_decode_mmio(struct insn *insn, int *bytes)
+enum insn_mmio_type insn_decode_mmio(struct insn *insn, int *bytes)
{
- enum mmio_type type = MMIO_DECODE_FAILED;
+ enum insn_mmio_type type = INSN_MMIO_DECODE_FAILED;
*bytes = 0;
if (insn_get_opcode(insn))
- return MMIO_DECODE_FAILED;
+ return INSN_MMIO_DECODE_FAILED;
switch (insn->opcode.bytes[0]) {
case 0x88: /* MOV m8,r8 */
@@ -1613,7 +1613,7 @@
case 0x89: /* MOV m16/m32/m64, r16/m32/m64 */
if (!*bytes)
*bytes = insn->opnd_bytes;
- type = MMIO_WRITE;
+ type = INSN_MMIO_WRITE;
break;
case 0xc6: /* MOV m8, imm8 */
@@ -1622,7 +1622,7 @@
case 0xc7: /* MOV m16/m32/m64, imm16/imm32/imm64 */
if (!*bytes)
*bytes = insn->opnd_bytes;
- type = MMIO_WRITE_IMM;
+ type = INSN_MMIO_WRITE_IMM;
break;
case 0x8a: /* MOV r8, m8 */
@@ -1631,7 +1631,7 @@
case 0x8b: /* MOV r16/r32/r64, m16/m32/m64 */
if (!*bytes)
*bytes = insn->opnd_bytes;
- type = MMIO_READ;
+ type = INSN_MMIO_READ;
break;
case 0xa4: /* MOVS m8, m8 */
@@ -1640,7 +1640,7 @@
case 0xa5: /* MOVS m16/m32/m64, m16/m32/m64 */
if (!*bytes)
*bytes = insn->opnd_bytes;
- type = MMIO_MOVS;
+ type = INSN_MMIO_MOVS;
break;
case 0x0f: /* Two-byte instruction */
@@ -1651,7 +1651,7 @@
case 0xb7: /* MOVZX r32/r64, m16 */
if (!*bytes)
*bytes = 2;
- type = MMIO_READ_ZERO_EXTEND;
+ type = INSN_MMIO_READ_ZERO_EXTEND;
break;
case 0xbe: /* MOVSX r16/r32/r64, m8 */
@@ -1660,7 +1660,7 @@
case 0xbf: /* MOVSX r32/r64, m16 */
if (!*bytes)
*bytes = 2;
- type = MMIO_READ_SIGN_EXTEND;
+ type = INSN_MMIO_READ_SIGN_EXTEND;
break;
}
break;
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index ddb7986..618b01d 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -34,6 +34,7 @@
#endif
int pcibios_last_bus = -1;
unsigned long pirq_table_addr;
+unsigned int pci_early_clear_msi;
const struct pci_raw_ops *__read_mostly raw_pci_ops;
const struct pci_raw_ops *__read_mostly raw_pci_ext_ops;
@@ -614,6 +615,9 @@
} else if (!strcmp(str, "skip_isa_align")) {
pci_probe |= PCI_CAN_SKIP_ISA_ALIGN;
return NULL;
+ } else if (!strcmp(str, "clearmsi")) {
+ pci_early_clear_msi = 1;
+ return NULL;
} else if (!strcmp(str, "noioapicquirk")) {
noioapicquirk = 1;
return NULL;
diff --git a/arch/x86/pci/early.c b/arch/x86/pci/early.c
index f5fc953..f1ba9d7 100644
--- a/arch/x86/pci/early.c
+++ b/arch/x86/pci/early.c
@@ -51,6 +51,31 @@
outw(val, 0xcfc + (offset&2));
}
+u32 pci_early_find_cap(int bus, int slot, int func, int cap)
+{
+ int bytes;
+ u8 pos;
+
+ if (!(read_pci_config_16(bus, slot, func, PCI_STATUS) &
+ PCI_STATUS_CAP_LIST))
+ return 0;
+
+ pos = read_pci_config_byte(bus, slot, func, PCI_CAPABILITY_LIST);
+ for (bytes = 0; bytes < 48 && pos >= 0x40; bytes++) {
+ u8 id;
+
+ pos &= ~3;
+ id = read_pci_config_byte(bus, slot, func, pos+PCI_CAP_LIST_ID);
+ if (id == 0xff)
+ break;
+ if (id == cap)
+ return pos;
+ pos = read_pci_config_byte(bus, slot, func,
+ pos+PCI_CAP_LIST_NEXT);
+ }
+ return 0;
+}
+
int early_pci_allowed(void)
{
return (pci_probe & (PCI_PROBE_CONF1|PCI_PROBE_NOEARLY)) ==
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index ebc98a6..b486392 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -96,6 +96,9 @@
#ifdef CONFIG_EFI_COCO_SECRET
&efi.coco_secret,
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ &efi.unaccepted,
+#endif
};
u64 efi_setup; /* efi setup_data physical address */
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index 6f04b83..108bf9b 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -198,7 +198,7 @@
source "drivers/base/regmap/Kconfig"
config DMA_SHARED_BUFFER
- bool
+ bool "Buffer framework to be shared between drivers"
default n
select IRQ_WORK
help
diff --git a/drivers/base/node.c b/drivers/base/node.c
index a4141b5..5841aaf 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -449,6 +449,9 @@
"Node %d FileHugePages: %8lu kB\n"
"Node %d FilePmdMapped: %8lu kB\n"
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ "Node %d Unaccepted: %8lu kB\n"
+#endif
,
nid, K(node_page_state(pgdat, NR_FILE_DIRTY)),
nid, K(node_page_state(pgdat, NR_WRITEBACK)),
@@ -478,6 +481,10 @@
nid, K(node_page_state(pgdat, NR_FILE_THPS)),
nid, K(node_page_state(pgdat, NR_FILE_PMDMAPPED))
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ ,
+ nid, K(sum_zone_node_page_state(nid, NR_UNACCEPTED))
+#endif
);
len += hugetlb_report_node_meminfo(buf, len, nid);
return len;
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 3d58514..cd15d5c 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -22,10 +22,12 @@
#include <linux/module.h>
#include <linux/seq_file.h>
#include <linux/sync_file.h>
+#include <linux/pci.h>
#include <linux/poll.h>
#include <linux/dma-resv.h>
#include <linux/mm.h>
#include <linux/mount.h>
+#include <linux/netdevice.h>
#include <linux/pseudo_fs.h>
#include <uapi/linux/dma-buf.h>
@@ -437,12 +439,16 @@
}
#endif
+static long dma_buf_create_pages(struct file *file,
+ struct dma_buf_create_pages_info *create_info);
+
static long dma_buf_ioctl(struct file *file,
unsigned int cmd, unsigned long arg)
{
struct dma_buf *dmabuf;
struct dma_buf_sync sync;
enum dma_data_direction direction;
+ struct dma_buf_create_pages_info create_info;
int ret;
dmabuf = file->private_data;
@@ -479,6 +485,12 @@
case DMA_BUF_SET_NAME_A:
case DMA_BUF_SET_NAME_B:
return dma_buf_set_name(dmabuf, (const char __user *)arg);
+ case DMA_BUF_CREATE_PAGES:
+ if (copy_from_user(&create_info, (void __user *)arg,
+ sizeof(create_info))) {
+ return -EFAULT;
+ }
+ return dma_buf_create_pages(file, &create_info);
#if IS_ENABLED(CONFIG_SYNC_FILE)
case DMA_BUF_IOCTL_EXPORT_SYNC_FILE:
@@ -1507,6 +1519,374 @@
}
EXPORT_SYMBOL_NS_GPL(dma_buf_vunmap, DMA_BUF);
+static DEFINE_MUTEX(bind_rx_queue_mutex);
+
+static int dma_buf_pages_release(struct inode *inode, struct file *file)
+{
+ struct dma_buf_pages_file_priv *priv = file->private_data;
+ struct netdev_rx_queue *rxq;
+ struct file *old_pages;
+ unsigned long xa_idx;
+ int i;
+
+ xa_for_each(&priv->bound_rxq_list, xa_idx, rxq) {
+ mutex_lock(&bind_rx_queue_mutex);
+ old_pages = rcu_dereference_protected(rxq->dmabuf_pages,
+ mutex_is_locked(&bind_rx_queue_mutex));
+ if (old_pages == file)
+ rcu_assign_pointer(rxq->dmabuf_pages, NULL);
+ mutex_unlock(&bind_rx_queue_mutex);
+ dev_put(rxq->dev);
+ }
+
+ if (priv->tx_bv)
+ for (i = 0; i < priv->num_pages; i++)
+ put_page(&priv->pages[i]);
+
+ dma_buf_unmap_attachment(priv->attachment, priv->sgt, priv->direction);
+ dma_buf_detach(priv->dmabuf, priv->attachment);
+ dma_buf_put(priv->dmabuf);
+ pci_dev_put(priv->pci_dev);
+
+ xa_destroy(&priv->bound_rxq_list);
+
+ percpu_ref_kill(&priv->pgmap.ref);
+ /* Drop initial ref after percpu_ref_kill(). */
+ percpu_ref_put(&priv->pgmap.ref);
+
+ return 0;
+}
+
+static int
+dma_buf_pages_bind_rx_queue(struct file *file,
+ struct dma_buf_pages_bind_rx_queue *bind_rx_queue)
+{
+ struct dma_buf_pages_file_priv *priv = file->private_data;
+ struct netdev_rx_queue *rxq;
+ struct net_device *netdev;
+ int xa_id;
+ int err;
+
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
+ if (!priv->page_pool)
+ return -ENOTTY;
+
+ bind_rx_queue->ifname[IFNAMSIZ - 1] = '\0';
+
+ netdev = dev_get_by_name(current->nsproxy->net_ns,
+ bind_rx_queue->ifname);
+ if (!netdev)
+ return -ENODEV;
+
+ if (!dev_is_pci(netdev->dev.parent)) {
+ err = -ENOTBLK;
+ goto out_put_dev;
+ }
+
+ if (to_pci_dev(netdev->dev.parent) != priv->pci_dev) {
+ err = -EXDEV;
+ goto out_put_dev;
+ }
+
+ if (bind_rx_queue->rxq_idx >= netdev->num_rx_queues) {
+ err = -ERANGE;
+ goto out_put_dev;
+ }
+
+ rxq = __netif_get_rx_queue(netdev, bind_rx_queue->rxq_idx);
+
+ err = xa_alloc(&priv->bound_rxq_list, &xa_id, rxq, xa_limit_32b,
+ GFP_KERNEL);
+ if (err)
+ goto out_put_dev;
+ mutex_lock(&bind_rx_queue_mutex);
+
+ /* The DMA_BUF_CREATE_PAGES ioctl that creates the input file does a
+ * dma_buf_attach(), which validates that the net_device we're trying to
+ * attach to can reach the dmabuf, so we don't need to check here as
+ * well.
+ */
+ rcu_assign_pointer(rxq->dmabuf_pages, file);
+
+ mutex_unlock(&bind_rx_queue_mutex);
+
+ return 0;
+out_put_dev:
+ dev_put(netdev);
+ return err;
+}
+
+static long dma_buf_pages_ioctl(struct file *file, unsigned int op,
+ unsigned long arg)
+{
+ struct dma_buf_pages_bind_rx_queue bind_rx_queue;
+ void *input_ptr = (void *)arg;
+
+ switch (op) {
+ case DMA_BUF_PAGES_BIND_RX:
+ if (copy_from_user(&bind_rx_queue, input_ptr,
+ sizeof(bind_rx_queue)))
+ return -EFAULT;
+ return dma_buf_pages_bind_rx_queue(file, &bind_rx_queue);
+ default:
+ return -EINVAL;
+ }
+}
+
+static void dma_buf_page_free(struct page *page)
+{
+ struct dma_buf_pages_file_priv *priv;
+ struct dev_pagemap *pgmap;
+ unsigned long addr;
+ ssize_t offset;
+
+ pgmap = page->pgmap;
+ priv = container_of(pgmap, struct dma_buf_pages_file_priv, pgmap);
+ offset = page - priv->pages;
+
+ if (WARN_ON_ONCE(offset < 0 || offset > priv->num_pages))
+ return;
+
+ /* Offset + 1 is due to the fact that we want to avoid 0 virt address
+ * returned from the gen_pool. The genpool returns 0 on error, and virt
+ * address 0 is indistinguishable from an error.
+ */
+ addr = (offset + 1) << PAGE_SHIFT;
+
+ if (priv->page_pool) {
+ /* page->private containers the order for dma buf pages. */
+ if (!WARN_ON_ONCE(!gen_pool_has_addr(priv->page_pool, addr,
+ PAGE_SIZE * (1 << page->private)))) {
+ gen_pool_free(priv->page_pool, addr,
+ PAGE_SIZE * (1 << page->private));
+ }
+
+ }
+ percpu_ref_put(&pgmap->ref);
+}
+
+const struct dev_pagemap_ops dma_buf_pgmap_ops = {
+ .page_free = dma_buf_page_free,
+};
+EXPORT_SYMBOL_GPL(dma_buf_pgmap_ops);
+
+const struct file_operations dma_buf_pages_fops = {
+ .unlocked_ioctl = dma_buf_pages_ioctl,
+ .release = dma_buf_pages_release,
+};
+EXPORT_SYMBOL_GPL(dma_buf_pages_fops);
+
+#ifdef CONFIG_ZONE_DEVICE
+static void dma_buf_pages_percpu_release(struct percpu_ref *ref)
+{
+ struct dma_buf_pages_file_priv *priv;
+ struct dev_pagemap *pgmap;
+
+ pgmap = container_of(ref, struct dev_pagemap, ref);
+ priv = container_of(pgmap, struct dma_buf_pages_file_priv, pgmap);
+
+ if (priv->tx_bv) {
+ kvfree(priv->tx_bv);
+ } else {
+ /* This can be a racy check, if another thread is releasing
+ * memory to the gen_pool. However, that should not happen, as
+ * the dma_buf_pages_percpu_release() being called indicates
+ * the there are no lingering refs to pages anymore
+ */
+ if (!WARN_ON_ONCE(gen_pool_size(priv->page_pool) !=
+ gen_pool_avail(priv->page_pool))) {
+ gen_pool_destroy(priv->page_pool);
+ }
+ }
+
+ kvfree(priv->pages);
+ kfree(priv);
+}
+
+static long dma_buf_create_pages(struct file *file,
+ struct dma_buf_create_pages_info *create_info)
+{
+ int err, fd, i, pg_idx;
+ struct scatterlist *sg;
+ struct dma_buf_pages_file_priv *priv;
+ struct file *new_file;
+
+ fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC);
+ if (fd < 0) {
+ err = fd;
+ goto out_err;
+ }
+
+ priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+ if (!priv) {
+ err = -ENOMEM;
+ goto out_put_fd;
+ }
+
+ priv->pgmap.type = MEMORY_DEVICE_PRIVATE;
+ priv->pgmap.ops = &dma_buf_pgmap_ops;
+ init_completion(&priv->pgmap.done);
+
+ /* This refcount is incremented everytime a page in priv->pages is
+ * allocated, and decremented everytime a page is freed. When
+ * it drops to 0, the priv struct can be freed. The priv struct
+ * is not freed until the initial reference acquired below is dropped.
+ */
+ err = percpu_ref_init(&priv->pgmap.ref, dma_buf_pages_percpu_release, 0,
+ GFP_KERNEL);
+ if (err)
+ goto out_free_priv;
+
+ /* Initial ref to be dropped after percpu_ref_kill(). */
+ percpu_ref_get(&priv->pgmap.ref);
+
+ priv->pci_dev = pci_get_domain_bus_and_slot(
+ 0, create_info->pci_bdf[0],
+ PCI_DEVFN(create_info->pci_bdf[1], create_info->pci_bdf[2]));
+ if (!priv->pci_dev) {
+ err = -ENODEV;
+ goto out_exit_percpu_ref;
+ }
+
+ priv->dmabuf = dma_buf_get(create_info->dma_buf_fd);
+ if (IS_ERR(priv->dmabuf)) {
+ err = PTR_ERR(priv->dmabuf);
+ goto out_put_pci_dev;
+ }
+
+ if (priv->dmabuf->size % PAGE_SIZE != 0) {
+ err = -EINVAL;
+ goto out_put_dma_buf;
+ }
+
+ priv->attachment = dma_buf_attach(priv->dmabuf, &priv->pci_dev->dev);
+ if (IS_ERR(priv->attachment)) {
+ err = PTR_ERR(priv->attachment);
+ goto out_put_dma_buf;
+ }
+
+ priv->num_pages = priv->dmabuf->size / PAGE_SIZE;
+ priv->pages = kvmalloc_array(priv->num_pages, sizeof(struct page),
+ GFP_KERNEL);
+ if (!priv->pages) {
+ err = -ENOMEM;
+ goto out_detach_dma_buf;
+ }
+
+ for (i = 0; i < priv->num_pages; i++) {
+ struct page *page = &priv->pages[i];
+
+ mm_zero_struct_page(page);
+ set_page_zone(page, ZONE_DEVICE);
+ set_page_count(page, 0);
+ page->pgmap = &priv->pgmap;
+ }
+
+ priv->direction = DMA_BIDIRECTIONAL;
+ priv->sgt = dma_buf_map_attachment(priv->attachment, priv->direction);
+ if (IS_ERR(priv->sgt)) {
+ err = PTR_ERR(priv->sgt);
+ goto out_free_pages;
+ }
+
+ /* Now write each dma address to each page */
+ pg_idx = 0;
+ for_each_sgtable_dma_sg(priv->sgt, sg, i) {
+ size_t len = sg_dma_len(sg);
+ dma_addr_t dma_addr = sg_dma_address(sg);
+
+ BUG_ON(!PAGE_ALIGNED(len));
+ while (len > 0) {
+ priv->pages[pg_idx].zone_device_data = (void *)dma_addr;
+ pg_idx++;
+ dma_addr += PAGE_SIZE;
+ len -= PAGE_SIZE;
+ }
+ }
+
+ if (create_info->create_page_pool != 0) {
+ priv->page_pool = gen_pool_create(
+ PAGE_SHIFT, dev_to_node(&priv->pci_dev->dev));
+ if (!priv->page_pool) {
+ err = -ENOMEM;
+ goto out_unmap_dma_buf;
+ }
+ /*
+ * We start with PAGE_SIZE instead of 0 since
+ * gen_pool_alloc_*() returns NULL when error
+ */
+ err = gen_pool_add_virt(priv->page_pool, PAGE_SIZE, 0,
+ PAGE_SIZE * priv->num_pages,
+ dev_to_node(&priv->pci_dev->dev));
+ if (err)
+ goto out_destroy_genpool;
+ xa_init_flags(&priv->bound_rxq_list, XA_FLAGS_ALLOC);
+ priv->tx_bv = NULL;
+ } else {
+ priv->page_pool = NULL;
+ priv->tx_bv = kvmalloc_array(priv->num_pages, sizeof(struct bio_vec),
+ GFP_KERNEL);
+ if (!priv->tx_bv) {
+ err = -ENOMEM;
+ goto out_unmap_dma_buf;
+ }
+ for (i = 0; i < priv->num_pages; i++) {
+ priv->tx_bv[i].bv_page = &priv->pages[i];
+ priv->tx_bv[i].bv_offset = 0;
+ priv->tx_bv[i].bv_len = PAGE_SIZE;
+ get_page(&priv->pages[i]);
+ }
+ percpu_ref_get_many(&priv->pgmap.ref, priv->num_pages);
+ iov_iter_bvec(&priv->tx_iter, WRITE, priv->tx_bv,
+ priv->num_pages, priv->dmabuf->size);
+ }
+
+ new_file = anon_inode_getfile("[dma_buf_pages]", &dma_buf_pages_fops,
+ (void *)priv, O_RDWR | O_CLOEXEC);
+ if (IS_ERR(new_file)) {
+ err = PTR_ERR(new_file);
+ goto out_destroy_genpool;
+ }
+
+ fd_install(fd, new_file);
+ return fd;
+
+out_destroy_genpool:
+ if (priv->page_pool) {
+ gen_pool_destroy(priv->page_pool);
+ } else {
+ kvfree(priv->tx_bv);
+ percpu_ref_put_many(&priv->pgmap.ref, priv->num_pages);
+ }
+out_unmap_dma_buf:
+ dma_buf_unmap_attachment(priv->attachment, priv->sgt, priv->direction);
+out_free_pages:
+ kvfree(priv->pages);
+out_detach_dma_buf:
+ dma_buf_detach(priv->dmabuf, priv->attachment);
+out_put_dma_buf:
+ dma_buf_put(priv->dmabuf);
+out_put_pci_dev:
+ pci_dev_put(priv->pci_dev);
+out_exit_percpu_ref:
+ percpu_ref_exit(&priv->pgmap.ref);
+out_free_priv:
+ kfree(priv);
+out_put_fd:
+ put_unused_fd(fd);
+out_err:
+ return err;
+}
+#else
+static long dma_buf_create_pages(struct file *file,
+ struct dma_buf_create_pages_info *create_info)
+{
+ return -ENOTSUPP;
+}
+#endif
+
#ifdef CONFIG_DEBUG_FS
static int dma_buf_debug_show(struct seq_file *s, void *unused)
{
diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index 6787ed8..8aa8adf 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -314,6 +314,20 @@
virt/coco/efi_secret module to access the secrets, which in turn
allows userspace programs to access the injected secrets.
+config UNACCEPTED_MEMORY
+ bool
+ depends on EFI_STUB
+ help
+ Some Virtual Machine platforms, such as Intel TDX, require
+ some memory to be "accepted" by the guest before it can be used.
+ This mechanism helps prevent malicious hosts from making changes
+ to guest memory.
+
+ UEFI specification v2.9 introduced EFI_UNACCEPTED_MEMORY memory type.
+
+ This option adds support for unaccepted memory and makes such memory
+ usable by the kernel.
+
config EFI_EMBEDDED_FIRMWARE
bool
select CRYPTO_LIB_SHA256
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index 8d151e3..dd57d92 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -46,3 +46,4 @@
obj-$(CONFIG_EFI_EARLYCON) += earlycon.o
obj-$(CONFIG_UEFI_CPER_ARM) += cper-arm.o
obj-$(CONFIG_UEFI_CPER_X86) += cper-x86.o
+obj-$(CONFIG_UNACCEPTED_MEMORY) += unaccepted_memory.o
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index b7c0e8c..8e90125 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -50,6 +50,9 @@
#ifdef CONFIG_EFI_COCO_SECRET
.coco_secret = EFI_INVALID_TABLE_ADDR,
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ .unaccepted = EFI_INVALID_TABLE_ADDR,
+#endif
};
EXPORT_SYMBOL(efi);
@@ -554,6 +557,9 @@
#ifdef CONFIG_EFI_COCO_SECRET
{LINUX_EFI_COCO_SECRET_AREA_GUID, &efi.coco_secret, "CocoSecret" },
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ {LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID, &efi.unaccepted, "Unaccepted" },
+#endif
{},
};
@@ -698,6 +704,25 @@
}
}
+ if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) &&
+ efi.unaccepted != EFI_INVALID_TABLE_ADDR) {
+ struct efi_unaccepted_memory *unaccepted;
+
+ unaccepted = early_memremap(efi.unaccepted, sizeof(*unaccepted));
+ if (unaccepted) {
+ unsigned long size;
+
+ if (unaccepted->version == 1) {
+ size = sizeof(*unaccepted) + unaccepted->size;
+ memblock_reserve(efi.unaccepted, size);
+ } else {
+ efi.unaccepted = EFI_INVALID_TABLE_ADDR;
+ }
+
+ early_memunmap(unaccepted, sizeof(*unaccepted));
+ }
+ }
+
return 0;
}
@@ -784,6 +809,7 @@
"MMIO Port",
"PAL Code",
"Persistent",
+ "Unaccepted",
};
char * __init efi_md_typeattr_format(char *buf, size_t size,
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index ef5045a..35e8e53 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -92,6 +92,8 @@
zboot-obj-$(CONFIG_RISCV) := lib-clz_ctz.o lib-ashldi3.o
lib-$(CONFIG_EFI_ZBOOT) += zboot.o $(zboot-obj-y)
+lib-$(CONFIG_UNACCEPTED_MEMORY) += unaccepted_memory.o bitmap.o find.o
+
extra-y := $(lib-y)
lib-y := $(patsubst %.o,%.stub.o,$(lib-y))
diff --git a/drivers/firmware/efi/libstub/bitmap.c b/drivers/firmware/efi/libstub/bitmap.c
new file mode 100644
index 0000000..5c9bba0
--- /dev/null
+++ b/drivers/firmware/efi/libstub/bitmap.c
@@ -0,0 +1,41 @@
+#include <linux/bitmap.h>
+
+void __bitmap_set(unsigned long *map, unsigned int start, int len)
+{
+ unsigned long *p = map + BIT_WORD(start);
+ const unsigned int size = start + len;
+ int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
+ unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+
+ while (len - bits_to_set >= 0) {
+ *p |= mask_to_set;
+ len -= bits_to_set;
+ bits_to_set = BITS_PER_LONG;
+ mask_to_set = ~0UL;
+ p++;
+ }
+ if (len) {
+ mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+ *p |= mask_to_set;
+ }
+}
+
+void __bitmap_clear(unsigned long *map, unsigned int start, int len)
+{
+ unsigned long *p = map + BIT_WORD(start);
+ const unsigned int size = start + len;
+ int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+ unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+
+ while (len - bits_to_clear >= 0) {
+ *p &= ~mask_to_clear;
+ len -= bits_to_clear;
+ bits_to_clear = BITS_PER_LONG;
+ mask_to_clear = ~0UL;
+ p++;
+ }
+ if (len) {
+ mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+ *p &= ~mask_to_clear;
+ }
+}
diff --git a/drivers/firmware/efi/libstub/efistub.h b/drivers/firmware/efi/libstub/efistub.h
index 970e86e..a3758c8 100644
--- a/drivers/firmware/efi/libstub/efistub.h
+++ b/drivers/firmware/efi/libstub/efistub.h
@@ -1040,4 +1040,10 @@
const u8 *__efi_get_smbios_string(const struct efi_smbios_record *record,
u8 type, int offset, int recsize);
+efi_status_t allocate_unaccepted_bitmap(__u32 nr_desc,
+ struct efi_boot_memmap *map);
+void process_unaccepted_memory(u64 start, u64 end);
+void accept_memory(phys_addr_t start, phys_addr_t end);
+void arch_accept_memory(phys_addr_t start, phys_addr_t end);
+
#endif
diff --git a/drivers/firmware/efi/libstub/find.c b/drivers/firmware/efi/libstub/find.c
new file mode 100644
index 0000000..4e7740d
--- /dev/null
+++ b/drivers/firmware/efi/libstub/find.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/bitmap.h>
+#include <linux/math.h>
+#include <linux/minmax.h>
+
+/*
+ * Common helper for find_next_bit() function family
+ * @FETCH: The expression that fetches and pre-processes each word of bitmap(s)
+ * @MUNGE: The expression that post-processes a word containing found bit (may be empty)
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ */
+#define FIND_NEXT_BIT(FETCH, MUNGE, size, start) \
+({ \
+ unsigned long mask, idx, tmp, sz = (size), __start = (start); \
+ \
+ if (unlikely(__start >= sz)) \
+ goto out; \
+ \
+ mask = MUNGE(BITMAP_FIRST_WORD_MASK(__start)); \
+ idx = __start / BITS_PER_LONG; \
+ \
+ for (tmp = (FETCH) & mask; !tmp; tmp = (FETCH)) { \
+ if ((idx + 1) * BITS_PER_LONG >= sz) \
+ goto out; \
+ idx++; \
+ } \
+ \
+ sz = min(idx * BITS_PER_LONG + __ffs(MUNGE(tmp)), sz); \
+out: \
+ sz; \
+})
+
+unsigned long _find_next_bit(const unsigned long *addr, unsigned long nbits, unsigned long start)
+{
+ return FIND_NEXT_BIT(addr[idx], /* nop */, nbits, start);
+}
+
+unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long nbits,
+ unsigned long start)
+{
+ return FIND_NEXT_BIT(~addr[idx], /* nop */, nbits, start);
+}
diff --git a/drivers/firmware/efi/libstub/unaccepted_memory.c b/drivers/firmware/efi/libstub/unaccepted_memory.c
new file mode 100644
index 0000000..ca61f47
--- /dev/null
+++ b/drivers/firmware/efi/libstub/unaccepted_memory.c
@@ -0,0 +1,222 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+#include "efistub.h"
+
+struct efi_unaccepted_memory *unaccepted_table;
+
+efi_status_t allocate_unaccepted_bitmap(__u32 nr_desc,
+ struct efi_boot_memmap *map)
+{
+ efi_guid_t unaccepted_table_guid = LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID;
+ u64 unaccepted_start = ULLONG_MAX, unaccepted_end = 0, bitmap_size;
+ efi_status_t status;
+ int i;
+
+ /* Check if the table is already installed */
+ unaccepted_table = get_efi_config_table(unaccepted_table_guid);
+ if (unaccepted_table) {
+ if (unaccepted_table->version != 1) {
+ efi_err("Unknown version of unaccepted memory table\n");
+ return EFI_UNSUPPORTED;
+ }
+ return EFI_SUCCESS;
+ }
+
+ /* Check if there's any unaccepted memory and find the max address */
+ for (i = 0; i < nr_desc; i++) {
+ efi_memory_desc_t *d;
+ unsigned long m = (unsigned long)map->map;
+
+ d = efi_early_memdesc_ptr(m, map->desc_size, i);
+ if (d->type != EFI_UNACCEPTED_MEMORY)
+ continue;
+
+ unaccepted_start = min(unaccepted_start, d->phys_addr);
+ unaccepted_end = max(unaccepted_end,
+ d->phys_addr + d->num_pages * PAGE_SIZE);
+ }
+
+ if (unaccepted_start == ULLONG_MAX)
+ return EFI_SUCCESS;
+
+ unaccepted_start = round_down(unaccepted_start,
+ EFI_UNACCEPTED_UNIT_SIZE);
+ unaccepted_end = round_up(unaccepted_end, EFI_UNACCEPTED_UNIT_SIZE);
+
+ /*
+ * If unaccepted memory is present, allocate a bitmap to track what
+ * memory has to be accepted before access.
+ *
+ * One bit in the bitmap represents 2MiB in the address space:
+ * A 4k bitmap can track 64GiB of physical address space.
+ *
+ * In the worst case scenario -- a huge hole in the middle of the
+ * address space -- It needs 256MiB to handle 4PiB of the address
+ * space.
+ *
+ * The bitmap will be populated in setup_e820() according to the memory
+ * map after efi_exit_boot_services().
+ */
+ bitmap_size = DIV_ROUND_UP(unaccepted_end - unaccepted_start,
+ EFI_UNACCEPTED_UNIT_SIZE * BITS_PER_BYTE);
+
+ status = efi_bs_call(allocate_pool, EFI_LOADER_DATA,
+ sizeof(*unaccepted_table) + bitmap_size,
+ (void **)&unaccepted_table);
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to allocate unaccepted memory config table\n");
+ return status;
+ }
+
+ unaccepted_table->version = 1;
+ unaccepted_table->unit_size = EFI_UNACCEPTED_UNIT_SIZE;
+ unaccepted_table->phys_base = unaccepted_start;
+ unaccepted_table->size = bitmap_size;
+ memset(unaccepted_table->bitmap, 0, bitmap_size);
+
+ status = efi_bs_call(install_configuration_table,
+ &unaccepted_table_guid, unaccepted_table);
+ if (status != EFI_SUCCESS) {
+ efi_bs_call(free_pool, unaccepted_table);
+ efi_err("Failed to install unaccepted memory config table!\n");
+ }
+
+ return status;
+}
+
+/*
+ * The accepted memory bitmap only works at unit_size granularity. Take
+ * unaligned start/end addresses and either:
+ * 1. Accepts the memory immediately and in its entirety
+ * 2. Accepts unaligned parts, and marks *some* aligned part unaccepted
+ *
+ * The function will never reach the bitmap_set() with zero bits to set.
+ */
+void process_unaccepted_memory(u64 start, u64 end)
+{
+ u64 unit_size = unaccepted_table->unit_size;
+ u64 unit_mask = unaccepted_table->unit_size - 1;
+ u64 bitmap_size = unaccepted_table->size;
+
+ /*
+ * Ensure that at least one bit will be set in the bitmap by
+ * immediately accepting all regions under 2*unit_size. This is
+ * imprecise and may immediately accept some areas that could
+ * have been represented in the bitmap. But, results in simpler
+ * code below
+ *
+ * Consider case like this (assuming unit_size == 2MB):
+ *
+ * | 4k | 2044k | 2048k |
+ * ^ 0x0 ^ 2MB ^ 4MB
+ *
+ * Only the first 4k has been accepted. The 0MB->2MB region can not be
+ * represented in the bitmap. The 2MB->4MB region can be represented in
+ * the bitmap. But, the 0MB->4MB region is <2*unit_size and will be
+ * immediately accepted in its entirety.
+ */
+ if (end - start < 2 * unit_size) {
+ arch_accept_memory(start, end);
+ return;
+ }
+
+ /*
+ * No matter how the start and end are aligned, at least one unaccepted
+ * unit_size area will remain to be marked in the bitmap.
+ */
+
+ /* Immediately accept a <unit_size piece at the start: */
+ if (start & unit_mask) {
+ arch_accept_memory(start, round_up(start, unit_size));
+ start = round_up(start, unit_size);
+ }
+
+ /* Immediately accept a <unit_size piece at the end: */
+ if (end & unit_mask) {
+ arch_accept_memory(round_down(end, unit_size), end);
+ end = round_down(end, unit_size);
+ }
+
+ /*
+ * Accept part of the range that before phys_base and cannot be recorded
+ * into the bitmap.
+ */
+ if (start < unaccepted_table->phys_base) {
+ arch_accept_memory(start,
+ min(unaccepted_table->phys_base, end));
+ start = unaccepted_table->phys_base;
+ }
+
+ /* Nothing to record */
+ if (end < unaccepted_table->phys_base)
+ return;
+
+ /* Translate to offsets from the beginning of the bitmap */
+ start -= unaccepted_table->phys_base;
+ end -= unaccepted_table->phys_base;
+
+ /* Accept memory that doesn't fit into bitmap */
+ if (end > bitmap_size * unit_size * BITS_PER_BYTE) {
+ unsigned long phys_start, phys_end;
+
+ phys_start = bitmap_size * unit_size * BITS_PER_BYTE +
+ unaccepted_table->phys_base;
+ phys_end = end + unaccepted_table->phys_base;
+
+ arch_accept_memory(phys_start, phys_end);
+ end = bitmap_size * unit_size * BITS_PER_BYTE;
+ }
+
+ /*
+ * 'start' and 'end' are now both unit_size-aligned.
+ * Record the range as being unaccepted:
+ */
+ bitmap_set(unaccepted_table->bitmap,
+ start / unit_size, (end - start) / unit_size);
+}
+
+void accept_memory(phys_addr_t start, phys_addr_t end)
+{
+ unsigned long range_start, range_end;
+ unsigned long bitmap_size;
+ u64 unit_size;
+
+ if (!unaccepted_table)
+ return;
+
+ unit_size = unaccepted_table->unit_size;
+
+ /*
+ * Only care for the part of the range that is represented
+ * in the bitmap.
+ */
+ if (start < unaccepted_table->phys_base)
+ start = unaccepted_table->phys_base;
+ if (end < unaccepted_table->phys_base)
+ return;
+
+ /* Translate to offsets from the beginning of the bitmap */
+ start -= unaccepted_table->phys_base;
+ end -= unaccepted_table->phys_base;
+
+ /* Make sure not to overrun the bitmap */
+ if (end > unaccepted_table->size * unit_size * BITS_PER_BYTE)
+ end = unaccepted_table->size * unit_size * BITS_PER_BYTE;
+
+ range_start = start / unit_size;
+ bitmap_size = DIV_ROUND_UP(end, unit_size);
+
+ for_each_set_bitrange_from(range_start, range_end,
+ unaccepted_table->bitmap, bitmap_size) {
+ unsigned long phys_start, phys_end;
+
+ phys_start = range_start * unit_size + unaccepted_table->phys_base;
+ phys_end = range_end * unit_size + unaccepted_table->phys_base;
+
+ arch_accept_memory(phys_start, phys_end);
+ bitmap_clear(unaccepted_table->bitmap,
+ range_start, range_end - range_start);
+ }
+}
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
index 4f0152b..bfc7059 100644
--- a/drivers/firmware/efi/libstub/x86-stub.c
+++ b/drivers/firmware/efi/libstub/x86-stub.c
@@ -613,6 +613,16 @@
e820_type = E820_TYPE_PMEM;
break;
+ case EFI_UNACCEPTED_MEMORY:
+ if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) {
+ efi_warn_once(
+"The system has unaccepted memory, but kernel does not support it\nConsider enabling CONFIG_UNACCEPTED_MEMORY\n");
+ continue;
+ }
+ e820_type = E820_TYPE_RAM;
+ process_unaccepted_memory(d->phys_addr,
+ d->phys_addr + PAGE_SIZE * d->num_pages);
+ break;
default:
continue;
}
@@ -681,28 +691,27 @@
struct setup_data **e820ext,
u32 *e820ext_size)
{
- unsigned long map_size, desc_size, map_key;
+ struct efi_boot_memmap *map;
efi_status_t status;
- __u32 nr_desc, desc_version;
+ __u32 nr_desc;
- /* Only need the size of the mem map and size of each mem descriptor */
- map_size = 0;
- status = efi_bs_call(get_memory_map, &map_size, NULL, &map_key,
- &desc_size, &desc_version);
- if (status != EFI_BUFFER_TOO_SMALL)
- return (status != EFI_SUCCESS) ? status : EFI_UNSUPPORTED;
+ status = efi_get_memory_map(&map, false);
+ if (status != EFI_SUCCESS)
+ return status;
- nr_desc = map_size / desc_size + EFI_MMAP_NR_SLACK_SLOTS;
-
- if (nr_desc > ARRAY_SIZE(params->e820_table)) {
- u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table);
+ nr_desc = map->map_size / map->desc_size;
+ if (nr_desc > ARRAY_SIZE(params->e820_table) - EFI_MMAP_NR_SLACK_SLOTS) {
+ u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table) +
+ EFI_MMAP_NR_SLACK_SLOTS;
status = alloc_e820ext(nr_e820ext, e820ext, e820ext_size);
- if (status != EFI_SUCCESS)
- return status;
}
- return EFI_SUCCESS;
+ if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) && status == EFI_SUCCESS)
+ status = allocate_unaccepted_bitmap(nr_desc, map);
+
+ efi_bs_call(free_pool, map);
+ return status;
}
struct exit_boot_struct {
diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/efi/unaccepted_memory.c
new file mode 100644
index 0000000..853f7dc
--- /dev/null
+++ b/drivers/firmware/efi/unaccepted_memory.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/efi.h>
+#include <linux/memblock.h>
+#include <linux/spinlock.h>
+#include <asm/unaccepted_memory.h>
+
+/* Protects unaccepted memory bitmap */
+static DEFINE_SPINLOCK(unaccepted_memory_lock);
+
+/*
+ * accept_memory() -- Consult bitmap and accept the memory if needed.
+ *
+ * Only memory that is explicitly marked as unaccepted in the bitmap requires
+ * an action. All the remaining memory is implicitly accepted and doesn't need
+ * acceptance.
+ *
+ * No need to accept:
+ * - anything if the system has no unaccepted table;
+ * - memory that is below phys_base;
+ * - memory that is above the memory that addressable by the bitmap;
+ */
+void accept_memory(phys_addr_t start, phys_addr_t end)
+{
+ struct efi_unaccepted_memory *unaccepted;
+ unsigned long range_start, range_end;
+ unsigned long flags;
+ u64 unit_size;
+
+ unaccepted = efi_get_unaccepted_table();
+ if (!unaccepted)
+ return;
+
+ unit_size = unaccepted->unit_size;
+
+ /*
+ * Only care for the part of the range that is represented
+ * in the bitmap.
+ */
+ if (start < unaccepted->phys_base)
+ start = unaccepted->phys_base;
+ if (end < unaccepted->phys_base)
+ return;
+
+ /* Translate to offsets from the beginning of the bitmap */
+ start -= unaccepted->phys_base;
+ end -= unaccepted->phys_base;
+
+ /*
+ * load_unaligned_zeropad() can lead to unwanted loads across page
+ * boundaries. The unwanted loads are typically harmless. But, they
+ * might be made to totally unrelated or even unmapped memory.
+ * load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now
+ * #VE) to recover from these unwanted loads.
+ *
+ * But, this approach does not work for unaccepted memory. For TDX, a
+ * load from unaccepted memory will not lead to a recoverable exception
+ * within the guest. The guest will exit to the VMM where the only
+ * recourse is to terminate the guest.
+ *
+ * There are two parts to fix this issue and comprehensively avoid
+ * access to unaccepted memory. Together these ensure that an extra
+ * "guard" page is accepted in addition to the memory that needs to be
+ * used:
+ *
+ * 1. Implicitly extend the range_contains_unaccepted_memory(start, end)
+ * checks up to end+unit_size if 'end' is aligned on a unit_size
+ * boundary.
+ *
+ * 2. Implicitly extend accept_memory(start, end) to end+unit_size if
+ * 'end' is aligned on a unit_size boundary. (immediately following
+ * this comment)
+ */
+ if (!(end % unit_size))
+ end += unit_size;
+
+ /* Make sure not to overrun the bitmap */
+ if (end > unaccepted->size * unit_size * BITS_PER_BYTE)
+ end = unaccepted->size * unit_size * BITS_PER_BYTE;
+
+ range_start = start / unit_size;
+
+ spin_lock_irqsave(&unaccepted_memory_lock, flags);
+ for_each_set_bitrange_from(range_start, range_end, unaccepted->bitmap,
+ DIV_ROUND_UP(end, unit_size)) {
+ unsigned long phys_start, phys_end;
+ unsigned long len = range_end - range_start;
+
+ phys_start = range_start * unit_size + unaccepted->phys_base;
+ phys_end = range_end * unit_size + unaccepted->phys_base;
+
+ arch_accept_memory(phys_start, phys_end);
+ bitmap_clear(unaccepted->bitmap, range_start, len);
+ }
+ spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
+}
+
+bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end)
+{
+ struct efi_unaccepted_memory *unaccepted;
+ unsigned long flags;
+ bool ret = false;
+ u64 unit_size;
+
+ unaccepted = efi_get_unaccepted_table();
+ if (!unaccepted)
+ return false;
+
+ unit_size = unaccepted->unit_size;
+
+ /*
+ * Only care for the part of the range that is represented
+ * in the bitmap.
+ */
+ if (start < unaccepted->phys_base)
+ start = unaccepted->phys_base;
+ if (end < unaccepted->phys_base)
+ return false;
+
+ /* Translate to offsets from the beginning of the bitmap */
+ start -= unaccepted->phys_base;
+ end -= unaccepted->phys_base;
+
+ /*
+ * Also consider the unaccepted state of the *next* page. See fix #1 in
+ * the comment on load_unaligned_zeropad() in accept_memory().
+ */
+ if (!(end % unit_size))
+ end += unit_size;
+
+ /* Make sure not to overrun the bitmap */
+ if (end > unaccepted->size * unit_size * BITS_PER_BYTE)
+ end = unaccepted->size * unit_size * BITS_PER_BYTE;
+
+ spin_lock_irqsave(&unaccepted_memory_lock, flags);
+ while (start < end) {
+ if (test_bit(start / unit_size, unaccepted->bitmap)) {
+ ret = true;
+ break;
+ }
+
+ start += unit_size;
+ }
+ spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
+
+ return ret;
+}
diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
index 458149a..4394bbd 100644
--- a/drivers/net/ethernet/google/gve/gve.h
+++ b/drivers/net/ethernet/google/gve/gve.h
@@ -14,6 +14,7 @@
#include "gve_desc.h"
#include "gve_desc_dqo.h"
+#include "gve_register.h"
#ifndef PCI_VENDOR_ID_GOOGLE
#define PCI_VENDOR_ID_GOOGLE 0x1ae0
@@ -42,13 +43,45 @@
#define GVE_DATA_SLOT_ADDR_PAGE_MASK (~(PAGE_SIZE - 1))
+// TX timeout period to check the miss path
+#define GVE_TX_TIMEOUT_PERIOD 1 * HZ
+
/* PTYPEs are always 10 bits. */
#define GVE_NUM_PTYPES 1024
#define GVE_RX_BUFFER_SIZE_DQO 2048
+#define GVE_MIN_RX_BUFFER_SIZE 2048
+#define GVE_MAX_RX_BUFFER_SIZE 4096
+
+#define GVE_RSS_KEY_SIZE 40
+#define GVE_RSS_INDIR_SIZE 128
+
+#define GVE_HEADER_BUFFER_SIZE_MIN 64
+#define GVE_HEADER_BUFFER_SIZE_MAX 256
+#define GVE_HEADER_BUFFER_SIZE_DEFAULT 128
#define GVE_GQ_TX_MIN_PKT_DESC_BYTES 182
+#define DQO_QPL_DEFAULT_TX_PAGES 512
+#define DQO_QPL_DEFAULT_RX_PAGES 2048
+
+/* Maximum TSO size supported on DQO */
+#define GVE_DQO_TX_MAX 0x3FFFF
+
+#define GVE_TX_BUF_SHIFT_DQO 11
+
+/* 2K buffers for DQO-QPL */
+#define GVE_TX_BUF_SIZE_DQO BIT(GVE_TX_BUF_SHIFT_DQO)
+#define GVE_TX_BUFS_PER_PAGE_DQO (PAGE_SIZE >> GVE_TX_BUF_SHIFT_DQO)
+#define GVE_MAX_TX_BUFS_PER_PKT (DIV_ROUND_UP(GVE_DQO_TX_MAX, GVE_TX_BUF_SIZE_DQO))
+
+/* If number of free/recyclable buffers are less than this threshold; driver
+ * allocs and uses a non-qpl page on the receive path of DQO QPL to free
+ * up buffers.
+ * Value is set big enough to post at least 3 64K LRO packet via 2K buffer to NIC.
+ */
+#define GVE_DQO_QPL_ONDEMAND_ALLOC_THRESHOLD 96
+
/* Each slot in the desc ring has a 1:1 mapping to a slot in the data ring */
struct gve_rx_desc_queue {
struct gve_rx_desc *desc_ring; /* the descriptor ring */
@@ -62,7 +95,8 @@
void *page_address;
u32 page_offset; /* offset to write to in page */
int pagecnt_bias; /* expected pagecnt if only the driver has a ref */
- u8 can_flip;
+ u16 pad; /* adjustment for rx padding */
+ u8 can_flip; /* tracks if the networking stack is using the page */
};
/* A list of pages registered with the device during setup and used by a queue
@@ -121,6 +155,11 @@
u32 mask; /* Mask for indices to the size of the ring */
};
+struct gve_header_buf {
+ u8 *data;
+ dma_addr_t addr;
+};
+
/* Stores state for tracking buffers posted to HW */
struct gve_rx_buf_state_dqo {
/* The page posted to HW. */
@@ -134,6 +173,9 @@
*/
u32 last_single_ref_offset;
+ /* Pointer to the header buffer when header-split is active */
+ struct gve_header_buf *hdr_buf;
+
/* Linked list index to next element in the list, or -1 if none */
s16 next;
};
@@ -151,10 +193,17 @@
/* head and tail of skb chain for the current packet or NULL if none */
struct sk_buff *skb_head;
struct sk_buff *skb_tail;
- u16 total_expected_size;
- u8 expected_frag_cnt;
- u8 curr_frag_cnt;
- u8 reuse_frags;
+ u32 total_size;
+ u8 frag_cnt;
+ bool drop_pkt;
+};
+
+struct gve_rx_cnts {
+ u32 ok_pkt_bytes;
+ u16 ok_pkt_cnt;
+ u16 total_pkt_cnt;
+ u16 cont_pkt_cnt;
+ u16 desc_err_pkt_cnt;
};
/* Contains datapath state used to represent an RX queue. */
@@ -169,6 +218,10 @@
/* threshold for posting new buffs and descs */
u32 db_threshold;
u16 packet_buffer_size;
+
+ u32 qpl_copy_pool_mask;
+ u32 qpl_copy_pool_head;
+ struct gve_rx_slot_page_info *qpl_copy_pool;
};
/* DQO fields. */
@@ -203,22 +256,43 @@
* which cannot be reused yet.
*/
struct gve_index_list used_buf_states;
+
+ /* qpl assigned to this queue */
+ struct gve_queue_page_list *qpl;
+
+ /* index into queue page list */
+ u32 next_qpl_page_idx;
+
+ /* track number of used buffers */
+ u16 used_buf_states_cnt;
+
+ /* Array of buffers for header-split */
+ struct gve_header_buf *hdr_bufs;
} dqo;
};
u64 rbytes; /* free-running bytes received */
+ u64 rheader_bytes; /* free-running header bytes received */
u64 rpackets; /* free-running packets received */
u32 cnt; /* free-running total number of completed packets */
u32 fill_cnt; /* free-running total number of descs and buffs posted */
u32 mask; /* masks the cnt and fill_cnt to the size of the ring */
+ u64 rx_hsplit_pkt; /* free-running packets with headers split */
+ u64 rx_hsplit_hbo_pkt; /* free-running packets with header buffer overflow */
+ u64 rx_devmem_pkt; /* devmem packets processed */
+ u64 rx_devmem_dropped; /* devmem pkts dropped */
u64 rx_copybreak_pkt; /* free-running count of copybreak packets */
u64 rx_copied_pkt; /* free-running total number of copied packets */
u64 rx_skb_alloc_fail; /* free-running count of skb alloc fails */
u64 rx_buf_alloc_fail; /* free-running count of buffer alloc fails */
u64 rx_desc_err_dropped_pkt; /* free-running count of packets dropped by descriptor error */
+ /* free-running count of packets dropped by header-split overflow */
+ u64 rx_hsplit_err_dropped_pkt;
u64 rx_cont_packet_cnt; /* free-running multi-fragment packets received */
u64 rx_frag_flip_cnt; /* free-running count of rx segments where page_flip was used */
- u64 rx_frag_copy_cnt; /* free-running count of rx segments copied into skb linear portion */
+ u64 rx_frag_copy_cnt; /* free-running count of rx segments copied */
+ u64 rx_frag_alloc_cnt; /* free-running count of rx page allocations */
+
u32 q_num; /* queue index */
u32 ntfy_id; /* notification block index */
struct gve_queue_resources *q_resources; /* head and tail pointer idx */
@@ -296,8 +370,14 @@
* All others correspond to `skb`'s frags and should be unmapped with
* `dma_unmap_page`.
*/
- DEFINE_DMA_UNMAP_ADDR(dma[MAX_SKB_FRAGS + 1]);
- DEFINE_DMA_UNMAP_LEN(len[MAX_SKB_FRAGS + 1]);
+ union {
+ struct {
+ DEFINE_DMA_UNMAP_ADDR(dma[MAX_SKB_FRAGS + 1]);
+ DEFINE_DMA_UNMAP_LEN(len[MAX_SKB_FRAGS + 1]);
+ };
+ s16 tx_qpl_buf_ids[GVE_MAX_TX_BUFS_PER_PKT];
+ };
+
u16 num_bufs;
/* Linked list index to next element in the list, or -1 if none */
@@ -352,6 +432,32 @@
* set.
*/
u32 last_re_idx;
+
+ /* free running number of packet buf descriptors posted */
+ u16 posted_packet_desc_cnt;
+ /* free running number of packet buf descriptors completed */
+ u16 completed_packet_desc_cnt;
+
+ /* QPL fields */
+ struct {
+ /* Linked list of gve_tx_buf_dqo. Index into
+ * tx_qpl_buf_next, or -1 if empty.
+ *
+ * This is a consumer list owned by the TX path. When it
+ * runs out, the producer list is stolen from the
+ * completion handling path
+ * (dqo_compl.free_tx_qpl_buf_head).
+ */
+ s16 free_tx_qpl_buf_head;
+
+ /* Free running count of the number of QPL tx buffers
+ * allocated
+ */
+ u32 alloc_tx_qpl_buf_cnt;
+
+ /* Cached value of `dqo_compl.free_tx_qpl_buf_cnt` */
+ u32 free_tx_qpl_buf_cnt;
+ };
} dqo_tx;
};
@@ -370,6 +476,10 @@
/* Tracks the current gen bit of compl_q */
u8 cur_gen_bit;
+ /* the jiffies when last TX completion was processed*/
+ unsigned long last_processed;
+ bool kicked;
+
/* Linked list of gve_tx_pending_packet_dqo. Index into
* pending_packets, or -1 if empty.
*
@@ -393,6 +503,24 @@
* reached a specified timeout.
*/
struct gve_index_list timed_out_completions;
+
+ /* QPL fields */
+ struct {
+ /* Linked list of gve_tx_buf_dqo. Index into
+ * tx_qpl_buf_next, or -1 if empty.
+ *
+ * This is the producer list, owned by the completion
+ * handling path. When the consumer list
+ * (dqo_tx.free_tx_qpl_buf_head) is runs out, this list
+ * will be stolen.
+ */
+ atomic_t free_tx_qpl_buf_head;
+
+ /* Free running count of the number of tx buffers
+ * freed
+ */
+ atomic_t free_tx_qpl_buf_cnt;
+ };
} dqo_compl;
} ____cacheline_aligned;
u64 pkt_done; /* free-running - total packets completed */
@@ -419,6 +547,21 @@
s16 num_pending_packets;
u32 complq_mask; /* complq size is complq_mask + 1 */
+
+ /* QPL fields */
+ struct {
+ /* qpl assigned to this queue */
+ struct gve_queue_page_list *qpl;
+
+ /* Each QPL page is divided into TX bounce buffers
+ * of size GVE_TX_BUF_SIZE_DQO. tx_qpl_buf_next is
+ * an array to manage linked lists of TX buffers.
+ * An entry j at index i implies that j'th buffer
+ * is next on the list after i
+ */
+ s16 *tx_qpl_buf_next;
+ u32 num_tx_qpl_bufs;
+ };
} dqo;
} ____cacheline_aligned;
struct netdev_queue *netdev_txq;
@@ -482,6 +625,19 @@
struct gve_ptype ptypes[GVE_NUM_PTYPES];
};
+enum gve_rss_hash_alg {
+ GVE_RSS_HASH_UNDEFINED = 0,
+ GVE_RSS_HASH_TOEPLITZ = 1,
+};
+
+struct gve_rss_config {
+ enum gve_rss_hash_alg alg;
+ u16 key_size;
+ u16 indir_size;
+ u8 *key;
+ u32 *indir;
+};
+
/* GVE_QUEUE_FORMAT_UNSPECIFIED must be zero since 0 is the default value
* when the entire configure_device_resources command is zeroed out and the
* queue_format is not specified.
@@ -491,6 +647,32 @@
GVE_GQI_RDA_FORMAT = 0x1,
GVE_GQI_QPL_FORMAT = 0x2,
GVE_DQO_RDA_FORMAT = 0x3,
+ GVE_DQO_QPL_FORMAT = 0x4,
+};
+
+struct gve_flow_spec {
+ __be32 src_ip[4];
+ __be32 dst_ip[4];
+ union {
+ struct {
+ __be16 src_port;
+ __be16 dst_port;
+ };
+ __be32 spi;
+ };
+ union {
+ u8 tos;
+ u8 tclass;
+ };
+};
+
+struct gve_flow_rule {
+ struct list_head list;
+ u16 loc;
+ u16 flow_type;
+ u16 action;
+ struct gve_flow_spec key;
+ struct gve_flow_spec mask;
};
struct gve_priv {
@@ -510,7 +692,8 @@
u16 num_event_counters;
u16 tx_desc_cnt; /* num desc per ring */
u16 rx_desc_cnt; /* num desc per ring */
- u16 tx_pages_per_qpl; /* tx buffer length */
+ u16 tx_pages_per_qpl; /* Suggested number of pages per qpl for TX queues by NIC */
+ u16 rx_pages_per_qpl; /* Suggested number of pages per qpl for RX queues by NIC */
u16 rx_data_slot_cnt; /* rx buffer length */
u64 max_registered_pages;
u64 num_registered_pages; /* num pages registered with NIC */
@@ -551,6 +734,9 @@
u32 adminq_report_stats_cnt;
u32 adminq_report_link_speed_cnt;
u32 adminq_get_ptype_map_cnt;
+ u32 adminq_verify_driver_compatibility_cnt;
+ u32 adminq_cfg_flow_rule_cnt;
+ u32 adminq_cfg_rss_cnt;
/* Global stats */
u32 interface_up_cnt; /* count of times interface turned up since last reset */
@@ -571,10 +757,15 @@
u64 stats_report_len;
dma_addr_t stats_report_bus; /* dma address for the stats report */
unsigned long ethtool_flags;
+ unsigned long ethtool_defaults; /* default flags */
unsigned long stats_report_timer_period;
struct timer_list stats_report_timer;
+ unsigned long tx_timeout_period;
+ /* tx timeout timer for the miss path */
+ struct timer_list tx_timeout_timer;
+
/* Gvnic device link speed from hypervisor. */
u64 link_speed;
bool up_before_suspend; /* True if dev was up before suspend */
@@ -584,12 +775,33 @@
/* Must be a power of two. */
int data_buffer_size_dqo;
+ int dev_max_rx_buffer_size; /* The max rx buffer size that device support*/
enum gve_queue_format queue_format;
/* Interrupt coalescing settings */
u32 tx_coalesce_usecs;
u32 rx_coalesce_usecs;
+
+ /* The size of buffers to allocate for the headers.
+ * A non-zero value enables header-split.
+ */
+ u16 header_buf_size;
+ u8 header_split_strict;
+ struct dma_pool *header_buf_pool;
+
+ /* The maximum number of rules for flow-steering.
+ * A non-zero value enables flow-steering.
+ */
+ u16 flow_rules_max;
+ u16 flow_rules_cnt;
+ struct list_head flow_rules;
+ struct mutex flow_rules_lock;
+
+ /* RSS configuration */
+ struct gve_rss_config rss_config;
+
+ enum gve_reset_reason scheduled_reset_reason;
};
enum gve_service_task_flags_bit {
@@ -608,8 +820,17 @@
enum gve_ethtool_flags_bit {
GVE_PRIV_FLAGS_REPORT_STATS = 0,
+ GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT = 1,
+ GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT = 2,
+ GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE = 3,
};
+#define GVE_PRIV_FLAGS_MASK \
+ (BIT(GVE_PRIV_FLAGS_REPORT_STATS) | \
+ BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT) | \
+ BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT) | \
+ BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE))
+
static inline bool gve_get_do_reset(struct gve_priv *priv)
{
return test_bit(GVE_PRIV_FLAGS_DO_RESET, &priv->service_task_flags);
@@ -743,6 +964,16 @@
clear_bit(GVE_PRIV_FLAGS_REPORT_STATS, &priv->ethtool_flags);
}
+static inline bool gve_get_enable_header_split(struct gve_priv *priv)
+{
+ return test_bit(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT, &priv->ethtool_flags);
+}
+
+static inline bool gve_get_enable_max_rx_buffer_size(struct gve_priv *priv)
+{
+ return test_bit(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE, &priv->ethtool_flags);
+}
+
/* Returns the address of the ntfy_blocks irq doorbell
*/
static inline __be32 __iomem *gve_irq_doorbell(struct gve_priv *priv,
@@ -765,11 +996,17 @@
return (priv->num_ntfy_blks / 2) + queue_idx;
}
+static inline bool gve_is_qpl(struct gve_priv *priv)
+{
+ return priv->queue_format == GVE_GQI_QPL_FORMAT ||
+ priv->queue_format == GVE_DQO_QPL_FORMAT;
+}
+
/* Returns the number of tx queue page lists
*/
static inline u32 gve_num_tx_qpls(struct gve_priv *priv)
{
- if (priv->queue_format != GVE_GQI_QPL_FORMAT)
+ if (!gve_is_qpl(priv))
return 0;
return priv->tx_cfg.num_queues;
@@ -779,7 +1016,7 @@
*/
static inline u32 gve_num_rx_qpls(struct gve_priv *priv)
{
- if (priv->queue_format != GVE_GQI_QPL_FORMAT)
+ if (!gve_is_qpl(priv))
return 0;
return priv->rx_cfg.num_queues;
@@ -842,6 +1079,11 @@
priv->queue_format == GVE_GQI_QPL_FORMAT;
}
+static inline int gve_num_tx_queues(struct gve_priv *priv)
+{
+ return priv->tx_cfg.num_queues;
+}
+
/* buffers */
int gve_alloc_page(struct gve_priv *priv, struct device *dev,
struct page **page, dma_addr_t *dma,
@@ -862,14 +1104,28 @@
bool gve_rx_work_pending(struct gve_rx_ring *rx);
int gve_rx_alloc_rings(struct gve_priv *priv);
void gve_rx_free_rings_gqi(struct gve_priv *priv);
+int gve_recreate_rx_rings(struct gve_priv *priv);
+int gve_reconfigure_rx_rings(struct gve_priv *priv,
+ bool enable_hdr_split,
+ int packet_buffer_size);
/* Reset */
void gve_schedule_reset(struct gve_priv *priv);
-int gve_reset(struct gve_priv *priv, bool attempt_teardown);
+int gve_reset(struct gve_priv *priv, bool attempt_teardown,
+ enum gve_reset_reason reason);
int gve_adjust_queues(struct gve_priv *priv,
struct gve_queue_config new_rx_config,
struct gve_queue_config new_tx_config);
+int gve_flow_rules_reset(struct gve_priv *priv);
+void gve_flow_rules_release(struct gve_priv *priv);
+
/* report stats handling */
void gve_handle_report_stats(struct gve_priv *priv);
+
+/* RSS support */
+int gve_rss_config_init(struct gve_priv *priv);
+void gve_rss_set_default_indir(struct gve_priv *priv);
+void gve_rss_config_release(struct gve_rss_config *rss_config);
+
/* exported by ethtool.c */
extern const struct ethtool_ops gve_ethtool_ops;
/* needed by ethtool */
diff --git a/drivers/net/ethernet/google/gve/gve_adminq.c b/drivers/net/ethernet/google/gve/gve_adminq.c
index f7621ab..793372c 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.c
+++ b/drivers/net/ethernet/google/gve/gve_adminq.c
@@ -12,7 +12,7 @@
#define GVE_MAX_ADMINQ_RELEASE_CHECK 500
#define GVE_ADMINQ_SLEEP_LEN 20
-#define GVE_MAX_ADMINQ_EVENT_COUNTER_CHECK 100
+#define GVE_MAX_ADMINQ_EVENT_COUNTER_CHECK 1000
#define GVE_DEVICE_OPTION_ERROR_FMT "%s option error:\n" \
"Expected: length=%d, feature_mask=%x.\n" \
@@ -39,7 +39,10 @@
struct gve_device_option_gqi_rda **dev_op_gqi_rda,
struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
struct gve_device_option_dqo_rda **dev_op_dqo_rda,
- struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
+ struct gve_device_option_jumbo_frames **dev_op_jumbo_frames,
+ struct gve_device_option_dqo_qpl **dev_op_dqo_qpl,
+ struct gve_device_option_buffer_sizes **dev_op_buffer_sizes,
+ struct gve_device_option_flow_steering **dev_op_flow_steering)
{
u32 req_feat_mask = be32_to_cpu(option->required_features_mask);
u16 option_length = be16_to_cpu(option->option_length);
@@ -112,6 +115,22 @@
}
*dev_op_dqo_rda = (void *)(option + 1);
break;
+ case GVE_DEV_OPT_ID_DQO_QPL:
+ if (option_length < sizeof(**dev_op_dqo_qpl) ||
+ req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_DQO_QPL) {
+ dev_warn(&priv->pdev->dev, GVE_DEVICE_OPTION_ERROR_FMT,
+ "DQO QPL", (int)sizeof(**dev_op_dqo_qpl),
+ GVE_DEV_OPT_REQ_FEAT_MASK_DQO_QPL,
+ option_length, req_feat_mask);
+ break;
+ }
+
+ if (option_length > sizeof(**dev_op_dqo_qpl)) {
+ dev_warn(&priv->pdev->dev,
+ GVE_DEVICE_OPTION_TOO_BIG_FMT, "DQO QPL");
+ }
+ *dev_op_dqo_qpl = (void *)(option + 1);
+ break;
case GVE_DEV_OPT_ID_JUMBO_FRAMES:
if (option_length < sizeof(**dev_op_jumbo_frames) ||
req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES) {
@@ -130,6 +149,44 @@
}
*dev_op_jumbo_frames = (void *)(option + 1);
break;
+ case GVE_DEV_OPT_ID_BUFFER_SIZES:
+ if (option_length < sizeof(**dev_op_buffer_sizes) ||
+ req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_BUFFER_SIZES) {
+ dev_warn(&priv->pdev->dev, GVE_DEVICE_OPTION_ERROR_FMT,
+ "Buffer Sizes",
+ (int)sizeof(**dev_op_buffer_sizes),
+ GVE_DEV_OPT_REQ_FEAT_MASK_BUFFER_SIZES,
+ option_length, req_feat_mask);
+ break;
+ }
+
+ if (option_length > sizeof(**dev_op_buffer_sizes)) {
+ dev_warn(&priv->pdev->dev,
+ GVE_DEVICE_OPTION_TOO_BIG_FMT,
+ "Buffer Sizes");
+ }
+ *dev_op_buffer_sizes = (void *)(option + 1);
+ if ((*dev_op_buffer_sizes)->header_buffer_size)
+ priv->ethtool_defaults |= BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+ break;
+ case GVE_DEV_OPT_ID_FLOW_STEERING:
+ if (option_length < sizeof(**dev_op_flow_steering) ||
+ req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING) {
+ dev_warn(&priv->pdev->dev, GVE_DEVICE_OPTION_ERROR_FMT,
+ "Flow Steering",
+ (int)sizeof(**dev_op_flow_steering),
+ GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING,
+ option_length, req_feat_mask);
+ break;
+ }
+
+ if (option_length > sizeof(**dev_op_flow_steering)) {
+ dev_warn(&priv->pdev->dev,
+ GVE_DEVICE_OPTION_TOO_BIG_FMT,
+ "Flow Steering");
+ }
+ *dev_op_flow_steering = (void *)(option + 1);
+ break;
default:
/* If we don't recognize the option just continue
* without doing anything.
@@ -146,7 +203,10 @@
struct gve_device_option_gqi_rda **dev_op_gqi_rda,
struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
struct gve_device_option_dqo_rda **dev_op_dqo_rda,
- struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
+ struct gve_device_option_jumbo_frames **dev_op_jumbo_frames,
+ struct gve_device_option_dqo_qpl **dev_op_dqo_qpl,
+ struct gve_device_option_buffer_sizes **dev_op_buffer_sizes,
+ struct gve_device_option_flow_steering **dev_op_flow_steering)
{
const int num_options = be16_to_cpu(descriptor->num_device_options);
struct gve_device_option *dev_opt;
@@ -166,7 +226,10 @@
gve_parse_device_option(priv, descriptor, dev_opt,
dev_op_gqi_rda, dev_op_gqi_qpl,
- dev_op_dqo_rda, dev_op_jumbo_frames);
+ dev_op_dqo_rda, dev_op_jumbo_frames,
+ dev_op_dqo_qpl,
+ dev_op_buffer_sizes,
+ dev_op_flow_steering);
dev_opt = next_opt;
}
@@ -197,6 +260,7 @@
priv->adminq_report_stats_cnt = 0;
priv->adminq_report_link_speed_cnt = 0;
priv->adminq_get_ptype_map_cnt = 0;
+ priv->adminq_cfg_flow_rule_cnt = 0;
/* Setup Admin queue with the device */
iowrite32be(priv->adminq_bus_addr / PAGE_SIZE,
@@ -289,7 +353,7 @@
case GVE_ADMINQ_COMMAND_ERROR_RESOURCE_EXHAUSTED:
return -ENOMEM;
case GVE_ADMINQ_COMMAND_ERROR_UNIMPLEMENTED:
- return -ENOTSUPP;
+ return -EOPNOTSUPP;
default:
dev_err(&priv->pdev->dev, "parse_aq_err: unknown status code %d\n", status);
return -EINVAL;
@@ -299,14 +363,17 @@
/* Flushes all AQ commands currently queued and waits for them to complete.
* If there are failures, it will return the first error.
*/
-static int gve_adminq_kick_and_wait(struct gve_priv *priv)
+static int gve_adminq_kick_and_wait(struct gve_priv *priv, int ret_cnt, int *ret_codes)
{
int tail, head;
- int i;
+ int i, j;
tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
head = priv->adminq_prod_cnt;
+ if ((head - tail) > ret_cnt)
+ return -EINVAL;
+
gve_adminq_kick_cmd(priv, head);
if (!gve_adminq_wait_for_cmd(priv, head)) {
dev_err(&priv->pdev->dev, "AQ commands timed out, need to reset AQ\n");
@@ -314,16 +381,13 @@
return -ENOTRECOVERABLE;
}
- for (i = tail; i < head; i++) {
+ for (i = tail, j = 0; i < head; i++, j++) {
union gve_adminq_command *cmd;
u32 status, err;
cmd = &priv->adminq[i & priv->adminq_mask];
status = be32_to_cpu(READ_ONCE(cmd->status));
- err = gve_adminq_parse_err(priv, status);
- if (err)
- // Return the first error if we failed.
- return err;
+ ret_codes[j] = gve_adminq_parse_err(priv, status);
}
return 0;
@@ -342,30 +406,16 @@
tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
// Check if next command will overflow the buffer.
- if (((priv->adminq_prod_cnt + 1) & priv->adminq_mask) ==
- (tail & priv->adminq_mask)) {
- int err;
-
- // Flush existing commands to make room.
- err = gve_adminq_kick_and_wait(priv);
- if (err)
- return err;
-
- // Retry.
- tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
- if (((priv->adminq_prod_cnt + 1) & priv->adminq_mask) ==
- (tail & priv->adminq_mask)) {
- // This should never happen. We just flushed the
- // command queue so there should be enough space.
- return -ENOMEM;
- }
- }
+ if ((priv->adminq_prod_cnt - tail) > priv->adminq_mask)
+ return -ENOMEM;
cmd = &priv->adminq[priv->adminq_prod_cnt & priv->adminq_mask];
priv->adminq_prod_cnt++;
memcpy(cmd, cmd_orig, sizeof(*cmd_orig));
opcode = be32_to_cpu(READ_ONCE(cmd->opcode));
+ if (opcode == GVE_ADMINQ_EXTENDED_COMMAND)
+ opcode = be32_to_cpu(cmd->extended_command.inner_opcode);
switch (opcode) {
case GVE_ADMINQ_DESCRIBE_DEVICE:
@@ -395,6 +445,9 @@
case GVE_ADMINQ_DECONFIGURE_DEVICE_RESOURCES:
priv->adminq_dcfg_device_resources_cnt++;
break;
+ case GVE_ADMINQ_CONFIGURE_RSS:
+ priv->adminq_cfg_rss_cnt++;
+ break;
case GVE_ADMINQ_SET_DRIVER_PARAMETER:
priv->adminq_set_driver_parameter_cnt++;
break;
@@ -407,6 +460,12 @@
case GVE_ADMINQ_GET_PTYPE_MAP:
priv->adminq_get_ptype_map_cnt++;
break;
+ case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY:
+ priv->adminq_verify_driver_compatibility_cnt++;
+ break;
+ case GVE_ADMINQ_CONFIGURE_FLOW_RULE:
+ priv->adminq_cfg_flow_rule_cnt++;
+ break;
default:
dev_err(&priv->pdev->dev, "unknown AQ command opcode %d\n", opcode);
}
@@ -422,8 +481,9 @@
static int gve_adminq_execute_cmd(struct gve_priv *priv,
union gve_adminq_command *cmd_orig)
{
+ int retry_cnt = GVE_ADMINQ_RETRY_COUNT;
u32 tail, head;
- int err;
+ int err, ret;
tail = ioread32be(&priv->reg_bar0->adminq_event_counter);
head = priv->adminq_prod_cnt;
@@ -431,13 +491,52 @@
// This is not a valid path
return -EINVAL;
- err = gve_adminq_issue_cmd(priv, cmd_orig);
- if (err)
- return err;
+ do {
+ err = gve_adminq_issue_cmd(priv, cmd_orig);
+ if (err)
+ return err;
- return gve_adminq_kick_and_wait(priv);
+ err = gve_adminq_kick_and_wait(priv, 1, &ret);
+ if (err)
+ return err;
+ } while (ret == -ETIME && retry_cnt-- > 0);
+
+ return ret;
}
+static int gve_adminq_execute_extended_cmd(struct gve_priv *priv,
+ uint32_t opcode, size_t cmd_size,
+ void *cmd_orig)
+{
+ union gve_adminq_command cmd;
+ dma_addr_t inner_cmd_bus;
+ void *inner_cmd;
+ int err;
+
+ inner_cmd = dma_alloc_coherent(&priv->pdev->dev, cmd_size,
+ &inner_cmd_bus, GFP_KERNEL);
+ if (!inner_cmd)
+ return -ENOMEM;
+
+ memcpy(inner_cmd, cmd_orig, cmd_size);
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.opcode = cpu_to_be32(GVE_ADMINQ_EXTENDED_COMMAND);
+ cmd.extended_command = (struct gve_adminq_extended_command) {
+ .inner_opcode = cpu_to_be32(opcode),
+ .inner_length = cpu_to_be32(cmd_size),
+ .inner_command_addr = cpu_to_be64(inner_cmd_bus),
+ };
+
+ err = gve_adminq_execute_cmd(priv, &cmd);
+
+ dma_free_coherent(&priv->pdev->dev,
+ cmd_size,
+ inner_cmd, inner_cmd_bus);
+ return err;
+}
+
+
/* The device specifies that the management vector can either be the first irq
* or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
* the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
@@ -481,6 +580,69 @@
return gve_adminq_execute_cmd(priv, &cmd);
}
+typedef int (gve_adminq_queue_cmd) (struct gve_priv *priv, u32 queue_index);
+
+static int gve_adminq_manage_queues(struct gve_priv *priv,
+ gve_adminq_queue_cmd *cmd,
+ u32 start_id, u32 num_queues) {
+#define QUEUE_DONE -1
+ int retry_cnt = GVE_ADMINQ_RETRY_COUNT;
+ int cmd_idx, queue_idx, code_idx;
+ int queues_waiting[num_queues];
+ int commands[num_queues];
+ int codes[num_queues];
+ int retry_needed;
+ int err;
+
+ for (queue_idx = 0; queue_idx < num_queues; queue_idx++)
+ queues_waiting[queue_idx] = start_id + queue_idx;
+
+ do {
+ retry_needed = 0;
+ queue_idx = 0;
+ while (queue_idx < num_queues) {
+ cmd_idx = 0;
+ while (queue_idx < num_queues) {
+ if (queues_waiting[queue_idx] != QUEUE_DONE) {
+ err = cmd(priv, queues_waiting[queue_idx]);
+ if (err == -ENOMEM)
+ break;
+ if (err)
+ return err;
+ commands[cmd_idx++] = queue_idx;
+ }
+ queue_idx++;
+ }
+
+ if (queue_idx < num_queues)
+ dev_dbg(&priv->pdev->dev,
+ "Issued %d of %d batched commands\n",
+ queue_idx, num_queues);
+
+ err = gve_adminq_kick_and_wait(priv, cmd_idx, codes);
+ if (err)
+ return err;
+
+ for (code_idx = 0; code_idx < cmd_idx; code_idx++) {
+ if (codes[code_idx] == 0)
+ queues_waiting[commands[code_idx]] = QUEUE_DONE;
+ else if (codes[code_idx] != -ETIME)
+ return codes[code_idx];
+ else
+ retry_needed++;
+ }
+
+ if (retry_needed)
+ dev_dbg(&priv->pdev->dev,
+ "Issued %d batched commands, %d needed a retry\n",
+ cmd_idx, retry_needed);
+ }
+ } while (retry_needed && retry_cnt-- > 0);
+
+ return retry_needed ? -ETIME : 0;
+#undef QUEUE_DONE
+}
+
static int gve_adminq_create_tx_queue(struct gve_priv *priv, u32 queue_index)
{
struct gve_tx_ring *tx = &priv->tx[queue_index];
@@ -502,29 +664,33 @@
cmd.create_tx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
} else {
+ u16 comp_ring_size;
+ u32 qpl_id = 0;
+
+ if (priv->queue_format == GVE_DQO_RDA_FORMAT) {
+ qpl_id = GVE_RAW_ADDRESSING_QPL_ID;
+ comp_ring_size =
+ priv->options_dqo_rda.tx_comp_ring_entries;
+ } else {
+ qpl_id = tx->dqo.qpl->id;
+ comp_ring_size = priv->tx_desc_cnt;
+ }
+ cmd.create_tx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
cmd.create_tx_queue.tx_ring_size =
cpu_to_be16(priv->tx_desc_cnt);
cmd.create_tx_queue.tx_comp_ring_addr =
cpu_to_be64(tx->complq_bus_dqo);
cmd.create_tx_queue.tx_comp_ring_size =
- cpu_to_be16(priv->options_dqo_rda.tx_comp_ring_entries);
+ cpu_to_be16(comp_ring_size);
}
return gve_adminq_issue_cmd(priv, &cmd);
}
-int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 num_queues)
+int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_create_tx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_create_tx_queue,
+ start_id, num_queues);
}
static int gve_adminq_create_rx_queue(struct gve_priv *priv, u32 queue_index)
@@ -552,6 +718,18 @@
cmd.create_rx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
cmd.create_rx_queue.packet_buffer_size = cpu_to_be16(rx->packet_buffer_size);
} else {
+ u16 rx_buff_ring_entries;
+ u32 qpl_id = 0;
+
+ if (priv->queue_format == GVE_DQO_RDA_FORMAT) {
+ qpl_id = GVE_RAW_ADDRESSING_QPL_ID;
+ rx_buff_ring_entries =
+ priv->options_dqo_rda.rx_buff_ring_entries;
+ } else {
+ qpl_id = rx->dqo.qpl->id;
+ rx_buff_ring_entries = priv->rx_desc_cnt;
+ }
+ cmd.create_rx_queue.queue_page_list_id = cpu_to_be32(qpl_id);
cmd.create_rx_queue.rx_ring_size =
cpu_to_be16(priv->rx_desc_cnt);
cmd.create_rx_queue.rx_desc_ring_addr =
@@ -561,9 +739,12 @@
cmd.create_rx_queue.packet_buffer_size =
cpu_to_be16(priv->data_buffer_size_dqo);
cmd.create_rx_queue.rx_buff_ring_size =
- cpu_to_be16(priv->options_dqo_rda.rx_buff_ring_entries);
+ cpu_to_be16(rx_buff_ring_entries);
cmd.create_rx_queue.enable_rsc =
!!(priv->dev->features & NETIF_F_LRO);
+ if (rx->dqo.hdr_bufs)
+ cmd.create_rx_queue.header_buffer_size =
+ cpu_to_be16(priv->header_buf_size);
}
return gve_adminq_issue_cmd(priv, &cmd);
@@ -571,16 +752,8 @@
int gve_adminq_create_rx_queues(struct gve_priv *priv, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_create_rx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_create_rx_queue,
+ 0, num_queues);
}
static int gve_adminq_destroy_tx_queue(struct gve_priv *priv, u32 queue_index)
@@ -601,18 +774,10 @@
return 0;
}
-int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 num_queues)
+int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_destroy_tx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_destroy_tx_queue,
+ start_id, num_queues);
}
static int gve_adminq_destroy_rx_queue(struct gve_priv *priv, u32 queue_index)
@@ -635,16 +800,8 @@
int gve_adminq_destroy_rx_queues(struct gve_priv *priv, u32 num_queues)
{
- int err;
- int i;
-
- for (i = 0; i < num_queues; i++) {
- err = gve_adminq_destroy_rx_queue(priv, i);
- if (err)
- return err;
- }
-
- return gve_adminq_kick_and_wait(priv);
+ return gve_adminq_manage_queues(priv, &gve_adminq_destroy_rx_queue,
+ 0, num_queues);
}
static int gve_set_desc_cnt(struct gve_priv *priv,
@@ -672,9 +829,13 @@
const struct gve_device_option_dqo_rda *dev_op_dqo_rda)
{
priv->tx_desc_cnt = be16_to_cpu(descriptor->tx_queue_entries);
+ priv->rx_desc_cnt = be16_to_cpu(descriptor->rx_queue_entries);
+
+ if (priv->queue_format == GVE_DQO_QPL_FORMAT)
+ return 0;
+
priv->options_dqo_rda.tx_comp_ring_entries =
be16_to_cpu(dev_op_dqo_rda->tx_comp_ring_entries);
- priv->rx_desc_cnt = be16_to_cpu(descriptor->rx_queue_entries);
priv->options_dqo_rda.rx_buff_ring_entries =
be16_to_cpu(dev_op_dqo_rda->rx_buff_ring_entries);
@@ -684,9 +845,17 @@
static void gve_enable_supported_features(struct gve_priv *priv,
u32 supported_features_mask,
const struct gve_device_option_jumbo_frames
- *dev_op_jumbo_frames)
+ *dev_op_jumbo_frames,
+ const struct gve_device_option_dqo_qpl
+ *dev_op_dqo_qpl,
+ const struct gve_device_option_buffer_sizes
+ *dev_op_buffer_sizes,
+ const struct gve_device_option_flow_steering
+ *dev_op_flow_steering)
{
- /* Before control reaches this point, the page-size-capped max MTU from
+ int buf_size;
+
+ /* Before control reaches this point, the page-size-capped max MTU in
* the gve_device_descriptor field has already been stored in
* priv->dev->max_mtu. We overwrite it with the true max MTU below.
*/
@@ -696,14 +865,69 @@
"JUMBO FRAMES device option enabled.\n");
priv->dev->max_mtu = be16_to_cpu(dev_op_jumbo_frames->max_mtu);
}
+
+ /* Override pages for qpl for DQO-QPL */
+ if (dev_op_dqo_qpl) {
+ priv->tx_pages_per_qpl =
+ be16_to_cpu(dev_op_dqo_qpl->tx_pages_per_qpl);
+ priv->rx_pages_per_qpl =
+ be16_to_cpu(dev_op_dqo_qpl->rx_pages_per_qpl);
+ if (priv->tx_pages_per_qpl == 0)
+ priv->tx_pages_per_qpl = DQO_QPL_DEFAULT_TX_PAGES;
+ if (priv->rx_pages_per_qpl == 0)
+ priv->rx_pages_per_qpl = DQO_QPL_DEFAULT_RX_PAGES;
+ }
+
+ priv->data_buffer_size_dqo = GVE_RX_BUFFER_SIZE_DQO;
+ priv->dev_max_rx_buffer_size = GVE_RX_BUFFER_SIZE_DQO;
+ priv->header_buf_size = 0;
+
+ if (dev_op_buffer_sizes &&
+ (supported_features_mask & GVE_SUP_BUFFER_SIZES_MASK)) {
+ dev_info(&priv->pdev->dev,
+ "BUFFER SIZES device option enabled.\n");
+ buf_size = be16_to_cpu(dev_op_buffer_sizes->packet_buffer_size);
+ if (buf_size) {
+ priv->dev_max_rx_buffer_size = buf_size;
+ if (priv->dev_max_rx_buffer_size &
+ (priv->dev_max_rx_buffer_size - 1))
+ priv->dev_max_rx_buffer_size = GVE_RX_BUFFER_SIZE_DQO;
+ if (priv->dev_max_rx_buffer_size < GVE_MIN_RX_BUFFER_SIZE)
+ priv->dev_max_rx_buffer_size = GVE_MIN_RX_BUFFER_SIZE;
+ if (priv->dev_max_rx_buffer_size > GVE_MAX_RX_BUFFER_SIZE)
+ priv->dev_max_rx_buffer_size = GVE_MAX_RX_BUFFER_SIZE;
+ }
+ buf_size = be16_to_cpu(dev_op_buffer_sizes->header_buffer_size);
+ if (buf_size) {
+ priv->header_buf_size = buf_size;
+ if (priv->header_buf_size & (priv->header_buf_size - 1))
+ priv->header_buf_size =
+ GVE_HEADER_BUFFER_SIZE_DEFAULT;
+ if (priv->header_buf_size < GVE_HEADER_BUFFER_SIZE_MIN)
+ priv->header_buf_size = GVE_HEADER_BUFFER_SIZE_MIN;
+ if (priv->header_buf_size > GVE_HEADER_BUFFER_SIZE_MAX)
+ priv->header_buf_size = GVE_HEADER_BUFFER_SIZE_MAX;
+ }
+ }
+
+ if (dev_op_flow_steering &&
+ (supported_features_mask & GVE_SUP_FLOW_STEERING_MASK)) {
+ dev_info(&priv->pdev->dev,
+ "FLOW STEERING device option enabled.\n");
+ priv->flow_rules_max =
+ be16_to_cpu(dev_op_flow_steering->max_num_rules);
+ }
}
int gve_adminq_describe_device(struct gve_priv *priv)
{
+ struct gve_device_option_flow_steering *dev_op_flow_steering = NULL;
+ struct gve_device_option_buffer_sizes *dev_op_buffer_sizes = NULL;
struct gve_device_option_jumbo_frames *dev_op_jumbo_frames = NULL;
struct gve_device_option_gqi_rda *dev_op_gqi_rda = NULL;
struct gve_device_option_gqi_qpl *dev_op_gqi_qpl = NULL;
struct gve_device_option_dqo_rda *dev_op_dqo_rda = NULL;
+ struct gve_device_option_dqo_qpl *dev_op_dqo_qpl = NULL;
struct gve_device_descriptor *descriptor;
u32 supported_features_mask = 0;
union gve_adminq_command cmd;
@@ -730,13 +954,16 @@
err = gve_process_device_options(priv, descriptor, &dev_op_gqi_rda,
&dev_op_gqi_qpl, &dev_op_dqo_rda,
- &dev_op_jumbo_frames);
+ &dev_op_jumbo_frames,
+ &dev_op_dqo_qpl,
+ &dev_op_buffer_sizes,
+ &dev_op_flow_steering);
if (err)
goto free_device_descriptor;
/* If the GQI_RAW_ADDRESSING option is not enabled and the queue format
* is not set to GqiRda, choose the queue format in a priority order:
- * DqoRda, GqiRda, GqiQpl. Use GqiQpl as default.
+ * DqoRda, DqoQpl, GqiRda, GqiQpl. Use GqiQpl as default.
*/
if (dev_op_dqo_rda) {
priv->queue_format = GVE_DQO_RDA_FORMAT;
@@ -744,7 +971,11 @@
"Driver is running with DQO RDA queue format.\n");
supported_features_mask =
be32_to_cpu(dev_op_dqo_rda->supported_features_mask);
- } else if (dev_op_gqi_rda) {
+ } else if (dev_op_dqo_qpl) {
+ priv->queue_format = GVE_DQO_QPL_FORMAT;
+ supported_features_mask =
+ be32_to_cpu(dev_op_dqo_qpl->supported_features_mask);
+ } else if (dev_op_gqi_rda) {
priv->queue_format = GVE_GQI_RDA_FORMAT;
dev_info(&priv->pdev->dev,
"Driver is running with GQI RDA queue format.\n");
@@ -764,8 +995,9 @@
if (gve_is_gqi(priv)) {
err = gve_set_desc_cnt(priv, descriptor);
} else {
- /* DQO supports LRO. */
+ /* DQO supports LRO and flow-steering */
priv->dev->hw_features |= NETIF_F_LRO;
+ priv->dev->hw_features |= NETIF_F_NTUPLE;
err = gve_set_desc_cnt_dqo(priv, descriptor, dev_op_dqo_rda);
}
if (err)
@@ -795,7 +1027,9 @@
priv->default_num_queues = be16_to_cpu(descriptor->default_num_queues);
gve_enable_supported_features(priv, supported_features_mask,
- dev_op_jumbo_frames);
+ dev_op_jumbo_frames, dev_op_dqo_qpl,
+ dev_op_buffer_sizes,
+ dev_op_flow_steering);
free_device_descriptor:
dma_free_coherent(&priv->pdev->dev, PAGE_SIZE, descriptor,
@@ -878,6 +1112,22 @@
return gve_adminq_execute_cmd(priv, &cmd);
}
+int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
+ u64 driver_info_len,
+ dma_addr_t driver_info_addr)
+{
+ union gve_adminq_command cmd;
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.opcode = cpu_to_be32(GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY);
+ cmd.verify_driver_compatibility = (struct gve_adminq_verify_driver_compatibility) {
+ .driver_info_len = cpu_to_be64(driver_info_len),
+ .driver_info_addr = cpu_to_be64(driver_info_addr),
+ };
+
+ return gve_adminq_execute_cmd(priv, &cmd);
+}
+
int gve_adminq_report_link_speed(struct gve_priv *priv)
{
union gve_adminq_command gvnic_cmd;
@@ -942,3 +1192,166 @@
ptype_map_bus);
return err;
}
+
+static int gve_adminq_configure_flow_rule(struct gve_priv *priv,
+ struct gve_adminq_configure_flow_rule *flow_rule_cmd)
+{
+ return gve_adminq_execute_extended_cmd(priv,
+ GVE_ADMINQ_CONFIGURE_FLOW_RULE,
+ sizeof(struct gve_adminq_configure_flow_rule),
+ flow_rule_cmd);
+}
+
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+ struct gve_flow_rule *rule)
+{
+ struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+ .cmd = cpu_to_be16(GVE_RULE_ADD),
+ .loc = cpu_to_be16(rule->loc),
+ .rule = {
+ .flow_type = cpu_to_be16(rule->flow_type),
+ .action = cpu_to_be16(rule->action),
+ .key = {
+ .src_ip = { rule->key.src_ip[0],
+ rule->key.src_ip[1],
+ rule->key.src_ip[2],
+ rule->key.src_ip[3] },
+ .dst_ip = { rule->key.dst_ip[0],
+ rule->key.dst_ip[1],
+ rule->key.dst_ip[2],
+ rule->key.dst_ip[3] },
+ },
+ .mask = {
+ .src_ip = { rule->mask.src_ip[0],
+ rule->mask.src_ip[1],
+ rule->mask.src_ip[2],
+ rule->mask.src_ip[3] },
+ .dst_ip = { rule->mask.dst_ip[0],
+ rule->mask.dst_ip[1],
+ rule->mask.dst_ip[2],
+ rule->mask.dst_ip[3] },
+ },
+ },
+ };
+ switch (rule->flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ case GVE_FLOW_TYPE_UDPV4:
+ case GVE_FLOW_TYPE_SCTPV4:
+ flow_rule_cmd.rule.key.src_port = rule->key.src_port;
+ flow_rule_cmd.rule.key.dst_port = rule->key.dst_port;
+ flow_rule_cmd.rule.key.tos = rule->key.tos;
+ flow_rule_cmd.rule.mask.src_port = rule->mask.src_port;
+ flow_rule_cmd.rule.mask.dst_port = rule->mask.dst_port;
+ flow_rule_cmd.rule.mask.tos = rule->mask.tos;
+ break;
+ case GVE_FLOW_TYPE_AHV4:
+ case GVE_FLOW_TYPE_ESPV4:
+ flow_rule_cmd.rule.key.spi = rule->key.spi;
+ flow_rule_cmd.rule.key.tos = rule->key.tos;
+ flow_rule_cmd.rule.mask.spi = rule->mask.spi;
+ flow_rule_cmd.rule.mask.tos = rule->mask.tos;
+ break;
+ case GVE_FLOW_TYPE_TCPV6:
+ case GVE_FLOW_TYPE_UDPV6:
+ case GVE_FLOW_TYPE_SCTPV6:
+ flow_rule_cmd.rule.key.src_port = rule->key.src_port;
+ flow_rule_cmd.rule.key.dst_port = rule->key.dst_port;
+ flow_rule_cmd.rule.key.tclass = rule->key.tclass;
+ flow_rule_cmd.rule.mask.src_port = rule->mask.src_port;
+ flow_rule_cmd.rule.mask.dst_port = rule->mask.dst_port;
+ flow_rule_cmd.rule.mask.tclass = rule->mask.tclass;
+ break;
+ case GVE_FLOW_TYPE_AHV6:
+ case GVE_FLOW_TYPE_ESPV6:
+ flow_rule_cmd.rule.key.spi = rule->key.spi;
+ flow_rule_cmd.rule.key.tclass = rule->key.tclass;
+ flow_rule_cmd.rule.mask.spi = rule->mask.spi;
+ flow_rule_cmd.rule.mask.tclass = rule->mask.tclass;
+ break;
+ }
+
+ return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_del_flow_rule(struct gve_priv *priv, int loc)
+{
+ struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+ .cmd = cpu_to_be16(GVE_RULE_DEL),
+ .loc = cpu_to_be16(loc),
+ };
+ return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_reset_flow_rules(struct gve_priv *priv)
+{
+ struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+ .cmd = cpu_to_be16(GVE_RULE_RESET),
+ };
+ return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_configure_rss(struct gve_priv *priv,
+ struct gve_rss_config *rss_config)
+{
+ dma_addr_t indir_bus = 0, key_bus = 0;
+ union gve_adminq_command cmd;
+ __be32 *indir = NULL;
+ u8 *key = NULL;
+ int err = 0;
+ int i;
+
+ if (rss_config->indir_size) {
+ indir = dma_alloc_coherent(&priv->pdev->dev,
+ rss_config->indir_size *
+ sizeof(*rss_config->indir),
+ &indir_bus, GFP_KERNEL);
+ if (!indir) {
+ err = -ENOMEM;
+ goto out;
+ }
+ for (i = 0; i < rss_config->indir_size; i++)
+ indir[i] = cpu_to_be32(rss_config->indir[i]);
+ }
+
+ if (rss_config->key_size) {
+ key = dma_alloc_coherent(&priv->pdev->dev,
+ rss_config->key_size *
+ sizeof(*rss_config->key),
+ &key_bus, GFP_KERNEL);
+ if (!key) {
+ err = -ENOMEM;
+ goto out;
+ }
+ memcpy(key, rss_config->key, rss_config->key_size);
+ }
+
+ memset(&cmd, 0, sizeof(cmd));
+ cmd.opcode = cpu_to_be32(GVE_ADMINQ_CONFIGURE_RSS);
+ cmd.configure_rss = (struct gve_adminq_configure_rss) {
+ .hash_types = cpu_to_be16(GVE_RSS_HASH_TCPV4 |
+ GVE_RSS_HASH_UDPV4 |
+ GVE_RSS_HASH_TCPV6 |
+ GVE_RSS_HASH_UDPV6),
+ .halg = rss_config->alg,
+ .hkey_len = cpu_to_be16(rss_config->key_size),
+ .indir_len = cpu_to_be16(rss_config->indir_size),
+ .hkey_addr = cpu_to_be64(key_bus),
+ .indir_addr = cpu_to_be64(indir_bus),
+ };
+
+ err = gve_adminq_execute_cmd(priv, &cmd);
+
+out:
+ if (indir)
+ dma_free_coherent(&priv->pdev->dev,
+ rss_config->indir_size *
+ sizeof(*rss_config->indir),
+ indir, indir_bus);
+ if (key)
+ dma_free_coherent(&priv->pdev->dev,
+ rss_config->key_size *
+ sizeof(*rss_config->key),
+ key, key_bus);
+ return err;
+}
+
diff --git a/drivers/net/ethernet/google/gve/gve_adminq.h b/drivers/net/ethernet/google/gve/gve_adminq.h
index 83c0b40..0b4acdd 100644
--- a/drivers/net/ethernet/google/gve/gve_adminq.h
+++ b/drivers/net/ethernet/google/gve/gve_adminq.h
@@ -20,10 +20,17 @@
GVE_ADMINQ_DESTROY_TX_QUEUE = 0x7,
GVE_ADMINQ_DESTROY_RX_QUEUE = 0x8,
GVE_ADMINQ_DECONFIGURE_DEVICE_RESOURCES = 0x9,
+ GVE_ADMINQ_CONFIGURE_RSS = 0xA,
GVE_ADMINQ_SET_DRIVER_PARAMETER = 0xB,
GVE_ADMINQ_REPORT_STATS = 0xC,
GVE_ADMINQ_REPORT_LINK_SPEED = 0xD,
GVE_ADMINQ_GET_PTYPE_MAP = 0xE,
+ GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY = 0xF,
+
+ /* For commands that are larger than 56 bytes */
+ GVE_ADMINQ_EXTENDED_COMMAND = 0xFF,
+
+ GVE_ADMINQ_CONFIGURE_FLOW_RULE = 0x101,
};
/* Admin queue status codes */
@@ -48,6 +55,11 @@
GVE_ADMINQ_COMMAND_ERROR_UNKNOWN_ERROR = 0xFFFFFFFF,
};
+/* AdminQ commands (that aren't batched) will be retried if they encounter
+ * an recoverable error.
+ */
+#define GVE_ADMINQ_RETRY_COUNT 3
+
#define GVE_ADMINQ_DEVICE_DESCRIPTOR_VERSION 1
/* All AdminQ command structs should be naturally packed. The static_assert
@@ -108,6 +120,14 @@
static_assert(sizeof(struct gve_device_option_dqo_rda) == 8);
+struct gve_device_option_dqo_qpl {
+ __be32 supported_features_mask;
+ __be16 tx_pages_per_qpl;
+ __be16 rx_pages_per_qpl;
+};
+
+static_assert(sizeof(struct gve_device_option_dqo_qpl) == 8);
+
struct gve_device_option_jumbo_frames {
__be32 supported_features_mask;
__be16 max_mtu;
@@ -116,6 +136,22 @@
static_assert(sizeof(struct gve_device_option_jumbo_frames) == 8);
+struct gve_device_option_buffer_sizes {
+ __be32 supported_features_mask;
+ __be16 packet_buffer_size;
+ __be16 header_buffer_size;
+};
+
+static_assert(sizeof(struct gve_device_option_buffer_sizes) == 8);
+
+struct gve_device_option_flow_steering {
+ __be32 supported_features_mask;
+ __be16 max_num_rules;
+ u8 padding[2];
+};
+
+static_assert(sizeof(struct gve_device_option_flow_steering) == 8);
+
/* Terminology:
*
* RDA - Raw DMA Addressing - Buffers associated with SKBs are directly DMA
@@ -129,7 +165,10 @@
GVE_DEV_OPT_ID_GQI_RDA = 0x2,
GVE_DEV_OPT_ID_GQI_QPL = 0x3,
GVE_DEV_OPT_ID_DQO_RDA = 0x4,
+ GVE_DEV_OPT_ID_DQO_QPL = 0x7,
GVE_DEV_OPT_ID_JUMBO_FRAMES = 0x8,
+ GVE_DEV_OPT_ID_BUFFER_SIZES = 0xa,
+ GVE_DEV_OPT_ID_FLOW_STEERING = 0xb,
};
enum gve_dev_opt_req_feat_mask {
@@ -138,14 +177,73 @@
GVE_DEV_OPT_REQ_FEAT_MASK_GQI_QPL = 0x0,
GVE_DEV_OPT_REQ_FEAT_MASK_DQO_RDA = 0x0,
GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES = 0x0,
+ GVE_DEV_OPT_REQ_FEAT_MASK_DQO_QPL = 0x0,
+ GVE_DEV_OPT_REQ_FEAT_MASK_BUFFER_SIZES = 0x0,
+ GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING = 0x0,
};
enum gve_sup_feature_mask {
GVE_SUP_JUMBO_FRAMES_MASK = 1 << 2,
+ GVE_SUP_BUFFER_SIZES_MASK = 1 << 4,
+ GVE_SUP_FLOW_STEERING_MASK = 1 << 5,
};
#define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0
+#define GVE_VERSION_STR_LEN 128
+
+enum gve_driver_capbility {
+ gve_driver_capability_gqi_qpl = 0,
+ gve_driver_capability_gqi_rda = 1,
+ gve_driver_capability_dqo_qpl = 2, /* reserved for future use */
+ gve_driver_capability_dqo_rda = 3,
+ gve_driver_capability_alt_miss_compl = 4,
+ gve_driver_capability_flexible_buffer_size = 5,
+};
+
+#define GVE_CAP1(a) BIT((int)a)
+#define GVE_CAP2(a) BIT(((int)a) - 64)
+#define GVE_CAP3(a) BIT(((int)a) - 128)
+#define GVE_CAP4(a) BIT(((int)a) - 192)
+
+#define GVE_DRIVER_CAPABILITY_FLAGS1 \
+ (GVE_CAP1(gve_driver_capability_gqi_qpl) | \
+ GVE_CAP1(gve_driver_capability_gqi_rda) | \
+ GVE_CAP1(gve_driver_capability_dqo_rda) | \
+ GVE_CAP1(gve_driver_capability_alt_miss_compl) | \
+ GVE_CAP1(gve_driver_capability_flexible_buffer_size))
+
+#define GVE_DRIVER_CAPABILITY_FLAGS2 0x0
+#define GVE_DRIVER_CAPABILITY_FLAGS3 0x0
+#define GVE_DRIVER_CAPABILITY_FLAGS4 0x0
+
+struct gve_adminq_extended_command {
+ __be32 inner_opcode;
+ __be32 inner_length;
+ __be64 inner_command_addr;
+};
+static_assert(sizeof(struct gve_adminq_extended_command) == 16);
+
+struct gve_driver_info {
+ u8 os_type; /* 0x01 = Linux */
+ u8 driver_major;
+ u8 driver_minor;
+ u8 driver_sub;
+ __be32 os_version_major;
+ __be32 os_version_minor;
+ __be32 os_version_sub;
+ __be64 driver_capability_flags[4];
+ u8 os_version_str1[GVE_VERSION_STR_LEN];
+ u8 os_version_str2[GVE_VERSION_STR_LEN];
+};
+
+struct gve_adminq_verify_driver_compatibility {
+ __be64 driver_info_len;
+ __be64 driver_info_addr;
+};
+
+static_assert(sizeof(struct gve_adminq_verify_driver_compatibility) == 16);
+
struct gve_adminq_configure_device_resources {
__be64 counter_array;
__be64 irq_db_addr;
@@ -203,7 +301,9 @@
__be16 packet_buffer_size;
__be16 rx_buff_ring_size;
u8 enable_rsc;
- u8 padding[5];
+ u8 padding1;
+ __be16 header_buffer_size;
+ u8 padding2[2];
};
static_assert(sizeof(struct gve_adminq_create_rx_queue) == 56);
@@ -327,6 +427,81 @@
__be64 ptype_map_addr;
};
+/* Flow-steering related definitions */
+enum gve_adminq_flow_rule_cmd {
+ GVE_RULE_ADD = 0,
+ GVE_RULE_DEL = 1,
+ GVE_RULE_RESET = 2,
+};
+
+enum gve_adminq_flow_type {
+ GVE_FLOW_TYPE_TCPV4 = 0,
+ GVE_FLOW_TYPE_UDPV4 = 1,
+ GVE_FLOW_TYPE_SCTPV4 = 2,
+ GVE_FLOW_TYPE_AHV4 = 3,
+ GVE_FLOW_TYPE_ESPV4 = 4,
+ GVE_FLOW_TYPE_TCPV6 = 5,
+ GVE_FLOW_TYPE_UDPV6 = 6,
+ GVE_FLOW_TYPE_SCTPV6 = 7,
+ GVE_FLOW_TYPE_AHV6 = 8,
+ GVE_FLOW_TYPE_ESPV6 = 9,
+};
+
+struct gve_adminq_flow_spec {
+ __be32 src_ip[4];
+ __be32 dst_ip[4];
+ union {
+ struct {
+ __be16 src_port;
+ __be16 dst_port;
+ };
+ __be32 spi;
+ };
+ union {
+ u8 tos;
+ u8 tclass;
+ };
+};
+static_assert(sizeof(struct gve_adminq_flow_spec) == 40);
+
+/* Flow-steering command */
+struct gve_adminq_flow_rule {
+ __be16 flow_type;
+ __be16 action; /* Queue */
+ struct gve_adminq_flow_spec key;
+ struct gve_adminq_flow_spec mask; /* ports can be 0 or 0xffff */
+};
+
+struct gve_adminq_configure_flow_rule {
+ __be16 cmd;
+ __be16 loc;
+ struct gve_adminq_flow_rule rule;
+};
+static_assert(sizeof(struct gve_adminq_configure_flow_rule) == 88);
+
+#define GVE_RSS_HASH_IPV4 BIT(0)
+#define GVE_RSS_HASH_TCPV4 BIT(1)
+#define GVE_RSS_HASH_IPV6 BIT(2)
+#define GVE_RSS_HASH_IPV6_EX BIT(3)
+#define GVE_RSS_HASH_TCPV6 BIT(4)
+#define GVE_RSS_HASH_TCPV6_EX BIT(5)
+#define GVE_RSS_HASH_UDPV4 BIT(6)
+#define GVE_RSS_HASH_UDPV6 BIT(7)
+#define GVE_RSS_HASH_UDPV6_EX BIT(8)
+
+/* RSS configuration command */
+struct gve_adminq_configure_rss {
+ __be16 hash_types;
+ u8 halg; /* hash algorithm */
+ u8 reserved;
+ __be16 hkey_len;
+ __be16 indir_len;
+ __be64 hkey_addr;
+ __be64 indir_addr;
+};
+
+static_assert(sizeof(struct gve_adminq_configure_rss) == 24);
+
union gve_adminq_command {
struct {
__be32 opcode;
@@ -341,10 +516,14 @@
struct gve_adminq_describe_device describe_device;
struct gve_adminq_register_page_list reg_page_list;
struct gve_adminq_unregister_page_list unreg_page_list;
+ struct gve_adminq_configure_rss configure_rss;
struct gve_adminq_set_driver_parameter set_driver_param;
struct gve_adminq_report_stats report_stats;
struct gve_adminq_report_link_speed report_link_speed;
struct gve_adminq_get_ptype_map get_ptype_map;
+ struct gve_adminq_verify_driver_compatibility
+ verify_driver_compatibility;
+ struct gve_adminq_extended_command extended_command;
};
};
u8 reserved[64];
@@ -362,8 +541,8 @@
dma_addr_t db_array_bus_addr,
u32 num_ntfy_blks);
int gve_adminq_deconfigure_device_resources(struct gve_priv *priv);
-int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 num_queues);
-int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 queue_id);
+int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues);
+int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues);
int gve_adminq_create_rx_queues(struct gve_priv *priv, u32 num_queues);
int gve_adminq_destroy_rx_queues(struct gve_priv *priv, u32 queue_id);
int gve_adminq_register_page_list(struct gve_priv *priv,
@@ -372,7 +551,16 @@
int gve_adminq_set_mtu(struct gve_priv *priv, u64 mtu);
int gve_adminq_report_stats(struct gve_priv *priv, u64 stats_report_len,
dma_addr_t stats_report_addr, u64 interval);
+int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
+ u64 driver_info_len,
+ dma_addr_t driver_info_addr);
+int gve_adminq_configure_rss(struct gve_priv *priv,
+ struct gve_rss_config *config);
int gve_adminq_report_link_speed(struct gve_priv *priv);
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+ struct gve_flow_rule *rule);
+int gve_adminq_del_flow_rule(struct gve_priv *priv, int loc);
+int gve_adminq_reset_flow_rules(struct gve_priv *priv);
struct gve_ptype_lut;
int gve_adminq_get_ptype_map_dqo(struct gve_priv *priv,
diff --git a/drivers/net/ethernet/google/gve/gve_desc_dqo.h b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
index e8fe9ad..f79cd05 100644
--- a/drivers/net/ethernet/google/gve/gve_desc_dqo.h
+++ b/drivers/net/ethernet/google/gve/gve_desc_dqo.h
@@ -176,6 +176,11 @@
#define GVE_COMPL_TYPE_DQO_MISS 0x1 /* Miss path completion */
#define GVE_COMPL_TYPE_DQO_REINJECTION 0x3 /* Re-injection completion */
+/* The most significant bit in the completion tag can change the completion
+ * type from packet completion to miss path completion.
+ */
+#define GVE_ALT_MISS_COMPL_BIT BIT(15)
+
/* Descriptor to post buffers to HW on buffer queue. */
struct gve_rx_desc_dqo {
__le16 buf_id; /* ID returned in Rx completion descriptor */
diff --git a/drivers/net/ethernet/google/gve/gve_dqo.h b/drivers/net/ethernet/google/gve/gve_dqo.h
index 1eb4d5f..59f52b3 100644
--- a/drivers/net/ethernet/google/gve/gve_dqo.h
+++ b/drivers/net/ethernet/google/gve/gve_dqo.h
@@ -33,16 +33,22 @@
#define GVE_DEALLOCATE_COMPL_TIMEOUT 60
netdev_tx_t gve_tx_dqo(struct sk_buff *skb, struct net_device *dev);
+netdev_features_t gve_features_check_dqo(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features);
bool gve_tx_poll_dqo(struct gve_notify_block *block, bool do_clean);
int gve_rx_poll_dqo(struct gve_notify_block *block, int budget);
+bool gve_tx_work_pending_dqo(struct gve_tx_ring *tx);
int gve_tx_alloc_rings_dqo(struct gve_priv *priv);
void gve_tx_free_rings_dqo(struct gve_priv *priv);
int gve_rx_alloc_rings_dqo(struct gve_priv *priv);
void gve_rx_free_rings_dqo(struct gve_priv *priv);
+void gve_rx_reset_rings_dqo(struct gve_priv *priv);
int gve_clean_tx_done_dqo(struct gve_priv *priv, struct gve_tx_ring *tx,
struct napi_struct *napi);
void gve_rx_post_buffers_dqo(struct gve_rx_ring *rx);
void gve_rx_write_doorbell_dqo(const struct gve_priv *priv, int queue_idx);
+int gve_rx_handle_hdr_resources_dqo(struct gve_priv *priv, bool enable_hdr_split);
static inline void
gve_tx_put_doorbell_dqo(const struct gve_priv *priv,
@@ -51,7 +57,11 @@
u64 index;
index = be32_to_cpu(q_resources->db_index);
+ /* Ensure that all the TX descriptor writes are flushed */
+ dma_wmb();
iowrite32(val, &priv->db_bar2[index]);
+ /* Make sure that the doorbell write is sent to NIC */
+ wmb();
}
/* Builds register value to write to DQO IRQ doorbell to enable with specified
diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c
index 033f17c..9bb7161 100644
--- a/drivers/net/ethernet/google/gve/gve_ethtool.c
+++ b/drivers/net/ethernet/google/gve/gve_ethtool.c
@@ -35,16 +35,19 @@
}
static const char gve_gstrings_main_stats[][ETH_GSTRING_LEN] = {
- "rx_packets", "tx_packets", "rx_bytes", "tx_bytes",
- "rx_dropped", "tx_dropped", "tx_timeouts",
- "rx_skb_alloc_fail", "rx_buf_alloc_fail", "rx_desc_err_dropped_pkt",
+ "rx_packets", "rx_packets_sph", "rx_packets_hbo", "rx_devmem_pkts",
+ "rx_devmem_dropped", "tx_packets", "rx_bytes", "tx_bytes", "rx_dropped",
+ "tx_dropped", "tx_timeouts", "rx_skb_alloc_fail", "rx_buf_alloc_fail",
+ "rx_desc_err_dropped_pkt", "rx_hsplit_err_dropped_pkt",
"interface_up_cnt", "interface_down_cnt", "reset_cnt",
"page_alloc_fail", "dma_mapping_error", "stats_report_trigger_cnt",
};
static const char gve_gstrings_rx_stats[][ETH_GSTRING_LEN] = {
- "rx_posted_desc[%u]", "rx_completed_desc[%u]", "rx_consumed_desc[%u]", "rx_bytes[%u]",
+ "rx_posted_desc[%u]", "rx_completed_desc[%u]", "rx_consumed_desc[%u]",
+ "rx_bytes[%u]", "rx_header_bytes[%u]",
"rx_cont_packet_cnt[%u]", "rx_frag_flip_cnt[%u]", "rx_frag_copy_cnt[%u]",
+ "rx_frag_alloc_cnt[%u]",
"rx_dropped_pkt[%u]", "rx_copybreak_pkt[%u]", "rx_copied_pkt[%u]",
"rx_queue_drop_cnt[%u]", "rx_no_buffers_posted[%u]",
"rx_drops_packet_over_mru[%u]", "rx_drops_invalid_checksum[%u]",
@@ -63,11 +66,13 @@
"adminq_create_tx_queue_cnt", "adminq_create_rx_queue_cnt",
"adminq_destroy_tx_queue_cnt", "adminq_destroy_rx_queue_cnt",
"adminq_dcfg_device_resources_cnt", "adminq_set_driver_parameter_cnt",
- "adminq_report_stats_cnt", "adminq_report_link_speed_cnt"
+ "adminq_report_stats_cnt", "adminq_report_link_speed_cnt",
+ "adminq_cfg_flow_rule", "adminq_cfg_rss_cnt"
};
static const char gve_gstrings_priv_flags[][ETH_GSTRING_LEN] = {
- "report-stats",
+ "report-stats", "enable-header-split", "enable-strict-header-split",
+ "enable-max-rx-buffer-size"
};
#define GVE_MAIN_STATS_LEN ARRAY_SIZE(gve_gstrings_main_stats)
@@ -140,11 +145,16 @@
gve_get_ethtool_stats(struct net_device *netdev,
struct ethtool_stats *stats, u64 *data)
{
- u64 tmp_rx_pkts, tmp_rx_bytes, tmp_rx_skb_alloc_fail,
- tmp_rx_buf_alloc_fail, tmp_rx_desc_err_dropped_pkt,
+ u64 tmp_rx_pkts, tmp_rx_pkts_sph, tmp_rx_pkts_hbo, tmp_rx_devmem_pkt,
+ tmp_rx_devmem_dropped, tmp_rx_bytes,
+ tmp_rx_hbytes, tmp_rx_skb_alloc_fail, tmp_rx_buf_alloc_fail,
+ tmp_rx_desc_err_dropped_pkt, tmp_rx_hsplit_err_dropped_pkt,
tmp_tx_pkts, tmp_tx_bytes;
- u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_pkts,
- rx_skb_alloc_fail, rx_bytes, tx_pkts, tx_bytes, tx_dropped;
+
+ u64 rx_buf_alloc_fail, rx_desc_err_dropped_pkt, rx_hsplit_err_dropped_pkt,
+ rx_pkts, rx_pkts_sph, rx_pkts_hbo, rx_devmem_pkt, rx_devmem_dropped,
+ rx_skb_alloc_fail, rx_bytes,
+ tx_pkts, tx_bytes, tx_dropped;
int stats_idx, base_stats_idx, max_stats_idx;
struct stats *report_stats;
int *rx_qid_to_stats_idx;
@@ -169,28 +179,42 @@
kfree(rx_qid_to_stats_idx);
return;
}
- for (rx_pkts = 0, rx_bytes = 0, rx_skb_alloc_fail = 0,
- rx_buf_alloc_fail = 0, rx_desc_err_dropped_pkt = 0, ring = 0;
+ for (rx_pkts = 0, rx_bytes = 0, rx_pkts_sph = 0, rx_pkts_hbo = 0,
+ rx_devmem_pkt = 0, rx_devmem_dropped = 0,
+ rx_skb_alloc_fail = 0, rx_buf_alloc_fail = 0,
+ rx_desc_err_dropped_pkt = 0, rx_hsplit_err_dropped_pkt = 0,
+ ring = 0;
ring < priv->rx_cfg.num_queues; ring++) {
if (priv->rx) {
do {
struct gve_rx_ring *rx = &priv->rx[ring];
start =
- u64_stats_fetch_begin_irq(&priv->rx[ring].statss);
+ u64_stats_fetch_begin(&priv->rx[ring].statss);
tmp_rx_pkts = rx->rpackets;
+ tmp_rx_pkts_sph = rx->rx_hsplit_pkt;
+ tmp_rx_pkts_hbo = rx->rx_hsplit_hbo_pkt;
+ tmp_rx_devmem_pkt = rx->rx_devmem_pkt;
+ tmp_rx_devmem_dropped = rx->rx_devmem_dropped;
tmp_rx_bytes = rx->rbytes;
tmp_rx_skb_alloc_fail = rx->rx_skb_alloc_fail;
tmp_rx_buf_alloc_fail = rx->rx_buf_alloc_fail;
tmp_rx_desc_err_dropped_pkt =
rx->rx_desc_err_dropped_pkt;
- } while (u64_stats_fetch_retry_irq(&priv->rx[ring].statss,
+ tmp_rx_hsplit_err_dropped_pkt =
+ rx->rx_hsplit_err_dropped_pkt;
+ } while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
rx_pkts += tmp_rx_pkts;
+ rx_pkts_sph += tmp_rx_pkts_sph;
+ rx_pkts_hbo += tmp_rx_pkts_hbo;
+ rx_devmem_pkt += tmp_rx_devmem_pkt;
+ rx_devmem_dropped += tmp_rx_devmem_dropped;
rx_bytes += tmp_rx_bytes;
rx_skb_alloc_fail += tmp_rx_skb_alloc_fail;
rx_buf_alloc_fail += tmp_rx_buf_alloc_fail;
rx_desc_err_dropped_pkt += tmp_rx_desc_err_dropped_pkt;
+ rx_hsplit_err_dropped_pkt += tmp_rx_hsplit_err_dropped_pkt;
}
}
for (tx_pkts = 0, tx_bytes = 0, tx_dropped = 0, ring = 0;
@@ -198,10 +222,10 @@
if (priv->tx) {
do {
start =
- u64_stats_fetch_begin_irq(&priv->tx[ring].statss);
+ u64_stats_fetch_begin(&priv->tx[ring].statss);
tmp_tx_pkts = priv->tx[ring].pkt_done;
tmp_tx_bytes = priv->tx[ring].bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->tx[ring].statss,
start));
tx_pkts += tmp_tx_pkts;
tx_bytes += tmp_tx_bytes;
@@ -211,6 +235,10 @@
i = 0;
data[i++] = rx_pkts;
+ data[i++] = rx_pkts_sph;
+ data[i++] = rx_pkts_hbo;
+ data[i++] = rx_devmem_pkt;
+ data[i++] = rx_devmem_dropped;
data[i++] = tx_pkts;
data[i++] = rx_bytes;
data[i++] = tx_bytes;
@@ -222,6 +250,7 @@
data[i++] = rx_skb_alloc_fail;
data[i++] = rx_buf_alloc_fail;
data[i++] = rx_desc_err_dropped_pkt;
+ data[i++] = rx_hsplit_err_dropped_pkt;
data[i++] = priv->interface_up_cnt;
data[i++] = priv->interface_down_cnt;
data[i++] = priv->reset_cnt;
@@ -259,18 +288,21 @@
data[i++] = rx->fill_cnt - rx->cnt;
do {
start =
- u64_stats_fetch_begin_irq(&priv->rx[ring].statss);
+ u64_stats_fetch_begin(&priv->rx[ring].statss);
tmp_rx_bytes = rx->rbytes;
+ tmp_rx_hbytes = rx->rheader_bytes;
tmp_rx_skb_alloc_fail = rx->rx_skb_alloc_fail;
tmp_rx_buf_alloc_fail = rx->rx_buf_alloc_fail;
tmp_rx_desc_err_dropped_pkt =
rx->rx_desc_err_dropped_pkt;
- } while (u64_stats_fetch_retry_irq(&priv->rx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
data[i++] = tmp_rx_bytes;
+ data[i++] = tmp_rx_hbytes;
data[i++] = rx->rx_cont_packet_cnt;
data[i++] = rx->rx_frag_flip_cnt;
data[i++] = rx->rx_frag_copy_cnt;
+ data[i++] = rx->rx_frag_alloc_cnt;
/* rx dropped packets */
data[i++] = tmp_rx_skb_alloc_fail +
tmp_rx_buf_alloc_fail +
@@ -331,9 +363,9 @@
}
do {
start =
- u64_stats_fetch_begin_irq(&priv->tx[ring].statss);
+ u64_stats_fetch_begin(&priv->tx[ring].statss);
tmp_tx_bytes = tx->bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->tx[ring].statss,
start));
data[i++] = tmp_tx_bytes;
data[i++] = tx->wake_queue;
@@ -374,6 +406,8 @@
data[i++] = priv->adminq_set_driver_parameter_cnt;
data[i++] = priv->adminq_report_stats_cnt;
data[i++] = priv->adminq_report_link_speed_cnt;
+ data[i++] = priv->adminq_cfg_flow_rule_cnt;
+ data[i++] = priv->adminq_cfg_rss_cnt;
}
static void gve_get_channels(struct net_device *netdev,
@@ -410,7 +444,7 @@
if (!new_rx || !new_tx)
return -EINVAL;
- if (!netif_carrier_ok(netdev)) {
+ if (!netif_running(netdev)) {
priv->tx_cfg.num_queues = new_tx;
priv->rx_cfg.num_queues = new_rx;
return 0;
@@ -441,7 +475,7 @@
if (*flags == ETH_RESET_ALL) {
*flags = 0;
- return gve_reset(priv, true);
+ return gve_reset(priv, true, GVE_RESET_REASON_RESET_BY_USER);
}
return -EOPNOTSUPP;
@@ -488,28 +522,69 @@
static u32 gve_get_priv_flags(struct net_device *netdev)
{
struct gve_priv *priv = netdev_priv(netdev);
- u32 ret_flags = 0;
-
- /* Only 1 flag exists currently: report-stats (BIT(O)), so set that flag. */
- if (priv->ethtool_flags & BIT(0))
- ret_flags |= BIT(0);
- return ret_flags;
+ return priv->ethtool_flags & GVE_PRIV_FLAGS_MASK;
}
static int gve_set_priv_flags(struct net_device *netdev, u32 flags)
{
struct gve_priv *priv = netdev_priv(netdev);
- u64 ori_flags, new_flags;
+ u64 ori_flags, new_flags, flag_diff;
+ int new_packet_buffer_size;
+
+ /* If turning off header split, strict header split will be turned off too*/
+ if (gve_get_enable_header_split(priv) &&
+ !(flags & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT))) {
+ flags &= ~BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+ flags &= ~BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT);
+ }
+
+ /* If strict header-split is requested, turn on regular header-split */
+ if (flags & BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT))
+ flags |= BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+
+ /* Make sure header-split is available */
+ if ((flags & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT)) &&
+ !(priv->ethtool_defaults & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT))) {
+ dev_err(&priv->pdev->dev,
+ "Header-split not available\n");
+ return -EINVAL;
+ }
+
+ if ((flags & BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE)) &&
+ priv->dev_max_rx_buffer_size <= GVE_MIN_RX_BUFFER_SIZE) {
+ dev_err(&priv->pdev->dev,
+ "Max-rx-buffer-size not available\n");
+ return -EINVAL;
+ }
ori_flags = READ_ONCE(priv->ethtool_flags);
- new_flags = ori_flags;
- /* Only one priv flag exists: report-stats (BIT(0))*/
- if (flags & BIT(0))
- new_flags |= BIT(0);
- else
- new_flags &= ~(BIT(0));
+ new_flags = flags & GVE_PRIV_FLAGS_MASK;
+
+ flag_diff = new_flags ^ ori_flags;
+
+ if ((flag_diff & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT)) ||
+ (flag_diff & BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE))) {
+ bool enable_hdr_split =
+ new_flags & BIT(GVE_PRIV_FLAGS_ENABLE_HEADER_SPLIT);
+ bool enable_max_buffer_size =
+ new_flags & BIT(GVE_PRIV_FLAGS_ENABLE_MAX_RX_BUFFER_SIZE);
+ int err;
+
+ if (enable_max_buffer_size)
+ new_packet_buffer_size = priv->dev_max_rx_buffer_size;
+ else
+ new_packet_buffer_size = GVE_RX_BUFFER_SIZE_DQO;
+
+ err = gve_reconfigure_rx_rings(priv,
+ enable_hdr_split,
+ new_packet_buffer_size);
+ if (err)
+ return err;
+ }
+
priv->ethtool_flags = new_flags;
+
/* start report-stats timer when user turns report stats on. */
if (flags & BIT(0)) {
mod_timer(&priv->stats_report_timer,
@@ -528,6 +603,10 @@
sizeof(struct stats));
del_timer_sync(&priv->stats_report_timer);
}
+ priv->header_split_strict =
+ (priv->ethtool_flags &
+ BIT(GVE_PRIV_FLAGS_ENABLE_STRICT_HEADER_SPLIT)) ? true : false;
+
return 0;
}
@@ -604,6 +683,630 @@
return 0;
}
+static u32 gve_get_rxfh_key_size(struct net_device *netdev)
+{
+ return GVE_RSS_KEY_SIZE;
+}
+
+static u32 gve_get_rxfh_indir_size(struct net_device *netdev)
+{
+ return GVE_RSS_INDIR_SIZE;
+}
+
+static int gve_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
+ u8 *hfunc)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ struct gve_rss_config *rss_config = &priv->rss_config;
+ u16 i;
+
+ if (hfunc) {
+ switch (rss_config->alg) {
+ case GVE_RSS_HASH_TOEPLITZ:
+ *hfunc = ETH_RSS_HASH_TOP;
+ break;
+ case GVE_RSS_HASH_UNDEFINED:
+ default:
+ return -EOPNOTSUPP;
+ }
+ }
+ if (key)
+ memcpy(key, rss_config->key, rss_config->key_size);
+
+ if (indir)
+ /* Each 32 bits pointed by 'indir' is stored with a lut entry */
+ for (i = 0; i < rss_config->indir_size; i++)
+ indir[i] = (u32)rss_config->indir[i];
+
+ return 0;
+}
+
+static int gve_set_rxfh(struct net_device *netdev, const u32 *indir,
+ const u8 *key, const u8 hfunc)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ struct gve_rss_config *rss_config = &priv->rss_config;
+ bool init = false;
+ u16 i;
+ int err = 0;
+
+ /* Initialize RSS if not configured before */
+ if (rss_config->alg == GVE_RSS_HASH_UNDEFINED) {
+ err = gve_rss_config_init(priv);
+ if (err)
+ return err;
+ init = true;
+ }
+
+ switch (hfunc) {
+ case ETH_RSS_HASH_NO_CHANGE:
+ break;
+ case ETH_RSS_HASH_TOP:
+ rss_config->alg = GVE_RSS_HASH_TOEPLITZ;
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ if (!key && !indir && !init)
+ return 0;
+
+ if (key)
+ memcpy(rss_config->key, key, rss_config->key_size);
+
+ if (indir) {
+ /* Each 32 bits pointed by 'indir' is stored with a lut entry */
+ for (i = 0; i < rss_config->indir_size; i++)
+ rss_config->indir[i] = indir[i];
+ }
+
+ return gve_adminq_configure_rss(priv, rss_config);
+}
+
+static const char *gve_flow_type_name(enum gve_adminq_flow_type flow_type)
+{
+ switch (flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ case GVE_FLOW_TYPE_TCPV6:
+ return "TCP";
+ case GVE_FLOW_TYPE_UDPV4:
+ case GVE_FLOW_TYPE_UDPV6:
+ return "UDP";
+ case GVE_FLOW_TYPE_SCTPV4:
+ case GVE_FLOW_TYPE_SCTPV6:
+ return "SCTP";
+ case GVE_FLOW_TYPE_AHV4:
+ case GVE_FLOW_TYPE_AHV6:
+ return "AH";
+ case GVE_FLOW_TYPE_ESPV4:
+ case GVE_FLOW_TYPE_ESPV6:
+ return "ESP";
+ }
+ return NULL;
+}
+
+static void gve_print_flow_rule(struct gve_priv *priv,
+ struct gve_flow_rule *rule)
+{
+ const char *proto = gve_flow_type_name(rule->flow_type);
+
+ if (!proto)
+ return;
+
+ switch (rule->flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ case GVE_FLOW_TYPE_UDPV4:
+ case GVE_FLOW_TYPE_SCTPV4:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI4 src_ip %pI4 %s: dst_port %hu src_port %hu\n",
+ rule->loc,
+ &rule->key.dst_ip[0],
+ &rule->key.src_ip[0],
+ proto,
+ ntohs(rule->key.dst_port),
+ ntohs(rule->key.src_port));
+ break;
+ case GVE_FLOW_TYPE_AHV4:
+ case GVE_FLOW_TYPE_ESPV4:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI4 src_ip %pI4 %s: spi %hu\n",
+ rule->loc,
+ &rule->key.dst_ip[0],
+ &rule->key.src_ip[0],
+ proto,
+ ntohl(rule->key.spi));
+ break;
+ case GVE_FLOW_TYPE_TCPV6:
+ case GVE_FLOW_TYPE_UDPV6:
+ case GVE_FLOW_TYPE_SCTPV6:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI6 src_ip %pI6 %s: dst_port %hu src_port %hu\n",
+ rule->loc,
+ &rule->key.dst_ip,
+ &rule->key.src_ip,
+ proto,
+ ntohs(rule->key.dst_port),
+ ntohs(rule->key.src_port));
+ break;
+ case GVE_FLOW_TYPE_AHV6:
+ case GVE_FLOW_TYPE_ESPV6:
+ dev_info_ratelimited(&priv->pdev->dev, "Rule ID: %u dst_ip: %pI6 src_ip %pI6 %s: spi %hu\n",
+ rule->loc,
+ &rule->key.dst_ip,
+ &rule->key.src_ip,
+ proto,
+ ntohl(rule->key.spi));
+ break;
+ default:
+ break;
+ }
+}
+
+static bool gve_flow_rule_is_dup_rule(struct gve_priv *priv, struct gve_flow_rule *rule)
+{
+ struct gve_flow_rule *tmp;
+
+ list_for_each_entry(tmp, &priv->flow_rules, list) {
+ if (tmp->flow_type != rule->flow_type)
+ continue;
+
+ if (!memcmp(&tmp->key, &rule->key,
+ sizeof(struct gve_flow_spec)) &&
+ !memcmp(&tmp->mask, &rule->mask,
+ sizeof(struct gve_flow_spec)))
+ return true;
+ }
+ return false;
+}
+
+static struct gve_flow_rule *gve_find_flow_rule_by_loc(struct gve_priv *priv, u16 loc)
+{
+ struct gve_flow_rule *rule;
+
+ list_for_each_entry(rule, &priv->flow_rules, list)
+ if (rule->loc == loc)
+ return rule;
+
+ return NULL;
+}
+
+static void gve_flow_rules_add_rule(struct gve_priv *priv, struct gve_flow_rule *rule)
+{
+ struct gve_flow_rule *tmp, *parent = NULL;
+
+ list_for_each_entry(tmp, &priv->flow_rules, list) {
+ if (tmp->loc >= rule->loc)
+ break;
+ parent = tmp;
+ }
+
+ if (parent)
+ list_add(&rule->list, &parent->list);
+ else
+ list_add(&rule->list, &priv->flow_rules);
+
+ priv->flow_rules_cnt++;
+}
+
+static void gve_flow_rules_del_rule(struct gve_priv *priv, struct gve_flow_rule *rule)
+{
+ list_del(&rule->list);
+ kvfree(rule);
+ priv->flow_rules_cnt--;
+}
+
+static int
+gve_get_flow_rule_entry(struct gve_priv *priv, struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp = (struct ethtool_rx_flow_spec *)&cmd->fs;
+ struct gve_flow_rule *rule = NULL;
+ int err = 0;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ mutex_lock(&priv->flow_rules_lock);
+ rule = gve_find_flow_rule_by_loc(priv, fsp->location);
+ if (!rule) {
+ err = -EINVAL;
+ goto ret;
+ }
+
+ switch (rule->flow_type) {
+ case GVE_FLOW_TYPE_TCPV4:
+ fsp->flow_type = TCP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_UDPV4:
+ fsp->flow_type = UDP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_SCTPV4:
+ fsp->flow_type = SCTP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_AHV4:
+ fsp->flow_type = AH_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_ESPV4:
+ fsp->flow_type = ESP_V4_FLOW;
+ break;
+ case GVE_FLOW_TYPE_TCPV6:
+ fsp->flow_type = TCP_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_UDPV6:
+ fsp->flow_type = UDP_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_SCTPV6:
+ fsp->flow_type = SCTP_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_AHV6:
+ fsp->flow_type = AH_V6_FLOW;
+ break;
+ case GVE_FLOW_TYPE_ESPV6:
+ fsp->flow_type = ESP_V6_FLOW;
+ break;
+ default:
+ err = -EINVAL;
+ goto ret;
+ }
+
+ memset(&fsp->h_u, 0, sizeof(fsp->h_u));
+ memset(&fsp->h_ext, 0, sizeof(fsp->h_ext));
+ memset(&fsp->m_u, 0, sizeof(fsp->m_u));
+ memset(&fsp->m_ext, 0, sizeof(fsp->m_ext));
+
+ switch (fsp->flow_type) {
+ case TCP_V4_FLOW:
+ case UDP_V4_FLOW:
+ case SCTP_V4_FLOW:
+ fsp->h_u.tcp_ip4_spec.ip4src = rule->key.src_ip[0];
+ fsp->h_u.tcp_ip4_spec.ip4dst = rule->key.dst_ip[0];
+ fsp->h_u.tcp_ip4_spec.psrc = rule->key.src_port;
+ fsp->h_u.tcp_ip4_spec.pdst = rule->key.dst_port;
+ fsp->h_u.tcp_ip4_spec.tos = rule->key.tos;
+ fsp->m_u.tcp_ip4_spec.ip4src = rule->mask.src_ip[0];
+ fsp->m_u.tcp_ip4_spec.ip4dst = rule->mask.dst_ip[0];
+ fsp->m_u.tcp_ip4_spec.psrc = rule->mask.src_port;
+ fsp->m_u.tcp_ip4_spec.pdst = rule->mask.dst_port;
+ fsp->m_u.tcp_ip4_spec.tos = rule->mask.tos;
+ break;
+ case AH_V4_FLOW:
+ case ESP_V4_FLOW:
+ fsp->h_u.ah_ip4_spec.ip4src = rule->key.src_ip[0];
+ fsp->h_u.ah_ip4_spec.ip4dst = rule->key.dst_ip[0];
+ fsp->h_u.ah_ip4_spec.spi = rule->key.spi;
+ fsp->h_u.ah_ip4_spec.tos = rule->key.tos;
+ fsp->m_u.ah_ip4_spec.ip4src = rule->mask.src_ip[0];
+ fsp->m_u.ah_ip4_spec.ip4dst = rule->mask.dst_ip[0];
+ fsp->m_u.ah_ip4_spec.spi = rule->mask.spi;
+ fsp->m_u.ah_ip4_spec.tos = rule->mask.tos;
+ break;
+ case TCP_V6_FLOW:
+ case UDP_V6_FLOW:
+ case SCTP_V6_FLOW:
+ memcpy(fsp->h_u.tcp_ip6_spec.ip6src, &rule->key.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->h_u.tcp_ip6_spec.ip6dst, &rule->key.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->h_u.tcp_ip6_spec.psrc = rule->key.src_port;
+ fsp->h_u.tcp_ip6_spec.pdst = rule->key.dst_port;
+ fsp->h_u.tcp_ip6_spec.tclass = rule->key.tclass;
+ memcpy(fsp->m_u.tcp_ip6_spec.ip6src, &rule->mask.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->m_u.tcp_ip6_spec.ip6dst, &rule->mask.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->m_u.tcp_ip6_spec.psrc = rule->mask.src_port;
+ fsp->m_u.tcp_ip6_spec.pdst = rule->mask.dst_port;
+ fsp->m_u.tcp_ip6_spec.tclass = rule->mask.tclass;
+ break;
+ case AH_V6_FLOW:
+ case ESP_V6_FLOW:
+ memcpy(fsp->h_u.ah_ip6_spec.ip6src, &rule->key.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->h_u.ah_ip6_spec.ip6dst, &rule->key.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->h_u.ah_ip6_spec.spi = rule->key.spi;
+ fsp->h_u.ah_ip6_spec.tclass = rule->key.tclass;
+ memcpy(fsp->m_u.ah_ip6_spec.ip6src, &rule->mask.src_ip,
+ sizeof(struct in6_addr));
+ memcpy(fsp->m_u.ah_ip6_spec.ip6dst, &rule->mask.dst_ip,
+ sizeof(struct in6_addr));
+ fsp->m_u.ah_ip6_spec.spi = rule->mask.spi;
+ fsp->m_u.ah_ip6_spec.tclass = rule->mask.tclass;
+ break;
+ default:
+ err = -EINVAL;
+ goto ret;
+ }
+
+ fsp->ring_cookie = rule->action;
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ return err;
+}
+
+static int
+gve_get_flow_rule_ids(struct gve_priv *priv, struct ethtool_rxnfc *cmd,
+ u32 *rule_locs)
+{
+ struct gve_flow_rule *rule;
+ unsigned int cnt = 0;
+ int err = 0;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ cmd->data = priv->flow_rules_max;
+
+ mutex_lock(&priv->flow_rules_lock);
+ list_for_each_entry(rule, &priv->flow_rules, list) {
+ if (cnt == cmd->rule_cnt) {
+ err = -EMSGSIZE;
+ goto ret;
+ }
+ rule_locs[cnt] = rule->loc;
+ cnt++;
+ }
+ cmd->rule_cnt = cnt;
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ return err;
+}
+
+static int
+gve_add_flow_rule_info(struct gve_priv *priv, struct ethtool_rx_flow_spec *fsp,
+ struct gve_flow_rule *rule)
+{
+ u32 flow_type, q_index = 0;
+
+ if (fsp->ring_cookie == RX_CLS_FLOW_DISC)
+ return -EOPNOTSUPP;
+
+ q_index = fsp->ring_cookie;
+ if (q_index >= priv->rx_cfg.num_queues)
+ return -EINVAL;
+
+ rule->action = q_index;
+ rule->loc = fsp->location;
+
+ flow_type = fsp->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+ switch (flow_type) {
+ case TCP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_TCPV4;
+ break;
+ case UDP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_UDPV4;
+ break;
+ case SCTP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_SCTPV4;
+ break;
+ case AH_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_AHV4;
+ break;
+ case ESP_V4_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_ESPV4;
+ break;
+ case TCP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_TCPV6;
+ break;
+ case UDP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_UDPV6;
+ break;
+ case SCTP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_SCTPV6;
+ break;
+ case AH_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_AHV6;
+ break;
+ case ESP_V6_FLOW:
+ rule->flow_type = GVE_FLOW_TYPE_ESPV6;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ switch (flow_type) {
+ case TCP_V4_FLOW:
+ case UDP_V4_FLOW:
+ case SCTP_V4_FLOW:
+ rule->key.src_ip[0] = fsp->h_u.tcp_ip4_spec.ip4src;
+ rule->key.dst_ip[0] = fsp->h_u.tcp_ip4_spec.ip4dst;
+ rule->key.src_port = fsp->h_u.tcp_ip4_spec.psrc;
+ rule->key.dst_port = fsp->h_u.tcp_ip4_spec.pdst;
+ rule->mask.src_ip[0] = fsp->m_u.tcp_ip4_spec.ip4src;
+ rule->mask.dst_ip[0] = fsp->m_u.tcp_ip4_spec.ip4dst;
+ rule->mask.src_port = fsp->m_u.tcp_ip4_spec.psrc;
+ rule->mask.dst_port = fsp->m_u.tcp_ip4_spec.pdst;
+ break;
+ case AH_V4_FLOW:
+ case ESP_V4_FLOW:
+ rule->key.src_ip[0] = fsp->h_u.tcp_ip4_spec.ip4src;
+ rule->key.dst_ip[0] = fsp->h_u.tcp_ip4_spec.ip4dst;
+ rule->key.spi = fsp->h_u.ah_ip4_spec.spi;
+ rule->mask.src_ip[0] = fsp->m_u.tcp_ip4_spec.ip4src;
+ rule->mask.dst_ip[0] = fsp->m_u.tcp_ip4_spec.ip4dst;
+ rule->mask.spi = fsp->m_u.ah_ip4_spec.spi;
+ break;
+ case TCP_V6_FLOW:
+ case UDP_V6_FLOW:
+ case SCTP_V6_FLOW:
+ memcpy(&rule->key.src_ip, fsp->h_u.tcp_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->key.dst_ip, fsp->h_u.tcp_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->key.src_port = fsp->h_u.tcp_ip6_spec.psrc;
+ rule->key.dst_port = fsp->h_u.tcp_ip6_spec.pdst;
+ memcpy(&rule->mask.src_ip, fsp->m_u.tcp_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->mask.dst_ip, fsp->m_u.tcp_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->mask.src_port = fsp->m_u.tcp_ip6_spec.psrc;
+ rule->mask.dst_port = fsp->m_u.tcp_ip6_spec.pdst;
+ break;
+ case AH_V6_FLOW:
+ case ESP_V6_FLOW:
+ memcpy(&rule->key.src_ip, fsp->h_u.usr_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->key.dst_ip, fsp->h_u.usr_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->key.spi = fsp->h_u.ah_ip6_spec.spi;
+ memcpy(&rule->mask.src_ip, fsp->m_u.usr_ip6_spec.ip6src,
+ sizeof(struct in6_addr));
+ memcpy(&rule->mask.dst_ip, fsp->m_u.usr_ip6_spec.ip6dst,
+ sizeof(struct in6_addr));
+ rule->key.spi = fsp->h_u.ah_ip6_spec.spi;
+ break;
+ default:
+ /* not doing un-parsed flow types */
+ return -EINVAL;
+ }
+
+ if (gve_flow_rule_is_dup_rule(priv, rule))
+ return -EEXIST;
+
+ return 0;
+}
+
+static int gve_add_flow_rule(struct gve_priv *priv, struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp = &cmd->fs;
+ struct gve_flow_rule *rule = NULL;
+ int err;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ if (priv->flow_rules_cnt >= priv->flow_rules_max) {
+ dev_err(&priv->pdev->dev,
+ "Reached the limit of max allowed flow rules (%u)\n",
+ priv->flow_rules_max);
+ return -ENOSPC;
+ }
+
+ mutex_lock(&priv->flow_rules_lock);
+ if (gve_find_flow_rule_by_loc(priv, fsp->location)) {
+ dev_err(&priv->pdev->dev, "Flow rule %d already exists\n",
+ fsp->location);
+ err = -EEXIST;
+ goto ret;
+ }
+
+ rule = kvzalloc(sizeof(*rule), GFP_KERNEL);
+ if (!rule) {
+ err = -ENOMEM;
+ goto ret;
+ }
+
+ err = gve_add_flow_rule_info(priv, fsp, rule);
+ if (err)
+ goto ret;
+
+ err = gve_adminq_add_flow_rule(priv, rule);
+ if (err)
+ goto ret;
+
+ gve_flow_rules_add_rule(priv, rule);
+ gve_print_flow_rule(priv, rule);
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ if (err && rule)
+ kfree(rule);
+ return err;
+}
+
+static int gve_del_flow_rule(struct gve_priv *priv, struct ethtool_rxnfc *cmd)
+{
+ struct ethtool_rx_flow_spec *fsp = (struct ethtool_rx_flow_spec *)&cmd->fs;
+ struct gve_flow_rule *rule = NULL;
+ int err = 0;
+
+ if (priv->flow_rules_max == 0)
+ return -EOPNOTSUPP;
+
+ mutex_lock(&priv->flow_rules_lock);
+ rule = gve_find_flow_rule_by_loc(priv, fsp->location);
+ if (!rule) {
+ err = -EINVAL;
+ goto ret;
+ }
+
+ err = gve_adminq_del_flow_rule(priv, fsp->location);
+ if (err)
+ goto ret;
+
+ gve_flow_rules_del_rule(priv, rule);
+
+ret:
+ mutex_unlock(&priv->flow_rules_lock);
+ return err;
+}
+
+static int gve_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ int err = -EOPNOTSUPP;
+
+ dev_hold(netdev);
+ rtnl_unlock();
+ if (!(netdev->features & NETIF_F_NTUPLE))
+ goto ret;
+
+ switch (cmd->cmd) {
+ case ETHTOOL_SRXCLSRLINS:
+ err = gve_add_flow_rule(priv, cmd);
+ break;
+ case ETHTOOL_SRXCLSRLDEL:
+ err = gve_del_flow_rule(priv, cmd);
+ break;
+ case ETHTOOL_SRXFH:
+ /* not supported */
+ break;
+ default:
+ break;
+ }
+
+ret:
+ rtnl_lock();
+ dev_put(netdev);
+ return err;
+}
+
+static int gve_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd,
+ u32 *rule_locs)
+{
+ struct gve_priv *priv = netdev_priv(netdev);
+ int err = -EOPNOTSUPP;
+
+ dev_hold(netdev);
+ rtnl_unlock();
+ switch (cmd->cmd) {
+ case ETHTOOL_GRXRINGS:
+ cmd->data = priv->rx_cfg.num_queues;
+ err = 0;
+ break;
+ case ETHTOOL_GRXCLSRLCNT:
+ if (priv->flow_rules_max == 0)
+ break;
+ cmd->rule_cnt = priv->flow_rules_cnt;
+ cmd->data = priv->flow_rules_max;
+ err = 0;
+ break;
+ case ETHTOOL_GRXCLSRULE:
+ err = gve_get_flow_rule_entry(priv, cmd);
+ break;
+ case ETHTOOL_GRXCLSRLALL:
+ err = gve_get_flow_rule_ids(priv, cmd, (u32 *)rule_locs);
+ break;
+ case ETHTOOL_GRXFH:
+ /* not supported */
+ break;
+ default:
+ break;
+ }
+
+ rtnl_lock();
+ dev_put(netdev);
+ return err;
+}
+
const struct ethtool_ops gve_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_USECS,
.get_drvinfo = gve_get_drvinfo,
@@ -614,6 +1317,12 @@
.get_msglevel = gve_get_msglevel,
.set_channels = gve_set_channels,
.get_channels = gve_get_channels,
+ .set_rxnfc = gve_set_rxnfc,
+ .get_rxnfc = gve_get_rxnfc,
+ .get_rxfh_indir_size = gve_get_rxfh_indir_size,
+ .get_rxfh_key_size = gve_get_rxfh_key_size,
+ .get_rxfh = gve_get_rxfh,
+ .set_rxfh = gve_set_rxfh,
.get_link = ethtool_op_get_link,
.get_coalesce = gve_get_coalesce,
.set_coalesce = gve_set_coalesce,
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index d3f6ad5..77190d4 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -12,6 +12,9 @@
#include <linux/sched.h>
#include <linux/timer.h>
#include <linux/workqueue.h>
+#include <linux/utsname.h>
+#include <linux/version.h>
+#include <linux/dma-buf.h>
#include <net/sch_generic.h>
#include "gve.h"
#include "gve_dqo.h"
@@ -30,6 +33,61 @@
const char gve_version_str[] = GVE_VERSION;
static const char gve_version_prefix[] = GVE_VERSION_PREFIX;
+static int gve_verify_driver_compatibility(struct gve_priv *priv)
+{
+ int err;
+ struct gve_driver_info *driver_info;
+ dma_addr_t driver_info_bus;
+
+ driver_info = dma_alloc_coherent(&priv->pdev->dev,
+ sizeof(struct gve_driver_info),
+ &driver_info_bus, GFP_KERNEL);
+ if (!driver_info)
+ return -ENOMEM;
+
+ *driver_info = (struct gve_driver_info) {
+ .os_type = 1, /* Linux */
+ .os_version_major = cpu_to_be32(LINUX_VERSION_MAJOR),
+ .os_version_minor = cpu_to_be32(LINUX_VERSION_SUBLEVEL),
+ .os_version_sub = cpu_to_be32(LINUX_VERSION_PATCHLEVEL),
+ .driver_capability_flags = {
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS1),
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS2),
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS3),
+ cpu_to_be64(GVE_DRIVER_CAPABILITY_FLAGS4),
+ },
+ };
+ strscpy(driver_info->os_version_str1, utsname()->release,
+ sizeof(driver_info->os_version_str1));
+ strscpy(driver_info->os_version_str2, utsname()->version,
+ sizeof(driver_info->os_version_str2));
+
+ err = gve_adminq_verify_driver_compatibility(priv,
+ sizeof(struct gve_driver_info),
+ driver_info_bus);
+
+ /* It's ok if the device doesn't support this */
+ if (err == -EOPNOTSUPP)
+ err = 0;
+
+ dma_free_coherent(&priv->pdev->dev,
+ sizeof(struct gve_driver_info),
+ driver_info, driver_info_bus);
+ return err;
+}
+
+static netdev_features_t gve_features_check(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ struct gve_priv *priv = netdev_priv(dev);
+
+ if (!gve_is_gqi(priv))
+ return gve_features_check_dqo(skb, dev, features);
+
+ return features;
+}
+
static netdev_tx_t gve_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct gve_priv *priv = netdev_priv(dev);
@@ -51,10 +109,10 @@
for (ring = 0; ring < priv->rx_cfg.num_queues; ring++) {
do {
start =
- u64_stats_fetch_begin_irq(&priv->rx[ring].statss);
+ u64_stats_fetch_begin(&priv->rx[ring].statss);
packets = priv->rx[ring].rpackets;
bytes = priv->rx[ring].rbytes;
- } while (u64_stats_fetch_retry_irq(&priv->rx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->rx[ring].statss,
start));
s->rx_packets += packets;
s->rx_bytes += bytes;
@@ -64,10 +122,10 @@
for (ring = 0; ring < priv->tx_cfg.num_queues; ring++) {
do {
start =
- u64_stats_fetch_begin_irq(&priv->tx[ring].statss);
+ u64_stats_fetch_begin(&priv->tx[ring].statss);
packets = priv->tx[ring].pkt_done;
bytes = priv->tx[ring].bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[ring].statss,
+ } while (u64_stats_fetch_retry(&priv->tx[ring].statss,
start));
s->tx_packets += packets;
s->tx_bytes += bytes;
@@ -162,6 +220,77 @@
priv->stats_report = NULL;
}
+static void gve_tx_timeout_for_miss_path(struct net_device *dev, unsigned int txqueue)
+{
+ struct gve_notify_block *block;
+ struct gve_tx_ring *tx = NULL;
+ bool has_work = false;
+ struct gve_priv *priv;
+ u32 ntfy_idx;
+
+ priv = netdev_priv(dev);
+
+ ntfy_idx = gve_tx_idx_to_ntfy(priv, txqueue);
+ if (ntfy_idx > priv->num_ntfy_blks)
+ return;
+
+ block = &priv->ntfy_blocks[ntfy_idx];
+ tx = block->tx;
+ if (!tx)
+ return;
+
+ /* Check to see if there is pending work */
+ has_work = gve_tx_work_pending_dqo(tx);
+ if (!has_work)
+ return;
+
+ if (READ_ONCE(tx->dqo_compl.kicked)) {
+ netdev_warn(dev,
+ "TX timeout on queue %d. Scheduling reset.",
+ txqueue);
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_TX_MISS_PATH_TIMEOUT);
+ gve_schedule_reset(priv);
+ }
+
+ gve_write_irq_doorbell_dqo(priv, block, GVE_ITR_NO_UPDATE_DQO);
+
+ netdev_info(dev, "Kicking tx queue %d for miss path", txqueue);
+ napi_schedule(&block->napi);
+ WRITE_ONCE(tx->dqo_compl.kicked, true);
+}
+
+static void gve_tx_timeout_timer(struct timer_list *t)
+{
+ struct gve_priv *priv = from_timer(priv, t, tx_timeout_timer);
+ int i;
+
+ for(i = 0; i < priv->tx_cfg.num_queues; i++) {
+ if (time_after(jiffies, READ_ONCE(priv->tx[i].dqo_compl.last_processed)
+ + priv->tx_timeout_period)) {
+ gve_tx_timeout_for_miss_path(priv->dev, i);
+ }
+ }
+ mod_timer(&priv->tx_timeout_timer,
+ jiffies + priv->tx_timeout_period);
+}
+
+static int gve_setup_tx_timeout_timer(struct gve_priv *priv)
+{
+ /* Set up 1 sec timer to check no reinjection on miss path */
+ if (gve_is_gqi(priv))
+ return 0;
+ priv->tx_timeout_period = GVE_TX_TIMEOUT_PERIOD;
+ timer_setup(&priv->tx_timeout_timer, gve_tx_timeout_timer, 0);
+ return 0;
+}
+
+static void gve_free_tx_timeout_timer(struct gve_priv *priv)
+{
+ if (gve_is_gqi(priv))
+ return;
+ del_timer_sync(&priv->tx_timeout_timer);
+}
+
static irqreturn_t gve_mgmnt_intr(int irq, void *arg)
{
struct gve_priv *priv = arg;
@@ -269,7 +398,6 @@
static int gve_alloc_notify_blocks(struct gve_priv *priv)
{
int num_vecs_requested = priv->num_ntfy_blks + 1;
- char *name = priv->dev->name;
unsigned int active_cpus;
int vecs_enabled;
int i, j;
@@ -313,8 +441,8 @@
active_cpus = min_t(int, priv->num_ntfy_blks / 2, num_online_cpus());
/* Setup Management Vector - the last vector */
- snprintf(priv->mgmt_msix_name, sizeof(priv->mgmt_msix_name), "%s-mgmnt",
- name);
+ snprintf(priv->mgmt_msix_name, sizeof(priv->mgmt_msix_name), "gve-mgmnt@pci:%s",
+ pci_name(priv->pdev));
err = request_irq(priv->msix_vectors[priv->mgmt_msix_idx].vector,
gve_mgmnt_intr, 0, priv->mgmt_msix_name, priv);
if (err) {
@@ -343,8 +471,8 @@
struct gve_notify_block *block = &priv->ntfy_blocks[i];
int msix_idx = i;
- snprintf(block->name, sizeof(block->name), "%s-ntfy-block.%d",
- name, i);
+ snprintf(block->name, sizeof(block->name), "gve-ntfy-blk%d@pci:%s",
+ i, pci_name(priv->pdev));
block->priv = priv;
err = request_irq(priv->msix_vectors[msix_idx].vector,
gve_is_gqi(priv) ? gve_intr : gve_intr_dqo,
@@ -423,9 +551,12 @@
err = gve_alloc_notify_blocks(priv);
if (err)
goto abort_with_counter;
+ err = gve_setup_tx_timeout_timer(priv);
+ if(err)
+ goto abort_with_ntfy_blocks;
err = gve_alloc_stats_report(priv);
if (err)
- goto abort_with_ntfy_blocks;
+ goto abort_with_tx_timeout;
err = gve_adminq_configure_device_resources(priv,
priv->counter_array_bus,
priv->num_event_counters,
@@ -438,7 +569,7 @@
goto abort_with_stats_report;
}
- if (priv->queue_format == GVE_DQO_RDA_FORMAT) {
+ if (!gve_is_gqi(priv)) {
priv->ptype_lut_dqo = kvzalloc(sizeof(*priv->ptype_lut_dqo),
GFP_KERNEL);
if (!priv->ptype_lut_dqo) {
@@ -467,6 +598,8 @@
priv->ptype_lut_dqo = NULL;
abort_with_stats_report:
gve_free_stats_report(priv);
+abort_with_tx_timeout:
+ gve_free_tx_timeout_timer(priv);
abort_with_ntfy_blocks:
gve_free_notify_blocks(priv);
abort_with_counter:
@@ -475,7 +608,8 @@
return err;
}
-static void gve_trigger_reset(struct gve_priv *priv);
+static void gve_trigger_reset(struct gve_priv *priv,
+ enum gve_reset_reason reason);
static void gve_teardown_device_resources(struct gve_priv *priv)
{
@@ -483,28 +617,39 @@
/* Tell device its resources are being freed */
if (gve_get_device_resources_ok(priv)) {
+ if (priv->flow_rules_cnt != 0) {
+ err = gve_adminq_reset_flow_rules(priv);
+ if (err) {
+ dev_err(&priv->pdev->dev,
+ "Failed to reset flow rules: err=%d\n", err);
+ gve_trigger_reset(priv, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
+ }
+ }
/* detach the stats report */
err = gve_adminq_report_stats(priv, 0, 0x0, GVE_STATS_REPORT_TIMER_PERIOD);
if (err) {
dev_err(&priv->pdev->dev,
"Failed to detach stats report: err=%d\n", err);
- gve_trigger_reset(priv);
+ gve_trigger_reset(priv, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
}
err = gve_adminq_deconfigure_device_resources(priv);
if (err) {
dev_err(&priv->pdev->dev,
"Could not deconfigure device resources: err=%d\n",
err);
- gve_trigger_reset(priv);
+ gve_trigger_reset(priv, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
}
}
kvfree(priv->ptype_lut_dqo);
priv->ptype_lut_dqo = NULL;
+ gve_flow_rules_release(priv);
+ gve_rss_config_release(&priv->rss_config);
gve_free_counter_array(priv);
gve_free_notify_blocks(priv);
gve_free_stats_report(priv);
+ gve_free_tx_timeout_timer(priv);
gve_clear_device_resources_ok(priv);
}
@@ -563,23 +708,11 @@
return 0;
}
-static int gve_create_rings(struct gve_priv *priv)
+static int gve_create_rx_rings(struct gve_priv *priv)
{
int err;
int i;
- err = gve_adminq_create_tx_queues(priv, priv->tx_cfg.num_queues);
- if (err) {
- netif_err(priv, drv, priv->dev, "failed to create %d tx queues\n",
- priv->tx_cfg.num_queues);
- /* This failure will trigger a reset - no need to clean
- * up
- */
- return err;
- }
- netif_dbg(priv, drv, priv->dev, "created %d tx queues\n",
- priv->tx_cfg.num_queues);
-
err = gve_adminq_create_rx_queues(priv, priv->rx_cfg.num_queues);
if (err) {
netif_err(priv, drv, priv->dev, "failed to create %d rx queues\n",
@@ -611,6 +744,39 @@
return 0;
}
+static int gve_create_tx_rings(struct gve_priv *priv, int start_id, u32 num_tx_queues)
+{
+ int err;
+
+ err = gve_adminq_create_tx_queues(priv, start_id, num_tx_queues);
+ if (err) {
+ netif_err(priv, drv, priv->dev, "failed to create %d tx queues\n",
+ num_tx_queues);
+ /* This failure will trigger a reset - no need to clean
+ * up
+ */
+ return err;
+ }
+ netif_dbg(priv, drv, priv->dev, "created %d tx queues\n",
+ num_tx_queues);
+
+ return 0;
+}
+
+static int gve_create_rings(struct gve_priv *priv)
+{
+ int num_tx_queues = gve_num_tx_queues(priv);
+ int err;
+
+ err = gve_create_tx_rings(priv, 0, num_tx_queues);
+ if (err)
+ return err;
+
+ err = gve_create_rx_rings(priv);
+
+ return err;
+}
+
static void add_napi_init_sync_stats(struct gve_priv *priv,
int (*napi_poll)(struct napi_struct *napi,
int budget))
@@ -694,18 +860,10 @@
return err;
}
-static int gve_destroy_rings(struct gve_priv *priv)
+static int gve_destroy_rx_rings(struct gve_priv *priv)
{
int err;
- err = gve_adminq_destroy_tx_queues(priv, priv->tx_cfg.num_queues);
- if (err) {
- netif_err(priv, drv, priv->dev,
- "failed to destroy tx queues\n");
- /* This failure will trigger a reset - no need to clean up */
- return err;
- }
- netif_dbg(priv, drv, priv->dev, "destroyed tx queues\n");
err = gve_adminq_destroy_rx_queues(priv, priv->rx_cfg.num_queues);
if (err) {
netif_err(priv, drv, priv->dev,
@@ -717,6 +875,38 @@
return 0;
}
+static int gve_destroy_tx_rings(struct gve_priv *priv, int start_id, u32 num_queues)
+{
+ int err;
+
+ err = gve_adminq_destroy_tx_queues(priv, start_id, num_queues);
+ if (err) {
+ netif_err(priv, drv, priv->dev,
+ "failed to destroy tx queues\n");
+ /* This failure will trigger a reset - no need to clean up */
+ return err;
+ }
+ netif_dbg(priv, drv, priv->dev, "destroyed tx queues\n");
+
+ return 0;
+}
+
+static int gve_destroy_rings(struct gve_priv *priv)
+{
+ int num_tx_queues = gve_num_tx_queues(priv);
+ int err;
+
+ err = gve_destroy_tx_rings(priv, 0, num_tx_queues);
+ if (err)
+ return err;
+
+ err = gve_destroy_rx_rings(priv);
+ if (err)
+ return err;
+
+ return 0;
+}
+
static void gve_rx_free_rings(struct gve_priv *priv)
{
if (gve_is_gqi(priv))
@@ -811,7 +1001,7 @@
void gve_free_page(struct device *dev, struct page *page, dma_addr_t dma,
enum dma_data_direction dir)
{
- if (!dma_mapping_error(dev, dma))
+ if (!is_dma_buf_page(page) && !dma_mapping_error(dev, dma))
dma_unmap_page(dev, dma, PAGE_SIZE, dir);
if (page)
put_page(page);
@@ -840,6 +1030,7 @@
static int gve_alloc_qpls(struct gve_priv *priv)
{
int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv);
+ int page_count;
int i, j;
int err;
@@ -850,15 +1041,23 @@
if (!priv->qpls)
return -ENOMEM;
+ page_count = priv->tx_pages_per_qpl;
for (i = 0; i < gve_num_tx_qpls(priv); i++) {
err = gve_alloc_queue_page_list(priv, i,
- priv->tx_pages_per_qpl);
+ page_count);
if (err)
goto free_qpls;
}
+
+ /* For GQI_QPL number of pages allocated have 1:1 relationship with
+ * number of descriptors. For DQO, number of pages required are
+ * more than descriptors (because of out of order completions).
+ */
+ page_count = priv->queue_format == GVE_GQI_QPL_FORMAT ?
+ priv->rx_data_slot_cnt : priv->rx_pages_per_qpl;
for (; i < num_qpls; i++) {
err = gve_alloc_queue_page_list(priv, i,
- priv->rx_data_slot_cnt);
+ page_count);
if (err)
goto free_qpls;
}
@@ -907,7 +1106,8 @@
queue_work(priv->gve_wq, &priv->service_task);
}
-static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up);
+static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up,
+ enum gve_reset_reason reason);
static int gve_reset_recovery(struct gve_priv *priv, bool was_up);
static void gve_turndown(struct gve_priv *priv);
static void gve_turnup(struct gve_priv *priv);
@@ -915,6 +1115,7 @@
static int gve_open(struct net_device *dev)
{
struct gve_priv *priv = netdev_priv(dev);
+ enum gve_reset_reason reset_reason;
int err;
err = gve_alloc_qpls(priv);
@@ -933,18 +1134,16 @@
goto free_rings;
err = gve_register_qpls(priv);
- if (err)
+ if (err) {
+ reset_reason = GVE_RESET_REASON_REGISTER_QPLS_FAILED;
goto reset;
-
- if (!gve_is_gqi(priv)) {
- /* Hard code this for now. This may be tuned in the future for
- * performance.
- */
- priv->data_buffer_size_dqo = GVE_RX_BUFFER_SIZE_DQO;
}
+
err = gve_create_rings(priv);
- if (err)
+ if (err) {
+ reset_reason = GVE_RESET_REASON_CREATE_RINGS_FAILED;
goto reset;
+ }
gve_set_device_rings_ok(priv);
@@ -953,6 +1152,10 @@
round_jiffies(jiffies +
msecs_to_jiffies(priv->stats_report_timer_period)));
+ if (!gve_is_gqi(priv))
+ mod_timer(&priv->tx_timeout_timer,
+ jiffies + priv->tx_timeout_period);
+
gve_turnup(priv);
queue_work(priv->gve_wq, &priv->service_task);
priv->interface_up_cnt++;
@@ -971,7 +1174,7 @@
if (gve_get_reset_in_progress(priv))
return err;
/* Otherwise reset before returning */
- gve_reset_and_teardown(priv, true);
+ gve_reset_and_teardown(priv, true, reset_reason);
/* if this fails there is nothing we can do so just ignore the return */
gve_reset_recovery(priv, false);
/* return the original error */
@@ -995,6 +1198,7 @@
gve_clear_device_rings_ok(priv);
}
del_timer_sync(&priv->stats_report_timer);
+ del_timer_sync(&priv->tx_timeout_timer);
gve_free_rings(priv);
gve_free_qpls(priv);
@@ -1008,40 +1212,75 @@
if (gve_get_reset_in_progress(priv))
return err;
/* Otherwise reset before returning */
- gve_reset_and_teardown(priv, true);
+ gve_reset_and_teardown(priv, true, GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED);
return gve_reset_recovery(priv, false);
}
+int gve_flow_rules_reset(struct gve_priv *priv)
+{
+ int err;
+
+ if (priv->flow_rules_cnt == 0)
+ return 0;
+
+ err = gve_adminq_reset_flow_rules(priv);
+ if (err)
+ return err;
+
+ gve_flow_rules_release(priv);
+ return 0;
+}
+
+static int gve_adjust_queue_count(struct gve_priv *priv,
+ struct gve_queue_config new_rx_config,
+ struct gve_queue_config new_tx_config)
+{
+ struct gve_queue_config old_rx_config = priv->rx_cfg;
+ int err = 0;
+
+ priv->rx_cfg = new_rx_config;
+ priv->tx_cfg = new_tx_config;
+
+ if (old_rx_config.num_queues != new_rx_config.num_queues) {
+ err = gve_flow_rules_reset(priv);
+ if (err)
+ return err;
+
+ if (priv->rss_config.alg != GVE_RSS_HASH_UNDEFINED)
+ err = gve_rss_config_init(priv);
+ }
+
+ return err;
+}
+
int gve_adjust_queues(struct gve_priv *priv,
struct gve_queue_config new_rx_config,
struct gve_queue_config new_tx_config)
{
int err;
-
- if (netif_carrier_ok(priv->dev)) {
+ if (netif_running(priv->dev)) {
/* To make this process as simple as possible we teardown the
* device, set the new configuration, and then bring the device
* up again.
*/
err = gve_close(priv->dev);
- /* we have already tried to reset in close,
- * just fail at this point
+ /* We have already tried to reset in close, just fail at this
+ * point.
*/
if (err)
return err;
- priv->tx_cfg = new_tx_config;
- priv->rx_cfg = new_rx_config;
-
+ err = gve_adjust_queue_count(priv, new_rx_config, new_tx_config);
+ if (err)
+ goto err;
err = gve_open(priv->dev);
if (err)
goto err;
-
return 0;
}
/* Set the config for the next up. */
- priv->tx_cfg = new_tx_config;
- priv->rx_cfg = new_rx_config;
-
+ err = gve_adjust_queue_count(priv, new_rx_config, new_tx_config);
+ if (err)
+ goto err;
return 0;
err:
netif_err(priv, drv, priv->dev,
@@ -1122,7 +1361,6 @@
struct gve_notify_block *block;
struct gve_tx_ring *tx = NULL;
struct gve_priv *priv;
- u32 last_nic_done;
u32 current_time;
u32 ntfy_idx;
@@ -1145,22 +1383,56 @@
/* Check to see if there are missed completions, which will allow us to
* kick the queue.
*/
- last_nic_done = gve_tx_load_event_counter(priv, tx);
- if (last_nic_done - tx->done) {
+ if (gve_is_gqi(priv)) {
+ u32 last_nic_done = gve_tx_load_event_counter(priv, tx);
+ bool has_work = (last_nic_done - tx->done) != 0;
+
+ if (!has_work) {
+ netdev_err(dev, "Tx queue %d stuck with no work. "
+ "Posted descriptors %d, Completed descriptors %d, "
+ "Driver stopped %d, Transmit stopped %d",
+ txqueue, tx->req,
+ tx->done,
+ netif_tx_queue_stopped(tx->netdev_txq),
+ netif_xmit_stopped(tx->netdev_txq));
+ goto reset;
+ }
+
netdev_info(dev, "Kicking queue %d", txqueue);
iowrite32be(GVE_IRQ_MASK, gve_irq_doorbell(priv, block));
- napi_schedule(&block->napi);
- tx->last_kick_msec = current_time;
- goto out;
- } // Else reset.
+ } else {
+ struct gve_tx_compl_desc *compl_desc =
+ &tx->dqo.compl_ring[tx->dqo_compl.head];
-reset:
- gve_schedule_reset(priv);
+ bool has_work =
+ compl_desc->generation != tx->dqo_compl.cur_gen_bit;
+
+ if (!has_work) {
+ netdev_err(dev, "Tx queue %d stuck with no work. "
+ "Driver stopped %d, Transmit stopped %d",
+ txqueue,
+ netif_tx_queue_stopped(tx->netdev_txq),
+ netif_xmit_stopped(tx->netdev_txq));
+ goto reset;
+ }
+
+ netdev_info(dev, "Kicking queue %d", txqueue);
+ gve_write_irq_doorbell_dqo(priv, block, GVE_ITR_NO_UPDATE_DQO);
+ }
+
+ napi_schedule(&block->napi);
+ tx->last_kick_msec = current_time;
out:
if (tx)
tx->queue_timeout++;
priv->tx_timeo_cnt++;
+ return;
+
+reset:
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_TX_TIMEOUT);
+ gve_schedule_reset(priv);
+ goto out;
}
static int gve_set_features(struct net_device *netdev,
@@ -1172,7 +1444,7 @@
if ((netdev->features & NETIF_F_LRO) != (features & NETIF_F_LRO)) {
netdev->features ^= NETIF_F_LRO;
- if (netif_carrier_ok(netdev)) {
+ if (netif_running(netdev)) {
/* To make this process as simple as possible we
* teardown the device, set the new configuration,
* and then bring the device up again.
@@ -1190,6 +1462,12 @@
}
}
+ if ((netdev->features & NETIF_F_NTUPLE) && !(features & NETIF_F_NTUPLE)) {
+ err = gve_flow_rules_reset(priv);
+ if (err)
+ goto err;
+ }
+
return 0;
err:
/* Reverts the change on error. */
@@ -1201,6 +1479,7 @@
static const struct net_device_ops gve_netdev_ops = {
.ndo_start_xmit = gve_start_xmit,
+ .ndo_features_check = gve_features_check,
.ndo_open = gve_open,
.ndo_stop = gve_close,
.ndo_get_stats64 = gve_get_stats,
@@ -1212,6 +1491,7 @@
{
if (GVE_DEVICE_STATUS_RESET_MASK & status) {
dev_info(&priv->pdev->dev, "Device requested reset.\n");
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_DEVICE_REQUESTED);
gve_set_do_reset(priv);
}
if (GVE_DEVICE_STATUS_REPORT_STATS_MASK & status) {
@@ -1231,7 +1511,7 @@
if (gve_get_do_reset(priv)) {
rtnl_lock();
- gve_reset(priv, false);
+ gve_reset(priv, false, READ_ONCE(priv->scheduled_reset_reason));
rtnl_unlock();
}
}
@@ -1260,9 +1540,9 @@
}
do {
- start = u64_stats_fetch_begin_irq(&priv->tx[idx].statss);
+ start = u64_stats_fetch_begin(&priv->tx[idx].statss);
tx_bytes = priv->tx[idx].bytes_done;
- } while (u64_stats_fetch_retry_irq(&priv->tx[idx].statss, start));
+ } while (u64_stats_fetch_retry(&priv->tx[idx].statss, start));
stats[stats_idx++] = (struct stats) {
.stat_name = cpu_to_be32(TX_WAKE_CNT),
.value = cpu_to_be64(priv->tx[idx].wake_queue),
@@ -1329,6 +1609,66 @@
}
}
+static void gve_turnup_and_check_status(struct gve_priv *priv)
+{
+ u32 status;
+
+ gve_turnup(priv);
+ status = ioread32be(&priv->reg_bar0->device_status);
+ gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status);
+}
+
+int gve_recreate_rx_rings(struct gve_priv *priv)
+{
+ int err;
+
+ /* Unregister queues with the device*/
+ err = gve_destroy_rx_rings(priv);
+ if (err)
+ return err;
+
+ /* Reset the RX state */
+ gve_rx_reset_rings_dqo(priv);
+
+ /* Register queues with the device */
+ return gve_create_rx_rings(priv);
+}
+
+int gve_reconfigure_rx_rings(struct gve_priv *priv,
+ bool enable_hdr_split,
+ int packet_buffer_size)
+{
+ int err = 0;
+
+ if (priv->queue_format != GVE_DQO_RDA_FORMAT)
+ return -EOPNOTSUPP;
+
+ gve_turndown(priv);
+
+ /* Allocate/free hdr resources */
+ if (enable_hdr_split != !!priv->header_buf_pool) {
+ err = gve_rx_handle_hdr_resources_dqo(priv, enable_hdr_split);
+ if (err)
+ goto out;
+ }
+
+ /* Apply new RX configuration changes */
+ priv->data_buffer_size_dqo = packet_buffer_size;
+
+ /* Reset RX state and re-register with the device */
+ err = gve_recreate_rx_rings(priv);
+ if (err)
+ goto reset;
+out:
+ gve_turnup_and_check_status(priv);
+ return err;
+
+reset:
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_DEVICE_FAILURE);
+ gve_schedule_reset(priv);
+ return err;
+}
+
/* Handle NIC status register changes, reset requests and report stats */
static void gve_service_task(struct work_struct *work)
{
@@ -1355,6 +1695,13 @@
return err;
}
+ err = gve_verify_driver_compatibility(priv);
+ if (err) {
+ dev_err(&priv->pdev->dev,
+ "Could not verify driver compatibility: err=%d\n", err);
+ goto err;
+ }
+
if (skip_describe_device)
goto setup_device;
@@ -1380,6 +1727,10 @@
goto err;
}
+ /* Big TCP is only supported on DQ*/
+ if (!gve_is_gqi(priv))
+ netif_set_tso_max_size(priv->dev, GVE_DQO_TX_MAX);
+
priv->num_registered_pages = 0;
priv->rx_copybreak = GVE_DEFAULT_RX_COPYBREAK;
/* gvnic has one Notification Block per MSI-x vector, except for the
@@ -1388,6 +1739,9 @@
priv->num_ntfy_blks = (num_ntfy - 1) & ~0x1;
priv->mgmt_msix_idx = priv->num_ntfy_blks;
+ mutex_init(&priv->flow_rules_lock);
+ INIT_LIST_HEAD(&priv->flow_rules);
+
priv->tx_cfg.max_queues =
min_t(int, priv->tx_cfg.max_queues, priv->num_ntfy_blks / 2);
priv->rx_cfg.max_queues =
@@ -1427,15 +1781,27 @@
gve_adminq_free(&priv->pdev->dev, priv);
}
-static void gve_trigger_reset(struct gve_priv *priv)
+static void gve_write_reset_reason(struct gve_priv *priv,
+ enum gve_reset_reason reason)
{
+ u32 driver_status = reason;
+ driver_status <<= 32 - GVE_RESET_REASON_SIZE;
+
+ iowrite32be(driver_status, &priv->reg_bar0->driver_status);
+}
+
+static void gve_trigger_reset(struct gve_priv *priv,
+ enum gve_reset_reason reason)
+{
+ gve_write_reset_reason(priv, reason);
/* Reset the device by releasing the AQ */
gve_adminq_release(priv);
}
-static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up)
+static void gve_reset_and_teardown(struct gve_priv *priv, bool was_up,
+ enum gve_reset_reason reason)
{
- gve_trigger_reset(priv);
+ gve_trigger_reset(priv, reason);
/* With the reset having already happened, close cannot fail */
if (was_up)
gve_close(priv->dev);
@@ -1461,12 +1827,13 @@
return err;
}
-int gve_reset(struct gve_priv *priv, bool attempt_teardown)
+int gve_reset(struct gve_priv *priv, bool attempt_teardown,
+ enum gve_reset_reason reason)
{
- bool was_up = netif_carrier_ok(priv->dev);
+ bool was_up = netif_running(priv->dev);
int err;
- dev_info(&priv->pdev->dev, "Performing reset\n");
+ dev_info(&priv->pdev->dev, "Performing reset with reason %d\n", reason);
gve_clear_do_reset(priv);
gve_set_reset_in_progress(priv);
/* If we aren't attempting to teardown normally, just go turndown and
@@ -1474,19 +1841,23 @@
*/
if (!attempt_teardown) {
gve_turndown(priv);
- gve_reset_and_teardown(priv, was_up);
+ gve_reset_and_teardown(priv, was_up, reason);
} else {
/* Otherwise attempt to close normally */
if (was_up) {
err = gve_close(priv->dev);
/* If that fails reset as we did above */
if (err)
- gve_reset_and_teardown(priv, was_up);
+ gve_reset_and_teardown(priv, was_up, reason);
}
+ gve_write_reset_reason(priv, reason);
/* Clean up any remaining resources */
gve_teardown_priv_resources(priv);
}
+ /* The reset reason preceding this reset should be cleared. */
+ WRITE_ONCE(priv->scheduled_reset_reason, 0);
+
/* Set it all back up */
err = gve_reset_recovery(priv, was_up);
gve_clear_reset_in_progress(priv);
@@ -1596,6 +1967,7 @@
priv->service_task_flags = 0x0;
priv->state_flags = 0x0;
priv->ethtool_flags = 0x0;
+ priv->ethtool_defaults = 0x0;
gve_set_probe_in_progress(priv);
priv->gve_wq = alloc_ordered_workqueue("gve", 0);
@@ -1654,6 +2026,7 @@
void __iomem *reg_bar = priv->reg_bar0;
unregister_netdev(netdev);
+ gve_write_reset_reason(priv, GVE_RESET_REASON_DRIVER_REMOVED);
gve_teardown_priv_resources(priv);
destroy_workqueue(priv->gve_wq);
free_netdev(netdev);
@@ -1667,13 +2040,14 @@
{
struct net_device *netdev = pci_get_drvdata(pdev);
struct gve_priv *priv = netdev_priv(netdev);
- bool was_up = netif_carrier_ok(priv->dev);
+ bool was_up = netif_running(priv->dev);
rtnl_lock();
if (was_up && gve_close(priv->dev)) {
/* If the dev was up, attempt to close, if close fails, reset */
- gve_reset_and_teardown(priv, was_up);
+ gve_reset_and_teardown(priv, was_up, GVE_RESET_REASON_DRIVER_SHUTDOWN);
} else {
+ gve_write_reset_reason(priv, GVE_RESET_REASON_DRIVER_SHUTDOWN);
/* If the dev wasn't up or close worked, finish tearing down */
gve_teardown_priv_resources(priv);
}
@@ -1685,14 +2059,15 @@
{
struct net_device *netdev = pci_get_drvdata(pdev);
struct gve_priv *priv = netdev_priv(netdev);
- bool was_up = netif_carrier_ok(priv->dev);
+ bool was_up = netif_running(priv->dev);
priv->suspend_cnt++;
rtnl_lock();
if (was_up && gve_close(priv->dev)) {
/* If the dev was up, attempt to close, if close fails, reset */
- gve_reset_and_teardown(priv, was_up);
+ gve_reset_and_teardown(priv, was_up, GVE_RESET_REASON_DRIVER_SUSPENDED);
} else {
+ gve_write_reset_reason(priv, GVE_RESET_REASON_DRIVER_SUSPENDED);
/* If the dev wasn't up or close worked, finish tearing down */
gve_teardown_priv_resources(priv);
}
@@ -1715,6 +2090,67 @@
}
#endif /* CONFIG_PM */
+void gve_rss_set_default_indir(struct gve_priv *priv)
+{
+ struct gve_rss_config *rss_config = &priv->rss_config;
+ int i;
+
+ for (i = 0; i < GVE_RSS_INDIR_SIZE; i++)
+ rss_config->indir[i] = i % priv->rx_cfg.num_queues;
+}
+
+void gve_flow_rules_release(struct gve_priv *priv)
+{
+ struct gve_flow_rule *cur, *next;
+
+ if (priv->flow_rules_cnt == 0)
+ return;
+
+ list_for_each_entry_safe(cur, next, &priv->flow_rules, list) {
+ list_del(&cur->list);
+ kvfree(cur);
+ priv->flow_rules_cnt--;
+ }
+}
+
+void gve_rss_config_release(struct gve_rss_config *rss_config)
+{
+ kvfree(rss_config->key);
+ kvfree(rss_config->indir);
+ memset(rss_config, 0, sizeof(*rss_config));
+}
+
+int gve_rss_config_init(struct gve_priv *priv)
+{
+ struct gve_rss_config *rss_config = &priv->rss_config;
+
+ gve_rss_config_release(rss_config);
+
+ rss_config->key = kvzalloc(GVE_RSS_KEY_SIZE, GFP_KERNEL);
+ if (!rss_config->key)
+ goto err;
+
+ netdev_rss_key_fill(rss_config->key, GVE_RSS_KEY_SIZE);
+
+ rss_config->indir = kvcalloc(GVE_RSS_INDIR_SIZE,
+ sizeof(*rss_config->indir),
+ GFP_KERNEL);
+ if (!rss_config->indir)
+ goto err;
+
+ rss_config->alg = GVE_RSS_HASH_TOEPLITZ;
+ rss_config->key_size = GVE_RSS_KEY_SIZE;
+ rss_config->indir_size = GVE_RSS_INDIR_SIZE;
+ gve_rss_set_default_indir(priv);
+
+ return gve_adminq_configure_rss(priv, rss_config);
+
+err:
+ kvfree(rss_config->key);
+ rss_config->key = NULL;
+ return -ENOMEM;
+}
+
static const struct pci_device_id gve_id_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_GOOGLE, PCI_DEV_ID_GVNIC) },
{ }
diff --git a/drivers/net/ethernet/google/gve/gve_register.h b/drivers/net/ethernet/google/gve/gve_register.h
index fb65546..2511352 100644
--- a/drivers/net/ethernet/google/gve/gve_register.h
+++ b/drivers/net/ethernet/google/gve/gve_register.h
@@ -25,4 +25,25 @@
GVE_DEVICE_STATUS_LINK_STATUS_MASK = BIT(2),
GVE_DEVICE_STATUS_REPORT_STATS_MASK = BIT(3),
};
+
+#define GVE_RESET_REASON_SIZE 5
+
+enum gve_reset_reason {
+ GVE_RESET_REASON_UNKNOWN,
+ GVE_RESET_REASON_RESET_BY_USER,
+ GVE_RESET_REASON_DRIVER_SHUTDOWN,
+ GVE_RESET_REASON_DRIVER_SUSPENDED,
+ GVE_RESET_REASON_CREATE_RINGS_FAILED,
+ GVE_RESET_REASON_REGISTER_QPLS_FAILED,
+ GVE_RESET_REASON_DRIVER_TEARDOWN_FAILED,
+ GVE_RESET_REASON_TX_TIMEOUT,
+ GVE_RESET_REASON_TX_MISS_PATH_TIMEOUT,
+ GVE_RESET_REASON_RX_ERROR,
+ GVE_RESET_REASON_DEVICE_REQUESTED,
+ GVE_RESET_REASON_DRIVER_REMOVED,
+ GVE_RESET_REASON_DEVICE_FAILURE,
+ GVE_NUM_RESET_REASONS, /* Not a reset reason */
+};
+
+static_assert(GVE_NUM_RESET_REASONS <= (1 << GVE_RESET_REASON_SIZE));
#endif /* _GVE_REGISTER_H_ */
diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c
index 021bbf3..9f60bb5 100644
--- a/drivers/net/ethernet/google/gve/gve_rx.c
+++ b/drivers/net/ethernet/google/gve/gve_rx.c
@@ -6,6 +6,7 @@
#include "gve.h"
#include "gve_adminq.h"
+#include "gve_register.h"
#include "gve_utils.h"
#include <linux/etherdevice.h>
@@ -35,6 +36,12 @@
rx->data.page_info[i].pagecnt_bias - 1);
gve_unassign_qpl(priv, rx->data.qpl->id);
rx->data.qpl = NULL;
+
+ for (i = 0; i < rx->qpl_copy_pool_mask + 1; i++) {
+ page_ref_sub(rx->qpl_copy_pool[i].page,
+ rx->qpl_copy_pool[i].pagecnt_bias - 1);
+ put_page(rx->qpl_copy_pool[i].page);
+ }
}
kvfree(rx->data.page_info);
rx->data.page_info = NULL;
@@ -63,6 +70,10 @@
dma_free_coherent(dev, bytes, rx->data.data_ring,
rx->data.data_bus);
rx->data.data_ring = NULL;
+
+ kvfree(rx->qpl_copy_pool);
+ rx->qpl_copy_pool = NULL;
+
netif_dbg(priv, drv, priv->dev, "freed rx ring %d\n", idx);
}
@@ -101,6 +112,7 @@
u32 slots;
int err;
int i;
+ int j;
/* Allocate one page per Rx queue slot. Each page is split into two
* packet buffers, when possible we "page flip" between the two.
@@ -135,7 +147,33 @@
goto alloc_err;
}
+ if (!rx->data.raw_addressing) {
+ for (j = 0; j < rx->qpl_copy_pool_mask + 1; j++) {
+ struct page *page = alloc_page(GFP_KERNEL);
+
+ if (!page) {
+ err = -ENOMEM;
+ goto alloc_err_qpl;
+ }
+
+ rx->qpl_copy_pool[j].page = page;
+ rx->qpl_copy_pool[j].page_offset = 0;
+ rx->qpl_copy_pool[j].page_address = page_address(page);
+
+ /* The page already has 1 ref. */
+ page_ref_add(page, INT_MAX - 1);
+ rx->qpl_copy_pool[j].pagecnt_bias = INT_MAX;
+ }
+ }
+
return slots;
+
+alloc_err_qpl:
+ while (j--) {
+ page_ref_sub(rx->qpl_copy_pool[j].page,
+ rx->qpl_copy_pool[j].pagecnt_bias - 1);
+ put_page(rx->qpl_copy_pool[j].page);
+ }
alloc_err:
while (i--)
gve_rx_free_buffer(&priv->pdev->dev,
@@ -146,12 +184,11 @@
static void gve_rx_ctx_clear(struct gve_rx_ctx *ctx)
{
- ctx->curr_frag_cnt = 0;
- ctx->total_expected_size = 0;
- ctx->expected_frag_cnt = 0;
ctx->skb_head = NULL;
ctx->skb_tail = NULL;
- ctx->reuse_frags = false;
+ ctx->total_size = 0;
+ ctx->frag_cnt = 0;
+ ctx->drop_pkt = false;
}
static int gve_rx_alloc_ring(struct gve_priv *priv, int idx)
@@ -181,10 +218,22 @@
GFP_KERNEL);
if (!rx->data.data_ring)
return -ENOMEM;
+
+ rx->qpl_copy_pool_mask = min_t(u32, U32_MAX, slots * 2) - 1;
+ rx->qpl_copy_pool_head = 0;
+ rx->qpl_copy_pool = kvcalloc(rx->qpl_copy_pool_mask + 1,
+ sizeof(rx->qpl_copy_pool[0]),
+ GFP_KERNEL);
+
+ if (!rx->qpl_copy_pool) {
+ err = -ENOMEM;
+ goto abort_with_slots;
+ }
+
filled_pages = gve_prefill_rx_pages(rx);
if (filled_pages < 0) {
err = -ENOMEM;
- goto abort_with_slots;
+ goto abort_with_copy_pool;
}
rx->fill_cnt = filled_pages;
/* Ensure data ring slots (packet buffers) are visible. */
@@ -236,6 +285,9 @@
rx->q_resources = NULL;
abort_filled:
gve_rx_unfill_pages(priv, rx);
+abort_with_copy_pool:
+ kvfree(rx->qpl_copy_pool);
+ rx->qpl_copy_pool = NULL;
abort_with_slots:
bytes = sizeof(*rx->data.data_ring) * slots;
dma_free_coherent(hdev, bytes, rx->data.data_ring, rx->data.data_bus);
@@ -292,30 +344,47 @@
return PKT_HASH_TYPE_L2;
}
-static u16 gve_rx_ctx_padding(struct gve_rx_ctx *ctx)
-{
- return (ctx->curr_frag_cnt == 0) ? GVE_RX_PAD : 0;
-}
-
static struct sk_buff *gve_rx_add_frags(struct napi_struct *napi,
struct gve_rx_slot_page_info *page_info,
u16 packet_buffer_size, u16 len,
struct gve_rx_ctx *ctx)
{
- u32 offset = page_info->page_offset + gve_rx_ctx_padding(ctx);
- struct sk_buff *skb;
+ u32 offset = page_info->page_offset + page_info->pad;
+ struct sk_buff *skb = ctx->skb_tail;
+ int num_frags = 0;
- if (!ctx->skb_head)
- ctx->skb_head = napi_get_frags(napi);
+ if (!skb) {
+ skb = napi_get_frags(napi);
+ if (unlikely(!skb))
+ return NULL;
- if (unlikely(!ctx->skb_head))
- return NULL;
+ ctx->skb_head = skb;
+ ctx->skb_tail = skb;
+ } else {
+ num_frags = skb_shinfo(ctx->skb_tail)->nr_frags;
+ if (num_frags == MAX_SKB_FRAGS) {
+ skb = napi_alloc_skb(napi, 0);
+ if (!skb)
+ return NULL;
- skb = ctx->skb_head;
- skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page_info->page,
+ // We will never chain more than two SKBs: 2 * 16 * 2k > 64k
+ // which is why we do not need to chain by using skb->next
+ skb_shinfo(ctx->skb_tail)->frag_list = skb;
+
+ ctx->skb_tail = skb;
+ num_frags = 0;
+ }
+ }
+
+ if (skb != ctx->skb_head) {
+ ctx->skb_head->len += len;
+ ctx->skb_head->data_len += len;
+ ctx->skb_head->truesize += packet_buffer_size;
+ }
+ skb_add_rx_frag(skb, num_frags, page_info->page,
offset, len, packet_buffer_size);
- return skb;
+ return ctx->skb_head;
}
static void gve_rx_flip_buff(struct gve_rx_slot_page_info *page_info, __be64 *slot_addr)
@@ -363,6 +432,93 @@
return skb;
}
+static struct sk_buff *gve_rx_copy_to_pool(struct gve_rx_ring *rx,
+ struct gve_rx_slot_page_info *page_info,
+ u16 len, struct napi_struct *napi)
+{
+ u32 pool_idx = rx->qpl_copy_pool_head & rx->qpl_copy_pool_mask;
+ void *src = page_info->page_address + page_info->page_offset;
+ struct gve_rx_slot_page_info *copy_page_info;
+ struct gve_rx_ctx *ctx = &rx->ctx;
+ bool alloc_page = false;
+ struct sk_buff *skb;
+ void *dst;
+
+ copy_page_info = &rx->qpl_copy_pool[pool_idx];
+ if (!copy_page_info->can_flip) {
+ int recycle = gve_rx_can_recycle_buffer(copy_page_info);
+
+ if (unlikely(recycle < 0)) {
+ WRITE_ONCE(rx->gve->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(rx->gve);
+ return NULL;
+ }
+ alloc_page = !recycle;
+ }
+
+ if (alloc_page) {
+ struct gve_rx_slot_page_info alloc_page_info;
+ struct page *page;
+
+ /* The least recently used page turned out to be
+ * still in use by the kernel. Ignoring it and moving
+ * on alleviates head-of-line blocking.
+ */
+ rx->qpl_copy_pool_head++;
+
+ page = alloc_page(GFP_ATOMIC);
+ if (!page)
+ return NULL;
+
+ alloc_page_info.page = page;
+ alloc_page_info.page_offset = 0;
+ alloc_page_info.page_address = page_address(page);
+ alloc_page_info.pad = page_info->pad;
+
+ memcpy(alloc_page_info.page_address, src, page_info->pad + len);
+ skb = gve_rx_add_frags(napi, &alloc_page_info,
+ rx->packet_buffer_size,
+ len, ctx);
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_copy_cnt++;
+ rx->rx_frag_alloc_cnt++;
+ u64_stats_update_end(&rx->statss);
+
+ return skb;
+ }
+
+ dst = copy_page_info->page_address + copy_page_info->page_offset;
+ memcpy(dst, src, page_info->pad + len);
+ copy_page_info->pad = page_info->pad;
+
+ skb = gve_rx_add_frags(napi, copy_page_info,
+ rx->packet_buffer_size, len, ctx);
+ if (unlikely(!skb))
+ return NULL;
+
+ gve_dec_pagecnt_bias(copy_page_info);
+ copy_page_info->page_offset += rx->packet_buffer_size;
+ copy_page_info->page_offset &= (PAGE_SIZE - 1);
+
+ if (copy_page_info->can_flip) {
+ /* We have used both halves of this copy page, it
+ * is time for it to go to the back of the queue.
+ */
+ copy_page_info->can_flip = false;
+ rx->qpl_copy_pool_head++;
+ prefetch(rx->qpl_copy_pool[rx->qpl_copy_pool_head & rx->qpl_copy_pool_mask].page);
+ } else {
+ copy_page_info->can_flip = true;
+ }
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_copy_cnt++;
+ u64_stats_update_end(&rx->statss);
+
+ return skb;
+}
+
static struct sk_buff *
gve_rx_qpl(struct device *dev, struct net_device *netdev,
struct gve_rx_ring *rx, struct gve_rx_slot_page_info *page_info,
@@ -377,7 +533,7 @@
* choice is to copy the data out of it so that we can return it to the
* device.
*/
- if (ctx->reuse_frags) {
+ if (page_info->can_flip) {
skb = gve_rx_add_frags(napi, page_info, rx->packet_buffer_size, len, ctx);
/* No point in recycling if we didn't get the skb */
if (skb) {
@@ -386,116 +542,23 @@
gve_rx_flip_buff(page_info, &data_slot->qpl_offset);
}
} else {
- const u16 padding = gve_rx_ctx_padding(ctx);
-
- skb = gve_rx_copy(netdev, napi, page_info, len, padding, ctx);
- if (skb) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_frag_copy_cnt++;
- u64_stats_update_end(&rx->statss);
- }
+ skb = gve_rx_copy_to_pool(rx, page_info, len, napi);
}
return skb;
}
-#define GVE_PKTCONT_BIT_IS_SET(x) (GVE_RXF_PKT_CONT & (x))
-static u16 gve_rx_get_fragment_size(struct gve_rx_ctx *ctx, struct gve_rx_desc *desc)
-{
- return be16_to_cpu(desc->len) - gve_rx_ctx_padding(ctx);
-}
-
-static bool gve_rx_ctx_init(struct gve_rx_ctx *ctx, struct gve_rx_ring *rx)
-{
- bool qpl_mode = !rx->data.raw_addressing, packet_size_error = false;
- bool buffer_error = false, desc_error = false, seqno_error = false;
- struct gve_rx_slot_page_info *page_info;
- struct gve_priv *priv = rx->gve;
- u32 idx = rx->cnt & rx->mask;
- bool reuse_frags, can_flip;
- struct gve_rx_desc *desc;
- u16 packet_size = 0;
- u16 n_frags = 0;
- int recycle;
-
- /** In QPL mode, we only flip buffers when all buffers containing the packet
- * can be flipped. RDA can_flip decisions will be made later, per frag.
- */
- can_flip = qpl_mode;
- reuse_frags = can_flip;
- do {
- u16 frag_size;
-
- n_frags++;
- desc = &rx->desc.desc_ring[idx];
- desc_error = unlikely(desc->flags_seq & GVE_RXF_ERR) || desc_error;
- if (GVE_SEQNO(desc->flags_seq) != rx->desc.seqno) {
- seqno_error = true;
- netdev_warn(priv->dev,
- "RX seqno error: want=%d, got=%d, dropping packet and scheduling reset.",
- rx->desc.seqno, GVE_SEQNO(desc->flags_seq));
- }
- frag_size = be16_to_cpu(desc->len);
- packet_size += frag_size;
- if (frag_size > rx->packet_buffer_size) {
- packet_size_error = true;
- netdev_warn(priv->dev,
- "RX fragment error: packet_buffer_size=%d, frag_size=%d, dropping packet.",
- rx->packet_buffer_size, be16_to_cpu(desc->len));
- }
- page_info = &rx->data.page_info[idx];
- if (can_flip) {
- recycle = gve_rx_can_recycle_buffer(page_info);
- reuse_frags = reuse_frags && recycle > 0;
- buffer_error = buffer_error || unlikely(recycle < 0);
- }
- idx = (idx + 1) & rx->mask;
- rx->desc.seqno = gve_next_seqno(rx->desc.seqno);
- } while (GVE_PKTCONT_BIT_IS_SET(desc->flags_seq));
-
- prefetch(rx->desc.desc_ring + idx);
-
- ctx->curr_frag_cnt = 0;
- ctx->total_expected_size = packet_size - GVE_RX_PAD;
- ctx->expected_frag_cnt = n_frags;
- ctx->skb_head = NULL;
- ctx->reuse_frags = reuse_frags;
-
- if (ctx->expected_frag_cnt > 1) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_cont_packet_cnt++;
- u64_stats_update_end(&rx->statss);
- }
- if (ctx->total_expected_size > priv->rx_copybreak && !ctx->reuse_frags && qpl_mode) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_copied_pkt++;
- u64_stats_update_end(&rx->statss);
- }
-
- if (unlikely(buffer_error || seqno_error || packet_size_error)) {
- gve_schedule_reset(priv);
- return false;
- }
-
- if (unlikely(desc_error)) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_desc_err_dropped_pkt++;
- u64_stats_update_end(&rx->statss);
- return false;
- }
- return true;
-}
-
static struct sk_buff *gve_rx_skb(struct gve_priv *priv, struct gve_rx_ring *rx,
struct gve_rx_slot_page_info *page_info, struct napi_struct *napi,
- u16 len, union gve_rx_data_slot *data_slot)
+ u16 len, union gve_rx_data_slot *data_slot,
+ bool is_only_frag)
{
struct net_device *netdev = priv->dev;
struct gve_rx_ctx *ctx = &rx->ctx;
struct sk_buff *skb = NULL;
- if (len <= priv->rx_copybreak && ctx->expected_frag_cnt == 1) {
+ if (len <= priv->rx_copybreak && is_only_frag) {
/* Just copy small packets */
- skb = gve_rx_copy(netdev, napi, page_info, len, GVE_RX_PAD, ctx);
+ skb = gve_rx_copy(netdev, napi, page_info, len, GVE_RX_PAD);
if (skb) {
u64_stats_update_begin(&rx->statss);
rx->rx_copied_pkt++;
@@ -504,29 +567,26 @@
u64_stats_update_end(&rx->statss);
}
} else {
- if (rx->data.raw_addressing) {
- int recycle = gve_rx_can_recycle_buffer(page_info);
+ int recycle = gve_rx_can_recycle_buffer(page_info);
- if (unlikely(recycle < 0)) {
- gve_schedule_reset(priv);
- return NULL;
- }
- page_info->can_flip = recycle;
- if (page_info->can_flip) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_frag_flip_cnt++;
- u64_stats_update_end(&rx->statss);
- }
+ if (unlikely(recycle < 0)) {
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(priv);
+ return NULL;
+ }
+ page_info->can_flip = recycle;
+ if (page_info->can_flip) {
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_flip_cnt++;
+ u64_stats_update_end(&rx->statss);
+ }
+
+ if (rx->data.raw_addressing) {
skb = gve_rx_raw_addressing(&priv->pdev->dev, netdev,
page_info, len, napi,
data_slot,
rx->packet_buffer_size, ctx);
} else {
- if (ctx->reuse_frags) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_frag_flip_cnt++;
- u64_stats_update_end(&rx->statss);
- }
skb = gve_rx_qpl(&priv->pdev->dev, netdev, rx,
page_info, len, napi, data_slot);
}
@@ -534,101 +594,114 @@
return skb;
}
-static bool gve_rx(struct gve_rx_ring *rx, netdev_features_t feat,
- u64 *packet_size_bytes, u32 *work_done)
+#define GVE_PKTCONT_BIT_IS_SET(x) (GVE_RXF_PKT_CONT & (x))
+static void gve_rx(struct gve_rx_ring *rx, netdev_features_t feat,
+ struct gve_rx_desc *desc, u32 idx,
+ struct gve_rx_cnts *cnts)
{
+ bool is_last_frag = !GVE_PKTCONT_BIT_IS_SET(desc->flags_seq);
struct gve_rx_slot_page_info *page_info;
+ u16 frag_size = be16_to_cpu(desc->len);
struct gve_rx_ctx *ctx = &rx->ctx;
union gve_rx_data_slot *data_slot;
struct gve_priv *priv = rx->gve;
- struct gve_rx_desc *first_desc;
struct sk_buff *skb = NULL;
- struct gve_rx_desc *desc;
- struct napi_struct *napi;
dma_addr_t page_bus;
- u32 work_cnt = 0;
void *va;
- u32 idx;
- u16 len;
- idx = rx->cnt & rx->mask;
- first_desc = &rx->desc.desc_ring[idx];
- desc = first_desc;
- napi = &priv->ntfy_blocks[rx->ntfy_id].napi;
+ struct napi_struct *napi = &priv->ntfy_blocks[rx->ntfy_id].napi;
+ bool is_first_frag = ctx->frag_cnt == 0;
- if (unlikely(!gve_rx_ctx_init(ctx, rx)))
- goto skb_alloc_fail;
+ bool is_only_frag = is_first_frag && is_last_frag;
- while (ctx->curr_frag_cnt < ctx->expected_frag_cnt) {
- /* Prefetch two packet buffers ahead, we will need it soon. */
- page_info = &rx->data.page_info[(idx + 2) & rx->mask];
- va = page_info->page_address + page_info->page_offset;
+ if (unlikely(ctx->drop_pkt))
+ goto finish_frag;
- prefetch(page_info->page); /* Kernel page struct. */
- prefetch(va); /* Packet header. */
- prefetch(va + 64); /* Next cacheline too. */
+ if (desc->flags_seq & GVE_RXF_ERR) {
+ ctx->drop_pkt = true;
+ cnts->desc_err_pkt_cnt++;
+ napi_free_frags(napi);
+ goto finish_frag;
+ }
- len = gve_rx_get_fragment_size(ctx, desc);
+ if (unlikely(frag_size > rx->packet_buffer_size)) {
+ netdev_warn(priv->dev, "Unexpected frag size %d, can't exceed %d, scheduling reset",
+ frag_size, rx->packet_buffer_size);
+ ctx->drop_pkt = true;
+ napi_free_frags(napi);
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(rx->gve);
+ goto finish_frag;
+ }
- page_info = &rx->data.page_info[idx];
- data_slot = &rx->data.data_ring[idx];
- page_bus = rx->data.raw_addressing ?
- be64_to_cpu(data_slot->addr) - page_info->page_offset :
- rx->data.qpl->page_buses[idx];
- dma_sync_single_for_cpu(&priv->pdev->dev, page_bus, PAGE_SIZE, DMA_FROM_DEVICE);
+ /* Prefetch two packet buffers ahead, we will need it soon. */
+ page_info = &rx->data.page_info[(idx + 2) & rx->mask];
+ va = page_info->page_address + page_info->page_offset;
+ prefetch(page_info->page); /* Kernel page struct. */
+ prefetch(va); /* Packet header. */
+ prefetch(va + 64); /* Next cacheline too. */
- skb = gve_rx_skb(priv, rx, page_info, napi, len, data_slot);
- if (!skb) {
- u64_stats_update_begin(&rx->statss);
- rx->rx_skb_alloc_fail++;
- u64_stats_update_end(&rx->statss);
- goto skb_alloc_fail;
+ page_info = &rx->data.page_info[idx];
+ data_slot = &rx->data.data_ring[idx];
+ page_bus = (rx->data.raw_addressing) ?
+ be64_to_cpu(data_slot->addr) - page_info->page_offset :
+ rx->data.qpl->page_buses[idx];
+ dma_sync_single_for_cpu(&priv->pdev->dev, page_bus,
+ PAGE_SIZE, DMA_FROM_DEVICE);
+ page_info->pad = is_first_frag ? GVE_RX_PAD : 0;
+ frag_size -= page_info->pad;
+
+ skb = gve_rx_skb(priv, rx, page_info, napi, frag_size,
+ data_slot, is_only_frag);
+ if (!skb) {
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_skb_alloc_fail++;
+ u64_stats_update_end(&rx->statss);
+
+ napi_free_frags(napi);
+ ctx->drop_pkt = true;
+ goto finish_frag;
+ }
+ ctx->total_size += frag_size;
+
+ if (is_first_frag) {
+ if (likely(feat & NETIF_F_RXCSUM)) {
+ /* NIC passes up the partial sum */
+ if (desc->csum)
+ skb->ip_summed = CHECKSUM_COMPLETE;
+ else
+ skb->ip_summed = CHECKSUM_NONE;
+ skb->csum = csum_unfold(desc->csum);
}
- ctx->curr_frag_cnt++;
- rx->cnt++;
- idx = rx->cnt & rx->mask;
- work_cnt++;
- desc = &rx->desc.desc_ring[idx];
+ /* parse flags & pass relevant info up */
+ if (likely(feat & NETIF_F_RXHASH) &&
+ gve_needs_rss(desc->flags_seq))
+ skb_set_hash(skb, be32_to_cpu(desc->rss_hash),
+ gve_rss_type(desc->flags_seq));
}
- if (likely(feat & NETIF_F_RXCSUM)) {
- /* NIC passes up the partial sum */
- if (first_desc->csum)
- skb->ip_summed = CHECKSUM_COMPLETE;
+ if (is_last_frag) {
+ skb_record_rx_queue(skb, rx->q_num);
+ if (skb_is_nonlinear(skb))
+ napi_gro_frags(napi);
else
- skb->ip_summed = CHECKSUM_NONE;
- skb->csum = csum_unfold(first_desc->csum);
+ napi_gro_receive(napi, skb);
+ goto finish_ok_pkt;
}
- /* parse flags & pass relevant info up */
- if (likely(feat & NETIF_F_RXHASH) &&
- gve_needs_rss(first_desc->flags_seq))
- skb_set_hash(skb, be32_to_cpu(first_desc->rss_hash),
- gve_rss_type(first_desc->flags_seq));
+ goto finish_frag;
- *packet_size_bytes = skb->len + (skb->protocol ? ETH_HLEN : 0);
- *work_done = work_cnt;
- skb_record_rx_queue(skb, rx->q_num);
- if (skb_is_nonlinear(skb))
- napi_gro_frags(napi);
- else
- napi_gro_receive(napi, skb);
-
- gve_rx_ctx_clear(ctx);
- return true;
-
-skb_alloc_fail:
- if (napi->skb)
- napi_free_frags(napi);
- *packet_size_bytes = 0;
- *work_done = ctx->expected_frag_cnt;
- while (ctx->curr_frag_cnt < ctx->expected_frag_cnt) {
- rx->cnt++;
- ctx->curr_frag_cnt++;
+finish_ok_pkt:
+ cnts->ok_pkt_bytes += ctx->total_size;
+ cnts->ok_pkt_cnt++;
+finish_frag:
+ ctx->frag_cnt++;
+ if (is_last_frag) {
+ cnts->total_pkt_cnt++;
+ cnts->cont_pkt_cnt += (ctx->frag_cnt > 1);
+ gve_rx_ctx_clear(ctx);
}
- gve_rx_ctx_clear(ctx);
- return false;
}
bool gve_rx_work_pending(struct gve_rx_ring *rx)
@@ -675,8 +748,10 @@
int recycle = gve_rx_can_recycle_buffer(page_info);
if (recycle < 0) {
- if (!rx->data.raw_addressing)
+ if (!rx->data.raw_addressing) {
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
gve_schedule_reset(priv);
+ }
return false;
}
if (!recycle) {
@@ -704,36 +779,40 @@
static int gve_clean_rx_done(struct gve_rx_ring *rx, int budget,
netdev_features_t feat)
{
- u32 work_done = 0, total_packet_cnt = 0, ok_packet_cnt = 0;
+ struct gve_rx_ctx *ctx = &rx->ctx;
struct gve_priv *priv = rx->gve;
+ struct gve_rx_cnts cnts = {0};
+ struct gve_rx_desc *next_desc;
u32 idx = rx->cnt & rx->mask;
- struct gve_rx_desc *desc;
- u64 bytes = 0;
+ u32 work_done = 0;
- desc = &rx->desc.desc_ring[idx];
+ struct gve_rx_desc *desc = &rx->desc.desc_ring[idx];
+
+ // Exceed budget only if (and till) the inflight packet is consumed.
while ((GVE_SEQNO(desc->flags_seq) == rx->desc.seqno) &&
- work_done < budget) {
- u64 packet_size_bytes = 0;
- u32 work_cnt = 0;
- bool dropped;
+ (work_done < budget || ctx->frag_cnt)) {
+ next_desc = &rx->desc.desc_ring[(idx + 1) & rx->mask];
+ prefetch(next_desc);
- netif_info(priv, rx_status, priv->dev,
- "[%d] idx=%d desc=%p desc->flags_seq=0x%x\n",
- rx->q_num, idx, desc, desc->flags_seq);
- netif_info(priv, rx_status, priv->dev,
- "[%d] seqno=%d rx->desc.seqno=%d\n",
- rx->q_num, GVE_SEQNO(desc->flags_seq),
- rx->desc.seqno);
+ gve_rx(rx, feat, desc, idx, &cnts);
- dropped = !gve_rx(rx, feat, &packet_size_bytes, &work_cnt);
- if (!dropped) {
- bytes += packet_size_bytes;
- ok_packet_cnt++;
- }
- total_packet_cnt++;
+ rx->cnt++;
idx = rx->cnt & rx->mask;
desc = &rx->desc.desc_ring[idx];
- work_done += work_cnt;
+ rx->desc.seqno = gve_next_seqno(rx->desc.seqno);
+ work_done++;
+ }
+
+ // The device will only send whole packets.
+ if (unlikely(ctx->frag_cnt)) {
+ struct napi_struct *napi = &priv->ntfy_blocks[rx->ntfy_id].napi;
+
+ napi_free_frags(napi);
+ gve_rx_ctx_clear(&rx->ctx);
+ netdev_warn(priv->dev, "Unexpected seq number %d with incomplete packet, expected %d, scheduling reset",
+ GVE_SEQNO(desc->flags_seq), rx->desc.seqno);
+ WRITE_ONCE(priv->scheduled_reset_reason, GVE_RESET_REASON_RX_ERROR);
+ gve_schedule_reset(rx->gve);
}
if (!work_done && rx->fill_cnt - rx->cnt > rx->db_threshold)
@@ -741,8 +820,10 @@
if (work_done) {
u64_stats_update_begin(&rx->statss);
- rx->rpackets += ok_packet_cnt;
- rx->rbytes += bytes;
+ rx->rpackets += cnts.ok_pkt_cnt;
+ rx->rbytes += cnts.ok_pkt_bytes;
+ rx->rx_cont_packet_cnt += cnts.cont_pkt_cnt;
+ rx->rx_desc_err_dropped_pkt += cnts.desc_err_pkt_cnt;
u64_stats_update_end(&rx->statss);
}
@@ -767,7 +848,7 @@
}
gve_rx_write_doorbell(priv, rx);
- return total_packet_cnt;
+ return cnts.total_pkt_cnt;
}
int gve_rx_poll(struct gve_notify_block *block, int budget)
diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
index a9409e3..1736d5b 100644
--- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c
@@ -12,6 +12,7 @@
#include <linux/ipv6.h>
#include <linux/skbuff.h>
#include <linux/slab.h>
+#include <linux/dma-buf.h>
#include <net/ip6_checksum.h>
#include <net/ipv6.h>
#include <net/tcp.h>
@@ -22,11 +23,13 @@
}
static void gve_free_page_dqo(struct gve_priv *priv,
- struct gve_rx_buf_state_dqo *bs)
+ struct gve_rx_buf_state_dqo *bs,
+ bool free_page)
{
page_ref_sub(bs->page_info.page, bs->page_info.pagecnt_bias - 1);
- gve_free_page(&priv->pdev->dev, bs->page_info.page, bs->addr,
- DMA_FROM_DEVICE);
+ if (free_page)
+ gve_free_page(&priv->pdev->dev, bs->page_info.page, bs->addr,
+ DMA_FROM_DEVICE);
bs->page_info.page = NULL;
}
@@ -109,6 +112,13 @@
}
}
+static void gve_recycle_buf(struct gve_rx_ring *rx,
+ struct gve_rx_buf_state_dqo *buf_state)
+{
+ buf_state->hdr_buf = NULL;
+ gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+}
+
static struct gve_rx_buf_state_dqo *
gve_get_recycled_buf_state(struct gve_rx_ring *rx)
{
@@ -130,12 +140,20 @@
*/
for (i = 0; i < 5; i++) {
buf_state = gve_dequeue_buf_state(rx, &rx->dqo.used_buf_states);
- if (gve_buf_ref_cnt(buf_state) == 0)
+ if (gve_buf_ref_cnt(buf_state) == 0) {
+ rx->dqo.used_buf_states_cnt--;
return buf_state;
+ }
gve_enqueue_buf_state(rx, &rx->dqo.used_buf_states, buf_state);
}
+ /* For QPL, we cannot allocate any new buffers and must
+ * wait for the existing ones to be available.
+ */
+ if (rx->dqo.qpl)
+ return NULL;
+
/* If there are no free buf states discard an entry from
* `used_buf_states` so it can be used.
*/
@@ -144,23 +162,74 @@
if (gve_buf_ref_cnt(buf_state) == 0)
return buf_state;
- gve_free_page_dqo(rx->gve, buf_state);
+ gve_free_page_dqo(rx->gve, buf_state, true);
gve_free_buf_state(rx, buf_state);
}
return NULL;
}
-static int gve_alloc_page_dqo(struct gve_priv *priv,
+static int gve_alloc_page_dqo(struct gve_rx_ring *rx,
struct gve_rx_buf_state_dqo *buf_state)
{
+ struct netdev_rx_queue *rxq = NULL;
+ struct gve_priv *priv = rx->gve;
+ struct scatterlist sgl;
+ int num_pages_mapped;
+ u32 idx;
int err;
- err = gve_alloc_page(priv, &priv->pdev->dev, &buf_state->page_info.page,
- &buf_state->addr, DMA_FROM_DEVICE, GFP_ATOMIC);
- if (err)
- return err;
+ if (rx)
+ rxq = __netif_get_rx_queue(priv->dev, rx->q_num);
+ if (!rx->dqo.qpl) {
+ if (rxq && unlikely(rcu_access_pointer(rxq->dmabuf_pages))) {
+ buf_state->page_info.page =
+ netdev_rxq_alloc_dma_buf_page(rxq, 0);
+
+ if (!buf_state->page_info.page) {
+ priv->page_alloc_fail++;
+ return -ENOMEM;
+ }
+
+ BUG_ON(!is_dma_buf_page(buf_state->page_info.page));
+
+ sgl.offset = 0;
+ sgl.length = PAGE_SIZE;
+ sgl.page_link = (unsigned long)buf_state->page_info.page;
+ num_pages_mapped = dma_buf_map_sg(&priv->pdev->dev, &sgl, 1,
+ DMA_FROM_DEVICE);
+ if (!num_pages_mapped) {
+ net_err_ratelimited(
+ "dma_buf_map_sg failed (num_mapped (%d) <= 0)\n",
+ num_pages_mapped);
+ netdev_rxq_free_page(buf_state->page_info.page);
+ return -ENOMEM;
+ }
+ buf_state->addr = sgl.dma_address;
+ } else {
+ err = gve_alloc_page(priv, &priv->pdev->dev,
+ &buf_state->page_info.page,
+ &buf_state->addr,
+ DMA_FROM_DEVICE, GFP_ATOMIC);
+ if (err)
+ return err;
+ }
+ /* Update stats */
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_alloc_cnt++;
+ u64_stats_update_end(&rx->statss);
+ } else {
+ idx = rx->dqo.next_qpl_page_idx;
+ if (idx >= priv->rx_pages_per_qpl) {
+ net_err_ratelimited("%s: Out of QPL pages\n",
+ priv->dev->name);
+ return -ENOMEM;
+ }
+ buf_state->page_info.page = rx->dqo.qpl->pages[idx];
+ buf_state->addr = rx->dqo.qpl->page_buses[idx];
+ rx->dqo.next_qpl_page_idx++;
+ }
buf_state->page_info.page_offset = 0;
buf_state->page_info.page_address =
page_address(buf_state->page_info.page);
@@ -173,6 +242,23 @@
return 0;
}
+static void gve_rx_free_hdr_bufs(struct gve_priv *priv, int idx)
+{
+ struct gve_rx_ring *rx = &priv->rx[idx];
+ int buffer_queue_slots = rx->dqo.bufq.mask + 1;
+ int i;
+
+ if (rx->dqo.hdr_bufs) {
+ for (i = 0; i < buffer_queue_slots; i++)
+ if (rx->dqo.hdr_bufs[i].data)
+ dma_pool_free(priv->header_buf_pool,
+ rx->dqo.hdr_bufs[i].data,
+ rx->dqo.hdr_bufs[i].addr);
+ kvfree(rx->dqo.hdr_bufs);
+ rx->dqo.hdr_bufs = NULL;
+ }
+}
+
static void gve_rx_free_ring_dqo(struct gve_priv *priv, int idx)
{
struct gve_rx_ring *rx = &priv->rx[idx];
@@ -195,9 +281,13 @@
for (i = 0; i < rx->dqo.num_buf_states; i++) {
struct gve_rx_buf_state_dqo *bs = &rx->dqo.buf_states[i];
-
+ /* Only free page for RDA. QPL pages are freed in gve_main. */
if (bs->page_info.page)
- gve_free_page_dqo(priv, bs);
+ gve_free_page_dqo(priv, bs, !rx->dqo.qpl);
+ }
+ if (rx->dqo.qpl) {
+ gve_unassign_qpl(priv, rx->dqo.qpl->id);
+ rx->dqo.qpl = NULL;
}
if (rx->dqo.bufq.desc_ring) {
@@ -218,18 +308,116 @@
kvfree(rx->dqo.buf_states);
rx->dqo.buf_states = NULL;
+ gve_rx_free_hdr_bufs(priv, idx);
+
netif_dbg(priv, drv, priv->dev, "freed rx ring %d\n", idx);
}
+static int gve_rx_alloc_hdr_bufs(struct gve_priv *priv, int idx)
+{
+ struct gve_rx_ring *rx = &priv->rx[idx];
+ int buffer_queue_slots = rx->dqo.bufq.mask + 1;
+ int i;
+
+ rx->dqo.hdr_bufs = kvcalloc(buffer_queue_slots,
+ sizeof(rx->dqo.hdr_bufs[0]),
+ GFP_KERNEL);
+ if (!rx->dqo.hdr_bufs)
+ return -ENOMEM;
+
+ for (i = 0; i < buffer_queue_slots; i++) {
+ rx->dqo.hdr_bufs[i].data =
+ dma_pool_alloc(priv->header_buf_pool,
+ GFP_KERNEL,
+ &rx->dqo.hdr_bufs[i].addr);
+ if (!rx->dqo.hdr_bufs[i].data)
+ goto err;
+ }
+
+ return 0;
+err:
+ gve_rx_free_hdr_bufs(priv, idx);
+ return -ENOMEM;
+}
+
+static void gve_rx_init_ring_state_dqo(struct gve_rx_ring *rx,
+ const u32 buffer_queue_slots,
+ const u32 completion_queue_slots)
+{
+ int i;
+
+ /* Set buffer queue state */
+ rx->dqo.bufq.mask = buffer_queue_slots - 1;
+ rx->dqo.bufq.head = 0;
+ rx->dqo.bufq.tail = 0;
+
+ /* Set completion queue state */
+ rx->dqo.complq.num_free_slots = completion_queue_slots;
+ rx->dqo.complq.mask = completion_queue_slots - 1;
+ rx->dqo.complq.cur_gen_bit = 0;
+ rx->dqo.complq.head = 0;
+
+ /* Set RX SKB context */
+ rx->ctx.skb_head = NULL;
+ rx->ctx.skb_tail = NULL;
+
+ /* Set up linked list of buffer IDs */
+ for (i = 0; i < rx->dqo.num_buf_states - 1; i++)
+ rx->dqo.buf_states[i].next = i + 1;
+ rx->dqo.buf_states[rx->dqo.num_buf_states - 1].next = -1;
+
+ rx->dqo.free_buf_states = 0;
+ rx->dqo.recycled_buf_states.head = -1;
+ rx->dqo.recycled_buf_states.tail = -1;
+ rx->dqo.used_buf_states.head = -1;
+ rx->dqo.used_buf_states.tail = -1;
+}
+
+static void gve_rx_reset_ring_dqo(struct gve_priv *priv, int idx)
+{
+ struct gve_rx_ring *rx = &priv->rx[idx];
+ size_t size;
+ int i;
+
+ const u32 buffer_queue_slots = priv->rx_desc_cnt;
+ const u32 completion_queue_slots = priv->rx_desc_cnt;
+
+ netif_dbg(priv, drv, priv->dev, "Resetting rx ring \n");
+
+ /* Reset buffer queue */
+ size = sizeof(rx->dqo.bufq.desc_ring[0]) *
+ buffer_queue_slots;
+ memset(rx->dqo.bufq.desc_ring, 0 , size);
+
+ /* Reset completion queue */
+ size = sizeof(rx->dqo.complq.desc_ring[0]) *
+ completion_queue_slots;
+ memset(rx->dqo.complq.desc_ring, 0, size);
+
+ /* Reset q_resources */
+ memset(rx->q_resources, 0, sizeof(*rx->q_resources));
+
+ /* Reset buf states */
+ for (i = 0; i < rx->dqo.num_buf_states; i++) {
+ struct gve_rx_buf_state_dqo *bs = &rx->dqo.buf_states[i];
+
+ if (bs->page_info.page)
+ gve_free_page_dqo(priv, bs, !rx->dqo.qpl);
+ }
+
+ gve_rx_init_ring_state_dqo(rx, buffer_queue_slots,
+ completion_queue_slots);
+}
+
static int gve_rx_alloc_ring_dqo(struct gve_priv *priv, int idx)
{
struct gve_rx_ring *rx = &priv->rx[idx];
struct device *hdev = &priv->pdev->dev;
size_t size;
- int i;
const u32 buffer_queue_slots =
- priv->options_dqo_rda.rx_buff_ring_entries;
+ priv->queue_format == GVE_DQO_RDA_FORMAT ?
+ priv->options_dqo_rda.rx_buff_ring_entries : priv->rx_desc_cnt;
const u32 completion_queue_slots = priv->rx_desc_cnt;
netif_dbg(priv, drv, priv->dev, "allocating rx ring DQO\n");
@@ -237,29 +425,17 @@
memset(rx, 0, sizeof(*rx));
rx->gve = priv;
rx->q_num = idx;
- rx->dqo.bufq.mask = buffer_queue_slots - 1;
- rx->dqo.complq.num_free_slots = completion_queue_slots;
- rx->dqo.complq.mask = completion_queue_slots - 1;
- rx->ctx.skb_head = NULL;
- rx->ctx.skb_tail = NULL;
- rx->dqo.num_buf_states = min_t(s16, S16_MAX, buffer_queue_slots * 4);
+ /* Allocate buf states */
+ rx->dqo.num_buf_states = priv->queue_format == GVE_DQO_RDA_FORMAT ?
+ min_t(s16, S16_MAX, buffer_queue_slots * 8) :
+ priv->rx_pages_per_qpl;
rx->dqo.buf_states = kvcalloc(rx->dqo.num_buf_states,
sizeof(rx->dqo.buf_states[0]),
GFP_KERNEL);
if (!rx->dqo.buf_states)
return -ENOMEM;
- /* Set up linked list of buffer IDs */
- for (i = 0; i < rx->dqo.num_buf_states - 1; i++)
- rx->dqo.buf_states[i].next = i + 1;
-
- rx->dqo.buf_states[rx->dqo.num_buf_states - 1].next = -1;
- rx->dqo.recycled_buf_states.head = -1;
- rx->dqo.recycled_buf_states.tail = -1;
- rx->dqo.used_buf_states.head = -1;
- rx->dqo.used_buf_states.tail = -1;
-
/* Allocate RX completion queue */
size = sizeof(rx->dqo.complq.desc_ring[0]) *
completion_queue_slots;
@@ -275,11 +451,26 @@
if (!rx->dqo.bufq.desc_ring)
goto err;
+ if (priv->queue_format != GVE_DQO_RDA_FORMAT) {
+ rx->dqo.qpl = gve_assign_rx_qpl(priv);
+ if (!rx->dqo.qpl)
+ goto err;
+ rx->dqo.next_qpl_page_idx = 0;
+ }
+
rx->q_resources = dma_alloc_coherent(hdev, sizeof(*rx->q_resources),
&rx->q_resources_bus, GFP_KERNEL);
if (!rx->q_resources)
goto err;
+ gve_rx_init_ring_state_dqo(rx, buffer_queue_slots,
+ completion_queue_slots);
+
+ /* Allocate header buffers for header-split */
+ if (priv->header_buf_pool)
+ if (gve_rx_alloc_hdr_bufs(priv, idx))
+ goto err;
+
gve_rx_add_to_block(priv, idx);
return 0;
@@ -297,10 +488,28 @@
iowrite32(rx->dqo.bufq.tail, &priv->db_bar2[index]);
}
+static int gve_rx_alloc_hdr_buf_pool(struct gve_priv *priv)
+{
+ priv->header_buf_pool = dma_pool_create("header_bufs",
+ &priv->pdev->dev,
+ priv->header_buf_size,
+ 64, 0);
+ if (!priv->header_buf_pool)
+ return -ENOMEM;
+
+ return 0;
+}
+
int gve_rx_alloc_rings_dqo(struct gve_priv *priv)
{
int err = 0;
- int i;
+ int i = 0;
+
+ if (gve_get_enable_header_split(priv)) {
+ err = gve_rx_alloc_hdr_buf_pool(priv);
+ if (err)
+ goto err;
+ }
for (i = 0; i < priv->rx_cfg.num_queues; i++) {
err = gve_rx_alloc_ring_dqo(priv, i);
@@ -321,12 +530,23 @@
return err;
}
+void gve_rx_reset_rings_dqo(struct gve_priv *priv)
+{
+ int i;
+
+ for (i = 0; i < priv->rx_cfg.num_queues; i++)
+ gve_rx_reset_ring_dqo(priv, i);
+}
+
void gve_rx_free_rings_dqo(struct gve_priv *priv)
{
int i;
for (i = 0; i < priv->rx_cfg.num_queues; i++)
gve_rx_free_ring_dqo(priv, i);
+
+ dma_pool_destroy(priv->header_buf_pool);
+ priv->header_buf_pool = NULL;
}
void gve_rx_post_buffers_dqo(struct gve_rx_ring *rx)
@@ -352,7 +572,7 @@
if (unlikely(!buf_state))
break;
- if (unlikely(gve_alloc_page_dqo(priv, buf_state))) {
+ if (unlikely(gve_alloc_page_dqo(rx, buf_state))) {
u64_stats_update_begin(&rx->statss);
rx->rx_buf_alloc_fail++;
u64_stats_update_end(&rx->statss);
@@ -364,6 +584,12 @@
desc->buf_id = cpu_to_le16(buf_state - rx->dqo.buf_states);
desc->buf_addr = cpu_to_le64(buf_state->addr +
buf_state->page_info.page_offset);
+ if (rx->dqo.hdr_bufs) {
+ struct gve_header_buf *hdr_buf =
+ &rx->dqo.hdr_bufs[bufq->tail];
+ buf_state->hdr_buf = hdr_buf;
+ desc->header_buf_addr = cpu_to_le64(hdr_buf->addr);
+ }
bufq->tail = (bufq->tail + 1) & bufq->mask;
complq->num_free_slots--;
@@ -410,11 +636,12 @@
goto mark_used;
}
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+ gve_recycle_buf(rx, buf_state);
return;
mark_used:
gve_enqueue_buf_state(rx, &rx->dqo.used_buf_states, buf_state);
+ rx->dqo.used_buf_states_cnt++;
}
static void gve_rx_skb_csum(struct sk_buff *skb,
@@ -465,16 +692,55 @@
skb_set_hash(skb, le32_to_cpu(compl_desc->hash), hash_type);
}
-static void gve_rx_free_skb(struct gve_rx_ring *rx)
+static void gve_rx_free_skb(struct napi_struct *napi, struct gve_rx_ring *rx)
{
if (!rx->ctx.skb_head)
return;
+ if (rx->ctx.skb_head == napi->skb)
+ napi->skb = NULL;
dev_kfree_skb_any(rx->ctx.skb_head);
rx->ctx.skb_head = NULL;
rx->ctx.skb_tail = NULL;
}
+static bool gve_rx_should_trigger_copy_ondemand(struct gve_rx_ring *rx)
+{
+ if (!rx->dqo.qpl)
+ return false;
+ if (rx->dqo.used_buf_states_cnt <
+ (rx->dqo.num_buf_states -
+ GVE_DQO_QPL_ONDEMAND_ALLOC_THRESHOLD))
+ return false;
+ return true;
+}
+
+static int gve_rx_copy_ondemand(struct gve_rx_ring *rx,
+ struct gve_rx_buf_state_dqo *buf_state,
+ u16 buf_len)
+{
+ struct page *page = alloc_page(GFP_ATOMIC);
+ int num_frags;
+
+ if (!page)
+ return -ENOMEM;
+
+ memcpy(page_address(page),
+ buf_state->page_info.page_address +
+ buf_state->page_info.page_offset,
+ buf_len);
+ num_frags = skb_shinfo(rx->ctx.skb_tail)->nr_frags;
+ skb_add_rx_frag(rx->ctx.skb_tail, num_frags, page,
+ 0, buf_len, PAGE_SIZE);
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_frag_alloc_cnt++;
+ u64_stats_update_end(&rx->statss);
+ /* Return unused buffer. */
+ gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+ return 0;
+}
+
/* Chains multi skbs for single rx packet.
* Returns 0 if buffer is appended, -1 otherwise.
*/
@@ -505,12 +771,23 @@
rx->ctx.skb_head->truesize += priv->data_buffer_size_dqo;
}
+ /* Trigger ondemand page allocation if we are running low on buffers */
+ if (gve_rx_should_trigger_copy_ondemand(rx))
+ return gve_rx_copy_ondemand(rx, buf_state, buf_len);
+
skb_add_rx_frag(rx->ctx.skb_tail, num_frags,
buf_state->page_info.page,
buf_state->page_info.page_offset,
buf_len, priv->data_buffer_size_dqo);
gve_dec_pagecnt_bias(&buf_state->page_info);
+ if (is_dma_buf_page(buf_state->page_info.page))
+ rx->ctx.skb_tail->devmem = 1;
+
+ /* Advances buffer page-offset if page is partially used.
+ * Marks buffer as used if page is full.
+ */
+ gve_try_recycle_buf(priv, rx, buf_state);
return 0;
}
@@ -523,10 +800,13 @@
int queue_idx)
{
const u16 buffer_id = le16_to_cpu(compl_desc->buf_id);
+ const bool hbo = compl_desc->header_buffer_overflow != 0;
const bool eop = compl_desc->end_of_packet != 0;
+ const bool sph = compl_desc->split_header != 0;
struct gve_rx_buf_state_dqo *buf_state;
struct gve_priv *priv = rx->gve;
u16 buf_len;
+ u16 hdr_len;
if (unlikely(buffer_id >= rx->dqo.num_buf_states)) {
net_err_ratelimited("%s: Invalid RX buffer_id=%u\n",
@@ -541,18 +821,74 @@
}
if (unlikely(compl_desc->rx_error)) {
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states,
- buf_state);
+ net_err_ratelimited("%s: Descriptor error=%u\n",
+ priv->dev->name, compl_desc->rx_error);
+ gve_recycle_buf(rx, buf_state);
return -EINVAL;
}
buf_len = compl_desc->packet_len;
+ hdr_len = compl_desc->header_len;
+
+ if (unlikely(sph && !hdr_len)) {
+ gve_recycle_buf(rx, buf_state);
+ return -EINVAL;
+ }
+
+ if (unlikely(hdr_len && buf_state->hdr_buf == NULL)) {
+ gve_recycle_buf(rx, buf_state);
+ return -EINVAL;
+ }
+
+ if (unlikely(hbo && priv->header_split_strict)) {
+ gve_recycle_buf(rx, buf_state);
+ return -EFAULT;
+ }
/* Page might have not been used for awhile and was likely last written
* by a different thread.
*/
prefetch(buf_state->page_info.page);
+ if (!sph && !rx->ctx.skb_head &&
+ is_dma_buf_page(buf_state->page_info.page)) {
+ /* !sph indicates the packet is not split, and the header went
+ * to the packet buffer. If the packet buffer is a dma_buf
+ * page, those can't be easily mapped into the kernel space to
+ * access the header required to process the packet.
+ *
+ * In the future we may be able to map the dma_buf page to
+ * kernel space to access the header for dma_buf providers that
+ * support that, but for now, simply drop the packet. We expect
+ * the TCP packets that we care about to be header split
+ * anyway.
+ */
+ rx->rx_devmem_dropped++;
+ gve_recycle_buf(rx, buf_state);
+ return -EFAULT;
+ }
+
+ /* Copy the header into the skb in the case of header split */
+ if (sph) {
+ dma_sync_single_for_cpu(&priv->pdev->dev,
+ buf_state->hdr_buf->addr,
+ hdr_len, DMA_FROM_DEVICE);
+
+ rx->ctx.skb_head = gve_rx_copy_data(priv->dev, napi,
+ buf_state->hdr_buf->data,
+ hdr_len);
+ if (unlikely(!rx->ctx.skb_head))
+ goto error;
+
+ rx->ctx.skb_tail = rx->ctx.skb_head;
+
+ u64_stats_update_begin(&rx->statss);
+ rx->rx_hsplit_pkt++;
+ rx->rx_hsplit_hbo_pkt += hbo;
+ rx->rheader_bytes += hdr_len;
+ u64_stats_update_end(&rx->statss);
+ }
+
/* Sync the portion of dma buffer for CPU to read. */
dma_sync_single_range_for_cpu(&priv->pdev->dev, buf_state->addr,
buf_state->page_info.page_offset,
@@ -564,14 +900,14 @@
priv)) != 0) {
goto error;
}
-
- gve_try_recycle_buf(priv, rx, buf_state);
return 0;
}
- if (eop && buf_len <= priv->rx_copybreak) {
+ /* We can't copy dma-buf pages. Ignore any copybreak setting. */
+ if (eop && buf_len <= priv->rx_copybreak &&
+ (!is_dma_buf_page(buf_state->page_info.page) || !buf_len)) {
rx->ctx.skb_head = gve_rx_copy(priv->dev, napi,
- &buf_state->page_info, buf_len, 0, NULL);
+ &buf_state->page_info, buf_len, 0);
if (unlikely(!rx->ctx.skb_head))
goto error;
rx->ctx.skb_tail = rx->ctx.skb_head;
@@ -581,8 +917,7 @@
rx->rx_copybreak_pkt++;
u64_stats_update_end(&rx->statss);
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states,
- buf_state);
+ gve_recycle_buf(rx, buf_state);
return 0;
}
@@ -591,16 +926,26 @@
goto error;
rx->ctx.skb_tail = rx->ctx.skb_head;
+ if (gve_rx_should_trigger_copy_ondemand(rx)) {
+ if (gve_rx_copy_ondemand(rx, buf_state, buf_len) < 0)
+ goto error;
+ return 0;
+ }
+
skb_add_rx_frag(rx->ctx.skb_head, 0, buf_state->page_info.page,
buf_state->page_info.page_offset, buf_len,
priv->data_buffer_size_dqo);
gve_dec_pagecnt_bias(&buf_state->page_info);
+ if (is_dma_buf_page(buf_state->page_info.page))
+ rx->ctx.skb_head->devmem = 1;
+
gve_try_recycle_buf(priv, rx, buf_state);
return 0;
error:
- gve_enqueue_buf_state(rx, &rx->dqo.recycled_buf_states, buf_state);
+ dev_err(&priv->pdev->dev, "%s: Error return", priv->dev->name);
+ gve_recycle_buf(rx, buf_state);
return -ENOMEM;
}
@@ -655,10 +1000,15 @@
return err;
}
- if (skb_headlen(rx->ctx.skb_head) == 0)
+ if (skb_headlen(rx->ctx.skb_head) == 0) {
+ if (napi_get_frags(napi)->devmem)
+ rx->rx_devmem_pkt++;
napi_gro_frags(napi);
- else
+ } else {
+ if (rx->ctx.skb_head->devmem)
+ rx->rx_devmem_pkt++;
napi_gro_receive(napi, rx->ctx.skb_head);
+ }
return 0;
}
@@ -693,12 +1043,14 @@
err = gve_rx_dqo(napi, rx, compl_desc, rx->q_num);
if (err < 0) {
- gve_rx_free_skb(rx);
+ gve_rx_free_skb(napi, rx);
u64_stats_update_begin(&rx->statss);
if (err == -ENOMEM)
rx->rx_skb_alloc_fail++;
else if (err == -EINVAL)
rx->rx_desc_err_dropped_pkt++;
+ else if (err == -EFAULT)
+ rx->rx_hsplit_err_dropped_pkt++;
u64_stats_update_end(&rx->statss);
}
@@ -736,7 +1088,7 @@
/* gve_rx_complete_skb() will consume skb if successful */
if (gve_rx_complete_skb(rx, napi, compl_desc, feat) != 0) {
- gve_rx_free_skb(rx);
+ gve_rx_free_skb(napi, rx);
u64_stats_update_begin(&rx->statss);
rx->rx_desc_err_dropped_pkt++;
u64_stats_update_end(&rx->statss);
@@ -757,3 +1109,39 @@
return work_done;
}
+
+int gve_rx_handle_hdr_resources_dqo(struct gve_priv *priv,
+ bool enable_hdr_split)
+{
+ int err = 0;
+ int i;
+
+ if (enable_hdr_split) {
+ err = gve_rx_alloc_hdr_buf_pool(priv);
+ if (err)
+ goto err;
+
+ for (i = 0; i < priv->rx_cfg.num_queues; i++) {
+ err = gve_rx_alloc_hdr_bufs(priv, i);
+ if (err)
+ goto free_buf_pool;
+ }
+ } else {
+ for (i = 0; i < priv->rx_cfg.num_queues; i++)
+ gve_rx_free_hdr_bufs(priv, i);
+
+ dma_pool_destroy(priv->header_buf_pool);
+ priv->header_buf_pool = NULL;
+ }
+
+ return 0;
+
+free_buf_pool:
+ for (i--; i >= 0; i--)
+ gve_rx_free_hdr_bufs(priv, i);
+
+ dma_pool_destroy(priv->header_buf_pool);
+ priv->header_buf_pool = NULL;
+err:
+ return err;
+}
diff --git a/drivers/net/ethernet/google/gve/gve_tx_dqo.c b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
index 588d648..7f7c354 100644
--- a/drivers/net/ethernet/google/gve/gve_tx_dqo.c
+++ b/drivers/net/ethernet/google/gve/gve_tx_dqo.c
@@ -8,10 +8,94 @@
#include "gve_adminq.h"
#include "gve_utils.h"
#include "gve_dqo.h"
+#include <net/ip.h>
#include <linux/tcp.h>
#include <linux/slab.h>
#include <linux/skbuff.h>
+/* Returns true if tx_bufs are available. */
+static bool gve_has_free_tx_qpl_bufs(struct gve_tx_ring *tx, int count)
+{
+ int num_avail;
+
+ if (!tx->dqo.qpl)
+ return true;
+
+ num_avail = tx->dqo.num_tx_qpl_bufs -
+ (tx->dqo_tx.alloc_tx_qpl_buf_cnt -
+ tx->dqo_tx.free_tx_qpl_buf_cnt);
+
+ if (count <= num_avail)
+ return true;
+
+ /* Update cached value from dqo_compl. */
+ tx->dqo_tx.free_tx_qpl_buf_cnt =
+ atomic_read_acquire(&tx->dqo_compl.free_tx_qpl_buf_cnt);
+
+ num_avail = tx->dqo.num_tx_qpl_bufs -
+ (tx->dqo_tx.alloc_tx_qpl_buf_cnt -
+ tx->dqo_tx.free_tx_qpl_buf_cnt);
+
+ return count <= num_avail;
+}
+
+static s16
+gve_alloc_tx_qpl_buf(struct gve_tx_ring *tx)
+{
+ s16 index;
+
+ index = tx->dqo_tx.free_tx_qpl_buf_head;
+
+ /* No TX buffers available, try to steal the list from the
+ * completion handler.
+ */
+ if (unlikely(index == -1)) {
+ tx->dqo_tx.free_tx_qpl_buf_head =
+ atomic_xchg(&tx->dqo_compl.free_tx_qpl_buf_head, -1);
+ index = tx->dqo_tx.free_tx_qpl_buf_head;
+
+ if (unlikely(index == -1))
+ return index;
+ }
+
+ /* Remove TX buf from free list */
+ tx->dqo_tx.free_tx_qpl_buf_head = tx->dqo.tx_qpl_buf_next[index];
+
+ return index;
+}
+
+static void
+gve_free_tx_qpl_bufs(struct gve_tx_ring *tx,
+ struct gve_tx_pending_packet_dqo *pkt)
+{
+ s16 index;
+ int i;
+
+ if (!pkt->num_bufs)
+ return;
+
+ index = pkt->tx_qpl_buf_ids[0];
+ /* Create a linked list of buffers to be added to the free list */
+ for (i = 1; i < pkt->num_bufs; i++) {
+ tx->dqo.tx_qpl_buf_next[index] = pkt->tx_qpl_buf_ids[i];
+ index = pkt->tx_qpl_buf_ids[i];
+ }
+
+ while (true) {
+ s16 old_head = atomic_read_acquire(&tx->dqo_compl.free_tx_qpl_buf_head);
+
+ tx->dqo.tx_qpl_buf_next[index] = old_head;
+ if (atomic_cmpxchg(&tx->dqo_compl.free_tx_qpl_buf_head,
+ old_head,
+ pkt->tx_qpl_buf_ids[0]) == old_head) {
+ break;
+ }
+ }
+
+ atomic_add(pkt->num_bufs, &tx->dqo_compl.free_tx_qpl_buf_cnt);
+ pkt->num_bufs = 0;
+}
+
/* Returns true if a gve_tx_pending_packet_dqo object is available. */
static bool gve_has_pending_packet(struct gve_tx_ring *tx)
{
@@ -135,9 +219,40 @@
kvfree(tx->dqo.pending_packets);
tx->dqo.pending_packets = NULL;
+ kvfree(tx->dqo.tx_qpl_buf_next);
+ tx->dqo.tx_qpl_buf_next = NULL;
+
+ if (tx->dqo.qpl) {
+ gve_unassign_qpl(priv, tx->dqo.qpl->id);
+ tx->dqo.qpl = NULL;
+ }
+
netif_dbg(priv, drv, priv->dev, "freed tx queue %d\n", idx);
}
+static int gve_tx_qpl_buf_init(struct gve_tx_ring *tx)
+{
+ int num_tx_qpl_bufs = GVE_TX_BUFS_PER_PAGE_DQO *
+ tx->dqo.qpl->num_entries;
+ int i;
+
+ tx->dqo.tx_qpl_buf_next = kvcalloc(num_tx_qpl_bufs,
+ sizeof(tx->dqo.tx_qpl_buf_next[0]),
+ GFP_KERNEL);
+ if (!tx->dqo.tx_qpl_buf_next)
+ return -ENOMEM;
+
+ tx->dqo.num_tx_qpl_bufs = num_tx_qpl_bufs;
+
+ /* Generate free TX buf list */
+ for (i = 0; i < num_tx_qpl_bufs - 1; i++)
+ tx->dqo.tx_qpl_buf_next[i] = i + 1;
+ tx->dqo.tx_qpl_buf_next[num_tx_qpl_bufs - 1] = -1;
+
+ atomic_set_release(&tx->dqo_compl.free_tx_qpl_buf_head, -1);
+ return 0;
+}
+
static int gve_tx_alloc_ring_dqo(struct gve_priv *priv, int idx)
{
struct gve_tx_ring *tx = &priv->tx[idx];
@@ -154,7 +269,9 @@
/* Queue sizes must be a power of 2 */
tx->mask = priv->tx_desc_cnt - 1;
- tx->dqo.complq_mask = priv->options_dqo_rda.tx_comp_ring_entries - 1;
+ tx->dqo.complq_mask = priv->queue_format == GVE_DQO_RDA_FORMAT ?
+ priv->options_dqo_rda.tx_comp_ring_entries - 1 :
+ tx->mask;
/* The max number of pending packets determines the maximum number of
* descriptors which maybe written to the completion queue.
@@ -210,6 +327,15 @@
if (!tx->q_resources)
goto err;
+ if (gve_is_qpl(priv)) {
+ tx->dqo.qpl = gve_assign_tx_qpl(priv);
+ if (!tx->dqo.qpl)
+ goto err;
+
+ if (gve_tx_qpl_buf_init(tx))
+ goto err;
+ }
+
gve_tx_add_to_block(priv, idx);
return 0;
@@ -266,20 +392,27 @@
return tx->mask - num_used;
}
+static bool gve_has_avail_slots_tx_dqo(struct gve_tx_ring *tx,
+ int desc_count, int buf_count)
+{
+ return gve_has_pending_packet(tx) &&
+ num_avail_tx_slots(tx) >= desc_count &&
+ gve_has_free_tx_qpl_bufs(tx, buf_count);
+}
+
/* Stops the queue if available descriptors is less than 'count'.
* Return: 0 if stop is not required.
*/
-static int gve_maybe_stop_tx_dqo(struct gve_tx_ring *tx, int count)
+static int gve_maybe_stop_tx_dqo(struct gve_tx_ring *tx,
+ int desc_count, int buf_count)
{
- if (likely(gve_has_pending_packet(tx) &&
- num_avail_tx_slots(tx) >= count))
+ if (likely(gve_has_avail_slots_tx_dqo(tx, desc_count, buf_count)))
return 0;
/* Update cached TX head pointer */
tx->dqo_tx.head = atomic_read_acquire(&tx->dqo_compl.hw_tx_head);
- if (likely(gve_has_pending_packet(tx) &&
- num_avail_tx_slots(tx) >= count))
+ if (likely(gve_has_avail_slots_tx_dqo(tx, desc_count, buf_count)))
return 0;
/* No space, so stop the queue */
@@ -294,8 +427,7 @@
*/
tx->dqo_tx.head = atomic_read_acquire(&tx->dqo_compl.hw_tx_head);
- if (likely(!gve_has_pending_packet(tx) ||
- num_avail_tx_slots(tx) < count))
+ if (likely(!gve_has_avail_slots_tx_dqo(tx, desc_count, buf_count)))
return -EBUSY;
netif_tx_start_queue(tx->netdev_txq);
@@ -443,26 +575,159 @@
};
}
+static int gve_tx_add_skb_no_copy_dqo(struct gve_tx_ring *tx,
+ struct sk_buff *skb,
+ struct gve_tx_pending_packet_dqo *pkt,
+ s16 completion_tag,
+ u32 *desc_idx,
+ bool is_gso)
+{
+ const struct skb_shared_info *shinfo = skb_shinfo(skb);
+ int i;
+
+ /* Note: HW requires that the size of a non-TSO packet be within the
+ * range of [17, 9728].
+ *
+ * We don't double check because
+ * - We limited `netdev->min_mtu` to ETH_MIN_MTU.
+ * - Hypervisor won't allow MTU larger than 9216.
+ */
+
+ pkt->num_bufs = 0;
+ /* Map the linear portion of skb */
+ {
+ u32 len = skb_headlen(skb);
+ dma_addr_t addr;
+
+ addr = dma_map_single(tx->dev, skb->data, len, DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx->dev, addr)))
+ goto err;
+
+ dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
+ dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
+ ++pkt->num_bufs;
+
+ gve_tx_fill_pkt_desc_dqo(tx, desc_idx, skb, len, addr,
+ completion_tag,
+ /*eop=*/shinfo->nr_frags == 0, is_gso);
+ }
+
+ for (i = 0; i < shinfo->nr_frags; i++) {
+ const skb_frag_t *frag = &shinfo->frags[i];
+ bool is_eop = i == (shinfo->nr_frags - 1);
+ u32 len = skb_frag_size(frag);
+ dma_addr_t addr;
+
+ addr = skb_devmem_frag_dma_map(tx->dev, skb, frag, 0, len,
+ DMA_TO_DEVICE);
+ if (unlikely(dma_mapping_error(tx->dev, addr)))
+ goto err;
+
+ dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
+ dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
+ ++pkt->num_bufs;
+
+ gve_tx_fill_pkt_desc_dqo(tx, desc_idx, skb, len, addr,
+ completion_tag, is_eop, is_gso);
+ }
+
+ return 0;
+err:
+ for (i = 0; i < pkt->num_bufs; i++) {
+ if (i == 0) {
+ dma_unmap_single(tx->dev,
+ dma_unmap_addr(pkt, dma[i]),
+ dma_unmap_len(pkt, len[i]),
+ DMA_TO_DEVICE);
+ } else {
+ dma_unmap_page(tx->dev,
+ dma_unmap_addr(pkt, dma[i]),
+ dma_unmap_len(pkt, len[i]),
+ DMA_TO_DEVICE);
+ }
+ }
+ pkt->num_bufs = 0;
+ return -1;
+}
+
+/* Tx buffer i corresponds to
+ * qpl_page_id = i / GVE_TX_BUFS_PER_PAGE_DQO
+ * qpl_page_offset = (i % GVE_TX_BUFS_PER_PAGE_DQO) * GVE_TX_BUF_SIZE_DQO
+ */
+static void gve_tx_buf_get_addr(struct gve_tx_ring *tx,
+ s16 index,
+ void **va, dma_addr_t *dma_addr)
+{
+ int page_id = index >> (PAGE_SHIFT - GVE_TX_BUF_SHIFT_DQO);
+ int offset = (index & (GVE_TX_BUFS_PER_PAGE_DQO - 1)) << GVE_TX_BUF_SHIFT_DQO;
+
+ *va = page_address(tx->dqo.qpl->pages[page_id]) + offset;
+ *dma_addr = tx->dqo.qpl->page_buses[page_id] + offset;
+}
+
+static int gve_tx_add_skb_copy_dqo(struct gve_tx_ring *tx,
+ struct sk_buff *skb,
+ struct gve_tx_pending_packet_dqo *pkt,
+ s16 completion_tag,
+ u32 *desc_idx,
+ bool is_gso)
+{
+ u32 copy_offset = 0;
+ dma_addr_t dma_addr;
+ u32 copy_len;
+ s16 index;
+ void *va;
+
+ /* Break the packet into buffer size chunks */
+ pkt->num_bufs = 0;
+ while (copy_offset < skb->len) {
+ index = gve_alloc_tx_qpl_buf(tx);
+ if (unlikely(index == -1))
+ goto err;
+
+ gve_tx_buf_get_addr(tx, index, &va, &dma_addr);
+ copy_len = min_t(u32, GVE_TX_BUF_SIZE_DQO,
+ skb->len - copy_offset);
+ skb_copy_bits(skb, copy_offset, va, copy_len);
+
+ copy_offset += copy_len;
+ dma_sync_single_for_device(tx->dev, dma_addr,
+ copy_len, DMA_TO_DEVICE);
+ gve_tx_fill_pkt_desc_dqo(tx, desc_idx, skb,
+ copy_len,
+ dma_addr,
+ completion_tag,
+ copy_offset == skb->len,
+ is_gso);
+
+ pkt->tx_qpl_buf_ids[pkt->num_bufs] = index;
+ ++tx->dqo_tx.alloc_tx_qpl_buf_cnt;
+ ++pkt->num_bufs;
+ }
+
+ return 0;
+err:
+ /* Should not be here if gve_has_free_tx_qpl_bufs() check is correct */
+ gve_free_tx_qpl_bufs(tx, pkt);
+ return -ENOMEM;
+}
+
/* Returns 0 on success, or < 0 on error.
*
* Before this function is called, the caller must ensure
* gve_has_pending_packet(tx) returns true.
*/
-static int gve_tx_add_skb_no_copy_dqo(struct gve_tx_ring *tx,
- struct sk_buff *skb)
+static int gve_tx_add_skb_dqo(struct gve_tx_ring *tx,
+ struct sk_buff *skb)
{
- const struct skb_shared_info *shinfo = skb_shinfo(skb);
const bool is_gso = skb_is_gso(skb);
u32 desc_idx = tx->dqo_tx.tail;
-
struct gve_tx_pending_packet_dqo *pkt;
struct gve_tx_metadata_dqo metadata;
s16 completion_tag;
- int i;
pkt = gve_alloc_pending_packet(tx);
pkt->skb = skb;
- pkt->num_bufs = 0;
completion_tag = pkt - tx->dqo.pending_packets;
gve_extract_tx_metadata_dqo(skb, &metadata);
@@ -481,49 +746,19 @@
&metadata);
desc_idx = (desc_idx + 1) & tx->mask;
- /* Note: HW requires that the size of a non-TSO packet be within the
- * range of [17, 9728].
- *
- * We don't double check because
- * - We limited `netdev->min_mtu` to ETH_MIN_MTU.
- * - Hypervisor won't allow MTU larger than 9216.
- */
-
- /* Map the linear portion of skb */
- {
- u32 len = skb_headlen(skb);
- dma_addr_t addr;
-
- addr = dma_map_single(tx->dev, skb->data, len, DMA_TO_DEVICE);
- if (unlikely(dma_mapping_error(tx->dev, addr)))
+ if (tx->dqo.qpl) {
+ if (gve_tx_add_skb_copy_dqo(tx, skb, pkt,
+ completion_tag,
+ &desc_idx, is_gso))
goto err;
-
- dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
- dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
- ++pkt->num_bufs;
-
- gve_tx_fill_pkt_desc_dqo(tx, &desc_idx, skb, len, addr,
- completion_tag,
- /*eop=*/shinfo->nr_frags == 0, is_gso);
+ } else {
+ if (gve_tx_add_skb_no_copy_dqo(tx, skb, pkt,
+ completion_tag,
+ &desc_idx, is_gso))
+ goto err;
}
- for (i = 0; i < shinfo->nr_frags; i++) {
- const skb_frag_t *frag = &shinfo->frags[i];
- bool is_eop = i == (shinfo->nr_frags - 1);
- u32 len = skb_frag_size(frag);
- dma_addr_t addr;
-
- addr = skb_frag_dma_map(tx->dev, frag, 0, len, DMA_TO_DEVICE);
- if (unlikely(dma_mapping_error(tx->dev, addr)))
- goto err;
-
- dma_unmap_len_set(pkt, len[pkt->num_bufs], len);
- dma_unmap_addr_set(pkt, dma[pkt->num_bufs], addr);
- ++pkt->num_bufs;
-
- gve_tx_fill_pkt_desc_dqo(tx, &desc_idx, skb, len, addr,
- completion_tag, is_eop, is_gso);
- }
+ tx->dqo_tx.posted_packet_desc_cnt += pkt->num_bufs;
/* Commit the changes to our state */
tx->dqo_tx.tail = desc_idx;
@@ -546,22 +781,7 @@
return 0;
err:
- for (i = 0; i < pkt->num_bufs; i++) {
- if (i == 0) {
- dma_unmap_single(tx->dev,
- dma_unmap_addr(pkt, dma[i]),
- dma_unmap_len(pkt, len[i]),
- DMA_TO_DEVICE);
- } else {
- dma_unmap_page(tx->dev,
- dma_unmap_addr(pkt, dma[i]),
- dma_unmap_len(pkt, len[i]),
- DMA_TO_DEVICE);
- }
- }
-
pkt->skb = NULL;
- pkt->num_bufs = 0;
gve_free_pending_packet(tx, pkt);
return -1;
@@ -603,27 +823,57 @@
const int header_len = skb_tcp_all_headers(skb);
const int gso_size = shinfo->gso_size;
int cur_seg_num_bufs;
+ int last_frag_size;
int cur_seg_size;
int i;
cur_seg_size = skb_headlen(skb) - header_len;
+ last_frag_size = skb_headlen(skb);
cur_seg_num_bufs = cur_seg_size > 0;
for (i = 0; i < shinfo->nr_frags; i++) {
if (cur_seg_size >= gso_size) {
cur_seg_size %= gso_size;
cur_seg_num_bufs = cur_seg_size > 0;
+
+ /* If the last buffer is split in the middle of a TSO
+ * segment, then it will count as two descriptors.
+ */
+ if (last_frag_size > GVE_TX_MAX_BUF_SIZE_DQO) {
+ int last_frag_remain = last_frag_size %
+ GVE_TX_MAX_BUF_SIZE_DQO;
+
+ /* If the last frag was evenly divisible by
+ * GVE_TX_MAX_BUF_SIZE_DQO, then it will not be
+ * split in the current segment.
+ */
+ if (last_frag_remain &&
+ cur_seg_size > last_frag_remain) {
+ cur_seg_num_bufs++;
+ }
+ }
}
if (unlikely(++cur_seg_num_bufs > max_bufs_per_seg))
return false;
- cur_seg_size += skb_frag_size(&shinfo->frags[i]);
+ last_frag_size = skb_frag_size(&shinfo->frags[i]);
+ cur_seg_size += last_frag_size;
}
return true;
}
+netdev_features_t gve_features_check_dqo(struct sk_buff *skb,
+ struct net_device *dev,
+ netdev_features_t features)
+{
+ if (skb_is_gso(skb) && !gve_can_send_tso(skb))
+ return features & ~NETIF_F_GSO_MASK;
+
+ return features;
+}
+
/* Attempt to transmit specified SKB.
*
* Returns 0 if the SKB was transmitted or dropped.
@@ -635,37 +885,39 @@
int num_buffer_descs;
int total_num_descs;
- if (skb_is_gso(skb)) {
- /* If TSO doesn't meet HW requirements, attempt to linearize the
- * packet.
- */
- if (unlikely(!gve_can_send_tso(skb) &&
- skb_linearize(skb) < 0)) {
- net_err_ratelimited("%s: Failed to transmit TSO packet\n",
- priv->dev->name);
- goto drop;
- }
+ if (skb_is_gso(skb) && unlikely(ipv6_hopopt_jumbo_remove(skb)))
+ goto drop;
- num_buffer_descs = gve_num_buffer_descs_needed(skb);
+ if (tx->dqo.qpl) {
+ /* We do not need to verify the number of buffers used per
+ * packet or per segment in case of TSO as with 2K size buffers
+ * none of the TX packet rules would be violated.
+ *
+ * gve_can_send_tso() checks that each TCP segment of gso_size is
+ * not distributed over more than 9 SKB frags..
+ */
+ num_buffer_descs = DIV_ROUND_UP(skb->len, GVE_TX_BUF_SIZE_DQO);
} else {
num_buffer_descs = gve_num_buffer_descs_needed(skb);
+ if (!skb_is_gso(skb)) {
+ if (unlikely(num_buffer_descs > GVE_TX_MAX_DATA_DESCS)) {
+ if (unlikely(skb_linearize(skb) < 0))
+ goto drop;
- if (unlikely(num_buffer_descs > GVE_TX_MAX_DATA_DESCS)) {
- if (unlikely(skb_linearize(skb) < 0))
- goto drop;
-
- num_buffer_descs = 1;
+ num_buffer_descs = 1;
+ }
}
}
/* Metadata + (optional TSO) + data descriptors. */
total_num_descs = 1 + skb_is_gso(skb) + num_buffer_descs;
if (unlikely(gve_maybe_stop_tx_dqo(tx, total_num_descs +
- GVE_TX_MIN_DESC_PREVENT_CACHE_OVERLAP))) {
+ GVE_TX_MIN_DESC_PREVENT_CACHE_OVERLAP,
+ num_buffer_descs))) {
return -1;
}
- if (unlikely(gve_tx_add_skb_no_copy_dqo(tx, skb) < 0))
+ if (unlikely(gve_tx_add_skb_dqo(tx, skb) < 0))
goto drop;
netdev_tx_sent_queue(tx->netdev_txq, skb->len);
@@ -813,7 +1065,11 @@
return;
}
}
- gve_unmap_packet(tx->dev, pending_packet);
+ tx->dqo_tx.completed_packet_desc_cnt += pending_packet->num_bufs;
+ if (tx->dqo.qpl)
+ gve_free_tx_qpl_bufs(tx, pending_packet);
+ else
+ gve_unmap_packet(tx->dev, pending_packet);
*bytes += pending_packet->skb->len;
(*pkts)++;
@@ -871,12 +1127,16 @@
remove_from_list(tx, &tx->dqo_compl.miss_completions,
pending_packet);
- /* Unmap buffers and free skb but do not unallocate packet i.e.
+ /* Unmap/free TX buffers and free skb but do not unallocate packet i.e.
* the completion tag is not freed to ensure that the driver
* can take appropriate action if a corresponding valid
* completion is received later.
*/
- gve_unmap_packet(tx->dev, pending_packet);
+ if (tx->dqo.qpl)
+ gve_free_tx_qpl_bufs(tx, pending_packet);
+ else
+ gve_unmap_packet(tx->dev, pending_packet);
+
/* This indicates the packet was dropped. */
dev_kfree_skb_any(pending_packet->skb);
pending_packet->skb = NULL;
@@ -953,12 +1213,18 @@
atomic_set_release(&tx->dqo_compl.hw_tx_head, tx_head);
} else if (type == GVE_COMPL_TYPE_DQO_PKT) {
u16 compl_tag = le16_to_cpu(compl_desc->completion_tag);
-
- gve_handle_packet_completion(priv, tx, !!napi,
- compl_tag,
- &pkt_compl_bytes,
- &pkt_compl_pkts,
- /*is_reinjection=*/false);
+ if (compl_tag & GVE_ALT_MISS_COMPL_BIT) {
+ compl_tag &= ~GVE_ALT_MISS_COMPL_BIT;
+ gve_handle_miss_completion(priv, tx, compl_tag,
+ &miss_compl_bytes,
+ &miss_compl_pkts);
+ } else {
+ gve_handle_packet_completion(priv, tx, !!napi,
+ compl_tag,
+ &pkt_compl_bytes,
+ &pkt_compl_pkts,
+ false);
+ }
} else if (type == GVE_COMPL_TYPE_DQO_MISS) {
u16 compl_tag = le16_to_cpu(compl_desc->completion_tag);
@@ -972,7 +1238,7 @@
compl_tag,
&reinject_compl_bytes,
&reinject_compl_pkts,
- /*is_reinjection=*/true);
+ true);
}
tx->dqo_compl.head =
@@ -989,6 +1255,9 @@
remove_miss_completions(priv, tx);
remove_timed_out_completions(priv, tx);
+ WRITE_ONCE(tx->dqo_compl.last_processed, jiffies);
+ WRITE_ONCE(tx->dqo_compl.kicked, false);
+
u64_stats_update_begin(&tx->statss);
tx->bytes_done += pkt_compl_bytes + reinject_compl_bytes;
tx->pkt_done += pkt_compl_pkts + reinject_compl_pkts;
@@ -1020,3 +1289,9 @@
compl_desc = &tx->dqo.compl_ring[tx->dqo_compl.head];
return compl_desc->generation != tx->dqo_compl.cur_gen_bit;
}
+
+bool gve_tx_work_pending_dqo(struct gve_tx_ring *tx)
+{
+ struct gve_index_list *miss_comp_list = &tx->dqo_compl.miss_completions;
+ return READ_ONCE(miss_comp_list->head) != -1;
+}
diff --git a/drivers/net/ethernet/google/gve/gve_utils.c b/drivers/net/ethernet/google/gve/gve_utils.c
index d57508b..3341193 100644
--- a/drivers/net/ethernet/google/gve/gve_utils.c
+++ b/drivers/net/ethernet/google/gve/gve_utils.c
@@ -48,40 +48,30 @@
rx->ntfy_id = ntfy_idx;
}
-struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi,
- struct gve_rx_slot_page_info *page_info, u16 len,
- u16 padding, struct gve_rx_ctx *ctx)
+struct sk_buff *gve_rx_copy_data(struct net_device *dev, struct napi_struct *napi,
+ u8 *data, u16 len)
{
- void *va = page_info->page_address + padding + page_info->page_offset;
- int skb_linear_offset = 0;
- bool set_protocol = false;
struct sk_buff *skb;
- if (ctx) {
- if (!ctx->skb_head)
- ctx->skb_head = napi_alloc_skb(napi, ctx->total_expected_size);
+ skb = napi_alloc_skb(napi, len);
+ if (unlikely(!skb))
+ return NULL;
- if (unlikely(!ctx->skb_head))
- return NULL;
- skb = ctx->skb_head;
- skb_linear_offset = skb->len;
- set_protocol = ctx->curr_frag_cnt == ctx->expected_frag_cnt - 1;
- } else {
- skb = napi_alloc_skb(napi, len);
-
- if (unlikely(!skb))
- return NULL;
- set_protocol = true;
- }
__skb_put(skb, len);
- skb_copy_to_linear_data_offset(skb, skb_linear_offset, va, len);
-
- if (set_protocol)
- skb->protocol = eth_type_trans(skb, dev);
+ skb_copy_to_linear_data_offset(skb, 0, data, len);
+ skb->protocol = eth_type_trans(skb, dev);
return skb;
}
+struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi,
+ struct gve_rx_slot_page_info *page_info, u16 len,
+ u16 padding)
+{
+ u8 *va = page_info->page_address + padding + page_info->page_offset;
+ return gve_rx_copy_data(dev, napi, va, len);
+}
+
void gve_dec_pagecnt_bias(struct gve_rx_slot_page_info *page_info)
{
page_info->pagecnt_bias--;
diff --git a/drivers/net/ethernet/google/gve/gve_utils.h b/drivers/net/ethernet/google/gve/gve_utils.h
index 6d98e69..421cc11 100644
--- a/drivers/net/ethernet/google/gve/gve_utils.h
+++ b/drivers/net/ethernet/google/gve/gve_utils.h
@@ -17,9 +17,12 @@
void gve_rx_remove_from_block(struct gve_priv *priv, int queue_idx);
void gve_rx_add_to_block(struct gve_priv *priv, int queue_idx);
+struct sk_buff *gve_rx_copy_data(struct net_device *dev, struct napi_struct *napi,
+ u8 *data, u16 len);
+
struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi,
struct gve_rx_slot_page_info *page_info, u16 len,
- u16 pad, struct gve_rx_ctx *ctx);
+ u16 pad);
/* Decrement pagecnt_bias. Set it back to INT_MAX if it reached zero. */
void gve_dec_pagecnt_bias(struct gve_rx_slot_page_info *page_info);
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index af281e2..3e1de4c 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -2314,9 +2314,9 @@
frags++;
if (frags >= SKB_WR_LIST_SIZE) {
- pr_err("csk 0x%p, frags %u, %u,%u >%lu.\n",
+ pr_err("csk 0x%p, frags %u, %u,%u >%u.\n",
csk, skb_shinfo(skb)->nr_frags, skb->len,
- skb->data_len, SKB_WR_LIST_SIZE);
+ skb->data_len, (unsigned int)SKB_WR_LIST_SIZE);
return -EINVAL;
}
diff --git a/drivers/virt/Kconfig b/drivers/virt/Kconfig
index 87ef258..f79ab13 100644
--- a/drivers/virt/Kconfig
+++ b/drivers/virt/Kconfig
@@ -52,4 +52,6 @@
source "drivers/virt/coco/sev-guest/Kconfig"
+source "drivers/virt/coco/tdx-guest/Kconfig"
+
endif
diff --git a/drivers/virt/Makefile b/drivers/virt/Makefile
index 093674e..e9aa6fc 100644
--- a/drivers/virt/Makefile
+++ b/drivers/virt/Makefile
@@ -11,3 +11,4 @@
obj-$(CONFIG_ACRN_HSM) += acrn/
obj-$(CONFIG_EFI_SECRET) += coco/efi_secret/
obj-$(CONFIG_SEV_GUEST) += coco/sev-guest/
+obj-$(CONFIG_INTEL_TDX_GUEST) += coco/tdx-guest/
diff --git a/drivers/virt/coco/tdx-guest/Kconfig b/drivers/virt/coco/tdx-guest/Kconfig
new file mode 100644
index 0000000..14246fc
--- /dev/null
+++ b/drivers/virt/coco/tdx-guest/Kconfig
@@ -0,0 +1,10 @@
+config TDX_GUEST_DRIVER
+ tristate "TDX Guest driver"
+ depends on INTEL_TDX_GUEST
+ help
+ The driver provides userspace interface to communicate with
+ the TDX module to request the TDX guest details like attestation
+ report.
+
+ To compile this driver as module, choose M here. The module will
+ be called tdx-guest.
diff --git a/drivers/virt/coco/tdx-guest/Makefile b/drivers/virt/coco/tdx-guest/Makefile
new file mode 100644
index 0000000..775cb46
--- /dev/null
+++ b/drivers/virt/coco/tdx-guest/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_TDX_GUEST_DRIVER) += tdx-guest.o
diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
new file mode 100644
index 0000000..5e44a0f
--- /dev/null
+++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
@@ -0,0 +1,102 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TDX guest user interface driver
+ *
+ * Copyright (C) 2022 Intel Corporation
+ */
+
+#include <linux/kernel.h>
+#include <linux/miscdevice.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/mod_devicetable.h>
+#include <linux/string.h>
+#include <linux/uaccess.h>
+
+#include <uapi/linux/tdx-guest.h>
+
+#include <asm/cpu_device_id.h>
+#include <asm/tdx.h>
+
+static long tdx_get_report0(struct tdx_report_req __user *req)
+{
+ u8 *reportdata, *tdreport;
+ long ret;
+
+ reportdata = kmalloc(TDX_REPORTDATA_LEN, GFP_KERNEL);
+ if (!reportdata)
+ return -ENOMEM;
+
+ tdreport = kzalloc(TDX_REPORT_LEN, GFP_KERNEL);
+ if (!tdreport) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ if (copy_from_user(reportdata, req->reportdata, TDX_REPORTDATA_LEN)) {
+ ret = -EFAULT;
+ goto out;
+ }
+
+ /* Generate TDREPORT0 using "TDG.MR.REPORT" TDCALL */
+ ret = tdx_mcall_get_report0(reportdata, tdreport);
+ if (ret)
+ goto out;
+
+ if (copy_to_user(req->tdreport, tdreport, TDX_REPORT_LEN))
+ ret = -EFAULT;
+
+out:
+ kfree(reportdata);
+ kfree(tdreport);
+
+ return ret;
+}
+
+static long tdx_guest_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+ switch (cmd) {
+ case TDX_CMD_GET_REPORT0:
+ return tdx_get_report0((struct tdx_report_req __user *)arg);
+ default:
+ return -ENOTTY;
+ }
+}
+
+static const struct file_operations tdx_guest_fops = {
+ .owner = THIS_MODULE,
+ .unlocked_ioctl = tdx_guest_ioctl,
+ .llseek = no_llseek,
+};
+
+static struct miscdevice tdx_misc_dev = {
+ .name = KBUILD_MODNAME,
+ .minor = MISC_DYNAMIC_MINOR,
+ .fops = &tdx_guest_fops,
+};
+
+static const struct x86_cpu_id tdx_guest_ids[] = {
+ X86_MATCH_FEATURE(X86_FEATURE_TDX_GUEST, NULL),
+ {}
+};
+MODULE_DEVICE_TABLE(x86cpu, tdx_guest_ids);
+
+static int __init tdx_guest_init(void)
+{
+ if (!x86_match_cpu(tdx_guest_ids))
+ return -ENODEV;
+
+ return misc_register(&tdx_misc_dev);
+}
+module_init(tdx_guest_init);
+
+static void __exit tdx_guest_exit(void)
+{
+ misc_deregister(&tdx_misc_dev);
+}
+module_exit(tdx_guest_exit);
+
+MODULE_AUTHOR("Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>");
+MODULE_DESCRIPTION("TDX Guest Driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index aa90bd0..69a30ce 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -841,8 +841,22 @@
struct virtio_balloon *vb = container_of(nb,
struct virtio_balloon, oom_nb);
unsigned long *freed = parm;
-
- *freed += leak_balloon(vb, VIRTIO_BALLOON_OOM_NR_PAGES) /
+ s64 pages_to_release = VIRTIO_BALLOON_OOM_NR_PAGES;
+ if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_MIN_BALLOON_SIZE)) {
+ /* Minimin balloon size - VIRTIO_BALLOON_F_MIN_BALLOON_SIZE */
+ u32 min_balloon_pages;
+ // get the latest dynamic_limit from config space.
+ virtio_cread_le(vb->vdev, struct virtio_balloon_config,
+ min_balloon_pages, &min_balloon_pages);
+ s64 min_pages = min_balloon_pages;
+ if (min_pages >= vb->num_pages) {
+ printk("num_pages: %d <= min_balloon_pages %u \n",
+ vb->num_pages, min_balloon_pages);
+ return NOTIFY_STOP;
+ }
+ pages_to_release = min(vb->num_pages - min_pages, pages_to_release);
+ }
+ *freed += leak_balloon(vb, pages_to_release) /
VIRTIO_BALLOON_PAGES_PER_PAGE;
update_balloon_size(vb);
@@ -1108,6 +1122,7 @@
VIRTIO_BALLOON_F_FREE_PAGE_HINT,
VIRTIO_BALLOON_F_PAGE_POISON,
VIRTIO_BALLOON_F_REPORTING,
+ VIRTIO_BALLOON_F_MIN_BALLOON_SIZE,
};
static struct virtio_driver virtio_balloon_driver = {
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index c214fe0..95ccfa2 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -996,6 +996,14 @@
return rc;
}
+static int ecryptfs_do_getattr(const struct path *path, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ if (flags & AT_GETATTR_NOSEC)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
+
static int ecryptfs_getattr(struct user_namespace *mnt_userns,
const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int flags)
@@ -1004,8 +1012,8 @@
struct kstat lower_stat;
int rc;
- rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
- request_mask, flags);
+ rc = ecryptfs_do_getattr(ecryptfs_dentry_to_lower_path(dentry),
+ &lower_stat, request_mask, flags);
if (!rc) {
fsstack_copy_attr_all(d_inode(dentry),
ecryptfs_inode_to_lower(d_inode(dentry)));
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 601e097..188bd91 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4344,9 +4344,6 @@
if (!IS_EXT3_SB(sb) && !IS_EXT2_SB(sb) &&
((def_mount_opts & EXT4_DEFM_NODELALLOC) == 0))
set_opt(sb, DELALLOC);
-
- if (sb->s_blocksize == PAGE_SIZE)
- set_opt(sb, DIOREAD_NOLOCK);
}
static int ext4_handle_clustersize(struct super_block *sb)
diff --git a/fs/file_table.c b/fs/file_table.c
index dd88701..f8232d6 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -318,6 +318,7 @@
}
if (file->f_op->release)
file->f_op->release(inode, file);
+ security_file_pre_free(file);
if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
!(mode & FMODE_PATH))) {
cdev_put(inode->i_cdev);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index c062f7e..4e979fc 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -168,7 +168,7 @@
type = ovl_path_real(dentry, &realpath);
old_cred = ovl_override_creds(dentry->d_sb);
- err = vfs_getattr(&realpath, stat, request_mask, flags);
+ err = ovl_do_getattr(&realpath, stat, request_mask, flags);
if (err)
goto out;
@@ -193,8 +193,8 @@
(!is_dir ? STATX_NLINK : 0);
ovl_path_lower(dentry, &realpath);
- err = vfs_getattr(&realpath, &lowerstat,
- lowermask, flags);
+ err = ovl_do_getattr(&realpath, &lowerstat, lowermask,
+ flags);
if (err)
goto out;
@@ -243,8 +243,8 @@
u32 lowermask = STATX_BLOCKS;
ovl_path_lowerdata(dentry, &realpath);
- err = vfs_getattr(&realpath, &lowerdatastat,
- lowermask, flags);
+ err = ovl_do_getattr(&realpath, &lowerdatastat,
+ lowermask, flags);
if (err)
goto out;
stat->blocks = lowerdatastat.blocks;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index a3c59ac..073f5e9 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -348,6 +348,13 @@
!ofs->config.redirect_dir && ofs->config.xino != OVL_XINO_ON);
}
+static inline int ovl_do_getattr(const struct path *path, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ if (flags & AT_GETATTR_NOSEC)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
/* util.c */
int ovl_want_write(struct dentry *dentry);
diff --git a/fs/proc/Kconfig b/fs/proc/Kconfig
index 32b1116..65a7a80 100644
--- a/fs/proc/Kconfig
+++ b/fs/proc/Kconfig
@@ -108,3 +108,11 @@
config PROC_CPU_RESCTRL
def_bool n
depends on PROC_FS
+
+config PROC_SELF_MEM_READONLY
+ bool "Force /proc/<pid>/mem paths to be read-only"
+ default y
+ help
+ When enabled, attempts to open /proc/self/mem for write access
+ will always fail. Write access to this file allows bypassing
+ of memory map permissions (such as modifying read-only code).
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 74442e0..413d5f4 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -150,6 +150,12 @@
NULL, &proc_pid_attr_operations, \
{ .lsm = LSM })
+#ifdef CONFIG_PROC_SELF_MEM_READONLY
+# define PROC_PID_MEM_MODE S_IRUSR
+#else
+# define PROC_PID_MEM_MODE S_IRUSR|S_IWUSR
+#endif
+
/*
* Count the number of hardlinks for the pid_entry table, excluding the .
* and .. links.
@@ -898,7 +904,11 @@
static ssize_t mem_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
+#ifdef CONFIG_PROC_SELF_MEM_READONLY
+ return -EACCES;
+#else
return mem_rw(file, (char __user*)buf, count, ppos, 1);
+#endif
}
loff_t mem_lseek(struct file *file, loff_t offset, int orig)
@@ -3269,7 +3279,7 @@
#ifdef CONFIG_NUMA
REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
#endif
- REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations),
+ REG("mem", PROC_PID_MEM_MODE, proc_mem_operations),
LNK("cwd", proc_cwd_link),
LNK("root", proc_root_link),
LNK("exe", proc_exe_link),
@@ -3619,7 +3629,7 @@
#ifdef CONFIG_NUMA
REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations),
#endif
- REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations),
+ REG("mem", PROC_PID_MEM_MODE, proc_mem_operations),
LNK("cwd", proc_cwd_link),
LNK("root", proc_root_link),
LNK("exe", proc_exe_link),
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index 4409601..e9c05b4 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -155,6 +155,11 @@
global_zone_page_state(NR_FREE_CMA_PAGES));
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ show_val_kb(m, "Unaccepted: ",
+ global_zone_page_state(NR_UNACCEPTED));
+#endif
+
hugetlb_report_meminfo(m);
arch_report_meminfo(m);
diff --git a/fs/stat.c b/fs/stat.c
index ef50573..db5d306 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -18,6 +18,7 @@
#include <linux/syscalls.h>
#include <linux/pagemap.h>
#include <linux/compat.h>
+#include <linux/iversion.h>
#include <linux/uaccess.h>
#include <asm/unistd.h>
@@ -119,10 +120,16 @@
stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT |
STATX_ATTR_DAX);
+ if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) {
+ stat->result_mask |= STATX_CHANGE_COOKIE;
+ stat->change_cookie = inode_query_iversion(inode);
+ }
+
mnt_userns = mnt_user_ns(path->mnt);
if (inode->i_op->getattr)
return inode->i_op->getattr(mnt_userns, path, stat,
- request_mask, query_flags);
+ request_mask,
+ query_flags | AT_GETATTR_NOSEC);
generic_fillattr(mnt_userns, inode, stat);
return 0;
@@ -155,6 +162,9 @@
{
int retval;
+ if (WARN_ON_ONCE(query_flags & AT_GETATTR_NOSEC))
+ return -EPERM;
+
retval = security_inode_getattr(path);
if (retval)
return retval;
@@ -599,9 +609,11 @@
memset(&tmp, 0, sizeof(tmp));
- tmp.stx_mask = stat->result_mask;
+ /* STATX_CHANGE_COOKIE is kernel-only for now */
+ tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE;
tmp.stx_blksize = stat->blksize;
- tmp.stx_attributes = stat->attributes;
+ /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */
+ tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC;
tmp.stx_nlink = stat->nlink;
tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid);
tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid);
@@ -640,6 +652,11 @@
if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE)
return -EINVAL;
+ /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests
+ * from userland.
+ */
+ mask &= ~STATX_CHANGE_COOKIE;
+
error = vfs_statx(dfd, filename, flags, &stat, mask);
if (error)
return error;
diff --git a/google/certs/lakitu_root_cert.pem b/google/certs/lakitu_root_cert.pem
new file mode 100644
index 0000000..c7c11b7
--- /dev/null
+++ b/google/certs/lakitu_root_cert.pem
@@ -0,0 +1,26 @@
+-----BEGIN CERTIFICATE-----
+MIIEdzCCAt+gAwIBAgIBATANBgkqhkiG9w0BAQsFADA/MQswCQYDVQQGEwJVUzEP
+MA0GA1UEChMGR29vZ2xlMR8wHQYDVQQDExZDb250YWluZXItT3B0aW1pemVkIE9T
+MB4XDTE5MDExNTAwMzA0MloXDTI5MDExNjAwMzA0MlowPzELMAkGA1UEBhMCVVMx
+DzANBgNVBAoTBkdvb2dsZTEfMB0GA1UEAxMWQ29udGFpbmVyLU9wdGltaXplZCBP
+UzCCAaIwDQYJKoZIhvcNAQEBBQADggGPADCCAYoCggGBALrwED7U5BVwSxRBaaaT
+anJ1V/NrEOL0SRVO6GWPLEK4eNEpIlVG5wvvNVXHZo01pBjpjaU+E7bc7buEzSKa
+62jqIiIe80HbDZ9qmPqsWVtaDiT/bROWpGmJ+oYLqBB20z1vPAhur5KOVOvMZqNG
+MR2Hxg3EvoS2PwcXtsJ3rca3SR7oF7BvGGRaDoeO/1cxi1OLlfe8qIaw5VQquXvm
+NqQchtp/8AN6UOEBCql7fTVf/DINpDa/QhWqnIvGfgcT+84caX+2RIys3JIojZ9j
+sWTLnk/BtPOdRRN4SuJXqxOcdig8RHZPP4K7evdn83IQdOzdL6I5P395QhCNU1hT
+rzkJ6DKIqwzoC9fPlav1zec14xSO8nS8Obwr0af3dYNDZFKc8VZdgCaVckSg6E6w
+3R+6dn2Vdto+r3mSYydpDT6obeMZOwNhQfTFltZ7O+KC98AhnwLu2+ReSPXLHazT
+HYLOrONHmfD/mHbPdzMHrftFUqtmbdIt/45gdEBYIjt51wIDAQABo34wfDAOBgNV
+HQ8BAf8EBAMCAoQwEgYDVR0TAQH/BAgwBgEB/wIBADApBgNVHQ4EIgQgLv6kk/wU
+izWFnQvIw91xoTn8erQRKD4G9sELWmsX41UwKwYDVR0jBCQwIoAgLv6kk/wUizWF
+nQvIw91xoTn8erQRKD4G9sELWmsX41UwDQYJKoZIhvcNAQELBQADggGBAAs4IGt0
+qmpKxfzsM3iN0zZCMpH0/0M7O8DkEXnPFDyY55IuH64hkcPjeA1VNHoCiPAsDlWy
+u4lgJ33aEpWJx/0F5V3DSgszu/JT4/DByi9EP0jli9AT0iV9RCRDTfOWkIoOb6EC
+7OLUmR+gYRUpMfzQEmo5sdrl0Ju9K2LOj3lqyenrWOOXEJ3yR3Z922mxh6SWC7Hw
+ho8CMsaq4PWMycsSOvlr/j+SWCZCxgLdKfm/PmSziD7lexnFytDD52c6vYKXf58e
+v41LzO6s/0DTT6cLE7rHXmRPwBBWjyFpJ8GfRF7o8aHKyzdJtLVlG02PU/uJ/UCT
+DGWVSCqbd3+AkP8IfZ2CZHRpJ15DkkprS9f5nw8o/nHIdjPdbyhTELCsF4kfn8/o
+6jS8HVDv5BvfNpoiIh5VCMnAcaQ9aoCEi3oyPm05/ApFeIRDg7Wu8PpCyNEM6bTp
+I1DxUItYhLYHeunme6EK0aLnDsrtMapPPNyBcWa92GgmPQnfByETYuoO6w==
+-----END CERTIFICATE-----
diff --git a/google/kokoro/build_flavors/default.cfg b/google/kokoro/build_flavors/default.cfg
new file mode 100644
index 0000000..59f085b
--- /dev/null
+++ b/google/kokoro/build_flavors/default.cfg
@@ -0,0 +1,2 @@
+KERNEL_CONFIGS="lakitu_defconfig"
+BUILD_COMPONENTS="-k -H -d"
diff --git a/google/kokoro/build_flavors/xfstest.cfg b/google/kokoro/build_flavors/xfstest.cfg
new file mode 100644
index 0000000..533dd32
--- /dev/null
+++ b/google/kokoro/build_flavors/xfstest.cfg
@@ -0,0 +1,2 @@
+KERNEL_CONFIGS="lakitu_defconfig,google/xfstest.config"
+BUILD_COMPONENTS="-k"
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 7ad6f51..a025b19 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -81,8 +81,8 @@
#define RO_EXCEPTION_TABLE
#endif
-/* Align . to a 8 byte boundary equals to maximum function alignment. */
-#define ALIGN_FUNCTION() . = ALIGN(8)
+/* Align . function alignment. */
+#define ALIGN_FUNCTION() . = ALIGN(CONFIG_FUNCTION_ALIGNMENT)
/*
* LD_DEAD_CODE_DATA_ELIMINATION option enables -fdata-sections, which
diff --git a/include/linux/audit.h b/include/linux/audit.h
index 3608992..505acdd3b 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -96,6 +96,17 @@
struct audit_ntp_data {};
#endif
+struct audit_task_info {
+ kuid_t loginuid;
+ unsigned int sessionid;
+ u64 contid;
+#ifdef CONFIG_AUDITSYSCALL
+ struct audit_context *ctx;
+#endif
+};
+
+extern struct audit_task_info init_struct_audit;
+
enum audit_nfcfgop {
AUDIT_XT_OP_REGISTER,
AUDIT_XT_OP_REPLACE,
@@ -153,6 +164,9 @@
#ifdef CONFIG_AUDIT
/* These are defined in audit.c */
/* Public API */
+extern int audit_alloc(struct task_struct *task);
+extern void audit_free(struct task_struct *task);
+extern void __init audit_task_init(void);
extern __printf(4, 5)
void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
const char *fmt, ...);
@@ -196,12 +210,25 @@
static inline kuid_t audit_get_loginuid(struct task_struct *tsk)
{
- return tsk->loginuid;
+ if (!tsk->audit)
+ return INVALID_UID;
+ return tsk->audit->loginuid;
}
static inline unsigned int audit_get_sessionid(struct task_struct *tsk)
{
- return tsk->sessionid;
+ if (!tsk->audit)
+ return AUDIT_SID_UNSET;
+ return tsk->audit->sessionid;
+}
+
+extern int audit_set_contid(struct task_struct *tsk, u64 contid);
+
+static inline u64 audit_get_contid(struct task_struct *tsk)
+{
+ if (!tsk->audit)
+ return AUDIT_CID_UNSET;
+ return tsk->audit->contid;
}
extern u32 audit_enabled;
@@ -209,6 +236,14 @@
extern int audit_signal_info(int sig, struct task_struct *t);
#else /* CONFIG_AUDIT */
+static inline int audit_alloc(struct task_struct *task)
+{
+ return 0;
+}
+static inline void audit_free(struct task_struct *task)
+{ }
+static inline void __init audit_task_init(void)
+{ }
static inline __printf(4, 5)
void audit_log(struct audit_context *ctx, gfp_t gfp_mask, int type,
const char *fmt, ...)
@@ -260,6 +295,11 @@
return AUDIT_SID_UNSET;
}
+static inline u64 audit_get_contid(struct task_struct *tsk)
+{
+ return AUDIT_CID_UNSET;
+}
+
#define audit_enabled AUDIT_OFF
static inline int audit_signal_info(int sig, struct task_struct *t)
@@ -306,12 +346,14 @@
static inline void audit_set_context(struct task_struct *task, struct audit_context *ctx)
{
- task->audit_context = ctx;
+ task->audit->ctx = ctx;
}
static inline struct audit_context *audit_context(void)
{
- return current->audit_context;
+ if (!current->audit)
+ return NULL;
+ return current->audit->ctx;
}
static inline bool audit_dummy_context(void)
@@ -319,11 +361,6 @@
void *p = audit_context();
return !p || *(int *)p;
}
-static inline void audit_free(struct task_struct *task)
-{
- if (unlikely(task->audit_context))
- __audit_free(task);
-}
static inline void audit_uring_entry(u8 op)
{
/*
@@ -716,4 +753,19 @@
return uid_valid(audit_get_loginuid(tsk));
}
+static inline bool audit_contid_valid(u64 contid)
+{
+ return contid != AUDIT_CID_UNSET;
+}
+
+static inline bool audit_contid_set(struct task_struct *tsk)
+{
+ return audit_contid_valid(audit_get_contid(tsk));
+}
+
+static inline void audit_log_string(struct audit_buffer *ab, const char *buf)
+{
+ audit_log_n_string(ab, buf, strlen(buf));
+}
+
#endif
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 898b3458..b831264 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -76,12 +76,6 @@
#endif
/*
- * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-cold-function-attribute
- * gcc: https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html#index-cold-label-attribute
- */
-#define __cold __attribute__((__cold__))
-
-/*
* Note the long name.
*
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index eb04662..3df3764 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -78,6 +78,33 @@
/* Attributes */
#include <linux/compiler_attributes.h>
+#if CONFIG_FUNCTION_ALIGNMENT > 0
+#define __function_aligned __aligned(CONFIG_FUNCTION_ALIGNMENT)
+#else
+#define __function_aligned
+#endif
+
+/*
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-cold-function-attribute
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html#index-cold-label-attribute
+ *
+ * When -falign-functions=N is in use, we must avoid the cold attribute as
+ * contemporary versions of GCC drop the alignment for cold functions. Worse,
+ * GCC can implicitly mark callees of cold functions as cold themselves, so
+ * it's not sufficient to add __function_aligned here as that will not ensure
+ * that callees are correctly aligned.
+ *
+ * See:
+ *
+ * https://lore.kernel.org/lkml/Y77%2FqVgvaJidFpYt@FVFF77S0Q05N
+ * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345#c9
+ */
+#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0)
+#define __cold __attribute__((__cold__))
+#else
+#define __cold
+#endif
+
/* Builtins */
/*
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 9c31f1f..bdfc1a1 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -22,6 +22,9 @@
#include <linux/fs.h>
#include <linux/dma-fence.h>
#include <linux/wait.h>
+#include <linux/uio.h>
+#include <linux/genalloc.h>
+#include <linux/xarray.h>
struct device;
struct dma_buf;
@@ -549,6 +552,30 @@
void *priv;
};
+struct dma_buf_pages_file_priv {
+ /* fields for dmabuf */
+ struct dma_buf *dmabuf;
+ struct dma_buf_attachment *attachment;
+ struct sg_table *sgt;
+ struct pci_dev *pci_dev;
+ enum dma_data_direction direction;
+
+ /* fields for dma-buf page */
+ size_t num_pages;
+ struct page *pages;
+ struct dev_pagemap pgmap;
+
+ int has_page_pool;
+
+ /* fields for Tx */
+ struct iov_iter tx_iter;
+ struct bio_vec *tx_bv;
+
+ /* fields for Rx */
+ struct gen_pool *page_pool;
+ struct xarray bound_rxq_list;
+};
+
/**
* DEFINE_DMA_BUF_EXPORT_INFO - helper macro for exporters
* @name: export-info name
@@ -638,4 +665,47 @@
unsigned long);
int dma_buf_vmap(struct dma_buf *dmabuf, struct iosys_map *map);
void dma_buf_vunmap(struct dma_buf *dmabuf, struct iosys_map *map);
+
+#ifdef CONFIG_DMA_SHARED_BUFFER
+extern const struct file_operations dma_buf_pages_fops;
+extern const struct dev_pagemap_ops dma_buf_pgmap_ops;
+
+static inline bool is_dma_buf_pages_file(struct file *file)
+{
+ return file->f_op == &dma_buf_pages_fops;
+}
+
+static inline bool is_dma_buf_page(struct page *page)
+{
+ return (is_zone_device_page(page) && page->pgmap &&
+ page->pgmap->ops == &dma_buf_pgmap_ops);
+}
+#else
+static bool is_dma_buf_page(struct page *page)
+{
+ return false;
+}
+
+static bool is_dma_buf_pages_file(struct file *file)
+{
+ return false;
+}
+#endif
+
+static inline int dma_buf_map_sg(struct device *dev, struct scatterlist *sg,
+ int nents, enum dma_data_direction dir)
+{
+ struct scatterlist *s;
+ int i;
+
+ for_each_sg(sg, s, nents, i) {
+ struct page *pg = sg_page(s);
+
+ s->dma_address = (dma_addr_t)pg->zone_device_data;
+ sg_dma_len(s) = s->length;
+ }
+
+ return nents;
+}
+
#endif /* __DMA_BUF_H__ */
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 4e1bfee..6a5c15e 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -108,7 +108,8 @@
#define EFI_MEMORY_MAPPED_IO_PORT_SPACE 12
#define EFI_PAL_CODE 13
#define EFI_PERSISTENT_MEMORY 14
-#define EFI_MAX_MEMORY_TYPE 15
+#define EFI_UNACCEPTED_MEMORY 15
+#define EFI_MAX_MEMORY_TYPE 16
/* Attribute values: */
#define EFI_MEMORY_UC ((u64)0x0000000000000001ULL) /* uncached */
@@ -416,6 +417,7 @@
#define LINUX_EFI_MOK_VARIABLE_TABLE_GUID EFI_GUID(0xc451ed2b, 0x9694, 0x45d3, 0xba, 0xba, 0xed, 0x9f, 0x89, 0x88, 0xa3, 0x89)
#define LINUX_EFI_COCO_SECRET_AREA_GUID EFI_GUID(0xadf956ad, 0xe98c, 0x484c, 0xae, 0x11, 0xb5, 0x1c, 0x7d, 0x33, 0x64, 0x47)
#define LINUX_EFI_BOOT_MEMMAP_GUID EFI_GUID(0x800f683f, 0xd08b, 0x423a, 0xa2, 0x93, 0x96, 0x5c, 0x3c, 0x6f, 0xe2, 0xb4)
+#define LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID EFI_GUID(0xd5d1de3c, 0x105c, 0x44f9, 0x9e, 0xa9, 0xbc, 0xef, 0x98, 0x12, 0x00, 0x31)
#define RISCV_EFI_BOOT_PROTOCOL_GUID EFI_GUID(0xccd15fec, 0x6f73, 0x4eec, 0x83, 0x95, 0x3e, 0x69, 0xe4, 0xb9, 0x40, 0xbf)
@@ -533,6 +535,14 @@
efi_memory_desc_t map[];
};
+struct efi_unaccepted_memory {
+ u32 version;
+ u32 unit_size;
+ u64 phys_base;
+ u64 size;
+ unsigned long bitmap[];
+};
+
/*
* Architecture independent structure for describing a memory map for the
* benefit of efi_memmap_init_early(), and for passing context between
@@ -631,6 +641,7 @@
unsigned long tpm_final_log; /* TPM2 Final Events Log table */
unsigned long mokvar_table; /* MOK variable config table */
unsigned long coco_secret; /* Confidential computing secret table */
+ unsigned long unaccepted; /* Unaccepted memory table */
efi_get_time_t *get_time;
efi_set_time_t *set_time;
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 8128059..3e56cb6 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -37,9 +37,11 @@
static inline void ftrace_boot_snapshot(void) { }
#endif
-#ifdef CONFIG_FUNCTION_TRACER
struct ftrace_ops;
struct ftrace_regs;
+struct dyn_ftrace;
+
+#ifdef CONFIG_FUNCTION_TRACER
/*
* If the arch's mcount caller does not support all of ftrace's
* features, then it must call an indirect function that
@@ -56,6 +58,9 @@
void arch_ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs);
#endif
+extern const struct ftrace_ops ftrace_nop_ops;
+extern const struct ftrace_ops ftrace_list_ops;
+struct ftrace_ops *ftrace_find_unique_ops(struct dyn_ftrace *rec);
#endif /* CONFIG_FUNCTION_TRACER */
/* Main tracing buffer and events set up */
@@ -110,12 +115,11 @@
#define arch_ftrace_get_regs(fregs) (&(fregs)->regs)
/*
- * ftrace_instruction_pointer_set() is to be defined by the architecture
- * if to allow setting of the instruction pointer from the ftrace_regs
- * when HAVE_DYNAMIC_FTRACE_WITH_ARGS is set and it supports
- * live kernel patching.
+ * ftrace_regs_set_instruction_pointer() is to be defined by the architecture
+ * if to allow setting of the instruction pointer from the ftrace_regs when
+ * HAVE_DYNAMIC_FTRACE_WITH_ARGS is set and it supports live kernel patching.
*/
-#define ftrace_instruction_pointer_set(fregs, ip) do { } while (0)
+#define ftrace_regs_set_instruction_pointer(fregs, ip) do { } while (0)
#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs)
@@ -126,6 +130,35 @@
return arch_ftrace_get_regs(fregs);
}
+/*
+ * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
+ * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
+ */
+static __always_inline bool ftrace_regs_has_args(struct ftrace_regs *fregs)
+{
+ if (IS_ENABLED(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS))
+ return true;
+
+ return ftrace_get_regs(fregs) != NULL;
+}
+
+#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+#define ftrace_regs_get_instruction_pointer(fregs) \
+ instruction_pointer(ftrace_get_regs(fregs))
+#define ftrace_regs_get_argument(fregs, n) \
+ regs_get_kernel_argument(ftrace_get_regs(fregs), n)
+#define ftrace_regs_get_stack_pointer(fregs) \
+ kernel_stack_pointer(ftrace_get_regs(fregs))
+#define ftrace_regs_return_value(fregs) \
+ regs_return_value(ftrace_get_regs(fregs))
+#define ftrace_regs_set_return_value(fregs, ret) \
+ regs_set_return_value(ftrace_get_regs(fregs), ret)
+#define ftrace_override_function_with_return(fregs) \
+ override_function_with_return(ftrace_get_regs(fregs))
+#define ftrace_regs_query_register_offset(name) \
+ regs_query_register_offset(name)
+#endif
+
typedef void (*ftrace_func_t)(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs);
@@ -208,6 +241,12 @@
FTRACE_OPS_FL_DIRECT = BIT(17),
};
+#ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
+#define FTRACE_OPS_FL_SAVE_ARGS FTRACE_OPS_FL_SAVE_REGS
+#else
+#define FTRACE_OPS_FL_SAVE_ARGS 0
+#endif
+
/*
* FTRACE_OPS_CMD_* commands allow the ftrace core logic to request changes
* to a ftrace_ops. Note, the requests may fail.
@@ -288,6 +327,9 @@
unsigned long trampoline_size;
struct list_head list;
ftrace_ops_func_t ops_func;
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ unsigned long direct_call;
+#endif
#endif
};
@@ -362,74 +404,42 @@
unsigned long direct; /* for direct lookup only */
};
-struct dyn_ftrace;
-
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
extern int ftrace_direct_func_count;
-int register_ftrace_direct(unsigned long ip, unsigned long addr);
-int unregister_ftrace_direct(unsigned long ip, unsigned long addr);
-int modify_ftrace_direct(unsigned long ip, unsigned long old_addr, unsigned long new_addr);
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr);
-int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
- struct dyn_ftrace *rec,
- unsigned long old_addr,
- unsigned long new_addr);
unsigned long ftrace_find_rec_direct(unsigned long ip);
-int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
-int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr);
+int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr,
+ bool free_filters);
+int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr);
+int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr);
+
+void ftrace_stub_direct_tramp(void);
#else
struct ftrace_ops;
# define ftrace_direct_func_count 0
-static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
-{
- return -ENOTSUPP;
-}
-static inline int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
-{
- return -ENOTSUPP;
-}
-static inline int modify_ftrace_direct(unsigned long ip,
- unsigned long old_addr, unsigned long new_addr)
-{
- return -ENOTSUPP;
-}
-static inline struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
- return NULL;
-}
-static inline int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
- struct dyn_ftrace *rec,
- unsigned long old_addr,
- unsigned long new_addr)
-{
- return -ENODEV;
-}
static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
{
return 0;
}
-static inline int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
{
return -ENODEV;
}
-static inline int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr,
+ bool free_filters)
{
return -ENODEV;
}
-static inline int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+static inline int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
{
return -ENODEV;
}
-static inline int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
+static inline int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
{
return -ENODEV;
}
-#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
-#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
/*
* This must be implemented by the architecture.
* It is the way the ftrace direct_ops helper, when called
@@ -443,9 +453,9 @@
* the return from the trampoline jump to the direct caller
* instead of going back to the function it just traced.
*/
-static inline void arch_ftrace_set_direct_caller(struct pt_regs *regs,
+static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
unsigned long addr) { }
-#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
#ifdef CONFIG_STACK_TRACER
@@ -536,6 +546,8 @@
* IPMODIFY - the record allows for the IP address to be changed.
* DISABLED - the record is not ready to be touched yet
* DIRECT - there is a direct function to call
+ * CALL_OPS - the record can use callsite-specific ops
+ * CALL_OPS_EN - the function is set up to use callsite-specific ops
*
* When a new ftrace_ops is registered and wants a function to save
* pt_regs, the rec->flags REGS is set. When the function has been
@@ -553,9 +565,11 @@
FTRACE_FL_DISABLED = (1UL << 25),
FTRACE_FL_DIRECT = (1UL << 24),
FTRACE_FL_DIRECT_EN = (1UL << 23),
+ FTRACE_FL_CALL_OPS = (1UL << 22),
+ FTRACE_FL_CALL_OPS_EN = (1UL << 21),
};
-#define FTRACE_REF_MAX_SHIFT 23
+#define FTRACE_REF_MAX_SHIFT 21
#define FTRACE_REF_MAX ((1UL << FTRACE_REF_MAX_SHIFT) - 1)
#define ftrace_rec_count(rec) ((rec)->flags & FTRACE_REF_MAX)
@@ -793,7 +807,8 @@
*/
extern int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr);
-#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#if defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS) || \
+ defined(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)
/**
* ftrace_modify_call - convert from one addr to another (no nop)
* @rec: the call site record (e.g. mcount/fentry)
@@ -806,6 +821,9 @@
* what we expect it to be, and then on success of the compare,
* it should write to the location.
*
+ * When using call ops, this is called when the associated ops change, even
+ * when (addr == old_addr).
+ *
* The code segment at @rec->ip should be a caller to @old_addr
*
* Return must be:
diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index 1feab61..5c8865b 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -69,8 +69,8 @@
#endif
#ifndef __ALIGN
-#define __ALIGN .align 4,0x90
-#define __ALIGN_STR ".align 4,0x90"
+#define __ALIGN .balign CONFIG_FUNCTION_ALIGNMENT
+#define __ALIGN_STR __stringify(__ALIGN)
#endif
#ifdef __ASSEMBLY__
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 02b19c5..71725d8 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -163,6 +163,7 @@
LSM_HOOK(int, 0, file_permission, struct file *file, int mask)
LSM_HOOK(int, 0, file_alloc_security, struct file *file)
LSM_HOOK(void, LSM_RET_VOID, file_free_security, struct file *file)
+LSM_HOOK(void, LSM_RET_VOID, file_pre_free_security, struct file *file)
LSM_HOOK(int, 0, file_ioctl, struct file *file, unsigned int cmd,
unsigned long arg)
LSM_HOOK(int, 0, mmap_addr, unsigned long addr)
@@ -180,6 +181,7 @@
LSM_HOOK(int, 0, file_open, struct file *file)
LSM_HOOK(int, 0, task_alloc, struct task_struct *task,
unsigned long clone_flags)
+LSM_HOOK(void, LSM_RET_VOID, task_post_alloc, struct task_struct *task)
LSM_HOOK(void, LSM_RET_VOID, task_free, struct task_struct *task)
LSM_HOOK(int, 0, cred_alloc_blank, struct cred *cred, gfp_t gfp)
LSM_HOOK(void, LSM_RET_VOID, cred_free, struct cred *cred)
@@ -221,6 +223,7 @@
LSM_HOOK(int, 0, task_movememory, struct task_struct *p)
LSM_HOOK(int, 0, task_kill, struct task_struct *p, struct kernel_siginfo *info,
int sig, const struct cred *cred)
+LSM_HOOK(void, LSM_RET_VOID, task_exit, struct task_struct *p)
LSM_HOOK(int, -ENOSYS, task_prctl, int option, unsigned long arg2,
unsigned long arg3, unsigned long arg4, unsigned long arg5)
LSM_HOOK(void, LSM_RET_VOID, task_to_inode, struct task_struct *p,
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 4ec80b9..abac12c 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -534,6 +534,10 @@
* @file_free_security:
* Deallocate and free any security structures stored in file->f_security.
* @file contains the file structure being modified.
+ * @file_pre_free_security:
+ * Perform any logging or LSM state updates for a file being deleted
+ * using fields of the file before they have been cleared.
+ * @file contains the file structure being freed
* @file_ioctl:
* @file contains the file structure.
* @cmd contains the operation to perform.
@@ -610,6 +614,10 @@
* @clone_flags contains the flags indicating what should be shared.
* Handle allocation of task-related resources.
* Returns a zero on success, negative values on failure.
+ * @task_post_alloc:
+ * @task task being allocated.
+ * Handle allocation of task-related resources after all task fields are
+ * filled in.
* @task_free:
* @task task about to be freed.
* Handle release of task-related resources. (Note that this can be called
@@ -791,6 +799,9 @@
* @cred contains the cred of the process where the signal originated, or
* NULL if the current task is the originator.
* Return 0 if permission is granted.
+ * @task_exit:
+ * Called early when a task is exiting before all state is lost.
+ * @p contains the task_struct for process.
* @task_prctl:
* Check permission before performing a process control operation on the
* current process.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index eefb094..b65cce93 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3529,4 +3529,23 @@
}
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+
+bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end);
+void accept_memory(phys_addr_t start, phys_addr_t end);
+
+#else
+
+static inline bool range_contains_unaccepted_memory(phys_addr_t start,
+ phys_addr_t end)
+{
+ return false;
+}
+
+static inline void accept_memory(phys_addr_t start, phys_addr_t end)
+{
+}
+
+#endif
+
#endif /* _LINUX_MM_H */
diff --git a/include/linux/mmu_context.h b/include/linux/mmu_context.h
index b9b970f..01c77ba 100644
--- a/include/linux/mmu_context.h
+++ b/include/linux/mmu_context.h
@@ -5,6 +5,11 @@
#include <asm/mmu_context.h>
#include <asm/mmu.h>
+struct mm_struct;
+
+void use_mm(struct mm_struct *mm);
+void unuse_mm(struct mm_struct *mm);
+
/* Architectures that care about IRQ state in switch_mm can override this. */
#ifndef switch_mm_irqs_off
# define switch_mm_irqs_off switch_mm
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 323ee36..341316c 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -152,6 +152,9 @@
NR_ZSPAGES, /* allocated in zsmalloc */
#endif
NR_FREE_CMA_PAGES,
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ NR_UNACCEPTED,
+#endif
NR_VM_ZONE_STAT_ITEMS };
enum node_stat_item {
@@ -822,6 +825,11 @@
/* free areas of different sizes */
struct free_area free_area[MAX_ORDER];
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ /* Pages to be accepted. All pages on the list are MAX_ORDER */
+ struct list_head unaccepted_pages;
+#endif
+
/* zone flags, see below */
unsigned long flags;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0373e09..d6e8103 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -51,6 +51,7 @@
#include <linux/rbtree.h>
#include <net/net_trackers.h>
#include <net/net_debug.h>
+#include <linux/dma-buf.h>
struct netpoll_info;
struct device;
@@ -785,8 +786,41 @@
#ifdef CONFIG_XDP_SOCKETS
struct xsk_buff_pool *pool;
#endif
+ struct file __rcu *dmabuf_pages;
} ____cacheline_aligned_in_smp;
+struct page *
+__netdev_rxq_alloc_page_from_dmabuf_pool(struct netdev_rx_queue *rxq,
+ unsigned int order);
+
+static inline struct page *netdev_rxq_alloc_dma_buf_page(struct netdev_rx_queue *rxq,
+ unsigned int order)
+{
+
+ /* Return NULL if we can't allocate a dma_buf page, instead of trying
+ * to fallback to alloc_page(). The reason we do this is because we
+ * don't want to confuse the caller with respect to whether the page
+ * they are getting is a dma_buf page or otherwise. dma_buf pages and
+ * devmem skbs require special handling, so the distinction is
+ * important. Let the caller fall back to allocating a non-dma_buf page
+ * if they know what they're doing.
+ */
+ if (unlikely(!rcu_access_pointer(rxq->dmabuf_pages)))
+ return NULL;
+
+ return __netdev_rxq_alloc_page_from_dmabuf_pool(rxq, order);
+}
+
+static inline void netdev_rxq_free_page(struct page *pg)
+{
+ if (is_dma_buf_page(pg)) {
+ put_page(pg);
+ return;
+ }
+
+ __free_page(pg);
+}
+
/*
* RX queue sysfs structures and functions.
*/
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 07481bb..2181f3c 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -31,6 +31,9 @@
struct ucounts *ucounts;
int reboot; /* group exit code if this pidns was rebooted */
struct ns_common ns;
+#ifdef CONFIG_SECURITY_CONTAINER_MONITOR
+ u64 cid; /* Main container identifier, zero if not assigned. */
+#endif
} __randomize_layout;
extern struct pid_namespace init_pid_ns;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0cac699..4083a72 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -39,7 +39,6 @@
#include <asm/kmap_size.h>
/* task_struct member predeclarations (sorted alphabetically): */
-struct audit_context;
struct backing_dev_info;
struct bio_list;
struct blk_plug;
@@ -1115,11 +1114,7 @@
struct callback_head *task_works;
#ifdef CONFIG_AUDIT
-#ifdef CONFIG_AUDITSYSCALL
- struct audit_context *audit_context;
-#endif
- kuid_t loginuid;
- unsigned int sessionid;
+ struct audit_task_info *audit;
#endif
struct seccomp seccomp;
struct syscall_user_dispatch syscall_dispatch;
@@ -1973,6 +1968,12 @@
extern int wake_up_process(struct task_struct *tsk);
extern void wake_up_new_task(struct task_struct *tsk);
+/*
+ * Wake up tsk and try to swap it into the current tasks place, which
+ * initially means just trying to migrate it to the current CPU.
+ */
+extern int wake_up_swap(struct task_struct *tsk);
+
#ifdef CONFIG_SMP
extern void kick_process(struct task_struct *tsk);
#else
diff --git a/include/linux/security.h b/include/linux/security.h
index a6c97cc..b909dbc 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -384,6 +384,7 @@
int security_file_permission(struct file *file, int mask);
int security_file_alloc(struct file *file);
void security_file_free(struct file *file);
+void security_file_pre_free(struct file *file);
int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
int security_mmap_file(struct file *file, unsigned long prot,
unsigned long flags);
@@ -398,6 +399,7 @@
int security_file_receive(struct file *file);
int security_file_open(struct file *file);
int security_task_alloc(struct task_struct *task, unsigned long clone_flags);
+void security_task_post_alloc(struct task_struct *task);
void security_task_free(struct task_struct *task);
int security_cred_alloc_blank(struct cred *cred, gfp_t gfp);
void security_cred_free(struct cred *cred);
@@ -437,6 +439,7 @@
int security_task_movememory(struct task_struct *p);
int security_task_kill(struct task_struct *p, struct kernel_siginfo *info,
int sig, const struct cred *cred);
+void security_task_exit(struct task_struct *p);
int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
unsigned long arg4, unsigned long arg5);
void security_task_to_inode(struct task_struct *p, struct inode *inode);
@@ -963,6 +966,9 @@
static inline void security_file_free(struct file *file)
{ }
+static inline void security_file_pre_free(struct file *file)
+{ }
+
static inline int security_file_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -1026,6 +1032,9 @@
return 0;
}
+static inline void security_task_post_alloc(struct task_struct *task)
+{ }
+
static inline void security_task_free(struct task_struct *task)
{ }
@@ -1192,6 +1201,9 @@
return 0;
}
+static inline void security_task_exit(struct task_struct *p)
+{ }
+
static inline int security_task_prctl(int option, unsigned long arg2,
unsigned long arg3,
unsigned long arg4,
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index d500ea9..7917e80 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -51,6 +51,7 @@
spinlock_t shrinklist_lock; /* Protects shrinklist */
struct list_head shrinklist; /* List of shinkable inodes */
unsigned long shrinklist_len; /* Length of shrinklist */
+ bool user_xattr; /* user.* xattrs are allowed */
};
static inline struct shmem_inode_info *SHMEM_I(struct inode *inode)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index c30d419..40ac808 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -44,6 +44,7 @@
#endif
#include <net/net_debug.h>
#include <net/dropreason.h>
+#include <linux/dma-buf.h>
/**
* DOC: skb checksums
@@ -347,18 +348,12 @@
struct sk_buff;
-/* To allow 64K frame to be packed as single skb without frag_list we
- * require 64K/PAGE_SIZE pages plus 1 additional page to allow for
- * buffers which do not start on a page boundary.
- *
- * Since GRO uses frags we allocate at least 16 regardless of page
- * size.
- */
-#if (65536/PAGE_SIZE + 1) < 16
-#define MAX_SKB_FRAGS 16UL
-#else
-#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 1)
+#ifndef CONFIG_MAX_SKB_FRAGS
+# define CONFIG_MAX_SKB_FRAGS 17
#endif
+
+#define MAX_SKB_FRAGS CONFIG_MAX_SKB_FRAGS
+
extern int sysctl_max_skb_frags;
/* Set skb_shinfo(skb)->gso_size to this in case you want skb_segment to
@@ -813,6 +808,8 @@
* the packet minus one that have been verified as
* CHECKSUM_UNNECESSARY (max 3)
* @scm_io_uring: SKB holds io_uring registered files
+ * @devmem: indicates that all the fragments in this skb is backed by
+ * device memory.
* @dst_pending_confirm: need to confirm neighbour
* @decrypted: Decrypted SKB
* @slow_gro: state present at GRO time, slower prepare step required
@@ -993,7 +990,7 @@
__u8 slow_gro:1;
__u8 csum_not_inet:1;
__u8 scm_io_uring:1;
-
+ __u8 devmem:1;
#ifdef CONFIG_NET_SCHED
__u16 tc_index; /* traffic control index */
#endif
@@ -1647,6 +1644,10 @@
struct msghdr *msg, int len,
struct ubuf_info *uarg);
+int skb_devmem_iter_stream(struct sock *sk, struct sk_buff *skb,
+ struct msghdr *msg, struct iov_iter *iov_iter, int len,
+ struct ubuf_info *uarg);
+
/* Internal */
#define skb_shinfo(SKB) ((struct skb_shared_info *)(skb_end_pointer(SKB)))
@@ -1754,6 +1755,12 @@
__skb_zcopy_downgrade_managed(skb);
}
+/* Return true if frags in this skb are not readable by the host. */
+static inline bool skb_frags_not_readable(const struct sk_buff *skb)
+{
+ return skb->devmem;
+}
+
static inline void skb_mark_not_on_list(struct sk_buff *skb)
{
skb->next = NULL;
@@ -3506,6 +3513,22 @@
skb_frag_off(frag) + offset, size, dir);
}
+/* Similar to skb_frag_dma_map, but handles devmem skbs correctly. */
+static inline dma_addr_t skb_devmem_frag_dma_map(struct device *dev,
+ const struct sk_buff *skb,
+ const skb_frag_t *frag,
+ size_t offset, size_t size,
+ enum dma_data_direction dir)
+{
+ if (unlikely(skb->devmem && is_dma_buf_page(skb_frag_page(frag)))) {
+ struct page *page = skb_frag_page(frag);
+ dma_addr_t dma_addr = (dma_addr_t)page->zone_device_data;
+
+ return dma_addr + skb_frag_off(frag) + offset;
+ }
+ return skb_frag_dma_map(dev, frag, offset, size, dir);
+}
+
static inline struct sk_buff *pskb_copy(struct sk_buff *skb,
gfp_t gfp_mask)
{
diff --git a/include/linux/socket.h b/include/linux/socket.h
index b3c5804..70ae187 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -323,6 +323,8 @@
* plain text and require encryption
*/
+#define MSG_SOCK_DEVMEM 0x2000000 /* don't copy devmem pages but return
+ * them as cmsg instead */
#define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */
#define MSG_SPLICE_PAGES 0x8000000 /* Splice the pages from the iterator in sendmsg() */
#define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */
diff --git a/include/linux/stat.h b/include/linux/stat.h
index ff277ce..5215057 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -52,6 +52,15 @@
u64 mnt_id;
u32 dio_mem_align;
u32 dio_offset_align;
+ u64 change_cookie;
};
+/* These definitions are internal to the kernel for now. Mainly used by nfsd. */
+
+/* mask values */
+#define STATX_CHANGE_COOKIE 0x40000000U /* Want/got stx_change_attr */
+
+/* file attribute values */
+#define STATX_ATTR_CHANGE_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */
+
#endif
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 9cd289a..32c9559 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -172,6 +172,8 @@
return (struct tcp_request_sock *)req;
}
+#define TCP_RMEM_TO_WIN_SCALE 8
+
struct tcp_sock {
/* inet_connection_sock has to be the first member of tcp_sock */
struct inet_connection_sock inet_conn;
@@ -238,7 +240,7 @@
u32 window_clamp; /* Maximal window to advertise */
u32 rcv_ssthresh; /* Current window clamp */
-
+ u8 scaling_ratio; /* see tcp_win_from_space() */
/* Information of the most recently (s)acked skb */
struct tcp_rack {
u64 mstamp; /* (Re)sent time of the skb */
@@ -460,15 +462,17 @@
TCP_MTU_REDUCED_DEFERRED, /* tcp_v{4|6}_err() could not call
* tcp_v{4|6}_mtu_reduced()
*/
+ TCP_ACK_DEFERRED, /* TX pure ack is deferred */
};
enum tsq_flags {
- TSQF_THROTTLED = (1UL << TSQ_THROTTLED),
- TSQF_QUEUED = (1UL << TSQ_QUEUED),
- TCPF_TSQ_DEFERRED = (1UL << TCP_TSQ_DEFERRED),
- TCPF_WRITE_TIMER_DEFERRED = (1UL << TCP_WRITE_TIMER_DEFERRED),
- TCPF_DELACK_TIMER_DEFERRED = (1UL << TCP_DELACK_TIMER_DEFERRED),
- TCPF_MTU_REDUCED_DEFERRED = (1UL << TCP_MTU_REDUCED_DEFERRED),
+ TSQF_THROTTLED = BIT(TSQ_THROTTLED),
+ TSQF_QUEUED = BIT(TSQ_QUEUED),
+ TCPF_TSQ_DEFERRED = BIT(TCP_TSQ_DEFERRED),
+ TCPF_WRITE_TIMER_DEFERRED = BIT(TCP_WRITE_TIMER_DEFERRED),
+ TCPF_DELACK_TIMER_DEFERRED = BIT(TCP_DELACK_TIMER_DEFERRED),
+ TCPF_MTU_REDUCED_DEFERRED = BIT(TCP_MTU_REDUCED_DEFERRED),
+ TCPF_ACK_DEFERRED = BIT(TCP_ACK_DEFERRED),
};
static inline struct tcp_sock *tcp_sk(const struct sock *sk)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 517bdae..52f9b18 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -500,6 +500,39 @@
return jhdr->nexthdr;
}
+/* Return 0 if HBH header is successfully removed
+ * Or if HBH removal is unnecessary (packet is not big TCP)
+ * Return error to indicate dropping the packet
+ */
+static inline int ipv6_hopopt_jumbo_remove(struct sk_buff *skb)
+{
+ const int hophdr_len = sizeof(struct hop_jumbo_hdr);
+ int nexthdr = ipv6_has_hopopt_jumbo(skb);
+ struct ipv6hdr *h6;
+
+ if (!nexthdr)
+ return 0;
+
+ if (skb_cow_head(skb, 0))
+ return -1;
+
+ /* Remove the HBH header.
+ * Layout: [Ethernet header][IPv6 header][HBH][L4 Header]
+ */
+ memmove(skb_mac_header(skb) + hophdr_len, skb_mac_header(skb),
+ skb_network_header(skb) - skb_mac_header(skb) +
+ sizeof(struct ipv6hdr));
+
+ __skb_pull(skb, hophdr_len);
+ skb->network_header += hophdr_len;
+ skb->mac_header += hophdr_len;
+
+ h6 = ipv6_hdr(skb);
+ h6->nexthdr = nexthdr;
+
+ return 0;
+}
+
static inline bool ipv6_accept_ra(struct inet6_dev *idev)
{
/* If forwarding is enabled, RA are not accepted unless the special
diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 8249060..55d8723 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -13,6 +13,7 @@
int sysctl_somaxconn;
u8 sysctl_txrehash;
+ int sysctl_optmem_max;
#ifdef CONFIG_PROC_FS
struct prot_inuse __percpu *prot_inuse;
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index ede2ff1..ad6b02f 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -151,7 +151,7 @@
u8 sysctl_tcp_abort_on_overflow;
u8 sysctl_tcp_fack; /* obsolete */
int sysctl_tcp_max_reordering;
- int sysctl_tcp_adv_win_scale;
+ int sysctl_tcp_adv_win_scale; /* obsolete */
u8 sysctl_tcp_dsack;
u8 sysctl_tcp_app_win;
u8 sysctl_tcp_frto;
@@ -167,6 +167,7 @@
u8 sysctl_tcp_tso_rtt_log;
u8 sysctl_tcp_autocorking;
u8 sysctl_tcp_reflect_tos;
+ u8 sysctl_tcp_backlog_ack_defer;
int sysctl_tcp_invalid_ratelimit;
int sysctl_tcp_pacing_ss_ratio;
int sysctl_tcp_pacing_ca_ratio;
diff --git a/include/net/sock.h b/include/net/sock.h
index 6b51e85..84709e1 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -350,6 +350,7 @@
* @sk_txtime_unused: unused txtime flags
* @ns_tracker: tracker for netns reference
* @sk_bind2_node: bind node in the bhash2 table
+ * @sk_pagepool: page pool associated with this socket.
*/
struct sock {
/*
@@ -541,6 +542,7 @@
struct rcu_head sk_rcu;
netns_tracker ns_tracker;
struct hlist_node sk_bind2_node;
+ struct xarray sk_pagepool;
};
enum sk_pacing {
@@ -1846,12 +1848,10 @@
static inline void sock_release_ownership(struct sock *sk)
{
- if (sock_owned_by_user_nocheck(sk)) {
- sk->sk_lock.owned = 0;
+ sk->sk_lock.owned = 0;
- /* The sk_lock has mutex_unlock() semantics: */
- mutex_release(&sk->sk_lock.dep_map, _RET_IP_);
- }
+ /* The sk_lock has mutex_unlock() semantics: */
+ mutex_release(&sk->sk_lock.dep_map, _RET_IP_);
}
/* no reclassification while locks are held */
@@ -1924,6 +1924,8 @@
u64 transmit_time;
u32 mark;
u16 tsflags;
+ u32 devmem_fd;
+ u32 devmem_offset;
};
static inline void sockcm_init(struct sockcm_cookie *sockc,
@@ -2952,7 +2954,6 @@
extern __u32 sysctl_rmem_max;
extern int sysctl_tstamp_allow_data;
-extern int sysctl_optmem_max;
extern __u32 sysctl_wmem_default;
extern __u32 sysctl_rmem_default;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 4c838f7..c99b90c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -733,6 +733,8 @@
tcp_fast_path_on(tp);
}
+u32 tcp_delack_max(struct sock *sk);
+
/* Compute the actual rto_min value */
static inline u32 tcp_rto_min(struct sock *sk)
{
@@ -991,7 +993,7 @@
static inline bool tcp_skb_can_collapse_to(const struct sk_buff *skb)
{
- return likely(!TCP_SKB_CB(skb)->eor);
+ return likely(!TCP_SKB_CB(skb)->eor && !skb_frags_not_readable(skb));
}
static inline bool tcp_skb_can_collapse(const struct sk_buff *to,
@@ -999,7 +1001,9 @@
{
return likely(tcp_skb_can_collapse_to(to) &&
mptcp_skb_can_collapse(to, from) &&
- skb_pure_zcopy_same(to, from));
+ skb_pure_zcopy_same(to, from) &&
+ skb_frags_not_readable(to) ==
+ skb_frags_not_readable(from));
}
/* Events passed to congestion control interface */
@@ -1441,11 +1445,27 @@
static inline int tcp_win_from_space(const struct sock *sk, int space)
{
- int tcp_adv_win_scale = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_adv_win_scale);
+ s64 scaled_space = (s64)space * tcp_sk(sk)->scaling_ratio;
- return tcp_adv_win_scale <= 0 ?
- (space>>(-tcp_adv_win_scale)) :
- space - (space>>tcp_adv_win_scale);
+ return scaled_space >> TCP_RMEM_TO_WIN_SCALE;
+}
+
+/* inverse of tcp_win_from_space() */
+static inline int tcp_space_from_win(const struct sock *sk, int win)
+{
+ u64 val = (u64)win << TCP_RMEM_TO_WIN_SCALE;
+
+ do_div(val, tcp_sk(sk)->scaling_ratio);
+ return val;
+}
+
+static inline void tcp_scaling_ratio_init(struct sock *sk)
+{
+ /* Assume a conservative default of 1200 bytes of payload per 4K page.
+ * This may be adjusted later in tcp_measure_rcv_mss().
+ */
+ tcp_sk(sk)->scaling_ratio = (1200 << TCP_RMEM_TO_WIN_SCALE) /
+ SKB_TRUESIZE(4096);
}
/* Note: caller must be prepared to deal with negative returns */
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 6382308..2a5a7f5 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -132,6 +132,12 @@
#define SO_RCVMARK 75
+#define SO_DEVMEM_DONTNEED 97
+#define SO_DEVMEM_HEADER 98
+#define SCM_DEVMEM_HEADER SO_DEVMEM_HEADER
+#define SO_DEVMEM_OFFSET 99
+#define SCM_DEVMEM_OFFSET SO_DEVMEM_OFFSET
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index d676ed2..64dea3e 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -71,6 +71,7 @@
#define AUDIT_TTY_SET 1017 /* Set TTY auditing status */
#define AUDIT_SET_FEATURE 1018 /* Turn an audit feature on or off */
#define AUDIT_GET_FEATURE 1019 /* Get which features are enabled */
+#define AUDIT_CONTAINER_OP 1020 /* Define the container id and info */
#define AUDIT_FIRST_USER_MSG 1100 /* Userspace messages mostly uninteresting to kernel */
#define AUDIT_USER_AVC 1107 /* We filter this differently */
@@ -502,6 +503,7 @@
#define AUDIT_UID_UNSET (unsigned int)-1
#define AUDIT_SID_UNSET ((unsigned int)-1)
+#define AUDIT_CID_UNSET ((u64)-1)
/* audit_rule_data supports filter rules with both integer and string
* fields. It corresponds with AUDIT_ADD_RULE, AUDIT_DEL_RULE and
diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma-buf.h
index 5a6fda6..6c7dd86 100644
--- a/include/uapi/linux/dma-buf.h
+++ b/include/uapi/linux/dma-buf.h
@@ -75,6 +75,17 @@
__u64 flags;
};
+struct dma_buf_create_pages_info {
+ __u64 pci_bdf[3];
+ __s32 dma_buf_fd;
+ __s32 create_page_pool;
+};
+
+struct dma_buf_pages_bind_rx_queue {
+ char ifname[IFNAMSIZ];
+ __u32 rxq_idx;
+};
+
#define DMA_BUF_SYNC_READ (1 << 0)
#define DMA_BUF_SYNC_WRITE (2 << 0)
#define DMA_BUF_SYNC_RW (DMA_BUF_SYNC_READ | DMA_BUF_SYNC_WRITE)
@@ -179,4 +190,7 @@
#define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, struct dma_buf_export_sync_file)
#define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct dma_buf_import_sync_file)
+#define DMA_BUF_CREATE_PAGES _IOW(DMA_BUF_BASE, 2, struct dma_buf_create_pages_info)
+#define DMA_BUF_PAGES_BIND_RX _IOW(DMA_BUF_BASE, 3, struct dma_buf_pages_bind_rx_queue)
+
#endif
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 2f86b2a..6b82ad2 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -110,5 +110,8 @@
#define AT_STATX_DONT_SYNC 0x4000 /* - Don't sync attributes with the server */
#define AT_RECURSIVE 0x8000 /* Apply to the entire subtree */
+#if defined(__KERNEL__)
+#define AT_GETATTR_NOSEC 0x80000000
+#endif
#endif /* _UAPI_LINUX_FCNTL_H */
diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h
index 71a5df8d..701b355 100644
--- a/include/uapi/linux/futex.h
+++ b/include/uapi/linux/futex.h
@@ -23,6 +23,8 @@
#define FUTEX_CMP_REQUEUE_PI 12
#define FUTEX_LOCK_PI2 13
+#define GFUTEX_SWAP 60
+
#define FUTEX_PRIVATE_FLAG 128
#define FUTEX_CLOCK_REALTIME 256
#define FUTEX_CMD_MASK ~(FUTEX_PRIVATE_FLAG | FUTEX_CLOCK_REALTIME)
@@ -43,6 +45,8 @@
#define FUTEX_CMP_REQUEUE_PI_PRIVATE (FUTEX_CMP_REQUEUE_PI | \
FUTEX_PRIVATE_FLAG)
+#define GFUTEX_SWAP_PRIVATE (GFUTEX_SWAP | FUTEX_PRIVATE_FLAG)
+
/*
* Flags to specify the bit length of the futex word for futex2 syscalls.
* Currently, only 32 is supported.
diff --git a/include/uapi/linux/tdx-guest.h b/include/uapi/linux/tdx-guest.h
new file mode 100644
index 0000000..a6a2098
--- /dev/null
+++ b/include/uapi/linux/tdx-guest.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Userspace interface for TDX guest driver
+ *
+ * Copyright (C) 2022 Intel Corporation
+ */
+
+#ifndef _UAPI_LINUX_TDX_GUEST_H_
+#define _UAPI_LINUX_TDX_GUEST_H_
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/* Length of the REPORTDATA used in TDG.MR.REPORT TDCALL */
+#define TDX_REPORTDATA_LEN 64
+
+/* Length of TDREPORT used in TDG.MR.REPORT TDCALL */
+#define TDX_REPORT_LEN 1024
+
+/**
+ * struct tdx_report_req - Request struct for TDX_CMD_GET_REPORT0 IOCTL.
+ *
+ * @reportdata: User buffer with REPORTDATA to be included into TDREPORT.
+ * Typically it can be some nonce provided by attestation
+ * service, so the generated TDREPORT can be uniquely verified.
+ * @tdreport: User buffer to store TDREPORT output from TDCALL[TDG.MR.REPORT].
+ */
+struct tdx_report_req {
+ __u8 reportdata[TDX_REPORTDATA_LEN];
+ __u8 tdreport[TDX_REPORT_LEN];
+};
+
+/*
+ * TDX_CMD_GET_REPORT0 - Get TDREPORT0 (a.k.a. TDREPORT subtype 0) using
+ * TDCALL[TDG.MR.REPORT]
+ *
+ * Return 0 on success, -EIO on TDCALL execution failure, and
+ * standard errno on other general error cases.
+ */
+#define TDX_CMD_GET_REPORT0 _IOWR('T', 1, struct tdx_report_req)
+
+#endif /* _UAPI_LINUX_TDX_GUEST_H_ */
diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h
index 059b1a9..9e96fec 100644
--- a/include/uapi/linux/uio.h
+++ b/include/uapi/linux/uio.h
@@ -20,6 +20,17 @@
__kernel_size_t iov_len; /* Must be size_t (1003.1g) */
};
+struct devmemvec {
+ __u32 frag_offset;
+ __u32 frag_size;
+ __u32 frag_token; /* The token is only 31 bits long. The last bit is
+ reserved to indicate the end of pagelist. */
+};
+
+struct devmemtoken {
+ __u32 token_start;
+ __u32 token_count;
+};
/*
* UIO_MAXIOV shall be at least 16 1003.1g (5.4.1.1)
*/
diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
index ddaa45e..f69db4d 100644
--- a/include/uapi/linux/virtio_balloon.h
+++ b/include/uapi/linux/virtio_balloon.h
@@ -31,12 +31,13 @@
#include <linux/virtio_config.h>
/* The feature bitmap for virtio balloon */
-#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
-#define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */
-#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
-#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
-#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
-#define VIRTIO_BALLOON_F_REPORTING 5 /* Page reporting virtqueue */
+#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
+#define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */
+#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
+#define VIRTIO_BALLOON_F_PAGE_POISON 4 /* Guest is using page poisoning */
+#define VIRTIO_BALLOON_F_REPORTING 5 /* Page reporting virtqueue */
+#define VIRTIO_BALLOON_F_MIN_BALLOON_SIZE 6 /* Min balloon size to remain even with OOM event*/
/* Size of a PFN in the balloon interface. */
#define VIRTIO_BALLOON_PFN_SHIFT 12
@@ -59,6 +60,8 @@
};
/* Stores PAGE_POISON if page poisoning is in use */
__le32 poison_val;
+ /* Number of Pages host wants Guest to give up even with delfate_on_oom event. */
+ __le32 min_balloon_pages;
};
#define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */
diff --git a/init/init_task.c b/init/init_task.c
index ff6c4b9..54bd366 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -135,8 +135,7 @@
.thread_group = LIST_HEAD_INIT(init_task.thread_group),
.thread_node = LIST_HEAD_INIT(init_signals.thread_head),
#ifdef CONFIG_AUDIT
- .loginuid = INVALID_UID,
- .sessionid = AUDIT_SID_UNSET,
+ .audit = &init_struct_audit,
#endif
#ifdef CONFIG_PERF_EVENTS
.perf_event_mutex = __MUTEX_INITIALIZER(init_task.perf_event_mutex),
diff --git a/init/main.c b/init/main.c
index 87a52bd..a25818e 100644
--- a/init/main.c
+++ b/init/main.c
@@ -96,6 +96,8 @@
#include <linux/cache.h>
#include <linux/rodata_test.h>
#include <linux/jump_label.h>
+#include <linux/mem_encrypt.h>
+#include <linux/audit.h>
#include <linux/kcsan.h>
#include <linux/init_syscalls.h>
#include <linux/stackdepot.h>
@@ -1127,6 +1129,7 @@
nsfs_init();
cpuset_init();
cgroup_init();
+ audit_task_init();
taskstats_init_early();
delayacct_init();
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 3589495..93e5574 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -153,6 +153,22 @@
static struct kmem_cache *req_cachep;
+static int __read_mostly sysctl_io_uring_disabled;
+#ifdef CONFIG_SYSCTL
+static struct ctl_table kernel_io_uring_disabled_table[] = {
+ {
+ .procname = "io_uring_disabled",
+ .data = &sysctl_io_uring_disabled,
+ .maxlen = sizeof(sysctl_io_uring_disabled),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_TWO,
+ },
+ {},
+};
+#endif
+
struct sock *io_uring_get_socket(struct file *file)
{
#if defined(CONFIG_UNIX)
@@ -3682,9 +3698,19 @@
return io_uring_create(entries, &p, params);
}
+static inline bool io_uring_allowed(void)
+{
+ int disabled = READ_ONCE(sysctl_io_uring_disabled);
+
+ return disabled == 0 || (disabled == 1 && capable(CAP_SYS_ADMIN));
+}
+
SYSCALL_DEFINE2(io_uring_setup, u32, entries,
struct io_uring_params __user *, params)
{
+ if (!io_uring_allowed())
+ return -EPERM;
+
return io_uring_setup(entries, params);
}
@@ -4242,6 +4268,11 @@
req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
SLAB_ACCOUNT);
+
+#ifdef CONFIG_SYSCTL
+ register_sysctl_init("kernel", kernel_io_uring_disabled_table);
+#endif
+
return 0;
};
__initcall(io_uring_init);
diff --git a/kernel/audit.c b/kernel/audit.c
index 9bc0b03..c5dde3a 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -208,6 +208,75 @@
struct sk_buff *skb;
};
+static struct kmem_cache *audit_task_cache;
+
+void __init audit_task_init(void)
+{
+ audit_task_cache = kmem_cache_create("audit_task",
+ sizeof(struct audit_task_info),
+ 0, SLAB_PANIC, NULL);
+}
+
+/**
+ * audit_alloc - allocate an audit info block for a task
+ * @tsk: task
+ *
+ * Call audit_alloc_syscall to filter on the task information and
+ * allocate a per-task audit context if necessary. This is called from
+ * copy_process, so no lock is needed.
+ */
+int audit_alloc(struct task_struct *tsk)
+{
+ int ret = 0;
+ struct audit_task_info *info;
+
+ info = kmem_cache_alloc(audit_task_cache, GFP_KERNEL);
+ if (!info) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ info->loginuid = audit_get_loginuid(current);
+ info->sessionid = audit_get_sessionid(current);
+ info->contid = audit_get_contid(current);
+ tsk->audit = info;
+
+ ret = audit_alloc_syscall(tsk);
+ if (ret) {
+ tsk->audit = NULL;
+ kmem_cache_free(audit_task_cache, info);
+ }
+out:
+ return ret;
+}
+
+struct audit_task_info init_struct_audit = {
+ .loginuid = INVALID_UID,
+ .sessionid = AUDIT_SID_UNSET,
+ .contid = AUDIT_CID_UNSET,
+#ifdef CONFIG_AUDITSYSCALL
+ .ctx = NULL,
+#endif
+};
+
+/**
+ * audit_free - free per-task audit info
+ * @tsk: task whose audit info block to free
+ *
+ * Called from copy_process and do_exit
+ */
+void audit_free(struct task_struct *tsk)
+{
+ struct audit_task_info *info = tsk->audit;
+
+ audit_free_syscall(tsk);
+ /* Freeing the audit_task_info struct must be performed after
+ * audit_log_exit() due to need for loginuid and sessionid.
+ */
+ info = tsk->audit;
+ tsk->audit = NULL;
+ kmem_cache_free(audit_task_cache, info);
+}
+
/**
* auditd_test_task - Check to see if a given task is an audit daemon
* @task: the task to check
@@ -2362,8 +2431,8 @@
sessionid = (unsigned int)atomic_inc_return(&session_id);
}
- current->sessionid = sessionid;
- current->loginuid = loginuid;
+ current->audit->sessionid = sessionid;
+ current->audit->loginuid = loginuid;
out:
audit_log_set_loginuid(oldloginuid, loginuid, oldsessionid, sessionid, rc);
return rc;
@@ -2396,6 +2465,62 @@
return audit_signal_info_syscall(t);
}
+/*
+ * audit_set_contid - set current task's audit contid
+ * @task: target task
+ * @contid: contid value
+ *
+ * Returns 0 on success, -EPERM on permission failure.
+ *
+ * Called (set) from fs/proc/base.c::proc_contid_write().
+ */
+int audit_set_contid(struct task_struct *task, u64 contid)
+{
+ u64 oldcontid;
+ int rc = 0;
+ struct audit_buffer *ab;
+
+ task_lock(task);
+ /* Can't set if audit disabled */
+ if (!task->audit) {
+ task_unlock(task);
+ return -ENOPROTOOPT;
+ }
+ oldcontid = audit_get_contid(task);
+ read_lock(&tasklist_lock);
+ /* Don't allow the audit containerid to be unset */
+ if (!audit_contid_valid(contid))
+ rc = -EINVAL;
+ /* if we don't have caps, reject */
+ else if (!capable(CAP_AUDIT_CONTROL))
+ rc = -EPERM;
+ /* if task has children or is not single-threaded, deny */
+ else if (!list_empty(&task->children))
+ rc = -EBUSY;
+ else if (!(thread_group_leader(task) && thread_group_empty(task)))
+ rc = -EALREADY;
+ /* if contid is already set, deny */
+ else if (audit_contid_set(task))
+ rc = -ECHILD;
+ read_unlock(&tasklist_lock);
+ if (!rc)
+ task->audit->contid = contid;
+ task_unlock(task);
+
+ if (!audit_enabled)
+ return rc;
+
+ ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_OP);
+ if (!ab)
+ return rc;
+
+ audit_log_format(ab,
+ "op=set opid=%d contid=%llu old-contid=%llu",
+ task_tgid_nr(task), contid, oldcontid);
+ audit_log_end(ab);
+ return rc;
+}
+
/**
* audit_log_end - end one audit record
* @ab: the audit_buffer
diff --git a/kernel/audit.h b/kernel/audit.h
index c57b008..2d7634e 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -144,6 +144,7 @@
kuid_t target_uid;
unsigned int target_sessionid;
u32 target_sid;
+ u64 target_cid;
char target_comm[TASK_COMM_LEN];
struct audit_tree_refs *trees, *first_trees;
@@ -263,6 +264,8 @@
extern unsigned int audit_serial(void);
extern int auditsc_get_stamp(struct audit_context *ctx,
struct timespec64 *t, unsigned int *serial);
+extern int audit_alloc_syscall(struct task_struct *tsk);
+extern void audit_free_syscall(struct task_struct *tsk);
extern void audit_put_watch(struct audit_watch *watch);
extern void audit_get_watch(struct audit_watch *watch);
@@ -304,6 +307,9 @@
extern struct list_head *audit_killed_trees(void);
#else /* CONFIG_AUDITSYSCALL */
#define auditsc_get_stamp(c, t, s) 0
+#define audit_alloc_syscall(t) 0
+#define audit_free_syscall(t) do { } while (0)
+
#define audit_put_watch(w) do { } while (0)
#define audit_get_watch(w) do { } while (0)
#define audit_to_watch(k, p, l, o) (-EINVAL)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index c5f41fc..c880f70 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -100,6 +100,7 @@
kuid_t target_uid[AUDIT_AUX_PIDS];
unsigned int target_sessionid[AUDIT_AUX_PIDS];
u32 target_sid[AUDIT_AUX_PIDS];
+ u64 target_cid[AUDIT_AUX_PIDS];
char target_comm[AUDIT_AUX_PIDS][TASK_COMM_LEN];
int pid_count;
};
@@ -1036,23 +1037,25 @@
return context;
}
-/**
- * audit_alloc - allocate an audit context block for a task
+/*
+ * audit_alloc_syscall - allocate an audit context block for a task
* @tsk: task
*
* Filter on the task information and allocate a per-task audit context
* if necessary. Doing so turns on system call auditing for the
- * specified task. This is called from copy_process, so no lock is
- * needed.
+ * specified task. This is called from copy_process via audit_alloc, so
+ * no lock is needed.
*/
-int audit_alloc(struct task_struct *tsk)
+int audit_alloc_syscall(struct task_struct *tsk)
{
struct audit_context *context;
enum audit_state state;
char *key = NULL;
- if (likely(!audit_ever_enabled))
- return 0;
+ if (likely(!audit_ever_enabled)) {
+ audit_set_context(tsk, NULL);
+ return 0; /* Return if not auditing. */
+ }
state = audit_filter_task(tsk, &key);
if (state == AUDIT_STATE_DISABLED) {
@@ -1062,7 +1065,7 @@
if (!(context = audit_alloc_context(state))) {
kfree(key);
- audit_log_lost("out of memory in audit_alloc");
+ audit_log_lost("out of memory in audit_alloc_syscall");
return -ENOMEM;
}
context->filterkey = key;
@@ -1815,14 +1818,15 @@
}
/**
- * __audit_free - free a per-task audit context
+ * audit_free_syscall - free per-task audit context info
* @tsk: task whose audit context block to free
*
- * Called from copy_process, do_exit, and the io_uring code
+ * Called from audit_free
*/
-void __audit_free(struct task_struct *tsk)
+void audit_free_syscall(struct task_struct *tsk)
{
- struct audit_context *context = tsk->audit_context;
+ struct audit_task_info *info = tsk->audit;
+ struct audit_context *context = info->ctx;
if (!context)
return;
@@ -1852,7 +1856,6 @@
audit_log_uring(context);
}
}
-
audit_set_context(tsk, NULL);
audit_free_context(context);
}
@@ -2726,6 +2729,7 @@
context->target_uid = task_uid(t);
context->target_sessionid = audit_get_sessionid(t);
security_task_getsecid_obj(t, &context->target_sid);
+ context->target_cid = audit_get_contid(t);
memcpy(context->target_comm, t->comm, TASK_COMM_LEN);
}
@@ -2753,6 +2757,7 @@
ctx->target_uid = t_uid;
ctx->target_sessionid = audit_get_sessionid(t);
security_task_getsecid_obj(t, &ctx->target_sid);
+ ctx->target_cid = audit_get_contid(t);
memcpy(ctx->target_comm, t->comm, TASK_COMM_LEN);
return 0;
}
@@ -2774,6 +2779,7 @@
axp->target_uid[axp->pid_count] = t_uid;
axp->target_sessionid[axp->pid_count] = audit_get_sessionid(t);
security_task_getsecid_obj(t, &axp->target_sid[axp->pid_count]);
+ axp->target_cid[axp->pid_count] = audit_get_contid(t);
memcpy(axp->target_comm[axp->pid_count], t->comm, TASK_COMM_LEN);
axp->pid_count++;
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 748ac86..beeb89e 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -45,8 +45,8 @@
lockdep_assert_held_once(&tr->mutex);
/* Instead of updating the trampoline here, we propagate
- * -EAGAIN to register_ftrace_direct_multi(). Then we can
- * retry register_ftrace_direct_multi() after updating the
+ * -EAGAIN to register_ftrace_direct(). Then we can
+ * retry register_ftrace_direct() after updating the
* trampoline.
*/
if ((tr->flags & BPF_TRAMP_F_CALL_ORIG) &&
@@ -198,7 +198,7 @@
int ret;
if (tr->func.ftrace_managed)
- ret = unregister_ftrace_direct_multi(tr->fops, (long)old_addr);
+ ret = unregister_ftrace_direct(tr->fops, (long)old_addr, false);
else
ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, NULL);
@@ -215,9 +215,9 @@
if (tr->func.ftrace_managed) {
if (lock_direct_mutex)
- ret = modify_ftrace_direct_multi(tr->fops, (long)new_addr);
+ ret = modify_ftrace_direct(tr->fops, (long)new_addr);
else
- ret = modify_ftrace_direct_multi_nolock(tr->fops, (long)new_addr);
+ ret = modify_ftrace_direct_nolock(tr->fops, (long)new_addr);
} else {
ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, old_addr, new_addr);
}
@@ -243,7 +243,7 @@
if (tr->func.ftrace_managed) {
ftrace_set_filter_ip(tr->fops, (unsigned long)ip, 0, 1);
- ret = register_ftrace_direct_multi(tr->fops, (long)new_addr);
+ ret = register_ftrace_direct(tr->fops, (long)new_addr);
} else {
ret = bpf_arch_text_poke(ip, BPF_MOD_CALL, NULL, new_addr);
}
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index be61332c..f462a9d 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -79,6 +79,8 @@
if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT))
trace_sys_enter(regs, syscall);
+ /* trace_sys_enter might have changed the syscall number */
+ syscall = syscall_get_nr(current, regs);
syscall_enter_audit(regs, syscall);
diff --git a/kernel/exit.c b/kernel/exit.c
index bccfa42..3212f3f 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -68,6 +68,7 @@
#include <linux/kprobes.h>
#include <linux/rethook.h>
#include <linux/sysfs.h>
+#include <linux/security.h>
#include <linux/uaccess.h>
#include <asm/unistd.h>
@@ -844,6 +845,8 @@
#endif
if (tsk->mm)
setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
+
+ security_task_exit(tsk);
}
acct_collect(code, group_dead);
if (group_dead)
@@ -1905,7 +1908,14 @@
}
EXPORT_SYMBOL(thread_group_exited);
-__weak void abort(void)
+/*
+ * This needs to be __function_aligned as GCC implicitly makes any
+ * implementation of abort() cold and drops alignment specified by
+ * -falign-functions=N.
+ *
+ * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345#c11
+ */
+__weak __function_aligned void abort(void)
{
BUG();
diff --git a/kernel/fork.c b/kernel/fork.c
index 8561792..bdba9e2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2188,7 +2188,6 @@
posix_cputimers_init(&p->posix_cputimers);
p->io_context = NULL;
- audit_set_context(p, NULL);
cgroup_fork(p);
if (args->kthread) {
if (!set_kthread_struct(p))
@@ -2505,6 +2504,7 @@
uprobe_copy_process(p, clone_flags);
copy_oom_score_adj(clone_flags, p);
+ security_task_post_alloc(p);
return p;
diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h
index b5379c0..3928f7b 100644
--- a/kernel/futex/futex.h
+++ b/kernel/futex/futex.h
@@ -143,7 +143,7 @@
extern int futex_wait_setup(u32 __user *uaddr, u32 val, unsigned int flags,
struct futex_q *q, struct futex_hash_bucket **hb);
extern void futex_wait_queue(struct futex_hash_bucket *hb, struct futex_q *q,
- struct hrtimer_sleeper *timeout);
+ struct hrtimer_sleeper *timeout, struct task_struct *next);
extern void futex_wake_mark(struct wake_q_head *wake_q, struct futex_q *q);
extern int fault_in_user_writeable(u32 __user *uaddr);
@@ -265,7 +265,10 @@
u32 *cmpval, int requeue_pi);
extern int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
- ktime_t *abs_time, u32 bitset);
+ ktime_t *abs_time, u32 bitset, struct task_struct *next);
+
+extern int futex_swap(u32 __user *uaddr, unsigned int flags, u32 val,
+ ktime_t *abs_time, u32 __user *uaddr2);
/**
* struct futex_vector - Auxiliary struct for futex_waitv()
diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c
index cba8b1a..7f8fcf9 100644
--- a/kernel/futex/requeue.c
+++ b/kernel/futex/requeue.c
@@ -816,7 +816,7 @@
}
/* Queue the futex_q, drop the hb lock, wait for wakeup. */
- futex_wait_queue(hb, &q, to);
+ futex_wait_queue(hb, &q, to, NULL);
switch (futex_requeue_pi_wakeup_sync(&q)) {
case Q_REQUEUE_PI_IGNORE:
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index a807407..6af241b 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -103,7 +103,7 @@
val3 = FUTEX_BITSET_MATCH_ANY;
fallthrough;
case FUTEX_WAIT_BITSET:
- return futex_wait(uaddr, flags, val, timeout, val3);
+ return futex_wait(uaddr, flags, val, timeout, val3, NULL);
case FUTEX_WAKE:
val3 = FUTEX_BITSET_MATCH_ANY;
fallthrough;
@@ -130,6 +130,8 @@
uaddr2);
case FUTEX_CMP_REQUEUE_PI:
return futex_requeue(uaddr, flags, uaddr2, val, val2, &val3, 1);
+ case GFUTEX_SWAP:
+ return futex_swap(uaddr, flags, val, timeout, uaddr2);
}
return -ENOSYS;
}
@@ -142,6 +144,7 @@
case FUTEX_LOCK_PI2:
case FUTEX_WAIT_BITSET:
case FUTEX_WAIT_REQUEUE_PI:
+ case GFUTEX_SWAP:
return true;
}
return false;
@@ -154,7 +157,7 @@
return -EINVAL;
*t = timespec64_to_ktime(*ts);
- if (cmd == FUTEX_WAIT)
+ if (cmd == FUTEX_WAIT || cmd == GFUTEX_SWAP)
*t = ktime_add_safe(ktime_get(), *t);
else if (cmd != FUTEX_LOCK_PI && !(op & FUTEX_CLOCK_REALTIME))
*t = timens_ktime_to_host(CLOCK_MONOTONIC, *t);
diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c
index ba01b94..82a2029 100644
--- a/kernel/futex/waitwake.c
+++ b/kernel/futex/waitwake.c
@@ -138,15 +138,15 @@
}
/*
- * Wake up waiters matching bitset queued on this futex (uaddr).
+ * Prepare wake queue matching bitset queued on this futex (uaddr).
*/
-int futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset)
+int prepare_wake_q(u32 __user *uaddr, unsigned int flags,
+ int nr_wake, u32 bitset, struct wake_q_head *wake_q)
{
struct futex_hash_bucket *hb;
struct futex_q *this, *next;
union futex_key key = FUTEX_KEY_INIT;
int ret;
- DEFINE_WAKE_Q(wake_q);
if (!bitset)
return -EINVAL;
@@ -174,14 +174,23 @@
if (!(this->bitset & bitset))
continue;
- futex_wake_mark(&wake_q, this);
+ futex_wake_mark(wake_q, this);
if (++ret >= nr_wake)
break;
}
}
spin_unlock(&hb->lock);
- wake_up_q(&wake_q);
+ return ret;
+}
+
+int futex_wake(u32 __user *uaddr, unsigned int flags, int nr_wake, u32 bitset) {
+ int ret;
+ DEFINE_WAKE_Q(wake_q);
+
+ ret = prepare_wake_q(uaddr, flags, nr_wake, bitset, &wake_q);
+ if (ret > 0)
+ wake_up_q(&wake_q);
return ret;
}
@@ -324,9 +333,11 @@
* @hb: the futex hash bucket, must be locked by the caller
* @q: the futex_q to queue up on
* @timeout: the prepared hrtimer_sleeper, or null for no timeout
+ * @next: if present, wake next and hint to the scheduler that we'd
+ * prefer to execute it locally
*/
void futex_wait_queue(struct futex_hash_bucket *hb, struct futex_q *q,
- struct hrtimer_sleeper *timeout)
+ struct hrtimer_sleeper *timeout, struct task_struct *next)
{
/*
* The task state is guaranteed to be set before another task can
@@ -351,10 +362,24 @@
* flagged for rescheduling. Only call schedule if there
* is no timeout, or if it has yet to expire.
*/
- if (!timeout || timeout->task)
+ if (!timeout || timeout->task) {
+ if (next) {
+#ifdef CONFIG_SMP
+ wake_up_swap(next);
+#else
+ wake_up_process(next);
+#endif
+ put_task_struct(next);
+ next = NULL;
+ }
schedule();
+ }
}
__set_current_state(TASK_RUNNING);
+ if (next) {
+ wake_up_process(next);
+ put_task_struct(next);
+ }
}
/**
@@ -629,7 +654,8 @@
return ret;
}
-int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, ktime_t *abs_time, u32 bitset)
+int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val,
+ ktime_t *abs_time, u32 bitset, struct task_struct *next)
{
struct hrtimer_sleeper timeout, *to;
struct restart_block *restart;
@@ -653,7 +679,8 @@
goto out;
/* futex_queue and wait for wakeup, timeout, or a signal. */
- futex_wait_queue(hb, &q, to);
+ futex_wait_queue(hb, &q, to, next);
+ next = NULL;
/* If we were woken (and unqueued), we succeeded, whatever. */
ret = 0;
@@ -684,6 +711,10 @@
ret = set_restart_fn(restart, futex_wait_restart);
out:
+ if (next) {
+ wake_up_process(next);
+ put_task_struct(next);
+ }
if (to) {
hrtimer_cancel(&to->timer);
destroy_hrtimer_on_stack(&to->timer);
@@ -703,6 +734,26 @@
restart->fn = do_no_restart_syscall;
return (long)futex_wait(uaddr, restart->futex.flags,
- restart->futex.val, tp, restart->futex.bitset);
+ restart->futex.val, tp, restart->futex.bitset, NULL);
}
+int futex_swap(u32 __user *uaddr, unsigned int flags, u32 val,
+ ktime_t *abs_time, u32 __user *uaddr2)
+{
+ u32 bitset = FUTEX_BITSET_MATCH_ANY;
+ struct task_struct *next = NULL;
+ DEFINE_WAKE_Q(wake_q);
+ int ret;
+
+ ret = prepare_wake_q(uaddr2, flags, 1, bitset, &wake_q);
+ if (ret < 0)
+ return ret;
+ if (wake_q.first != WAKE_Q_TAIL) {
+ WARN_ON(ret != 1);
+ /* At most one wakee can be present. Pull it out. */
+ next = container_of(wake_q.first, struct task_struct, wake_q);
+ next->wake_q.next = NULL;
+ }
+
+ return futex_wait(uaddr, flags, val, abs_time, bitset, next);
+}
\ No newline at end of file
diff --git a/kernel/kthread.c b/kernel/kthread.c
index f97fd01..e6b84b7 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -1402,12 +1402,11 @@
* kthread_use_mm - make the calling kthread operate on an address space
* @mm: address space to operate on
*/
-void kthread_use_mm(struct mm_struct *mm)
+void use_mm(struct mm_struct *mm)
{
struct mm_struct *active_mm;
struct task_struct *tsk = current;
- WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
WARN_ON_ONCE(tsk->mm);
task_lock(tsk);
@@ -1441,6 +1440,16 @@
else
smp_mb();
}
+EXPORT_SYMBOL_GPL(use_mm);
+
+void kthread_use_mm(struct mm_struct *mm)
+{
+ struct task_struct *tsk = current;
+
+ WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
+
+ use_mm(mm);
+}
EXPORT_SYMBOL_GPL(kthread_use_mm);
/**
@@ -1452,6 +1461,16 @@
struct task_struct *tsk = current;
WARN_ON_ONCE(!(tsk->flags & PF_KTHREAD));
+
+ unuse_mm(mm);
+}
+
+EXPORT_SYMBOL_GPL(kthread_unuse_mm);
+
+void unuse_mm(struct mm_struct *mm)
+{
+ struct task_struct *tsk = current;
+
WARN_ON_ONCE(!tsk->mm);
task_lock(tsk);
@@ -1472,7 +1491,7 @@
local_irq_enable();
task_unlock(tsk);
}
-EXPORT_SYMBOL_GPL(kthread_unuse_mm);
+EXPORT_SYMBOL_GPL(unuse_mm);
#ifdef CONFIG_BLK_CGROUP
/**
diff --git a/kernel/livepatch/patch.c b/kernel/livepatch/patch.c
index 4c4f5a7..4152c71 100644
--- a/kernel/livepatch/patch.c
+++ b/kernel/livepatch/patch.c
@@ -118,7 +118,7 @@
if (func->nop)
goto unlock;
- ftrace_instruction_pointer_set(fregs, (unsigned long)func->new_func);
+ ftrace_regs_set_instruction_pointer(fregs, (unsigned long)func->new_func);
unlock:
ftrace_test_recursion_unlock(bit);
diff --git a/kernel/module/signing.c b/kernel/module/signing.c
index a2ff4242..a2923e9 100644
--- a/kernel/module/signing.c
+++ b/kernel/module/signing.c
@@ -61,10 +61,17 @@
modlen -= sig_len + sizeof(ms);
info->len = modlen;
- return verify_pkcs7_signature(mod, modlen, mod + modlen, sig_len,
- VERIFY_USE_SECONDARY_KEYRING,
- VERIFYING_MODULE_SIGNATURE,
- NULL, NULL);
+ ret = verify_pkcs7_signature(mod, modlen, mod + modlen, sig_len,
+ VERIFY_USE_SECONDARY_KEYRING,
+ VERIFYING_MODULE_SIGNATURE,
+ NULL, NULL);
+ if (ret == -ENOKEY && IS_ENABLED(CONFIG_INTEGRITY_PLATFORM_KEYRING)) {
+ ret = verify_pkcs7_signature(mod, modlen, mod + modlen, sig_len,
+ VERIFY_USE_PLATFORM_KEYRING,
+ VERIFYING_MODULE_SIGNATURE,
+ NULL, NULL);
+ }
+ return ret;
}
int module_sig_check(struct load_info *info, int flags)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 18a4f8f..1d0cfa9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4363,6 +4363,11 @@
}
EXPORT_SYMBOL(wake_up_process);
+int wake_up_swap(struct task_struct *tsk)
+{
+ return try_to_wake_up(tsk, TASK_NORMAL, WF_CURRENT_CPU);
+}
+
int wake_up_state(struct task_struct *p, unsigned int state)
{
return try_to_wake_up(p, state, 0);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2558ab9..e8ce9d8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7439,6 +7439,8 @@
/* SD_flags and WF_flags share the first nibble */
int sd_flag = wake_flags & 0xF;
+ if ((wake_flags & WF_CURRENT_CPU) && cpumask_test_cpu(cpu, p->cpus_ptr))
+ return cpu;
/*
* required for stable ->cpus_allowed
*/
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b62d53d..ac02495 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2083,6 +2083,7 @@
#define WF_SYNC 0x10 /* Waker goes to sleep after wakeup */
#define WF_MIGRATED 0x20 /* Internal use, task got migrated */
+#define WF_CURRENT_CPU 0x200 /* Prefer to move wakee to the current CPU */
#ifdef CONFIG_SMP
static_assert(WF_EXEC == SD_BALANCE_EXEC);
diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913..6768729 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -80,21 +80,6 @@
wake_up_process(tsk);
}
-/*
- * If ksoftirqd is scheduled, we do not want to process pending softirqs
- * right now. Let ksoftirqd handle this at its own rate, to get fairness,
- * unless we're doing some of the synchronous softirqs.
- */
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
-static bool ksoftirqd_running(unsigned long pending)
-{
- struct task_struct *tsk = __this_cpu_read(ksoftirqd);
-
- if (pending & SOFTIRQ_NOW_MASK)
- return false;
- return tsk && task_is_running(tsk) && !__kthread_should_park(tsk);
-}
-
#ifdef CONFIG_TRACE_IRQFLAGS
DEFINE_PER_CPU(int, hardirqs_enabled);
DEFINE_PER_CPU(int, hardirq_context);
@@ -236,7 +221,7 @@
goto out;
pending = local_softirq_pending();
- if (!pending || ksoftirqd_running(pending))
+ if (!pending)
goto out;
/*
@@ -432,9 +417,6 @@
static inline void invoke_softirq(void)
{
- if (ksoftirqd_running(local_softirq_pending()))
- return;
-
if (!force_irqthreads() || !__this_cpu_read(ksoftirqd)) {
#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
/*
@@ -468,7 +450,7 @@
pending = local_softirq_pending();
- if (pending && !ksoftirqd_running(pending))
+ if (pending)
do_softirq_own_stack();
local_irq_restore(flags);
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 93d7249..b6600ae 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -42,14 +42,17 @@
config HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
bool
+config HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
+ bool
+
config HAVE_DYNAMIC_FTRACE_WITH_ARGS
bool
help
If this is set, then arguments and stack can be found from
- the pt_regs passed into the function callback regs parameter
+ the ftrace_regs passed into the function callback regs parameter
by default, even without setting the REGS flag in the ftrace_ops.
- This allows for use of regs_get_kernel_argument() and
- kernel_stack_pointer().
+ This allows for use of ftrace_regs_get_argument() and
+ ftrace_regs_get_stack_pointer().
config HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
bool
@@ -247,9 +250,13 @@
config DYNAMIC_FTRACE_WITH_DIRECT_CALLS
def_bool y
- depends on DYNAMIC_FTRACE_WITH_REGS
+ depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS
depends on HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+config DYNAMIC_FTRACE_WITH_CALL_OPS
+ def_bool y
+ depends on HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
+
config DYNAMIC_FTRACE_WITH_ARGS
def_bool y
depends on DYNAMIC_FTRACE
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 552956c..d6b0e16 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -125,6 +125,33 @@
void ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs);
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+/*
+ * Stub used to invoke the list ops without requiring a separate trampoline.
+ */
+const struct ftrace_ops ftrace_list_ops = {
+ .func = ftrace_ops_list_func,
+ .flags = FTRACE_OPS_FL_STUB,
+};
+
+static void ftrace_ops_nop_func(unsigned long ip, unsigned long parent_ip,
+ struct ftrace_ops *op,
+ struct ftrace_regs *fregs)
+{
+ /* do nothing */
+}
+
+/*
+ * Stub used when a call site is disabled. May be called transiently by threads
+ * which have made it into ftrace_caller but haven't yet recovered the ops at
+ * the point the call site is disabled.
+ */
+const struct ftrace_ops ftrace_nop_ops = {
+ .func = ftrace_ops_nop_func,
+ .flags = FTRACE_OPS_FL_STUB,
+};
+#endif
+
static inline void ftrace_ops_init(struct ftrace_ops *ops)
{
#ifdef CONFIG_DYNAMIC_FTRACE
@@ -1820,6 +1847,18 @@
* if rec count is zero.
*/
}
+
+ /*
+ * If the rec has a single associated ops, and ops->func can be
+ * called directly, allow the call site to call via the ops.
+ */
+ if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS) &&
+ ftrace_rec_count(rec) == 1 &&
+ ftrace_ops_get_func(ops) == ops->func)
+ rec->flags |= FTRACE_FL_CALL_OPS;
+ else
+ rec->flags &= ~FTRACE_FL_CALL_OPS;
+
count++;
/* Must match FTRACE_UPDATE_CALLS in ftrace_modify_all_code() */
@@ -2114,8 +2153,9 @@
struct ftrace_ops *ops = NULL;
pr_info("ftrace record flags: %lx\n", rec->flags);
- pr_cont(" (%ld)%s", ftrace_rec_count(rec),
- rec->flags & FTRACE_FL_REGS ? " R" : " ");
+ pr_cont(" (%ld)%s%s", ftrace_rec_count(rec),
+ rec->flags & FTRACE_FL_REGS ? " R" : " ",
+ rec->flags & FTRACE_FL_CALL_OPS ? " O" : " ");
if (rec->flags & FTRACE_FL_TRAMP_EN) {
ops = ftrace_find_tramp_ops_any(rec);
if (ops) {
@@ -2183,6 +2223,7 @@
* want the direct enabled (it will be done via the
* direct helper). But if DIRECT_EN is set, and
* the count is not one, we need to clear it.
+ *
*/
if (ftrace_rec_count(rec) == 1) {
if (!(rec->flags & FTRACE_FL_DIRECT) !=
@@ -2191,6 +2232,19 @@
} else if (rec->flags & FTRACE_FL_DIRECT_EN) {
flag |= FTRACE_FL_DIRECT;
}
+
+ /*
+ * Ops calls are special, as count matters.
+ * As with direct calls, they must only be enabled when count
+ * is one, otherwise they'll be handled via the list ops.
+ */
+ if (ftrace_rec_count(rec) == 1) {
+ if (!(rec->flags & FTRACE_FL_CALL_OPS) !=
+ !(rec->flags & FTRACE_FL_CALL_OPS_EN))
+ flag |= FTRACE_FL_CALL_OPS;
+ } else if (rec->flags & FTRACE_FL_CALL_OPS_EN) {
+ flag |= FTRACE_FL_CALL_OPS;
+ }
}
/* If the state of this record hasn't changed, then do nothing */
@@ -2235,6 +2289,21 @@
rec->flags &= ~FTRACE_FL_DIRECT_EN;
}
}
+
+ if (flag & FTRACE_FL_CALL_OPS) {
+ if (ftrace_rec_count(rec) == 1) {
+ if (rec->flags & FTRACE_FL_CALL_OPS)
+ rec->flags |= FTRACE_FL_CALL_OPS_EN;
+ else
+ rec->flags &= ~FTRACE_FL_CALL_OPS_EN;
+ } else {
+ /*
+ * Can only call directly if there's
+ * only one set of associated ops.
+ */
+ rec->flags &= ~FTRACE_FL_CALL_OPS_EN;
+ }
+ }
}
/*
@@ -2264,7 +2333,8 @@
* and REGS states. The _EN flags must be disabled though.
*/
rec->flags &= ~(FTRACE_FL_ENABLED | FTRACE_FL_TRAMP_EN |
- FTRACE_FL_REGS_EN | FTRACE_FL_DIRECT_EN);
+ FTRACE_FL_REGS_EN | FTRACE_FL_DIRECT_EN |
+ FTRACE_FL_CALL_OPS_EN);
}
ftrace_bug_type = FTRACE_BUG_NOP;
@@ -2437,6 +2507,25 @@
return NULL;
}
+struct ftrace_ops *
+ftrace_find_unique_ops(struct dyn_ftrace *rec)
+{
+ struct ftrace_ops *op, *found = NULL;
+ unsigned long ip = rec->ip;
+
+ do_for_each_ftrace_op(op, ftrace_ops_list) {
+
+ if (hash_contains_ip(ip, op->func_hash)) {
+ if (found)
+ return NULL;
+ found = op;
+ }
+
+ } while_for_each_ftrace_op(op);
+
+ return found;
+}
+
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
/* Protected by rcu_tasks for reading, and direct_mutex for writing */
static struct ftrace_hash *direct_functions = EMPTY_HASH;
@@ -2494,29 +2583,13 @@
static void call_direct_funcs(unsigned long ip, unsigned long pip,
struct ftrace_ops *ops, struct ftrace_regs *fregs)
{
- struct pt_regs *regs = ftrace_get_regs(fregs);
- unsigned long addr;
+ unsigned long addr = READ_ONCE(ops->direct_call);
- addr = ftrace_find_rec_direct(ip);
if (!addr)
return;
- arch_ftrace_set_direct_caller(regs, addr);
+ arch_ftrace_set_direct_caller(fregs, addr);
}
-
-struct ftrace_ops direct_ops = {
- .func = call_direct_funcs,
- .flags = FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS
- | FTRACE_OPS_FL_PERMANENT,
- /*
- * By declaring the main trampoline as this trampoline
- * it will never have one allocated for it. Allocated
- * trampolines should not call direct functions.
- * The direct_ops should only be called by the builtin
- * ftrace_regs_caller trampoline.
- */
- .trampoline = FTRACE_REGS_ADDR,
-};
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
/**
@@ -3782,11 +3855,12 @@
if (iter->flags & FTRACE_ITER_ENABLED) {
struct ftrace_ops *ops;
- seq_printf(m, " (%ld)%s%s%s",
+ seq_printf(m, " (%ld)%s%s%s%s",
ftrace_rec_count(rec),
rec->flags & FTRACE_FL_REGS ? " R" : " ",
rec->flags & FTRACE_FL_IPMODIFY ? " I" : " ",
- rec->flags & FTRACE_FL_DIRECT ? " D" : " ");
+ rec->flags & FTRACE_FL_DIRECT ? " D" : " ",
+ rec->flags & FTRACE_FL_CALL_OPS ? " O" : " ");
if (rec->flags & FTRACE_FL_TRAMP_EN) {
ops = ftrace_find_tramp_ops_any(rec);
if (ops) {
@@ -3802,6 +3876,15 @@
} else {
add_trampoline_func(m, NULL, rec);
}
+ if (rec->flags & FTRACE_FL_CALL_OPS_EN) {
+ ops = ftrace_find_unique_ops(rec);
+ if (ops) {
+ seq_printf(m, "\tops: %pS (%pS)",
+ ops, ops->func);
+ } else {
+ seq_puts(m, "\tops: ERROR!");
+ }
+ }
if (rec->flags & FTRACE_FL_DIRECT) {
unsigned long direct;
@@ -5197,391 +5280,9 @@
static LIST_HEAD(ftrace_direct_funcs);
-/**
- * ftrace_find_direct_func - test an address if it is a registered direct caller
- * @addr: The address of a registered direct caller
- *
- * This searches to see if a ftrace direct caller has been registered
- * at a specific address, and if so, it returns a descriptor for it.
- *
- * This can be used by architecture code to see if an address is
- * a direct caller (trampoline) attached to a fentry/mcount location.
- * This is useful for the function_graph tracer, as it may need to
- * do adjustments if it traced a location that also has a direct
- * trampoline attached to it.
- */
-struct ftrace_direct_func *ftrace_find_direct_func(unsigned long addr)
-{
- struct ftrace_direct_func *entry;
- bool found = false;
-
- /* May be called by fgraph trampoline (protected by rcu tasks) */
- list_for_each_entry_rcu(entry, &ftrace_direct_funcs, next) {
- if (entry->addr == addr) {
- found = true;
- break;
- }
- }
- if (found)
- return entry;
-
- return NULL;
-}
-
-static struct ftrace_direct_func *ftrace_alloc_direct_func(unsigned long addr)
-{
- struct ftrace_direct_func *direct;
-
- direct = kmalloc(sizeof(*direct), GFP_KERNEL);
- if (!direct)
- return NULL;
- direct->addr = addr;
- direct->count = 0;
- list_add_rcu(&direct->next, &ftrace_direct_funcs);
- ftrace_direct_func_count++;
- return direct;
-}
-
static int register_ftrace_function_nolock(struct ftrace_ops *ops);
-/**
- * register_ftrace_direct - Call a custom trampoline directly
- * @ip: The address of the nop at the beginning of a function
- * @addr: The address of the trampoline to call at @ip
- *
- * This is used to connect a direct call from the nop location (@ip)
- * at the start of ftrace traced functions. The location that it calls
- * (@addr) must be able to handle a direct call, and save the parameters
- * of the function being traced, and restore them (or inject new ones
- * if needed), before returning.
- *
- * Returns:
- * 0 on success
- * -EBUSY - Another direct function is already attached (there can be only one)
- * -ENODEV - @ip does not point to a ftrace nop location (or not supported)
- * -ENOMEM - There was an allocation failure.
- */
-int register_ftrace_direct(unsigned long ip, unsigned long addr)
-{
- struct ftrace_direct_func *direct;
- struct ftrace_func_entry *entry;
- struct ftrace_hash *free_hash = NULL;
- struct dyn_ftrace *rec;
- int ret = -ENODEV;
-
- mutex_lock(&direct_mutex);
-
- ip = ftrace_location(ip);
- if (!ip)
- goto out_unlock;
-
- /* See if there's a direct function at @ip already */
- ret = -EBUSY;
- if (ftrace_find_rec_direct(ip))
- goto out_unlock;
-
- ret = -ENODEV;
- rec = lookup_rec(ip, ip);
- if (!rec)
- goto out_unlock;
-
- /*
- * Check if the rec says it has a direct call but we didn't
- * find one earlier?
- */
- if (WARN_ON(rec->flags & FTRACE_FL_DIRECT))
- goto out_unlock;
-
- /* Make sure the ip points to the exact record */
- if (ip != rec->ip) {
- ip = rec->ip;
- /* Need to check this ip for a direct. */
- if (ftrace_find_rec_direct(ip))
- goto out_unlock;
- }
-
- ret = -ENOMEM;
- direct = ftrace_find_direct_func(addr);
- if (!direct) {
- direct = ftrace_alloc_direct_func(addr);
- if (!direct)
- goto out_unlock;
- }
-
- entry = ftrace_add_rec_direct(ip, addr, &free_hash);
- if (!entry)
- goto out_unlock;
-
- ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
-
- if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
- ret = register_ftrace_function_nolock(&direct_ops);
- if (ret)
- ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
- }
-
- if (ret) {
- remove_hash_entry(direct_functions, entry);
- kfree(entry);
- if (!direct->count) {
- list_del_rcu(&direct->next);
- synchronize_rcu_tasks();
- kfree(direct);
- if (free_hash)
- free_ftrace_hash(free_hash);
- free_hash = NULL;
- ftrace_direct_func_count--;
- }
- } else {
- direct->count++;
- }
- out_unlock:
- mutex_unlock(&direct_mutex);
-
- if (free_hash) {
- synchronize_rcu_tasks();
- free_ftrace_hash(free_hash);
- }
-
- return ret;
-}
-EXPORT_SYMBOL_GPL(register_ftrace_direct);
-
-static struct ftrace_func_entry *find_direct_entry(unsigned long *ip,
- struct dyn_ftrace **recp)
-{
- struct ftrace_func_entry *entry;
- struct dyn_ftrace *rec;
-
- rec = lookup_rec(*ip, *ip);
- if (!rec)
- return NULL;
-
- entry = __ftrace_lookup_ip(direct_functions, rec->ip);
- if (!entry) {
- WARN_ON(rec->flags & FTRACE_FL_DIRECT);
- return NULL;
- }
-
- WARN_ON(!(rec->flags & FTRACE_FL_DIRECT));
-
- /* Passed in ip just needs to be on the call site */
- *ip = rec->ip;
-
- if (recp)
- *recp = rec;
-
- return entry;
-}
-
-int unregister_ftrace_direct(unsigned long ip, unsigned long addr)
-{
- struct ftrace_direct_func *direct;
- struct ftrace_func_entry *entry;
- struct ftrace_hash *hash;
- int ret = -ENODEV;
-
- mutex_lock(&direct_mutex);
-
- ip = ftrace_location(ip);
- if (!ip)
- goto out_unlock;
-
- entry = find_direct_entry(&ip, NULL);
- if (!entry)
- goto out_unlock;
-
- hash = direct_ops.func_hash->filter_hash;
- if (hash->count == 1)
- unregister_ftrace_function(&direct_ops);
-
- ret = ftrace_set_filter_ip(&direct_ops, ip, 1, 0);
-
- WARN_ON(ret);
-
- remove_hash_entry(direct_functions, entry);
-
- direct = ftrace_find_direct_func(addr);
- if (!WARN_ON(!direct)) {
- /* This is the good path (see the ! before WARN) */
- direct->count--;
- WARN_ON(direct->count < 0);
- if (!direct->count) {
- list_del_rcu(&direct->next);
- synchronize_rcu_tasks();
- kfree(direct);
- kfree(entry);
- ftrace_direct_func_count--;
- }
- }
- out_unlock:
- mutex_unlock(&direct_mutex);
-
- return ret;
-}
-EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
-
-static struct ftrace_ops stub_ops = {
- .func = ftrace_stub,
-};
-
-/**
- * ftrace_modify_direct_caller - modify ftrace nop directly
- * @entry: The ftrace hash entry of the direct helper for @rec
- * @rec: The record representing the function site to patch
- * @old_addr: The location that the site at @rec->ip currently calls
- * @new_addr: The location that the site at @rec->ip should call
- *
- * An architecture may overwrite this function to optimize the
- * changing of the direct callback on an ftrace nop location.
- * This is called with the ftrace_lock mutex held, and no other
- * ftrace callbacks are on the associated record (@rec). Thus,
- * it is safe to modify the ftrace record, where it should be
- * currently calling @old_addr directly, to call @new_addr.
- *
- * This is called with direct_mutex locked.
- *
- * Safety checks should be made to make sure that the code at
- * @rec->ip is currently calling @old_addr. And this must
- * also update entry->direct to @new_addr.
- */
-int __weak ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
- struct dyn_ftrace *rec,
- unsigned long old_addr,
- unsigned long new_addr)
-{
- unsigned long ip = rec->ip;
- int ret;
-
- lockdep_assert_held(&direct_mutex);
-
- /*
- * The ftrace_lock was used to determine if the record
- * had more than one registered user to it. If it did,
- * we needed to prevent that from changing to do the quick
- * switch. But if it did not (only a direct caller was attached)
- * then this function is called. But this function can deal
- * with attached callers to the rec that we care about, and
- * since this function uses standard ftrace calls that take
- * the ftrace_lock mutex, we need to release it.
- */
- mutex_unlock(&ftrace_lock);
-
- /*
- * By setting a stub function at the same address, we force
- * the code to call the iterator and the direct_ops helper.
- * This means that @ip does not call the direct call, and
- * we can simply modify it.
- */
- ret = ftrace_set_filter_ip(&stub_ops, ip, 0, 0);
- if (ret)
- goto out_lock;
-
- ret = register_ftrace_function_nolock(&stub_ops);
- if (ret) {
- ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
- goto out_lock;
- }
-
- entry->direct = new_addr;
-
- /*
- * By removing the stub, we put back the direct call, calling
- * the @new_addr.
- */
- unregister_ftrace_function(&stub_ops);
- ftrace_set_filter_ip(&stub_ops, ip, 1, 0);
-
- out_lock:
- mutex_lock(&ftrace_lock);
-
- return ret;
-}
-
-/**
- * modify_ftrace_direct - Modify an existing direct call to call something else
- * @ip: The instruction pointer to modify
- * @old_addr: The address that the current @ip calls directly
- * @new_addr: The address that the @ip should call
- *
- * This modifies a ftrace direct caller at an instruction pointer without
- * having to disable it first. The direct call will switch over to the
- * @new_addr without missing anything.
- *
- * Returns: zero on success. Non zero on error, which includes:
- * -ENODEV : the @ip given has no direct caller attached
- * -EINVAL : the @old_addr does not match the current direct caller
- */
-int modify_ftrace_direct(unsigned long ip,
- unsigned long old_addr, unsigned long new_addr)
-{
- struct ftrace_direct_func *direct, *new_direct = NULL;
- struct ftrace_func_entry *entry;
- struct dyn_ftrace *rec;
- int ret = -ENODEV;
-
- mutex_lock(&direct_mutex);
-
- mutex_lock(&ftrace_lock);
-
- ip = ftrace_location(ip);
- if (!ip)
- goto out_unlock;
-
- entry = find_direct_entry(&ip, &rec);
- if (!entry)
- goto out_unlock;
-
- ret = -EINVAL;
- if (entry->direct != old_addr)
- goto out_unlock;
-
- direct = ftrace_find_direct_func(old_addr);
- if (WARN_ON(!direct))
- goto out_unlock;
- if (direct->count > 1) {
- ret = -ENOMEM;
- new_direct = ftrace_alloc_direct_func(new_addr);
- if (!new_direct)
- goto out_unlock;
- direct->count--;
- new_direct->count++;
- } else {
- direct->addr = new_addr;
- }
-
- /*
- * If there's no other ftrace callback on the rec->ip location,
- * then it can be changed directly by the architecture.
- * If there is another caller, then we just need to change the
- * direct caller helper to point to @new_addr.
- */
- if (ftrace_rec_count(rec) == 1) {
- ret = ftrace_modify_direct_caller(entry, rec, old_addr, new_addr);
- } else {
- entry->direct = new_addr;
- ret = 0;
- }
-
- if (ret) {
- direct->addr = old_addr;
- if (unlikely(new_direct)) {
- direct->count++;
- list_del_rcu(&new_direct->next);
- synchronize_rcu_tasks();
- kfree(new_direct);
- ftrace_direct_func_count--;
- }
- }
-
- out_unlock:
- mutex_unlock(&ftrace_lock);
- mutex_unlock(&direct_mutex);
- return ret;
-}
-EXPORT_SYMBOL_GPL(modify_ftrace_direct);
-
-#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_REGS)
+#define MULTI_FLAGS (FTRACE_OPS_FL_DIRECT | FTRACE_OPS_FL_SAVE_ARGS)
static int check_direct_multi(struct ftrace_ops *ops)
{
@@ -5610,7 +5311,7 @@
}
/**
- * register_ftrace_direct_multi - Call a custom trampoline directly
+ * register_ftrace_direct - Call a custom trampoline directly
* for multiple functions registered in @ops
* @ops: The address of the struct ftrace_ops object
* @addr: The address of the trampoline to call at @ops functions
@@ -5631,7 +5332,7 @@
* -ENODEV - @ip does not point to a ftrace nop location (or not supported)
* -ENOMEM - There was an allocation failure.
*/
-int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int register_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
{
struct ftrace_hash *hash, *free_hash = NULL;
struct ftrace_func_entry *entry, *new;
@@ -5673,6 +5374,7 @@
ops->func = call_direct_funcs;
ops->flags = MULTI_FLAGS;
ops->trampoline = FTRACE_REGS_ADDR;
+ ops->direct_call = addr;
err = register_ftrace_function_nolock(ops);
@@ -5689,11 +5391,11 @@
}
return err;
}
-EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(register_ftrace_direct);
/**
- * unregister_ftrace_direct_multi - Remove calls to custom trampoline
- * previously registered by register_ftrace_direct_multi for @ops object.
+ * unregister_ftrace_direct - Remove calls to custom trampoline
+ * previously registered by register_ftrace_direct for @ops object.
* @ops: The address of the struct ftrace_ops object
*
* This is used to remove a direct calls to @addr from the nop locations
@@ -5704,7 +5406,8 @@
* 0 on success
* -EINVAL - The @ops object was not properly registered.
*/
-int unregister_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int unregister_ftrace_direct(struct ftrace_ops *ops, unsigned long addr,
+ bool free_filters)
{
struct ftrace_hash *hash = ops->func_hash->filter_hash;
int err;
@@ -5722,12 +5425,15 @@
/* cleanup for possible another register call */
ops->func = NULL;
ops->trampoline = 0;
+
+ if (free_filters)
+ ftrace_free_filter(ops);
return err;
}
-EXPORT_SYMBOL_GPL(unregister_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(unregister_ftrace_direct);
static int
-__modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+__modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
{
struct ftrace_hash *hash;
struct ftrace_func_entry *entry, *iter;
@@ -5743,6 +5449,7 @@
/* Enable the tmp_ops to have the same functions as the direct ops */
ftrace_ops_init(&tmp_ops);
tmp_ops.func_hash = ops->func_hash;
+ tmp_ops.direct_call = addr;
err = register_ftrace_function_nolock(&tmp_ops);
if (err)
@@ -5764,6 +5471,8 @@
entry->direct = addr;
}
}
+ /* Prevent store tearing if a trampoline concurrently accesses the value */
+ WRITE_ONCE(ops->direct_call, addr);
mutex_unlock(&ftrace_lock);
@@ -5774,7 +5483,7 @@
}
/**
- * modify_ftrace_direct_multi_nolock - Modify an existing direct 'multi' call
+ * modify_ftrace_direct_nolock - Modify an existing direct 'multi' call
* to call something else
* @ops: The address of the struct ftrace_ops object
* @addr: The address of the new trampoline to call at @ops functions
@@ -5791,19 +5500,19 @@
* Returns: zero on success. Non zero on error, which includes:
* -EINVAL - The @ops object was not properly registered.
*/
-int modify_ftrace_direct_multi_nolock(struct ftrace_ops *ops, unsigned long addr)
+int modify_ftrace_direct_nolock(struct ftrace_ops *ops, unsigned long addr)
{
if (check_direct_multi(ops))
return -EINVAL;
if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
return -EINVAL;
- return __modify_ftrace_direct_multi(ops, addr);
+ return __modify_ftrace_direct(ops, addr);
}
-EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi_nolock);
+EXPORT_SYMBOL_GPL(modify_ftrace_direct_nolock);
/**
- * modify_ftrace_direct_multi - Modify an existing direct 'multi' call
+ * modify_ftrace_direct - Modify an existing direct 'multi' call
* to call something else
* @ops: The address of the struct ftrace_ops object
* @addr: The address of the new trampoline to call at @ops functions
@@ -5817,7 +5526,7 @@
* Returns: zero on success. Non zero on error, which includes:
* -EINVAL - The @ops object was not properly registered.
*/
-int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+int modify_ftrace_direct(struct ftrace_ops *ops, unsigned long addr)
{
int err;
@@ -5827,11 +5536,11 @@
return -EINVAL;
mutex_lock(&direct_mutex);
- err = __modify_ftrace_direct_multi(ops, addr);
+ err = __modify_ftrace_direct(ops, addr);
mutex_unlock(&direct_mutex);
return err;
}
-EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi);
+EXPORT_SYMBOL_GPL(modify_ftrace_direct);
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
/**
diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index a2d301f..a931d9a 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -785,7 +785,7 @@
};
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
-noinline __noclone static void trace_direct_tramp(void) { }
+static struct ftrace_ops direct;
#endif
/*
@@ -863,8 +863,9 @@
* Register direct function together with graph tracer
* and make sure we get graph trace.
*/
- ret = register_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
- (unsigned long) trace_direct_tramp);
+ ftrace_set_filter_ip(&direct, (unsigned long)DYN_FTRACE_TEST_NAME, 0, 0);
+ ret = register_ftrace_direct(&direct,
+ (unsigned long)ftrace_stub_direct_tramp);
if (ret)
goto out;
@@ -884,8 +885,9 @@
unregister_ftrace_graph(&fgraph_ops);
- ret = unregister_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
- (unsigned long) trace_direct_tramp);
+ ret = unregister_ftrace_direct(&direct,
+ (unsigned long)ftrace_stub_direct_tramp,
+ true);
if (ret)
goto out;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 95541b9..399473c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -478,6 +478,7 @@
config DEBUG_FORCE_FUNCTION_ALIGN_64B
bool "Force all function address 64B aligned"
depends on EXPERT && (X86_64 || ARM64 || PPC32 || PPC64 || ARC)
+ select FUNCTION_ALIGNMENT_64B
help
There are cases that a commit from one domain changes the function
address alignment of other domains, and cause magic performance
diff --git a/mm/memblock.c b/mm/memblock.c
index 511d478..3bc404a 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1423,6 +1423,15 @@
*/
kmemleak_alloc_phys(found, size, 0);
+ /*
+ * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP,
+ * require memory to be accepted before it can be used by the
+ * guest.
+ *
+ * Accept the memory of the allocated buffer.
+ */
+ accept_memory(found, found + size);
+
return found;
}
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9da98e3..aeae943 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4580,7 +4580,7 @@
struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
struct mem_cgroup *parent;
- mem_cgroup_flush_stats();
+ mem_cgroup_flush_stats_delayed();
*pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
*pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4583f8a..50bd885 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -428,6 +428,12 @@
EXPORT_SYMBOL(nr_online_nodes);
#endif
+static bool page_contains_unaccepted(struct page *page, unsigned int order);
+static void accept_page(struct page *page, unsigned int order);
+static bool try_to_accept_memory(struct zone *zone, unsigned int order);
+static inline bool has_unaccepted_memory(void);
+static bool __free_unaccepted(struct page *page);
+
int page_group_by_mobility_disabled __read_mostly;
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
@@ -1731,6 +1737,13 @@
atomic_long_add(nr_pages, &page_zone(page)->managed_pages);
+ if (page_contains_unaccepted(page, order)) {
+ if (order == MAX_ORDER && __free_unaccepted(page))
+ return;
+
+ accept_page(page, order);
+ }
+
/*
* Bypass PCP and place fresh pages right to the tail, primarily
* relevant for memory onlining.
@@ -1891,6 +1904,9 @@
return;
}
+ /* Accept chunks smaller than MAX_ORDER upfront */
+ accept_memory(PFN_PHYS(pfn), PFN_PHYS(pfn + nr_pages));
+
for (i = 0; i < nr_pages; i++, page++, pfn++) {
if (pageblock_aligned(pfn))
set_pageblock_migratetype(page, MIGRATE_MOVABLE);
@@ -3951,6 +3967,9 @@
if (!(alloc_flags & ALLOC_CMA))
unusable_free += zone_page_state(z, NR_FREE_CMA_PAGES);
#endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ unusable_free += zone_page_state(z, NR_UNACCEPTED);
+#endif
return unusable_free;
}
@@ -4235,6 +4254,11 @@
gfp_mask)) {
int ret;
+ if (has_unaccepted_memory()) {
+ if (try_to_accept_memory(zone, order))
+ goto try_this_zone;
+ }
+
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
/*
* Watermark failed for this zone, but see if we can
@@ -4287,6 +4311,11 @@
return page;
} else {
+ if (has_unaccepted_memory()) {
+ if (try_to_accept_memory(zone, order))
+ goto try_this_zone;
+ }
+
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
/* Try again if zone has deferred pages */
if (static_branch_unlikely(&deferred_pages)) {
@@ -6941,6 +6970,9 @@
INIT_LIST_HEAD(&zone->free_area[order].free_list[t]);
zone->free_area[order].nr_free = 0;
}
+ #ifdef CONFIG_UNACCEPTED_MEMORY
+ INIT_LIST_HEAD(&zone->unaccepted_pages);
+ #endif
}
/*
@@ -9726,3 +9758,150 @@
return false;
}
#endif /* CONFIG_ZONE_DMA */
+
+#ifdef CONFIG_UNACCEPTED_MEMORY
+
+/* Counts number of zones with unaccepted pages. */
+static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages);
+
+static bool lazy_accept = true;
+
+static int __init accept_memory_parse(char *p)
+{
+ if (!strcmp(p, "lazy")) {
+ lazy_accept = true;
+ return 0;
+ } else if (!strcmp(p, "eager")) {
+ lazy_accept = false;
+ return 0;
+ } else {
+ return -EINVAL;
+ }
+}
+early_param("accept_memory", accept_memory_parse);
+
+static bool page_contains_unaccepted(struct page *page, unsigned int order)
+{
+ phys_addr_t start = page_to_phys(page);
+ phys_addr_t end = start + (PAGE_SIZE << order);
+
+ return range_contains_unaccepted_memory(start, end);
+}
+
+static void accept_page(struct page *page, unsigned int order)
+{
+ phys_addr_t start = page_to_phys(page);
+
+ accept_memory(start, start + (PAGE_SIZE << order));
+}
+
+static bool try_to_accept_memory_one(struct zone *zone)
+{
+ unsigned long flags;
+ struct page *page;
+ bool last;
+
+ if (list_empty(&zone->unaccepted_pages))
+ return false;
+
+ spin_lock_irqsave(&zone->lock, flags);
+ page = list_first_entry_or_null(&zone->unaccepted_pages,
+ struct page, lru);
+ if (!page) {
+ spin_unlock_irqrestore(&zone->lock, flags);
+ return false;
+ }
+
+ list_del(&page->lru);
+ last = list_empty(&zone->unaccepted_pages);
+
+ __mod_zone_freepage_state(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
+ __mod_zone_page_state(zone, NR_UNACCEPTED, -MAX_ORDER_NR_PAGES);
+ spin_unlock_irqrestore(&zone->lock, flags);
+
+ accept_page(page, MAX_ORDER);
+
+ __free_pages_ok(page, MAX_ORDER, FPI_TO_TAIL);
+
+ if (last)
+ static_branch_dec(&zones_with_unaccepted_pages);
+
+ return true;
+}
+
+static bool try_to_accept_memory(struct zone *zone, unsigned int order)
+{
+ long to_accept;
+ int ret = false;
+
+ /* How much to accept to get to high watermark? */
+ to_accept = high_wmark_pages(zone) -
+ (zone_page_state(zone, NR_FREE_PAGES) -
+ __zone_watermark_unusable_free(zone, order, 0));
+
+ /* Accept at least one page */
+ do {
+ if (!try_to_accept_memory_one(zone))
+ break;
+ ret = true;
+ to_accept -= MAX_ORDER_NR_PAGES;
+ } while (to_accept > 0);
+
+ return ret;
+}
+
+static inline bool has_unaccepted_memory(void)
+{
+ return static_branch_unlikely(&zones_with_unaccepted_pages);
+}
+
+static bool __free_unaccepted(struct page *page)
+{
+ struct zone *zone = page_zone(page);
+ unsigned long flags;
+ bool first = false;
+
+ if (!lazy_accept)
+ return false;
+
+ spin_lock_irqsave(&zone->lock, flags);
+ first = list_empty(&zone->unaccepted_pages);
+ list_add_tail(&page->lru, &zone->unaccepted_pages);
+ __mod_zone_freepage_state(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
+ __mod_zone_page_state(zone, NR_UNACCEPTED, MAX_ORDER_NR_PAGES);
+ spin_unlock_irqrestore(&zone->lock, flags);
+
+ if (first)
+ static_branch_inc(&zones_with_unaccepted_pages);
+
+ return true;
+}
+
+#else
+
+static bool page_contains_unaccepted(struct page *page, unsigned int order)
+{
+ return false;
+}
+
+static void accept_page(struct page *page, unsigned int order)
+{
+}
+
+static bool try_to_accept_memory(struct zone *zone, unsigned int order)
+{
+ return false;
+}
+
+static inline bool has_unaccepted_memory(void)
+{
+ return false;
+}
+
+static bool __free_unaccepted(struct page *page)
+{
+ BUILD_BUG();
+ return false;
+}
+
+#endif /* CONFIG_UNACCEPTED_MEMORY */
diff --git a/mm/shmem.c b/mm/shmem.c
index f7c08e1..f87101d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -86,6 +86,7 @@
#define BLOCKS_PER_PAGE (PAGE_SIZE/512)
#define VM_ACCT(size) (PAGE_ALIGN(size) >> PAGE_SHIFT)
+#define TMPFS_USER_XATTR_INDEX 1
/* Pretend that each entry is of this size in directory's i_size */
#define BOGO_DIRENT_SIZE 20
@@ -116,10 +117,12 @@
bool full_inums;
int huge;
int seen;
+ bool user_xattr;
#define SHMEM_SEEN_BLOCKS 1
#define SHMEM_SEEN_INODES 2
#define SHMEM_SEEN_HUGE 4
#define SHMEM_SEEN_INUMS 8
+#define SHMEM_SEEN_USER_XATTR 16
};
#ifdef CONFIG_TMPFS
@@ -3311,6 +3314,16 @@
const char *name, void *buffer, size_t size)
{
struct shmem_inode_info *info = SHMEM_I(inode);
+ struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
+
+ switch (handler->flags) {
+ case TMPFS_USER_XATTR_INDEX:
+ if (!sbinfo->user_xattr)
+ return -EOPNOTSUPP;
+ break;
+ default:
+ break;
+ }
name = xattr_full_name(handler, name);
return simple_xattr_get(&info->xattrs, name, buffer, size);
@@ -3323,8 +3336,18 @@
size_t size, int flags)
{
struct shmem_inode_info *info = SHMEM_I(inode);
+ struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
int err;
+ switch (handler->flags) {
+ case TMPFS_USER_XATTR_INDEX:
+ if (!sbinfo->user_xattr)
+ return -EOPNOTSUPP;
+ break;
+ default:
+ break;
+ }
+
name = xattr_full_name(handler, name);
err = simple_xattr_set(&info->xattrs, name, value, size, flags, NULL);
if (!err) {
@@ -3346,6 +3369,13 @@
.set = shmem_xattr_handler_set,
};
+static const struct xattr_handler shmem_user_xattr_handler = {
+ .prefix = XATTR_USER_PREFIX,
+ .flags = TMPFS_USER_XATTR_INDEX,
+ .get = shmem_xattr_handler_get,
+ .set = shmem_xattr_handler_set,
+};
+
static const struct xattr_handler *shmem_xattr_handlers[] = {
#ifdef CONFIG_TMPFS_POSIX_ACL
&posix_acl_access_xattr_handler,
@@ -3353,6 +3383,7 @@
#endif
&shmem_security_xattr_handler,
&shmem_trusted_xattr_handler,
+ &shmem_user_xattr_handler,
NULL
};
@@ -3471,6 +3502,8 @@
Opt_uid,
Opt_inode32,
Opt_inode64,
+ Opt_user_xattr,
+ Opt_nouser_xattr,
};
static const struct constant_table shmem_param_enums_huge[] = {
@@ -3492,6 +3525,8 @@
fsparam_u32 ("uid", Opt_uid),
fsparam_flag ("inode32", Opt_inode32),
fsparam_flag ("inode64", Opt_inode64),
+ fsparam_flag ("user_xattr", Opt_user_xattr),
+ fsparam_flag ("nouser_xattr", Opt_nouser_xattr),
{}
};
@@ -3595,6 +3630,14 @@
ctx->full_inums = true;
ctx->seen |= SHMEM_SEEN_INUMS;
break;
+ case Opt_user_xattr:
+ ctx->user_xattr = true;
+ ctx->seen |= SHMEM_SEEN_USER_XATTR;
+ break;
+ case Opt_nouser_xattr:
+ ctx->user_xattr = false;
+ ctx->seen |= SHMEM_SEEN_USER_XATTR;
+ break;
}
return 0;
@@ -3704,6 +3747,8 @@
sbinfo->max_inodes = ctx->inodes;
sbinfo->free_inodes = ctx->inodes - inodes;
}
+ if (ctx->seen & SHMEM_SEEN_USER_XATTR)
+ sbinfo->user_xattr = ctx->user_xattr;
/*
* Preserve previous mempolicy unless mpol remount option was specified.
diff --git a/mm/swap.c b/mm/swap.c
index 955930f..b52cf8a 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -37,6 +37,7 @@
#include <linux/page_idle.h>
#include <linux/local_lock.h>
#include <linux/buffer_head.h>
+#include <linux/dma-buf.h>
#include "internal.h"
@@ -120,8 +121,13 @@
void __folio_put(struct folio *folio)
{
- if (unlikely(folio_is_zone_device(folio)))
+ if (unlikely(folio_is_zone_device(folio))) {
+ if (is_dma_buf_page(&folio->page)) {
+ folio->page.pgmap->ops->page_free(&folio->page);
+ return;
+ }
free_zone_device_page(&folio->page);
+ }
else if (unlikely(folio_test_large(folio)))
__folio_put_large(folio);
else
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9f3cfb7..15e7ccd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2810,7 +2810,7 @@
* Flush the memory cgroup stats, so that we read accurate per-memcg
* lruvec stats for heuristics.
*/
- mem_cgroup_flush_stats();
+ mem_cgroup_flush_stats_delayed();
/*
* Determine the scan balance between anon and file LRUs.
diff --git a/mm/vmstat.c b/mm/vmstat.c
index b2371d7..8d2cdb7 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1180,6 +1180,9 @@
"nr_zspages",
#endif
"nr_free_cma",
+#ifdef CONFIG_UNACCEPTED_MEMORY
+ "nr_unaccepted",
+#endif
/* enum numa_stat_item counters */
#ifdef CONFIG_NUMA
diff --git a/net/Kconfig b/net/Kconfig
index 48c33c2..f806722 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -251,6 +251,18 @@
network device refcount are using per cpu variables if this option is set.
This can be forced to N to detect underflows (with a performance drop).
+config MAX_SKB_FRAGS
+ int "Maximum number of fragments per skb_shared_info"
+ range 17 45
+ default 17
+ help
+ Having more fragments per skb_shared_info can help GRO efficiency.
+ This helps BIG TCP workloads, but might expose bugs in some
+ legacy drivers.
+ This also increases memory overhead of small packets,
+ and in drivers using build_skb().
+ If unsure, say 17.
+
config RPS
bool
depends on SMP && SYSFS
diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index ad01b1b..513f24a 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -310,9 +310,10 @@
static int bpf_sk_storage_charge(struct bpf_local_storage_map *smap,
void *owner, u32 size)
{
- int optmem_max = READ_ONCE(sysctl_optmem_max);
struct sock *sk = (struct sock *)owner;
+ int optmem_max;
+ optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
/* same check as in sock_kmalloc() */
if (size <= optmem_max &&
atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 8dabb9a..c1c6ec1 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -425,6 +425,9 @@
return 0;
}
+ if (skb_frags_not_readable(skb))
+ goto short_copy;
+
/* Copy paged appendix. Hmm... why does this look so complicated? */
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
diff --git a/net/core/dev.c b/net/core/dev.c
index 0a5566b..6dde8d4 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2214,6 +2214,16 @@
return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
}
+static inline int deliver_skb_tx(struct sk_buff *skb,
+ struct packet_type *pt_prev,
+ struct net_device *orig_dev)
+{
+ if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
+ return -ENOMEM;
+ refcount_inc(&skb->users);
+ return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
+}
+
static inline void deliver_ptype_list_skb(struct sk_buff *skb,
struct packet_type **pt,
struct net_device *orig_dev,
@@ -2281,7 +2291,7 @@
continue;
if (pt_prev) {
- deliver_skb(skb2, pt_prev, skb->dev);
+ deliver_skb_tx(skb2, pt_prev, skb->dev);
pt_prev = ptype;
continue;
}
@@ -2318,7 +2328,7 @@
}
out_unlock:
if (pt_prev) {
- if (!skb_orphan_frags_rx(skb2, GFP_ATOMIC))
+ if (!skb_orphan_frags(skb2, GFP_ATOMIC))
pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
else
kfree_skb(skb2);
@@ -4761,6 +4771,56 @@
return rxqueue;
}
+struct page *
+__netdev_rxq_alloc_page_from_dmabuf_pool(struct netdev_rx_queue *rxq,
+ unsigned int order)
+{
+ struct dma_buf_pages_file_priv *priv;
+ struct file *dmabuf_pages_file;
+ unsigned long kvirt;
+ struct page *pg;
+ size_t offset;
+
+ rcu_read_lock();
+ dmabuf_pages_file = rcu_dereference(rxq->dmabuf_pages);
+ if (!dmabuf_pages_file || !get_file_rcu(dmabuf_pages_file)) {
+ rcu_read_unlock();
+ return NULL;
+ }
+ rcu_read_unlock();
+
+ priv = dmabuf_pages_file->private_data;
+ kvirt = gen_pool_alloc(priv->page_pool, PAGE_SIZE * (1 << order));
+ if (!kvirt)
+ goto out_err_put;
+
+ if (!PAGE_ALIGNED(kvirt)) {
+ net_err_ratelimited("dmabuf page pool allocation not aligned");
+ gen_pool_free(priv->page_pool, kvirt, PAGE_SIZE * (1 << order));
+ goto out_err_put;
+ }
+
+ /* - 1 is due to the fact that we want to avoid 0 virt address
+ * returned from the gen_pool. See comment in dma_buf_create_pages()
+ * for details.
+ */
+ offset = (kvirt >> PAGE_SHIFT) - 1;
+ pg = &priv->pages[offset];
+
+ /* pg->private holds the order of the page for freeing. */
+ pg->private = order;
+ percpu_ref_get(&pg->pgmap->ref);
+ fput(dmabuf_pages_file);
+ get_page(pg);
+ return pg;
+
+out_err_put:
+ fput(dmabuf_pages_file);
+
+ return NULL;
+}
+EXPORT_SYMBOL(__netdev_rxq_alloc_page_from_dmabuf_pool);
+
u32 bpf_prog_run_generic_xdp(struct sk_buff *skb, struct xdp_buff *xdp,
struct bpf_prog *xdp_prog)
{
diff --git a/net/core/filter.c b/net/core/filter.c
index 3a6110e..63a7051 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1215,8 +1215,8 @@
*/
static bool __sk_filter_charge(struct sock *sk, struct sk_filter *fp)
{
+ int optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
u32 filter_size = bpf_prog_size(fp->prog->len);
- int optmem_max = READ_ONCE(sysctl_optmem_max);
/* same check as in sock_kmalloc() */
if (filter_size <= optmem_max &&
@@ -1546,12 +1546,13 @@
int sk_reuseport_attach_filter(struct sock_fprog *fprog, struct sock *sk)
{
struct bpf_prog *prog = __get_filter(fprog, sk);
- int err;
+ int err, optmem_max;
if (IS_ERR(prog))
return PTR_ERR(prog);
- if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max))
+ optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
+ if (bpf_prog_size(prog->len) > optmem_max)
err = -ENOMEM;
else
err = reuseport_attach_prog(sk, prog);
@@ -1590,7 +1591,7 @@
int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
{
struct bpf_prog *prog;
- int err;
+ int err, optmem_max;
if (sock_flag(sk, SOCK_FILTER_LOCKED))
return -EPERM;
@@ -1618,7 +1619,8 @@
}
} else {
/* BPF_PROG_TYPE_SOCKET_FILTER */
- if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max)) {
+ optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
+ if (bpf_prog_size(prog->len) > optmem_max) {
err = -ENOMEM;
goto err_prog_put;
}
diff --git a/net/core/gro.c b/net/core/gro.c
index 352f966c..639519b 100644
--- a/net/core/gro.c
+++ b/net/core/gro.c
@@ -443,6 +443,9 @@
{
struct skb_shared_info *pinfo = skb_shinfo(skb);
+ if (WARN_ON_ONCE(skb_frags_not_readable(skb)))
+ return;
+
BUG_ON(skb->end - skb->tail < grow);
memcpy(skb_tail_pointer(skb), NAPI_GRO_CB(skb)->frag0, grow);
@@ -573,7 +576,7 @@
pull:
grow = skb_gro_offset(skb) - skb_headlen(skb);
- if (grow > 0)
+ if (grow > 0 && !skb_frags_not_readable(skb))
gro_pull_from_frag0(skb, grow);
ok:
if (gro_list->count) {
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 4c1707d..63b385e 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -366,6 +366,10 @@
{
net->core.sysctl_somaxconn = SOMAXCONN;
net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
+ /* Limits per socket sk_omem_alloc usage.
+ * TCP zerocopy regular usage needs 128 KB.
+ */
+ net->core.sysctl_optmem_max = 128 * 1024;
return 0;
}
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 8a819d0..0dc0773 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -982,11 +982,16 @@
skb_frag_size(frag), p, p_off, p_len,
copied) {
seg_len = min_t(int, p_len, len);
- vaddr = kmap_atomic(p);
- print_hex_dump(level, "skb frag: ",
- DUMP_PREFIX_OFFSET,
- 16, 1, vaddr + p_off, seg_len, false);
- kunmap_atomic(vaddr);
+ if (!is_dma_buf_page(p)) {
+ vaddr = kmap_atomic(p);
+ print_hex_dump(level, "skb frag: ",
+ DUMP_PREFIX_OFFSET, 16, 1,
+ vaddr + p_off, seg_len, false);
+ kunmap_atomic(vaddr);
+ } else {
+ printk("%sskb frag: devmem", level);
+ }
+
len -= seg_len;
if (!len)
break;
@@ -1467,8 +1472,8 @@
}
EXPORT_SYMBOL_GPL(msg_zerocopy_put_abort);
-int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
- struct msghdr *msg, int len,
+int skb_devmem_iter_stream(struct sock *sk, struct sk_buff *skb,
+ struct msghdr *msg, struct iov_iter *iov_iter, int len,
struct ubuf_info *uarg)
{
struct ubuf_info *orig_uarg = skb_zcopy(skb);
@@ -1480,12 +1485,43 @@
if (orig_uarg && uarg != orig_uarg)
return -EEXIST;
- err = __zerocopy_sg_from_iter(msg, sk, skb, &msg->msg_iter, len);
+ err = __zerocopy_sg_from_iter(msg, sk, skb, iov_iter, len);
if (err == -EFAULT || (err == -EMSGSIZE && skb->len == orig_len)) {
struct sock *save_sk = skb->sk;
/* Streams do not free skb on error. Reset to prev state. */
- iov_iter_revert(&msg->msg_iter, skb->len - orig_len);
+ iov_iter_revert(iov_iter, skb->len - orig_len);
+ skb->sk = sk;
+ ___pskb_trim(skb, orig_len);
+ skb->sk = save_sk;
+ return err;
+ }
+
+ skb_zcopy_set(skb, uarg, NULL);
+ return skb->len - orig_len;
+}
+EXPORT_SYMBOL_GPL(skb_devmem_iter_stream);
+
+int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
+ struct msghdr *msg, int len,
+ struct ubuf_info *uarg)
+{
+ struct ubuf_info *orig_uarg = skb_zcopy(skb);
+ struct iov_iter orig_iter = msg->msg_iter;
+ int err, orig_len = skb->len;
+
+ /* An skb can only point to one uarg. This edge case happens when
+ * TCP appends to an skb, but zerocopy_realloc triggered a new alloc.
+ */
+ if (orig_uarg && uarg != orig_uarg)
+ return -EEXIST;
+
+ err = __zerocopy_sg_from_iter(NULL, sk, skb, &msg->msg_iter, len);
+ if (err == -EFAULT || (err == -EMSGSIZE && skb->len == orig_len)) {
+ struct sock *save_sk = skb->sk;
+
+ /* Streams do not free skb on error. Reset to prev state. */
+ msg->msg_iter = orig_iter;
skb->sk = sk;
___pskb_trim(skb, orig_len);
skb->sk = save_sk;
@@ -1552,6 +1588,9 @@
if (skb_shared(skb) || skb_unclone(skb, gfp_mask))
return -EINVAL;
+ if (skb_frags_not_readable(skb))
+ return -EFAULT;
+
if (!num_frags)
goto release;
@@ -1722,8 +1761,10 @@
{
int headerlen = skb_headroom(skb);
unsigned int size = skb_end_offset(skb) + skb->data_len;
- struct sk_buff *n = __alloc_skb(size, gfp_mask,
- skb_alloc_rx_flag(skb), NUMA_NO_NODE);
+ struct sk_buff *n = skb_frags_not_readable(skb) ? NULL :
+ __alloc_skb(size, gfp_mask,
+ skb_alloc_rx_flag(skb),
+ NUMA_NO_NODE);
if (!n)
return NULL;
@@ -2037,9 +2078,10 @@
/*
* Allocate the copy buffer
*/
- struct sk_buff *n = __alloc_skb(newheadroom + skb->len + newtailroom,
- gfp_mask, skb_alloc_rx_flag(skb),
- NUMA_NO_NODE);
+ struct sk_buff *n = skb_frags_not_readable(skb) ? NULL :
+ __alloc_skb(newheadroom + skb->len + newtailroom,
+ gfp_mask, skb_alloc_rx_flag(skb),
+ NUMA_NO_NODE);
int oldheadroom = skb_headroom(skb);
int head_copy_len, head_copy_off;
@@ -2380,6 +2422,9 @@
*/
int i, k, eat = (skb->tail + delta) - skb->end;
+ if (skb_frags_not_readable(skb))
+ return NULL;
+
if (eat > 0 || skb_cloned(skb)) {
if (pskb_expand_head(skb, 0, eat > 0 ? eat + 128 : 0,
GFP_ATOMIC))
@@ -2533,6 +2578,9 @@
to += copy;
}
+ if (skb_frags_not_readable(skb))
+ goto fault;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
skb_frag_t *f = &skb_shinfo(skb)->frags[i];
@@ -2606,7 +2654,7 @@
{
struct page_frag *pfrag = sk_page_frag(sk);
- if (!sk_page_frag_refill(sk, pfrag))
+ if (!sk_page_frag_refill(sk, pfrag) || is_dma_buf_page(pfrag->page))
return NULL;
*len = min_t(unsigned int, *len, pfrag->size - pfrag->offset);
@@ -2935,6 +2983,9 @@
from += copy;
}
+ if (skb_frags_not_readable(skb))
+ goto fault;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
int end;
@@ -3014,6 +3065,9 @@
pos = copy;
}
+ if (WARN_ON_ONCE(skb_frags_not_readable(skb)))
+ return -EFAULT;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
@@ -3114,6 +3168,9 @@
pos = copy;
}
+ if (WARN_ON_ONCE(skb_frags_not_readable(skb)))
+ return -EFAULT;
+
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
int end;
@@ -3571,7 +3628,9 @@
skb_shinfo(skb1)->frags[i] = skb_shinfo(skb)->frags[i];
skb_shinfo(skb1)->nr_frags = skb_shinfo(skb)->nr_frags;
+ skb1->devmem = skb->devmem;
skb_shinfo(skb)->nr_frags = 0;
+ skb->devmem = 0;
skb1->data_len = skb->data_len;
skb1->len += skb1->data_len;
skb->data_len = 0;
@@ -3585,11 +3644,13 @@
{
int i, k = 0;
const int nfrags = skb_shinfo(skb)->nr_frags;
+ const bool devmem = skb->devmem;
skb_shinfo(skb)->nr_frags = 0;
skb1->len = skb1->data_len = skb->len - len;
skb->len = len;
skb->data_len = len - pos;
+ skb->devmem = skb1->devmem = 0;
for (i = 0; i < nfrags; i++) {
int size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
@@ -3618,6 +3679,12 @@
pos += size;
}
skb_shinfo(skb1)->nr_frags = k;
+
+ if (skb_shinfo(skb)->nr_frags)
+ skb->devmem = devmem;
+
+ if (skb_shinfo(skb1)->nr_frags)
+ skb1->devmem = devmem;
}
/**
@@ -3853,6 +3920,9 @@
return block_limit - abs_offset;
}
+ if (skb_frags_not_readable(st->cur_skb))
+ return 0;
+
if (st->frag_idx == 0 && !st->frag_data)
st->stepped_offset += skb_headlen(st->cur_skb);
@@ -4843,6 +4913,9 @@
bool icmp_next = false;
unsigned long flags;
+ if (skb_queue_empty_lockless(q))
+ return NULL;
+
spin_lock_irqsave(&q->lock, flags);
skb = __skb_dequeue(q);
if (skb && (skb_next = skb_peek(q))) {
@@ -5451,7 +5524,10 @@
(from->pp_recycle && skb_cloned(from)))
return false;
- if (len <= skb_tailroom(to)) {
+ if (skb_frags_not_readable(from) != skb_frags_not_readable(to))
+ return false;
+
+ if (len <= skb_tailroom(to) && !skb_frags_not_readable(from)) {
if (len)
BUG_ON(skb_copy_bits(from, 0, skb_put(to, len), len));
*delta_truesize = 0;
@@ -5767,6 +5843,9 @@
if (!pskb_may_pull(skb, write_len))
return -ENOMEM;
+ if (skb_frags_not_readable(skb))
+ return -EFAULT;
+
if (!skb_cloned(skb) || skb_clone_writable(skb, write_len))
return 0;
@@ -6433,7 +6512,7 @@
{
if (skb->data_len) {
if (skb->data_len > skb->end - skb->tail ||
- skb_cloned(skb))
+ skb_cloned(skb) || skb_frags_not_readable(skb))
return;
/* Nice, we can free page frag(s) right now */
diff --git a/net/core/sock.c b/net/core/sock.c
index c50a14a..b515ab4 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -279,10 +279,6 @@
__u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX;
__u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX;
-/* Maximal space eaten by iovec or ancillary data plus some space */
-int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512);
-EXPORT_SYMBOL(sysctl_optmem_max);
-
int sysctl_tstamp_allow_data __read_mostly = 1;
DEFINE_STATIC_KEY_FALSE(memalloc_socks_key);
@@ -1107,6 +1103,75 @@
valbool = val ? 1 : 0;
+ /* handle options that do not need to lock the socket */
+ switch (optname) {
+ case SO_DEVMEM_DONTNEED:
+ {
+ struct devmemtoken singleton_token, *tokens;
+ unsigned int num_tokens, i, j, k, pg_num = 0;
+ struct page *pgs[16];
+
+ if (sk->sk_type != SOCK_STREAM || sk->sk_protocol != IPPROTO_TCP)
+ return -EBADF;
+
+ if (optlen < sizeof(*tokens) || optlen % sizeof(*tokens))
+ return -EINVAL;
+
+ if (optlen == sizeof(*tokens)) {
+ if (copy_from_sockptr(&singleton_token, optval,
+ sizeof(*tokens))) {
+ return -EFAULT;
+ }
+ num_tokens = 1;
+ tokens = &singleton_token;
+ } else {
+ if (optlen > 4096)
+ return -EINVAL;
+ num_tokens = optlen / sizeof(*tokens);
+ tokens = kmalloc(optlen, GFP_KERNEL);
+ if (!tokens)
+ return -ENOMEM;
+ if (copy_from_sockptr(tokens, optval, optlen)) {
+ kfree(tokens);
+ return -EFAULT;
+ }
+ }
+
+ ret = 0;
+
+ xa_lock_bh(&sk->sk_pagepool);
+ for (i = 0; i < num_tokens; i++) {
+ for (j = 0; j < tokens[i].token_count; j++) {
+ struct page *pg = __xa_erase(&sk->sk_pagepool,
+ tokens[i].token_start + j);
+
+ if (pg) {
+ pgs[pg_num++] = pg;
+ if (pg_num == ARRAY_SIZE(pgs)) {
+ xa_unlock_bh(&sk->sk_pagepool);
+ for (k = 0; k < pg_num; k++)
+ put_page(pgs[k]);
+ pg_num = 0;
+ xa_lock_bh(&sk->sk_pagepool);
+ }
+ } else
+ /* -EINTR here notifies the userspace
+ * that not all tokens passed to it have
+ * been freed.
+ */
+ ret = -EINTR;
+ }
+ }
+ xa_unlock_bh(&sk->sk_pagepool);
+ for (k = 0; k < pg_num; k++)
+ put_page(pgs[k]);
+ if (num_tokens > 1)
+ kfree(tokens);
+
+ return ret;
+ }
+ }
+
sockopt_lock_sock(sk);
switch (optname) {
@@ -2616,7 +2681,7 @@
/* small safe race: SKB_TRUESIZE may differ from final skb->truesize */
if (atomic_read(&sk->sk_omem_alloc) + SKB_TRUESIZE(size) >
- READ_ONCE(sysctl_optmem_max))
+ READ_ONCE(sock_net(sk)->core.sysctl_optmem_max))
return NULL;
skb = alloc_skb(size, priority);
@@ -2634,7 +2699,7 @@
*/
void *sock_kmalloc(struct sock *sk, int size, gfp_t priority)
{
- int optmem_max = READ_ONCE(sysctl_optmem_max);
+ int optmem_max = READ_ONCE(sock_net(sk)->core.sysctl_optmem_max);
if ((unsigned int)size <= optmem_max &&
atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
@@ -2789,6 +2854,12 @@
return -EINVAL;
sockc->transmit_time = get_unaligned((u64 *)CMSG_DATA(cmsg));
break;
+ case SCM_DEVMEM_OFFSET:
+ if (cmsg->cmsg_len != CMSG_LEN(2 * sizeof(u32)))
+ return -EINVAL;
+ sockc->devmem_fd = ((u32 *)CMSG_DATA(cmsg))[0];
+ sockc->devmem_offset = ((u32 *)CMSG_DATA(cmsg))[1];
+ break;
/* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */
case SCM_RIGHTS:
case SCM_CREDENTIALS:
@@ -2952,6 +3023,9 @@
{
spin_lock_bh(&sk->sk_lock.slock);
__release_sock(sk);
+
+ if (sk->sk_prot->release_cb)
+ sk->sk_prot->release_cb(sk);
spin_unlock_bh(&sk->sk_lock.slock);
}
EXPORT_SYMBOL_GPL(__sk_flush_backlog);
@@ -3497,9 +3571,6 @@
if (sk->sk_backlog.tail)
__release_sock(sk);
- /* Warning : release_cb() might need to release sk ownership,
- * ie call sock_release_ownership(sk) before us.
- */
if (sk->sk_prot->release_cb)
sk->sk_prot->release_cb(sk);
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 5b1ce65..e516d6a5 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -450,13 +450,6 @@
.proc_handler = proc_dointvec,
},
{
- .procname = "optmem_max",
- .data = &sysctl_optmem_max,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec
- },
- {
.procname = "tstamp_allow_data",
.data = &sysctl_tstamp_allow_data,
.maxlen = sizeof(int),
@@ -613,7 +606,15 @@
.mode = 0644,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
- .proc_handler = proc_dou8vec_minmax,
+ .proc_handler = proc_dou8vec_minmax
+ },
+ {
+ .procname = "optmem_max",
+ .data = &init_net.core.sysctl_optmem_max,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .extra1 = SYSCTL_ZERO,
+ .proc_handler = proc_dointvec_minmax
},
{ }
};
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index c1fb758..9fa1424 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -779,7 +779,7 @@
if (optlen < GROUP_FILTER_SIZE(0))
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max))
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max))
return -ENOBUFS;
gsf = memdup_sockptr(optval, optlen);
@@ -815,7 +815,7 @@
if (optlen < size0)
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max) - 4)
return -ENOBUFS;
p = kmalloc(optlen + 4, GFP_KERNEL);
@@ -1241,7 +1241,7 @@
if (optlen < IP_MSFILTER_SIZE(0))
goto e_inval;
- if (optlen > READ_ONCE(sysctl_optmem_max)) {
+ if (optlen > READ_ONCE(net->core.sysctl_optmem_max)) {
err = -ENOBUFS;
break;
}
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 73e5821..f4d8396 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -1339,6 +1339,15 @@
.extra1 = SYSCTL_ZERO,
},
{
+ .procname = "tcp_backlog_ack_defer",
+ .data = &init_net.ipv4.sysctl_tcp_backlog_ack_defer,
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE,
+ },
+ {
.procname = "tcp_reflect_tos",
.data = &init_net.ipv4.sysctl_tcp_reflect_tos,
.maxlen = sizeof(u8),
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0b7844a..440ed37 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -279,6 +279,7 @@
#include <linux/uaccess.h>
#include <asm/ioctls.h>
#include <net/busy_poll.h>
+#include <linux/dma-buf.h>
/* Track pending CMSGs. */
enum {
@@ -457,9 +458,11 @@
WRITE_ONCE(sk->sk_sndbuf, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_wmem[1]));
WRITE_ONCE(sk->sk_rcvbuf, READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[1]));
+ tcp_scaling_ratio_init(sk);
set_bit(SOCK_SUPPORT_ZC, &sk->sk_socket->flags);
sk_sockets_allocated_inc(sk);
+ xa_init_flags(&sk->sk_pagepool, XA_FLAGS_ALLOC1);
}
EXPORT_SYMBOL(tcp_init_sock);
@@ -1215,6 +1218,49 @@
return err;
}
+static int tcp_prepare_devmem_data(struct msghdr *msg, int devmem_fd,
+ unsigned int devmem_offset,
+ struct file **devmem_file,
+ struct iov_iter *devmem_tx_iter, size_t size)
+{
+ struct dma_buf_pages_file_priv *priv;
+ int err = 0;
+
+ *devmem_file = fget_raw(devmem_fd);
+ if (!*devmem_file) {
+ err = -EINVAL;
+ goto err;
+ }
+
+ if (!is_dma_buf_pages_file(*devmem_file)) {
+ err = -EBADF;
+ goto err_fput;
+ }
+
+ priv = (*devmem_file)->private_data;
+ if (!priv) {
+ WARN_ONCE(!priv, "dma_buf_pages_file has no private_data");
+ err = -EINTR;
+ goto err_fput;
+ }
+
+ if (devmem_offset + size > priv->dmabuf->size) {
+ err = -ENOSPC;
+ goto err_fput;
+ }
+
+ *devmem_tx_iter = priv->tx_iter;
+ iov_iter_advance(devmem_tx_iter, devmem_offset);
+
+ return 0;
+
+err_fput:
+ fput(*devmem_file);
+ *devmem_file = NULL;
+err:
+ return err;
+}
+
int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
{
struct tcp_sock *tp = tcp_sk(sk);
@@ -1226,6 +1272,8 @@
int process_backlog = 0;
bool zc = false;
long timeo;
+ struct file *devmem_file = NULL;
+ struct iov_iter devmem_tx_iter;
flags = msg->msg_flags;
@@ -1294,6 +1342,14 @@
}
}
+ if (sockc.devmem_fd) {
+ err = tcp_prepare_devmem_data(msg, sockc.devmem_fd,
+ sockc.devmem_offset, &devmem_file,
+ &devmem_tx_iter, size);
+ if (err)
+ goto out_err;
+ }
+
/* This should be in poll */
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
@@ -1407,7 +1463,17 @@
goto wait_for_space;
}
- err = skb_zerocopy_iter_stream(sk, skb, msg, copy, uarg);
+ if (devmem_file) {
+ err = skb_devmem_iter_stream(sk, skb, msg,
+ &devmem_tx_iter,
+ copy, uarg);
+ skb->devmem = 1;
+ if (err > 0)
+ iov_iter_advance(&msg->msg_iter, err);
+ } else {
+ err = skb_zerocopy_iter_stream(sk, skb, msg,
+ copy, uarg);
+ }
if (err == -EMSGSIZE || err == -EEXIST) {
tcp_mark_push(tp, skb);
goto new_segment;
@@ -1461,6 +1527,8 @@
}
out_nopush:
net_zcopy_put(uarg);
+ if (devmem_file)
+ fput(devmem_file);
return copied + copied_syn;
do_error:
@@ -1469,6 +1537,8 @@
if (copied + copied_syn)
goto out;
out_err:
+ if (devmem_file)
+ fput(devmem_file);
net_zcopy_put_abort(uarg, true);
err = sk_stream_error(sk, flags, err);
/* make sure we wake any epoll edge trigger waiter */
@@ -1849,7 +1919,7 @@
/* Make sure sk_rcvbuf is big enough to satisfy SO_RCVLOWAT hint */
int tcp_set_rcvlowat(struct sock *sk, int val)
{
- int cap;
+ int space, cap;
if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
cap = sk->sk_rcvbuf >> 1;
@@ -1864,10 +1934,10 @@
if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
return 0;
- val <<= 1;
- if (val > sk->sk_rcvbuf) {
- WRITE_ONCE(sk->sk_rcvbuf, val);
- tcp_sk(sk)->window_clamp = tcp_win_from_space(sk, val);
+ space = tcp_space_from_win(sk, val);
+ if (space > sk->sk_rcvbuf) {
+ WRITE_ONCE(sk->sk_rcvbuf, space);
+ tcp_sk(sk)->window_clamp = val;
}
return 0;
}
@@ -2418,6 +2488,210 @@
return inq;
}
+/* batch __xa_alloc() calls and reduce xa_lock()/xa_unlock() overhead. */
+struct tcp_xa_pool {
+ u8 max; /* max <= MAX_SKB_FRAGS */
+ u8 idx; /* idx <= max */
+ unsigned int tokens[MAX_SKB_FRAGS];
+ struct page *pgs[MAX_SKB_FRAGS];
+};
+
+static void tcp_xa_pool_commit(struct sock *sk, struct tcp_xa_pool *p,
+ bool lock)
+{
+ int i;
+
+ if (!p->max)
+ return;
+ if (lock)
+ xa_lock_bh(&sk->sk_pagepool);
+ /* Commit part that has been copied to user space. */
+ for (i = 0; i < p->idx; i++)
+ __xa_cmpxchg(&sk->sk_pagepool,
+ p->tokens[i],
+ XA_ZERO_ENTRY,
+ p->pgs[i],
+ GFP_KERNEL);
+ /* Rollback what has been pre-allocated and is no longer needed. */
+ for (; i < p->max; i++)
+ __xa_erase(&sk->sk_pagepool, p->tokens[i]);
+ if (lock)
+ xa_unlock_bh(&sk->sk_pagepool);
+ p->max = 0;
+ p->idx = 0;
+}
+
+static int tcp_xa_pool_refill(struct sock *sk, struct tcp_xa_pool *p,
+ unsigned int max_frags)
+{
+ int err, k;
+
+ if (p->idx < p->max)
+ return 0;
+
+ xa_lock_bh(&sk->sk_pagepool);
+
+ tcp_xa_pool_commit(sk, p, false);
+ for (k = 0; k < max_frags; k++) {
+ err = __xa_alloc(&sk->sk_pagepool, &p->tokens[k],
+ XA_ZERO_ENTRY, xa_limit_31b, GFP_KERNEL);
+ if (err)
+ break;
+ }
+
+ xa_unlock_bh(&sk->sk_pagepool);
+
+ p->max = k;
+ p->idx = 0;
+ return k ? 0 : err;
+}
+
+/* On error, returns the -errno. On success, returns number of bytes sent to the
+ * user. May not consume all of @len.
+ */
+static int tcp_recvmsg_devmem(struct sock *sk, const struct sk_buff *skb,
+ int offset, struct msghdr *msg, int len)
+{
+ struct devmemvec devmemvec = { 0 };
+ struct tcp_xa_pool tcp_xa_pool;
+ unsigned int start;
+ int i, copy, n;
+ int sent = 0;
+ int err = 0;
+
+ tcp_xa_pool.max = 0;
+ tcp_xa_pool.idx = 0;
+ do {
+ start = skb_headlen(skb);
+
+ if (!skb->devmem) {
+ err = -ENODEV;
+ goto out;
+ }
+
+ /* Copy header. */
+ copy = start - offset;
+ if (copy > 0) {
+ copy = min(copy, len);
+
+ n = copy_to_iter(skb->data + offset, copy, &msg->msg_iter);
+ if (n != copy) {
+ err = -EFAULT;
+ goto out;
+ }
+
+ offset += copy;
+ len -= copy;
+
+ /* First a devmemvec for # bytes copied to user buffer. */
+ memset(&devmemvec, 0, sizeof(devmemvec));
+ devmemvec.frag_size = copy;
+ err = put_cmsg(msg, SOL_SOCKET, SO_DEVMEM_HEADER,
+ sizeof(devmemvec), &devmemvec);
+ if (err || msg->msg_flags & MSG_CTRUNC) {
+ msg->msg_flags &= ~MSG_CTRUNC;
+ if (!err)
+ err = -ETOOSMALL;
+ goto out;
+ }
+
+ sent += copy;
+
+ if (len == 0)
+ goto out;
+ }
+
+ /* after that, send information of devmem pages through a
+ * sequence of cmsg
+ */
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+ const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+ struct page *page = skb_frag_page(frag);
+ struct dma_buf_pages_file_priv *priv;
+ struct page *dmabuf_pages;
+ u32 frag_offset;
+ int end;
+
+ /* skb->devmem should indicate that ALL the
+ * frags in this skb are unreadable pages.
+ * We're checking for that flag above, but also check
+ * individual pages here. If the tcp stack is not
+ * setting skb->devmem correctly, we still don't want to
+ * crash here when accessing pgmap or priv below.
+ */
+ if (!is_dma_buf_page(page)) {
+ err = -ENODEV;
+ goto out;
+ }
+
+ end = start + skb_frag_size(frag);
+ copy = end - offset;
+
+ if (copy > 0) {
+ copy = min(copy, len);
+
+ priv = container_of(page->pgmap,
+ struct dma_buf_pages_file_priv,
+ pgmap);
+
+ dmabuf_pages = priv->pages;
+ frag_offset = ((page - dmabuf_pages) << PAGE_SHIFT) +
+ skb_frag_off(frag) + offset - start;
+ devmemvec.frag_offset = frag_offset;
+ devmemvec.frag_size = copy;
+ err = tcp_xa_pool_refill(sk, &tcp_xa_pool,
+ skb_shinfo(skb)->nr_frags - i);
+ if (err)
+ goto out;
+ /* Will perform the exchange later */
+ devmemvec.frag_token = tcp_xa_pool.tokens[tcp_xa_pool.idx];
+ offset += copy;
+ len -= copy;
+
+ err = put_cmsg(msg, SOL_SOCKET, SO_DEVMEM_OFFSET,
+ sizeof(devmemvec), &devmemvec);
+ if (err || msg->msg_flags & MSG_CTRUNC) {
+ msg->msg_flags &= ~MSG_CTRUNC;
+ if (!err)
+ err = -ETOOSMALL;
+ goto out;
+ }
+
+ tcp_xa_pool.pgs[tcp_xa_pool.idx++] = page;
+ get_page(page);
+
+ sent += copy;
+
+ if (len == 0)
+ goto out;
+ }
+ start = end;
+ }
+ tcp_xa_pool_commit(sk, &tcp_xa_pool, true);
+ if (!len)
+ goto out;
+
+ /* if len is not satisfied yet, we need to go to the next frag
+ * in the frag_list to satisfy len.
+ */
+ skb = skb_shinfo(skb)->frag_list ?: skb->next;
+
+ offset = offset - start;
+ } while (skb);
+
+ if (len) {
+ err = -EFAULT;
+ goto out;
+ }
+
+out:
+ tcp_xa_pool_commit(sk, &tcp_xa_pool, true);
+ if (!sent)
+ sent = err;
+
+ return sent;
+}
+
/*
* This routine copies from a sock struct into the user buffer.
*
@@ -2430,6 +2704,7 @@
int flags, struct scm_timestamping_internal *tss,
int *cmsg_flags)
{
+ bool last_copied_devmem, last_copied_init = false;
struct tcp_sock *tp = tcp_sk(sk);
int copied = 0;
u32 peek_seq;
@@ -2608,13 +2883,42 @@
}
if (!(flags & MSG_TRUNC)) {
- err = skb_copy_datagram_msg(skb, offset, msg, used);
- if (err) {
- /* Exception. Bailout! */
- if (!copied)
- copied = -EFAULT;
+ if (last_copied_init &&
+ last_copied_devmem != skb->devmem)
break;
+
+ if (!skb->devmem) {
+ err = skb_copy_datagram_msg(skb, offset, msg,
+ used);
+ if (err) {
+ /* Exception. Bailout! */
+ if (!copied)
+ copied = -EFAULT;
+ break;
+ }
+ } else {
+ if (!(flags & MSG_SOCK_DEVMEM)) {
+ /* skb->devmem skbs can only be received
+ * with the MSG_SOCK_DEVMEM flag.
+ */
+ if (!copied)
+ copied = -EFAULT;
+
+ break;
+ }
+
+ err = tcp_recvmsg_devmem(sk, skb, offset, msg,
+ used);
+ if (err <= 0) {
+ if (!copied)
+ copied = -EFAULT;
+
+ break;
+ }
+ used = err;
}
+ last_copied_devmem = skb->devmem;
+ last_copied_init = true;
}
WRITE_ONCE(*seq, *seq + used);
@@ -3914,7 +4218,8 @@
info->tcpi_options |= TCPI_OPT_SYN_DATA;
info->tcpi_rto = jiffies_to_usecs(icsk->icsk_rto);
- info->tcpi_ato = jiffies_to_usecs(icsk->icsk_ack.ato);
+ info->tcpi_ato = jiffies_to_usecs(min(icsk->icsk_ack.ato,
+ tcp_delack_max(sk)));
info->tcpi_snd_mss = tp->mss_cache;
info->tcpi_rcv_mss = icsk->icsk_ack.rcv_mss;
@@ -4545,6 +4850,9 @@
if (crypto_ahash_update(req))
return 1;
+ if (skb_frags_not_readable(skb))
+ return 1;
+
for (i = 0; i < shi->nr_frags; ++i) {
const skb_frag_t *f = &shi->frags[i];
unsigned int offset = skb_frag_off(f);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 34460c9..a864e22 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -237,6 +237,16 @@
*/
len = skb_shinfo(skb)->gso_size ? : skb->len;
if (len >= icsk->icsk_ack.rcv_mss) {
+ /* Note: divides are still a bit expensive.
+ * For the moment, only adjust scaling_ratio
+ * when we update icsk_ack.rcv_mss.
+ */
+ if (unlikely(len != icsk->icsk_ack.rcv_mss)) {
+ u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE;
+
+ do_div(val, skb->truesize);
+ tcp_sk(sk)->scaling_ratio = val ? val : 1;
+ }
icsk->icsk_ack.rcv_mss = min_t(unsigned int, len,
tcp_sk(sk)->advmss);
/* Account for possibly-removed options */
@@ -739,8 +749,8 @@
if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_moderate_rcvbuf) &&
!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) {
- int rcvmem, rcvbuf;
u64 rcvwin, grow;
+ int rcvbuf;
/* minimal window to cope with packet losses, assuming
* steady state. Add some cushion because of small variations.
@@ -752,12 +762,7 @@
do_div(grow, tp->rcvq_space.space);
rcvwin += (grow << 1);
- rcvmem = SKB_TRUESIZE(tp->advmss + MAX_TCP_HEADER);
- while (tcp_win_from_space(sk, rcvmem) < tp->advmss)
- rcvmem += 128;
-
- do_div(rcvwin, tp->advmss);
- rcvbuf = min_t(u64, rcvwin * rcvmem,
+ rcvbuf = min_t(u64, tcp_space_from_win(sk, rcvwin),
READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rmem[2]));
if (rcvbuf > sk->sk_rcvbuf) {
WRITE_ONCE(sk->sk_rcvbuf, rcvbuf);
@@ -5210,6 +5215,9 @@
for (end_of_skbs = true; skb != NULL && skb != tail; skb = n) {
n = tcp_skb_next(skb, list);
+ if (skb_frags_not_readable(skb))
+ goto skip_this;
+
/* No new bits? It is possible on ofo queue. */
if (!before(start, TCP_SKB_CB(skb)->end_seq)) {
skb = tcp_collapse_one(sk, skb, list, root);
@@ -5230,17 +5238,20 @@
break;
}
- if (n && n != tail && mptcp_skb_can_collapse(skb, n) &&
+ if (n && n != tail && !skb_frags_not_readable(n) &&
+ mptcp_skb_can_collapse(skb, n) &&
TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(n)->seq) {
end_of_skbs = false;
break;
}
+skip_this:
/* Decided to skip this, advance start seq. */
start = TCP_SKB_CB(skb)->end_seq;
}
if (end_of_skbs ||
- (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)))
+ (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)) ||
+ skb_frags_not_readable(skb))
return;
__skb_queue_head_init(&tmp);
@@ -5284,7 +5295,8 @@
if (!skb ||
skb == tail ||
!mptcp_skb_can_collapse(nskb, skb) ||
- (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)))
+ (TCP_SKB_CB(skb)->tcp_flags & (TCPHDR_SYN | TCPHDR_FIN)) ||
+ skb_frags_not_readable(skb))
goto end;
#ifdef CONFIG_TLS_DEVICE
if (skb->decrypted != nskb->decrypted)
@@ -5543,6 +5555,14 @@
tcp_in_quickack_mode(sk) ||
/* Protocol state mandates a one-time immediate ACK */
inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOW) {
+ /* If we are running from __release_sock() in user context,
+ * Defer the ack until tcp_release_cb().
+ */
+ if (sock_owned_by_user_nocheck(sk) &&
+ READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_backlog_ack_defer)) {
+ set_bit(TCP_ACK_DEFERRED, &sk->sk_tsq_flags);
+ return;
+ }
send_now:
tcp_send_ack(sk);
return;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index be2c807..8d40e06 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2246,6 +2246,14 @@
{
struct tcp_sock *tp = tcp_sk(sk);
+ unsigned long index;
+ struct page *page;
+
+ xa_for_each(&sk->sk_pagepool, index, page)
+ put_page(page);
+
+ xa_destroy(&sk->sk_pagepool);
+
trace_tcp_destroy_sock(sk);
tcp_clear_xmit_timers(sk);
@@ -3211,6 +3219,7 @@
net->ipv4.sysctl_tcp_comp_sack_delay_ns = NSEC_PER_MSEC;
net->ipv4.sysctl_tcp_comp_sack_slack_ns = 100 * NSEC_PER_USEC;
net->ipv4.sysctl_tcp_comp_sack_nr = 44;
+ net->ipv4.sysctl_tcp_backlog_ack_defer = 1;
net->ipv4.sysctl_tcp_fastopen = TFO_CLIENT_ENABLE;
net->ipv4.sysctl_tcp_fastopen_blackhole_timeout = 0;
atomic_set(&net->ipv4.tfo_active_disable_times, 0);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 67087da..85b6ac3 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -242,6 +242,23 @@
space = min_t(u32, space, *window_clamp);
*rcv_wscale = clamp_t(int, ilog2(space) - 15,
0, TCP_MAX_WSCALE);
+ /* b/144469234 : We force WSCALE >= 12 for 4K MTU
+ * so that senders do not feel the need to send
+ * too small packets. We prefer full size (4K) packets.
+ * We also special-case MSS == 8192 for similar reason.
+ *
+ * 4K MTU : SYN packets get mss = 4108 (4096 + 12),
+ * while SYNACK packets get mss = 4096,
+ * (assuming TCP TS), courtesy of tcp_openreq_init_rwin()
+ * calling us with "mss - (ireq->tstamp_ok ? TCPOLEN_TSTAMP_ALIGNED : 0)"
+ */
+ if (mss == 4096 ||
+ mss == 4096 + TCPOLEN_TSTAMP_ALIGNED)
+ *rcv_wscale = max_t(int, *rcv_wscale, 12);
+
+ if (mss == 8192 ||
+ mss == 8192 + TCPOLEN_TSTAMP_ALIGNED)
+ *rcv_wscale = max_t(int, *rcv_wscale, 13);
}
/* Set the clamp no higher than max representable value */
(*window_clamp) = min_t(__u32, U16_MAX << (*rcv_wscale), *window_clamp);
@@ -1068,7 +1085,8 @@
#define TCP_DEFERRED_ALL (TCPF_TSQ_DEFERRED | \
TCPF_WRITE_TIMER_DEFERRED | \
TCPF_DELACK_TIMER_DEFERRED | \
- TCPF_MTU_REDUCED_DEFERRED)
+ TCPF_MTU_REDUCED_DEFERRED | \
+ TCPF_ACK_DEFERRED)
/**
* tcp_release_cb - tcp release_sock() callback
* @sk: socket
@@ -1092,16 +1110,6 @@
tcp_tsq_write(sk);
__sock_put(sk);
}
- /* Here begins the tricky part :
- * We are called from release_sock() with :
- * 1) BH disabled
- * 2) sk_lock.slock spinlock held
- * 3) socket owned by us (sk->sk_lock.owned == 1)
- *
- * But following code is meant to be called from BH handlers,
- * so we should keep BH disabled, but early release socket ownership
- */
- sock_release_ownership(sk);
if (flags & TCPF_WRITE_TIMER_DEFERRED) {
tcp_write_timer_handler(sk);
@@ -1115,6 +1123,8 @@
inet_csk(sk)->icsk_af_ops->mtu_reduced(sk);
__sock_put(sk);
}
+ if ((flags & TCPF_ACK_DEFERRED) && inet_csk_ack_scheduled(sk))
+ tcp_send_ack(sk);
}
EXPORT_SYMBOL(tcp_release_cb);
@@ -2314,7 +2324,8 @@
if (unlikely(TCP_SKB_CB(skb)->eor) ||
tcp_has_tx_tstamp(skb) ||
- !skb_pure_zcopy_same(skb, next))
+ !skb_pure_zcopy_same(skb, next) ||
+ skb->devmem != next->devmem)
return false;
len -= skb->len;
@@ -3144,6 +3155,8 @@
return false;
if (skb_cloned(skb))
return false;
+ if (skb_frags_not_readable(skb))
+ return false;
/* Some heuristics for collapsing over SACK'd could be invented */
if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED)
return false;
@@ -3948,6 +3961,20 @@
}
EXPORT_SYMBOL(tcp_connect);
+u32 tcp_delack_max(struct sock *sk)
+{
+ const struct dst_entry *dst = __sk_dst_get(sk);
+ u32 delack_max = inet_csk(sk)->icsk_delack_max;
+
+ if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) {
+ u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN);
+ u32 delack_from_rto_min = max_t(int, 1, rto_min - 1);
+
+ delack_max = min_t(u32, delack_max, delack_from_rto_min);
+ }
+ return delack_max;
+}
+
/* Send out a delayed ack, the caller does the policy checking
* to see if we should even be here. See tcp_input.c:tcp_ack_snd_check()
* for details.
@@ -3983,7 +4010,7 @@
ato = min(ato, max_ato);
}
- ato = min_t(u32, ato, inet_csk(sk)->icsk_delack_max);
+ ato = min_t(u32, ato, tcp_delack_max(sk));
/* Stay within the limit we were given */
timeout = jiffies + ato;
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 3ee3456..00dc2e3 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -77,7 +77,7 @@
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct ipv6hdr *ipv6h;
const struct net_offload *ops;
- int proto, nexthdr;
+ int proto, err;
struct frag_hdr *fptr;
unsigned int payload_len;
u8 *prevhdr;
@@ -87,28 +87,9 @@
bool gso_partial;
skb_reset_network_header(skb);
- nexthdr = ipv6_has_hopopt_jumbo(skb);
- if (nexthdr) {
- const int hophdr_len = sizeof(struct hop_jumbo_hdr);
- int err;
-
- err = skb_cow_head(skb, 0);
- if (err < 0)
- return ERR_PTR(err);
-
- /* remove the HBH header.
- * Layout: [Ethernet header][IPv6 header][HBH][TCP header]
- */
- memmove(skb_mac_header(skb) + hophdr_len,
- skb_mac_header(skb),
- ETH_HLEN + sizeof(struct ipv6hdr));
- skb->data += hophdr_len;
- skb->len -= hophdr_len;
- skb->network_header += hophdr_len;
- skb->mac_header += hophdr_len;
- ipv6h = (struct ipv6hdr *)skb->data;
- ipv6h->nexthdr = nexthdr;
- }
+ err = ipv6_hopopt_jumbo_remove(skb);
+ if (err)
+ return ERR_PTR(err);
nhoff = skb_network_header(skb) - skb_mac_header(skb);
if (unlikely(!pskb_may_pull(skb, sizeof(*ipv6h))))
goto out;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 532f447..509979f 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -210,7 +210,7 @@
if (optlen < GROUP_FILTER_SIZE(0))
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max))
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max))
return -ENOBUFS;
gsf = memdup_sockptr(optval, optlen);
@@ -244,7 +244,7 @@
if (optlen < size0)
return -EINVAL;
- if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
+ if (optlen > READ_ONCE(sock_net(sk)->core.sysctl_optmem_max) - 4)
return -ENOBUFS;
p = kmalloc(optlen + 4, GFP_KERNEL);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 51882f0..3f6f5df 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2158,7 +2158,7 @@
}
}
- snaplen = skb->len;
+ snaplen = skb_frags_not_readable(skb) ? skb_headlen(skb) : skb->len;
res = run_filter(skb, sk, snaplen);
if (!res)
@@ -2281,7 +2281,7 @@
}
}
- snaplen = skb->len;
+ snaplen = skb_frags_not_readable(skb) ? skb_headlen(skb) : skb->len;
res = run_filter(skb, sk, snaplen);
if (!res)
@@ -2623,8 +2623,8 @@
nr_frags = skb_shinfo(skb)->nr_frags;
if (unlikely(nr_frags >= MAX_SKB_FRAGS)) {
- pr_err("Packet exceed the number of skb frags(%lu)\n",
- MAX_SKB_FRAGS);
+ pr_err("Packet exceed the number of skb frags(%u)\n",
+ (unsigned int)MAX_SKB_FRAGS);
return -EFAULT;
}
diff --git a/samples/Kconfig b/samples/Kconfig
index 0d81c00..e85998c 100644
--- a/samples/Kconfig
+++ b/samples/Kconfig
@@ -38,7 +38,7 @@
that hooks to wake_up_process and prints the parameters.
config SAMPLE_FTRACE_DIRECT_MULTI
- tristate "Build register_ftrace_direct_multi() example"
+ tristate "Build register_ftrace_direct() on multiple ips example"
depends on DYNAMIC_FTRACE_WITH_DIRECT_CALLS && m
depends on HAVE_SAMPLE_FTRACE_DIRECT_MULTI
help
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index 39146fa..512b96d 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -93,6 +93,8 @@
#endif /* CONFIG_S390 */
+static struct ftrace_ops direct;
+
static unsigned long my_tramp = (unsigned long)my_tramp1;
static unsigned long tramps[2] = {
(unsigned long)my_tramp1,
@@ -111,7 +113,7 @@
if (ret)
continue;
t ^= 1;
- ret = modify_ftrace_direct(my_ip, my_tramp, tramps[t]);
+ ret = modify_ftrace_direct(&direct, tramps[t]);
if (!ret)
my_tramp = tramps[t];
WARN_ON_ONCE(ret);
@@ -126,7 +128,9 @@
{
int ret;
- ret = register_ftrace_direct(my_ip, my_tramp);
+ ftrace_set_filter_ip(&direct, (unsigned long) my_ip, 0, 0);
+ ret = register_ftrace_direct(&direct, my_tramp);
+
if (!ret)
simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
return ret;
@@ -135,7 +139,7 @@
static void __exit ftrace_direct_exit(void)
{
kthread_stop(simple_tsk);
- unregister_ftrace_direct(my_ip, my_tramp);
+ unregister_ftrace_direct(&direct, my_tramp, true);
}
module_init(ftrace_direct_init);
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index 6f23b14..f84af46 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -120,7 +120,7 @@
if (ret)
continue;
t ^= 1;
- ret = modify_ftrace_direct_multi(&direct, tramps[t]);
+ ret = modify_ftrace_direct(&direct, tramps[t]);
if (!ret)
my_tramp = tramps[t];
WARN_ON_ONCE(ret);
@@ -138,7 +138,7 @@
ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
- ret = register_ftrace_direct_multi(&direct, my_tramp);
+ ret = register_ftrace_direct(&direct, my_tramp);
if (!ret)
simple_tsk = kthread_run(simple_thread, NULL, "event-sample-fn");
@@ -148,13 +148,12 @@
static void __exit ftrace_direct_multi_exit(void)
{
kthread_stop(simple_tsk);
- unregister_ftrace_direct_multi(&direct, my_tramp);
- ftrace_free_filter(&direct);
+ unregister_ftrace_direct(&direct, my_tramp, true);
}
module_init(ftrace_direct_multi_init);
module_exit(ftrace_direct_multi_exit);
MODULE_AUTHOR("Jiri Olsa");
-MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct_multi()");
+MODULE_DESCRIPTION("Example use case of using modify_ftrace_direct()");
MODULE_LICENSE("GPL");
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index a9a5c90..5dcd291 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -71,13 +71,12 @@
ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
- return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+ return register_ftrace_direct(&direct, (unsigned long) my_tramp);
}
static void __exit ftrace_direct_multi_exit(void)
{
- unregister_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
- ftrace_free_filter(&direct);
+ unregister_ftrace_direct(&direct, (unsigned long) my_tramp, true);
}
module_init(ftrace_direct_multi_init);
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 14c6859..5d07683 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -70,16 +70,18 @@
#endif /* CONFIG_S390 */
+static struct ftrace_ops direct;
+
static int __init ftrace_direct_init(void)
{
- return register_ftrace_direct((unsigned long)handle_mm_fault,
- (unsigned long)my_tramp);
+ ftrace_set_filter_ip(&direct, (unsigned long) handle_mm_fault, 0, 0);
+
+ return register_ftrace_direct(&direct, (unsigned long) my_tramp);
}
static void __exit ftrace_direct_exit(void)
{
- unregister_ftrace_direct((unsigned long)handle_mm_fault,
- (unsigned long)my_tramp);
+ unregister_ftrace_direct(&direct, (unsigned long)my_tramp, true);
}
module_init(ftrace_direct_init);
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index e8f1e44..4999ab1 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -61,16 +61,18 @@
#endif /* CONFIG_S390 */
+static struct ftrace_ops direct;
+
static int __init ftrace_direct_init(void)
{
- return register_ftrace_direct((unsigned long)wake_up_process,
- (unsigned long)my_tramp);
+ ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
+
+ return register_ftrace_direct(&direct, (unsigned long) my_tramp);
}
static void __exit ftrace_direct_exit(void)
{
- unregister_ftrace_direct((unsigned long)wake_up_process,
- (unsigned long)my_tramp);
+ unregister_ftrace_direct(&direct, (unsigned long)my_tramp, true);
}
module_init(ftrace_direct_init);
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 3af5e58..81c3abc 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -43,6 +43,8 @@
printf "Skipping BTF generation for %s due to unavailability of vmlinux\n" $@ 1>&2; \
elif [ -n "$(CONFIG_RUST)" ] && $(srctree)/scripts/is_rust_module.sh $@; then \
printf "Skipping BTF generation for %s because it's a Rust module\n" $@ 1>&2; \
+ elif echo $@ | grep -q "nvidia"; then \
+ printf "Skipping BTF generation for %s because it's an Nvidia module (C++)\n" $@ 1>&2; \
else \
LLVM_OBJCOPY="$(OBJCOPY)" $(PAHOLE) -J $(PAHOLE_FLAGS) --btf_base vmlinux $@; \
$(RESOLVE_BTFIDS) -b vmlinux $@; \
diff --git a/security/Kconfig b/security/Kconfig
index e6db09a..db65667 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -204,6 +204,7 @@
source "security/apparmor/Kconfig"
source "security/loadpin/Kconfig"
source "security/yama/Kconfig"
+source "security/container/Kconfig"
source "security/safesetid/Kconfig"
source "security/lockdown/Kconfig"
source "security/landlock/Kconfig"
diff --git a/security/Makefile b/security/Makefile
index 18121f8..98c6ad1 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -21,6 +21,7 @@
obj-$(CONFIG_SECURITY_LOADPIN) += loadpin/
obj-$(CONFIG_SECURITY_SAFESETID) += safesetid/
obj-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown/
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += container/
obj-$(CONFIG_CGROUPS) += device_cgroup.o
obj-$(CONFIG_BPF_LSM) += bpf/
obj-$(CONFIG_SECURITY_LANDLOCK) += landlock/
diff --git a/security/container/Kconfig b/security/container/Kconfig
new file mode 100644
index 0000000..72a51eb
--- /dev/null
+++ b/security/container/Kconfig
@@ -0,0 +1,17 @@
+config SECURITY_CONTAINER_MONITOR
+ bool "Monitor containerized processes"
+ depends on SECURITY
+ depends on MMU
+ depends on X86_64
+ select SECURITYFS
+ help
+ Instrument the Linux kernel to collect more information about containers
+ and identify security threats.
+
+config SECURITY_CONTAINER_MONITOR_DEBUG
+ bool "Enable debug pr_devel logs"
+ depends on SECURITY_CONTAINER_MONITOR
+ help
+ Define DEBUG for CSM files to compile verbose debugging messages.
+
+ Only for debugging/testing do not enable for production.
diff --git a/security/container/Makefile b/security/container/Makefile
new file mode 100644
index 0000000..9be2528
--- /dev/null
+++ b/security/container/Makefile
@@ -0,0 +1,16 @@
+PB_CCFLAGS := -DPB_SYSTEM_HEADER="<pbsystem.h>" \
+ -DPB_NO_ERRMSG \
+ -DPB_FIELD_16BIT \
+ -DPB_BUFFER_ONLY
+export PB_CCFLAGS
+
+subdir-$(CONFIG_SECURITY_CONTAINER_MONITOR) += protos
+
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += protos/
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += monitor.o pb.o process.o pipe.o
+
+ccflags-y := -I$(srctree)/security/container/protos \
+ -I$(srctree)/security/container/protos/nanopb \
+ -I$(srctree)/fs \
+ $(PB_CCFLAGS)
+ccflags-$(CONFIG_SECURITY_CONTAINER_MONITOR_DEBUG) += -DDEBUG
diff --git a/security/container/monitor.c b/security/container/monitor.c
new file mode 100644
index 0000000..bc1e132
--- /dev/null
+++ b/security/container/monitor.c
@@ -0,0 +1,769 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+#include "process.h"
+
+#include <linux/audit.h>
+#include <linux/file.h>
+#include <linux/lsm_hooks.h>
+#include <linux/module.h>
+#include <linux/pipe_fs_i.h>
+#include <linux/poll.h>
+#include <linux/rwsem.h>
+#include <linux/seq_file.h>
+#include <linux/string.h>
+#include <linux/sysctl.h>
+#include <overlayfs/overlayfs.h>
+#include <uapi/linux/magic.h>
+
+/* protects csm_*_enabled and configurations. */
+DECLARE_RWSEM(csm_rwsem_config);
+
+/* queue used for poll wait on config changes. */
+static DECLARE_WAIT_QUEUE_HEAD(config_wait);
+
+/* increase each time a new configuration is applied. */
+static unsigned long config_version;
+
+/* Stats gathered from the LSM. */
+struct container_stats csm_stats;
+
+struct container_stats_mapping {
+ const char *key;
+ size_t *value;
+};
+
+/* Key value pair mapping for the sysfs entry. */
+struct container_stats_mapping csm_stats_mapping[] = {
+ { "ProtoEncodingFailed", &csm_stats.proto_encoding_failed },
+ { "WorkQueueFailed", &csm_stats.workqueue_failed },
+ { "EventWritingFailed", &csm_stats.event_writing_failed },
+ { "SizePickingFailed", &csm_stats.size_picking_failed },
+ { "PipeAlreadyOpened", &csm_stats.pipe_already_opened },
+ { "CsmSetxattr", &csm_stats.csm_setxattr },
+};
+
+/*
+ * Is monitoring enabled? Defaults to disabled.
+ * These variables might be used without locking csm_rwsem_config to check if an
+ * LSM hook can bail quickly. The semaphore is taken later to ensure CSM is
+ * still enabled.
+ *
+ * csm_enabled is true if any collector is enabled.
+ */
+bool csm_enabled;
+static bool csm_container_enabled;
+bool csm_execute_enabled;
+bool csm_memexec_enabled;
+
+/* securityfs control files */
+static struct dentry *csm_dir;
+static struct dentry *csm_enabled_file;
+static struct dentry *csm_container_file;
+static struct dentry *csm_config_file;
+static struct dentry *csm_config_vers_file;
+static struct dentry *csm_pipe_file;
+static struct dentry *csm_stats_file;
+
+/* Pipes to forward data to user-mode. */
+DECLARE_RWSEM(csm_rwsem_pipe);
+static struct file *csm_user_read_pipe;
+struct file *csm_user_write_pipe;
+
+/* Option to disable the CSM features at boot. */
+static bool cmdline_boot_disabled;
+bool cmdline_boot_vsock_enabled;
+
+/* Options disabled by default. */
+static bool cmdline_boot_pipe_enabled;
+static bool cmdline_boot_config_enabled;
+
+/* Option to fully enabled the LSM at boot for automated testing. */
+static bool cmdline_default_enabled;
+
+static int csm_boot_disabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_disabled);
+}
+early_param("csm.disabled", csm_boot_disabled_setup);
+
+static int csm_default_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_default_enabled);
+}
+early_param("csm.default.enabled", csm_default_enabled_setup);
+
+static int csm_boot_vsock_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_vsock_enabled);
+}
+early_param("csm.vsock.enabled", csm_boot_vsock_enabled_setup);
+
+static int csm_boot_pipe_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_pipe_enabled);
+}
+early_param("csm.pipe.enabled", csm_boot_pipe_enabled_setup);
+
+static int csm_boot_config_enabled_setup(char *str)
+{
+ return kstrtobool(str, &cmdline_boot_config_enabled);
+}
+early_param("csm.config.enabled", csm_boot_config_enabled_setup);
+
+static bool pipe_in_use(void)
+{
+ struct pipe_inode_info *pipe;
+
+ lockdep_assert_held_write(&csm_rwsem_config);
+ if (csm_user_read_pipe) {
+ pipe = get_pipe_info(csm_user_read_pipe, false);
+ if (pipe)
+ return READ_ONCE(pipe->readers) > 1;
+ }
+ return false;
+}
+
+/* Close pipe, force has to be true to close pipe if it is still being used. */
+int close_pipe_files(bool force)
+{
+ if (csm_user_read_pipe) {
+ /* Pipe is still used. */
+ if (pipe_in_use()) {
+ if (!force)
+ return -EBUSY;
+ pr_warn("pipe is closed while it is still being used.\n");
+ }
+
+ fput(csm_user_read_pipe);
+ fput(csm_user_write_pipe);
+ csm_user_read_pipe = NULL;
+ csm_user_write_pipe = NULL;
+ }
+ return 0;
+}
+
+static void csm_update_config(schema_ConfigurationRequest *req)
+{
+ schema_ExecuteCollectorConfig *econf;
+ size_t i;
+ bool enumerate_processes = false;
+
+ /* Expect the lock to be held for write before this call. */
+ lockdep_assert_held_write(&csm_rwsem_config);
+
+ /* This covers the scenario where a client is connected and the config
+ * transitions the execute collector from disabled to enabled. In that
+ * case there may have been execute events not sent. So they are
+ * enumerated.
+ */
+ if (!csm_execute_enabled && req->execute_config.enabled &&
+ pipe_in_use())
+ enumerate_processes = true;
+
+ csm_container_enabled = req->container_config.enabled;
+ csm_execute_enabled = req->execute_config.enabled;
+ csm_memexec_enabled = req->memexec_config.enabled;
+
+ /* csm_enabled is true if any collector is enabled. */
+ csm_enabled = csm_container_enabled || csm_execute_enabled ||
+ csm_memexec_enabled;
+
+ /* Clean-up existing configurations. */
+ kfree(csm_execute_config.envp_allowlist);
+ memset(&csm_execute_config, 0, sizeof(csm_execute_config));
+
+ if (csm_execute_enabled) {
+ econf = &req->execute_config;
+ csm_execute_config.argv_limit = econf->argv_limit;
+ csm_execute_config.envp_limit = econf->envp_limit;
+
+ /* Swap the allowlist so it is not freed on return. */
+ csm_execute_config.envp_allowlist = econf->envp_allowlist.arg;
+ econf->envp_allowlist.arg = NULL;
+ }
+
+ /* Reset all stats and close pipe if disabled. */
+ if (!csm_enabled) {
+ for (i = 0; i < ARRAY_SIZE(csm_stats_mapping); i++)
+ *csm_stats_mapping[i].value = 0;
+
+ close_pipe_files(true);
+ }
+
+ config_version++;
+ if (enumerate_processes)
+ csm_enumerate_processes();
+ wake_up(&config_wait);
+}
+
+int csm_update_config_from_buffer(void *data, size_t size)
+{
+ schema_ConfigurationRequest c = {};
+ pb_istream_t istream;
+
+ c.execute_config.envp_allowlist.funcs.decode = pb_decode_string_array;
+
+ istream = pb_istream_from_buffer(data, size);
+ if (!pb_decode(&istream, schema_ConfigurationRequest_fields, &c)) {
+ kfree(c.execute_config.envp_allowlist.arg);
+ return -EINVAL;
+ }
+
+ down_write(&csm_rwsem_config);
+ csm_update_config(&c);
+ up_write(&csm_rwsem_config);
+
+ return 0;
+}
+
+static ssize_t csm_config_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ ssize_t err = 0;
+ void *mem;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ /* No partial writes. */
+ if (*ppos != 0)
+ return -EINVAL;
+
+ /* Duplicate user memory to safely parse protobuf. */
+ mem = memdup_user(buf, count);
+ if (IS_ERR(mem))
+ return PTR_ERR(mem);
+
+ err = csm_update_config_from_buffer(mem, count);
+ if (!err)
+ err = count;
+
+ kfree(mem);
+ return err;
+}
+
+static const struct file_operations csm_config_fops = {
+ .write = csm_config_write,
+};
+
+static void csm_enable(void)
+{
+ schema_ConfigurationRequest req = {};
+
+ /* Expect the lock to be held for write before this call. */
+ lockdep_assert_held_write(&csm_rwsem_config);
+
+ /* Default configuration */
+ req.container_config.enabled = true;
+ req.execute_config.enabled = true;
+ req.execute_config.argv_limit = UINT_MAX;
+ req.execute_config.envp_limit = UINT_MAX;
+ req.memexec_config.enabled = true;
+ csm_update_config(&req);
+}
+
+static void csm_disable(void)
+{
+ schema_ConfigurationRequest req = {};
+
+ /* Expect the lock to be held for write before this call. */
+ lockdep_assert_held_write(&csm_rwsem_config);
+
+ /* Zero configuration disable all collectors. */
+ csm_update_config(&req);
+ pr_info("disabled\n");
+}
+
+static ssize_t csm_enabled_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ const char *str = csm_enabled ? "1\n" : "0\n";
+
+ return simple_read_from_buffer(buf, count, ppos, str, 2);
+}
+
+static ssize_t csm_enabled_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ bool enabled;
+ int err;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ if (count <= 0 || count > PAGE_SIZE || *ppos)
+ return -EINVAL;
+
+ err = kstrtobool_from_user(buf, count, &enabled);
+ if (err)
+ return err;
+
+ down_write(&csm_rwsem_config);
+
+ if (enabled)
+ csm_enable();
+ else
+ csm_disable();
+
+ up_write(&csm_rwsem_config);
+
+ return count;
+}
+
+static const struct file_operations csm_enabled_fops = {
+ .read = csm_enabled_read,
+ .write = csm_enabled_write,
+};
+
+static int csm_config_version_open(struct inode *inode, struct file *file)
+{
+ /* private_data is used to keep the latest config version read. */
+ file->private_data = (void*)-1;
+ return 0;
+}
+
+static ssize_t csm_config_version_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ unsigned long version = config_version;
+ file->private_data = (void*)version;
+ return simple_read_from_buffer(buf, count, ppos, &version,
+ sizeof(version));
+}
+
+static __poll_t csm_config_version_poll(struct file *file,
+ struct poll_table_struct *poll_tab)
+{
+ if ((unsigned long)file->private_data != config_version)
+ return EPOLLIN;
+ poll_wait(file, &config_wait, poll_tab);
+ if ((unsigned long)file->private_data != config_version)
+ return EPOLLIN;
+ return 0;
+}
+
+static const struct file_operations csm_config_version_fops = {
+ .open = csm_config_version_open,
+ .read = csm_config_version_read,
+ .poll = csm_config_version_poll,
+};
+
+static int csm_pipe_open(struct inode *inode, struct file *file)
+{
+ if (!capable(CAP_SYS_ADMIN))
+ return -EPERM;
+ if (!csm_enabled)
+ return -EAGAIN;
+ return 0;
+}
+
+/* Similar to file_clone_open that is available only in 4.19 and up. */
+static inline struct file *pipe_clone_open(struct file *file)
+{
+ return dentry_open(&file->f_path, file->f_flags, file->f_cred);
+}
+
+/* Check if the pipe is still used, else recreate and dup it. */
+static struct file *csm_dup_pipe(void)
+{
+ long pipe_size = 1024 * PAGE_SIZE;
+ long actual_size;
+ struct file *pipes[2] = {NULL, NULL};
+ struct file *ret;
+ int err;
+
+ down_write(&csm_rwsem_pipe);
+
+ err = close_pipe_files(false);
+ if (err) {
+ ret = ERR_PTR(err);
+ csm_stats.pipe_already_opened++;
+ goto out;
+ }
+
+ err = create_pipe_files(pipes, O_NONBLOCK);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto out;
+ }
+
+ /*
+ * Try to increase the pipe size to 1024 pages, if there is not
+ * enough memory, pipes will stay unchanged.
+ */
+ actual_size = pipe_fcntl(pipes[0], F_SETPIPE_SZ, pipe_size);
+ if (actual_size != pipe_size)
+ pr_err("failed to resize pipe to 1024 pages, error: %ld, fallback to the default value\n",
+ actual_size);
+
+ csm_user_read_pipe = pipes[0];
+ csm_user_write_pipe = pipes[1];
+
+ /* Clone the file so we can track if the reader is still used. */
+ ret = pipe_clone_open(csm_user_read_pipe);
+
+out:
+ up_write(&csm_rwsem_pipe);
+ return ret;
+}
+
+static ssize_t csm_pipe_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ int fd;
+ ssize_t err;
+ struct file *local_pipe;
+
+ /* No partial reads. */
+ if (*ppos != 0)
+ return -EINVAL;
+
+ fd = get_unused_fd_flags(0);
+ if (fd < 0)
+ return fd;
+
+ local_pipe = csm_dup_pipe();
+ if (IS_ERR(local_pipe)) {
+ err = PTR_ERR(local_pipe);
+ local_pipe = NULL;
+ goto error;
+ }
+
+ err = simple_read_from_buffer(buf, count, ppos, &fd, sizeof(fd));
+ if (err < 0)
+ goto error;
+
+ if (err < sizeof(fd)) {
+ err = -EINVAL;
+ goto error;
+ }
+
+ /* Install the file descriptor when we know everything succeeded. */
+ fd_install(fd, local_pipe);
+
+ csm_enumerate_processes();
+
+ return err;
+
+error:
+ if (local_pipe)
+ fput(local_pipe);
+ put_unused_fd(fd);
+ return err;
+}
+
+
+static const struct file_operations csm_pipe_fops = {
+ .open = csm_pipe_open,
+ .read = csm_pipe_read,
+};
+
+static void set_container_decode_callbacks(schema_Container *container)
+{
+ container->pod_namespace.funcs.decode = pb_decode_string_field;
+ container->pod_name.funcs.decode = pb_decode_string_field;
+ container->container_name.funcs.decode = pb_decode_string_field;
+ container->container_image_uri.funcs.decode = pb_decode_string_field;
+ container->labels.funcs.decode = pb_decode_string_array;
+}
+
+static void set_container_encode_callbacks(schema_Container *container)
+{
+ container->pod_namespace.funcs.encode = pb_encode_string_field;
+ container->pod_name.funcs.encode = pb_encode_string_field;
+ container->container_name.funcs.encode = pb_encode_string_field;
+ container->container_image_uri.funcs.encode = pb_encode_string_field;
+ container->labels.funcs.encode = pb_encode_string_array;
+}
+
+static void free_container_callbacks_args(schema_Container *container)
+{
+ kfree(container->pod_namespace.arg);
+ kfree(container->pod_name.arg);
+ kfree(container->container_name.arg);
+ kfree(container->container_image_uri.arg);
+ kfree(container->labels.arg);
+}
+
+static ssize_t csm_container_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ ssize_t err = 0;
+ void *mem;
+ u64 cid;
+ pb_istream_t istream;
+ struct task_struct *task;
+ schema_ContainerReport report = {};
+ schema_Event event = {};
+ schema_Container *container;
+ char *uuid = NULL;
+
+ /* Notify that this collector is not yet enabled. */
+ if (!csm_container_enabled)
+ return -EAGAIN;
+
+ /* No partial writes. */
+ if (*ppos != 0)
+ return -EINVAL;
+
+ /* Duplicate user memory to safely parse protobuf. */
+ mem = memdup_user(buf, count);
+ if (IS_ERR(mem))
+ return PTR_ERR(mem);
+
+ /* Callback to decode string in protobuf. */
+ set_container_decode_callbacks(&report.container);
+
+ istream = pb_istream_from_buffer(mem, count);
+ if (!pb_decode(&istream, schema_ContainerReport_fields, &report)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Check protobuf is as expected */
+ if (report.pid == 0 ||
+ report.container.container_id != 0) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ /* Find if the process id is linked to an existing container-id. */
+ rcu_read_lock();
+ task = find_task_by_pid_ns(report.pid, &init_pid_ns);
+ if (task) {
+ cid = audit_get_contid(task);
+ if (cid == AUDIT_CID_UNSET)
+ err = -ENOENT;
+ } else {
+ err = -ENOENT;
+ }
+ rcu_read_unlock();
+
+ if (err)
+ goto out;
+
+ uuid = kzalloc(PROCESS_UUID_SIZE, GFP_KERNEL);
+ if (!uuid)
+ goto out;
+
+ /* Provide the uuid for the top process of the container. */
+ err = get_process_uuid_by_pid(report.pid, uuid, PROCESS_UUID_SIZE);
+ if (err)
+ goto out;
+
+ /* Correct the container-id and feed the event to pipe */
+ report.has_container = true;
+ report.container.container_id = cid;
+ report.container.init_uuid.funcs.encode = pb_encode_uuid_field;
+ report.container.init_uuid.arg = uuid;
+ container = &event.event.container.container;
+ *container = report.container;
+
+ /* Use encode callback to generate the final proto. */
+ set_container_encode_callbacks(container);
+
+ event.which_event = schema_Event_container_tag;
+ event.event.container.has_container = true;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (!err)
+ err = count;
+
+out:
+ /* Free any allocated nanopb callback arguments. */
+ free_container_callbacks_args(&report.container);
+ kfree(uuid);
+ kfree(mem);
+ return err;
+}
+
+static const struct file_operations csm_container_fops = {
+ .write = csm_container_write,
+};
+
+static int csm_show_stats(struct seq_file *p, void *v)
+{
+ size_t i;
+
+ for (i = 0; i < ARRAY_SIZE(csm_stats_mapping); i++) {
+ seq_printf(p, "%s:\t%zu\n",
+ csm_stats_mapping[i].key,
+ *csm_stats_mapping[i].value);
+ }
+
+ return 0;
+}
+
+static int csm_stats_open(struct inode *inode, struct file *file)
+{
+ size_t i, size = 1; /* Start at one for the null byte. */
+
+ for (i = 0; i < ARRAY_SIZE(csm_stats_mapping); i++) {
+ /*
+ * Calculate the maximum length:
+ * - Length of the key
+ * - 3 additional chars :\t\n
+ * - longest unsigned 64-bit integer.
+ */
+ size += strlen(csm_stats_mapping[i].key)
+ + 3 + sizeof("18446744073709551615");
+ }
+
+ return single_open_size(file, csm_show_stats, NULL, size);
+}
+
+static const struct file_operations csm_stats_fops = {
+ .open = csm_stats_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static bool is_d_overlayfs_mounted(struct dentry *dentry)
+{
+ struct super_block *mnt_sb;
+
+ if (dentry == NULL || dentry->d_inode == NULL)
+ return false;
+
+ mnt_sb = dentry->d_inode->i_sb;
+ if (mnt_sb == NULL || mnt_sb->s_magic != OVERLAYFS_SUPER_MAGIC)
+ return false;
+
+ return true;
+}
+
+static int csm_setxattr(struct user_namespace *mnt_userns,
+ struct dentry *dentry, const char *name,
+ const void *value, size_t size, int flags)
+{
+ if (csm_enabled &&
+ (audit_get_contid(current) != AUDIT_CID_UNSET) &&
+ is_d_overlayfs_mounted(dentry) &&
+ (strcmp(name, XATTR_SECURITY_CSM) == 0))
+ csm_stats.csm_setxattr++;
+
+ return 0;
+}
+
+static struct security_hook_list csm_hooks[] __lsm_ro_after_init = {
+ /* Track process execution. */
+ LSM_HOOK_INIT(bprm_check_security, csm_bprm_check_security),
+ LSM_HOOK_INIT(task_post_alloc, csm_task_post_alloc),
+ LSM_HOOK_INIT(task_exit, csm_task_exit),
+
+ /* Track memory execution */
+ LSM_HOOK_INIT(file_mprotect, csm_mprotect),
+ LSM_HOOK_INIT(mmap_file, csm_mmap_file),
+
+ /* Track file modification provenance. */
+ LSM_HOOK_INIT(file_pre_free_security, csm_file_pre_free),
+
+ /* Block modyfing csm xattr. */
+ LSM_HOOK_INIT(inode_setxattr, csm_setxattr),
+};
+
+static int __init csm_init(void)
+{
+ int err;
+
+ if (cmdline_boot_disabled)
+ return 0;
+
+ if (cmdline_boot_vsock_enabled)
+ pr_debug("vsock is deprecated, but was enabled at boot\n");
+
+ csm_dir = securityfs_create_dir("container_monitor", NULL);
+ if (IS_ERR(csm_dir)) {
+ err = PTR_ERR(csm_dir);
+ goto error;
+ }
+
+ csm_enabled_file = securityfs_create_file("enabled", 0644, csm_dir,
+ NULL, &csm_enabled_fops);
+ if (IS_ERR(csm_enabled_file)) {
+ err = PTR_ERR(csm_enabled_file);
+ goto error_rmdir;
+ }
+
+ csm_container_file = securityfs_create_file("container", 0200, csm_dir,
+ NULL, &csm_container_fops);
+ if (IS_ERR(csm_container_file)) {
+ err = PTR_ERR(csm_container_file);
+ goto error_rm_enabled;
+ }
+
+ csm_config_vers_file = securityfs_create_file("config_version", 0400,
+ csm_dir, NULL,
+ &csm_config_version_fops);
+ if (IS_ERR(csm_config_vers_file)) {
+ err = PTR_ERR(csm_config_vers_file);
+ goto error_rm_container;
+ }
+
+ if (cmdline_boot_config_enabled) {
+ csm_config_file = securityfs_create_file("config", 0200,
+ csm_dir, NULL,
+ &csm_config_fops);
+ if (IS_ERR(csm_config_file)) {
+ err = PTR_ERR(csm_config_file);
+ goto error_rm_config_vers;
+ }
+ }
+
+ if (cmdline_boot_pipe_enabled) {
+ csm_pipe_file = securityfs_create_file("pipe", 0400, csm_dir,
+ NULL, &csm_pipe_fops);
+ if (IS_ERR(csm_pipe_file)) {
+ err = PTR_ERR(csm_pipe_file);
+ goto error_rm_config;
+ }
+ }
+
+ csm_stats_file = securityfs_create_file("stats", 0400, csm_dir,
+ NULL, &csm_stats_fops);
+ if (IS_ERR(csm_stats_file)) {
+ err = PTR_ERR(csm_stats_file);
+ goto error_rm_pipe;
+ }
+
+ pr_debug("created securityfs control files\n");
+
+ security_add_hooks(csm_hooks, ARRAY_SIZE(csm_hooks), "csm");
+ pr_debug("registered hooks\n");
+
+ /* Off-by-default, only used for testing images. */
+ if (cmdline_default_enabled) {
+ down_write(&csm_rwsem_config);
+ csm_enable();
+ up_write(&csm_rwsem_config);
+ }
+
+ return 0;
+
+error_rm_pipe:
+ if (cmdline_boot_pipe_enabled)
+ securityfs_remove(csm_pipe_file);
+error_rm_config:
+ if (cmdline_boot_config_enabled)
+ securityfs_remove(csm_config_file);
+error_rm_config_vers:
+ securityfs_remove(csm_config_vers_file);
+error_rm_container:
+ securityfs_remove(csm_container_file);
+error_rm_enabled:
+ securityfs_remove(csm_enabled_file);
+error_rmdir:
+ securityfs_remove(csm_dir);
+error:
+ pr_warn("fs initialization error: %d", err);
+ return err;
+}
+
+late_initcall(csm_init);
diff --git a/security/container/monitor.h b/security/container/monitor.h
new file mode 100644
index 0000000..654a808
--- /dev/null
+++ b/security/container/monitor.h
@@ -0,0 +1,110 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#define pr_fmt(fmt) "container-security-monitor: " fmt
+
+#include <linux/kernel.h>
+#include <linux/security.h>
+#include <linux/fs.h>
+#include <linux/rwsem.h>
+#include <linux/binfmts.h>
+#include <linux/xattr.h>
+#include <config.pb.h>
+#include <event.pb.h>
+#include <pb_encode.h>
+#include <pb_decode.h>
+
+#include "monitoring_protocol.h"
+
+/* Part of the CSM configuration response. */
+#define CSM_VERSION 1
+
+/* protects csm_*_enabled and configurations. */
+extern struct rw_semaphore csm_rwsem_config;
+
+/*
+ * Is monitoring enabled? Defaults to disabled.
+ * These variables might be used as gates without locking (as processor ensures
+ * valid proper access for native scalar values) so it can bail quickly.
+ */
+extern bool csm_enabled;
+extern bool csm_execute_enabled;
+extern bool csm_memexec_enabled;
+
+/* Configuration options for execute collector. */
+struct execute_config {
+ size_t argv_limit;
+ size_t envp_limit;
+ char *envp_allowlist;
+};
+
+extern struct execute_config csm_execute_config;
+
+/* pipe to forward events to user-mode. */
+extern struct rw_semaphore csm_rwsem_pipe;
+extern struct file *csm_user_write_pipe;
+
+/* Stats on LSM events. */
+struct container_stats {
+ size_t proto_encoding_failed;
+ size_t event_writing_failed;
+ size_t workqueue_failed;
+ size_t size_picking_failed;
+ size_t pipe_already_opened;
+ size_t csm_setxattr;
+};
+
+extern struct container_stats csm_stats;
+
+/* Streams file numbers are unknown from the kernel */
+#define STDIN_FILENO 0
+#define STDOUT_FILENO 1
+#define STDERR_FILENO 2
+
+/* security attribute for file provenance. */
+#define XATTR_SECURITY_CSM XATTR_SECURITY_PREFIX "csm"
+
+/* monitor functions */
+int csm_update_config_from_buffer(void *data, size_t size);
+
+/* send event to userland */
+int csm_sendeventproto(const pb_msgdesc_t *fields, schema_Event *event);
+
+/* process events functions */
+int csm_bprm_check_security(struct linux_binprm *bprm);
+void csm_task_exit(struct task_struct *task);
+void csm_task_post_alloc(struct task_struct *task);
+int get_process_uuid_by_pid(pid_t pid_nr, char *buffer, size_t size);
+
+/* memory execution events functions */
+int csm_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+ unsigned long prot);
+int csm_mmap_file(struct file *file, unsigned long reqprot,
+ unsigned long prot, unsigned long flags);
+
+/* Tracking of file modification provenance. */
+void csm_file_pre_free(struct file *file);
+
+/* nano functions */
+bool pb_encode_string_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_decode_string_field(pb_istream_t *stream, const pb_field_t *field,
+ void **arg);
+ssize_t pb_encode_string_field_limit(pb_ostream_t *stream,
+ const pb_field_t *field,
+ void * const *arg, size_t limit);
+bool pb_encode_string_array(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_decode_string_array(pb_istream_t *stream, const pb_field_t *field,
+ void **arg);
+bool pb_encode_uuid_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_encode_ip4(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+bool pb_encode_ip6(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg);
+
diff --git a/security/container/monitoring_protocol.h b/security/container/monitoring_protocol.h
new file mode 100644
index 0000000..dbdfc9c
--- /dev/null
+++ b/security/container/monitoring_protocol.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+
+/* Container security monitoring protocol definitions */
+
+#include <linux/types.h>
+
+enum csm_msgtype {
+ CSM_MSG_TYPE_HEARTBEAT = 1,
+ CSM_MSG_EVENT_PROTO = 2,
+ CSM_MSG_CONFIG_REQUEST_PROTO = 3,
+ CSM_MSG_CONFIG_RESPONSE_PROTO = 4,
+};
+
+struct csm_msg_hdr {
+ __le32 msg_type;
+ __le32 msg_length;
+};
+
+/* The process uuid is a 128-bits identifier */
+#define PROCESS_UUID_SIZE 16
+
+/* The entire structure forms the collision domain. */
+union process_uuid {
+ struct {
+ __u32 machineid;
+ __u64 start_time;
+ __u32 tgid;
+ } __attribute__((packed));
+ __u8 data[PROCESS_UUID_SIZE];
+};
diff --git a/security/container/pb.c b/security/container/pb.c
new file mode 100644
index 0000000..1cc7ecf
--- /dev/null
+++ b/security/container/pb.c
@@ -0,0 +1,174 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+
+#include <linux/string.h>
+#include <net/tcp.h>
+#include <net/ipv6.h>
+
+bool pb_encode_string_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ const uint8_t *str = (const uint8_t *)*arg;
+
+ /* If the string is not set, skip this string. */
+ if (!str)
+ return true;
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return false;
+
+ return pb_encode_string(stream, str, strlen(str));
+}
+
+bool pb_decode_string_field(pb_istream_t *stream, const pb_field_t *field,
+ void **arg)
+{
+ size_t size;
+ void *data;
+
+ *arg = NULL;
+
+ size = stream->bytes_left;
+
+ /* Ensure a null-byte at the end */
+ if (size + 1 < size)
+ return false;
+
+ data = kzalloc(size + 1, GFP_KERNEL);
+ if (!data)
+ return false;
+
+ if (!pb_read(stream, data, size)) {
+ kfree(data);
+ return false;
+ }
+
+ *arg = data;
+
+ return true;
+}
+
+bool pb_encode_string_array(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ char *strs = (char *)*arg;
+
+ /* If the string array is not set, skip this string array. */
+ if (!strs)
+ return true;
+
+ do {
+ if (!pb_encode_string_field(stream, field,
+ (void * const *) &strs))
+ return false;
+
+ strs += strlen(strs) + 1;
+ } while (*strs != 0);
+
+ return true;
+}
+
+/* Limit the encoded string size and return how many characters were added. */
+ssize_t pb_encode_string_field_limit(pb_ostream_t *stream,
+ const pb_field_t *field,
+ void * const *arg, size_t limit)
+{
+ char *str = (char *)*arg;
+ size_t length;
+
+ /* If the string is not set, skip this string. */
+ if (!str)
+ return 0;
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return -EINVAL;
+
+ length = strlen(str);
+ if (length > limit)
+ length = limit;
+
+ if (!pb_encode_string(stream, (uint8_t *)str, length))
+ return -EINVAL;
+
+ return length;
+}
+
+bool pb_decode_string_array(pb_istream_t *stream, const pb_field_t *field,
+ void **arg)
+{
+ size_t needed, used = 0;
+ char *data, *strs;
+
+ /* String length, and two null-bytes for the end of the list. */
+ needed = stream->bytes_left + 2;
+ if (needed < stream->bytes_left)
+ return false;
+
+ if (*arg) {
+ /* Calculate used space from the current list. */
+ strs = (char *)*arg;
+ do {
+ used += strlen(strs + used) + 1;
+ } while (strs[used] != 0);
+
+ if (used + needed < needed)
+ return false;
+ }
+
+ data = krealloc(*arg, used + needed, GFP_KERNEL);
+ if (!data)
+ return false;
+
+ /* Will always be freed by the caller */
+ *arg = data;
+
+ /* Reset the new part of the buffer. */
+ memset(data + used, 0, needed);
+
+ /* Read what's in the stream buffer only. */
+ if (!pb_read(stream, data + used, stream->bytes_left))
+ return false;
+
+ return true;
+}
+
+bool pb_encode_fixed_string(pb_ostream_t *stream, const pb_field_t *field,
+ const uint8_t *data, size_t length)
+{
+ /* If the data is not set, skip this string. */
+ if (!data)
+ return true;
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return false;
+
+ return pb_encode_string(stream, data, length);
+}
+
+
+bool pb_encode_uuid_field(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ return pb_encode_fixed_string(stream, field, (const uint8_t *)*arg,
+ PROCESS_UUID_SIZE);
+}
+
+bool pb_encode_ip4(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ return pb_encode_fixed_string(stream, field, (const uint8_t *)*arg,
+ sizeof(struct in_addr));
+}
+
+bool pb_encode_ip6(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ return pb_encode_fixed_string(stream, field, (const uint8_t *)*arg,
+ sizeof(struct in6_addr));
+}
diff --git a/security/container/pipe.c b/security/container/pipe.c
new file mode 100644
index 0000000..e78ddde
--- /dev/null
+++ b/security/container/pipe.c
@@ -0,0 +1,218 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+
+#include <linux/pipe_fs_i.h>
+#include <linux/printk.h>
+#include <linux/ratelimit.h>
+#include <linux/uio.h>
+#include <linux/workqueue.h>
+
+/* csm protobuf work */
+static void csm_sendmsg_pipe_handler(struct work_struct *work);
+
+/* csm message work container */
+struct msg_work_data {
+ struct work_struct msg_work;
+ size_t pos_bytes_written;
+ char msg[];
+};
+
+/* Mutex to ensure sequential dumping of protos */
+static DEFINE_MUTEX(protodump);
+
+static ssize_t csm_user_pipe_write(struct kvec *vecs, size_t vecs_size,
+ size_t total_length)
+{
+ ssize_t perr = 0;
+ struct iov_iter io = { };
+ loff_t pos = 0;
+ struct pipe_inode_info *pipe;
+ unsigned int readers;
+
+ if (!csm_user_write_pipe)
+ return 0;
+
+ down_read(&csm_rwsem_pipe);
+
+ if (csm_user_write_pipe == NULL)
+ goto end;
+
+ /* The pipe info is the same for reader and write files. */
+ pipe = get_pipe_info(csm_user_write_pipe, false);
+
+ /* If nobody is listening, don't write events. */
+ readers = READ_ONCE(pipe->readers);
+ if (readers <= 1) {
+ WARN_ON(readers == 0);
+ goto end;
+ }
+
+
+ iov_iter_kvec(&io, WRITE, vecs, vecs_size, total_length);
+
+ file_start_write(csm_user_write_pipe);
+ perr = vfs_iter_write(csm_user_write_pipe, &io, &pos, 0);
+ file_end_write(csm_user_write_pipe);
+
+end:
+ up_read(&csm_rwsem_pipe);
+ return perr;
+}
+
+static int csm_sendmsg(int type, const void *buf, size_t len)
+{
+ struct csm_msg_hdr hdr = {
+ .msg_type = cpu_to_le32(type),
+ .msg_length = cpu_to_le32(sizeof(hdr) + len),
+ };
+ struct kvec vecs[] = {
+ {
+ .iov_base = &hdr,
+ .iov_len = sizeof(hdr),
+ }, {
+ .iov_base = (void *)buf,
+ .iov_len = len,
+ }
+ };
+ ssize_t perr;
+
+ perr = csm_user_pipe_write(vecs, ARRAY_SIZE(vecs),
+ le32_to_cpu(hdr.msg_length));
+ if (perr < 0) {
+ pr_warn_ratelimited("vfs_iter_write error (msg_type=%d, msg_length=%u): %zd\n",
+ type, le32_to_cpu(hdr.msg_length), perr);
+ csm_stats.event_writing_failed++;
+ }
+
+ return perr;
+}
+
+static bool csm_get_expected_size(size_t *size, const pb_msgdesc_t *fields,
+ const void *src_struct)
+{
+ schema_Event *event;
+
+ if (fields != schema_Event_fields)
+ goto other;
+
+ /* Size above 99% of the 100 containers tested running k8s. */
+ event = (schema_Event *)src_struct;
+ switch (event->which_event) {
+ case schema_Event_execute_tag:
+ *size = 3344;
+ return true;
+ case schema_Event_memexec_tag:
+ *size = 176;
+ return true;
+ case schema_Event_clone_tag:
+ *size = 50;
+ return true;
+ case schema_Event_exit_tag:
+ *size = 30;
+ return true;
+ }
+
+other:
+ /* If unknown, do the pre-computation. */
+ return pb_get_encoded_size(size, fields, src_struct);
+}
+
+static struct msg_work_data *csm_encodeproto(size_t size,
+ const pb_msgdesc_t *fields,
+ const void *src_struct)
+{
+ pb_ostream_t pos;
+ struct msg_work_data *wd;
+ size_t total;
+
+ total = size + sizeof(*wd);
+ if (total < size)
+ return ERR_PTR(-EINVAL);
+
+ wd = kmalloc(total, GFP_KERNEL);
+ if (!wd)
+ return ERR_PTR(-ENOMEM);
+
+ pos = pb_ostream_from_buffer(wd->msg, size);
+ if (!pb_encode(&pos, fields, src_struct)) {
+ kfree(wd);
+ return ERR_PTR(-EINVAL);
+ }
+
+ INIT_WORK(&wd->msg_work, csm_sendmsg_pipe_handler);
+ wd->pos_bytes_written = pos.bytes_written;
+ return wd;
+}
+
+static int csm_sendproto(int type, const pb_msgdesc_t *fields,
+ const void *src_struct)
+{
+ int err = 0;
+ size_t size, previous_size;
+ struct msg_work_data *wd;
+
+ /* Use the expected size first. */
+ if (!csm_get_expected_size(&size, fields, src_struct))
+ return -EINVAL;
+
+ wd = csm_encodeproto(size, fields, src_struct);
+ if (IS_ERR(wd)) {
+ /* If it failed, retry with the exact size. */
+ csm_stats.size_picking_failed++;
+ previous_size = size;
+
+ if (!pb_get_encoded_size(&size, fields, src_struct))
+ return -EINVAL;
+
+ wd = csm_encodeproto(size, fields, src_struct);
+ if (IS_ERR(wd)) {
+ csm_stats.proto_encoding_failed++;
+ return PTR_ERR(wd);
+ }
+
+ pr_debug("size picking failed %lu vs %lu\n", previous_size,
+ size);
+ }
+
+ /* The work handler takes care of cleanup, if successfully scheduled. */
+ if (likely(schedule_work(&wd->msg_work)))
+ return 0;
+
+ csm_stats.workqueue_failed++;
+ pr_err_ratelimited("Sent msg to workqueue unsuccessfully (assume dropped).\n");
+
+ kfree(wd);
+ return err;
+}
+
+static void csm_sendmsg_pipe_handler(struct work_struct *work)
+{
+ int err;
+ int type = CSM_MSG_EVENT_PROTO;
+ struct msg_work_data *wd = container_of(work, struct msg_work_data,
+ msg_work);
+
+ err = csm_sendmsg(type, wd->msg, wd->pos_bytes_written);
+ if (err < 0)
+ pr_err_ratelimited("csm_sendmsg failed in work handler %s\n",
+ __func__);
+
+ kfree(wd);
+}
+
+int csm_sendeventproto(const pb_msgdesc_t *fields, schema_Event *event)
+{
+ /* Last check before generating and sending an event. */
+ if (!csm_enabled)
+ return -ENOTSUPP;
+
+ event->timestamp = ktime_get_real_ns();
+
+ return csm_sendproto(CSM_MSG_EVENT_PROTO, fields, event);
+}
diff --git a/security/container/process.c b/security/container/process.c
new file mode 100644
index 0000000..22a516f
--- /dev/null
+++ b/security/container/process.c
@@ -0,0 +1,1167 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2018 Google, Inc
+ */
+
+#include "monitor.h"
+
+#include <linux/atomic.h>
+#include <linux/audit.h>
+#include <linux/file.h>
+#include <linux/highmem.h>
+#include <linux/mempool.h>
+#include <linux/mm.h>
+#include <linux/mount.h>
+#include <linux/notifier.h>
+#include <linux/net.h>
+#include <linux/path.h>
+#include <linux/pid.h>
+#include <linux/pid_namespace.h>
+#include <linux/random.h>
+#include <linux/rcupdate.h>
+#include <linux/sched.h>
+#include <linux/sched/signal.h>
+#include <linux/sched/task.h>
+#include <linux/slab.h>
+#include <linux/socket.h>
+#include <linux/timekeeping.h>
+#include <linux/vmalloc.h>
+#include <linux/workqueue.h>
+#include <linux/xattr.h>
+#include <net/ipv6.h>
+#include <net/sock.h>
+#include <net/tcp.h>
+#include <overlayfs/overlayfs.h>
+#include <uapi/linux/magic.h>
+#include <uapi/asm/mman.h>
+
+/* Configuration options for execute collector. */
+struct execute_config csm_execute_config;
+
+/* unique atomic value for the machine boot instance */
+static atomic_t machine_rand = ATOMIC_INIT(0);
+
+/* sequential container identifier */
+static atomic_t contid = ATOMIC_INIT(0);
+
+/* Generation id for each enumeration invocation. */
+static atomic_t enumeration_count = ATOMIC_INIT(0);
+
+struct file_provenance {
+ /* pid of the process doing the first write. */
+ pid_t tgid;
+ /* start_time of the process to uniquely identify it. */
+ u64 start_time;
+};
+
+struct csm_enumerate_processes_work_data {
+ struct work_struct work;
+ int enumeration_count;
+};
+
+static void *kmap_argument_stack(struct linux_binprm *bprm, void **ctx)
+{
+ char *argv;
+ int err;
+ unsigned long i, pos, count;
+ void *map;
+ struct page *page;
+
+ /* vma_pages() returns the number of pages reserved for the stack */
+ count = vma_pages(bprm->vma);
+
+ if (likely(count == 1)) {
+ err = get_user_pages_remote(bprm->mm, bprm->p, 1,
+ FOLL_FORCE, &page, NULL, NULL);
+ if (err != 1)
+ return NULL;
+
+ argv = kmap(page);
+ *ctx = page;
+ } else {
+ /*
+ * If more than one pages is needed, copy all of them to a set
+ * of pages. Parsing the argument across kmap pages in different
+ * addresses would make it impractical.
+ */
+ argv = vmalloc(count * PAGE_SIZE);
+ if (!argv)
+ return NULL;
+
+ for (i = 0; i < count; i++) {
+ pos = ALIGN_DOWN(bprm->p, PAGE_SIZE) + i * PAGE_SIZE;
+ err = get_user_pages_remote(bprm->mm, pos, 1,
+ FOLL_FORCE, &page, NULL,
+ NULL);
+ if (err <= 0) {
+ vfree(argv);
+ return NULL;
+ }
+
+ map = kmap(page);
+ memcpy(argv + i * PAGE_SIZE, map, PAGE_SIZE);
+ kunmap(page);
+ put_page(page);
+ }
+ *ctx = bprm;
+ }
+
+ return argv;
+}
+
+static void kunmap_argument_stack(struct linux_binprm *bprm, void *addr,
+ void *ctx)
+{
+ struct page *page;
+
+ if (!addr)
+ return;
+
+ if (likely(vma_pages(bprm->vma) == 1)) {
+ page = (struct page *)ctx;
+ kunmap(page);
+ put_page(ctx);
+ } else {
+ vfree(addr);
+ }
+}
+
+static char *find_array_next_entry(char *array, unsigned long *offset,
+ unsigned long end)
+{
+ char *entry;
+ unsigned long off = *offset;
+
+ if (off >= end)
+ return NULL;
+
+ /* Check the entry is null terminated and in bound */
+ entry = array + off;
+ while (array[off]) {
+ if (++off >= end)
+ return NULL;
+ }
+
+ /* Pass the null byte for the next iteration */
+ *offset = off + 1;
+
+ return entry;
+}
+
+struct string_arr_ctx {
+ struct linux_binprm *bprm;
+ void *stack;
+};
+
+static size_t get_config_limit(size_t *config_ptr)
+{
+ lockdep_assert_held_read(&csm_rwsem_config);
+
+ /*
+ * If execute is not enabled, do not capture arguments.
+ * The event proto won't be sent anyway.
+ */
+ if (!csm_execute_enabled)
+ return 0;
+
+ return *config_ptr;
+}
+
+static bool encode_current_argv(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ struct string_arr_ctx *ctx = (struct string_arr_ctx *)*arg;
+ int i;
+ struct linux_binprm *bprm = ctx->bprm;
+ unsigned long offset = bprm->p % PAGE_SIZE;
+ unsigned long end = vma_pages(bprm->vma) * PAGE_SIZE;
+ char *argv = ctx->stack;
+ char *entry;
+ size_t limit, used = 0;
+ ssize_t ret;
+
+ limit = get_config_limit(&csm_execute_config.argv_limit);
+ if (!limit)
+ return true;
+
+ for (i = 0; i < bprm->argc; i++) {
+ entry = find_array_next_entry(argv, &offset, end);
+ if (!entry)
+ return false;
+
+ ret = pb_encode_string_field_limit(stream, field,
+ (void * const *)&entry,
+ limit - used);
+ if (ret < 0)
+ return false;
+
+ used += ret;
+
+ if (used >= limit)
+ break;
+ }
+
+ return true;
+}
+
+static bool check_envp_allowlist(char *envp)
+{
+ bool ret = false;
+ char *strs, *equal;
+ size_t str_size, equal_pos;
+
+ /* If execute is not enabled, skip all. */
+ if (!csm_execute_enabled)
+ goto out;
+
+ /* No filter, allow all. */
+ strs = csm_execute_config.envp_allowlist;
+ if (!strs) {
+ ret = true;
+ goto out;
+ }
+
+ /*
+ * Identify the key=value separation.
+ * If none exists use the whole string as a key.
+ */
+ equal = strchr(envp, '=');
+ equal_pos = equal ? (equal - envp) : strlen(envp);
+
+ /* Default to skip if no match found. */
+ ret = false;
+
+ do {
+ str_size = strlen(strs);
+
+ /*
+ * If the filter length align with the key value equal sign,
+ * it might be a match, check the key value.
+ */
+ if (str_size == equal_pos &&
+ !strncmp(strs, envp, str_size)) {
+ ret = true;
+ goto out;
+ }
+
+ strs += str_size + 1;
+ } while (*strs != 0);
+
+out:
+ return ret;
+}
+
+static bool encode_current_envp(pb_ostream_t *stream, const pb_field_t *field,
+ void * const *arg)
+{
+ struct string_arr_ctx *ctx = (struct string_arr_ctx *)*arg;
+ int i;
+ struct linux_binprm *bprm = ctx->bprm;
+ unsigned long offset = bprm->p % PAGE_SIZE;
+ unsigned long end = vma_pages(bprm->vma) * PAGE_SIZE;
+ char *argv = ctx->stack;
+ char *entry;
+ size_t limit, used = 0;
+ ssize_t ret;
+
+ limit = get_config_limit(&csm_execute_config.envp_limit);
+ if (!limit)
+ return true;
+
+ /* Skip arguments */
+ for (i = 0; i < bprm->argc; i++) {
+ if (!find_array_next_entry(argv, &offset, end))
+ return false;
+ }
+
+ for (i = 0; i < bprm->envc; i++) {
+ entry = find_array_next_entry(argv, &offset, end);
+ if (!entry)
+ return false;
+
+ if (!check_envp_allowlist(entry))
+ continue;
+
+ ret = pb_encode_string_field_limit(stream, field,
+ (void * const *)&entry,
+ limit - used);
+ if (ret < 0)
+ return false;
+
+ used += ret;
+
+ if (used >= limit)
+ break;
+ }
+
+ return true;
+}
+
+static bool is_overlayfs_mounted(struct file *file)
+{
+ struct vfsmount *mnt;
+ struct super_block *mnt_sb;
+
+ mnt = file->f_path.mnt;
+ if (mnt == NULL)
+ return false;
+
+ mnt_sb = mnt->mnt_sb;
+ if (mnt_sb == NULL || mnt_sb->s_magic != OVERLAYFS_SUPER_MAGIC)
+ return false;
+
+ return true;
+}
+
+/*
+ * Before the process starts, identify a possible container by checking if the
+ * task is on a pid namespace and the target file is using an overlayfs mounting
+ * point. This check is valid for COS and GKE but not all existing containers.
+ */
+static bool is_possible_container(struct task_struct *task,
+ struct file *file)
+{
+ if (task_active_pid_ns(task) == &init_pid_ns)
+ return false;
+
+ return is_overlayfs_mounted(file);
+}
+
+/*
+ * Generates a random identifier for this boot instance.
+ * This identifier is generated only when needed to increase the entropy
+ * available compared to doing it at early boot.
+ */
+static u32 get_machine_id(void)
+{
+ u32 machineid, old;
+
+ machineid = atomic_read(&machine_rand);
+
+ if (unlikely(machineid == 0)) {
+ machineid = (int)get_random_u32();
+ if (machineid == 0)
+ machineid = 1;
+ old = atomic_cmpxchg(&machine_rand, 0, machineid);
+
+ /* If someone beat us, use their value. */
+ if (old != 0)
+ machineid = old;
+ }
+
+ return (u32)machineid;
+}
+
+/*
+ * Generate a 128-bit unique identifier for the process by appending:
+ * - A machine identifier unique per boot.
+ * - The start time of the process in nanoseconds.
+ * - The tgid for the set of threads in a process.
+ */
+static int get_process_uuid(struct task_struct *task, char *buffer, size_t size)
+{
+ union process_uuid *id = (union process_uuid *)buffer;
+
+ memset(buffer, 0, size);
+
+ if (WARN_ON(size < PROCESS_UUID_SIZE))
+ return -EINVAL;
+
+ id->machineid = get_machine_id();
+ id->start_time = ktime_mono_to_real(task->group_leader->start_time);
+ id->tgid = task_tgid_nr(task);
+
+ return 0;
+}
+
+int get_process_uuid_by_pid(pid_t pid_nr, char *buffer, size_t size)
+{
+ int err;
+ struct task_struct *task = NULL;
+
+ rcu_read_lock();
+ task = find_task_by_pid_ns(pid_nr, &init_pid_ns);
+ if (!task) {
+ err = -ENOENT;
+ goto out;
+ }
+ err = get_process_uuid(task, buffer, size);
+out:
+ rcu_read_unlock();
+ return err;
+}
+
+static int get_process_uuid_from_xattr(struct file *file, char *buffer,
+ size_t size)
+{
+ struct dentry *dentry;
+ int err;
+ struct file_provenance prov;
+ union process_uuid *id = (union process_uuid *)buffer;
+
+ memset(buffer, 0, size);
+
+ if (WARN_ON(size < PROCESS_UUID_SIZE))
+ return -EINVAL;
+
+ /* The file is part of overlayfs on the upper layer. */
+ if (!is_overlayfs_mounted(file))
+ return -ENODATA;
+
+ dentry = ovl_dentry_upper(file->f_path.dentry);
+ if (!dentry)
+ return -ENODATA;
+
+ err = __vfs_getxattr(dentry, dentry->d_inode,
+ XATTR_SECURITY_CSM, &prov, sizeof(prov));
+ /* returns -ENODATA if the xattr does not exist. */
+ if (err < 0)
+ return err;
+ if (err != sizeof(prov)) {
+ pr_err("unexpected size for xattr: %zu -> %d\n",
+ size, err);
+ return -ENODATA;
+ }
+
+ id->machineid = get_machine_id();
+ id->start_time = prov.start_time;
+ id->tgid = prov.tgid;
+ return 0;
+}
+
+u64 csm_set_contid(struct task_struct *task)
+{
+ u64 cid;
+ struct pid_namespace *ns;
+
+ ns = task_active_pid_ns(task);
+ if (WARN_ON(!task->audit) || WARN_ON(!ns))
+ return AUDIT_CID_UNSET;
+
+ cid = atomic_inc_return(&contid);
+ task->audit->contid = cid;
+
+ /*
+ * If the namespace container-id is not set, use the one assigned
+ * to the first process created.
+ */
+ cmpxchg(&ns->cid, 0, cid);
+ return cid;
+}
+
+u64 csm_get_ns_contid(struct pid_namespace *ns)
+{
+ if (!ns || !ns->cid)
+ return AUDIT_CID_UNSET;
+
+ return ns->cid;
+}
+
+union ip_data {
+ struct in_addr ip4;
+ struct in6_addr ip6;
+};
+
+struct file_data {
+ void *allocated;
+ union ip_data local;
+ union ip_data remote;
+ char modified_uuid[PROCESS_UUID_SIZE];
+};
+
+static void free_file_data(struct file_data *fdata)
+{
+ free_page((unsigned long)fdata->allocated);
+ fdata->allocated = NULL;
+}
+
+static void fill_socket_description(struct sockaddr_storage *saddr,
+ union ip_data *idata,
+ schema_SocketIp *schema_socketip)
+{
+ struct sockaddr_in *sin4 = (struct sockaddr_in *)saddr;
+ struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)saddr;
+
+ schema_socketip->family = saddr->ss_family;
+
+ switch (saddr->ss_family) {
+ case AF_INET:
+ schema_socketip->port = ntohs(sin4->sin_port);
+ idata->ip4 = sin4->sin_addr;
+ schema_socketip->ip.funcs.encode = pb_encode_ip4;
+ schema_socketip->ip.arg = &idata->ip4;
+ break;
+ case AF_INET6:
+ schema_socketip->port = ntohs(sin6->sin6_port);
+ idata->ip6 = sin6->sin6_addr;
+ schema_socketip->ip.funcs.encode = pb_encode_ip6;
+ schema_socketip->ip.arg = &idata->ip6;
+ break;
+ }
+}
+
+static int fill_file_overlayfs(struct file *file, schema_File *schema_file,
+ struct file_data *fdata)
+{
+ struct dentry *dentry;
+ int err;
+ schema_Overlay *overlayfs;
+
+ /* If not an overlayfs superblock, done. */
+ if (!is_overlayfs_mounted(file))
+ return 0;
+
+ dentry = file->f_path.dentry;
+ schema_file->which_filesystem = schema_File_overlayfs_tag;
+ overlayfs = &schema_file->filesystem.overlayfs;
+ overlayfs->lower_layer = ovl_dentry_lower(dentry);
+ overlayfs->upper_layer = ovl_dentry_upper(dentry);
+
+ err = get_process_uuid_from_xattr(file, fdata->modified_uuid,
+ sizeof(fdata->modified_uuid));
+ /* If there is no xattr, just skip the modified_uuid field. */
+ if (err == -ENODATA)
+ return 0;
+ if (err < 0)
+ return err;
+
+ overlayfs->modified_uuid.funcs.encode = pb_encode_uuid_field;
+ overlayfs->modified_uuid.arg = fdata->modified_uuid;
+ return 0;
+}
+
+static int fill_file_description(struct file *file, schema_File *schema_file,
+ struct file_data *fdata)
+{
+ char *buf;
+ int err;
+ u32 mode;
+ char *path;
+ struct socket *socket;
+ schema_Socket *socketfs;
+ struct sockaddr_storage saddr;
+
+ memset(fdata, 0, sizeof(*fdata));
+
+ if (file == NULL)
+ return 0;
+
+ schema_file->ino = file_inode(file)->i_ino;
+ mode = file_inode(file)->i_mode;
+
+ /* For pipes, no need to resolve the path. */
+ if (S_ISFIFO(mode))
+ return 0;
+
+ if (S_ISSOCK(mode)) {
+ socket = (struct socket *)file->private_data;
+ socketfs = &schema_file->filesystem.socket;
+
+ /* Local socket */
+ err = kernel_getsockname(socket, (struct sockaddr *)&saddr);
+ if (err >= 0) {
+ socketfs->has_local = true;
+ fill_socket_description(&saddr, &fdata->local,
+ &socketfs->local);
+ }
+
+ /* Remote socket, might not be connected. */
+ err = kernel_getpeername(socket, (struct sockaddr *)&saddr);
+ if (err >= 0) {
+ socketfs->has_remote = true;
+ fill_socket_description(&saddr, &fdata->remote,
+ &socketfs->remote);
+ }
+
+ schema_file->which_filesystem = schema_File_socket_tag;
+ return 0;
+ }
+
+ /*
+ * From this point, we care about all the other types of files as their
+ * path provides interesting insight.
+ */
+ buf = (char *)__get_free_page(GFP_KERNEL);
+ if (buf == NULL)
+ return -ENOMEM;
+
+ fdata->allocated = buf;
+
+ path = d_path(&file->f_path, buf, PAGE_SIZE);
+ if (IS_ERR(path)) {
+ free_file_data(fdata);
+ return PTR_ERR(path);
+ }
+
+ schema_file->fullpath.funcs.encode = pb_encode_string_field;
+ schema_file->fullpath.arg = path; /* buf is freed in free_file_data. */
+
+ err = fill_file_overlayfs(file, schema_file, fdata);
+ if (err) {
+ free_file_data(fdata);
+ return err;
+ }
+
+ return 0;
+}
+
+static int fill_stream_description(schema_Descriptor *desc, int fd,
+ struct file_data *fdata)
+{
+ struct fd sfd;
+ struct file *file;
+ int err = 0;
+
+ sfd = fdget(fd);
+ file = sfd.file;
+
+ if (file == NULL) {
+ memset(fdata, 0, sizeof(*fdata));
+ goto end;
+ }
+
+ desc->mode = file_inode(file)->i_mode;
+ desc->has_file = true;
+ err = fill_file_description(file, &desc->file, fdata);
+
+end:
+ fdput(sfd);
+ return err;
+}
+
+static int populate_proc_uuid_common(schema_Process *proc, char *uuid,
+ size_t uuid_size, char *parent_uuid,
+ size_t parent_uuid_size,
+ struct task_struct *task)
+{
+ int err;
+ struct task_struct *parent;
+ /* Generate unique identifier for the process and its parent */
+ err = get_process_uuid(task, uuid, uuid_size);
+ if (err)
+ return err;
+
+ proc->uuid.funcs.encode = pb_encode_uuid_field;
+ proc->uuid.arg = uuid;
+
+ rcu_read_lock();
+
+ if (!pid_alive(task))
+ goto out;
+ /*
+ * I don't think this needs to be task_rcu_dereference because
+ * real_parent is only supposed to be accessed using RCU.
+ */
+ parent = rcu_dereference(task->real_parent);
+
+ if (parent) {
+ err = get_process_uuid(parent, parent_uuid, parent_uuid_size);
+ if (!err) {
+ proc->parent_uuid.funcs.encode = pb_encode_uuid_field;
+ proc->parent_uuid.arg = parent_uuid;
+ }
+ }
+
+out:
+ rcu_read_unlock();
+
+ return err;
+}
+
+/* Populate the fields that we always want to set in Process messages. */
+static int populate_proc_common(schema_Process *proc, char *uuid,
+ size_t uuid_size, char *parent_uuid,
+ size_t parent_uuid_size,
+ struct task_struct *task)
+{
+ u64 cid;
+ struct pid_namespace *ns = task_active_pid_ns(task);
+
+ /* Container identifier for the current namespace. */
+ proc->container_id = csm_get_ns_contid(ns);
+
+ /*
+ * If the process container-id is different, the process tree is part of
+ * a different session within the namespace (kubectl/docker exec,
+ * liveness probe or others).
+ */
+ cid = audit_get_contid(task);
+ if (proc->container_id != cid)
+ proc->exec_session_id = cid;
+
+ /* Add information about pid in different namespaces */
+ proc->pid = task_tgid_nr(task);
+ proc->parent_pid = task_ppid_nr(task);
+ proc->container_pid = task_tgid_nr_ns(task, ns);
+ proc->container_parent_pid = task_ppid_nr_ns(task, ns);
+
+ return populate_proc_uuid_common(proc, uuid, uuid_size, parent_uuid,
+ parent_uuid_size, task);
+}
+
+int csm_bprm_check_security(struct linux_binprm *bprm)
+{
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ int err;
+ schema_Event event = {};
+ schema_Process *proc;
+ struct string_arr_ctx argv_ctx;
+ void *stack = NULL, *ctx = NULL;
+ u64 cid;
+ struct file_data path_data = {};
+ struct file_data stdin_data = {};
+ struct file_data stdout_data = {};
+ struct file_data stderr_data = {};
+
+ /*
+ * Always create a container-id for containerized processes.
+ * If the LSM is enabled later, we can track existing containers.
+ */
+ cid = audit_get_contid(current);
+
+ if (cid == AUDIT_CID_UNSET) {
+ if (!is_possible_container(current, bprm->file))
+ return 0;
+
+ cid = csm_set_contid(current);
+
+ if (cid == AUDIT_CID_UNSET)
+ return 0;
+ }
+
+ if (!csm_execute_enabled)
+ return 0;
+
+ /* The interpreter will call us again with more context. */
+ if (bprm->buf[0] == '#' && bprm->buf[1] == '!')
+ return 0;
+
+ proc = &event.event.execute.proc;
+ err = populate_proc_common(proc, uuid, sizeof(uuid), parent_uuid,
+ sizeof(parent_uuid), current);
+ if (err)
+ goto out_free_buf;
+
+ proc->creation_timestamp = ktime_get_real_ns();
+
+ /* Provide information about the launched binary. */
+ proc->has_binary = true;
+ err = fill_file_description(bprm->file, &proc->binary, &path_data);
+ if (err)
+ goto out_free_buf;
+
+ /* Information about streams */
+ proc->has_streams = true;
+
+ proc->streams.has_stdin = true;
+ err = fill_stream_description(&proc->streams.stdin, STDIN_FILENO,
+ &stdin_data);
+ if (err)
+ goto out_free_buf;
+
+ proc->streams.has_stdout = true;
+ err = fill_stream_description(&proc->streams.stdout, STDOUT_FILENO,
+ &stdout_data);
+ if (err)
+ goto out_free_buf;
+
+ proc->streams.has_stderr = true;
+ err = fill_stream_description(&proc->streams.stderr, STDERR_FILENO,
+ &stderr_data);
+ if (err)
+ goto out_free_buf;
+
+ stack = kmap_argument_stack(bprm, &ctx);
+ if (!stack) {
+ err = -EFAULT;
+ goto out_free_buf;
+ }
+
+ /* Capture process argument */
+ argv_ctx.bprm = bprm;
+ argv_ctx.stack = stack;
+ proc->args.argv.funcs.encode = encode_current_argv;
+ proc->args.argv.arg = &argv_ctx;
+
+ /* Capture process environment variables */
+ proc->args.envp.funcs.encode = encode_current_envp;
+ proc->args.envp.arg = &argv_ctx;
+
+ event.which_event = schema_Event_execute_tag;
+ event.event.execute.has_proc = true;
+ proc->has_args = true;
+
+ /*
+ * Configurations options are checked when computing the serialized
+ * protobufs.
+ */
+ down_read(&csm_rwsem_config);
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ up_read(&csm_rwsem_config);
+
+ if (err)
+ pr_err("csm_sendeventproto returned %d on execve\n", err);
+ err = 0;
+
+out_free_buf:
+ kunmap_argument_stack(bprm, stack, ctx);
+ free_file_data(&path_data);
+ free_file_data(&stdin_data);
+ free_file_data(&stdout_data);
+ free_file_data(&stderr_data);
+
+ /*
+ * On failure, enforce it only if the execute config is enabled.
+ * If the collector was disabled, prefer to succeed to not impact the
+ * system.
+ */
+ if (unlikely(err < 0 && !csm_execute_enabled))
+ err = 0;
+
+ return err;
+}
+
+/* Create a clone event when a new task leader is created. */
+void csm_task_post_alloc(struct task_struct *task)
+{
+ int err;
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ schema_Event event = {};
+ schema_Process *proc;
+
+ if (!csm_execute_enabled ||
+ audit_get_contid(task) == AUDIT_CID_UNSET ||
+ !thread_group_leader(task))
+ return;
+
+ proc = &event.event.clone.proc;
+
+ err = populate_proc_uuid_common(proc, uuid, sizeof(uuid), parent_uuid,
+ sizeof(parent_uuid), task);
+
+ event.which_event = schema_Event_clone_tag;
+ event.event.clone.has_proc = true;
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on exit\n", err);
+}
+
+/*
+ * This LSM hook callback doesn't exist upstream and is called only when the
+ * last thread of a thread group exit.
+ */
+void csm_task_exit(struct task_struct *task)
+{
+ int err;
+ schema_Event event = {};
+ schema_ExitEvent *exit;
+ char uuid[PROCESS_UUID_SIZE];
+
+ if (!csm_execute_enabled ||
+ audit_get_contid(task) == AUDIT_CID_UNSET)
+ return;
+
+ exit = &event.event.exit;
+
+ /* Fetch the unique identifier for this process */
+ err = get_process_uuid(task, uuid, sizeof(uuid));
+ if (err) {
+ pr_err("failed to get process uuid on exit\n");
+ return;
+ }
+
+ exit->process_uuid.funcs.encode = pb_encode_uuid_field;
+ exit->process_uuid.arg = uuid;
+
+ event.which_event = schema_Event_exit_tag;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on exit\n", err);
+}
+
+int csm_mprotect(struct vm_area_struct *vma, unsigned long reqprot,
+ unsigned long prot)
+{
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ int err;
+ schema_Event event = {};
+ schema_MemoryExecEvent *memexec;
+ u64 cid;
+ struct file_data path_data = {};
+
+ cid = audit_get_contid(current);
+
+ if (!csm_memexec_enabled ||
+ !(prot & PROT_EXEC) ||
+ vma->vm_file == NULL ||
+ cid == AUDIT_CID_UNSET)
+ return 0;
+
+ memexec = &event.event.memexec;
+
+ err = fill_file_description(vma->vm_file, &memexec->mapped_file,
+ &path_data);
+ if (err)
+ return err;
+
+ err = populate_proc_common(&memexec->proc, uuid, sizeof(uuid),
+ parent_uuid, sizeof(parent_uuid), current);
+ if (err)
+ goto out;
+
+ memexec->prot_exec_timestamp = ktime_get_real_ns();
+ memexec->new_flags = prot;
+ memexec->req_flags = reqprot;
+ memexec->old_vm_flags = vma->vm_flags;
+
+ memexec->action = schema_MemoryExecEvent_Action_MPROTECT;
+ memexec->start_addr = vma->vm_start;
+ memexec->end_addr = vma->vm_end;
+
+ event.which_event = schema_Event_memexec_tag;
+ event.event.memexec.has_proc = true;
+ event.event.memexec.has_mapped_file = true;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on mprotect\n", err);
+ err = 0;
+
+ if (unlikely(err < 0 && !csm_memexec_enabled))
+ err = 0;
+
+out:
+ free_file_data(&path_data);
+ return err;
+}
+
+int csm_mmap_file(struct file *file, unsigned long reqprot,
+ unsigned long prot, unsigned long flags)
+{
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ int err;
+ schema_Event event = {};
+ schema_MemoryExecEvent *memexec;
+ struct file *exe_file;
+ u64 cid;
+ struct file_data path_data = {};
+
+ cid = audit_get_contid(current);
+
+ if (!csm_memexec_enabled ||
+ !(prot & PROT_EXEC) ||
+ file == NULL ||
+ cid == AUDIT_CID_UNSET)
+ return 0;
+
+ memexec = &event.event.memexec;
+ err = fill_file_description(file, &memexec->mapped_file,
+ &path_data);
+ if (err)
+ return err;
+
+ err = populate_proc_common(&memexec->proc, uuid, sizeof(uuid),
+ parent_uuid, sizeof(parent_uuid), current);
+ if (err)
+ goto out;
+
+ /* get_mm_exe_file does its own locking on mm_sem. */
+ exe_file = get_mm_exe_file(current->mm);
+ if (exe_file) {
+ if (path_equal(&file->f_path, &exe_file->f_path))
+ memexec->is_initial_mmap = 1;
+ fput(exe_file);
+ }
+
+ memexec->prot_exec_timestamp = ktime_get_real_ns();
+ memexec->new_flags = prot;
+ memexec->req_flags = reqprot;
+ memexec->mmap_flags = flags;
+ memexec->action = schema_MemoryExecEvent_Action_MMAP_FILE;
+ event.which_event = schema_Event_memexec_tag;
+ event.event.memexec.has_proc = true;
+ event.event.memexec.has_mapped_file = true;
+
+ err = csm_sendeventproto(schema_Event_fields, &event);
+ if (err)
+ pr_err("csm_sendeventproto returned %d on mmap_file\n", err);
+ err = 0;
+
+ if (unlikely(err < 0 && !csm_memexec_enabled))
+ err = 0;
+
+out:
+ free_file_data(&path_data);
+ return err;
+}
+
+void csm_file_pre_free(struct file *file)
+{
+ struct dentry *dentry;
+ int err;
+ struct file_provenance prov;
+
+ /* The file was opened to be modified and the LSM is enabled */
+ if (!(file->f_mode & FMODE_WRITE) ||
+ !csm_enabled)
+ return;
+
+ /* The current process is containerized. */
+ if (audit_get_contid(current) == AUDIT_CID_UNSET)
+ return;
+
+ /* The file is part of overlayfs on the upper layer. */
+ if (!is_overlayfs_mounted(file))
+ return;
+
+ dentry = ovl_dentry_upper(file->f_path.dentry);
+ if (!dentry)
+ return;
+
+ err = __vfs_getxattr(dentry, dentry->d_inode, XATTR_SECURITY_CSM,
+ NULL, 0);
+ if (err != -ENODATA) {
+ if (err < 0)
+ pr_err("failed to get security attribute: %d\n", err);
+ return;
+ }
+
+ prov.tgid = task_tgid_nr(current);
+ prov.start_time = ktime_mono_to_real(current->group_leader->start_time);
+
+ err = __vfs_setxattr(&init_user_ns, dentry, dentry->d_inode,
+ XATTR_SECURITY_CSM, &prov, sizeof(prov), 0);
+ if (err < 0)
+ pr_err("failed to set security attribute: %d\n", err);
+}
+
+/*
+ * Based off of fs/proc/base.c:next_tgid
+ *
+ * next_thread_group_leader returns the task_struct of the next task with a pid
+ * greater than or equal to tgid. The reference count is increased so that
+ * rcu_read_unlock may be called, and preemption reenabled.
+ */
+static struct task_struct *next_thread_group_leader(pid_t *tgid)
+{
+ struct pid *pid;
+ struct task_struct *task;
+
+ cond_resched();
+ rcu_read_lock();
+retry:
+ task = NULL;
+ pid = find_ge_pid(*tgid, &init_pid_ns);
+ if (pid) {
+ *tgid = pid_nr_ns(pid, &init_pid_ns);
+ task = pid_task(pid, PIDTYPE_PID);
+ if (!task || !thread_group_leader(task) ||
+ audit_get_contid(task) == AUDIT_CID_UNSET) {
+ (*tgid) += 1;
+ goto retry;
+ }
+
+ /*
+ * Increment the reference count on the task before leaving
+ * the RCU grace period.
+ */
+ get_task_struct(task);
+ (*tgid) += 1;
+ }
+
+ rcu_read_unlock();
+ return task;
+}
+
+void delayed_enumerate_processes(struct work_struct *work)
+{
+ pid_t tgid = 0;
+ struct task_struct *task;
+ struct csm_enumerate_processes_work_data *wd = container_of(
+ work, struct csm_enumerate_processes_work_data, work);
+ int wd_enumeration_count = wd->enumeration_count;
+
+ kfree(wd);
+ wd = NULL;
+ work = NULL;
+
+ /*
+ * Try for only a single enumeration routine at a time, as long as the
+ * execute collector is enabled.
+ */
+ while ((wd_enumeration_count == atomic_read(&enumeration_count)) &&
+ READ_ONCE(csm_execute_enabled) &&
+ (task = next_thread_group_leader(&tgid))) {
+ int err;
+ char uuid[PROCESS_UUID_SIZE];
+ char parent_uuid[PROCESS_UUID_SIZE];
+ struct file *exe_file = NULL;
+ struct file_data path_data = {};
+ schema_Event event = {};
+ schema_Process *proc = &event.event.enumproc.proc;
+
+ exe_file = get_task_exe_file(task);
+ if (!exe_file) {
+ pr_err("failed to get enumerated process executable, pid: %u\n",
+ task_pid_nr(task));
+ goto next;
+ }
+
+ proc->has_binary = true;
+ err = fill_file_description(exe_file, &proc->binary,
+ &path_data);
+ if (err) {
+ pr_err("failed to fill enumerated process %u executable description: %d\n",
+ task_pid_nr(task), err);
+ goto next;
+ }
+
+ err = populate_proc_common(proc, uuid, sizeof(uuid),
+ parent_uuid, sizeof(parent_uuid),
+ task);
+ if (err) {
+ pr_err("failed to set pid %u common fields: %d\n",
+ task_pid_nr(task), err);
+ goto next;
+ }
+
+ if (task->flags & PF_EXITING)
+ goto next;
+
+ event.which_event = schema_Event_enumproc_tag;
+ event.event.execute.has_proc = true;
+ err = csm_sendeventproto(schema_Event_fields,
+ &event);
+ if (err) {
+ pr_err("failed to send pid %u enumerated process: %d\n",
+ task_pid_nr(task), err);
+ goto next;
+ }
+next:
+ free_file_data(&path_data);
+ if (exe_file)
+ fput(exe_file);
+
+ put_task_struct(task);
+ }
+}
+
+void csm_enumerate_processes(unsigned long const config_version)
+{
+ struct csm_enumerate_processes_work_data *wd;
+
+ wd = kmalloc(sizeof(*wd), GFP_KERNEL);
+ if (!wd)
+ return;
+
+ INIT_WORK(&wd->work, delayed_enumerate_processes);
+ wd->enumeration_count = atomic_add_return(1, &enumeration_count);
+ schedule_work(&wd->work);
+}
diff --git a/security/container/process.h b/security/container/process.h
new file mode 100644
index 0000000..1c98134
--- /dev/null
+++ b/security/container/process.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Container Security Monitor module
+ *
+ * Copyright (c) 2019 Google, Inc
+ */
+
+void csm_enumerate_processes(void);
diff --git a/security/container/protos/Makefile b/security/container/protos/Makefile
new file mode 100644
index 0000000..a88068b
--- /dev/null
+++ b/security/container/protos/Makefile
@@ -0,0 +1,10 @@
+subdir-$(CONFIG_SECURITY_CONTAINER_MONITOR) += nanopb
+
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += nanopb/
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += protos.o
+
+protos-y := config.pb.o event.pb.o
+
+ccflags-y := -I$(srctree)/security/container/protos \
+ -I$(srctree)/security/container/protos/nanopb \
+ $(PB_CCFLAGS)
diff --git a/security/container/protos/README b/security/container/protos/README
new file mode 100644
index 0000000..1b0628a
--- /dev/null
+++ b/security/container/protos/README
@@ -0,0 +1,18 @@
+This document provides guidance on how to change the protos used in this directory.
+
+Any change made to a proto file require to reformat it and regenerate nanopb
+sources. It also requires the proto files to be compatible to previously released versions.
+
+To reformat any proto file run: "clang-format -style=Google -i <file.proto>"
+
+To regenerate nanopb files:
+ - Install protoc
+ - apt-get install protobuf-compiler
+ - Clone/setup nanopb for version 0.3.9.1 (or clone the internal depot)
+ - git clone --depth=1 https://github.com/nanopb/nanopb.git
+ - cd nanopb
+ - git fetch --tags
+ - git checkout tags/0.3.9.1
+ - make -C generator/proto
+ - Run protoc with the nanopb definition
+ - protoc --plugin=<path_to_nanopb>/generator/protoc-gen-nanopb --nanopb_out=<path_to_linux>/security/container/protos/ <path_to_linux>/security/container/protos/<file.proto> --proto_path=<path_to_linux>/security/container/protos
diff --git a/security/container/protos/config.pb.c b/security/container/protos/config.pb.c
new file mode 100644
index 0000000..08436ee
--- /dev/null
+++ b/security/container/protos/config.pb.c
@@ -0,0 +1,25 @@
+/* Automatically generated nanopb constant definitions */
+/* Generated by nanopb-0.4.5 */
+
+#include "config.pb.h"
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+PB_BIND(schema_ContainerCollectorConfig, schema_ContainerCollectorConfig, AUTO)
+
+
+PB_BIND(schema_ExecuteCollectorConfig, schema_ExecuteCollectorConfig, AUTO)
+
+
+PB_BIND(schema_MemExecCollectorConfig, schema_MemExecCollectorConfig, AUTO)
+
+
+PB_BIND(schema_ConfigurationRequest, schema_ConfigurationRequest, AUTO)
+
+
+PB_BIND(schema_ConfigurationResponse, schema_ConfigurationResponse, AUTO)
+
+
+
+
diff --git a/security/container/protos/config.pb.h b/security/container/protos/config.pb.h
new file mode 100644
index 0000000..893961e
--- /dev/null
+++ b/security/container/protos/config.pb.h
@@ -0,0 +1,157 @@
+/* Automatically generated nanopb header */
+/* Generated by nanopb-0.4.5 */
+
+#ifndef PB_SCHEMA_CONFIG_PB_H_INCLUDED
+#define PB_SCHEMA_CONFIG_PB_H_INCLUDED
+#include <pb.h>
+
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+/* Enum definitions */
+typedef enum _schema_ConfigurationResponse_ErrorCode {
+ schema_ConfigurationResponse_ErrorCode_NO_ERROR = 0,
+ schema_ConfigurationResponse_ErrorCode_UNKNOWN = 2
+} schema_ConfigurationResponse_ErrorCode;
+
+/* Struct definitions */
+/* Report success or failure of previous ConfigurationRequest */
+typedef struct _schema_ConfigurationResponse {
+ schema_ConfigurationResponse_ErrorCode error;
+ pb_callback_t msg;
+ uint64_t version; /* Version of the LSM */
+ uint32_t kernel_version; /* LINUX_VERSION_CODE */
+} schema_ConfigurationResponse;
+
+/* Collect information about running containers */
+typedef struct _schema_ContainerCollectorConfig {
+ bool enabled;
+} schema_ContainerCollectorConfig;
+
+typedef struct _schema_ExecuteCollectorConfig {
+ bool enabled;
+ /* truncate argv/envp if cumulative length exceeds limit */
+ uint32_t argv_limit;
+ uint32_t envp_limit;
+ /* If specified, only report the named environment variables. An
+ empty envp_allowlist indicates that all environment variables
+ should be reported up to a cumulative total of envp_limit bytes. */
+ pb_callback_t envp_allowlist;
+} schema_ExecuteCollectorConfig;
+
+/* Collect information about executable memory mappings. */
+typedef struct _schema_MemExecCollectorConfig {
+ bool enabled;
+} schema_MemExecCollectorConfig;
+
+/* Convey configuration information to Guest LSM */
+typedef struct _schema_ConfigurationRequest {
+ bool has_container_config;
+ schema_ContainerCollectorConfig container_config;
+ bool has_execute_config;
+ schema_ExecuteCollectorConfig execute_config;
+ bool has_memexec_config;
+ schema_MemExecCollectorConfig memexec_config;
+} schema_ConfigurationRequest;
+
+
+/* Helper constants for enums */
+#define _schema_ConfigurationResponse_ErrorCode_MIN schema_ConfigurationResponse_ErrorCode_NO_ERROR
+#define _schema_ConfigurationResponse_ErrorCode_MAX schema_ConfigurationResponse_ErrorCode_UNKNOWN
+#define _schema_ConfigurationResponse_ErrorCode_ARRAYSIZE ((schema_ConfigurationResponse_ErrorCode)(schema_ConfigurationResponse_ErrorCode_UNKNOWN+1))
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Initializer values for message structs */
+#define schema_ContainerCollectorConfig_init_default {0}
+#define schema_ExecuteCollectorConfig_init_default {0, 0, 0, {{NULL}, NULL}}
+#define schema_MemExecCollectorConfig_init_default {0}
+#define schema_ConfigurationRequest_init_default {false, schema_ContainerCollectorConfig_init_default, false, schema_ExecuteCollectorConfig_init_default, false, schema_MemExecCollectorConfig_init_default}
+#define schema_ConfigurationResponse_init_default {_schema_ConfigurationResponse_ErrorCode_MIN, {{NULL}, NULL}, 0, 0}
+#define schema_ContainerCollectorConfig_init_zero {0}
+#define schema_ExecuteCollectorConfig_init_zero {0, 0, 0, {{NULL}, NULL}}
+#define schema_MemExecCollectorConfig_init_zero {0}
+#define schema_ConfigurationRequest_init_zero {false, schema_ContainerCollectorConfig_init_zero, false, schema_ExecuteCollectorConfig_init_zero, false, schema_MemExecCollectorConfig_init_zero}
+#define schema_ConfigurationResponse_init_zero {_schema_ConfigurationResponse_ErrorCode_MIN, {{NULL}, NULL}, 0, 0}
+
+/* Field tags (for use in manual encoding/decoding) */
+#define schema_ConfigurationResponse_error_tag 1
+#define schema_ConfigurationResponse_msg_tag 2
+#define schema_ConfigurationResponse_version_tag 3
+#define schema_ConfigurationResponse_kernel_version_tag 4
+#define schema_ContainerCollectorConfig_enabled_tag 1
+#define schema_ExecuteCollectorConfig_enabled_tag 1
+#define schema_ExecuteCollectorConfig_argv_limit_tag 2
+#define schema_ExecuteCollectorConfig_envp_limit_tag 3
+#define schema_ExecuteCollectorConfig_envp_allowlist_tag 4
+#define schema_MemExecCollectorConfig_enabled_tag 1
+#define schema_ConfigurationRequest_container_config_tag 1
+#define schema_ConfigurationRequest_execute_config_tag 2
+#define schema_ConfigurationRequest_memexec_config_tag 3
+
+/* Struct field encoding specification for nanopb */
+#define schema_ContainerCollectorConfig_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, enabled, 1)
+#define schema_ContainerCollectorConfig_CALLBACK NULL
+#define schema_ContainerCollectorConfig_DEFAULT NULL
+
+#define schema_ExecuteCollectorConfig_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, enabled, 1) \
+X(a, STATIC, SINGULAR, UINT32, argv_limit, 2) \
+X(a, STATIC, SINGULAR, UINT32, envp_limit, 3) \
+X(a, CALLBACK, REPEATED, STRING, envp_allowlist, 4)
+#define schema_ExecuteCollectorConfig_CALLBACK pb_default_field_callback
+#define schema_ExecuteCollectorConfig_DEFAULT NULL
+
+#define schema_MemExecCollectorConfig_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, enabled, 1)
+#define schema_MemExecCollectorConfig_CALLBACK NULL
+#define schema_MemExecCollectorConfig_DEFAULT NULL
+
+#define schema_ConfigurationRequest_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, container_config, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, execute_config, 2) \
+X(a, STATIC, OPTIONAL, MESSAGE, memexec_config, 3)
+#define schema_ConfigurationRequest_CALLBACK NULL
+#define schema_ConfigurationRequest_DEFAULT NULL
+#define schema_ConfigurationRequest_container_config_MSGTYPE schema_ContainerCollectorConfig
+#define schema_ConfigurationRequest_execute_config_MSGTYPE schema_ExecuteCollectorConfig
+#define schema_ConfigurationRequest_memexec_config_MSGTYPE schema_MemExecCollectorConfig
+
+#define schema_ConfigurationResponse_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UENUM, error, 1) \
+X(a, CALLBACK, SINGULAR, STRING, msg, 2) \
+X(a, STATIC, SINGULAR, UINT64, version, 3) \
+X(a, STATIC, SINGULAR, UINT32, kernel_version, 4)
+#define schema_ConfigurationResponse_CALLBACK pb_default_field_callback
+#define schema_ConfigurationResponse_DEFAULT NULL
+
+extern const pb_msgdesc_t schema_ContainerCollectorConfig_msg;
+extern const pb_msgdesc_t schema_ExecuteCollectorConfig_msg;
+extern const pb_msgdesc_t schema_MemExecCollectorConfig_msg;
+extern const pb_msgdesc_t schema_ConfigurationRequest_msg;
+extern const pb_msgdesc_t schema_ConfigurationResponse_msg;
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define schema_ContainerCollectorConfig_fields &schema_ContainerCollectorConfig_msg
+#define schema_ExecuteCollectorConfig_fields &schema_ExecuteCollectorConfig_msg
+#define schema_MemExecCollectorConfig_fields &schema_MemExecCollectorConfig_msg
+#define schema_ConfigurationRequest_fields &schema_ConfigurationRequest_msg
+#define schema_ConfigurationResponse_fields &schema_ConfigurationResponse_msg
+
+/* Maximum encoded size of messages (where known) */
+/* schema_ExecuteCollectorConfig_size depends on runtime parameters */
+/* schema_ConfigurationRequest_size depends on runtime parameters */
+/* schema_ConfigurationResponse_size depends on runtime parameters */
+#define schema_ContainerCollectorConfig_size 2
+#define schema_MemExecCollectorConfig_size 2
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/config.proto b/security/container/protos/config.proto
new file mode 100644
index 0000000..e32a517
--- /dev/null
+++ b/security/container/protos/config.proto
@@ -0,0 +1,51 @@
+syntax = "proto3";
+
+package schema;
+
+// Collect information about running containers
+message ContainerCollectorConfig {
+ bool enabled = 1;
+}
+
+message ExecuteCollectorConfig {
+ bool enabled = 1;
+
+ // truncate argv/envp if cumulative length exceeds limit
+ uint32 argv_limit = 2;
+ uint32 envp_limit = 3;
+
+ // If specified, only report the named environment variables. An
+ // empty envp_allowlist indicates that all environment variables
+ // should be reported up to a cumulative total of envp_limit bytes.
+ repeated string envp_allowlist = 4;
+}
+
+// Collect information about executable memory mappings.
+message MemExecCollectorConfig {
+ bool enabled = 1;
+}
+
+// Convey configuration information to Guest LSM
+message ConfigurationRequest {
+ ContainerCollectorConfig container_config = 1;
+ ExecuteCollectorConfig execute_config = 2;
+ MemExecCollectorConfig memexec_config = 3;
+
+ // Additional configuration messages will be added as new collectors
+ // are implemented
+}
+
+// Report success or failure of previous ConfigurationRequest
+message ConfigurationResponse {
+ enum ErrorCode {
+ // Keep values in sync with
+ // https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
+ NO_ERROR = 0;
+ UNKNOWN = 2;
+ }
+
+ ErrorCode error = 1;
+ string msg = 2;
+ uint64 version = 3; // Version of the LSM
+ uint32 kernel_version = 4; // LINUX_VERSION_CODE
+}
diff --git a/security/container/protos/event.pb.c b/security/container/protos/event.pb.c
new file mode 100644
index 0000000..2293566
--- /dev/null
+++ b/security/container/protos/event.pb.c
@@ -0,0 +1,61 @@
+/* Automatically generated nanopb constant definitions */
+/* Generated by nanopb-0.4.5 */
+
+#include "event.pb.h"
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+PB_BIND(schema_SocketIp, schema_SocketIp, AUTO)
+
+
+PB_BIND(schema_Socket, schema_Socket, AUTO)
+
+
+PB_BIND(schema_Overlay, schema_Overlay, AUTO)
+
+
+PB_BIND(schema_File, schema_File, AUTO)
+
+
+PB_BIND(schema_ProcessArguments, schema_ProcessArguments, AUTO)
+
+
+PB_BIND(schema_Descriptor, schema_Descriptor, AUTO)
+
+
+PB_BIND(schema_Streams, schema_Streams, 2)
+
+
+PB_BIND(schema_Process, schema_Process, 2)
+
+
+PB_BIND(schema_Container, schema_Container, AUTO)
+
+
+PB_BIND(schema_ExecuteEvent, schema_ExecuteEvent, 2)
+
+
+PB_BIND(schema_CloneEvent, schema_CloneEvent, 2)
+
+
+PB_BIND(schema_EnumerateProcessEvent, schema_EnumerateProcessEvent, 2)
+
+
+PB_BIND(schema_MemoryExecEvent, schema_MemoryExecEvent, 2)
+
+
+PB_BIND(schema_ContainerInfoEvent, schema_ContainerInfoEvent, AUTO)
+
+
+PB_BIND(schema_ExitEvent, schema_ExitEvent, AUTO)
+
+
+PB_BIND(schema_Event, schema_Event, 2)
+
+
+PB_BIND(schema_ContainerReport, schema_ContainerReport, AUTO)
+
+
+
+
diff --git a/security/container/protos/event.pb.h b/security/container/protos/event.pb.h
new file mode 100644
index 0000000..9535068
--- /dev/null
+++ b/security/container/protos/event.pb.h
@@ -0,0 +1,518 @@
+/* Automatically generated nanopb header */
+/* Generated by nanopb-0.4.5 */
+
+#ifndef PB_SCHEMA_EVENT_PB_H_INCLUDED
+#define PB_SCHEMA_EVENT_PB_H_INCLUDED
+#include <pb.h>
+
+#if PB_PROTO_HEADER_VERSION != 40
+#error Regenerate this file with the current version of nanopb generator.
+#endif
+
+/* Enum definitions */
+typedef enum _schema_MemoryExecEvent_Action {
+ schema_MemoryExecEvent_Action_UNDEFINED = 0,
+ schema_MemoryExecEvent_Action_MPROTECT = 1,
+ schema_MemoryExecEvent_Action_MMAP_FILE = 2
+} schema_MemoryExecEvent_Action;
+
+/* Struct definitions */
+/* The process with the indicated pid has exited. */
+typedef struct _schema_ExitEvent {
+ pb_callback_t process_uuid;
+} schema_ExitEvent;
+
+typedef struct _schema_Container {
+ uint64_t creation_timestamp; /* container create time in ns */
+ pb_callback_t pod_namespace;
+ pb_callback_t pod_name;
+ uint64_t container_id; /* unique across lifetime of Node */
+ pb_callback_t container_name;
+ pb_callback_t container_image_uri;
+ pb_callback_t labels;
+ pb_callback_t init_uuid;
+ pb_callback_t container_image_id;
+} schema_Container;
+
+typedef struct _schema_Overlay {
+ bool lower_layer;
+ bool upper_layer;
+ pb_callback_t modified_uuid; /* The process who first modified the file. */
+} schema_Overlay;
+
+typedef struct _schema_ProcessArguments {
+ pb_callback_t argv; /* process arguments */
+ uint32_t argv_truncated; /* number of characters truncated from argv */
+ pb_callback_t envp; /* process environment variables */
+ uint32_t envp_truncated; /* number of characters truncated from envp */
+} schema_ProcessArguments;
+
+typedef struct _schema_SocketIp {
+ uint32_t family; /* AF_* for socket type. */
+ pb_callback_t ip; /* ip4 or ip6 address. */
+ uint32_t port; /* port bind or connected. */
+} schema_SocketIp;
+
+/* Associate the following container information with all processes
+ that have the indicated container_id. */
+typedef struct _schema_ContainerInfoEvent {
+ bool has_container;
+ schema_Container container;
+} schema_ContainerInfoEvent;
+
+/* Message sent by the daemonset to the LSM for container enlightenment. */
+typedef struct _schema_ContainerReport {
+ uint32_t pid; /* Top pid of the running container. */
+ bool has_container;
+ schema_Container container; /* Information collected about the container. */
+} schema_ContainerReport;
+
+typedef struct _schema_Socket {
+ bool has_local;
+ schema_SocketIp local;
+ bool has_remote;
+ schema_SocketIp remote; /* unset if not connected. */
+} schema_Socket;
+
+typedef struct _schema_File {
+ pb_callback_t fullpath;
+ pb_size_t which_filesystem;
+ union {
+ schema_Overlay overlayfs;
+ schema_Socket socket;
+ } filesystem; /* inode number. */
+ uint32_t ino;
+ uint64_t ctime;
+} schema_File;
+
+typedef struct _schema_Descriptor {
+ uint32_t mode; /* file mode (stat st_mode) */
+ bool has_file;
+ schema_File file;
+} schema_Descriptor;
+
+typedef struct _schema_Streams {
+ bool has_stdin;
+ schema_Descriptor stdin;
+ bool has_stdout;
+ schema_Descriptor stdout;
+ bool has_stderr;
+ schema_Descriptor stderr;
+} schema_Streams;
+
+typedef struct _schema_Process {
+ uint64_t creation_timestamp; /* Only populated in ExecuteEvent, in ns. */
+ pb_callback_t uuid;
+ uint32_t pid;
+ bool has_binary;
+ schema_File binary; /* Only populated in ExecuteEvent. */
+ uint32_t parent_pid;
+ pb_callback_t parent_uuid;
+ uint64_t container_id; /* unique id of process's container */
+ uint32_t container_pid; /* pid inside the container namespace pid */
+ uint32_t container_parent_pid; /* optional */
+ bool has_args;
+ schema_ProcessArguments args; /* Only populated in ExecuteEvent. */
+ bool has_streams;
+ schema_Streams streams; /* Only populated in ExecuteEvent. */
+ uint64_t exec_session_id; /* identifier set for kubectl exec sessions. */
+} schema_Process;
+
+/* A process clone is being created. This message means that a cloning operation
+ is being attempted. It may be sent even if fork fails. */
+typedef struct _schema_CloneEvent {
+ bool has_proc;
+ schema_Process proc;
+} schema_CloneEvent;
+
+/* Processes that are enumerated at startup will be sent with this event. There
+ is no distinction from events we would have seen from fork or exec. */
+typedef struct _schema_EnumerateProcessEvent {
+ bool has_proc;
+ schema_Process proc;
+} schema_EnumerateProcessEvent;
+
+/* A binary being executed.
+ e.g., execve() */
+typedef struct _schema_ExecuteEvent {
+ bool has_proc;
+ schema_Process proc;
+} schema_ExecuteEvent;
+
+/* Collect information about mmap/mprotect calls with the PROT_EXEC flag set. */
+typedef struct _schema_MemoryExecEvent {
+ bool has_proc;
+ schema_Process proc; /* The origin process */
+ /* The timestamp in ns when the memory was set executable */
+ uint64_t prot_exec_timestamp;
+ /* The prot flags granted by the kernel for the operation */
+ uint64_t new_flags;
+ /* The prot flags requested for the mprotect/mmap operation */
+ uint64_t req_flags;
+ /* The vm_flags prior to the mprotect operation, if relevant */
+ uint64_t old_vm_flags;
+ /* The operational flags for the mmap operation, if relevant */
+ uint64_t mmap_flags;
+ /* Derived from the file struct describing the fd being mapped */
+ bool has_mapped_file;
+ schema_File mapped_file;
+ schema_MemoryExecEvent_Action action;
+ uint64_t start_addr; /* The executable memory region start addr */
+ uint64_t end_addr; /* The executable memory region end addr */
+ /* True if this event is a mmap of the process' binary */
+ bool is_initial_mmap;
+} schema_MemoryExecEvent;
+
+/* Next ID: 8 */
+typedef struct _schema_Event {
+ pb_size_t which_event;
+ union {
+ schema_ExecuteEvent execute;
+ schema_ContainerInfoEvent container;
+ schema_ExitEvent exit;
+ schema_MemoryExecEvent memexec;
+ schema_CloneEvent clone;
+ schema_EnumerateProcessEvent enumproc;
+ } event;
+ uint64_t timestamp;
+} schema_Event;
+
+
+/* Helper constants for enums */
+#define _schema_MemoryExecEvent_Action_MIN schema_MemoryExecEvent_Action_UNDEFINED
+#define _schema_MemoryExecEvent_Action_MAX schema_MemoryExecEvent_Action_MMAP_FILE
+#define _schema_MemoryExecEvent_Action_ARRAYSIZE ((schema_MemoryExecEvent_Action)(schema_MemoryExecEvent_Action_MMAP_FILE+1))
+
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Initializer values for message structs */
+#define schema_SocketIp_init_default {0, {{NULL}, NULL}, 0}
+#define schema_Socket_init_default {false, schema_SocketIp_init_default, false, schema_SocketIp_init_default}
+#define schema_Overlay_init_default {0, 0, {{NULL}, NULL}}
+#define schema_File_init_default {{{NULL}, NULL}, 0, {schema_Overlay_init_default}, 0, 0}
+#define schema_ProcessArguments_init_default {{{NULL}, NULL}, 0, {{NULL}, NULL}, 0}
+#define schema_Descriptor_init_default {0, false, schema_File_init_default}
+#define schema_Streams_init_default {false, schema_Descriptor_init_default, false, schema_Descriptor_init_default, false, schema_Descriptor_init_default}
+#define schema_Process_init_default {0, {{NULL}, NULL}, 0, false, schema_File_init_default, 0, {{NULL}, NULL}, 0, 0, 0, false, schema_ProcessArguments_init_default, false, schema_Streams_init_default, 0}
+#define schema_Container_init_default {0, {{NULL}, NULL}, {{NULL}, NULL}, 0, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}}
+#define schema_ExecuteEvent_init_default {false, schema_Process_init_default}
+#define schema_CloneEvent_init_default {false, schema_Process_init_default}
+#define schema_EnumerateProcessEvent_init_default {false, schema_Process_init_default}
+#define schema_MemoryExecEvent_init_default {false, schema_Process_init_default, 0, 0, 0, 0, 0, false, schema_File_init_default, _schema_MemoryExecEvent_Action_MIN, 0, 0, 0}
+#define schema_ContainerInfoEvent_init_default {false, schema_Container_init_default}
+#define schema_ExitEvent_init_default {{{NULL}, NULL}}
+#define schema_Event_init_default {0, {schema_ExecuteEvent_init_default}, 0}
+#define schema_ContainerReport_init_default {0, false, schema_Container_init_default}
+#define schema_SocketIp_init_zero {0, {{NULL}, NULL}, 0}
+#define schema_Socket_init_zero {false, schema_SocketIp_init_zero, false, schema_SocketIp_init_zero}
+#define schema_Overlay_init_zero {0, 0, {{NULL}, NULL}}
+#define schema_File_init_zero {{{NULL}, NULL}, 0, {schema_Overlay_init_zero}, 0, 0}
+#define schema_ProcessArguments_init_zero {{{NULL}, NULL}, 0, {{NULL}, NULL}, 0}
+#define schema_Descriptor_init_zero {0, false, schema_File_init_zero}
+#define schema_Streams_init_zero {false, schema_Descriptor_init_zero, false, schema_Descriptor_init_zero, false, schema_Descriptor_init_zero}
+#define schema_Process_init_zero {0, {{NULL}, NULL}, 0, false, schema_File_init_zero, 0, {{NULL}, NULL}, 0, 0, 0, false, schema_ProcessArguments_init_zero, false, schema_Streams_init_zero, 0}
+#define schema_Container_init_zero {0, {{NULL}, NULL}, {{NULL}, NULL}, 0, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}, {{NULL}, NULL}}
+#define schema_ExecuteEvent_init_zero {false, schema_Process_init_zero}
+#define schema_CloneEvent_init_zero {false, schema_Process_init_zero}
+#define schema_EnumerateProcessEvent_init_zero {false, schema_Process_init_zero}
+#define schema_MemoryExecEvent_init_zero {false, schema_Process_init_zero, 0, 0, 0, 0, 0, false, schema_File_init_zero, _schema_MemoryExecEvent_Action_MIN, 0, 0, 0}
+#define schema_ContainerInfoEvent_init_zero {false, schema_Container_init_zero}
+#define schema_ExitEvent_init_zero {{{NULL}, NULL}}
+#define schema_Event_init_zero {0, {schema_ExecuteEvent_init_zero}, 0}
+#define schema_ContainerReport_init_zero {0, false, schema_Container_init_zero}
+
+/* Field tags (for use in manual encoding/decoding) */
+#define schema_ExitEvent_process_uuid_tag 1
+#define schema_Container_creation_timestamp_tag 1
+#define schema_Container_pod_namespace_tag 2
+#define schema_Container_pod_name_tag 3
+#define schema_Container_container_id_tag 4
+#define schema_Container_container_name_tag 5
+#define schema_Container_container_image_uri_tag 6
+#define schema_Container_labels_tag 7
+#define schema_Container_init_uuid_tag 8
+#define schema_Container_container_image_id_tag 9
+#define schema_Overlay_lower_layer_tag 1
+#define schema_Overlay_upper_layer_tag 2
+#define schema_Overlay_modified_uuid_tag 3
+#define schema_ProcessArguments_argv_tag 1
+#define schema_ProcessArguments_argv_truncated_tag 2
+#define schema_ProcessArguments_envp_tag 3
+#define schema_ProcessArguments_envp_truncated_tag 4
+#define schema_SocketIp_family_tag 1
+#define schema_SocketIp_ip_tag 2
+#define schema_SocketIp_port_tag 3
+#define schema_ContainerInfoEvent_container_tag 1
+#define schema_ContainerReport_pid_tag 1
+#define schema_ContainerReport_container_tag 2
+#define schema_Socket_local_tag 1
+#define schema_Socket_remote_tag 2
+#define schema_File_fullpath_tag 1
+#define schema_File_overlayfs_tag 2
+#define schema_File_socket_tag 4
+#define schema_File_ino_tag 3
+#define schema_File_ctime_tag 5
+#define schema_Descriptor_mode_tag 1
+#define schema_Descriptor_file_tag 2
+#define schema_Streams_stdin_tag 1
+#define schema_Streams_stdout_tag 2
+#define schema_Streams_stderr_tag 3
+#define schema_Process_creation_timestamp_tag 1
+#define schema_Process_uuid_tag 2
+#define schema_Process_pid_tag 3
+#define schema_Process_binary_tag 4
+#define schema_Process_parent_pid_tag 5
+#define schema_Process_parent_uuid_tag 6
+#define schema_Process_container_id_tag 7
+#define schema_Process_container_pid_tag 8
+#define schema_Process_container_parent_pid_tag 9
+#define schema_Process_args_tag 10
+#define schema_Process_streams_tag 11
+#define schema_Process_exec_session_id_tag 12
+#define schema_CloneEvent_proc_tag 1
+#define schema_EnumerateProcessEvent_proc_tag 1
+#define schema_ExecuteEvent_proc_tag 1
+#define schema_MemoryExecEvent_proc_tag 1
+#define schema_MemoryExecEvent_prot_exec_timestamp_tag 2
+#define schema_MemoryExecEvent_new_flags_tag 3
+#define schema_MemoryExecEvent_req_flags_tag 4
+#define schema_MemoryExecEvent_old_vm_flags_tag 5
+#define schema_MemoryExecEvent_mmap_flags_tag 6
+#define schema_MemoryExecEvent_mapped_file_tag 7
+#define schema_MemoryExecEvent_action_tag 8
+#define schema_MemoryExecEvent_start_addr_tag 9
+#define schema_MemoryExecEvent_end_addr_tag 10
+#define schema_MemoryExecEvent_is_initial_mmap_tag 11
+#define schema_Event_execute_tag 1
+#define schema_Event_container_tag 2
+#define schema_Event_exit_tag 3
+#define schema_Event_memexec_tag 4
+#define schema_Event_clone_tag 5
+#define schema_Event_enumproc_tag 7
+#define schema_Event_timestamp_tag 6
+
+/* Struct field encoding specification for nanopb */
+#define schema_SocketIp_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT32, family, 1) \
+X(a, CALLBACK, SINGULAR, BYTES, ip, 2) \
+X(a, STATIC, SINGULAR, UINT32, port, 3)
+#define schema_SocketIp_CALLBACK pb_default_field_callback
+#define schema_SocketIp_DEFAULT NULL
+
+#define schema_Socket_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, local, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, remote, 2)
+#define schema_Socket_CALLBACK NULL
+#define schema_Socket_DEFAULT NULL
+#define schema_Socket_local_MSGTYPE schema_SocketIp
+#define schema_Socket_remote_MSGTYPE schema_SocketIp
+
+#define schema_Overlay_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, BOOL, lower_layer, 1) \
+X(a, STATIC, SINGULAR, BOOL, upper_layer, 2) \
+X(a, CALLBACK, SINGULAR, BYTES, modified_uuid, 3)
+#define schema_Overlay_CALLBACK pb_default_field_callback
+#define schema_Overlay_DEFAULT NULL
+
+#define schema_File_FIELDLIST(X, a) \
+X(a, CALLBACK, SINGULAR, BYTES, fullpath, 1) \
+X(a, STATIC, ONEOF, MESSAGE, (filesystem,overlayfs,filesystem.overlayfs), 2) \
+X(a, STATIC, SINGULAR, UINT32, ino, 3) \
+X(a, STATIC, ONEOF, MESSAGE, (filesystem,socket,filesystem.socket), 4) \
+X(a, STATIC, SINGULAR, UINT64, ctime, 5)
+#define schema_File_CALLBACK pb_default_field_callback
+#define schema_File_DEFAULT NULL
+#define schema_File_filesystem_overlayfs_MSGTYPE schema_Overlay
+#define schema_File_filesystem_socket_MSGTYPE schema_Socket
+
+#define schema_ProcessArguments_FIELDLIST(X, a) \
+X(a, CALLBACK, REPEATED, BYTES, argv, 1) \
+X(a, STATIC, SINGULAR, UINT32, argv_truncated, 2) \
+X(a, CALLBACK, REPEATED, BYTES, envp, 3) \
+X(a, STATIC, SINGULAR, UINT32, envp_truncated, 4)
+#define schema_ProcessArguments_CALLBACK pb_default_field_callback
+#define schema_ProcessArguments_DEFAULT NULL
+
+#define schema_Descriptor_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT32, mode, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, file, 2)
+#define schema_Descriptor_CALLBACK NULL
+#define schema_Descriptor_DEFAULT NULL
+#define schema_Descriptor_file_MSGTYPE schema_File
+
+#define schema_Streams_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, stdin, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, stdout, 2) \
+X(a, STATIC, OPTIONAL, MESSAGE, stderr, 3)
+#define schema_Streams_CALLBACK NULL
+#define schema_Streams_DEFAULT NULL
+#define schema_Streams_stdin_MSGTYPE schema_Descriptor
+#define schema_Streams_stdout_MSGTYPE schema_Descriptor
+#define schema_Streams_stderr_MSGTYPE schema_Descriptor
+
+#define schema_Process_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT64, creation_timestamp, 1) \
+X(a, CALLBACK, SINGULAR, BYTES, uuid, 2) \
+X(a, STATIC, SINGULAR, UINT32, pid, 3) \
+X(a, STATIC, OPTIONAL, MESSAGE, binary, 4) \
+X(a, STATIC, SINGULAR, UINT32, parent_pid, 5) \
+X(a, CALLBACK, SINGULAR, BYTES, parent_uuid, 6) \
+X(a, STATIC, SINGULAR, UINT64, container_id, 7) \
+X(a, STATIC, SINGULAR, UINT32, container_pid, 8) \
+X(a, STATIC, SINGULAR, UINT32, container_parent_pid, 9) \
+X(a, STATIC, OPTIONAL, MESSAGE, args, 10) \
+X(a, STATIC, OPTIONAL, MESSAGE, streams, 11) \
+X(a, STATIC, SINGULAR, UINT64, exec_session_id, 12)
+#define schema_Process_CALLBACK pb_default_field_callback
+#define schema_Process_DEFAULT NULL
+#define schema_Process_binary_MSGTYPE schema_File
+#define schema_Process_args_MSGTYPE schema_ProcessArguments
+#define schema_Process_streams_MSGTYPE schema_Streams
+
+#define schema_Container_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT64, creation_timestamp, 1) \
+X(a, CALLBACK, SINGULAR, BYTES, pod_namespace, 2) \
+X(a, CALLBACK, SINGULAR, BYTES, pod_name, 3) \
+X(a, STATIC, SINGULAR, UINT64, container_id, 4) \
+X(a, CALLBACK, SINGULAR, BYTES, container_name, 5) \
+X(a, CALLBACK, SINGULAR, BYTES, container_image_uri, 6) \
+X(a, CALLBACK, REPEATED, BYTES, labels, 7) \
+X(a, CALLBACK, SINGULAR, BYTES, init_uuid, 8) \
+X(a, CALLBACK, SINGULAR, BYTES, container_image_id, 9)
+#define schema_Container_CALLBACK pb_default_field_callback
+#define schema_Container_DEFAULT NULL
+
+#define schema_ExecuteEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1)
+#define schema_ExecuteEvent_CALLBACK NULL
+#define schema_ExecuteEvent_DEFAULT NULL
+#define schema_ExecuteEvent_proc_MSGTYPE schema_Process
+
+#define schema_CloneEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1)
+#define schema_CloneEvent_CALLBACK NULL
+#define schema_CloneEvent_DEFAULT NULL
+#define schema_CloneEvent_proc_MSGTYPE schema_Process
+
+#define schema_EnumerateProcessEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1)
+#define schema_EnumerateProcessEvent_CALLBACK NULL
+#define schema_EnumerateProcessEvent_DEFAULT NULL
+#define schema_EnumerateProcessEvent_proc_MSGTYPE schema_Process
+
+#define schema_MemoryExecEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, proc, 1) \
+X(a, STATIC, SINGULAR, UINT64, prot_exec_timestamp, 2) \
+X(a, STATIC, SINGULAR, UINT64, new_flags, 3) \
+X(a, STATIC, SINGULAR, UINT64, req_flags, 4) \
+X(a, STATIC, SINGULAR, UINT64, old_vm_flags, 5) \
+X(a, STATIC, SINGULAR, UINT64, mmap_flags, 6) \
+X(a, STATIC, OPTIONAL, MESSAGE, mapped_file, 7) \
+X(a, STATIC, SINGULAR, UENUM, action, 8) \
+X(a, STATIC, SINGULAR, UINT64, start_addr, 9) \
+X(a, STATIC, SINGULAR, UINT64, end_addr, 10) \
+X(a, STATIC, SINGULAR, BOOL, is_initial_mmap, 11)
+#define schema_MemoryExecEvent_CALLBACK NULL
+#define schema_MemoryExecEvent_DEFAULT NULL
+#define schema_MemoryExecEvent_proc_MSGTYPE schema_Process
+#define schema_MemoryExecEvent_mapped_file_MSGTYPE schema_File
+
+#define schema_ContainerInfoEvent_FIELDLIST(X, a) \
+X(a, STATIC, OPTIONAL, MESSAGE, container, 1)
+#define schema_ContainerInfoEvent_CALLBACK NULL
+#define schema_ContainerInfoEvent_DEFAULT NULL
+#define schema_ContainerInfoEvent_container_MSGTYPE schema_Container
+
+#define schema_ExitEvent_FIELDLIST(X, a) \
+X(a, CALLBACK, SINGULAR, BYTES, process_uuid, 1)
+#define schema_ExitEvent_CALLBACK pb_default_field_callback
+#define schema_ExitEvent_DEFAULT NULL
+
+#define schema_Event_FIELDLIST(X, a) \
+X(a, STATIC, ONEOF, MESSAGE, (event,execute,event.execute), 1) \
+X(a, STATIC, ONEOF, MESSAGE, (event,container,event.container), 2) \
+X(a, STATIC, ONEOF, MESSAGE, (event,exit,event.exit), 3) \
+X(a, STATIC, ONEOF, MESSAGE, (event,memexec,event.memexec), 4) \
+X(a, STATIC, ONEOF, MESSAGE, (event,clone,event.clone), 5) \
+X(a, STATIC, SINGULAR, UINT64, timestamp, 6) \
+X(a, STATIC, ONEOF, MESSAGE, (event,enumproc,event.enumproc), 7)
+#define schema_Event_CALLBACK NULL
+#define schema_Event_DEFAULT NULL
+#define schema_Event_event_execute_MSGTYPE schema_ExecuteEvent
+#define schema_Event_event_container_MSGTYPE schema_ContainerInfoEvent
+#define schema_Event_event_exit_MSGTYPE schema_ExitEvent
+#define schema_Event_event_memexec_MSGTYPE schema_MemoryExecEvent
+#define schema_Event_event_clone_MSGTYPE schema_CloneEvent
+#define schema_Event_event_enumproc_MSGTYPE schema_EnumerateProcessEvent
+
+#define schema_ContainerReport_FIELDLIST(X, a) \
+X(a, STATIC, SINGULAR, UINT32, pid, 1) \
+X(a, STATIC, OPTIONAL, MESSAGE, container, 2)
+#define schema_ContainerReport_CALLBACK NULL
+#define schema_ContainerReport_DEFAULT NULL
+#define schema_ContainerReport_container_MSGTYPE schema_Container
+
+extern const pb_msgdesc_t schema_SocketIp_msg;
+extern const pb_msgdesc_t schema_Socket_msg;
+extern const pb_msgdesc_t schema_Overlay_msg;
+extern const pb_msgdesc_t schema_File_msg;
+extern const pb_msgdesc_t schema_ProcessArguments_msg;
+extern const pb_msgdesc_t schema_Descriptor_msg;
+extern const pb_msgdesc_t schema_Streams_msg;
+extern const pb_msgdesc_t schema_Process_msg;
+extern const pb_msgdesc_t schema_Container_msg;
+extern const pb_msgdesc_t schema_ExecuteEvent_msg;
+extern const pb_msgdesc_t schema_CloneEvent_msg;
+extern const pb_msgdesc_t schema_EnumerateProcessEvent_msg;
+extern const pb_msgdesc_t schema_MemoryExecEvent_msg;
+extern const pb_msgdesc_t schema_ContainerInfoEvent_msg;
+extern const pb_msgdesc_t schema_ExitEvent_msg;
+extern const pb_msgdesc_t schema_Event_msg;
+extern const pb_msgdesc_t schema_ContainerReport_msg;
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define schema_SocketIp_fields &schema_SocketIp_msg
+#define schema_Socket_fields &schema_Socket_msg
+#define schema_Overlay_fields &schema_Overlay_msg
+#define schema_File_fields &schema_File_msg
+#define schema_ProcessArguments_fields &schema_ProcessArguments_msg
+#define schema_Descriptor_fields &schema_Descriptor_msg
+#define schema_Streams_fields &schema_Streams_msg
+#define schema_Process_fields &schema_Process_msg
+#define schema_Container_fields &schema_Container_msg
+#define schema_ExecuteEvent_fields &schema_ExecuteEvent_msg
+#define schema_CloneEvent_fields &schema_CloneEvent_msg
+#define schema_EnumerateProcessEvent_fields &schema_EnumerateProcessEvent_msg
+#define schema_MemoryExecEvent_fields &schema_MemoryExecEvent_msg
+#define schema_ContainerInfoEvent_fields &schema_ContainerInfoEvent_msg
+#define schema_ExitEvent_fields &schema_ExitEvent_msg
+#define schema_Event_fields &schema_Event_msg
+#define schema_ContainerReport_fields &schema_ContainerReport_msg
+
+/* Maximum encoded size of messages (where known) */
+/* schema_SocketIp_size depends on runtime parameters */
+/* schema_Socket_size depends on runtime parameters */
+/* schema_Overlay_size depends on runtime parameters */
+/* schema_File_size depends on runtime parameters */
+/* schema_ProcessArguments_size depends on runtime parameters */
+/* schema_Descriptor_size depends on runtime parameters */
+/* schema_Streams_size depends on runtime parameters */
+/* schema_Process_size depends on runtime parameters */
+/* schema_Container_size depends on runtime parameters */
+/* schema_ExecuteEvent_size depends on runtime parameters */
+/* schema_CloneEvent_size depends on runtime parameters */
+/* schema_EnumerateProcessEvent_size depends on runtime parameters */
+/* schema_MemoryExecEvent_size depends on runtime parameters */
+/* schema_ContainerInfoEvent_size depends on runtime parameters */
+/* schema_ExitEvent_size depends on runtime parameters */
+/* schema_Event_size depends on runtime parameters */
+/* schema_ContainerReport_size depends on runtime parameters */
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/event.proto b/security/container/protos/event.proto
new file mode 100644
index 0000000..79a604d
--- /dev/null
+++ b/security/container/protos/event.proto
@@ -0,0 +1,152 @@
+syntax = "proto3";
+
+package schema;
+
+message SocketIp {
+ uint32 family = 1; // AF_* for socket type.
+ bytes ip = 2; // ip4 or ip6 address.
+ uint32 port = 3; // port bind or connected.
+}
+
+message Socket {
+ SocketIp local = 1;
+ SocketIp remote = 2; // unset if not connected.
+}
+
+message Overlay {
+ bool lower_layer = 1;
+ bool upper_layer = 2;
+ bytes modified_uuid = 3; // The process who first modified the file.
+}
+
+message File {
+ bytes fullpath = 1;
+ uint32 ino = 3; // inode number.
+ oneof filesystem {
+ Overlay overlayfs = 2;
+ Socket socket = 4;
+ }
+ uint64 ctime = 5;
+}
+
+message ProcessArguments {
+ repeated bytes argv = 1; // process arguments
+ uint32 argv_truncated = 2; // number of characters truncated from argv
+ repeated bytes envp = 3; // process environment variables
+ uint32 envp_truncated = 4; // number of characters truncated from envp
+}
+
+message Descriptor {
+ uint32 mode = 1; // file mode (stat st_mode)
+ File file = 2;
+}
+
+message Streams {
+ Descriptor stdin = 1;
+ Descriptor stdout = 2;
+ Descriptor stderr = 3;
+}
+
+message Process {
+ uint64 creation_timestamp = 1; // Only populated in ExecuteEvent, in ns.
+ bytes uuid = 2;
+ uint32 pid = 3;
+ File binary = 4; // Only populated in ExecuteEvent.
+ uint32 parent_pid = 5;
+ bytes parent_uuid = 6;
+ uint64 container_id = 7; // unique id of process's container
+ uint32 container_pid = 8; // pid inside the container namespace pid
+ uint32 container_parent_pid = 9; // optional
+ ProcessArguments args = 10; // Only populated in ExecuteEvent.
+ Streams streams = 11; // Only populated in ExecuteEvent.
+ uint64 exec_session_id = 12; // identifier set for kubectl exec sessions.
+}
+
+message Container {
+ uint64 creation_timestamp = 1; // container create time in ns
+ bytes pod_namespace = 2;
+ bytes pod_name = 3;
+ uint64 container_id = 4; // unique across lifetime of Node
+ bytes container_name = 5;
+ bytes container_image_uri = 6;
+ repeated bytes labels = 7;
+ bytes init_uuid = 8;
+ bytes container_image_id = 9;
+}
+
+// A binary being executed.
+// e.g., execve()
+message ExecuteEvent {
+ Process proc = 1;
+}
+
+// A process clone is being created. This message means that a cloning operation
+// is being attempted. It may be sent even if fork fails.
+message CloneEvent {
+ Process proc = 1;
+}
+
+// Processes that are enumerated at startup will be sent with this event. There
+// is no distinction from events we would have seen from fork or exec.
+message EnumerateProcessEvent {
+ Process proc = 1;
+}
+
+// Collect information about mmap/mprotect calls with the PROT_EXEC flag set.
+message MemoryExecEvent {
+ Process proc = 1; // The origin process
+ // The timestamp in ns when the memory was set executable
+ uint64 prot_exec_timestamp = 2;
+ // The prot flags granted by the kernel for the operation
+ uint64 new_flags = 3;
+ // The prot flags requested for the mprotect/mmap operation
+ uint64 req_flags = 4;
+ // The vm_flags prior to the mprotect operation, if relevant
+ uint64 old_vm_flags = 5;
+ // The operational flags for the mmap operation, if relevant
+ uint64 mmap_flags = 6;
+ // Derived from the file struct describing the fd being mapped
+ File mapped_file = 7;
+ enum Action {
+ UNDEFINED = 0;
+ MPROTECT = 1;
+ MMAP_FILE = 2;
+ }
+ Action action = 8;
+
+ uint64 start_addr = 9; // The executable memory region start addr
+ uint64 end_addr = 10; // The executable memory region end addr
+ // True if this event is a mmap of the process' binary
+ bool is_initial_mmap = 11;
+}
+
+// Associate the following container information with all processes
+// that have the indicated container_id.
+message ContainerInfoEvent {
+ Container container = 1;
+}
+
+// The process with the indicated pid has exited.
+message ExitEvent {
+ bytes process_uuid = 1;
+}
+
+// Next ID: 8
+message Event {
+ oneof event {
+ ExecuteEvent execute = 1;
+ ContainerInfoEvent container = 2;
+ ExitEvent exit = 3;
+ MemoryExecEvent memexec = 4;
+ CloneEvent clone = 5;
+ EnumerateProcessEvent enumproc = 7;
+ }
+
+ uint64 timestamp = 6; // In nanoseconds
+}
+
+// Message sent by the daemonset to the LSM for container enlightenment.
+message ContainerReport {
+ uint32 pid = 1; // Top pid of the running container.
+ Container container = 2; // Information collected about the container.
+}
diff --git a/security/container/protos/nanopb/LICENSE b/security/container/protos/nanopb/LICENSE
new file mode 100644
index 0000000..a83630a
--- /dev/null
+++ b/security/container/protos/nanopb/LICENSE
@@ -0,0 +1,20 @@
+Copyright (c) 2011 Petteri Aimonen <jpa at nanopb.mail.kapsi.fi>
+
+This software is provided 'as-is', without any express or
+implied warranty. In no event will the authors be held liable
+for any damages arising from the use of this software.
+
+Permission is granted to anyone to use this software for any
+purpose, including commercial applications, and to alter it and
+redistribute it freely, subject to the following restrictions:
+
+1. The origin of this software must not be misrepresented; you
+ must not claim that you wrote the original software. If you use
+ this software in a product, an acknowledgment in the product
+ documentation would be appreciated but is not required.
+
+2. Altered source versions must be plainly marked as such, and
+ must not be misrepresented as being the original software.
+
+3. This notice may not be removed or altered from any source
+ distribution.
diff --git a/security/container/protos/nanopb/METADATA b/security/container/protos/nanopb/METADATA
new file mode 100644
index 0000000..6b85630
--- /dev/null
+++ b/security/container/protos/nanopb/METADATA
@@ -0,0 +1,23 @@
+name: "nanopb"
+description: "Nanopb is a C library for encoding and decoding protocol buffers."
+
+third_party {
+ url {
+ type: GIT
+ value: "https://github.com/nanopb/nanopb/"
+ }
+ version: "0.4.5"
+ last_upgrade_date: {
+ year: 2021
+ month: 8
+ day: 12
+ }
+ license_type: NOTICE
+ security {
+ category: REVIEWED_AND_SECURE
+ note: "https://buganizer.corp.google.com/u/0/issues/19409596, https://buganizer.corp.google.com/u/0/issues/120506242"
+ tag: "NVD-CPE2.3:cpe:/a:nanopb_project:nanopb"
+ tag: "vuln_reporting:buganizer_component:588910"
+ tag: "vuln_reporting:contact_emails:" # Blunderbuss will assign bugs.
+ }
+}
diff --git a/security/container/protos/nanopb/Makefile b/security/container/protos/nanopb/Makefile
new file mode 100644
index 0000000..b7e15f8
--- /dev/null
+++ b/security/container/protos/nanopb/Makefile
@@ -0,0 +1,7 @@
+obj-$(CONFIG_SECURITY_CONTAINER_MONITOR) += nanopb.o
+
+nanopb-y := pb_encode.o pb_decode.o pb_common.o
+
+ccflags-y := -I$(srctree)/security/container/protos \
+ -I$(srctree)/security/container/protos/nanopb \
+ $(PB_CCFLAGS)
diff --git a/security/container/protos/nanopb/pb.h b/security/container/protos/nanopb/pb.h
new file mode 100644
index 0000000..be7c067
--- /dev/null
+++ b/security/container/protos/nanopb/pb.h
@@ -0,0 +1,875 @@
+/* Common parts of the nanopb library. Most of these are quite low-level
+ * stuff. For the high-level interface, see pb_encode.h and pb_decode.h.
+ */
+
+#ifndef PB_H_INCLUDED
+#define PB_H_INCLUDED
+
+/*****************************************************************
+ * Nanopb compilation time options. You can change these here by *
+ * uncommenting the lines, or on the compiler command line. *
+ *****************************************************************/
+
+/* Enable support for dynamically allocated fields */
+/* #define PB_ENABLE_MALLOC 1 */
+
+/* Define this if your CPU / compiler combination does not support
+ * unaligned memory access to packed structures. */
+/* #define PB_NO_PACKED_STRUCTS 1 */
+
+/* Increase the number of required fields that are tracked.
+ * A compiler warning will tell if you need this. */
+/* #define PB_MAX_REQUIRED_FIELDS 256 */
+
+/* Add support for tag numbers > 65536 and fields larger than 65536 bytes. */
+/* #define PB_FIELD_32BIT 1 */
+
+/* Disable support for error messages in order to save some code space. */
+/* #define PB_NO_ERRMSG 1 */
+
+/* Disable support for custom streams (support only memory buffers). */
+/* #define PB_BUFFER_ONLY 1 */
+
+/* Disable support for 64-bit datatypes, for compilers without int64_t
+ or to save some code space. */
+/* #define PB_WITHOUT_64BIT 1 */
+
+/* Don't encode scalar arrays as packed. This is only to be used when
+ * the decoder on the receiving side cannot process packed scalar arrays.
+ * Such example is older protobuf.js. */
+/* #define PB_ENCODE_ARRAYS_UNPACKED 1 */
+
+/* Enable conversion of doubles to floats for platforms that do not
+ * support 64-bit doubles. Most commonly AVR. */
+/* #define PB_CONVERT_DOUBLE_FLOAT 1 */
+
+/* Check whether incoming strings are valid UTF-8 sequences. Slows down
+ * the string processing slightly and slightly increases code size. */
+/* #define PB_VALIDATE_UTF8 1 */
+
+/******************************************************************
+ * You usually don't need to change anything below this line. *
+ * Feel free to look around and use the defined macros, though. *
+ ******************************************************************/
+
+
+/* Version of the nanopb library. Just in case you want to check it in
+ * your own program. */
+#define NANOPB_VERSION nanopb-0.4.5
+
+/* Include all the system headers needed by nanopb. You will need the
+ * definitions of the following:
+ * - strlen, memcpy, memset functions
+ * - [u]int_least8_t, uint_fast8_t, [u]int_least16_t, [u]int32_t, [u]int64_t
+ * - size_t
+ * - bool
+ *
+ * If you don't have the standard header files, you can instead provide
+ * a custom header that defines or includes all this. In that case,
+ * define PB_SYSTEM_HEADER to the path of this file.
+ */
+#ifdef PB_SYSTEM_HEADER
+#include PB_SYSTEM_HEADER
+#else
+#include <stdint.h>
+#include <stddef.h>
+#include <stdbool.h>
+#include <string.h>
+#include <limits.h>
+
+#ifdef PB_ENABLE_MALLOC
+#include <stdlib.h>
+#endif
+#endif
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Macro for defining packed structures (compiler dependent).
+ * This just reduces memory requirements, but is not required.
+ */
+#if defined(PB_NO_PACKED_STRUCTS)
+ /* Disable struct packing */
+# define PB_PACKED_STRUCT_START
+# define PB_PACKED_STRUCT_END
+# define pb_packed
+#elif defined(__GNUC__) || defined(__clang__)
+ /* For GCC and clang */
+# define PB_PACKED_STRUCT_START
+# define PB_PACKED_STRUCT_END
+# define pb_packed __attribute__((packed))
+#elif defined(__ICCARM__) || defined(__CC_ARM)
+ /* For IAR ARM and Keil MDK-ARM compilers */
+# define PB_PACKED_STRUCT_START _Pragma("pack(push, 1)")
+# define PB_PACKED_STRUCT_END _Pragma("pack(pop)")
+# define pb_packed
+#elif defined(_MSC_VER) && (_MSC_VER >= 1500)
+ /* For Microsoft Visual C++ */
+# define PB_PACKED_STRUCT_START __pragma(pack(push, 1))
+# define PB_PACKED_STRUCT_END __pragma(pack(pop))
+# define pb_packed
+#else
+ /* Unknown compiler */
+# define PB_PACKED_STRUCT_START
+# define PB_PACKED_STRUCT_END
+# define pb_packed
+#endif
+
+/* Handly macro for suppressing unreferenced-parameter compiler warnings. */
+#ifndef PB_UNUSED
+#define PB_UNUSED(x) (void)(x)
+#endif
+
+/* Harvard-architecture processors may need special attributes for storing
+ * field information in program memory. */
+#ifndef PB_PROGMEM
+#ifdef __AVR__
+#include <avr/pgmspace.h>
+#define PB_PROGMEM PROGMEM
+#define PB_PROGMEM_READU32(x) pgm_read_dword(&x)
+#else
+#define PB_PROGMEM
+#define PB_PROGMEM_READU32(x) (x)
+#endif
+#endif
+
+/* Compile-time assertion, used for checking compatible compilation options.
+ * If this does not work properly on your compiler, use
+ * #define PB_NO_STATIC_ASSERT to disable it.
+ *
+ * But before doing that, check carefully the error message / place where it
+ * comes from to see if the error has a real cause. Unfortunately the error
+ * message is not always very clear to read, but you can see the reason better
+ * in the place where the PB_STATIC_ASSERT macro was called.
+ */
+#ifndef PB_NO_STATIC_ASSERT
+# ifndef PB_STATIC_ASSERT
+# if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L
+ /* C11 standard _Static_assert mechanism */
+# define PB_STATIC_ASSERT(COND,MSG) _Static_assert(COND,#MSG);
+# else
+ /* Classic negative-size-array static assert mechanism */
+# define PB_STATIC_ASSERT(COND,MSG) typedef char PB_STATIC_ASSERT_MSG(MSG, __LINE__, __COUNTER__)[(COND)?1:-1];
+# define PB_STATIC_ASSERT_MSG(MSG, LINE, COUNTER) PB_STATIC_ASSERT_MSG_(MSG, LINE, COUNTER)
+# define PB_STATIC_ASSERT_MSG_(MSG, LINE, COUNTER) pb_static_assertion_##MSG##_##LINE##_##COUNTER
+# endif
+# endif
+#else
+ /* Static asserts disabled by PB_NO_STATIC_ASSERT */
+# define PB_STATIC_ASSERT(COND,MSG)
+#endif
+
+/* Number of required fields to keep track of. */
+#ifndef PB_MAX_REQUIRED_FIELDS
+#define PB_MAX_REQUIRED_FIELDS 64
+#endif
+
+#if PB_MAX_REQUIRED_FIELDS < 64
+#error You should not lower PB_MAX_REQUIRED_FIELDS from the default value (64).
+#endif
+
+#ifdef PB_WITHOUT_64BIT
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+/* Cannot use doubles without 64-bit types */
+#undef PB_CONVERT_DOUBLE_FLOAT
+#endif
+#endif
+
+/* List of possible field types. These are used in the autogenerated code.
+ * Least-significant 4 bits tell the scalar type
+ * Most-significant 4 bits specify repeated/required/packed etc.
+ */
+
+typedef uint_least8_t pb_type_t;
+
+/**** Field data types ****/
+
+/* Numeric types */
+#define PB_LTYPE_BOOL 0x00U /* bool */
+#define PB_LTYPE_VARINT 0x01U /* int32, int64, enum, bool */
+#define PB_LTYPE_UVARINT 0x02U /* uint32, uint64 */
+#define PB_LTYPE_SVARINT 0x03U /* sint32, sint64 */
+#define PB_LTYPE_FIXED32 0x04U /* fixed32, sfixed32, float */
+#define PB_LTYPE_FIXED64 0x05U /* fixed64, sfixed64, double */
+
+/* Marker for last packable field type. */
+#define PB_LTYPE_LAST_PACKABLE 0x05U
+
+/* Byte array with pre-allocated buffer.
+ * data_size is the length of the allocated PB_BYTES_ARRAY structure. */
+#define PB_LTYPE_BYTES 0x06U
+
+/* String with pre-allocated buffer.
+ * data_size is the maximum length. */
+#define PB_LTYPE_STRING 0x07U
+
+/* Submessage
+ * submsg_fields is pointer to field descriptions */
+#define PB_LTYPE_SUBMESSAGE 0x08U
+
+/* Submessage with pre-decoding callback
+ * The pre-decoding callback is stored as pb_callback_t right before pSize.
+ * submsg_fields is pointer to field descriptions */
+#define PB_LTYPE_SUBMSG_W_CB 0x09U
+
+/* Extension pseudo-field
+ * The field contains a pointer to pb_extension_t */
+#define PB_LTYPE_EXTENSION 0x0AU
+
+/* Byte array with inline, pre-allocated byffer.
+ * data_size is the length of the inline, allocated buffer.
+ * This differs from PB_LTYPE_BYTES by defining the element as
+ * pb_byte_t[data_size] rather than pb_bytes_array_t. */
+#define PB_LTYPE_FIXED_LENGTH_BYTES 0x0BU
+
+/* Number of declared LTYPES */
+#define PB_LTYPES_COUNT 0x0CU
+#define PB_LTYPE_MASK 0x0FU
+
+/**** Field repetition rules ****/
+
+#define PB_HTYPE_REQUIRED 0x00U
+#define PB_HTYPE_OPTIONAL 0x10U
+#define PB_HTYPE_SINGULAR 0x10U
+#define PB_HTYPE_REPEATED 0x20U
+#define PB_HTYPE_FIXARRAY 0x20U
+#define PB_HTYPE_ONEOF 0x30U
+#define PB_HTYPE_MASK 0x30U
+
+/**** Field allocation types ****/
+
+#define PB_ATYPE_STATIC 0x00U
+#define PB_ATYPE_POINTER 0x80U
+#define PB_ATYPE_CALLBACK 0x40U
+#define PB_ATYPE_MASK 0xC0U
+
+#define PB_ATYPE(x) ((x) & PB_ATYPE_MASK)
+#define PB_HTYPE(x) ((x) & PB_HTYPE_MASK)
+#define PB_LTYPE(x) ((x) & PB_LTYPE_MASK)
+#define PB_LTYPE_IS_SUBMSG(x) (PB_LTYPE(x) == PB_LTYPE_SUBMESSAGE || \
+ PB_LTYPE(x) == PB_LTYPE_SUBMSG_W_CB)
+
+/* Data type used for storing sizes of struct fields
+ * and array counts.
+ */
+#if defined(PB_FIELD_32BIT)
+ typedef uint32_t pb_size_t;
+ typedef int32_t pb_ssize_t;
+#else
+ typedef uint_least16_t pb_size_t;
+ typedef int_least16_t pb_ssize_t;
+#endif
+#define PB_SIZE_MAX ((pb_size_t)-1)
+
+/* Data type for storing encoded data and other byte streams.
+ * This typedef exists to support platforms where uint8_t does not exist.
+ * You can regard it as equivalent on uint8_t on other platforms.
+ */
+typedef uint_least8_t pb_byte_t;
+
+/* Forward declaration of struct types */
+typedef struct pb_istream_s pb_istream_t;
+typedef struct pb_ostream_s pb_ostream_t;
+typedef struct pb_field_iter_s pb_field_iter_t;
+
+/* This structure is used in auto-generated constants
+ * to specify struct fields.
+ */
+typedef struct pb_msgdesc_s pb_msgdesc_t;
+struct pb_msgdesc_s {
+ const uint32_t *field_info;
+ const pb_msgdesc_t * const * submsg_info;
+ const pb_byte_t *default_value;
+
+ bool (*field_callback)(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_iter_t *field);
+
+ pb_size_t field_count;
+ pb_size_t required_field_count;
+ pb_size_t largest_tag;
+};
+
+/* Iterator for message descriptor */
+struct pb_field_iter_s {
+ const pb_msgdesc_t *descriptor; /* Pointer to message descriptor constant */
+ void *message; /* Pointer to start of the structure */
+
+ pb_size_t index; /* Index of the field */
+ pb_size_t field_info_index; /* Index to descriptor->field_info array */
+ pb_size_t required_field_index; /* Index that counts only the required fields */
+ pb_size_t submessage_index; /* Index that counts only submessages */
+
+ pb_size_t tag; /* Tag of current field */
+ pb_size_t data_size; /* sizeof() of a single item */
+ pb_size_t array_size; /* Number of array entries */
+ pb_type_t type; /* Type of current field */
+
+ void *pField; /* Pointer to current field in struct */
+ void *pData; /* Pointer to current data contents. Different than pField for arrays and pointers. */
+ void *pSize; /* Pointer to count/has field */
+
+ const pb_msgdesc_t *submsg_desc; /* For submessage fields, pointer to field descriptor for the submessage. */
+};
+
+/* For compatibility with legacy code */
+typedef pb_field_iter_t pb_field_t;
+
+/* Make sure that the standard integer types are of the expected sizes.
+ * Otherwise fixed32/fixed64 fields can break.
+ *
+ * If you get errors here, it probably means that your stdint.h is not
+ * correct for your platform.
+ */
+#ifndef PB_WITHOUT_64BIT
+PB_STATIC_ASSERT(sizeof(int64_t) == 2 * sizeof(int32_t), INT64_T_WRONG_SIZE)
+PB_STATIC_ASSERT(sizeof(uint64_t) == 2 * sizeof(uint32_t), UINT64_T_WRONG_SIZE)
+#endif
+
+/* This structure is used for 'bytes' arrays.
+ * It has the number of bytes in the beginning, and after that an array.
+ * Note that actual structs used will have a different length of bytes array.
+ */
+#define PB_BYTES_ARRAY_T(n) struct { pb_size_t size; pb_byte_t bytes[n]; }
+#define PB_BYTES_ARRAY_T_ALLOCSIZE(n) ((size_t)n + offsetof(pb_bytes_array_t, bytes))
+
+struct pb_bytes_array_s {
+ pb_size_t size;
+ pb_byte_t bytes[1];
+};
+typedef struct pb_bytes_array_s pb_bytes_array_t;
+
+/* This structure is used for giving the callback function.
+ * It is stored in the message structure and filled in by the method that
+ * calls pb_decode.
+ *
+ * The decoding callback will be given a limited-length stream
+ * If the wire type was string, the length is the length of the string.
+ * If the wire type was a varint/fixed32/fixed64, the length is the length
+ * of the actual value.
+ * The function may be called multiple times (especially for repeated types,
+ * but also otherwise if the message happens to contain the field multiple
+ * times.)
+ *
+ * The encoding callback will receive the actual output stream.
+ * It should write all the data in one call, including the field tag and
+ * wire type. It can write multiple fields.
+ *
+ * The callback can be null if you want to skip a field.
+ */
+typedef struct pb_callback_s pb_callback_t;
+struct pb_callback_s {
+ /* Callback functions receive a pointer to the arg field.
+ * You can access the value of the field as *arg, and modify it if needed.
+ */
+ union {
+ bool (*decode)(pb_istream_t *stream, const pb_field_t *field, void **arg);
+ bool (*encode)(pb_ostream_t *stream, const pb_field_t *field, void * const *arg);
+ } funcs;
+
+ /* Free arg for use by callback */
+ void *arg;
+};
+
+extern bool pb_default_field_callback(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_t *field);
+
+/* Wire types. Library user needs these only in encoder callbacks. */
+typedef enum {
+ PB_WT_VARINT = 0,
+ PB_WT_64BIT = 1,
+ PB_WT_STRING = 2,
+ PB_WT_32BIT = 5
+} pb_wire_type_t;
+
+/* Structure for defining the handling of unknown/extension fields.
+ * Usually the pb_extension_type_t structure is automatically generated,
+ * while the pb_extension_t structure is created by the user. However,
+ * if you want to catch all unknown fields, you can also create a custom
+ * pb_extension_type_t with your own callback.
+ */
+typedef struct pb_extension_type_s pb_extension_type_t;
+typedef struct pb_extension_s pb_extension_t;
+struct pb_extension_type_s {
+ /* Called for each unknown field in the message.
+ * If you handle the field, read off all of its data and return true.
+ * If you do not handle the field, do not read anything and return true.
+ * If you run into an error, return false.
+ * Set to NULL for default handler.
+ */
+ bool (*decode)(pb_istream_t *stream, pb_extension_t *extension,
+ uint32_t tag, pb_wire_type_t wire_type);
+
+ /* Called once after all regular fields have been encoded.
+ * If you have something to write, do so and return true.
+ * If you do not have anything to write, just return true.
+ * If you run into an error, return false.
+ * Set to NULL for default handler.
+ */
+ bool (*encode)(pb_ostream_t *stream, const pb_extension_t *extension);
+
+ /* Free field for use by the callback. */
+ const void *arg;
+};
+
+struct pb_extension_s {
+ /* Type describing the extension field. Usually you'll initialize
+ * this to a pointer to the automatically generated structure. */
+ const pb_extension_type_t *type;
+
+ /* Destination for the decoded data. This must match the datatype
+ * of the extension field. */
+ void *dest;
+
+ /* Pointer to the next extension handler, or NULL.
+ * If this extension does not match a field, the next handler is
+ * automatically called. */
+ pb_extension_t *next;
+
+ /* The decoder sets this to true if the extension was found.
+ * Ignored for encoding. */
+ bool found;
+};
+
+#define pb_extension_init_zero {NULL,NULL,NULL,false}
+
+/* Memory allocation functions to use. You can define pb_realloc and
+ * pb_free to custom functions if you want. */
+#ifdef PB_ENABLE_MALLOC
+# ifndef pb_realloc
+# define pb_realloc(ptr, size) realloc(ptr, size)
+# endif
+# ifndef pb_free
+# define pb_free(ptr) free(ptr)
+# endif
+#endif
+
+/* This is used to inform about need to regenerate .pb.h/.pb.c files. */
+#define PB_PROTO_HEADER_VERSION 40
+
+/* These macros are used to declare pb_field_t's in the constant array. */
+/* Size of a structure member, in bytes. */
+#define pb_membersize(st, m) (sizeof ((st*)0)->m)
+/* Number of entries in an array. */
+#define pb_arraysize(st, m) (pb_membersize(st, m) / pb_membersize(st, m[0]))
+/* Delta from start of one member to the start of another member. */
+#define pb_delta(st, m1, m2) ((int)offsetof(st, m1) - (int)offsetof(st, m2))
+
+/* Force expansion of macro value */
+#define PB_EXPAND(x) x
+
+/* Binding of a message field set into a specific structure */
+#define PB_BIND(msgname, structname, width) \
+ const uint32_t structname ## _field_info[] PB_PROGMEM = \
+ { \
+ msgname ## _FIELDLIST(PB_GEN_FIELD_INFO_ ## width, structname) \
+ 0 \
+ }; \
+ const pb_msgdesc_t* const structname ## _submsg_info[] = \
+ { \
+ msgname ## _FIELDLIST(PB_GEN_SUBMSG_INFO, structname) \
+ NULL \
+ }; \
+ const pb_msgdesc_t structname ## _msg = \
+ { \
+ structname ## _field_info, \
+ structname ## _submsg_info, \
+ msgname ## _DEFAULT, \
+ msgname ## _CALLBACK, \
+ 0 msgname ## _FIELDLIST(PB_GEN_FIELD_COUNT, structname), \
+ 0 msgname ## _FIELDLIST(PB_GEN_REQ_FIELD_COUNT, structname), \
+ 0 msgname ## _FIELDLIST(PB_GEN_LARGEST_TAG, structname), \
+ }; \
+ msgname ## _FIELDLIST(PB_GEN_FIELD_INFO_ASSERT_ ## width, structname)
+
+#define PB_GEN_FIELD_COUNT(structname, atype, htype, ltype, fieldname, tag) +1
+#define PB_GEN_REQ_FIELD_COUNT(structname, atype, htype, ltype, fieldname, tag) \
+ + (PB_HTYPE_ ## htype == PB_HTYPE_REQUIRED)
+#define PB_GEN_LARGEST_TAG(structname, atype, htype, ltype, fieldname, tag) \
+ * 0 + tag
+
+/* X-macro for generating the entries in struct_field_info[] array. */
+#define PB_GEN_FIELD_INFO_1(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_1(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_2(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_2(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_4(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_4(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_8(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_8(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_AUTO(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_AUTO2(PB_FIELDINFO_WIDTH_AUTO(_PB_ATYPE_ ## atype, _PB_HTYPE_ ## htype, _PB_LTYPE_ ## ltype), \
+ tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_FIELDINFO_AUTO2(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size)
+
+#define PB_FIELDINFO_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_ ## width(tag, type, data_offset, data_size, size_offset, array_size)
+
+/* X-macro for generating asserts that entries fit in struct_field_info[] array.
+ * The structure of macros here must match the structure above in PB_GEN_FIELD_INFO_x(),
+ * but it is not easily reused because of how macro substitutions work. */
+#define PB_GEN_FIELD_INFO_ASSERT_1(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_1(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_2(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_2(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_4(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_4(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_8(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_8(tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_GEN_FIELD_INFO_ASSERT_AUTO(structname, atype, htype, ltype, fieldname, tag) \
+ PB_FIELDINFO_ASSERT_AUTO2(PB_FIELDINFO_WIDTH_AUTO(_PB_ATYPE_ ## atype, _PB_HTYPE_ ## htype, _PB_LTYPE_ ## ltype), \
+ tag, PB_ATYPE_ ## atype | PB_HTYPE_ ## htype | PB_LTYPE_MAP_ ## ltype, \
+ PB_DATA_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_DATA_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_SIZE_OFFSET_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname), \
+ PB_ARRAY_SIZE_ ## atype(_PB_HTYPE_ ## htype, structname, fieldname))
+
+#define PB_FIELDINFO_ASSERT_AUTO2(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_ASSERT_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size)
+
+#define PB_FIELDINFO_ASSERT_AUTO3(width, tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_FIELDINFO_ASSERT_ ## width(tag, type, data_offset, data_size, size_offset, array_size)
+
+#define PB_DATA_OFFSET_STATIC(htype, structname, fieldname) PB_DO ## htype(structname, fieldname)
+#define PB_DATA_OFFSET_POINTER(htype, structname, fieldname) PB_DO ## htype(structname, fieldname)
+#define PB_DATA_OFFSET_CALLBACK(htype, structname, fieldname) PB_DO ## htype(structname, fieldname)
+#define PB_DO_PB_HTYPE_REQUIRED(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_SINGULAR(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_ONEOF(structname, fieldname) offsetof(structname, PB_ONEOF_NAME(FULL, fieldname))
+#define PB_DO_PB_HTYPE_OPTIONAL(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_REPEATED(structname, fieldname) offsetof(structname, fieldname)
+#define PB_DO_PB_HTYPE_FIXARRAY(structname, fieldname) offsetof(structname, fieldname)
+
+#define PB_SIZE_OFFSET_STATIC(htype, structname, fieldname) PB_SO ## htype(structname, fieldname)
+#define PB_SIZE_OFFSET_POINTER(htype, structname, fieldname) PB_SO_PTR ## htype(structname, fieldname)
+#define PB_SIZE_OFFSET_CALLBACK(htype, structname, fieldname) PB_SO_CB ## htype(structname, fieldname)
+#define PB_SO_PB_HTYPE_REQUIRED(structname, fieldname) 0
+#define PB_SO_PB_HTYPE_SINGULAR(structname, fieldname) 0
+#define PB_SO_PB_HTYPE_ONEOF(structname, fieldname) PB_SO_PB_HTYPE_ONEOF2(structname, PB_ONEOF_NAME(FULL, fieldname), PB_ONEOF_NAME(UNION, fieldname))
+#define PB_SO_PB_HTYPE_ONEOF2(structname, fullname, unionname) PB_SO_PB_HTYPE_ONEOF3(structname, fullname, unionname)
+#define PB_SO_PB_HTYPE_ONEOF3(structname, fullname, unionname) pb_delta(structname, fullname, which_ ## unionname)
+#define PB_SO_PB_HTYPE_OPTIONAL(structname, fieldname) pb_delta(structname, fieldname, has_ ## fieldname)
+#define PB_SO_PB_HTYPE_REPEATED(structname, fieldname) pb_delta(structname, fieldname, fieldname ## _count)
+#define PB_SO_PB_HTYPE_FIXARRAY(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_REQUIRED(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_SINGULAR(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_ONEOF(structname, fieldname) PB_SO_PB_HTYPE_ONEOF(structname, fieldname)
+#define PB_SO_PTR_PB_HTYPE_OPTIONAL(structname, fieldname) 0
+#define PB_SO_PTR_PB_HTYPE_REPEATED(structname, fieldname) PB_SO_PB_HTYPE_REPEATED(structname, fieldname)
+#define PB_SO_PTR_PB_HTYPE_FIXARRAY(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_REQUIRED(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_SINGULAR(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_ONEOF(structname, fieldname) PB_SO_PB_HTYPE_ONEOF(structname, fieldname)
+#define PB_SO_CB_PB_HTYPE_OPTIONAL(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_REPEATED(structname, fieldname) 0
+#define PB_SO_CB_PB_HTYPE_FIXARRAY(structname, fieldname) 0
+
+#define PB_ARRAY_SIZE_STATIC(htype, structname, fieldname) PB_AS ## htype(structname, fieldname)
+#define PB_ARRAY_SIZE_POINTER(htype, structname, fieldname) PB_AS_PTR ## htype(structname, fieldname)
+#define PB_ARRAY_SIZE_CALLBACK(htype, structname, fieldname) 1
+#define PB_AS_PB_HTYPE_REQUIRED(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_SINGULAR(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_OPTIONAL(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_ONEOF(structname, fieldname) 1
+#define PB_AS_PB_HTYPE_REPEATED(structname, fieldname) pb_arraysize(structname, fieldname)
+#define PB_AS_PB_HTYPE_FIXARRAY(structname, fieldname) pb_arraysize(structname, fieldname)
+#define PB_AS_PTR_PB_HTYPE_REQUIRED(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_SINGULAR(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_OPTIONAL(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_ONEOF(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_REPEATED(structname, fieldname) 1
+#define PB_AS_PTR_PB_HTYPE_FIXARRAY(structname, fieldname) pb_arraysize(structname, fieldname[0])
+
+#define PB_DATA_SIZE_STATIC(htype, structname, fieldname) PB_DS ## htype(structname, fieldname)
+#define PB_DATA_SIZE_POINTER(htype, structname, fieldname) PB_DS_PTR ## htype(structname, fieldname)
+#define PB_DATA_SIZE_CALLBACK(htype, structname, fieldname) PB_DS_CB ## htype(structname, fieldname)
+#define PB_DS_PB_HTYPE_REQUIRED(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_PB_HTYPE_SINGULAR(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_PB_HTYPE_OPTIONAL(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_PB_HTYPE_ONEOF(structname, fieldname) pb_membersize(structname, PB_ONEOF_NAME(FULL, fieldname))
+#define PB_DS_PB_HTYPE_REPEATED(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PB_HTYPE_FIXARRAY(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_REQUIRED(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_SINGULAR(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_OPTIONAL(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_ONEOF(structname, fieldname) pb_membersize(structname, PB_ONEOF_NAME(FULL, fieldname)[0])
+#define PB_DS_PTR_PB_HTYPE_REPEATED(structname, fieldname) pb_membersize(structname, fieldname[0])
+#define PB_DS_PTR_PB_HTYPE_FIXARRAY(structname, fieldname) pb_membersize(structname, fieldname[0][0])
+#define PB_DS_CB_PB_HTYPE_REQUIRED(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_SINGULAR(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_OPTIONAL(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_ONEOF(structname, fieldname) pb_membersize(structname, PB_ONEOF_NAME(FULL, fieldname))
+#define PB_DS_CB_PB_HTYPE_REPEATED(structname, fieldname) pb_membersize(structname, fieldname)
+#define PB_DS_CB_PB_HTYPE_FIXARRAY(structname, fieldname) pb_membersize(structname, fieldname)
+
+#define PB_ONEOF_NAME(type, tuple) PB_EXPAND(PB_ONEOF_NAME_ ## type tuple)
+#define PB_ONEOF_NAME_UNION(unionname,membername,fullname) unionname
+#define PB_ONEOF_NAME_MEMBER(unionname,membername,fullname) membername
+#define PB_ONEOF_NAME_FULL(unionname,membername,fullname) fullname
+
+#define PB_GEN_SUBMSG_INFO(structname, atype, htype, ltype, fieldname, tag) \
+ PB_SUBMSG_INFO_ ## htype(_PB_LTYPE_ ## ltype, structname, fieldname)
+
+#define PB_SUBMSG_INFO_REQUIRED(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_SINGULAR(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_OPTIONAL(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_ONEOF(ltype, structname, fieldname) PB_SUBMSG_INFO_ONEOF2(ltype, structname, PB_ONEOF_NAME(UNION, fieldname), PB_ONEOF_NAME(MEMBER, fieldname))
+#define PB_SUBMSG_INFO_ONEOF2(ltype, structname, unionname, membername) PB_SUBMSG_INFO_ONEOF3(ltype, structname, unionname, membername)
+#define PB_SUBMSG_INFO_ONEOF3(ltype, structname, unionname, membername) PB_SI ## ltype(structname ## _ ## unionname ## _ ## membername ## _MSGTYPE)
+#define PB_SUBMSG_INFO_REPEATED(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SUBMSG_INFO_FIXARRAY(ltype, structname, fieldname) PB_SI ## ltype(structname ## _ ## fieldname ## _MSGTYPE)
+#define PB_SI_PB_LTYPE_BOOL(t)
+#define PB_SI_PB_LTYPE_BYTES(t)
+#define PB_SI_PB_LTYPE_DOUBLE(t)
+#define PB_SI_PB_LTYPE_ENUM(t)
+#define PB_SI_PB_LTYPE_UENUM(t)
+#define PB_SI_PB_LTYPE_FIXED32(t)
+#define PB_SI_PB_LTYPE_FIXED64(t)
+#define PB_SI_PB_LTYPE_FLOAT(t)
+#define PB_SI_PB_LTYPE_INT32(t)
+#define PB_SI_PB_LTYPE_INT64(t)
+#define PB_SI_PB_LTYPE_MESSAGE(t) PB_SUBMSG_DESCRIPTOR(t)
+#define PB_SI_PB_LTYPE_MSG_W_CB(t) PB_SUBMSG_DESCRIPTOR(t)
+#define PB_SI_PB_LTYPE_SFIXED32(t)
+#define PB_SI_PB_LTYPE_SFIXED64(t)
+#define PB_SI_PB_LTYPE_SINT32(t)
+#define PB_SI_PB_LTYPE_SINT64(t)
+#define PB_SI_PB_LTYPE_STRING(t)
+#define PB_SI_PB_LTYPE_UINT32(t)
+#define PB_SI_PB_LTYPE_UINT64(t)
+#define PB_SI_PB_LTYPE_EXTENSION(t)
+#define PB_SI_PB_LTYPE_FIXED_LENGTH_BYTES(t)
+#define PB_SUBMSG_DESCRIPTOR(t) &(t ## _msg),
+
+/* The field descriptors use a variable width format, with width of either
+ * 1, 2, 4 or 8 of 32-bit words. The two lowest bytes of the first byte always
+ * encode the descriptor size, 6 lowest bits of field tag number, and 8 bits
+ * of the field type.
+ *
+ * Descriptor size is encoded as 0 = 1 word, 1 = 2 words, 2 = 4 words, 3 = 8 words.
+ *
+ * Formats, listed starting with the least significant bit of the first word.
+ * 1 word: [2-bit len] [6-bit tag] [8-bit type] [8-bit data_offset] [4-bit size_offset] [4-bit data_size]
+ *
+ * 2 words: [2-bit len] [6-bit tag] [8-bit type] [12-bit array_size] [4-bit size_offset]
+ * [16-bit data_offset] [12-bit data_size] [4-bit tag>>6]
+ *
+ * 4 words: [2-bit len] [6-bit tag] [8-bit type] [16-bit array_size]
+ * [8-bit size_offset] [24-bit tag>>6]
+ * [32-bit data_offset]
+ * [32-bit data_size]
+ *
+ * 8 words: [2-bit len] [6-bit tag] [8-bit type] [16-bit reserved]
+ * [8-bit size_offset] [24-bit tag>>6]
+ * [32-bit data_offset]
+ * [32-bit data_size]
+ * [32-bit array_size]
+ * [32-bit reserved]
+ * [32-bit reserved]
+ * [32-bit reserved]
+ */
+
+#define PB_FIELDINFO_1(tag, type, data_offset, data_size, size_offset, array_size) \
+ (0 | (((tag) << 2) & 0xFF) | ((type) << 8) | (((uint32_t)(data_offset) & 0xFF) << 16) | \
+ (((uint32_t)(size_offset) & 0x0F) << 24) | (((uint32_t)(data_size) & 0x0F) << 28)),
+
+#define PB_FIELDINFO_2(tag, type, data_offset, data_size, size_offset, array_size) \
+ (1 | (((tag) << 2) & 0xFF) | ((type) << 8) | (((uint32_t)(array_size) & 0xFFF) << 16) | (((uint32_t)(size_offset) & 0x0F) << 28)), \
+ (((uint32_t)(data_offset) & 0xFFFF) | (((uint32_t)(data_size) & 0xFFF) << 16) | (((uint32_t)(tag) & 0x3c0) << 22)),
+
+#define PB_FIELDINFO_4(tag, type, data_offset, data_size, size_offset, array_size) \
+ (2 | (((tag) << 2) & 0xFF) | ((type) << 8) | (((uint32_t)(array_size) & 0xFFFF) << 16)), \
+ ((uint32_t)(int_least8_t)(size_offset) | (((uint32_t)(tag) << 2) & 0xFFFFFF00)), \
+ (data_offset), (data_size),
+
+#define PB_FIELDINFO_8(tag, type, data_offset, data_size, size_offset, array_size) \
+ (3 | (((tag) << 2) & 0xFF) | ((type) << 8)), \
+ ((uint32_t)(int_least8_t)(size_offset) | (((uint32_t)(tag) << 2) & 0xFFFFFF00)), \
+ (data_offset), (data_size), (array_size), 0, 0, 0,
+
+/* These assertions verify that the field information fits in the allocated space.
+ * The generator tries to automatically determine the correct width that can fit all
+ * data associated with a message. These asserts will fail only if there has been a
+ * problem in the automatic logic - this may be worth reporting as a bug. As a workaround,
+ * you can increase the descriptor width by defining PB_FIELDINFO_WIDTH or by setting
+ * descriptorsize option in .options file.
+ */
+#define PB_FITS(value,bits) ((uint32_t)(value) < ((uint32_t)1<<bits))
+#define PB_FIELDINFO_ASSERT_1(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,6) && PB_FITS(data_offset,8) && PB_FITS(size_offset,4) && PB_FITS(data_size,4) && PB_FITS(array_size,1), FIELDINFO_DOES_NOT_FIT_width1_field ## tag)
+
+#define PB_FIELDINFO_ASSERT_2(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,10) && PB_FITS(data_offset,16) && PB_FITS(size_offset,4) && PB_FITS(data_size,12) && PB_FITS(array_size,12), FIELDINFO_DOES_NOT_FIT_width2_field ## tag)
+
+#ifndef PB_FIELD_32BIT
+/* Maximum field sizes are still 16-bit if pb_size_t is 16-bit */
+#define PB_FIELDINFO_ASSERT_4(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,16) && PB_FITS(data_offset,16) && PB_FITS((int_least8_t)size_offset,8) && PB_FITS(data_size,16) && PB_FITS(array_size,16), FIELDINFO_DOES_NOT_FIT_width4_field ## tag)
+
+#define PB_FIELDINFO_ASSERT_8(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,16) && PB_FITS(data_offset,16) && PB_FITS((int_least8_t)size_offset,8) && PB_FITS(data_size,16) && PB_FITS(array_size,16), FIELDINFO_DOES_NOT_FIT_width8_field ## tag)
+#else
+/* Up to 32-bit fields supported.
+ * Note that the checks are against 31 bits to avoid compiler warnings about shift wider than type in the test.
+ * I expect that there is no reasonable use for >2GB messages with nanopb anyway.
+ */
+#define PB_FIELDINFO_ASSERT_4(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,30) && PB_FITS(data_offset,31) && PB_FITS(size_offset,8) && PB_FITS(data_size,31) && PB_FITS(array_size,16), FIELDINFO_DOES_NOT_FIT_width4_field ## tag)
+
+#define PB_FIELDINFO_ASSERT_8(tag, type, data_offset, data_size, size_offset, array_size) \
+ PB_STATIC_ASSERT(PB_FITS(tag,30) && PB_FITS(data_offset,31) && PB_FITS(size_offset,8) && PB_FITS(data_size,31) && PB_FITS(array_size,31), FIELDINFO_DOES_NOT_FIT_width8_field ## tag)
+#endif
+
+
+/* Automatic picking of FIELDINFO width:
+ * Uses width 1 when possible, otherwise resorts to width 2.
+ * This is used when PB_BIND() is called with "AUTO" as the argument.
+ * The generator will give explicit size argument when it knows that a message
+ * structure grows beyond 1-word format limits.
+ */
+#define PB_FIELDINFO_WIDTH_AUTO(atype, htype, ltype) PB_FI_WIDTH ## atype(htype, ltype)
+#define PB_FI_WIDTH_PB_ATYPE_STATIC(htype, ltype) PB_FI_WIDTH ## htype(ltype)
+#define PB_FI_WIDTH_PB_ATYPE_POINTER(htype, ltype) PB_FI_WIDTH ## htype(ltype)
+#define PB_FI_WIDTH_PB_ATYPE_CALLBACK(htype, ltype) 2
+#define PB_FI_WIDTH_PB_HTYPE_REQUIRED(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_SINGULAR(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_OPTIONAL(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_ONEOF(ltype) PB_FI_WIDTH ## ltype
+#define PB_FI_WIDTH_PB_HTYPE_REPEATED(ltype) 2
+#define PB_FI_WIDTH_PB_HTYPE_FIXARRAY(ltype) 2
+#define PB_FI_WIDTH_PB_LTYPE_BOOL 1
+#define PB_FI_WIDTH_PB_LTYPE_BYTES 2
+#define PB_FI_WIDTH_PB_LTYPE_DOUBLE 1
+#define PB_FI_WIDTH_PB_LTYPE_ENUM 1
+#define PB_FI_WIDTH_PB_LTYPE_UENUM 1
+#define PB_FI_WIDTH_PB_LTYPE_FIXED32 1
+#define PB_FI_WIDTH_PB_LTYPE_FIXED64 1
+#define PB_FI_WIDTH_PB_LTYPE_FLOAT 1
+#define PB_FI_WIDTH_PB_LTYPE_INT32 1
+#define PB_FI_WIDTH_PB_LTYPE_INT64 1
+#define PB_FI_WIDTH_PB_LTYPE_MESSAGE 2
+#define PB_FI_WIDTH_PB_LTYPE_MSG_W_CB 2
+#define PB_FI_WIDTH_PB_LTYPE_SFIXED32 1
+#define PB_FI_WIDTH_PB_LTYPE_SFIXED64 1
+#define PB_FI_WIDTH_PB_LTYPE_SINT32 1
+#define PB_FI_WIDTH_PB_LTYPE_SINT64 1
+#define PB_FI_WIDTH_PB_LTYPE_STRING 2
+#define PB_FI_WIDTH_PB_LTYPE_UINT32 1
+#define PB_FI_WIDTH_PB_LTYPE_UINT64 1
+#define PB_FI_WIDTH_PB_LTYPE_EXTENSION 1
+#define PB_FI_WIDTH_PB_LTYPE_FIXED_LENGTH_BYTES 2
+
+/* The mapping from protobuf types to LTYPEs is done using these macros. */
+#define PB_LTYPE_MAP_BOOL PB_LTYPE_BOOL
+#define PB_LTYPE_MAP_BYTES PB_LTYPE_BYTES
+#define PB_LTYPE_MAP_DOUBLE PB_LTYPE_FIXED64
+#define PB_LTYPE_MAP_ENUM PB_LTYPE_VARINT
+#define PB_LTYPE_MAP_UENUM PB_LTYPE_UVARINT
+#define PB_LTYPE_MAP_FIXED32 PB_LTYPE_FIXED32
+#define PB_LTYPE_MAP_FIXED64 PB_LTYPE_FIXED64
+#define PB_LTYPE_MAP_FLOAT PB_LTYPE_FIXED32
+#define PB_LTYPE_MAP_INT32 PB_LTYPE_VARINT
+#define PB_LTYPE_MAP_INT64 PB_LTYPE_VARINT
+#define PB_LTYPE_MAP_MESSAGE PB_LTYPE_SUBMESSAGE
+#define PB_LTYPE_MAP_MSG_W_CB PB_LTYPE_SUBMSG_W_CB
+#define PB_LTYPE_MAP_SFIXED32 PB_LTYPE_FIXED32
+#define PB_LTYPE_MAP_SFIXED64 PB_LTYPE_FIXED64
+#define PB_LTYPE_MAP_SINT32 PB_LTYPE_SVARINT
+#define PB_LTYPE_MAP_SINT64 PB_LTYPE_SVARINT
+#define PB_LTYPE_MAP_STRING PB_LTYPE_STRING
+#define PB_LTYPE_MAP_UINT32 PB_LTYPE_UVARINT
+#define PB_LTYPE_MAP_UINT64 PB_LTYPE_UVARINT
+#define PB_LTYPE_MAP_EXTENSION PB_LTYPE_EXTENSION
+#define PB_LTYPE_MAP_FIXED_LENGTH_BYTES PB_LTYPE_FIXED_LENGTH_BYTES
+
+/* These macros are used for giving out error messages.
+ * They are mostly a debugging aid; the main error information
+ * is the true/false return value from functions.
+ * Some code space can be saved by disabling the error
+ * messages if not used.
+ *
+ * PB_SET_ERROR() sets the error message if none has been set yet.
+ * msg must be a constant string literal.
+ * PB_GET_ERROR() always returns a pointer to a string.
+ * PB_RETURN_ERROR() sets the error and returns false from current
+ * function.
+ */
+#ifdef PB_NO_ERRMSG
+#define PB_SET_ERROR(stream, msg) PB_UNUSED(stream)
+#define PB_GET_ERROR(stream) "(errmsg disabled)"
+#else
+#define PB_SET_ERROR(stream, msg) (stream->errmsg = (stream)->errmsg ? (stream)->errmsg : (msg))
+#define PB_GET_ERROR(stream) ((stream)->errmsg ? (stream)->errmsg : "(none)")
+#endif
+
+#define PB_RETURN_ERROR(stream, msg) return PB_SET_ERROR(stream, msg), false
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#ifdef __cplusplus
+#if __cplusplus >= 201103L
+#define PB_CONSTEXPR constexpr
+#else // __cplusplus >= 201103L
+#define PB_CONSTEXPR
+#endif // __cplusplus >= 201103L
+
+#if __cplusplus >= 201703L
+#define PB_INLINE_CONSTEXPR inline constexpr
+#else // __cplusplus >= 201703L
+#define PB_INLINE_CONSTEXPR PB_CONSTEXPR
+#endif // __cplusplus >= 201703L
+
+namespace nanopb {
+// Each type will be partially specialized by the generator.
+template <typename GenMessageT> struct MessageDescriptor;
+} // namespace nanopb
+#endif /* __cplusplus */
+
+#endif
+
diff --git a/security/container/protos/nanopb/pb_common.c b/security/container/protos/nanopb/pb_common.c
new file mode 100644
index 0000000..6aee76b
--- /dev/null
+++ b/security/container/protos/nanopb/pb_common.c
@@ -0,0 +1,388 @@
+/* pb_common.c: Common support functions for pb_encode.c and pb_decode.c.
+ *
+ * 2014 Petteri Aimonen <jpa@kapsi.fi>
+ */
+
+#include "pb_common.h"
+
+static bool load_descriptor_values(pb_field_iter_t *iter)
+{
+ uint32_t word0;
+ uint32_t data_offset;
+ int_least8_t size_offset;
+
+ if (iter->index >= iter->descriptor->field_count)
+ return false;
+
+ word0 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+ iter->type = (pb_type_t)((word0 >> 8) & 0xFF);
+
+ switch(word0 & 3)
+ {
+ case 0: {
+ /* 1-word format */
+ iter->array_size = 1;
+ iter->tag = (pb_size_t)((word0 >> 2) & 0x3F);
+ size_offset = (int_least8_t)((word0 >> 24) & 0x0F);
+ data_offset = (word0 >> 16) & 0xFF;
+ iter->data_size = (pb_size_t)((word0 >> 28) & 0x0F);
+ break;
+ }
+
+ case 1: {
+ /* 2-word format */
+ uint32_t word1 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 1]);
+
+ iter->array_size = (pb_size_t)((word0 >> 16) & 0x0FFF);
+ iter->tag = (pb_size_t)(((word0 >> 2) & 0x3F) | ((word1 >> 28) << 6));
+ size_offset = (int_least8_t)((word0 >> 28) & 0x0F);
+ data_offset = word1 & 0xFFFF;
+ iter->data_size = (pb_size_t)((word1 >> 16) & 0x0FFF);
+ break;
+ }
+
+ case 2: {
+ /* 4-word format */
+ uint32_t word1 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 1]);
+ uint32_t word2 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 2]);
+ uint32_t word3 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 3]);
+
+ iter->array_size = (pb_size_t)(word0 >> 16);
+ iter->tag = (pb_size_t)(((word0 >> 2) & 0x3F) | ((word1 >> 8) << 6));
+ size_offset = (int_least8_t)(word1 & 0xFF);
+ data_offset = word2;
+ iter->data_size = (pb_size_t)word3;
+ break;
+ }
+
+ default: {
+ /* 8-word format */
+ uint32_t word1 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 1]);
+ uint32_t word2 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 2]);
+ uint32_t word3 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 3]);
+ uint32_t word4 = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index + 4]);
+
+ iter->array_size = (pb_size_t)word4;
+ iter->tag = (pb_size_t)(((word0 >> 2) & 0x3F) | ((word1 >> 8) << 6));
+ size_offset = (int_least8_t)(word1 & 0xFF);
+ data_offset = word2;
+ iter->data_size = (pb_size_t)word3;
+ break;
+ }
+ }
+
+ if (!iter->message)
+ {
+ /* Avoid doing arithmetic on null pointers, it is undefined */
+ iter->pField = NULL;
+ iter->pSize = NULL;
+ }
+ else
+ {
+ iter->pField = (char*)iter->message + data_offset;
+
+ if (size_offset)
+ {
+ iter->pSize = (char*)iter->pField - size_offset;
+ }
+ else if (PB_HTYPE(iter->type) == PB_HTYPE_REPEATED &&
+ (PB_ATYPE(iter->type) == PB_ATYPE_STATIC ||
+ PB_ATYPE(iter->type) == PB_ATYPE_POINTER))
+ {
+ /* Fixed count array */
+ iter->pSize = &iter->array_size;
+ }
+ else
+ {
+ iter->pSize = NULL;
+ }
+
+ if (PB_ATYPE(iter->type) == PB_ATYPE_POINTER && iter->pField != NULL)
+ {
+ iter->pData = *(void**)iter->pField;
+ }
+ else
+ {
+ iter->pData = iter->pField;
+ }
+ }
+
+ if (PB_LTYPE_IS_SUBMSG(iter->type))
+ {
+ iter->submsg_desc = iter->descriptor->submsg_info[iter->submessage_index];
+ }
+ else
+ {
+ iter->submsg_desc = NULL;
+ }
+
+ return true;
+}
+
+static void advance_iterator(pb_field_iter_t *iter)
+{
+ iter->index++;
+
+ if (iter->index >= iter->descriptor->field_count)
+ {
+ /* Restart */
+ iter->index = 0;
+ iter->field_info_index = 0;
+ iter->submessage_index = 0;
+ iter->required_field_index = 0;
+ }
+ else
+ {
+ /* Increment indexes based on previous field type.
+ * All field info formats have the following fields:
+ * - lowest 2 bits tell the amount of words in the descriptor (2^n words)
+ * - bits 2..7 give the lowest bits of tag number.
+ * - bits 8..15 give the field type.
+ */
+ uint32_t prev_descriptor = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+ pb_type_t prev_type = (prev_descriptor >> 8) & 0xFF;
+ pb_size_t descriptor_len = (pb_size_t)(1 << (prev_descriptor & 3));
+
+ /* Add to fields.
+ * The cast to pb_size_t is needed to avoid -Wconversion warning.
+ * Because the data is is constants from generator, there is no danger of overflow.
+ */
+ iter->field_info_index = (pb_size_t)(iter->field_info_index + descriptor_len);
+ iter->required_field_index = (pb_size_t)(iter->required_field_index + (PB_HTYPE(prev_type) == PB_HTYPE_REQUIRED));
+ iter->submessage_index = (pb_size_t)(iter->submessage_index + PB_LTYPE_IS_SUBMSG(prev_type));
+ }
+}
+
+bool pb_field_iter_begin(pb_field_iter_t *iter, const pb_msgdesc_t *desc, void *message)
+{
+ memset(iter, 0, sizeof(*iter));
+
+ iter->descriptor = desc;
+ iter->message = message;
+
+ return load_descriptor_values(iter);
+}
+
+bool pb_field_iter_begin_extension(pb_field_iter_t *iter, pb_extension_t *extension)
+{
+ const pb_msgdesc_t *msg = (const pb_msgdesc_t*)extension->type->arg;
+ bool status;
+
+ uint32_t word0 = PB_PROGMEM_READU32(msg->field_info[0]);
+ if (PB_ATYPE(word0 >> 8) == PB_ATYPE_POINTER)
+ {
+ /* For pointer extensions, the pointer is stored directly
+ * in the extension structure. This avoids having an extra
+ * indirection. */
+ status = pb_field_iter_begin(iter, msg, &extension->dest);
+ }
+ else
+ {
+ status = pb_field_iter_begin(iter, msg, extension->dest);
+ }
+
+ iter->pSize = &extension->found;
+ return status;
+}
+
+bool pb_field_iter_next(pb_field_iter_t *iter)
+{
+ advance_iterator(iter);
+ (void)load_descriptor_values(iter);
+ return iter->index != 0;
+}
+
+bool pb_field_iter_find(pb_field_iter_t *iter, uint32_t tag)
+{
+ if (iter->tag == tag)
+ {
+ return true; /* Nothing to do, correct field already. */
+ }
+ else if (tag > iter->descriptor->largest_tag)
+ {
+ return false;
+ }
+ else
+ {
+ pb_size_t start = iter->index;
+ uint32_t fieldinfo;
+
+ if (tag < iter->tag)
+ {
+ /* Fields are in tag number order, so we know that tag is between
+ * 0 and our start position. Setting index to end forces
+ * advance_iterator() call below to restart from beginning. */
+ iter->index = iter->descriptor->field_count;
+ }
+
+ do
+ {
+ /* Advance iterator but don't load values yet */
+ advance_iterator(iter);
+
+ /* Do fast check for tag number match */
+ fieldinfo = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+
+ if (((fieldinfo >> 2) & 0x3F) == (tag & 0x3F))
+ {
+ /* Good candidate, check further */
+ (void)load_descriptor_values(iter);
+
+ if (iter->tag == tag &&
+ PB_LTYPE(iter->type) != PB_LTYPE_EXTENSION)
+ {
+ /* Found it */
+ return true;
+ }
+ }
+ } while (iter->index != start);
+
+ /* Searched all the way back to start, and found nothing. */
+ (void)load_descriptor_values(iter);
+ return false;
+ }
+}
+
+bool pb_field_iter_find_extension(pb_field_iter_t *iter)
+{
+ if (PB_LTYPE(iter->type) == PB_LTYPE_EXTENSION)
+ {
+ return true;
+ }
+ else
+ {
+ pb_size_t start = iter->index;
+ uint32_t fieldinfo;
+
+ do
+ {
+ /* Advance iterator but don't load values yet */
+ advance_iterator(iter);
+
+ /* Do fast check for field type */
+ fieldinfo = PB_PROGMEM_READU32(iter->descriptor->field_info[iter->field_info_index]);
+
+ if (PB_LTYPE((fieldinfo >> 8) & 0xFF) == PB_LTYPE_EXTENSION)
+ {
+ return load_descriptor_values(iter);
+ }
+ } while (iter->index != start);
+
+ /* Searched all the way back to start, and found nothing. */
+ (void)load_descriptor_values(iter);
+ return false;
+ }
+}
+
+static void *pb_const_cast(const void *p)
+{
+ /* Note: this casts away const, in order to use the common field iterator
+ * logic for both encoding and decoding. The cast is done using union
+ * to avoid spurious compiler warnings. */
+ union {
+ void *p1;
+ const void *p2;
+ } t;
+ t.p2 = p;
+ return t.p1;
+}
+
+bool pb_field_iter_begin_const(pb_field_iter_t *iter, const pb_msgdesc_t *desc, const void *message)
+{
+ return pb_field_iter_begin(iter, desc, pb_const_cast(message));
+}
+
+bool pb_field_iter_begin_extension_const(pb_field_iter_t *iter, const pb_extension_t *extension)
+{
+ return pb_field_iter_begin_extension(iter, (pb_extension_t*)pb_const_cast(extension));
+}
+
+bool pb_default_field_callback(pb_istream_t *istream, pb_ostream_t *ostream, const pb_field_t *field)
+{
+ if (field->data_size == sizeof(pb_callback_t))
+ {
+ pb_callback_t *pCallback = (pb_callback_t*)field->pData;
+
+ if (pCallback != NULL)
+ {
+ if (istream != NULL && pCallback->funcs.decode != NULL)
+ {
+ return pCallback->funcs.decode(istream, field, &pCallback->arg);
+ }
+
+ if (ostream != NULL && pCallback->funcs.encode != NULL)
+ {
+ return pCallback->funcs.encode(ostream, field, &pCallback->arg);
+ }
+ }
+ }
+
+ return true; /* Success, but didn't do anything */
+
+}
+
+#ifdef PB_VALIDATE_UTF8
+
+/* This function checks whether a string is valid UTF-8 text.
+ *
+ * Algorithm is adapted from https://www.cl.cam.ac.uk/~mgk25/ucs/utf8_check.c
+ * Original copyright: Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> 2005-03-30
+ * Licensed under "Short code license", which allows use under MIT license or
+ * any compatible with it.
+ */
+
+bool pb_validate_utf8(const char *str)
+{
+ const pb_byte_t *s = (const pb_byte_t*)str;
+ while (*s)
+ {
+ if (*s < 0x80)
+ {
+ /* 0xxxxxxx */
+ s++;
+ }
+ else if ((s[0] & 0xe0) == 0xc0)
+ {
+ /* 110XXXXx 10xxxxxx */
+ if ((s[1] & 0xc0) != 0x80 ||
+ (s[0] & 0xfe) == 0xc0) /* overlong? */
+ return false;
+ else
+ s += 2;
+ }
+ else if ((s[0] & 0xf0) == 0xe0)
+ {
+ /* 1110XXXX 10Xxxxxx 10xxxxxx */
+ if ((s[1] & 0xc0) != 0x80 ||
+ (s[2] & 0xc0) != 0x80 ||
+ (s[0] == 0xe0 && (s[1] & 0xe0) == 0x80) || /* overlong? */
+ (s[0] == 0xed && (s[1] & 0xe0) == 0xa0) || /* surrogate? */
+ (s[0] == 0xef && s[1] == 0xbf &&
+ (s[2] & 0xfe) == 0xbe)) /* U+FFFE or U+FFFF? */
+ return false;
+ else
+ s += 3;
+ }
+ else if ((s[0] & 0xf8) == 0xf0)
+ {
+ /* 11110XXX 10XXxxxx 10xxxxxx 10xxxxxx */
+ if ((s[1] & 0xc0) != 0x80 ||
+ (s[2] & 0xc0) != 0x80 ||
+ (s[3] & 0xc0) != 0x80 ||
+ (s[0] == 0xf0 && (s[1] & 0xf0) == 0x80) || /* overlong? */
+ (s[0] == 0xf4 && s[1] > 0x8f) || s[0] > 0xf4) /* > U+10FFFF? */
+ return false;
+ else
+ s += 4;
+ }
+ else
+ {
+ return false;
+ }
+ }
+
+ return true;
+}
+
+#endif
+
diff --git a/security/container/protos/nanopb/pb_common.h b/security/container/protos/nanopb/pb_common.h
new file mode 100644
index 0000000..58aa90f
--- /dev/null
+++ b/security/container/protos/nanopb/pb_common.h
@@ -0,0 +1,49 @@
+/* pb_common.h: Common support functions for pb_encode.c and pb_decode.c.
+ * These functions are rarely needed by applications directly.
+ */
+
+#ifndef PB_COMMON_H_INCLUDED
+#define PB_COMMON_H_INCLUDED
+
+#include "pb.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Initialize the field iterator structure to beginning.
+ * Returns false if the message type is empty. */
+bool pb_field_iter_begin(pb_field_iter_t *iter, const pb_msgdesc_t *desc, void *message);
+
+/* Get a field iterator for extension field. */
+bool pb_field_iter_begin_extension(pb_field_iter_t *iter, pb_extension_t *extension);
+
+/* Same as pb_field_iter_begin(), but for const message pointer.
+ * Note that the pointers in pb_field_iter_t will be non-const but shouldn't
+ * be written to when using these functions. */
+bool pb_field_iter_begin_const(pb_field_iter_t *iter, const pb_msgdesc_t *desc, const void *message);
+bool pb_field_iter_begin_extension_const(pb_field_iter_t *iter, const pb_extension_t *extension);
+
+/* Advance the iterator to the next field.
+ * Returns false when the iterator wraps back to the first field. */
+bool pb_field_iter_next(pb_field_iter_t *iter);
+
+/* Advance the iterator until it points at a field with the given tag.
+ * Returns false if no such field exists. */
+bool pb_field_iter_find(pb_field_iter_t *iter, uint32_t tag);
+
+/* Find a field with type PB_LTYPE_EXTENSION, or return false if not found.
+ * There can be only one extension range field per message. */
+bool pb_field_iter_find_extension(pb_field_iter_t *iter);
+
+#ifdef PB_VALIDATE_UTF8
+/* Validate UTF-8 text string */
+bool pb_validate_utf8(const char *s);
+#endif
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
+
diff --git a/security/container/protos/nanopb/pb_decode.c b/security/container/protos/nanopb/pb_decode.c
new file mode 100644
index 0000000..b194825
--- /dev/null
+++ b/security/container/protos/nanopb/pb_decode.c
@@ -0,0 +1,1709 @@
+/* pb_decode.c -- decode a protobuf using minimal resources
+ *
+ * 2011 Petteri Aimonen <jpa@kapsi.fi>
+ */
+
+/* Use the GCC warn_unused_result attribute to check that all return values
+ * are propagated correctly. On other compilers and gcc before 3.4.0 just
+ * ignore the annotation.
+ */
+#if !defined(__GNUC__) || ( __GNUC__ < 3) || (__GNUC__ == 3 && __GNUC_MINOR__ < 4)
+ #define checkreturn
+#else
+ #define checkreturn __attribute__((warn_unused_result))
+#endif
+
+#include "pb.h"
+#include "pb_decode.h"
+#include "pb_common.h"
+
+/**************************************
+ * Declarations internal to this file *
+ **************************************/
+
+static bool checkreturn buf_read(pb_istream_t *stream, pb_byte_t *buf, size_t count);
+static bool checkreturn pb_decode_varint32_eof(pb_istream_t *stream, uint32_t *dest, bool *eof);
+static bool checkreturn read_raw_value(pb_istream_t *stream, pb_wire_type_t wire_type, pb_byte_t *buf, size_t *size);
+static bool checkreturn decode_basic_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_static_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_pointer_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_callback_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn decode_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field);
+static bool checkreturn default_extension_decoder(pb_istream_t *stream, pb_extension_t *extension, uint32_t tag, pb_wire_type_t wire_type);
+static bool checkreturn decode_extension(pb_istream_t *stream, uint32_t tag, pb_wire_type_t wire_type, pb_extension_t *extension);
+static bool pb_field_set_to_default(pb_field_iter_t *field);
+static bool pb_message_set_to_defaults(pb_field_iter_t *iter);
+static bool checkreturn pb_dec_bool(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_varint(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_bytes(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_string(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_submessage(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_dec_fixed_length_bytes(pb_istream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_skip_varint(pb_istream_t *stream);
+static bool checkreturn pb_skip_string(pb_istream_t *stream);
+
+#ifdef PB_ENABLE_MALLOC
+static bool checkreturn allocate_field(pb_istream_t *stream, void *pData, size_t data_size, size_t array_size);
+static void initialize_pointer_field(void *pItem, pb_field_iter_t *field);
+static bool checkreturn pb_release_union_field(pb_istream_t *stream, pb_field_iter_t *field);
+static void pb_release_single_field(pb_field_iter_t *field);
+#endif
+
+#ifdef PB_WITHOUT_64BIT
+#define pb_int64_t int32_t
+#define pb_uint64_t uint32_t
+#else
+#define pb_int64_t int64_t
+#define pb_uint64_t uint64_t
+#endif
+
+#define PB_WT_PACKED ((pb_wire_type_t)0xFF)
+
+typedef struct {
+ uint32_t bitfield[(PB_MAX_REQUIRED_FIELDS + 31) / 32];
+} pb_fields_seen_t;
+
+/*******************************
+ * pb_istream_t implementation *
+ *******************************/
+
+static bool checkreturn buf_read(pb_istream_t *stream, pb_byte_t *buf, size_t count)
+{
+ size_t i;
+ const pb_byte_t *source = (const pb_byte_t*)stream->state;
+ stream->state = (pb_byte_t*)stream->state + count;
+
+ if (buf != NULL)
+ {
+ for (i = 0; i < count; i++)
+ buf[i] = source[i];
+ }
+
+ return true;
+}
+
+bool checkreturn pb_read(pb_istream_t *stream, pb_byte_t *buf, size_t count)
+{
+ if (count == 0)
+ return true;
+
+#ifndef PB_BUFFER_ONLY
+ if (buf == NULL && stream->callback != buf_read)
+ {
+ /* Skip input bytes */
+ pb_byte_t tmp[16];
+ while (count > 16)
+ {
+ if (!pb_read(stream, tmp, 16))
+ return false;
+
+ count -= 16;
+ }
+
+ return pb_read(stream, tmp, count);
+ }
+#endif
+
+ if (stream->bytes_left < count)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+#ifndef PB_BUFFER_ONLY
+ if (!stream->callback(stream, buf, count))
+ PB_RETURN_ERROR(stream, "io error");
+#else
+ if (!buf_read(stream, buf, count))
+ return false;
+#endif
+
+ stream->bytes_left -= count;
+ return true;
+}
+
+/* Read a single byte from input stream. buf may not be NULL.
+ * This is an optimization for the varint decoding. */
+static bool checkreturn pb_readbyte(pb_istream_t *stream, pb_byte_t *buf)
+{
+ if (stream->bytes_left == 0)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+#ifndef PB_BUFFER_ONLY
+ if (!stream->callback(stream, buf, 1))
+ PB_RETURN_ERROR(stream, "io error");
+#else
+ *buf = *(const pb_byte_t*)stream->state;
+ stream->state = (pb_byte_t*)stream->state + 1;
+#endif
+
+ stream->bytes_left--;
+
+ return true;
+}
+
+pb_istream_t pb_istream_from_buffer(const pb_byte_t *buf, size_t msglen)
+{
+ pb_istream_t stream;
+ /* Cast away the const from buf without a compiler error. We are
+ * careful to use it only in a const manner in the callbacks.
+ */
+ union {
+ void *state;
+ const void *c_state;
+ } state;
+#ifdef PB_BUFFER_ONLY
+ stream.callback = NULL;
+#else
+ stream.callback = &buf_read;
+#endif
+ state.c_state = buf;
+ stream.state = state.state;
+ stream.bytes_left = msglen;
+#ifndef PB_NO_ERRMSG
+ stream.errmsg = NULL;
+#endif
+ return stream;
+}
+
+/********************
+ * Helper functions *
+ ********************/
+
+static bool checkreturn pb_decode_varint32_eof(pb_istream_t *stream, uint32_t *dest, bool *eof)
+{
+ pb_byte_t byte;
+ uint32_t result;
+
+ if (!pb_readbyte(stream, &byte))
+ {
+ if (stream->bytes_left == 0)
+ {
+ if (eof)
+ {
+ *eof = true;
+ }
+ }
+
+ return false;
+ }
+
+ if ((byte & 0x80) == 0)
+ {
+ /* Quick case, 1 byte value */
+ result = byte;
+ }
+ else
+ {
+ /* Multibyte case */
+ uint_fast8_t bitpos = 7;
+ result = byte & 0x7F;
+
+ do
+ {
+ if (!pb_readbyte(stream, &byte))
+ return false;
+
+ if (bitpos >= 32)
+ {
+ /* Note: The varint could have trailing 0x80 bytes, or 0xFF for negative. */
+ pb_byte_t sign_extension = (bitpos < 63) ? 0xFF : 0x01;
+ bool valid_extension = ((byte & 0x7F) == 0x00 ||
+ ((result >> 31) != 0 && byte == sign_extension));
+
+ if (bitpos >= 64 || !valid_extension)
+ {
+ PB_RETURN_ERROR(stream, "varint overflow");
+ }
+ }
+ else
+ {
+ result |= (uint32_t)(byte & 0x7F) << bitpos;
+ }
+ bitpos = (uint_fast8_t)(bitpos + 7);
+ } while (byte & 0x80);
+
+ if (bitpos == 35 && (byte & 0x70) != 0)
+ {
+ /* The last byte was at bitpos=28, so only bottom 4 bits fit. */
+ PB_RETURN_ERROR(stream, "varint overflow");
+ }
+ }
+
+ *dest = result;
+ return true;
+}
+
+bool checkreturn pb_decode_varint32(pb_istream_t *stream, uint32_t *dest)
+{
+ return pb_decode_varint32_eof(stream, dest, NULL);
+}
+
+#ifndef PB_WITHOUT_64BIT
+bool checkreturn pb_decode_varint(pb_istream_t *stream, uint64_t *dest)
+{
+ pb_byte_t byte;
+ uint_fast8_t bitpos = 0;
+ uint64_t result = 0;
+
+ do
+ {
+ if (bitpos >= 64)
+ PB_RETURN_ERROR(stream, "varint overflow");
+
+ if (!pb_readbyte(stream, &byte))
+ return false;
+
+ result |= (uint64_t)(byte & 0x7F) << bitpos;
+ bitpos = (uint_fast8_t)(bitpos + 7);
+ } while (byte & 0x80);
+
+ *dest = result;
+ return true;
+}
+#endif
+
+bool checkreturn pb_skip_varint(pb_istream_t *stream)
+{
+ pb_byte_t byte;
+ do
+ {
+ if (!pb_read(stream, &byte, 1))
+ return false;
+ } while (byte & 0x80);
+ return true;
+}
+
+bool checkreturn pb_skip_string(pb_istream_t *stream)
+{
+ uint32_t length;
+ if (!pb_decode_varint32(stream, &length))
+ return false;
+
+ if ((size_t)length != length)
+ {
+ PB_RETURN_ERROR(stream, "size too large");
+ }
+
+ return pb_read(stream, NULL, (size_t)length);
+}
+
+bool checkreturn pb_decode_tag(pb_istream_t *stream, pb_wire_type_t *wire_type, uint32_t *tag, bool *eof)
+{
+ uint32_t temp;
+ *eof = false;
+ *wire_type = (pb_wire_type_t) 0;
+ *tag = 0;
+
+ if (!pb_decode_varint32_eof(stream, &temp, eof))
+ {
+ return false;
+ }
+
+ *tag = temp >> 3;
+ *wire_type = (pb_wire_type_t)(temp & 7);
+ return true;
+}
+
+bool checkreturn pb_skip_field(pb_istream_t *stream, pb_wire_type_t wire_type)
+{
+ switch (wire_type)
+ {
+ case PB_WT_VARINT: return pb_skip_varint(stream);
+ case PB_WT_64BIT: return pb_read(stream, NULL, 8);
+ case PB_WT_STRING: return pb_skip_string(stream);
+ case PB_WT_32BIT: return pb_read(stream, NULL, 4);
+ default: PB_RETURN_ERROR(stream, "invalid wire_type");
+ }
+}
+
+/* Read a raw value to buffer, for the purpose of passing it to callback as
+ * a substream. Size is maximum size on call, and actual size on return.
+ */
+static bool checkreturn read_raw_value(pb_istream_t *stream, pb_wire_type_t wire_type, pb_byte_t *buf, size_t *size)
+{
+ size_t max_size = *size;
+ switch (wire_type)
+ {
+ case PB_WT_VARINT:
+ *size = 0;
+ do
+ {
+ (*size)++;
+ if (*size > max_size)
+ PB_RETURN_ERROR(stream, "varint overflow");
+
+ if (!pb_read(stream, buf, 1))
+ return false;
+ } while (*buf++ & 0x80);
+ return true;
+
+ case PB_WT_64BIT:
+ *size = 8;
+ return pb_read(stream, buf, 8);
+
+ case PB_WT_32BIT:
+ *size = 4;
+ return pb_read(stream, buf, 4);
+
+ case PB_WT_STRING:
+ /* Calling read_raw_value with a PB_WT_STRING is an error.
+ * Explicitly handle this case and fallthrough to default to avoid
+ * compiler warnings.
+ */
+
+ default: PB_RETURN_ERROR(stream, "invalid wire_type");
+ }
+}
+
+/* Decode string length from stream and return a substream with limited length.
+ * Remember to close the substream using pb_close_string_substream().
+ */
+bool checkreturn pb_make_string_substream(pb_istream_t *stream, pb_istream_t *substream)
+{
+ uint32_t size;
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ *substream = *stream;
+ if (substream->bytes_left < size)
+ PB_RETURN_ERROR(stream, "parent stream too short");
+
+ substream->bytes_left = (size_t)size;
+ stream->bytes_left -= (size_t)size;
+ return true;
+}
+
+bool checkreturn pb_close_string_substream(pb_istream_t *stream, pb_istream_t *substream)
+{
+ if (substream->bytes_left) {
+ if (!pb_read(substream, NULL, substream->bytes_left))
+ return false;
+ }
+
+ stream->state = substream->state;
+
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = substream->errmsg;
+#endif
+ return true;
+}
+
+/*************************
+ * Decode a single field *
+ *************************/
+
+static bool checkreturn decode_basic_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+ switch (PB_LTYPE(field->type))
+ {
+ case PB_LTYPE_BOOL:
+ if (wire_type != PB_WT_VARINT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_bool(stream, field);
+
+ case PB_LTYPE_VARINT:
+ case PB_LTYPE_UVARINT:
+ case PB_LTYPE_SVARINT:
+ if (wire_type != PB_WT_VARINT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_varint(stream, field);
+
+ case PB_LTYPE_FIXED32:
+ if (wire_type != PB_WT_32BIT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_decode_fixed32(stream, field->pData);
+
+ case PB_LTYPE_FIXED64:
+ if (wire_type != PB_WT_64BIT && wire_type != PB_WT_PACKED)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+ if (field->data_size == sizeof(float))
+ {
+ return pb_decode_double_as_float(stream, (float*)field->pData);
+ }
+#endif
+
+#ifdef PB_WITHOUT_64BIT
+ PB_RETURN_ERROR(stream, "invalid data_size");
+#else
+ return pb_decode_fixed64(stream, field->pData);
+#endif
+
+ case PB_LTYPE_BYTES:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_bytes(stream, field);
+
+ case PB_LTYPE_STRING:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_string(stream, field);
+
+ case PB_LTYPE_SUBMESSAGE:
+ case PB_LTYPE_SUBMSG_W_CB:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_submessage(stream, field);
+
+ case PB_LTYPE_FIXED_LENGTH_BYTES:
+ if (wire_type != PB_WT_STRING)
+ PB_RETURN_ERROR(stream, "wrong wire type");
+
+ return pb_dec_fixed_length_bytes(stream, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+static bool checkreturn decode_static_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+ switch (PB_HTYPE(field->type))
+ {
+ case PB_HTYPE_REQUIRED:
+ return decode_basic_field(stream, wire_type, field);
+
+ case PB_HTYPE_OPTIONAL:
+ if (field->pSize != NULL)
+ *(bool*)field->pSize = true;
+ return decode_basic_field(stream, wire_type, field);
+
+ case PB_HTYPE_REPEATED:
+ if (wire_type == PB_WT_STRING
+ && PB_LTYPE(field->type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ /* Packed array */
+ bool status = true;
+ pb_istream_t substream;
+ pb_size_t *size = (pb_size_t*)field->pSize;
+ field->pData = (char*)field->pField + field->data_size * (*size);
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ while (substream.bytes_left > 0 && *size < field->array_size)
+ {
+ if (!decode_basic_field(&substream, PB_WT_PACKED, field))
+ {
+ status = false;
+ break;
+ }
+ (*size)++;
+ field->pData = (char*)field->pData + field->data_size;
+ }
+
+ if (substream.bytes_left != 0)
+ PB_RETURN_ERROR(stream, "array overflow");
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return status;
+ }
+ else
+ {
+ /* Repeated field */
+ pb_size_t *size = (pb_size_t*)field->pSize;
+ field->pData = (char*)field->pField + field->data_size * (*size);
+
+ if ((*size)++ >= field->array_size)
+ PB_RETURN_ERROR(stream, "array overflow");
+
+ return decode_basic_field(stream, wire_type, field);
+ }
+
+ case PB_HTYPE_ONEOF:
+ if (PB_LTYPE_IS_SUBMSG(field->type) &&
+ *(pb_size_t*)field->pSize != field->tag)
+ {
+ /* We memset to zero so that any callbacks are set to NULL.
+ * This is because the callbacks might otherwise have values
+ * from some other union field.
+ * If callbacks are needed inside oneof field, use .proto
+ * option submsg_callback to have a separate callback function
+ * that can set the fields before submessage is decoded.
+ * pb_dec_submessage() will set any default values. */
+ memset(field->pData, 0, (size_t)field->data_size);
+
+ /* Set default values for the submessage fields. */
+ if (field->submsg_desc->default_value != NULL ||
+ field->submsg_desc->field_callback != NULL ||
+ field->submsg_desc->submsg_info[0] != NULL)
+ {
+ pb_field_iter_t submsg_iter;
+ if (pb_field_iter_begin(&submsg_iter, field->submsg_desc, field->pData))
+ {
+ if (!pb_message_set_to_defaults(&submsg_iter))
+ PB_RETURN_ERROR(stream, "failed to set defaults");
+ }
+ }
+ }
+ *(pb_size_t*)field->pSize = field->tag;
+
+ return decode_basic_field(stream, wire_type, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+#ifdef PB_ENABLE_MALLOC
+/* Allocate storage for the field and store the pointer at iter->pData.
+ * array_size is the number of entries to reserve in an array.
+ * Zero size is not allowed, use pb_free() for releasing.
+ */
+static bool checkreturn allocate_field(pb_istream_t *stream, void *pData, size_t data_size, size_t array_size)
+{
+ void *ptr = *(void**)pData;
+
+ if (data_size == 0 || array_size == 0)
+ PB_RETURN_ERROR(stream, "invalid size");
+
+#ifdef __AVR__
+ /* Workaround for AVR libc bug 53284: http://savannah.nongnu.org/bugs/?53284
+ * Realloc to size of 1 byte can cause corruption of the malloc structures.
+ */
+ if (data_size == 1 && array_size == 1)
+ {
+ data_size = 2;
+ }
+#endif
+
+ /* Check for multiplication overflows.
+ * This code avoids the costly division if the sizes are small enough.
+ * Multiplication is safe as long as only half of bits are set
+ * in either multiplicand.
+ */
+ {
+ const size_t check_limit = (size_t)1 << (sizeof(size_t) * 4);
+ if (data_size >= check_limit || array_size >= check_limit)
+ {
+ const size_t size_max = (size_t)-1;
+ if (size_max / array_size < data_size)
+ {
+ PB_RETURN_ERROR(stream, "size too large");
+ }
+ }
+ }
+
+ /* Allocate new or expand previous allocation */
+ /* Note: on failure the old pointer will remain in the structure,
+ * the message must be freed by caller also on error return. */
+ ptr = pb_realloc(ptr, array_size * data_size);
+ if (ptr == NULL)
+ PB_RETURN_ERROR(stream, "realloc failed");
+
+ *(void**)pData = ptr;
+ return true;
+}
+
+/* Clear a newly allocated item in case it contains a pointer, or is a submessage. */
+static void initialize_pointer_field(void *pItem, pb_field_iter_t *field)
+{
+ if (PB_LTYPE(field->type) == PB_LTYPE_STRING ||
+ PB_LTYPE(field->type) == PB_LTYPE_BYTES)
+ {
+ *(void**)pItem = NULL;
+ }
+ else if (PB_LTYPE_IS_SUBMSG(field->type))
+ {
+ /* We memset to zero so that any callbacks are set to NULL.
+ * Default values will be set by pb_dec_submessage(). */
+ memset(pItem, 0, field->data_size);
+ }
+}
+#endif
+
+static bool checkreturn decode_pointer_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+#ifndef PB_ENABLE_MALLOC
+ PB_UNUSED(wire_type);
+ PB_UNUSED(field);
+ PB_RETURN_ERROR(stream, "no malloc support");
+#else
+ switch (PB_HTYPE(field->type))
+ {
+ case PB_HTYPE_REQUIRED:
+ case PB_HTYPE_OPTIONAL:
+ case PB_HTYPE_ONEOF:
+ if (PB_LTYPE_IS_SUBMSG(field->type) && *(void**)field->pField != NULL)
+ {
+ /* Duplicate field, have to release the old allocation first. */
+ /* FIXME: Does this work correctly for oneofs? */
+ pb_release_single_field(field);
+ }
+
+ if (PB_HTYPE(field->type) == PB_HTYPE_ONEOF)
+ {
+ *(pb_size_t*)field->pSize = field->tag;
+ }
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_STRING ||
+ PB_LTYPE(field->type) == PB_LTYPE_BYTES)
+ {
+ /* pb_dec_string and pb_dec_bytes handle allocation themselves */
+ field->pData = field->pField;
+ return decode_basic_field(stream, wire_type, field);
+ }
+ else
+ {
+ if (!allocate_field(stream, field->pField, field->data_size, 1))
+ return false;
+
+ field->pData = *(void**)field->pField;
+ initialize_pointer_field(field->pData, field);
+ return decode_basic_field(stream, wire_type, field);
+ }
+
+ case PB_HTYPE_REPEATED:
+ if (wire_type == PB_WT_STRING
+ && PB_LTYPE(field->type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ /* Packed array, multiple items come in at once. */
+ bool status = true;
+ pb_size_t *size = (pb_size_t*)field->pSize;
+ size_t allocated_size = *size;
+ pb_istream_t substream;
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ while (substream.bytes_left)
+ {
+ if (*size == PB_SIZE_MAX)
+ {
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = "too many array entries";
+#endif
+ status = false;
+ break;
+ }
+
+ if ((size_t)*size + 1 > allocated_size)
+ {
+ /* Allocate more storage. This tries to guess the
+ * number of remaining entries. Round the division
+ * upwards. */
+ size_t remain = (substream.bytes_left - 1) / field->data_size + 1;
+ if (remain < PB_SIZE_MAX - allocated_size)
+ allocated_size += remain;
+ else
+ allocated_size += 1;
+
+ if (!allocate_field(&substream, field->pField, field->data_size, allocated_size))
+ {
+ status = false;
+ break;
+ }
+ }
+
+ /* Decode the array entry */
+ field->pData = *(char**)field->pField + field->data_size * (*size);
+ initialize_pointer_field(field->pData, field);
+ if (!decode_basic_field(&substream, PB_WT_PACKED, field))
+ {
+ status = false;
+ break;
+ }
+
+ (*size)++;
+ }
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return status;
+ }
+ else
+ {
+ /* Normal repeated field, i.e. only one item at a time. */
+ pb_size_t *size = (pb_size_t*)field->pSize;
+
+ if (*size == PB_SIZE_MAX)
+ PB_RETURN_ERROR(stream, "too many array entries");
+
+ if (!allocate_field(stream, field->pField, field->data_size, (size_t)(*size + 1)))
+ return false;
+
+ field->pData = *(char**)field->pField + field->data_size * (*size);
+ (*size)++;
+ initialize_pointer_field(field->pData, field);
+ return decode_basic_field(stream, wire_type, field);
+ }
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+#endif
+}
+
+static bool checkreturn decode_callback_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+ if (!field->descriptor->field_callback)
+ return pb_skip_field(stream, wire_type);
+
+ if (wire_type == PB_WT_STRING)
+ {
+ pb_istream_t substream;
+ size_t prev_bytes_left;
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ do
+ {
+ prev_bytes_left = substream.bytes_left;
+ if (!field->descriptor->field_callback(&substream, NULL, field))
+ PB_RETURN_ERROR(stream, "callback failed");
+ } while (substream.bytes_left > 0 && substream.bytes_left < prev_bytes_left);
+
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return true;
+ }
+ else
+ {
+ /* Copy the single scalar value to stack.
+ * This is required so that we can limit the stream length,
+ * which in turn allows to use same callback for packed and
+ * not-packed fields. */
+ pb_istream_t substream;
+ pb_byte_t buffer[10];
+ size_t size = sizeof(buffer);
+
+ if (!read_raw_value(stream, wire_type, buffer, &size))
+ return false;
+ substream = pb_istream_from_buffer(buffer, size);
+
+ return field->descriptor->field_callback(&substream, NULL, field);
+ }
+}
+
+static bool checkreturn decode_field(pb_istream_t *stream, pb_wire_type_t wire_type, pb_field_iter_t *field)
+{
+#ifdef PB_ENABLE_MALLOC
+ /* When decoding an oneof field, check if there is old data that must be
+ * released first. */
+ if (PB_HTYPE(field->type) == PB_HTYPE_ONEOF)
+ {
+ if (!pb_release_union_field(stream, field))
+ return false;
+ }
+#endif
+
+ switch (PB_ATYPE(field->type))
+ {
+ case PB_ATYPE_STATIC:
+ return decode_static_field(stream, wire_type, field);
+
+ case PB_ATYPE_POINTER:
+ return decode_pointer_field(stream, wire_type, field);
+
+ case PB_ATYPE_CALLBACK:
+ return decode_callback_field(stream, wire_type, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+/* Default handler for extension fields. Expects to have a pb_msgdesc_t
+ * pointer in the extension->type->arg field, pointing to a message with
+ * only one field in it. */
+static bool checkreturn default_extension_decoder(pb_istream_t *stream,
+ pb_extension_t *extension, uint32_t tag, pb_wire_type_t wire_type)
+{
+ pb_field_iter_t iter;
+
+ if (!pb_field_iter_begin_extension(&iter, extension))
+ PB_RETURN_ERROR(stream, "invalid extension");
+
+ if (iter.tag != tag || !iter.message)
+ return true;
+
+ extension->found = true;
+ return decode_field(stream, wire_type, &iter);
+}
+
+/* Try to decode an unknown field as an extension field. Tries each extension
+ * decoder in turn, until one of them handles the field or loop ends. */
+static bool checkreturn decode_extension(pb_istream_t *stream,
+ uint32_t tag, pb_wire_type_t wire_type, pb_extension_t *extension)
+{
+ size_t pos = stream->bytes_left;
+
+ while (extension != NULL && pos == stream->bytes_left)
+ {
+ bool status;
+ if (extension->type->decode)
+ status = extension->type->decode(stream, extension, tag, wire_type);
+ else
+ status = default_extension_decoder(stream, extension, tag, wire_type);
+
+ if (!status)
+ return false;
+
+ extension = extension->next;
+ }
+
+ return true;
+}
+
+/* Initialize message fields to default values, recursively */
+static bool pb_field_set_to_default(pb_field_iter_t *field)
+{
+ pb_type_t type;
+ type = field->type;
+
+ if (PB_LTYPE(type) == PB_LTYPE_EXTENSION)
+ {
+ pb_extension_t *ext = *(pb_extension_t* const *)field->pData;
+ while (ext != NULL)
+ {
+ pb_field_iter_t ext_iter;
+ if (pb_field_iter_begin_extension(&ext_iter, ext))
+ {
+ ext->found = false;
+ if (!pb_message_set_to_defaults(&ext_iter))
+ return false;
+ }
+ ext = ext->next;
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_STATIC)
+ {
+ bool init_data = true;
+ if (PB_HTYPE(type) == PB_HTYPE_OPTIONAL && field->pSize != NULL)
+ {
+ /* Set has_field to false. Still initialize the optional field
+ * itself also. */
+ *(bool*)field->pSize = false;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_REPEATED ||
+ PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ /* REPEATED: Set array count to 0, no need to initialize contents.
+ ONEOF: Set which_field to 0. */
+ *(pb_size_t*)field->pSize = 0;
+ init_data = false;
+ }
+
+ if (init_data)
+ {
+ if (PB_LTYPE_IS_SUBMSG(field->type) &&
+ (field->submsg_desc->default_value != NULL ||
+ field->submsg_desc->field_callback != NULL ||
+ field->submsg_desc->submsg_info[0] != NULL))
+ {
+ /* Initialize submessage to defaults.
+ * Only needed if it has default values
+ * or callback/submessage fields. */
+ pb_field_iter_t submsg_iter;
+ if (pb_field_iter_begin(&submsg_iter, field->submsg_desc, field->pData))
+ {
+ if (!pb_message_set_to_defaults(&submsg_iter))
+ return false;
+ }
+ }
+ else
+ {
+ /* Initialize to zeros */
+ memset(field->pData, 0, (size_t)field->data_size);
+ }
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ /* Initialize the pointer to NULL. */
+ *(void**)field->pField = NULL;
+
+ /* Initialize array count to 0. */
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED ||
+ PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ *(pb_size_t*)field->pSize = 0;
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_CALLBACK)
+ {
+ /* Don't overwrite callback */
+ }
+
+ return true;
+}
+
+static bool pb_message_set_to_defaults(pb_field_iter_t *iter)
+{
+ pb_istream_t defstream = PB_ISTREAM_EMPTY;
+ uint32_t tag = 0;
+ pb_wire_type_t wire_type = PB_WT_VARINT;
+ bool eof;
+
+ if (iter->descriptor->default_value)
+ {
+ defstream = pb_istream_from_buffer(iter->descriptor->default_value, (size_t)-1);
+ if (!pb_decode_tag(&defstream, &wire_type, &tag, &eof))
+ return false;
+ }
+
+ do
+ {
+ if (!pb_field_set_to_default(iter))
+ return false;
+
+ if (tag != 0 && iter->tag == tag)
+ {
+ /* We have a default value for this field in the defstream */
+ if (!decode_field(&defstream, wire_type, iter))
+ return false;
+ if (!pb_decode_tag(&defstream, &wire_type, &tag, &eof))
+ return false;
+
+ if (iter->pSize)
+ *(bool*)iter->pSize = false;
+ }
+ } while (pb_field_iter_next(iter));
+
+ return true;
+}
+
+/*********************
+ * Decode all fields *
+ *********************/
+
+static bool checkreturn pb_decode_inner(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct, unsigned int flags)
+{
+ uint32_t extension_range_start = 0;
+ pb_extension_t *extensions = NULL;
+
+ /* 'fixed_count_field' and 'fixed_count_size' track position of a repeated fixed
+ * count field. This can only handle _one_ repeated fixed count field that
+ * is unpacked and unordered among other (non repeated fixed count) fields.
+ */
+ pb_size_t fixed_count_field = PB_SIZE_MAX;
+ pb_size_t fixed_count_size = 0;
+ pb_size_t fixed_count_total_size = 0;
+
+ pb_fields_seen_t fields_seen = {{0, 0}};
+ const uint32_t allbits = ~(uint32_t)0;
+ pb_field_iter_t iter;
+
+ if (pb_field_iter_begin(&iter, fields, dest_struct))
+ {
+ if ((flags & PB_DECODE_NOINIT) == 0)
+ {
+ if (!pb_message_set_to_defaults(&iter))
+ PB_RETURN_ERROR(stream, "failed to set defaults");
+ }
+ }
+
+ while (stream->bytes_left)
+ {
+ uint32_t tag;
+ pb_wire_type_t wire_type;
+ bool eof;
+
+ if (!pb_decode_tag(stream, &wire_type, &tag, &eof))
+ {
+ if (eof)
+ break;
+ else
+ return false;
+ }
+
+ if (tag == 0)
+ {
+ if (flags & PB_DECODE_NULLTERMINATED)
+ {
+ break;
+ }
+ else
+ {
+ PB_RETURN_ERROR(stream, "zero tag");
+ }
+ }
+
+ if (!pb_field_iter_find(&iter, tag) || PB_LTYPE(iter.type) == PB_LTYPE_EXTENSION)
+ {
+ /* No match found, check if it matches an extension. */
+ if (extension_range_start == 0)
+ {
+ if (pb_field_iter_find_extension(&iter))
+ {
+ extensions = *(pb_extension_t* const *)iter.pData;
+ extension_range_start = iter.tag;
+ }
+
+ if (!extensions)
+ {
+ extension_range_start = (uint32_t)-1;
+ }
+ }
+
+ if (tag >= extension_range_start)
+ {
+ size_t pos = stream->bytes_left;
+
+ if (!decode_extension(stream, tag, wire_type, extensions))
+ return false;
+
+ if (pos != stream->bytes_left)
+ {
+ /* The field was handled */
+ continue;
+ }
+ }
+
+ /* No match found, skip data */
+ if (!pb_skip_field(stream, wire_type))
+ return false;
+ continue;
+ }
+
+ /* If a repeated fixed count field was found, get size from
+ * 'fixed_count_field' as there is no counter contained in the struct.
+ */
+ if (PB_HTYPE(iter.type) == PB_HTYPE_REPEATED && iter.pSize == &iter.array_size)
+ {
+ if (fixed_count_field != iter.index) {
+ /* If the new fixed count field does not match the previous one,
+ * check that the previous one is NULL or that it finished
+ * receiving all the expected data.
+ */
+ if (fixed_count_field != PB_SIZE_MAX &&
+ fixed_count_size != fixed_count_total_size)
+ {
+ PB_RETURN_ERROR(stream, "wrong size for fixed count field");
+ }
+
+ fixed_count_field = iter.index;
+ fixed_count_size = 0;
+ fixed_count_total_size = iter.array_size;
+ }
+
+ iter.pSize = &fixed_count_size;
+ }
+
+ if (PB_HTYPE(iter.type) == PB_HTYPE_REQUIRED
+ && iter.required_field_index < PB_MAX_REQUIRED_FIELDS)
+ {
+ uint32_t tmp = ((uint32_t)1 << (iter.required_field_index & 31));
+ fields_seen.bitfield[iter.required_field_index >> 5] |= tmp;
+ }
+
+ if (!decode_field(stream, wire_type, &iter))
+ return false;
+ }
+
+ /* Check that all elements of the last decoded fixed count field were present. */
+ if (fixed_count_field != PB_SIZE_MAX &&
+ fixed_count_size != fixed_count_total_size)
+ {
+ PB_RETURN_ERROR(stream, "wrong size for fixed count field");
+ }
+
+ /* Check that all required fields were present. */
+ {
+ pb_size_t req_field_count = iter.descriptor->required_field_count;
+
+ if (req_field_count > 0)
+ {
+ pb_size_t i;
+
+ if (req_field_count > PB_MAX_REQUIRED_FIELDS)
+ req_field_count = PB_MAX_REQUIRED_FIELDS;
+
+ /* Check the whole words */
+ for (i = 0; i < (req_field_count >> 5); i++)
+ {
+ if (fields_seen.bitfield[i] != allbits)
+ PB_RETURN_ERROR(stream, "missing required field");
+ }
+
+ /* Check the remaining bits (if any) */
+ if ((req_field_count & 31) != 0)
+ {
+ if (fields_seen.bitfield[req_field_count >> 5] !=
+ (allbits >> (uint_least8_t)(32 - (req_field_count & 31))))
+ {
+ PB_RETURN_ERROR(stream, "missing required field");
+ }
+ }
+ }
+ }
+
+ return true;
+}
+
+bool checkreturn pb_decode_ex(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct, unsigned int flags)
+{
+ bool status;
+
+ if ((flags & PB_DECODE_DELIMITED) == 0)
+ {
+ status = pb_decode_inner(stream, fields, dest_struct, flags);
+ }
+ else
+ {
+ pb_istream_t substream;
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ status = pb_decode_inner(&substream, fields, dest_struct, flags);
+
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+ }
+
+#ifdef PB_ENABLE_MALLOC
+ if (!status)
+ pb_release(fields, dest_struct);
+#endif
+
+ return status;
+}
+
+bool checkreturn pb_decode(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct)
+{
+ bool status;
+
+ status = pb_decode_inner(stream, fields, dest_struct, 0);
+
+#ifdef PB_ENABLE_MALLOC
+ if (!status)
+ pb_release(fields, dest_struct);
+#endif
+
+ return status;
+}
+
+#ifdef PB_ENABLE_MALLOC
+/* Given an oneof field, if there has already been a field inside this oneof,
+ * release it before overwriting with a different one. */
+static bool pb_release_union_field(pb_istream_t *stream, pb_field_iter_t *field)
+{
+ pb_field_iter_t old_field = *field;
+ pb_size_t old_tag = *(pb_size_t*)field->pSize; /* Previous which_ value */
+ pb_size_t new_tag = field->tag; /* New which_ value */
+
+ if (old_tag == 0)
+ return true; /* Ok, no old data in union */
+
+ if (old_tag == new_tag)
+ return true; /* Ok, old data is of same type => merge */
+
+ /* Release old data. The find can fail if the message struct contains
+ * invalid data. */
+ if (!pb_field_iter_find(&old_field, old_tag))
+ PB_RETURN_ERROR(stream, "invalid union tag");
+
+ pb_release_single_field(&old_field);
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+ /* Initialize the pointer to NULL to make sure it is valid
+ * even in case of error return. */
+ *(void**)field->pField = NULL;
+ field->pData = NULL;
+ }
+
+ return true;
+}
+
+static void pb_release_single_field(pb_field_iter_t *field)
+{
+ pb_type_t type;
+ type = field->type;
+
+ if (PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ if (*(pb_size_t*)field->pSize != field->tag)
+ return; /* This is not the current field in the union */
+ }
+
+ /* Release anything contained inside an extension or submsg.
+ * This has to be done even if the submsg itself is statically
+ * allocated. */
+ if (PB_LTYPE(type) == PB_LTYPE_EXTENSION)
+ {
+ /* Release fields from all extensions in the linked list */
+ pb_extension_t *ext = *(pb_extension_t**)field->pData;
+ while (ext != NULL)
+ {
+ pb_field_iter_t ext_iter;
+ if (pb_field_iter_begin_extension(&ext_iter, ext))
+ {
+ pb_release_single_field(&ext_iter);
+ }
+ ext = ext->next;
+ }
+ }
+ else if (PB_LTYPE_IS_SUBMSG(type) && PB_ATYPE(type) != PB_ATYPE_CALLBACK)
+ {
+ /* Release fields in submessage or submsg array */
+ pb_size_t count = 1;
+
+ if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ field->pData = *(void**)field->pField;
+ }
+ else
+ {
+ field->pData = field->pField;
+ }
+
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED)
+ {
+ count = *(pb_size_t*)field->pSize;
+
+ if (PB_ATYPE(type) == PB_ATYPE_STATIC && count > field->array_size)
+ {
+ /* Protect against corrupted _count fields */
+ count = field->array_size;
+ }
+ }
+
+ if (field->pData)
+ {
+ for (; count > 0; count--)
+ {
+ pb_release(field->submsg_desc, field->pData);
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ }
+ }
+
+ if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED &&
+ (PB_LTYPE(type) == PB_LTYPE_STRING ||
+ PB_LTYPE(type) == PB_LTYPE_BYTES))
+ {
+ /* Release entries in repeated string or bytes array */
+ void **pItem = *(void***)field->pField;
+ pb_size_t count = *(pb_size_t*)field->pSize;
+ for (; count > 0; count--)
+ {
+ pb_free(*pItem);
+ *pItem++ = NULL;
+ }
+ }
+
+ if (PB_HTYPE(type) == PB_HTYPE_REPEATED)
+ {
+ /* We are going to release the array, so set the size to 0 */
+ *(pb_size_t*)field->pSize = 0;
+ }
+
+ /* Release main pointer */
+ pb_free(*(void**)field->pField);
+ *(void**)field->pField = NULL;
+ }
+}
+
+void pb_release(const pb_msgdesc_t *fields, void *dest_struct)
+{
+ pb_field_iter_t iter;
+
+ if (!dest_struct)
+ return; /* Ignore NULL pointers, similar to free() */
+
+ if (!pb_field_iter_begin(&iter, fields, dest_struct))
+ return; /* Empty message type */
+
+ do
+ {
+ pb_release_single_field(&iter);
+ } while (pb_field_iter_next(&iter));
+}
+#endif
+
+/* Field decoders */
+
+bool pb_decode_bool(pb_istream_t *stream, bool *dest)
+{
+ uint32_t value;
+ if (!pb_decode_varint32(stream, &value))
+ return false;
+
+ *(bool*)dest = (value != 0);
+ return true;
+}
+
+bool pb_decode_svarint(pb_istream_t *stream, pb_int64_t *dest)
+{
+ pb_uint64_t value;
+ if (!pb_decode_varint(stream, &value))
+ return false;
+
+ if (value & 1)
+ *dest = (pb_int64_t)(~(value >> 1));
+ else
+ *dest = (pb_int64_t)(value >> 1);
+
+ return true;
+}
+
+bool pb_decode_fixed32(pb_istream_t *stream, void *dest)
+{
+ union {
+ uint32_t fixed32;
+ pb_byte_t bytes[4];
+ } u;
+
+ if (!pb_read(stream, u.bytes, 4))
+ return false;
+
+#if defined(__BYTE_ORDER) && __BYTE_ORDER == __LITTLE_ENDIAN && CHAR_BIT == 8
+ /* fast path - if we know that we're on little endian, assign directly */
+ *(uint32_t*)dest = u.fixed32;
+#else
+ *(uint32_t*)dest = ((uint32_t)u.bytes[0] << 0) |
+ ((uint32_t)u.bytes[1] << 8) |
+ ((uint32_t)u.bytes[2] << 16) |
+ ((uint32_t)u.bytes[3] << 24);
+#endif
+ return true;
+}
+
+#ifndef PB_WITHOUT_64BIT
+bool pb_decode_fixed64(pb_istream_t *stream, void *dest)
+{
+ union {
+ uint64_t fixed64;
+ pb_byte_t bytes[8];
+ } u;
+
+ if (!pb_read(stream, u.bytes, 8))
+ return false;
+
+#if defined(__BYTE_ORDER) && __BYTE_ORDER == __LITTLE_ENDIAN && CHAR_BIT == 8
+ /* fast path - if we know that we're on little endian, assign directly */
+ *(uint64_t*)dest = u.fixed64;
+#else
+ *(uint64_t*)dest = ((uint64_t)u.bytes[0] << 0) |
+ ((uint64_t)u.bytes[1] << 8) |
+ ((uint64_t)u.bytes[2] << 16) |
+ ((uint64_t)u.bytes[3] << 24) |
+ ((uint64_t)u.bytes[4] << 32) |
+ ((uint64_t)u.bytes[5] << 40) |
+ ((uint64_t)u.bytes[6] << 48) |
+ ((uint64_t)u.bytes[7] << 56);
+#endif
+ return true;
+}
+#endif
+
+static bool checkreturn pb_dec_bool(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ return pb_decode_bool(stream, (bool*)field->pData);
+}
+
+static bool checkreturn pb_dec_varint(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ if (PB_LTYPE(field->type) == PB_LTYPE_UVARINT)
+ {
+ pb_uint64_t value, clamped;
+ if (!pb_decode_varint(stream, &value))
+ return false;
+
+ /* Cast to the proper field size, while checking for overflows */
+ if (field->data_size == sizeof(pb_uint64_t))
+ clamped = *(pb_uint64_t*)field->pData = value;
+ else if (field->data_size == sizeof(uint32_t))
+ clamped = *(uint32_t*)field->pData = (uint32_t)value;
+ else if (field->data_size == sizeof(uint_least16_t))
+ clamped = *(uint_least16_t*)field->pData = (uint_least16_t)value;
+ else if (field->data_size == sizeof(uint_least8_t))
+ clamped = *(uint_least8_t*)field->pData = (uint_least8_t)value;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ if (clamped != value)
+ PB_RETURN_ERROR(stream, "integer too large");
+
+ return true;
+ }
+ else
+ {
+ pb_uint64_t value;
+ pb_int64_t svalue;
+ pb_int64_t clamped;
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_SVARINT)
+ {
+ if (!pb_decode_svarint(stream, &svalue))
+ return false;
+ }
+ else
+ {
+ if (!pb_decode_varint(stream, &value))
+ return false;
+
+ /* See issue 97: Google's C++ protobuf allows negative varint values to
+ * be cast as int32_t, instead of the int64_t that should be used when
+ * encoding. Nanopb versions before 0.2.5 had a bug in encoding. In order to
+ * not break decoding of such messages, we cast <=32 bit fields to
+ * int32_t first to get the sign correct.
+ */
+ if (field->data_size == sizeof(pb_int64_t))
+ svalue = (pb_int64_t)value;
+ else
+ svalue = (int32_t)value;
+ }
+
+ /* Cast to the proper field size, while checking for overflows */
+ if (field->data_size == sizeof(pb_int64_t))
+ clamped = *(pb_int64_t*)field->pData = svalue;
+ else if (field->data_size == sizeof(int32_t))
+ clamped = *(int32_t*)field->pData = (int32_t)svalue;
+ else if (field->data_size == sizeof(int_least16_t))
+ clamped = *(int_least16_t*)field->pData = (int_least16_t)svalue;
+ else if (field->data_size == sizeof(int_least8_t))
+ clamped = *(int_least8_t*)field->pData = (int_least8_t)svalue;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ if (clamped != svalue)
+ PB_RETURN_ERROR(stream, "integer too large");
+
+ return true;
+ }
+}
+
+static bool checkreturn pb_dec_bytes(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t size;
+ size_t alloc_size;
+ pb_bytes_array_t *dest;
+
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ if (size > PB_SIZE_MAX)
+ PB_RETURN_ERROR(stream, "bytes overflow");
+
+ alloc_size = PB_BYTES_ARRAY_T_ALLOCSIZE(size);
+ if (size > alloc_size)
+ PB_RETURN_ERROR(stream, "size too large");
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+#ifndef PB_ENABLE_MALLOC
+ PB_RETURN_ERROR(stream, "no malloc support");
+#else
+ if (stream->bytes_left < size)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+ if (!allocate_field(stream, field->pData, alloc_size, 1))
+ return false;
+ dest = *(pb_bytes_array_t**)field->pData;
+#endif
+ }
+ else
+ {
+ if (alloc_size > field->data_size)
+ PB_RETURN_ERROR(stream, "bytes overflow");
+ dest = (pb_bytes_array_t*)field->pData;
+ }
+
+ dest->size = (pb_size_t)size;
+ return pb_read(stream, dest->bytes, (size_t)size);
+}
+
+static bool checkreturn pb_dec_string(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t size;
+ size_t alloc_size;
+ pb_byte_t *dest = (pb_byte_t*)field->pData;
+
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ if (size == (uint32_t)-1)
+ PB_RETURN_ERROR(stream, "size too large");
+
+ /* Space for null terminator */
+ alloc_size = (size_t)(size + 1);
+
+ if (alloc_size < size)
+ PB_RETURN_ERROR(stream, "size too large");
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+#ifndef PB_ENABLE_MALLOC
+ PB_RETURN_ERROR(stream, "no malloc support");
+#else
+ if (stream->bytes_left < size)
+ PB_RETURN_ERROR(stream, "end-of-stream");
+
+ if (!allocate_field(stream, field->pData, alloc_size, 1))
+ return false;
+ dest = *(pb_byte_t**)field->pData;
+#endif
+ }
+ else
+ {
+ if (alloc_size > field->data_size)
+ PB_RETURN_ERROR(stream, "string overflow");
+ }
+
+ dest[size] = 0;
+
+ if (!pb_read(stream, dest, (size_t)size))
+ return false;
+
+#ifdef PB_VALIDATE_UTF8
+ if (!pb_validate_utf8((const char*)dest))
+ PB_RETURN_ERROR(stream, "invalid utf8");
+#endif
+
+ return true;
+}
+
+static bool checkreturn pb_dec_submessage(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ bool status = true;
+ bool submsg_consumed = false;
+ pb_istream_t substream;
+
+ if (!pb_make_string_substream(stream, &substream))
+ return false;
+
+ if (field->submsg_desc == NULL)
+ PB_RETURN_ERROR(stream, "invalid field descriptor");
+
+ /* Submessages can have a separate message-level callback that is called
+ * before decoding the message. Typically it is used to set callback fields
+ * inside oneofs. */
+ if (PB_LTYPE(field->type) == PB_LTYPE_SUBMSG_W_CB && field->pSize != NULL)
+ {
+ /* Message callback is stored right before pSize. */
+ pb_callback_t *callback = (pb_callback_t*)field->pSize - 1;
+ if (callback->funcs.decode)
+ {
+ status = callback->funcs.decode(&substream, field, &callback->arg);
+
+ if (substream.bytes_left == 0)
+ {
+ submsg_consumed = true;
+ }
+ }
+ }
+
+ /* Now decode the submessage contents */
+ if (status && !submsg_consumed)
+ {
+ unsigned int flags = 0;
+
+ /* Static required/optional fields are already initialized by top-level
+ * pb_decode(), no need to initialize them again. */
+ if (PB_ATYPE(field->type) == PB_ATYPE_STATIC &&
+ PB_HTYPE(field->type) != PB_HTYPE_REPEATED)
+ {
+ flags = PB_DECODE_NOINIT;
+ }
+
+ status = pb_decode_inner(&substream, field->submsg_desc, field->pData, flags);
+ }
+
+ if (!pb_close_string_substream(stream, &substream))
+ return false;
+
+ return status;
+}
+
+static bool checkreturn pb_dec_fixed_length_bytes(pb_istream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t size;
+
+ if (!pb_decode_varint32(stream, &size))
+ return false;
+
+ if (size > PB_SIZE_MAX)
+ PB_RETURN_ERROR(stream, "bytes overflow");
+
+ if (size == 0)
+ {
+ /* As a special case, treat empty bytes string as all zeros for fixed_length_bytes. */
+ memset(field->pData, 0, (size_t)field->data_size);
+ return true;
+ }
+
+ if (size != field->data_size)
+ PB_RETURN_ERROR(stream, "incorrect fixed length bytes size");
+
+ return pb_read(stream, (pb_byte_t*)field->pData, (size_t)field->data_size);
+}
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+bool pb_decode_double_as_float(pb_istream_t *stream, float *dest)
+{
+ uint_least8_t sign;
+ int exponent;
+ uint32_t mantissa;
+ uint64_t value;
+ union { float f; uint32_t i; } out;
+
+ if (!pb_decode_fixed64(stream, &value))
+ return false;
+
+ /* Decompose input value */
+ sign = (uint_least8_t)((value >> 63) & 1);
+ exponent = (int)((value >> 52) & 0x7FF) - 1023;
+ mantissa = (value >> 28) & 0xFFFFFF; /* Highest 24 bits */
+
+ /* Figure if value is in range representable by floats. */
+ if (exponent == 1024)
+ {
+ /* Special value */
+ exponent = 128;
+ mantissa >>= 1;
+ }
+ else
+ {
+ if (exponent > 127)
+ {
+ /* Too large, convert to infinity */
+ exponent = 128;
+ mantissa = 0;
+ }
+ else if (exponent < -150)
+ {
+ /* Too small, convert to zero */
+ exponent = -127;
+ mantissa = 0;
+ }
+ else if (exponent < -126)
+ {
+ /* Denormalized */
+ mantissa |= 0x1000000;
+ mantissa >>= (-126 - exponent);
+ exponent = -127;
+ }
+
+ /* Round off mantissa */
+ mantissa = (mantissa + 1) >> 1;
+
+ /* Check if mantissa went over 2.0 */
+ if (mantissa & 0x800000)
+ {
+ exponent += 1;
+ mantissa &= 0x7FFFFF;
+ mantissa >>= 1;
+ }
+ }
+
+ /* Combine fields */
+ out.i = mantissa;
+ out.i |= (uint32_t)(exponent + 127) << 23;
+ out.i |= (uint32_t)sign << 31;
+
+ *dest = out.f;
+ return true;
+}
+#endif
diff --git a/security/container/protos/nanopb/pb_decode.h b/security/container/protos/nanopb/pb_decode.h
new file mode 100644
index 0000000..824acd4
--- /dev/null
+++ b/security/container/protos/nanopb/pb_decode.h
@@ -0,0 +1,199 @@
+/* pb_decode.h: Functions to decode protocol buffers. Depends on pb_decode.c.
+ * The main function is pb_decode. You also need an input stream, and the
+ * field descriptions created by nanopb_generator.py.
+ */
+
+#ifndef PB_DECODE_H_INCLUDED
+#define PB_DECODE_H_INCLUDED
+
+#include "pb.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Structure for defining custom input streams. You will need to provide
+ * a callback function to read the bytes from your storage, which can be
+ * for example a file or a network socket.
+ *
+ * The callback must conform to these rules:
+ *
+ * 1) Return false on IO errors. This will cause decoding to abort.
+ * 2) You can use state to store your own data (e.g. buffer pointer),
+ * and rely on pb_read to verify that no-body reads past bytes_left.
+ * 3) Your callback may be used with substreams, in which case bytes_left
+ * is different than from the main stream. Don't use bytes_left to compute
+ * any pointers.
+ */
+struct pb_istream_s
+{
+#ifdef PB_BUFFER_ONLY
+ /* Callback pointer is not used in buffer-only configuration.
+ * Having an int pointer here allows binary compatibility but
+ * gives an error if someone tries to assign callback function.
+ */
+ int *callback;
+#else
+ bool (*callback)(pb_istream_t *stream, pb_byte_t *buf, size_t count);
+#endif
+
+ void *state; /* Free field for use by callback implementation */
+ size_t bytes_left;
+
+#ifndef PB_NO_ERRMSG
+ const char *errmsg;
+#endif
+};
+
+#ifndef PB_NO_ERRMSG
+#define PB_ISTREAM_EMPTY {0,0,0,0}
+#else
+#define PB_ISTREAM_EMPTY {0,0,0}
+#endif
+
+/***************************
+ * Main decoding functions *
+ ***************************/
+
+/* Decode a single protocol buffers message from input stream into a C structure.
+ * Returns true on success, false on any failure.
+ * The actual struct pointed to by dest must match the description in fields.
+ * Callback fields of the destination structure must be initialized by caller.
+ * All other fields will be initialized by this function.
+ *
+ * Example usage:
+ * MyMessage msg = {};
+ * uint8_t buffer[64];
+ * pb_istream_t stream;
+ *
+ * // ... read some data into buffer ...
+ *
+ * stream = pb_istream_from_buffer(buffer, count);
+ * pb_decode(&stream, MyMessage_fields, &msg);
+ */
+bool pb_decode(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct);
+
+/* Extended version of pb_decode, with several options to control
+ * the decoding process:
+ *
+ * PB_DECODE_NOINIT: Do not initialize the fields to default values.
+ * This is slightly faster if you do not need the default
+ * values and instead initialize the structure to 0 using
+ * e.g. memset(). This can also be used for merging two
+ * messages, i.e. combine already existing data with new
+ * values.
+ *
+ * PB_DECODE_DELIMITED: Input message starts with the message size as varint.
+ * Corresponds to parseDelimitedFrom() in Google's
+ * protobuf API.
+ *
+ * PB_DECODE_NULLTERMINATED: Stop reading when field tag is read as 0. This allows
+ * reading null terminated messages.
+ * NOTE: Until nanopb-0.4.0, pb_decode() also allows
+ * null-termination. This behaviour is not supported in
+ * most other protobuf implementations, so PB_DECODE_DELIMITED
+ * is a better option for compatibility.
+ *
+ * Multiple flags can be combined with bitwise or (| operator)
+ */
+#define PB_DECODE_NOINIT 0x01U
+#define PB_DECODE_DELIMITED 0x02U
+#define PB_DECODE_NULLTERMINATED 0x04U
+bool pb_decode_ex(pb_istream_t *stream, const pb_msgdesc_t *fields, void *dest_struct, unsigned int flags);
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define pb_decode_noinit(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_NOINIT)
+#define pb_decode_delimited(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_DELIMITED)
+#define pb_decode_delimited_noinit(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_DELIMITED | PB_DECODE_NOINIT)
+#define pb_decode_nullterminated(s,f,d) pb_decode_ex(s,f,d, PB_DECODE_NULLTERMINATED)
+
+#ifdef PB_ENABLE_MALLOC
+/* Release any allocated pointer fields. If you use dynamic allocation, you should
+ * call this for any successfully decoded message when you are done with it. If
+ * pb_decode() returns with an error, the message is already released.
+ */
+void pb_release(const pb_msgdesc_t *fields, void *dest_struct);
+#else
+/* Allocation is not supported, so release is no-op */
+#define pb_release(fields, dest_struct) PB_UNUSED(fields); PB_UNUSED(dest_struct);
+#endif
+
+
+/**************************************
+ * Functions for manipulating streams *
+ **************************************/
+
+/* Create an input stream for reading from a memory buffer.
+ *
+ * msglen should be the actual length of the message, not the full size of
+ * allocated buffer.
+ *
+ * Alternatively, you can use a custom stream that reads directly from e.g.
+ * a file or a network socket.
+ */
+pb_istream_t pb_istream_from_buffer(const pb_byte_t *buf, size_t msglen);
+
+/* Function to read from a pb_istream_t. You can use this if you need to
+ * read some custom header data, or to read data in field callbacks.
+ */
+bool pb_read(pb_istream_t *stream, pb_byte_t *buf, size_t count);
+
+
+/************************************************
+ * Helper functions for writing field callbacks *
+ ************************************************/
+
+/* Decode the tag for the next field in the stream. Gives the wire type and
+ * field tag. At end of the message, returns false and sets eof to true. */
+bool pb_decode_tag(pb_istream_t *stream, pb_wire_type_t *wire_type, uint32_t *tag, bool *eof);
+
+/* Skip the field payload data, given the wire type. */
+bool pb_skip_field(pb_istream_t *stream, pb_wire_type_t wire_type);
+
+/* Decode an integer in the varint format. This works for enum, int32,
+ * int64, uint32 and uint64 field types. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_decode_varint(pb_istream_t *stream, uint64_t *dest);
+#else
+#define pb_decode_varint pb_decode_varint32
+#endif
+
+/* Decode an integer in the varint format. This works for enum, int32,
+ * and uint32 field types. */
+bool pb_decode_varint32(pb_istream_t *stream, uint32_t *dest);
+
+/* Decode a bool value in varint format. */
+bool pb_decode_bool(pb_istream_t *stream, bool *dest);
+
+/* Decode an integer in the zig-zagged svarint format. This works for sint32
+ * and sint64. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_decode_svarint(pb_istream_t *stream, int64_t *dest);
+#else
+bool pb_decode_svarint(pb_istream_t *stream, int32_t *dest);
+#endif
+
+/* Decode a fixed32, sfixed32 or float value. You need to pass a pointer to
+ * a 4-byte wide C variable. */
+bool pb_decode_fixed32(pb_istream_t *stream, void *dest);
+
+#ifndef PB_WITHOUT_64BIT
+/* Decode a fixed64, sfixed64 or double value. You need to pass a pointer to
+ * a 8-byte wide C variable. */
+bool pb_decode_fixed64(pb_istream_t *stream, void *dest);
+#endif
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+/* Decode a double value into float variable. */
+bool pb_decode_double_as_float(pb_istream_t *stream, float *dest);
+#endif
+
+/* Make a limited-length substream for reading a PB_WT_STRING field. */
+bool pb_make_string_substream(pb_istream_t *stream, pb_istream_t *substream);
+bool pb_close_string_substream(pb_istream_t *stream, pb_istream_t *substream);
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/nanopb/pb_encode.c b/security/container/protos/nanopb/pb_encode.c
new file mode 100644
index 0000000..de716f7
--- /dev/null
+++ b/security/container/protos/nanopb/pb_encode.c
@@ -0,0 +1,987 @@
+/* pb_encode.c -- encode a protobuf using minimal resources
+ *
+ * 2011 Petteri Aimonen <jpa@kapsi.fi>
+ */
+
+#include "pb.h"
+#include "pb_encode.h"
+#include "pb_common.h"
+
+/* Use the GCC warn_unused_result attribute to check that all return values
+ * are propagated correctly. On other compilers and gcc before 3.4.0 just
+ * ignore the annotation.
+ */
+#if !defined(__GNUC__) || ( __GNUC__ < 3) || (__GNUC__ == 3 && __GNUC_MINOR__ < 4)
+ #define checkreturn
+#else
+ #define checkreturn __attribute__((warn_unused_result))
+#endif
+
+/**************************************
+ * Declarations internal to this file *
+ **************************************/
+static bool checkreturn buf_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count);
+static bool checkreturn encode_array(pb_ostream_t *stream, pb_field_iter_t *field);
+static bool checkreturn pb_check_proto3_default_value(const pb_field_iter_t *field);
+static bool checkreturn encode_basic_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn encode_callback_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn encode_field(pb_ostream_t *stream, pb_field_iter_t *field);
+static bool checkreturn encode_extension_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn default_extension_encoder(pb_ostream_t *stream, const pb_extension_t *extension);
+static bool checkreturn pb_encode_varint_32(pb_ostream_t *stream, uint32_t low, uint32_t high);
+static bool checkreturn pb_enc_bool(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_varint(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_fixed(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_bytes(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_string(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_submessage(pb_ostream_t *stream, const pb_field_iter_t *field);
+static bool checkreturn pb_enc_fixed_length_bytes(pb_ostream_t *stream, const pb_field_iter_t *field);
+
+#ifdef PB_WITHOUT_64BIT
+#define pb_int64_t int32_t
+#define pb_uint64_t uint32_t
+#else
+#define pb_int64_t int64_t
+#define pb_uint64_t uint64_t
+#endif
+
+/*******************************
+ * pb_ostream_t implementation *
+ *******************************/
+
+static bool checkreturn buf_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count)
+{
+ size_t i;
+ pb_byte_t *dest = (pb_byte_t*)stream->state;
+ stream->state = dest + count;
+
+ for (i = 0; i < count; i++)
+ dest[i] = buf[i];
+
+ return true;
+}
+
+pb_ostream_t pb_ostream_from_buffer(pb_byte_t *buf, size_t bufsize)
+{
+ pb_ostream_t stream;
+#ifdef PB_BUFFER_ONLY
+ stream.callback = (void*)1; /* Just a marker value */
+#else
+ stream.callback = &buf_write;
+#endif
+ stream.state = buf;
+ stream.max_size = bufsize;
+ stream.bytes_written = 0;
+#ifndef PB_NO_ERRMSG
+ stream.errmsg = NULL;
+#endif
+ return stream;
+}
+
+bool checkreturn pb_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count)
+{
+ if (count > 0 && stream->callback != NULL)
+ {
+ if (stream->bytes_written + count < stream->bytes_written ||
+ stream->bytes_written + count > stream->max_size)
+ {
+ PB_RETURN_ERROR(stream, "stream full");
+ }
+
+#ifdef PB_BUFFER_ONLY
+ if (!buf_write(stream, buf, count))
+ PB_RETURN_ERROR(stream, "io error");
+#else
+ if (!stream->callback(stream, buf, count))
+ PB_RETURN_ERROR(stream, "io error");
+#endif
+ }
+
+ stream->bytes_written += count;
+ return true;
+}
+
+/*************************
+ * Encode a single field *
+ *************************/
+
+/* Read a bool value without causing undefined behavior even if the value
+ * is invalid. See issue #434 and
+ * https://stackoverflow.com/questions/27661768/weird-results-for-conditional
+ */
+static bool safe_read_bool(const void *pSize)
+{
+ const char *p = (const char *)pSize;
+ size_t i;
+ for (i = 0; i < sizeof(bool); i++)
+ {
+ if (p[i] != 0)
+ return true;
+ }
+ return false;
+}
+
+/* Encode a static array. Handles the size calculations and possible packing. */
+static bool checkreturn encode_array(pb_ostream_t *stream, pb_field_iter_t *field)
+{
+ pb_size_t i;
+ pb_size_t count;
+#ifndef PB_ENCODE_ARRAYS_UNPACKED
+ size_t size;
+#endif
+
+ count = *(pb_size_t*)field->pSize;
+
+ if (count == 0)
+ return true;
+
+ if (PB_ATYPE(field->type) != PB_ATYPE_POINTER && count > field->array_size)
+ PB_RETURN_ERROR(stream, "array max size exceeded");
+
+#ifndef PB_ENCODE_ARRAYS_UNPACKED
+ /* We always pack arrays if the datatype allows it. */
+ if (PB_LTYPE(field->type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ if (!pb_encode_tag(stream, PB_WT_STRING, field->tag))
+ return false;
+
+ /* Determine the total size of packed array. */
+ if (PB_LTYPE(field->type) == PB_LTYPE_FIXED32)
+ {
+ size = 4 * (size_t)count;
+ }
+ else if (PB_LTYPE(field->type) == PB_LTYPE_FIXED64)
+ {
+ size = 8 * (size_t)count;
+ }
+ else
+ {
+ pb_ostream_t sizestream = PB_OSTREAM_SIZING;
+ void *pData_orig = field->pData;
+ for (i = 0; i < count; i++)
+ {
+ if (!pb_enc_varint(&sizestream, field))
+ PB_RETURN_ERROR(stream, PB_GET_ERROR(&sizestream));
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ field->pData = pData_orig;
+ size = sizestream.bytes_written;
+ }
+
+ if (!pb_encode_varint(stream, (pb_uint64_t)size))
+ return false;
+
+ if (stream->callback == NULL)
+ return pb_write(stream, NULL, size); /* Just sizing.. */
+
+ /* Write the data */
+ for (i = 0; i < count; i++)
+ {
+ if (PB_LTYPE(field->type) == PB_LTYPE_FIXED32 || PB_LTYPE(field->type) == PB_LTYPE_FIXED64)
+ {
+ if (!pb_enc_fixed(stream, field))
+ return false;
+ }
+ else
+ {
+ if (!pb_enc_varint(stream, field))
+ return false;
+ }
+
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ }
+ else /* Unpacked fields */
+#endif
+ {
+ for (i = 0; i < count; i++)
+ {
+ /* Normally the data is stored directly in the array entries, but
+ * for pointer-type string and bytes fields, the array entries are
+ * actually pointers themselves also. So we have to dereference once
+ * more to get to the actual data. */
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER &&
+ (PB_LTYPE(field->type) == PB_LTYPE_STRING ||
+ PB_LTYPE(field->type) == PB_LTYPE_BYTES))
+ {
+ bool status;
+ void *pData_orig = field->pData;
+ field->pData = *(void* const*)field->pData;
+
+ if (!field->pData)
+ {
+ /* Null pointer in array is treated as empty string / bytes */
+ status = pb_encode_tag_for_field(stream, field) &&
+ pb_encode_varint(stream, 0);
+ }
+ else
+ {
+ status = encode_basic_field(stream, field);
+ }
+
+ field->pData = pData_orig;
+
+ if (!status)
+ return false;
+ }
+ else
+ {
+ if (!encode_basic_field(stream, field))
+ return false;
+ }
+ field->pData = (char*)field->pData + field->data_size;
+ }
+ }
+
+ return true;
+}
+
+/* In proto3, all fields are optional and are only encoded if their value is "non-zero".
+ * This function implements the check for the zero value. */
+static bool checkreturn pb_check_proto3_default_value(const pb_field_iter_t *field)
+{
+ pb_type_t type = field->type;
+
+ if (PB_ATYPE(type) == PB_ATYPE_STATIC)
+ {
+ if (PB_HTYPE(type) == PB_HTYPE_REQUIRED)
+ {
+ /* Required proto2 fields inside proto3 submessage, pretty rare case */
+ return false;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_REPEATED)
+ {
+ /* Repeated fields inside proto3 submessage: present if count != 0 */
+ return *(const pb_size_t*)field->pSize == 0;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_ONEOF)
+ {
+ /* Oneof fields */
+ return *(const pb_size_t*)field->pSize == 0;
+ }
+ else if (PB_HTYPE(type) == PB_HTYPE_OPTIONAL && field->pSize != NULL)
+ {
+ /* Proto2 optional fields inside proto3 message, or proto3
+ * submessage fields. */
+ return safe_read_bool(field->pSize) == false;
+ }
+ else if (field->descriptor->default_value)
+ {
+ /* Proto3 messages do not have default values, but proto2 messages
+ * can contain optional fields without has_fields (generator option 'proto3').
+ * In this case they must always be encoded, to make sure that the
+ * non-zero default value is overwritten.
+ */
+ return false;
+ }
+
+ /* Rest is proto3 singular fields */
+ if (PB_LTYPE(type) <= PB_LTYPE_LAST_PACKABLE)
+ {
+ /* Simple integer / float fields */
+ pb_size_t i;
+ const char *p = (const char*)field->pData;
+ for (i = 0; i < field->data_size; i++)
+ {
+ if (p[i] != 0)
+ {
+ return false;
+ }
+ }
+
+ return true;
+ }
+ else if (PB_LTYPE(type) == PB_LTYPE_BYTES)
+ {
+ const pb_bytes_array_t *bytes = (const pb_bytes_array_t*)field->pData;
+ return bytes->size == 0;
+ }
+ else if (PB_LTYPE(type) == PB_LTYPE_STRING)
+ {
+ return *(const char*)field->pData == '\0';
+ }
+ else if (PB_LTYPE(type) == PB_LTYPE_FIXED_LENGTH_BYTES)
+ {
+ /* Fixed length bytes is only empty if its length is fixed
+ * as 0. Which would be pretty strange, but we can check
+ * it anyway. */
+ return field->data_size == 0;
+ }
+ else if (PB_LTYPE_IS_SUBMSG(type))
+ {
+ /* Check all fields in the submessage to find if any of them
+ * are non-zero. The comparison cannot be done byte-per-byte
+ * because the C struct may contain padding bytes that must
+ * be skipped. Note that usually proto3 submessages have
+ * a separate has_field that is checked earlier in this if.
+ */
+ pb_field_iter_t iter;
+ if (pb_field_iter_begin(&iter, field->submsg_desc, field->pData))
+ {
+ do
+ {
+ if (!pb_check_proto3_default_value(&iter))
+ {
+ return false;
+ }
+ } while (pb_field_iter_next(&iter));
+ }
+ return true;
+ }
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_POINTER)
+ {
+ return field->pData == NULL;
+ }
+ else if (PB_ATYPE(type) == PB_ATYPE_CALLBACK)
+ {
+ if (PB_LTYPE(type) == PB_LTYPE_EXTENSION)
+ {
+ const pb_extension_t *extension = *(const pb_extension_t* const *)field->pData;
+ return extension == NULL;
+ }
+ else if (field->descriptor->field_callback == pb_default_field_callback)
+ {
+ pb_callback_t *pCallback = (pb_callback_t*)field->pData;
+ return pCallback->funcs.encode == NULL;
+ }
+ else
+ {
+ return field->descriptor->field_callback == NULL;
+ }
+ }
+
+ return false; /* Not typically reached, safe default for weird special cases. */
+}
+
+/* Encode a field with static or pointer allocation, i.e. one whose data
+ * is available to the encoder directly. */
+static bool checkreturn encode_basic_field(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (!field->pData)
+ {
+ /* Missing pointer field */
+ return true;
+ }
+
+ if (!pb_encode_tag_for_field(stream, field))
+ return false;
+
+ switch (PB_LTYPE(field->type))
+ {
+ case PB_LTYPE_BOOL:
+ return pb_enc_bool(stream, field);
+
+ case PB_LTYPE_VARINT:
+ case PB_LTYPE_UVARINT:
+ case PB_LTYPE_SVARINT:
+ return pb_enc_varint(stream, field);
+
+ case PB_LTYPE_FIXED32:
+ case PB_LTYPE_FIXED64:
+ return pb_enc_fixed(stream, field);
+
+ case PB_LTYPE_BYTES:
+ return pb_enc_bytes(stream, field);
+
+ case PB_LTYPE_STRING:
+ return pb_enc_string(stream, field);
+
+ case PB_LTYPE_SUBMESSAGE:
+ case PB_LTYPE_SUBMSG_W_CB:
+ return pb_enc_submessage(stream, field);
+
+ case PB_LTYPE_FIXED_LENGTH_BYTES:
+ return pb_enc_fixed_length_bytes(stream, field);
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+}
+
+/* Encode a field with callback semantics. This means that a user function is
+ * called to provide and encode the actual data. */
+static bool checkreturn encode_callback_field(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (field->descriptor->field_callback != NULL)
+ {
+ if (!field->descriptor->field_callback(NULL, stream, field))
+ PB_RETURN_ERROR(stream, "callback error");
+ }
+ return true;
+}
+
+/* Encode a single field of any callback, pointer or static type. */
+static bool checkreturn encode_field(pb_ostream_t *stream, pb_field_iter_t *field)
+{
+ /* Check field presence */
+ if (PB_HTYPE(field->type) == PB_HTYPE_ONEOF)
+ {
+ if (*(const pb_size_t*)field->pSize != field->tag)
+ {
+ /* Different type oneof field */
+ return true;
+ }
+ }
+ else if (PB_HTYPE(field->type) == PB_HTYPE_OPTIONAL)
+ {
+ if (field->pSize)
+ {
+ if (safe_read_bool(field->pSize) == false)
+ {
+ /* Missing optional field */
+ return true;
+ }
+ }
+ else if (PB_ATYPE(field->type) == PB_ATYPE_STATIC)
+ {
+ /* Proto3 singular field */
+ if (pb_check_proto3_default_value(field))
+ return true;
+ }
+ }
+
+ if (!field->pData)
+ {
+ if (PB_HTYPE(field->type) == PB_HTYPE_REQUIRED)
+ PB_RETURN_ERROR(stream, "missing required field");
+
+ /* Pointer field set to NULL */
+ return true;
+ }
+
+ /* Then encode field contents */
+ if (PB_ATYPE(field->type) == PB_ATYPE_CALLBACK)
+ {
+ return encode_callback_field(stream, field);
+ }
+ else if (PB_HTYPE(field->type) == PB_HTYPE_REPEATED)
+ {
+ return encode_array(stream, field);
+ }
+ else
+ {
+ return encode_basic_field(stream, field);
+ }
+}
+
+/* Default handler for extension fields. Expects to have a pb_msgdesc_t
+ * pointer in the extension->type->arg field, pointing to a message with
+ * only one field in it. */
+static bool checkreturn default_extension_encoder(pb_ostream_t *stream, const pb_extension_t *extension)
+{
+ pb_field_iter_t iter;
+
+ if (!pb_field_iter_begin_extension_const(&iter, extension))
+ PB_RETURN_ERROR(stream, "invalid extension");
+
+ return encode_field(stream, &iter);
+}
+
+
+/* Walk through all the registered extensions and give them a chance
+ * to encode themselves. */
+static bool checkreturn encode_extension_field(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ const pb_extension_t *extension = *(const pb_extension_t* const *)field->pData;
+
+ while (extension)
+ {
+ bool status;
+ if (extension->type->encode)
+ status = extension->type->encode(stream, extension);
+ else
+ status = default_extension_encoder(stream, extension);
+
+ if (!status)
+ return false;
+
+ extension = extension->next;
+ }
+
+ return true;
+}
+
+/*********************
+ * Encode all fields *
+ *********************/
+
+bool checkreturn pb_encode(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct)
+{
+ pb_field_iter_t iter;
+ if (!pb_field_iter_begin_const(&iter, fields, src_struct))
+ return true; /* Empty message type */
+
+ do {
+ if (PB_LTYPE(iter.type) == PB_LTYPE_EXTENSION)
+ {
+ /* Special case for the extension field placeholder */
+ if (!encode_extension_field(stream, &iter))
+ return false;
+ }
+ else
+ {
+ /* Regular field */
+ if (!encode_field(stream, &iter))
+ return false;
+ }
+ } while (pb_field_iter_next(&iter));
+
+ return true;
+}
+
+bool checkreturn pb_encode_ex(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct, unsigned int flags)
+{
+ if ((flags & PB_ENCODE_DELIMITED) != 0)
+ {
+ return pb_encode_submessage(stream, fields, src_struct);
+ }
+ else if ((flags & PB_ENCODE_NULLTERMINATED) != 0)
+ {
+ const pb_byte_t zero = 0;
+
+ if (!pb_encode(stream, fields, src_struct))
+ return false;
+
+ return pb_write(stream, &zero, 1);
+ }
+ else
+ {
+ return pb_encode(stream, fields, src_struct);
+ }
+}
+
+bool pb_get_encoded_size(size_t *size, const pb_msgdesc_t *fields, const void *src_struct)
+{
+ pb_ostream_t stream = PB_OSTREAM_SIZING;
+
+ if (!pb_encode(&stream, fields, src_struct))
+ return false;
+
+ *size = stream.bytes_written;
+ return true;
+}
+
+/********************
+ * Helper functions *
+ ********************/
+
+/* This function avoids 64-bit shifts as they are quite slow on many platforms. */
+static bool checkreturn pb_encode_varint_32(pb_ostream_t *stream, uint32_t low, uint32_t high)
+{
+ size_t i = 0;
+ pb_byte_t buffer[10];
+ pb_byte_t byte = (pb_byte_t)(low & 0x7F);
+ low >>= 7;
+
+ while (i < 4 && (low != 0 || high != 0))
+ {
+ byte |= 0x80;
+ buffer[i++] = byte;
+ byte = (pb_byte_t)(low & 0x7F);
+ low >>= 7;
+ }
+
+ if (high)
+ {
+ byte = (pb_byte_t)(byte | ((high & 0x07) << 4));
+ high >>= 3;
+
+ while (high)
+ {
+ byte |= 0x80;
+ buffer[i++] = byte;
+ byte = (pb_byte_t)(high & 0x7F);
+ high >>= 7;
+ }
+ }
+
+ buffer[i++] = byte;
+
+ return pb_write(stream, buffer, i);
+}
+
+bool checkreturn pb_encode_varint(pb_ostream_t *stream, pb_uint64_t value)
+{
+ if (value <= 0x7F)
+ {
+ /* Fast path: single byte */
+ pb_byte_t byte = (pb_byte_t)value;
+ return pb_write(stream, &byte, 1);
+ }
+ else
+ {
+#ifdef PB_WITHOUT_64BIT
+ return pb_encode_varint_32(stream, value, 0);
+#else
+ return pb_encode_varint_32(stream, (uint32_t)value, (uint32_t)(value >> 32));
+#endif
+ }
+}
+
+bool checkreturn pb_encode_svarint(pb_ostream_t *stream, pb_int64_t value)
+{
+ pb_uint64_t zigzagged;
+ if (value < 0)
+ zigzagged = ~((pb_uint64_t)value << 1);
+ else
+ zigzagged = (pb_uint64_t)value << 1;
+
+ return pb_encode_varint(stream, zigzagged);
+}
+
+bool checkreturn pb_encode_fixed32(pb_ostream_t *stream, const void *value)
+{
+ uint32_t val = *(const uint32_t*)value;
+ pb_byte_t bytes[4];
+ bytes[0] = (pb_byte_t)(val & 0xFF);
+ bytes[1] = (pb_byte_t)((val >> 8) & 0xFF);
+ bytes[2] = (pb_byte_t)((val >> 16) & 0xFF);
+ bytes[3] = (pb_byte_t)((val >> 24) & 0xFF);
+ return pb_write(stream, bytes, 4);
+}
+
+#ifndef PB_WITHOUT_64BIT
+bool checkreturn pb_encode_fixed64(pb_ostream_t *stream, const void *value)
+{
+ uint64_t val = *(const uint64_t*)value;
+ pb_byte_t bytes[8];
+ bytes[0] = (pb_byte_t)(val & 0xFF);
+ bytes[1] = (pb_byte_t)((val >> 8) & 0xFF);
+ bytes[2] = (pb_byte_t)((val >> 16) & 0xFF);
+ bytes[3] = (pb_byte_t)((val >> 24) & 0xFF);
+ bytes[4] = (pb_byte_t)((val >> 32) & 0xFF);
+ bytes[5] = (pb_byte_t)((val >> 40) & 0xFF);
+ bytes[6] = (pb_byte_t)((val >> 48) & 0xFF);
+ bytes[7] = (pb_byte_t)((val >> 56) & 0xFF);
+ return pb_write(stream, bytes, 8);
+}
+#endif
+
+bool checkreturn pb_encode_tag(pb_ostream_t *stream, pb_wire_type_t wiretype, uint32_t field_number)
+{
+ pb_uint64_t tag = ((pb_uint64_t)field_number << 3) | wiretype;
+ return pb_encode_varint(stream, tag);
+}
+
+bool pb_encode_tag_for_field ( pb_ostream_t* stream, const pb_field_iter_t* field )
+{
+ pb_wire_type_t wiretype;
+ switch (PB_LTYPE(field->type))
+ {
+ case PB_LTYPE_BOOL:
+ case PB_LTYPE_VARINT:
+ case PB_LTYPE_UVARINT:
+ case PB_LTYPE_SVARINT:
+ wiretype = PB_WT_VARINT;
+ break;
+
+ case PB_LTYPE_FIXED32:
+ wiretype = PB_WT_32BIT;
+ break;
+
+ case PB_LTYPE_FIXED64:
+ wiretype = PB_WT_64BIT;
+ break;
+
+ case PB_LTYPE_BYTES:
+ case PB_LTYPE_STRING:
+ case PB_LTYPE_SUBMESSAGE:
+ case PB_LTYPE_SUBMSG_W_CB:
+ case PB_LTYPE_FIXED_LENGTH_BYTES:
+ wiretype = PB_WT_STRING;
+ break;
+
+ default:
+ PB_RETURN_ERROR(stream, "invalid field type");
+ }
+
+ return pb_encode_tag(stream, wiretype, field->tag);
+}
+
+bool checkreturn pb_encode_string(pb_ostream_t *stream, const pb_byte_t *buffer, size_t size)
+{
+ if (!pb_encode_varint(stream, (pb_uint64_t)size))
+ return false;
+
+ return pb_write(stream, buffer, size);
+}
+
+bool checkreturn pb_encode_submessage(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct)
+{
+ /* First calculate the message size using a non-writing substream. */
+ pb_ostream_t substream = PB_OSTREAM_SIZING;
+ size_t size;
+ bool status;
+
+ if (!pb_encode(&substream, fields, src_struct))
+ {
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = substream.errmsg;
+#endif
+ return false;
+ }
+
+ size = substream.bytes_written;
+
+ if (!pb_encode_varint(stream, (pb_uint64_t)size))
+ return false;
+
+ if (stream->callback == NULL)
+ return pb_write(stream, NULL, size); /* Just sizing */
+
+ if (stream->bytes_written + size > stream->max_size)
+ PB_RETURN_ERROR(stream, "stream full");
+
+ /* Use a substream to verify that a callback doesn't write more than
+ * what it did the first time. */
+ substream.callback = stream->callback;
+ substream.state = stream->state;
+ substream.max_size = size;
+ substream.bytes_written = 0;
+#ifndef PB_NO_ERRMSG
+ substream.errmsg = NULL;
+#endif
+
+ status = pb_encode(&substream, fields, src_struct);
+
+ stream->bytes_written += substream.bytes_written;
+ stream->state = substream.state;
+#ifndef PB_NO_ERRMSG
+ stream->errmsg = substream.errmsg;
+#endif
+
+ if (substream.bytes_written != size)
+ PB_RETURN_ERROR(stream, "submsg size changed");
+
+ return status;
+}
+
+/* Field encoders */
+
+static bool checkreturn pb_enc_bool(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ uint32_t value = safe_read_bool(field->pData) ? 1 : 0;
+ PB_UNUSED(field);
+ return pb_encode_varint(stream, value);
+}
+
+static bool checkreturn pb_enc_varint(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (PB_LTYPE(field->type) == PB_LTYPE_UVARINT)
+ {
+ /* Perform unsigned integer extension */
+ pb_uint64_t value = 0;
+
+ if (field->data_size == sizeof(uint_least8_t))
+ value = *(const uint_least8_t*)field->pData;
+ else if (field->data_size == sizeof(uint_least16_t))
+ value = *(const uint_least16_t*)field->pData;
+ else if (field->data_size == sizeof(uint32_t))
+ value = *(const uint32_t*)field->pData;
+ else if (field->data_size == sizeof(pb_uint64_t))
+ value = *(const pb_uint64_t*)field->pData;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ return pb_encode_varint(stream, value);
+ }
+ else
+ {
+ /* Perform signed integer extension */
+ pb_int64_t value = 0;
+
+ if (field->data_size == sizeof(int_least8_t))
+ value = *(const int_least8_t*)field->pData;
+ else if (field->data_size == sizeof(int_least16_t))
+ value = *(const int_least16_t*)field->pData;
+ else if (field->data_size == sizeof(int32_t))
+ value = *(const int32_t*)field->pData;
+ else if (field->data_size == sizeof(pb_int64_t))
+ value = *(const pb_int64_t*)field->pData;
+ else
+ PB_RETURN_ERROR(stream, "invalid data_size");
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_SVARINT)
+ return pb_encode_svarint(stream, value);
+#ifdef PB_WITHOUT_64BIT
+ else if (value < 0)
+ return pb_encode_varint_32(stream, (uint32_t)value, (uint32_t)-1);
+#endif
+ else
+ return pb_encode_varint(stream, (pb_uint64_t)value);
+
+ }
+}
+
+static bool checkreturn pb_enc_fixed(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+ if (field->data_size == sizeof(float) && PB_LTYPE(field->type) == PB_LTYPE_FIXED64)
+ {
+ return pb_encode_float_as_double(stream, *(float*)field->pData);
+ }
+#endif
+
+ if (field->data_size == sizeof(uint32_t))
+ {
+ return pb_encode_fixed32(stream, field->pData);
+ }
+#ifndef PB_WITHOUT_64BIT
+ else if (field->data_size == sizeof(uint64_t))
+ {
+ return pb_encode_fixed64(stream, field->pData);
+ }
+#endif
+ else
+ {
+ PB_RETURN_ERROR(stream, "invalid data_size");
+ }
+}
+
+static bool checkreturn pb_enc_bytes(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ const pb_bytes_array_t *bytes = NULL;
+
+ bytes = (const pb_bytes_array_t*)field->pData;
+
+ if (bytes == NULL)
+ {
+ /* Treat null pointer as an empty bytes field */
+ return pb_encode_string(stream, NULL, 0);
+ }
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_STATIC &&
+ bytes->size > field->data_size - offsetof(pb_bytes_array_t, bytes))
+ {
+ PB_RETURN_ERROR(stream, "bytes size exceeded");
+ }
+
+ return pb_encode_string(stream, bytes->bytes, (size_t)bytes->size);
+}
+
+static bool checkreturn pb_enc_string(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ size_t size = 0;
+ size_t max_size = (size_t)field->data_size;
+ const char *str = (const char*)field->pData;
+
+ if (PB_ATYPE(field->type) == PB_ATYPE_POINTER)
+ {
+ max_size = (size_t)-1;
+ }
+ else
+ {
+ /* pb_dec_string() assumes string fields end with a null
+ * terminator when the type isn't PB_ATYPE_POINTER, so we
+ * shouldn't allow more than max-1 bytes to be written to
+ * allow space for the null terminator.
+ */
+ if (max_size == 0)
+ PB_RETURN_ERROR(stream, "zero-length string");
+
+ max_size -= 1;
+ }
+
+
+ if (str == NULL)
+ {
+ size = 0; /* Treat null pointer as an empty string */
+ }
+ else
+ {
+ const char *p = str;
+
+ /* strnlen() is not always available, so just use a loop */
+ while (size < max_size && *p != '\0')
+ {
+ size++;
+ p++;
+ }
+
+ if (*p != '\0')
+ {
+ PB_RETURN_ERROR(stream, "unterminated string");
+ }
+ }
+
+#ifdef PB_VALIDATE_UTF8
+ if (!pb_validate_utf8(str))
+ PB_RETURN_ERROR(stream, "invalid utf8");
+#endif
+
+ return pb_encode_string(stream, (const pb_byte_t*)str, size);
+}
+
+static bool checkreturn pb_enc_submessage(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ if (field->submsg_desc == NULL)
+ PB_RETURN_ERROR(stream, "invalid field descriptor");
+
+ if (PB_LTYPE(field->type) == PB_LTYPE_SUBMSG_W_CB && field->pSize != NULL)
+ {
+ /* Message callback is stored right before pSize. */
+ pb_callback_t *callback = (pb_callback_t*)field->pSize - 1;
+ if (callback->funcs.encode)
+ {
+ if (!callback->funcs.encode(stream, field, &callback->arg))
+ return false;
+ }
+ }
+
+ return pb_encode_submessage(stream, field->submsg_desc, field->pData);
+}
+
+static bool checkreturn pb_enc_fixed_length_bytes(pb_ostream_t *stream, const pb_field_iter_t *field)
+{
+ return pb_encode_string(stream, (const pb_byte_t*)field->pData, (size_t)field->data_size);
+}
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+bool pb_encode_float_as_double(pb_ostream_t *stream, float value)
+{
+ union { float f; uint32_t i; } in;
+ uint_least8_t sign;
+ int exponent;
+ uint64_t mantissa;
+
+ in.f = value;
+
+ /* Decompose input value */
+ sign = (uint_least8_t)((in.i >> 31) & 1);
+ exponent = (int)((in.i >> 23) & 0xFF) - 127;
+ mantissa = in.i & 0x7FFFFF;
+
+ if (exponent == 128)
+ {
+ /* Special value (NaN etc.) */
+ exponent = 1024;
+ }
+ else if (exponent == -127)
+ {
+ if (!mantissa)
+ {
+ /* Zero */
+ exponent = -1023;
+ }
+ else
+ {
+ /* Denormalized */
+ mantissa <<= 1;
+ while (!(mantissa & 0x800000))
+ {
+ mantissa <<= 1;
+ exponent--;
+ }
+ mantissa &= 0x7FFFFF;
+ }
+ }
+
+ /* Combine fields */
+ mantissa <<= 29;
+ mantissa |= (uint64_t)(exponent + 1023) << 52;
+ mantissa |= (uint64_t)sign << 63;
+
+ return pb_encode_fixed64(stream, &mantissa);
+}
+#endif
diff --git a/security/container/protos/nanopb/pb_encode.h b/security/container/protos/nanopb/pb_encode.h
new file mode 100644
index 0000000..9cff22a
--- /dev/null
+++ b/security/container/protos/nanopb/pb_encode.h
@@ -0,0 +1,185 @@
+/* pb_encode.h: Functions to encode protocol buffers. Depends on pb_encode.c.
+ * The main function is pb_encode. You also need an output stream, and the
+ * field descriptions created by nanopb_generator.py.
+ */
+
+#ifndef PB_ENCODE_H_INCLUDED
+#define PB_ENCODE_H_INCLUDED
+
+#include "pb.h"
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Structure for defining custom output streams. You will need to provide
+ * a callback function to write the bytes to your storage, which can be
+ * for example a file or a network socket.
+ *
+ * The callback must conform to these rules:
+ *
+ * 1) Return false on IO errors. This will cause encoding to abort.
+ * 2) You can use state to store your own data (e.g. buffer pointer).
+ * 3) pb_write will update bytes_written after your callback runs.
+ * 4) Substreams will modify max_size and bytes_written. Don't use them
+ * to calculate any pointers.
+ */
+struct pb_ostream_s
+{
+#ifdef PB_BUFFER_ONLY
+ /* Callback pointer is not used in buffer-only configuration.
+ * Having an int pointer here allows binary compatibility but
+ * gives an error if someone tries to assign callback function.
+ * Also, NULL pointer marks a 'sizing stream' that does not
+ * write anything.
+ */
+ int *callback;
+#else
+ bool (*callback)(pb_ostream_t *stream, const pb_byte_t *buf, size_t count);
+#endif
+ void *state; /* Free field for use by callback implementation. */
+ size_t max_size; /* Limit number of output bytes written (or use SIZE_MAX). */
+ size_t bytes_written; /* Number of bytes written so far. */
+
+#ifndef PB_NO_ERRMSG
+ const char *errmsg;
+#endif
+};
+
+/***************************
+ * Main encoding functions *
+ ***************************/
+
+/* Encode a single protocol buffers message from C structure into a stream.
+ * Returns true on success, false on any failure.
+ * The actual struct pointed to by src_struct must match the description in fields.
+ * All required fields in the struct are assumed to have been filled in.
+ *
+ * Example usage:
+ * MyMessage msg = {};
+ * uint8_t buffer[64];
+ * pb_ostream_t stream;
+ *
+ * msg.field1 = 42;
+ * stream = pb_ostream_from_buffer(buffer, sizeof(buffer));
+ * pb_encode(&stream, MyMessage_fields, &msg);
+ */
+bool pb_encode(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct);
+
+/* Extended version of pb_encode, with several options to control the
+ * encoding process:
+ *
+ * PB_ENCODE_DELIMITED: Prepend the length of message as a varint.
+ * Corresponds to writeDelimitedTo() in Google's
+ * protobuf API.
+ *
+ * PB_ENCODE_NULLTERMINATED: Append a null byte to the message for termination.
+ * NOTE: This behaviour is not supported in most other
+ * protobuf implementations, so PB_ENCODE_DELIMITED
+ * is a better option for compatibility.
+ */
+#define PB_ENCODE_DELIMITED 0x02U
+#define PB_ENCODE_NULLTERMINATED 0x04U
+bool pb_encode_ex(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct, unsigned int flags);
+
+/* Defines for backwards compatibility with code written before nanopb-0.4.0 */
+#define pb_encode_delimited(s,f,d) pb_encode_ex(s,f,d, PB_ENCODE_DELIMITED)
+#define pb_encode_nullterminated(s,f,d) pb_encode_ex(s,f,d, PB_ENCODE_NULLTERMINATED)
+
+/* Encode the message to get the size of the encoded data, but do not store
+ * the data. */
+bool pb_get_encoded_size(size_t *size, const pb_msgdesc_t *fields, const void *src_struct);
+
+/**************************************
+ * Functions for manipulating streams *
+ **************************************/
+
+/* Create an output stream for writing into a memory buffer.
+ * The number of bytes written can be found in stream.bytes_written after
+ * encoding the message.
+ *
+ * Alternatively, you can use a custom stream that writes directly to e.g.
+ * a file or a network socket.
+ */
+pb_ostream_t pb_ostream_from_buffer(pb_byte_t *buf, size_t bufsize);
+
+/* Pseudo-stream for measuring the size of a message without actually storing
+ * the encoded data.
+ *
+ * Example usage:
+ * MyMessage msg = {};
+ * pb_ostream_t stream = PB_OSTREAM_SIZING;
+ * pb_encode(&stream, MyMessage_fields, &msg);
+ * printf("Message size is %d\n", stream.bytes_written);
+ */
+#ifndef PB_NO_ERRMSG
+#define PB_OSTREAM_SIZING {0,0,0,0,0}
+#else
+#define PB_OSTREAM_SIZING {0,0,0,0}
+#endif
+
+/* Function to write into a pb_ostream_t stream. You can use this if you need
+ * to append or prepend some custom headers to the message.
+ */
+bool pb_write(pb_ostream_t *stream, const pb_byte_t *buf, size_t count);
+
+
+/************************************************
+ * Helper functions for writing field callbacks *
+ ************************************************/
+
+/* Encode field header based on type and field number defined in the field
+ * structure. Call this from the callback before writing out field contents. */
+bool pb_encode_tag_for_field(pb_ostream_t *stream, const pb_field_iter_t *field);
+
+/* Encode field header by manually specifying wire type. You need to use this
+ * if you want to write out packed arrays from a callback field. */
+bool pb_encode_tag(pb_ostream_t *stream, pb_wire_type_t wiretype, uint32_t field_number);
+
+/* Encode an integer in the varint format.
+ * This works for bool, enum, int32, int64, uint32 and uint64 field types. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_encode_varint(pb_ostream_t *stream, uint64_t value);
+#else
+bool pb_encode_varint(pb_ostream_t *stream, uint32_t value);
+#endif
+
+/* Encode an integer in the zig-zagged svarint format.
+ * This works for sint32 and sint64. */
+#ifndef PB_WITHOUT_64BIT
+bool pb_encode_svarint(pb_ostream_t *stream, int64_t value);
+#else
+bool pb_encode_svarint(pb_ostream_t *stream, int32_t value);
+#endif
+
+/* Encode a string or bytes type field. For strings, pass strlen(s) as size. */
+bool pb_encode_string(pb_ostream_t *stream, const pb_byte_t *buffer, size_t size);
+
+/* Encode a fixed32, sfixed32 or float value.
+ * You need to pass a pointer to a 4-byte wide C variable. */
+bool pb_encode_fixed32(pb_ostream_t *stream, const void *value);
+
+#ifndef PB_WITHOUT_64BIT
+/* Encode a fixed64, sfixed64 or double value.
+ * You need to pass a pointer to a 8-byte wide C variable. */
+bool pb_encode_fixed64(pb_ostream_t *stream, const void *value);
+#endif
+
+#ifdef PB_CONVERT_DOUBLE_FLOAT
+/* Encode a float value so that it appears like a double in the encoded
+ * message. */
+bool pb_encode_float_as_double(pb_ostream_t *stream, float value);
+#endif
+
+/* Encode a submessage field.
+ * You need to pass the pb_field_t array and pointer to struct, just like
+ * with pb_encode(). This internally encodes the submessage twice, first to
+ * calculate message size and then to actually write it out.
+ */
+bool pb_encode_submessage(pb_ostream_t *stream, const pb_msgdesc_t *fields, const void *src_struct);
+
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif
diff --git a/security/container/protos/pbsystem.h b/security/container/protos/pbsystem.h
new file mode 100644
index 0000000..f2308f8
--- /dev/null
+++ b/security/container/protos/pbsystem.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Header and types for nanopb to work with the Linux kernel */
+#include <linux/kernel.h>
+#include <linux/string.h>
+
+/* Small types. */
+
+/* Signed. */
+typedef signed char int_least8_t;
+typedef short int int_least16_t;
+typedef int int_least32_t;
+typedef long int int_least64_t;
+
+/* Unsigned. */
+typedef unsigned char uint_least8_t;
+typedef unsigned short int uint_least16_t;
+typedef unsigned int uint_least32_t;
+typedef unsigned long int uint_least64_t;
+
+/* Fast types. */
+
+/* Signed. */
+typedef signed char int_fast8_t;
+typedef long int int_fast16_t;
+typedef long int int_fast32_t;
+typedef long int int_fast64_t;
+
+/* Unsigned. */
+typedef unsigned char uint_fast8_t;
+typedef unsigned long int uint_fast16_t;
+typedef unsigned long int uint_fast32_t;
+typedef unsigned long int uint_fast64_t;
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 026c8c9..7a244e8 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -13,7 +13,6 @@
#include <linux/fs.h>
#include <linux/xattr.h>
#include <linux/evm.h>
-#include <linux/iversion.h>
#include <linux/fsverity.h>
#include "ima.h"
@@ -246,10 +245,11 @@
struct inode *real_inode = d_real_inode(file_dentry(file));
const char *filename = file->f_path.dentry->d_name.name;
struct ima_max_digest_data hash;
+ struct kstat stat;
int result = 0;
int length;
void *tmpbuf;
- u64 i_version;
+ u64 i_version = 0;
/*
* Always collect the modsig, because IMA might have already collected
@@ -268,7 +268,10 @@
* to an initial measurement/appraisal/audit, but was modified to
* assume the file changed.
*/
- i_version = inode_query_iversion(inode);
+ result = vfs_getattr_nosec(&file->f_path, &stat, STATX_CHANGE_COOKIE,
+ AT_STATX_SYNC_AS_STAT);
+ if (!result && (stat.result_mask & STATX_CHANGE_COOKIE))
+ i_version = stat.change_cookie;
hash.hdr.algo = algo;
hash.hdr.length = hash_digest_size[algo];
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 185666d..bba421f 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -24,7 +24,6 @@
#include <linux/slab.h>
#include <linux/xattr.h>
#include <linux/ima.h>
-#include <linux/iversion.h>
#include <linux/fs.h>
#include <linux/iversion.h>
@@ -164,11 +163,16 @@
mutex_lock(&iint->mutex);
if (atomic_read(&inode->i_writecount) == 1) {
+ struct kstat stat;
+
update = test_and_clear_bit(IMA_UPDATE_XATTR,
&iint->atomic_flags);
- if (!IS_I_VERSION(inode) ||
- !inode_eq_iversion(inode, iint->version) ||
- (iint->flags & IMA_NEW_FILE)) {
+ if ((iint->flags & IMA_NEW_FILE) ||
+ vfs_getattr_nosec(&file->f_path, &stat,
+ STATX_CHANGE_COOKIE,
+ AT_STATX_SYNC_AS_STAT) ||
+ !(stat.result_mask & STATX_CHANGE_COOKIE) ||
+ stat.change_cookie != iint->version) {
iint->flags &= ~(IMA_DONE_MASK | IMA_NEW_FILE);
iint->measured_pcrs = 0;
if (update)
diff --git a/security/security.c b/security/security.c
index 5fa286a..0a94f1c 100644
--- a/security/security.c
+++ b/security/security.c
@@ -1563,6 +1563,11 @@
}
}
+void security_file_pre_free(struct file *file)
+{
+ call_void_hook(file_pre_free_security, file);
+}
+
int security_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
return call_int_hook(file_ioctl, 0, file, cmd, arg);
@@ -1679,6 +1684,11 @@
return rc;
}
+void security_task_post_alloc(struct task_struct *task)
+{
+ call_void_hook(task_post_alloc, task);
+}
+
void security_task_free(struct task_struct *task)
{
call_void_hook(task_free, task);
@@ -1903,6 +1913,11 @@
return call_int_hook(task_kill, 0, p, info, sig, cred);
}
+void security_task_exit(struct task_struct *p)
+{
+ call_void_hook(task_exit, p);
+}
+
int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
unsigned long arg4, unsigned long arg5)
{
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index aae64eb..7541dce 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -74,6 +74,7 @@
TARGETS += syscall_user_dispatch
TARGETS += sysctl
TARGETS += tc-testing
+TARGETS += tdx
TARGETS += timens
ifneq (1, $(quicktest))
TARGETS += timers
diff --git a/tools/testing/selftests/futex/functional/.gitignore b/tools/testing/selftests/futex/functional/.gitignore
index fbcbdb6..ffcf3d9 100644
--- a/tools/testing/selftests/futex/functional/.gitignore
+++ b/tools/testing/selftests/futex/functional/.gitignore
@@ -2,6 +2,7 @@
futex_requeue_pi
futex_requeue_pi_mismatched_ops
futex_requeue_pi_signal_restart
+futex_swap
futex_wait_private_mapped_file
futex_wait_timeout
futex_wait_uninitialized_heap
diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile
index a392d09..3381dd1 100644
--- a/tools/testing/selftests/futex/functional/Makefile
+++ b/tools/testing/selftests/futex/functional/Makefile
@@ -13,6 +13,7 @@
futex_requeue_pi \
futex_requeue_pi_signal_restart \
futex_requeue_pi_mismatched_ops \
+ futex_swap \
futex_wait_uninitialized_heap \
futex_wait_private_mapped_file \
futex_wait \
diff --git a/tools/testing/selftests/futex/functional/futex_swap.c b/tools/testing/selftests/futex/functional/futex_swap.c
new file mode 100644
index 0000000..9034d04
--- /dev/null
+++ b/tools/testing/selftests/futex/functional/futex_swap.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include <errno.h>
+#include <getopt.h>
+#include <pthread.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <time.h>
+#include "atomic.h"
+#include "futextest.h"
+
+/* The futex the main thread waits on. */
+futex_t futex_main = FUTEX_INITIALIZER;
+/* The futex the other thread wats on. */
+futex_t futex_other = FUTEX_INITIALIZER;
+
+/* The number of iterations to run (>1 => run benchmarks. */
+static int cfg_iterations = 1;
+
+/* If != 0, print diagnostic messages. */
+static int cfg_verbose;
+
+/* If == 0, do not use validation_counter. Useful for benchmarking. */
+static int cfg_validate = 1;
+
+/* How to swap threads. */
+#define SWAP_WAKE_WAIT 1
+#define SWAP_SWAP 2
+
+/* Futex values. */
+#define FUTEX_WAITING 0
+#define FUTEX_WAKEUP 1
+
+/* An atomic counter used to validate proper swapping. */
+static atomic_t validation_counter;
+
+void futex_swap_op(int mode, futex_t *futex_this, futex_t *futex_that)
+{
+ int ret;
+
+ switch (mode) {
+ case SWAP_WAKE_WAIT:
+ futex_set(futex_this, FUTEX_WAITING);
+ futex_set(futex_that, FUTEX_WAKEUP);
+ futex_wake(futex_that, 1, FUTEX_PRIVATE_FLAG);
+ futex_wait(futex_this, FUTEX_WAITING, NULL, FUTEX_PRIVATE_FLAG);
+ if (*futex_this != FUTEX_WAKEUP) {
+ fprintf(stderr, "unexpected futex_this value on wakeup\n");
+ exit(1);
+ }
+ break;
+
+ case SWAP_SWAP:
+ futex_set(futex_this, FUTEX_WAITING);
+ futex_set(futex_that, FUTEX_WAKEUP);
+ ret = futex_swap(futex_this, FUTEX_WAITING, NULL,
+ futex_that, FUTEX_PRIVATE_FLAG);
+ if (ret < 0 && errno == ENOSYS) {
+ /* futex_swap not implemented */
+ perror("futex_swap");
+ exit(1);
+ }
+ if (*futex_this != FUTEX_WAKEUP) {
+ fprintf(stderr, "unexpected futex_this value on wakeup\n");
+ exit(1);
+ }
+ break;
+
+ default:
+ fprintf(stderr, "unknown mode in %s\n", __func__);
+ exit(1);
+ }
+}
+
+void *other_thread(void *arg)
+{
+ int mode = *((int *)arg);
+ int counter;
+
+ if (cfg_verbose)
+ printf("%s started\n", __func__);
+
+ futex_wait(&futex_other, 0, NULL, FUTEX_PRIVATE_FLAG);
+
+ for (counter = 0; counter < cfg_iterations; ++counter) {
+ if (cfg_validate) {
+ int prev = 2 * counter + 1;
+
+ if (prev != atomic_cmpxchg(&validation_counter, prev,
+ prev + 1)) {
+ fprintf(stderr, "swap validation failed\n");
+ exit(1);
+ }
+ }
+ futex_swap_op(mode, &futex_other, &futex_main);
+ }
+
+ if (cfg_verbose)
+ printf("%s finished: %d iteration(s)\n", __func__, counter);
+
+ return NULL;
+}
+
+void run_test(int mode)
+{
+ struct timespec start, stop;
+ int ret, counter;
+ pthread_t thread;
+ uint64_t duration;
+
+ futex_set(&futex_other, FUTEX_WAITING);
+ atomic_set(&validation_counter, 0);
+ ret = pthread_create(&thread, NULL, &other_thread, &mode);
+ if (ret) {
+ perror("pthread_create");
+ exit(1);
+ }
+
+ ret = clock_gettime(CLOCK_MONOTONIC, &start);
+ if (ret) {
+ perror("clock_gettime");
+ exit(1);
+ }
+
+ for (counter = 0; counter < cfg_iterations; ++counter) {
+ if (cfg_validate) {
+ int prev = 2 * counter;
+
+ if (prev != atomic_cmpxchg(&validation_counter, prev,
+ prev + 1)) {
+ fprintf(stderr, "swap validation failed\n");
+ exit(1);
+ }
+ }
+ futex_swap_op(mode, &futex_main, &futex_other);
+ }
+ if (cfg_validate && validation_counter.val != 2 * cfg_iterations) {
+ fprintf(stderr, "final swap validation failed\n");
+ exit(1);
+ }
+
+ ret = clock_gettime(CLOCK_MONOTONIC, &stop);
+ if (ret) {
+ perror("clock_gettime");
+ exit(1);
+ }
+
+ duration = (stop.tv_sec - start.tv_sec) * 1000000000LL +
+ stop.tv_nsec - start.tv_nsec;
+ if (cfg_verbose || cfg_iterations > 1) {
+ printf("completed %d swap and back iterations in %lu ns: %lu ns per swap\n",
+ cfg_iterations, duration,
+ duration / (cfg_iterations * 2));
+ }
+
+ /* The remote thread is blocked; send it the final wake. */
+ futex_set(&futex_other, FUTEX_WAKEUP);
+ futex_wake(&futex_other, 1, FUTEX_PRIVATE_FLAG);
+ if (pthread_join(thread, NULL)) {
+ perror("pthread_join");
+ exit(1);
+ }
+}
+
+void usage(char *prog)
+{
+ printf("Usage: %s\n", prog);
+ printf(" -h Display this help message\n");
+ printf(" -i N Use N iterations to benchmark\n");
+ printf(" -n Do not validate swapping correctness\n");
+ printf(" -v Print diagnostic messages\n");
+}
+
+int main(int argc, char *argv[])
+{
+ int c;
+
+ while ((c = getopt(argc, argv, "hi:nv")) != -1) {
+ switch (c) {
+ case 'h':
+ usage(basename(argv[0]));
+ exit(0);
+ case 'i':
+ cfg_iterations = atoi(optarg);
+ break;
+ case 'n':
+ cfg_validate = 0;
+ break;
+ case 'v':
+ cfg_verbose = 1;
+ break;
+ default:
+ usage(basename(argv[0]));
+ exit(1);
+ }
+ }
+
+ printf("\n\n------- running SWAP_WAKE_WAIT -----------\n\n");
+ run_test(SWAP_WAKE_WAIT);
+ printf("PASS\n");
+
+ printf("\n\n------- running SWAP_SWAP -----------\n\n");
+ run_test(SWAP_SWAP);
+ printf("PASS\n");
+
+ return 0;
+}
diff --git a/tools/testing/selftests/futex/include/futextest.h b/tools/testing/selftests/futex/include/futextest.h
index ddbcfc9..d2861fd 100644
--- a/tools/testing/selftests/futex/include/futextest.h
+++ b/tools/testing/selftests/futex/include/futextest.h
@@ -38,6 +38,9 @@
#ifndef FUTEX_CMP_REQUEUE_PI
#define FUTEX_CMP_REQUEUE_PI 12
#endif
+#ifndef GFUTEX_SWAP
+#define GFUTEX_SWAP 60
+#endif
#ifndef FUTEX_WAIT_REQUEUE_PI_PRIVATE
#define FUTEX_WAIT_REQUEUE_PI_PRIVATE (FUTEX_WAIT_REQUEUE_PI | \
FUTEX_PRIVATE_FLAG)
@@ -205,6 +208,19 @@
}
/**
+ * futex_swap() - block on uaddr and wake one task blocked on uaddr2.
+ * @uaddr: futex to block the current task on
+ * @timeout: relative timeout for the current task block
+ * @uaddr2: futex to wake tasks at (can be the same as uaddr)
+ */
+static inline int
+futex_swap(futex_t *uaddr, futex_t val, struct timespec *timeout,
+ futex_t *uaddr2, int opflags)
+{
+ return futex(uaddr, GFUTEX_SWAP, val, timeout, uaddr2, 0, opflags);
+}
+
+/**
* futex_cmpxchg() - atomic compare and exchange
* @uaddr: The address of the futex to be modified
* @oldval: The expected value of the futex
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 5da0c5e..dde9973 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -96,6 +96,7 @@
#define X86_FEATURE_XMM2 KVM_X86_CPU_FEATURE(0x1, 0, EDX, 26)
#define X86_FEATURE_FSGSBASE KVM_X86_CPU_FEATURE(0x7, 0, EBX, 0)
#define X86_FEATURE_TSC_ADJUST KVM_X86_CPU_FEATURE(0x7, 0, EBX, 1)
+#define X86_FEATURE_SGX KVM_X86_CPU_FEATURE(0x7, 0, EBX, 2)
#define X86_FEATURE_HLE KVM_X86_CPU_FEATURE(0x7, 0, EBX, 4)
#define X86_FEATURE_SMEP KVM_X86_CPU_FEATURE(0x7, 0, EBX, 7)
#define X86_FEATURE_INVPCID KVM_X86_CPU_FEATURE(0x7, 0, EBX, 10)
@@ -109,6 +110,7 @@
#define X86_FEATURE_PKU KVM_X86_CPU_FEATURE(0x7, 0, ECX, 3)
#define X86_FEATURE_LA57 KVM_X86_CPU_FEATURE(0x7, 0, ECX, 16)
#define X86_FEATURE_RDPID KVM_X86_CPU_FEATURE(0x7, 0, ECX, 22)
+#define X86_FEATURE_SGX_LC KVM_X86_CPU_FEATURE(0x7, 0, ECX, 30)
#define X86_FEATURE_SHSTK KVM_X86_CPU_FEATURE(0x7, 0, ECX, 7)
#define X86_FEATURE_IBT KVM_X86_CPU_FEATURE(0x7, 0, EDX, 20)
#define X86_FEATURE_AMX_TILE KVM_X86_CPU_FEATURE(0x7, 0, EDX, 24)
diff --git a/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c b/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c
index 322d561..90720b6 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_msrs_test.c
@@ -67,6 +67,52 @@
vmx_fixed1_msr_test(vcpu, MSR_IA32_VMX_VMFUNC, -1ull);
}
+static void __ia32_feature_control_msr_test(struct kvm_vcpu *vcpu,
+ uint64_t msr_bit,
+ struct kvm_x86_cpu_feature feature)
+{
+ uint64_t val;
+
+ vcpu_clear_cpuid_feature(vcpu, feature);
+
+ val = vcpu_get_msr(vcpu, MSR_IA32_FEAT_CTL);
+ vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, val | msr_bit | FEAT_CTL_LOCKED);
+ vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, (val & ~msr_bit) | FEAT_CTL_LOCKED);
+ vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, val | msr_bit | FEAT_CTL_LOCKED);
+ vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, (val & ~msr_bit) | FEAT_CTL_LOCKED);
+ vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, val);
+
+ if (!kvm_cpu_has(feature))
+ return;
+
+ vcpu_set_cpuid_feature(vcpu, feature);
+}
+
+static void ia32_feature_control_msr_test(struct kvm_vcpu *vcpu)
+{
+ uint64_t supported_bits = FEAT_CTL_LOCKED |
+ FEAT_CTL_VMX_ENABLED_INSIDE_SMX |
+ FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX |
+ FEAT_CTL_SGX_LC_ENABLED |
+ FEAT_CTL_SGX_ENABLED |
+ FEAT_CTL_LMCE_ENABLED;
+ int bit, r;
+
+ __ia32_feature_control_msr_test(vcpu, FEAT_CTL_VMX_ENABLED_INSIDE_SMX, X86_FEATURE_SMX);
+ __ia32_feature_control_msr_test(vcpu, FEAT_CTL_VMX_ENABLED_INSIDE_SMX, X86_FEATURE_VMX);
+ __ia32_feature_control_msr_test(vcpu, FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX, X86_FEATURE_VMX);
+ __ia32_feature_control_msr_test(vcpu, FEAT_CTL_SGX_LC_ENABLED, X86_FEATURE_SGX_LC);
+ __ia32_feature_control_msr_test(vcpu, FEAT_CTL_SGX_LC_ENABLED, X86_FEATURE_SGX);
+ __ia32_feature_control_msr_test(vcpu, FEAT_CTL_SGX_ENABLED, X86_FEATURE_SGX);
+ __ia32_feature_control_msr_test(vcpu, FEAT_CTL_LMCE_ENABLED, X86_FEATURE_MCE);
+
+ for_each_clear_bit(bit, &supported_bits, 64) {
+ r = _vcpu_set_msr(vcpu, MSR_IA32_FEAT_CTL, BIT(bit));
+ TEST_ASSERT(r == 0,
+ "Setting reserved bit %d in IA32_FEATURE_CONTROL should fail", bit);
+ }
+}
+
int main(void)
{
struct kvm_vcpu *vcpu;
@@ -79,6 +125,7 @@
vm = vm_create_with_one_vcpu(&vcpu, NULL);
vmx_save_restore_msrs_test(vcpu);
+ ia32_feature_control_msr_test(vcpu);
kvm_vm_free(vm);
}
diff --git a/tools/testing/selftests/net/msg_zerocopy.sh b/tools/testing/selftests/net/msg_zerocopy.sh
index 825ffec..89c22f5 100755
--- a/tools/testing/selftests/net/msg_zerocopy.sh
+++ b/tools/testing/selftests/net/msg_zerocopy.sh
@@ -70,23 +70,22 @@
esac
# Start of state changes: install cleanup handler
-save_sysctl_mem="$(sysctl -n ${path_sysctl_mem})"
cleanup() {
ip netns del "${NS2}"
ip netns del "${NS1}"
- sysctl -w -q "${path_sysctl_mem}=${save_sysctl_mem}"
}
trap cleanup EXIT
-# Configure system settings
-sysctl -w -q "${path_sysctl_mem}=1000000"
-
# Create virtual ethernet pair between network namespaces
ip netns add "${NS1}"
ip netns add "${NS2}"
+# Configure system settings
+ip netns exec "${NS1}" sysctl -w -q "${path_sysctl_mem}=1000000"
+ip netns exec "${NS2}" sysctl -w -q "${path_sysctl_mem}=1000000"
+
ip link add "${DEV}" mtu "${DEV_MTU}" netns "${NS1}" type veth \
peer name "${DEV}" mtu "${DEV_MTU}" netns "${NS2}"
diff --git a/tools/testing/selftests/tdx/Makefile b/tools/testing/selftests/tdx/Makefile
new file mode 100644
index 0000000..8dd4351
--- /dev/null
+++ b/tools/testing/selftests/tdx/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+
+CFLAGS += -O3 -Wl,-no-as-needed -Wall -static
+
+TEST_GEN_PROGS := tdx_guest_test
+
+include ../lib.mk
diff --git a/tools/testing/selftests/tdx/config b/tools/testing/selftests/tdx/config
new file mode 100644
index 0000000..aa1edc8
--- /dev/null
+++ b/tools/testing/selftests/tdx/config
@@ -0,0 +1 @@
+CONFIG_TDX_GUEST_DRIVER=y
diff --git a/tools/testing/selftests/tdx/tdx_guest_test.c b/tools/testing/selftests/tdx/tdx_guest_test.c
new file mode 100644
index 0000000..2a2afd8
--- /dev/null
+++ b/tools/testing/selftests/tdx/tdx_guest_test.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Test TDX guest features
+ *
+ * Copyright (C) 2022 Intel Corporation.
+ *
+ * Author: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
+ */
+
+#include <sys/ioctl.h>
+
+#include <errno.h>
+#include <fcntl.h>
+
+#include "../kselftest_harness.h"
+#include "../../../../include/uapi/linux/tdx-guest.h"
+
+#define TDX_GUEST_DEVNAME "/dev/tdx_guest"
+#define HEX_DUMP_SIZE 8
+#define DEBUG 0
+
+/**
+ * struct tdreport_type - Type header of TDREPORT_STRUCT.
+ * @type: Type of the TDREPORT (0 - SGX, 81 - TDX, rest are reserved)
+ * @sub_type: Subtype of the TDREPORT (Default value is 0).
+ * @version: TDREPORT version (Default value is 0).
+ * @reserved: Added for future extension.
+ *
+ * More details can be found in TDX v1.0 module specification, sec
+ * titled "REPORTTYPE".
+ */
+struct tdreport_type {
+ __u8 type;
+ __u8 sub_type;
+ __u8 version;
+ __u8 reserved;
+};
+
+/**
+ * struct reportmac - TDX guest report data, MAC and TEE hashes.
+ * @type: TDREPORT type header.
+ * @reserved1: Reserved for future extension.
+ * @cpu_svn: CPU security version.
+ * @tee_tcb_info_hash: SHA384 hash of TEE TCB INFO.
+ * @tee_td_info_hash: SHA384 hash of TDINFO_STRUCT.
+ * @reportdata: User defined unique data passed in TDG.MR.REPORT request.
+ * @reserved2: Reserved for future extension.
+ * @mac: CPU MAC ID.
+ *
+ * It is MAC-protected and contains hashes of the remainder of the
+ * report structure along with user provided report data. More details can
+ * be found in TDX v1.0 Module specification, sec titled "REPORTMACSTRUCT"
+ */
+struct reportmac {
+ struct tdreport_type type;
+ __u8 reserved1[12];
+ __u8 cpu_svn[16];
+ __u8 tee_tcb_info_hash[48];
+ __u8 tee_td_info_hash[48];
+ __u8 reportdata[64];
+ __u8 reserved2[32];
+ __u8 mac[32];
+};
+
+/**
+ * struct td_info - TDX guest measurements and configuration.
+ * @attr: TDX Guest attributes (like debug, spet_disable, etc).
+ * @xfam: Extended features allowed mask.
+ * @mrtd: Build time measurement register.
+ * @mrconfigid: Software-defined ID for non-owner-defined configuration
+ * of the guest - e.g., run-time or OS configuration.
+ * @mrowner: Software-defined ID for the guest owner.
+ * @mrownerconfig: Software-defined ID for owner-defined configuration of
+ * the guest - e.g., specific to the workload.
+ * @rtmr: Run time measurement registers.
+ * @reserved: Added for future extension.
+ *
+ * It contains the measurements and initial configuration of the TDX guest
+ * that was locked at initialization and a set of measurement registers
+ * that are run-time extendable. More details can be found in TDX v1.0
+ * Module specification, sec titled "TDINFO_STRUCT".
+ */
+struct td_info {
+ __u8 attr[8];
+ __u64 xfam;
+ __u64 mrtd[6];
+ __u64 mrconfigid[6];
+ __u64 mrowner[6];
+ __u64 mrownerconfig[6];
+ __u64 rtmr[24];
+ __u64 reserved[14];
+};
+
+/*
+ * struct tdreport - Output of TDCALL[TDG.MR.REPORT].
+ * @reportmac: Mac protected header of size 256 bytes.
+ * @tee_tcb_info: Additional attestable elements in the TCB are not
+ * reflected in the reportmac.
+ * @reserved: Added for future extension.
+ * @tdinfo: Measurements and configuration data of size 512 bytes.
+ *
+ * More details can be found in TDX v1.0 Module specification, sec
+ * titled "TDREPORT_STRUCT".
+ */
+struct tdreport {
+ struct reportmac reportmac;
+ __u8 tee_tcb_info[239];
+ __u8 reserved[17];
+ struct td_info tdinfo;
+};
+
+static void print_array_hex(const char *title, const char *prefix_str,
+ const void *buf, int len)
+{
+ int i, j, line_len, rowsize = HEX_DUMP_SIZE;
+ const __u8 *ptr = buf;
+
+ printf("\t\t%s", title);
+
+ for (j = 0; j < len; j += rowsize) {
+ line_len = rowsize < (len - j) ? rowsize : (len - j);
+ printf("%s%.8x:", prefix_str, j);
+ for (i = 0; i < line_len; i++)
+ printf(" %.2x", ptr[j + i]);
+ printf("\n");
+ }
+
+ printf("\n");
+}
+
+TEST(verify_report)
+{
+ struct tdx_report_req req;
+ struct tdreport *tdreport;
+ int devfd, i;
+
+ devfd = open(TDX_GUEST_DEVNAME, O_RDWR | O_SYNC);
+ ASSERT_LT(0, devfd);
+
+ /* Generate sample report data */
+ for (i = 0; i < TDX_REPORTDATA_LEN; i++)
+ req.reportdata[i] = i;
+
+ /* Get TDREPORT */
+ ASSERT_EQ(0, ioctl(devfd, TDX_CMD_GET_REPORT0, &req));
+
+ if (DEBUG) {
+ print_array_hex("\n\t\tTDX report data\n", "",
+ req.reportdata, sizeof(req.reportdata));
+
+ print_array_hex("\n\t\tTDX tdreport data\n", "",
+ req.tdreport, sizeof(req.tdreport));
+ }
+
+ /* Make sure TDREPORT data includes the REPORTDATA passed */
+ tdreport = (struct tdreport *)req.tdreport;
+ ASSERT_EQ(0, memcmp(&tdreport->reportmac.reportdata[0],
+ req.reportdata, sizeof(req.reportdata)));
+
+ ASSERT_EQ(0, close(devfd));
+}
+
+TEST_HARNESS_MAIN