)]}'
{
  "commit": "413862f701596dfab09c84e125397b3b845ce69a",
  "tree": "53cec080eeda61e0467655c1f9cf949c07fc0e89",
  "parents": [
    "5ce171b75299f282dfee7f7083584c5c5cbe4f13"
  ],
  "author": {
    "name": "Mina Almasry",
    "email": "almasrymina@google.com",
    "time": "Tue Nov 04 00:05:51 2025 +0000"
  },
  "committer": {
    "name": "Kevin Berry",
    "email": "kpberry@google.com",
    "time": "Fri Nov 14 12:27:36 2025 -0800"
  },
  "message": "gve: use max allowed ring size for ZC page_pools\n\nNCCL workloads with NCCL_P2P_PXN_LEVEL\u003d2 or 1 are very slow with the\ncurrent gve devmem tcp configuration.\n\nRoot causing showed that this particular workload results in a very\nbursty pattern of devmem allocations and frees, exhausting the page_pool\nring buffer. This results in sock_devmem_dontneed taking up to 5ms to\nfree a batch of 128 netmems, as each free does not find an available\nentry in the pp-\u003ering, and going all the way down to the (slow) gen_pool,\nand gve_alloc_buffer running into a burst of successive allocations\nwhich also don\u0027t find entries in the pp-\u003ering (not dontneed\u0027d yet,\npresumably), each allocation taking up to 100us, slowing down the napi\npoll loop.\n\nFrom there, the slowness of the napi poll loop results, I suspect,\nin the rx buffers not being processed in time, and packet drops\ndetected by tcpdump. The total sum of all this badness results in this\nworkload running at around 0.5 GB/s, when expected perf is around 12\nGB/s.\n\nThis entire behavior can be avoided by increasing the pp-\u003ering size to the\nmax allowed 32768. This makes the pp able to handle the bursty\nalloc/frees of this particular workload. AFACT there should be no\nnegative side effect of arbitrarily increasing the pp-\u003ering size in this\nmanner for ZC configs - the memory is prealloced and pinned by the\nmemory provider anyway.\n\nTested by running AllToAll PXN\u003d2 workload. Before:\n\nAvg bus bandwidth    : 0.434191\n\nAfter:\n\nAvg bus bandwidth    : 12.5494\n\nNote that there is more we can do to optimize this path, such as bulk\nnetmem dontneeds, bulk netmem pp refills, and possibly taking a page\nfrom the iouring zcrx playbook and replacing the gen_pool with a simpler\nfixed-size array based allocator, but this seems sufficient to fix these\ncritcal workloads.\n\nWith thanks to Willem and Eric for helping root cause this,\n\nFixes: commit 62d7f40503bc (\"gve: support unreadable netmem\")\nReported-by: Vedant Mathur \u003cvedantmathur@google.com\u003e\nChange-Id: Ic597a356d347a2e61da214b1702090306057c44e\nSigned-off-by: Mina Almasry \u003calmasrymina@google.com\u003e\nReviewed-on: https://cos-review.googlesource.com/c/third_party/kernel/+/116948\nReviewed-by: Eric Dumazet \u003cedumazet@google.com\u003e\nMain-Branch-Verified: Cusky Presubmit Bot \u003cpresubmit@cos-infra-prod.iam.gserviceaccount.com\u003e\nTested-by: Cusky Presubmit Bot \u003cpresubmit@cos-infra-prod.iam.gserviceaccount.com\u003e\nReviewed-by: Robert Kolchmeyer \u003crkolchmeyer@google.com\u003e\nReviewed-on: https://cos-review.googlesource.com/c/third_party/kernel/+/118026\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "af5319e69293aca5b34c128ab919b76a3e2b2693",
      "old_mode": 33188,
      "old_path": "drivers/net/ethernet/google/gve/gve_buffer_mgmt_dqo.c",
      "new_id": "24f0c1a2727329270660e8771e029aa54f568134",
      "new_mode": 33188,
      "new_path": "drivers/net/ethernet/google/gve/gve_buffer_mgmt_dqo.c"
    }
  ]
}
