)]}'
{
  "commit": "0a062883b07e91150ed9a2b606e9063643e31e86",
  "tree": "cafe62411e5fc5a3a610a4562a64d920c6c7eaf0",
  "parents": [
    "2fe7065782e903d27b8e1451964dfd7cbc465d34"
  ],
  "author": {
    "name": "Neal Cardwell",
    "email": "ncardwell@google.com",
    "time": "Fri Sep 08 13:48:53 2023 +0000"
  },
  "committer": {
    "name": "Mina Almasry",
    "email": "almasrymina@google.com",
    "time": "Wed Oct 04 17:24:59 2023 -0700"
  },
  "message": "UPSTREAM: tcp: fix delayed ACKs for MSS boundary condition\n\nThis commit fixes poor delayed ACK behavior that can cause poor TCP\nlatency in a particular boundary condition: when an application makes\na TCP socket write that is an exact multiple of the MSS size.\n\nThe problem is that there is painful boundary discontinuity in the\ncurrent delayed ACK behavior. With the current delayed ACK behavior,\nwe have:\n\n(1) If an app reads \u003e 1*MSS data, tcp_cleanup_rbuf() ACKs immediately\n    because of:\n\n     tp-\u003ercv_nxt - tp-\u003ercv_wup \u003e icsk-\u003eicsk_ack.rcv_mss ||\n\n(2) If an app reads \u003c 1*MSS data and either (a) app is not ping-pong or\n    (b) we received two packets \u003c1*MSS, then tcp_cleanup_rbuf() ACKs\n    immediately beecause of:\n\n     ((icsk-\u003eicsk_ack.pending \u0026 ICSK_ACK_PUSHED2) ||\n      ((icsk-\u003eicsk_ack.pending \u0026 ICSK_ACK_PUSHED) \u0026\u0026\n       !inet_csk_in_pingpong_mode(sk))) \u0026\u0026\n\n(3) *However*: if an app reads exactly 1*MSS of data,\n    tcp_cleanup_rbuf() does not send an immediate ACK. This is true\n    even if the app is not ping-pong and the 1*MSS of data had the PSH\n    bit set, suggesting the sending application completed an\n    application write.\n\nThus if the app is not ping-pong, we have this painful case where\n\u003e1*MSS gets an immediate ACK, and \u003c1*MSS gets an immediate ACK, but a\nwrite whose last skb is an exact multiple of 1*MSS can get a 40ms\ndelayed ACK. This means that any app that transfers data in one\ndirection and takes care to align write size or packet size with MSS\ncan suffer this problem. With receive zero copy making 4KB MSS values\nmore common, it is becoming more common to have application writes\nnaturally align with MSS, and more applications are likely to\nencounter this delayed ACK problem.\n\nThe fix in this commit is to refine the delayed ACK heuristics with a\nsimple check: immediately ACK a received 1*MSS skb with PSH bit set if\nthe app reads all data. Why? If an skb has a len of exactly 1*MSS and\nhas the PSH bit set then it is likely the end of an application\nwrite. So more data may not be arriving soon, and yet the data sender\nmay be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we\nset ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send\nan ACK immediately if the app reads all of the data and is not\nping-pong. Note that this logic is also executed for the case where\nlen \u003e MSS, but in that case this logic does not matter (and does not\nhurt) because tcp_cleanup_rbuf() will always ACK immediately if the\napp reads data and there is more than an MSS of unACKed data.\n\nTested: http://sponge2/97c207f1-afe5-43df-a51e-1827684fd5e9\n  Passess all tests, including a new test in this\n  accompanying new commit:\n    net-test: add delack-policy-if-not-ping-pong\nEffort: net-backports\nGoogle-Bug-Id: 296716762\nChange-Id: I66d2ab6c87120972845bbaafbe0a634c900fa522\nReviewed-on: https://cos-review.googlesource.com/c/third_party/kernel/+/57748\nReviewed-by: Brian Vazquez \u003cbrianvv@google.com\u003e\nTested-by: Mina Almasry \u003calmasrymina@google.com\u003e\nReviewed-by: Eric Dumazet \u003cedumazet@google.com\u003e\nReviewed-by: Neal Cardwell \u003cncardwell@google.com\u003e\n(cherry picked from commit aa916d99fbe0890c45f20132157226d655f1dcac)\n",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "36e15276961e2c237a277a5ea780ffdd7786d7a1",
      "old_mode": 33188,
      "old_path": "net/ipv4/tcp_input.c",
      "new_id": "58b179c1e0b69b3d514c0fada29d3eb238e4f2a8",
      "new_mode": 33188,
      "new_path": "net/ipv4/tcp_input.c"
    }
  ]
}
