block: change blk_mq_add_to_batch() third argument type to bool

[ Upstream commit 9bce6b5f8987678b9c6c1fe433af6b5fe41feadc ]

Conflicts:
  drivers/block/null_blk/main.c: The end_cmd line in the patch context
  was replaced with blk_mq_end_request in 8b631f9cf0b8 (null_blk: remove
  the bio based I/O path). We don't have that refactor and cherry-picking
  it would result in many conflicts, so we leave that line as it was.
  The changed lines in the patch are still identical to the upstream
  commit. The risk here is also low because we don't compile null_blk
  into our kernel (CONFIG_BLK_DEV_NULL_BLK is not set).

Commit 1f47ed294a2b ("block: cleanup and fix batch completion adding
conditions") modified the evaluation criteria for the third argument,
'ioerror', in the blk_mq_add_to_batch() function. Initially, the
function had checked if 'ioerror' equals zero. Following the commit, it
started checking for negative error values, with the presumption that
such values, for instance -EIO, would be passed in.

However, blk_mq_add_to_batch() callers do not pass negative error
values. Instead, they pass status codes defined in various ways:

- NVMe PCI and Apple drivers pass NVMe status code
- virtio_blk driver passes the virtblk request header status byte
- null_blk driver passes blk_status_t

These codes are either zero or positive, therefore the revised check
fails to function as intended. Specifically, with the NVMe PCI driver,
this modification led to the failure of the blktests test case nvme/039.
In this test scenario, errors are artificially injected to the NVMe
driver, resulting in positive NVMe status codes passed to
blk_mq_add_to_batch(), which unexpectedly processes the failed I/O in a
batch. Hence the failure.

To correct the ioerror check within blk_mq_add_to_batch(), make all
callers to uniformly pass the argument as boolean. Modify the callers to
check their specific status codes and pass the boolean value 'is_error'.
Also describe the arguments of blK_mq_add_to_batch as kerneldoc.

BUG=b/431080964
TEST=validation, compiled kernel with revert and ran:
`ITERATIONS=300 KERNEL_TARBALL=$HOME/cos-117/src/third_party/kernel/v6.6/cos-kernel-6.6.97-x86_64.txz bash experimental/users/kpberry/b/431080964/repro.sh`
RELEASE_NOTE=Fixed a kernel bug which caused some NVME disk IO errors to be ignored, potentially resulting in dropped writes.

cos-patch: bug
Fixes: 1f47ed294a2b ("block: cleanup and fix batch completion adding conditions")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20250311104359.1767728-3-shinichiro.kawasaki@wdc.com
[axboe: fold in documentation update]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Kevin Berry <kpberry@google.com>
Change-Id: I5fb6eeece7b383c281255e554946f5ac310baeb7
Reviewed-on: https://cos-review.googlesource.com/c/third_party/kernel/+/107424
Tested-by: Cusky Presubmit Bot <presubmit@cos-infra-prod.iam.gserviceaccount.com>
Reviewed-by: Robert Kolchmeyer <rkolchmeyer@google.com>
Main-Branch-Verified: Cusky Presubmit Bot <presubmit@cos-infra-prod.iam.gserviceaccount.com>
Reviewed-by: Miri Amarilio <mirilio@google.com>
Reviewed-on: https://cos-review.googlesource.com/c/third_party/kernel/+/107737
5 files changed