check_ethernet: try each recovery method only once
Reboot has proven to be the most effective "recovery" method.
15 minutes was too long to get to reboot.
Restructured the main() recovery loop with that in mind:
1) execute each method only once before rebooting.
2) drop methods which have not proven effective.
3) look for PAUSE_FILE before trying any method.
4) Stop avoiding cdc_ether driver. This is the
NIC driver for Linksys USB3GIGV1 on older kernels.
For (2) provision jobs logs from bvt/cq pools was used to determine which
methods were effective for recovery. "Survivorship bias" was considered
and in fact a primary reason some methods were dropped.
I believe (4) was added to avoid bouncing 3G/4G/LTE modems.
Tests that depend on LTE and bring down the link to lab network will need
to use the open(PAUSE_ETHERNET_HOOK_FILE)/flock() also used by sys_power.py
to disable these checks.
TEST=copy script to /tmp on DUT
$ stop recover_duts
Disconnect LAN cable. Then:
And verify the DUT tries each method and then reboots.
Commit-Ready: Grant Grundler <email@example.com>
Tested-by: Grant Grundler <firstname.lastname@example.org>
Reviewed-by: Richard Barnette <email@example.com>
Reviewed-by: Ben Chan <firstname.lastname@example.org>
Reviewed-by: Grant Grundler <email@example.com>
1 file changed