check_ethernet: try each recovery method only once

Reboot has proven to be the most effective "recovery" method.
15 minutes was too long to get to reboot.

Restructured the main() recovery loop with that in mind:
1) execute each method only once before rebooting.
2) drop methods which have not proven effective.
3) look for PAUSE_FILE before trying any method.
4) Stop avoiding cdc_ether driver. This is the
   NIC driver for Linksys USB3GIGV1 on older kernels.

For (2) provision jobs logs from bvt/cq pools was used to determine which
methods were effective for recovery. "Survivorship bias" was considered
and in fact a primary reason some methods were dropped.

I believe (4) was added to avoid bouncing 3G/4G/LTE modems.
Tests that depend on LTE and bring down the link to lab network will need
to use the open(PAUSE_ETHERNET_HOOK_FILE)/flock() also used by
to disable these checks.

TEST=copy script to /tmp on DUT
       $ stop recover_duts
    Disconnect LAN cable. Then:
       $ /tmp/check_ethernet.hook
    And verify the DUT tries each method and then reboots.

Change-Id: I296d5565d264442a9033d6416a8262b1865961af
Commit-Ready: Grant Grundler <>
Tested-by: Grant Grundler <>
Reviewed-by: Richard Barnette <>
Reviewed-by: Ben Chan <>
Reviewed-by: Grant Grundler <>
1 file changed