When optimizing code, you probably want to work on performance improvements that can actually be fixed by editing the code.
Most of the things you can directly affect are things that happen in every test run, so best-case will include them.
Slower test runs will include events that don't happen on every test run (the computer is busy doing something else), so editing the code has less effect on them, and possibly none at all if it's completely unrelated.
Maybe those other events causing slowdown should be investigated too? But usually you want to look for a way to make them happen every time before working on them.
If the thing you're timing is expected to have a constant running time, the only thing that can slow it down is external factors (e.g. OS background tasks).
Best-case over a large number of runs is the correct way to approach the ideal running time of the task in this case, as you can eventually hit a run that didn't get impeded by anything.
You should never do this. Best-case favors outliers and does not represent expected performance, which is what we care about. Just because the stars happen to align one time doesn't mean you report that run.
You will in practice hardly ever see outliers like you descrivbed in system A, where one run is significantly faster. You will often see cases where one run is significantly slower. The reason could be things like cache misses, swapped out code, some bad code path happening etc (all these on very different timescales). These things tend to happen only occasionally, so the reversed case from your example A (seven five second runs and one ten seconds run) is more pluasible. Because such factors tend to be things you can't easily control, taking the minimum is a good approximation when optimizing a code snippet as opposed to the whole program.