Don't they use different hardware for inference and training? AIUI the former is usually done on cheaper GDDR cards and the latter is done on expensive HBM cards.
Don't they use different hardware for inference and training? AIUI the former is usually done on cheaper GDDR cards and the latter is done on expensive HBM cards.