what resources are required to train a GPT model with 10 billion parameters using 6 petabytes of data, assuming no hyperparameter tuning is performed? Specifically, how many GPUs would be needed and what types of GPUs would you recommend?
您尚未登入。 登入 去張貼答案。
一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。