what resources are required to train a GPT model with 10 billion parameters using 6 petabytes of data, assuming no hyperparameter tuning is performed? Specifically, how many GPUs would be needed and what types of GPUs would you recommend?
您未登录。 登录 发布回答。
一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。