based on the example here ,
https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-triton/ensemble/sentence-transformer-trt/examples/ensemble_hf/ensemble/config.pbtxt, i am working on a configuration file for a multi model endpoint on a bert based model. which takes on a string and outputs a string. the max_batch_size and the dims:[1] parameters below are not very clear . Is there any more info on this . triton server documentation is not very clear as well, from what i saw.
name: "ensemble" platform: "ensemble" max_batch_size: 16 input [ { name: "INPUT0" data_type: TYPE_STRING dims: [ 1 ] } ] output [ { name: "finaloutput" data_type: TYPE_FP32 dims: [384] } ]
ログインしていません。 ログイン 回答を投稿する。
優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。