我想在标记任务中使用 Amazon SageMaker Ground Truth 自定义 UI 模板和 AWS Lambda 函数。
解决方法
1. 为标记任务创建一个自定义 UI 模板,如以下示例所示。对于语义分割任务,请将 name 变量设置为 crowd-semantic-segmentation,如以下示例所示。对于边界框任务,请将 name 变量设置为 boundingBox。有关自定义模板的增强 HTML 元素的完整列表,请参阅 Crowd HTML 元素参考。
<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
<crowd-form>
<crowd-semantic-segmentation name="crowd-semantic-segmentation" src="{{ task.input.taskObject | grant_read_access }}" header= "{{ task.input.header }}" labels="{{ task.input.labels | to_json | escape }}">
<full-instructions header= "Segmentation Instructions">
<ol>
<li>Read the task carefully and inspect the image.</li>
<li>Read the options and review the examples provided to understand more about the labels.</li>
<li>Choose the appropriate label that best suits the image.</li>
</ol>
</full-instructions>
<short-instructions>
<p>Use the tools to label the requested items in the image</p>
</short-instructions>
</crowd-semantic-segmentation>
</crowd-form>
2. 为标签创建 JSON 文件。示例:
{
"labels": [
{
"label": "Chair"
},
...
{
"label": "Oven"
}
]
}
3. 为图像创建输入清单文件。示例:
{"source-ref":"s3://awsdoc-example-bucket/input_manifest/apartment-chair.jpg"}
{"source-ref":"s3://awsdoc-example-bucket/input_manifest/apartment-carpet.jpg"}
4. 将 HTML、清单文件和 JSON 文件上传至 Amazon Simple Storage Service (Amazon S3)。示例:
import boto3
import os
bucket = 'awsdoc-example-bucket'
prefix = 'GroundTruthCustomUI'
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'customUI.html')).upload_file('customUI.html')
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'input.manifest')).upload_file('input.manifest')
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'testLabels.json')).upload_file('testLabels.json')
5. 检索 Amazon 资源名称(ARN),以便对 Lambda 函数进行预处理和注释合并。例如,以下是语义分割 ARN:
arn:aws:lambda:eu-west-1:111122223333:function:PRE-SemanticSegmentation
arn:aws:lambda:eu-west-1:111122223333:function:ACS-SemanticSegmentation
6. 使用 AWS 开发工具包(如 boto3)创建标记作业。替换以下示例中的相关值:
INPUT_MANIFEST_IN_S3 S3_OUTPUT_PATH IAM_ROLE_ARN LABELS_JSON_FILE_IN_S3 WORKTEAM_ARN HTML_TEMPLATE_IN_S3
import boto3
client = boto3.client('sagemaker')
client.create_labeling_job(LabelingJobName='SemanticSeg-CustomUI',
LabelAttributeName='output-ref',
InputConfig={
'DataSource': {
'S3DataSource': {
'ManifestS3Uri': 'INPUT_MANIFEST_IN_S3'
}
},
'DataAttributes': {
'ContentClassifiers' : [
'FreeOfPersonallyIdentifiableInformation',
]
}
},
OutputConfig={
'S3OutputPath' : 'S3_OUTPUT_PATH'
},
RoleArn='IAM_ROLE_ARN',
LabelCategoryConfigS3Uri='LABELS_JSON_FILE_IN_S3',
StoppingConditions={
'MaxPercentageOfInputDatasetLabeled': 100
},
HumanTaskConfig={
'WorkteamArn': 'WORKTEAM_ARN',
'UiConfig': {
'UiTemplateS3Uri' : 'HTML_TEMPLATE_IN_S3'
},
'PreHumanTaskLambdaArn' : 'arn:aws:lambda:eu-west-1:111122223333:function:PRE-SemanticSegmentation',
'TaskKeywords': [
'SemanticSegmentation',
],
'TaskTitle': 'Semantic Segmentation',
'TaskDescription': 'Draw around the specified labels using the tools',
'NumberOfHumanWorkersPerDataObject': 1,
'TaskTimeLimitInSeconds': 3600,
'TaskAvailabilityLifetimeInSeconds': 1800,
'MaxConcurrentTaskCount': 1,
'AnnotationConsolidationConfig': {
'AnnotationConsolidationLambdaArn': 'arn:aws:lambda:eu-west-1:111122223333:function:ACS-SemanticSegmentation'
}
},
Tags=[
{
'Key': 'reason',
'Value': 'CustomUI'
}
])
相关信息
语义分割算法