Questions tagged with ML Ops with Amazon SageMaker and Kubernetes

Kubernetes is an open source system used to automate the deployment, scaling, and management of containerized applications. Kubeflow Pipelines is a workflow manager that offers an interface to manage and schedule machine learning (ML) workflows on a Kubernetes cluster. Using open source tools offers flexibility and standardization, but requires time and effort to set up infrastructure, provision notebook environments for data scientists, and stay up-to-date with the latest deep learning framework versions.

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Hello AWS team! I am trying to run a suite of inference recommendation jobs leveraging NVIDIA Triton Inference Server on a set of GPU instances (ml.g5.12xlarge, ml.g5.8xlarge, ml.g5.16xlarge) as well...
1
answers
0
votes
606
views
Adrian
asked 3 months ago
Hello, I am trying to run a suite of inference recommendation jobs on a set of GPU instances (ml.g5.12xlarge, ml.g5.8xlarge, ml.g5.16xlarge) as well as AWS Inferentia machines (ml.inf2.2xlarge,...
1
answers
0
votes
272
views
Adrian
asked 3 months ago
Hi, How would one go about designing a serverless ML application in AWS? Currently, our project is using the [serverless framework](https://www.serverless.com/) and lambda functions to accomplish...
1
answers
0
votes
415
views
JT
asked a year ago
I want to create a training step in sagemaker pipeline, and use custom processor like below. But instead of python code I want to use java code in the place of [code = 'src/processing.py' ]. Is it...
1
answers
0
votes
389
views
sashi
asked a year ago
I am trying to build a architecture for custom anomaly ai on AWS for my startup. Please let me know if my way of thinking is correct or not 1. Data Ingestion: Ingesting the data into AWS S3 in JSON...
1
answers
0
votes
350
views
asked a year ago
Calling the sagemaker model endpoint with contentType `application/octet-stream` which is also being captured in Data Capture Logs. What would be the ideal way to transform the data such that model...
1
answers
0
votes
596
views
asked 2 years ago
based on aws docs/examples (https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry-version.html), one can create/register model that is generated by your training pipeline. first we need to...
1
answers
0
votes
377
views
asked 2 years ago
Hi, I'm working on an end-to-end ml project which, for the moment, goes from training (it takes already processed train/val/test data from an S3 bucket) to deploy, passing through hyperparameter...
1
answers
0
votes
310
views
asked 2 years ago
I cant save neuron model after compile the model into an AWS Neuron optimized TorchScript. My code: ``` import tensorflow # to workaround a protobuf version conflict issue import torch import...
1
answers
0
votes
427
views
asked 2 years ago
Is there any step-by-step guides/tutorials on how to implement Kubeflow with custom OIDC providers? I want to install Kubeflow in region Jakarta with EKS, but Cognito is not available in region JKT...
2
answers
0
votes
559
views
riza
asked 2 years ago
Hi MLOps Gurus, I'd like to seek guidance on my below situation. This is regarding Sagemaker Project creation in AWS. The use case is to take final model (built by DS team) from S3 and do all...
1
answers
0
votes
569
views
Nikhil
asked 3 years ago
I am expermenting with a sagemaker serverless endpoint (sample code below to create an endpoint from aws documentation). but I keep getting error when the endpoint is invoked , has anyone run into...
1
answers
0
votes
938
views
asked 3 years ago
  • 1
  • 12 / page