All Content tagged with AWS Inferentia

AWS Inferentia is designed to provide high performance inference in the cloud, to drive down the total cost of inference, and to make it easy for developers to integrate machine learning into their business applications.

Content language: English

Select up to 5 tags to filter
Sort by most recent
48 results
profile pictureAWS
published 15 days ago0 votes853 views
Are you heading to **AWS re:Invent 2024** and looking for AWS Inferentia and Trainium sessions to take your machine learning skills to the next level?
profile pictureAWS
published 4 months ago1 votes297 views
See what regions have instances, and find out how to generate your own list with a python script.
Hello AWS team! I am trying to run a suite of inference recommendation jobs leveraging NVIDIA Triton Inference Server on a set of GPU instances (ml.g5.12xlarge, ml.g5.8xlarge, ml.g5.16xlarge) as well...
1
answers
0
votes
676
views
asked 5 months ago
profile pictureAWS
published 6 months ago3 votes1.2K views
Quick first steps to find out if Inferentia or Trainium is an option for you.
profile pictureAWS
published 7 months ago1 votes2.1K views
Understand what service quotas are, how they apply to Inferentia and Trainium instances and endpoints, and have an example of what quotas would be appropriate for a POC.
Hi, Is there more documentation/examples for *TensorFlow* on Trn1/Trn1n instances? Documentation at: [https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/index.html]() ha...
3
answers
0
votes
520
views
asked 7 months ago
We are using tensorflow.neuron to compile a tensorflow 1.x SavedModel to run on AWS Inferentia machines on EC2. We do this by calling: tensorflow.neuron.saved_model.compile(model_dir, compiled_model_d...
3
answers
0
votes
525
views
asked 8 months ago
Currently, I host my model with `tensorflow_model_server`. Here is how I export my model: ``` model = tf.keras.models.load_model("model.hdf5") def __decode_images(images, nch): o = tf.vectorized...
1
answers
0
votes
649
views
asked a year ago
I am new to AWS Neuron SDK and the documentation seems confusing to me. There is no direct guide on how to install the SDK and use it to compile models. The examples are outdated and the installation ...
1
answers
0
votes
796
views
asked a year ago
Currently, we are using Elastic Inference for inferencing on AWS ECS. We use `inference_accelerators` in `ecs.Ec2TaskDefinition` to set up elastic inference. For scaling, we are monitoring `Accelerato...
1
answers
0
votes
683
views
asked a year ago
I have a project where I would like to send inference requests. For this I need a API as AWS Lambda or a SageMaker endpoint so that the customer can send their request there. The inference performed ...
1
answers
0
votes
841
views
asked a year ago