Explore how you can quickly prepare for, respond to, and recover from security events. Learn more.
All Content tagged with AWS Inferentia
AWS Inferentia is designed to provide high performance inference in the cloud, to drive down the total cost of inference, and to make it easy for developers to integrate machine learning into their business applications.
Content language: English
Select up to 5 tags to filter
Sort by most recent
48 results
EXPERT
published 15 days ago0 votes853 views
Are you heading to **AWS re:Invent 2024** and looking for AWS Inferentia and Trainium sessions to take your machine learning skills to the next level?
EXPERT
published 4 months ago1 votes297 views
See what regions have instances, and find out how to generate your own list with a python script.
Hello AWS team!
I am trying to run a suite of inference recommendation jobs leveraging NVIDIA Triton Inference Server on a set of GPU instances (ml.g5.12xlarge, ml.g5.8xlarge, ml.g5.16xlarge) as well...
EXPERT
published 6 months ago3 votes1.2K views
Quick first steps to find out if Inferentia or Trainium is an option for you.
EXPERT
published 7 months ago1 votes2.1K views
Understand what service quotas are, how they apply to Inferentia and Trainium instances and endpoints, and have an example of what quotas would be appropriate for a POC.
Hi,
Is there more documentation/examples for *TensorFlow* on Trn1/Trn1n instances?
Documentation at:
[https://awsdocs-neuron.readthedocs-hosted.com/en/latest/frameworks/tensorflow/index.html]() ha...
We are using tensorflow.neuron to compile a tensorflow 1.x SavedModel to run on AWS Inferentia machines on EC2. We do this by calling:
tensorflow.neuron.saved_model.compile(model_dir, compiled_model_d...
EXPERT
published 10 months ago0 votes2.2K views
Announcement for pre-built AWS solutions
Currently, I host my model with `tensorflow_model_server`. Here is how I export my model:
```
model = tf.keras.models.load_model("model.hdf5")
def __decode_images(images, nch):
o = tf.vectorized...
I am new to AWS Neuron SDK and the documentation seems confusing to me.
There is no direct guide on how to install the SDK and use it to compile models. The examples are outdated and the installation ...
Currently, we are using Elastic Inference for inferencing on AWS ECS. We use `inference_accelerators` in `ecs.Ec2TaskDefinition` to set up elastic inference. For scaling, we are monitoring `Accelerato...
I have a project where I would like to send inference requests. For this I need a API as AWS Lambda or a SageMaker endpoint so that the customer can send their request there.
The inference performed ...