- Newest
- Most votes
- Most comments
Amazon EMR launches all nodes for a given cluster in the same Amazon EC2 Availability Zone. Running a cluster in the same zone improves performance of the jobs flows because it provides a higher data access rate. By default, Amazon EMR chooses the Availability Zone with the most available resources in which to run your cluster. However, you can specify another Availability Zone if required.
For Configure instance fleets : Select the VPC and one or more subnets where you would like to deploy your Amazon EMR Cluster. We recommend choosing more than one Availability Zone. Your cluster will still be deployed in a single Availability Zone, however selecting multiple Availability Zones allows Amazon EMR to look across all selected Availability Zones to deploy your cluster in the Availability Zone with the most EC2 Spot Capacity to run your cluster.
'To answer your question on ‘uptime with HBase on EMR’:
Apache HBase on Amazon S3 can be recommended if your application does not require support for high availability of writes and can tolerate failures during writes/updates. If you would like to mitigate the impact of Amazon EMR Master node failures (or Availability Zone failures that can cause the termination of the Apache HBase on Amazon S3 cluster or any temporary degradation of service due to an Apache HBase RegionServer operational/transient issues), we recommend that your pipeline architecture relies on a stream/messaging platform upstream to the Apache HBase on Amazon S3 cluster. We recommend that you always use the latest Amazon EMR release so you can benefit from all changes and features continuously added to Apache HBase.
https://d1.awsstatic.com/whitepapers/Migrating_to_Apache_Hbase_on_Amazon_S3_on_Amazon_EMR.pdf
Relevant content
- asked 5 months ago
- Accepted Answerasked 4 months ago
- asked a year ago
- asked 3 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 2 years ago