- 最新
- 最多得票
- 最多評論
Our problem was that the failed cassandra sessions were persisted between invocations, therefore causing cascading failures. Adding an exception to clear the sessions on a failure has helped with this
In keyspaces getting A ServerError usually indicates an transient service error.
https://docs.aws.amazon.com/keyspaces/latest/devguide/metrics-dimensions.html
You can set up Keyspace & Table Metrics for Amazon Keyspaces using https://github.com/aws-samples/amazon-keyspaces-cloudwatch-cloudformation-templates
Some of the metrics include Consumed and Provisioned Capacity per second, Number of CQL Request per second, Average Latency per Second, User errors, System Errors, Current Account Quotas. These statistics are kept for 15 months, so that you can access historical information and gain a better perspective on how your web application or service is performing
In distributed systems its common to see transient failures. The default policy will try “next host”, with keyspaces its best to retry the same. Here is a sample retry policy that should help
from cassandra.cluster import Cluster, ExecutionProfile, EXEC_PROFILE_DEFAULT, ConsistencyLevel from ssl import SSLContext, PROTOCOL_TLSv1_2 , CERT_REQUIRED from cassandra.auth import PlainTextAuthProvider from cassandra import ( ConsistencyLevel, AuthenticationFailed, OperationTimedOut, UnsupportedOperation, ProtocolVersion ) from cassandra.protocol import ( ErrorMessage, ReadTimeoutErrorMessage, WriteTimeoutErrorMessage, UnavailableErrorMessage ) from cassandra.policies import ( TokenAwarePolicy, DCAwareRoundRobinPolicy, RetryPolicy ) import logging logging.basicConfig(format='%(asctime)s - %(message)s', datefmt='%d-%b-%y %H:%M:%S' , level=logging.INFO) class KeyspacesRetryPolicy(RetryPolicy): def __init__(self, RETRY_MAX_ATTEMPTS=3): self.RETRY_MAX_ATTEMPTS = RETRY_MAX_ATTEMPTS def on_read_timeout ( self, query, consistency, required_responses, received_responses, data_retrieved, retry_num): if retry_num <= self.RETRY_MAX_ATTEMPTS: return self.RETRY, consistency else: return self.RETHROW, None def on_write_timeout (self, query, consistency, write_type, required_responses, received_responses, retry_num): if retry_num <= self.RETRY_MAX_ATTEMPTS: return self.RETRY, consistency else: return self.RETHROW, None def on_unavailable (self, query, consistency, required_replicas, alive_replicas, retry_num): if retry_num <= self.RETRY_MAX_ATTEMPTS: return self.RETRY, consistency else: return self.RETHROW, None def on_request_error (self, query, consistency, error, retry_num): if retry_num <= self.RETRY_MAX_ATTEMPTS: return self.RETRY, consistency else: return self.RETHROW, None ssl_context = SSLContext(PROTOCOL_TLSv1_2 ) ssl_context.load_verify_locations('sf-class2-root.crt') ssl_context.verify_mode = CERT_REQUIRED auth_provider = PlainTextAuthProvider(username='keyspace_user+', password='xxxxx') hosts = ['cassandra.us-east-2.amazonaws.com'] profile = ExecutionProfile( # load_balancing_policy=WhiteListRoundRobinPolicy(['cassandra.us-east-2.amazonaws.com']), consistency_level=ConsistencyLevel.LOCAL_QUORUM, retry_policy=KeyspacesRetryPolicy(RETRY_MAX_ATTEMPTS=5) ) cluster = Cluster( hosts, ssl_context=ssl_context, auth_provider=auth_provider, port=9142, execution_profiles={EXEC_PROFILE_DEFAULT: profile} ) session = cluster.connect() r = session.execute('select * from system_schema.keyspaces') print(r.current_rows)