Questions tagged with Amazon Redshift
Content language: English
Sort by most recent
The same code works in one account but not another account.
In a step function, lambda calls a redshift query, returns, next step calls get status.
The lambda returns but the status call errors with msg "Query does not exist".
The query in the previous step has successfully run and a statement id returned.
Its only the step with status call that fails.
Other lambdas in the same step function using the same pattern work correctly.
These other steps use the same get status lambda as the steps that fail.
The common item is the call to the redshift query in the previous step.
When ever that is used the status call fails.
However:
The query executes correctly and the same lambda works in the same step function in another account.
Other points:
The redshift clusters were built with a cloud formation template and should be identical.
The entire step function (unchanged other than account number change) works correctly in another account.
The lambdas have been manually copied and pasted from the account that works to ensure they are identical.
Does any one have any suggestions, as all obvious checks have been done.
For completeness the code for the redshift query call is below, however as stated above this works in another account.
```
import json
import boto3
import base64
import urllib.parse
import botocore.session as bc
from botocore.exceptions import ClientError
ssm_client = boto3.client('ssm')
def lambda_handler(event, context):
environment = event['environment']
source_bucket = event['source_bucket']
processed_bucket = event['processed_bucket']
role = event['role']
region = event['region']
database_name = event['database_name']
secret_name = event['secret_name']
secret_arn = event['secret_arn']
cluster_id = event['cluster_id']
proc_name = event['proc_name']
ssm_redshift_proc_name = ssm_client.get_parameter(Name=proc_name, WithDecryption=True)
redshift_proc_name = ssm_redshift_proc_name['Parameter']['Value']
query_str = "call "+redshift_proc_name+"();"
bc_session = bc.get_session()
session = boto3.Session(
botocore_session = bc_session,
region_name = region,
)
client_redshift = session.client("redshift-data")
res = client_redshift.execute_statement(
Database = database_name,
SecretArn = secret_arn,
Sql = query_str,
ClusterIdentifier = cluster_id
)
return {
'environment': environment,
'source_bucket': source_bucket,
'processed_bucket': processed_bucket,
'role': role,
'region': region,
'database_name': database_name,
'secret_name': secret_name,
'secret_arn': secret_arn,
'cluster_id': cluster_id,
'statementid': res['Id']
}
```
Using AWS documentation for granting roles on Redshift Serverless (https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-monitoring.html) - it says I should run this command to grant a role:
```
grant "sys:operator" to "IAM:<correct IAM user name here>";
```
This command always results in this error.
```
ERROR: syntax error at or near ""sys:monitor"" Position: 7 [ErrorId: 1-6372a5f7-38f7a17f6443bdfc4f6b348f]
```
What am I missing? Is the documentation incorrect?
hello,
i've a very simple query of the pattern SELECT timestamp_column FROM a_table WHERE columnID = 'column_value' ORDER BY timestamp_column DESC
the result set does not come back in descending order
i've rewritten this in many ways.
any ideas?
thanks!
When I execute a `copy table from 's3://...' credentials 'aws_iam_role=<role_arn>' csv` in my serverless Redshift db, the operation seems to take a long time and finally fails with a connection timeout trying to reach S3.
See the error below:
```
ERROR: S3CurlException: Connection timed out after 50001 milliseconds, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0 Detail: ----------------------------------------------- error: S3CurlException: Connection timed out after 50001 milliseconds, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0 code: 9002 context: Listing bucket=my-bucket prefix=data/2022-11-01T00:00:00.000Z/test.csv query: 2602701 location: s3_utility.cpp:688 process: padbmaster [pid=24336] ----------------------------------------------- [ErrorId: 1-636d5c5a-0102430e2a36b8567a19097c]
```
Where should I look for more details to help me debug?
Are there additional networking settings or security group rules I should review for my Redshift workgroup?
Hi friends,
We need to access executed queries in our redshift serverless. It shows there's the `sys_query_history` table https://docs.aws.amazon.com/redshift/latest/dg/SYS_QUERY_HISTORY.html, but the query_text has a 4000 character limits, longer than that would be truncated.
Is there another way for getting full query text in Redshift Serverless?
I need to convert timestamp into MMDDYY format :
2020-01-31 01:14:00.000 must be converted to 01/31/20
the timestamp is stored in a table and I need to display it as MMDDYY
Here is what I am trying to use :
to_date(cast(timestamp_column as varchar), 'DDMMYY') - returns a completely different date
Please anyone help me out here asap
Hi,
I've already created a QuickSight account with a VPC connection setting to Redshift and got connected successfully.
I removed the QuickSight account and re-created an account with Managed Microsoft Active Directory authentication method. Then I cannot connect to Redshift when creating a data soure with a timeout error.
The VPC setting is the same as when without Active Directory.
- the same subnet id
- the same security group id
- DNS resolver is blank
How should I change the VPN setting or anything to get successfully connected with Redshift?
Thanks,
Hi Team,
Automated snapshots are only for cluster recovery. You can use them to restore your domain in the event of red cluster status or data loss. What if my open-search cluster crashed or deleted by mistake, and how to restore the automated backup of my open-search.
Does the Redshift Data API support Redshift serverless? If so, please share a few examples of CLI or API commands of Data API accessing Redshift serverless
I seem to have trouble accessing the Redshift serverless resources from the Data API. The Data API does not respect the WorkgroupName attribute. Thoughts?
```
aws redshift-data list-databases --database dev3 --workgroup-name wg6 --region us-east-1
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:
aws help
aws <command> help
aws <command> <subcommand> help
Unknown options: --workgroup-name, wg6
```
The redshift docs say when creating a new user, you can specify your password as cleartext, md5 hash, or sha256 hash with a salt. Two valid formats they give for sha256 are `sha256|<cleartext password>`, and `sha256|<digest>|<salt>`. I tried both formats when making a user and came up with the same error: `error: authentication method 13 not supported`.
I tried psql that came with postgres version 14, and psql that came with version 13, both reported the same error. I also tried connecting with Navicat, and it reported the same error.
Do we need to run `SET password_encryption TO sha256` or something similar?
Hello. I'm attempting to build an ad-tracking application that can attribute, store, and then query and analyze website visitor information in real or near real-time. Unfortunately, I'm finding difficulty designing the application architecture as I am new to AWS overall. So far, I expected my application to look like this:
1. API Gateway to serve as a secure endpoint for websites and ad servers to send website visitor information (think utm parameters, device resolution, internal ID's etc)
2. Lambda/Node.js to route and attribute session information
3. DynamoDB for its ability to handle high-volume write rates in a cost-efficient way.
4. S3 to create frequent/on-demand backups of DynamoDB which can then be analyzed by
5. ? Considering passing all S3 data back for client-side processing in my dashboard.
**However:** I just found [this case study with Nasdaq](https://aws.amazon.com/solutions/case-studies/nasdaq-case-study/?pg=ln&sec=c) utilizing [redshift and other services shown here](https://aws.amazon.com/redshift/?p=ft&c=aa&z=3). Judging from the 'Data' label featured in the first illustration of the latter link (clickstreams, transactions, etc) it appears to be exactly what I need.
So, I suppose my question would be from a cost, simplicity and efficiency standpoint: Would it just be easier to eliminate dynamodb and s3 and instead configure my lambda functions to send their data directly into redshift?
Any guidance would be greatly appreciated, thank you!