Questions tagged with Amazon Redshift
Content language: English
Sort by most recent
We have two Redshift clusters. One cluster updated to version `1.0.44126` over the weekend. After that, we started experiencing failures in some data extract processes. The error was: `ERROR: schema "dbo" does not exist` Sure enough, after reviewing the queries we had inadvertently used `dbo` instead of `public` in the schema name. However, it worked for months prior to this latest Redshift version. I cannot find any information on version `1.0.44126`. I cannot find any feature that might translate these schema names transparently. **Why is Redshift doing this?** I can reproduce the behavior on cluster version `1.0.43931`. ``` create table public.test_table ( test integer ); select * from dbo.test_table; drop table dbo.test_table; ``` The above all succeeds on cluster version `1.0.43931` and fails on `1.0.44126`.
I am trying to load a Redshift table from a .csv file in an S3 bucket. When I created the table, three columns are set to have default values. ,"current_row_ind" CHAR(1) DEFAULT 'Y' ,"EFFECTIVE_DATE" DATE DEFAULT TO_DATE(CURRENT_DATE, 'YYYY-MM-DD') ,"EXPIRATION_DATE" DATE DEFAULT TO_DATE('9999-12-31', 'YYYY-MM-DD') Some of the rows in my file have values for these fields. Some do not. When I use the COPY command to load the table, Redshift seems to ignore the defaults for those fields that do not have values filled in. If I don't specify those fields in the list but leave them in my file, I get the following error: "Extra column(s) found". If I do specify those fields in the list, those fields in my file that do not have values are added with NULL values rather than the defaults. If I leave those fields off the COPY list and remove them from my file, they load with the defaults. Obviously this won't work as some of the fields have legitimate values. I've listed the parameters I'm using in my COPY command below: FORMAT AS CSV DELIMITER ',' QUOTE '"' ACCEPTINVCHARS '?' TIMEFORMAT 'auto' IGNOREHEADER 1 ACCEPTANYDATE BLANKSASNULL DATEFORMAT 'auto' EMPTYASNULL EXPLICIT_IDS FILLRECORD IGNOREBLANKLINES TRIMBLANKS REGION AS 'us-west-2' My question is how can I use the COPY command to load this table and have it pick up the values for these fields as they are in the file but use the defaults when values are not present? I have researched this here in re:Post as well as other forums such as StackOverflow but have not found anything that addresses this specific scenario. Any help would be appreciated.
Hi Team, I'm implementing pagination using AWS dynamoDB I have in my UI 4 buttons - first - next - previous - last I'm getting data by segments of 25 rows to avoid to do full table scan. when the user click the last button to go directly to the last page, do I need to do a full table scan for that? or is there a way to get the 25 last rows instead of doing full table scan when we click on the "last" button?
HIVE_UNSUPPORTED_FORMAT: Output format org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat with SerDe org.openx.data.jsonserde.JsonSerDe is not supported. If a data manifest file was generated at 's3://athena-one-output-bucket/Unsaved/2022/11/24/0a5467bf-8b9a-4119-bc89-c891d1e26744-manifest.csv', you may need to manually clean the data from locations specified in the manifest. Athena will not delete data in your account. This query ran against the "covid_dataset" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 0a5467bf-8b9a-4119-bc89-c891d1e26744
Hello, I am reading a specific table from a redshift database using redshift connector. when i am viewing the dataframe it does not show the column headers. It shows only numbers as the column headers. Can anyone help whats wrong here? we need to view the table headers
Hi, If I want to connect a Quicksight instance in Account A, to a private Redshift cluster (i.e. located in a private VPC subnet) in Account B, what reasons would I have to use a (more expensive from the looks of it) Redshift Managed VPC Endpoint to provide this cross-account, cross-VPC connectivity, over using VPC peering? Is this simply a case of "less management overhead", or is there a technical reason why VPC peering would not be suitable in this case?
Hi, I am new to Redshift serverless and working on a Lambda function to connect to serverless database using Python redshift Data API and execute bunch of SQL and stored procedures that are stored in a Nexus repository artifact. I am seeing errors when I try to execute SQL statements read from a file as a string. Here is an example of a DDL from one of the scripts. -- Table Definitions -- ---------------------------------- -- test1 ----------------------- -- ---------------------------------- DROP TABLE IF EXISTS test1; CREATE TABLE test1 ( id varchar(32), name varchar(64) DEFAULT NULL, grade varchar(64) DEFAULT NULL, zip varchar(5) DEFAULT NULL -- test2 -------------------------- -- ---------------------------------- DROP TABLE IF EXISTS test2; CREATE TABLE test2( id varchar(32), user_id varchar(32) DEFAULT NULL, hnum varchar(6), ts_created timestamp DEFAULT NULL, ts_updated timestamp DEFAULT NULL -- and few other tables in the same script The function runs fine if I hard code the sql query in the code and I don't see any syntax or other errors with the sql file contents since I could run those using Redshift query editor by manually copy n pasting all the DDLs. Am I missing anything or using data API is not the right approach for this use case? Error and Traceback from the lambda function execution: During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/var/runtime/bootstrap.py", line 60, in <module> main() File "/var/runtime/bootstrap.py", line 57, in main awslambdaricmain.main([os.environ["LAMBDA_TASK_ROOT"], os.environ["_HANDLER"]]) File "/var/runtime/awslambdaric/__main__.py", line 21, in main bootstrap.run(app_root, handler, lambda_runtime_api_addr) File "/var/runtime/awslambdaric/bootstrap.py", line 405, in run handle_event_request( File "/var/runtime/awslambdaric/bootstrap.py", line 165, in handle_event_request xray_fault = make_xray_fault(etype.__name__, str(value), os.getcwd(), tb_tuples) FileNotFoundError: [Errno 2] No such file or directory
Hi , We have a large parquet file which is stored in S3 storage. It contains the data for 6 months of logs or even more. As as service we need to run a query to fetch the user data from parquet file. It has to be done end-to-end using aws data apis. I have explored the redshift but the challenge is data loading from s3 to redshift needs additional effort such as provisioning a cluster and copying data from S3 to redshift data base. I have the following query. 1. Redshift provisioning can be done using aws data APIs ? 2. How to run the query to fetch user data from redshift database. Is any data apis available for this? 3. Is there any better aws service is available other than redshift ? Regards, Ashok
2020 and 2021 there was a lot of news about AQUA accelerated Redshift for RA3 instances. However I was just looking to enable it on a RA3.XLPLUS cluster and the option no longer appears. The "AQUA enabled" flag no longer shows either. When searching the redshift docs, the pages referencing AQUA no longer seem to be present, despite still being in the search. Eg this page is still listed if I search for "aqua" but when I click it its gone! https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-aqua.html Whats happened to AQUA?
We have a BI product which we provisioned on EC2 instances. The only way we can connect to AWS data sources from this EC2 instances is by giving cross account role trust policy. Ec2 is sitting in one vpc and data sources in different vpc's. We have use case to connect to multiple accounts(vpc) data sources, in which case if ec2 role is compromised, it will be able to connect to all the data sources which has the trust. How do we add more access control layers to this?
Any help please? Logs is as below: : Failed to upload file '/rdsdbdata/data/tasks/KXHGVEZM7WRYI636BOJOQYTCOIGFGS6AOGE3MNY/cloud/1/LOAD00000001.gzip' to bucket 'dms-in2tox7r7fgrwfjrshecwnp6s4y2uvsre63afzy' as 'redshift-FGKQA6KH2TIZ5J6G4T3NMVU7REZGRBNUYSGQI3Y/KXHGVEZM7WRYI636BOJOQYTCOIGFGS6AOGE3MNY/1/LOAD00000001.csv', status = 4 (FAILED)  (transfer_client.cpp:514) 2022-11-17T14:59:38.000+00:00 2022-11-17T14:59:38 [FILE_FACTORY ]E: Failed to upload </rdsdbdata/data/tasks/KXHGVEZM7WRYI636BOJOQYTCOIGFGS6AOGE3MNY/cloud/1/LOAD00000001.gzip> to <dms-in2tox7r7fgrwfjrshecwnp6s4y2uvsre63afzy/redshift-FGKQA6KH2TIZ5J6G4T3NMVU7REZGRBNUYSGQI3Y/KXHGVEZM7WRYI636BOJOQYTCOIGFGS6AOGE3MNY/1/LOAD00000001.csv>  (at_s3_ff.c:429) 2022-11-17T14:59:38.000+00:00 2022-11-17T14:59:38 [FILE_FACTORY ]E: Failed to write entire file (second trial)  (at_universal_fs_object.c:631) 2022-11-17T14:59:38.000+00:00 2022-11-17T14:59:38 [FILE_FACTORY ]E: Write entire file failed: source = '/rdsdbdata/data/tasks/KXHGVEZM7WRYI636BOJOQYTCOIGFGS6AOGE3MNY/cloud/1/LOAD00000001.gzip' target = 'dms-in2tox7r7fgrwfjrshecwnp6s4y2uvsre63afzy/redshift-FGKQA6KH2TIZ5J6G4T3NMVU7REZGRBNUYSGQI3Y/KXHGVEZM7WRYI636BOJOQYTCOIGFGS6AOGE3MNY/1/LOAD00000001.csv' open type = 3  (at_universal_fs_object.c:316
I have successfully created a user in Redshift with sha256 hashed password. When trying to authenticate through my app which is using libpq's PQconnectdb to make a new connection to Redshift I experience "authentication method 13 not supported". The same code/workflow works fine when using a user with md5. Also, I have verified that the problem is not related to the user or Redshift configuration by establishing DB connection through ODBC and JDBC drivers (that support sha256) with the same user. Libpq(s) that I used for testing(s) are from PostgreSQL 14 and 15. Can you please tell me if I am doing something wrong or confirm that libpq does not support Redshift sha256 hashed passwords?