Questions tagged with Amazon Redshift
Content language: English
Sort by most recent
Hi,
I was testing Redshift's new features - auto copy from s3 and streaming ingestion.
I have a few questions regarding these features.
1. I understand that Redshift automatically decides the number of files to upload on Redshift in a batch. I wonder how often Redshift detects the file and try to upload the file. Does it decide the timing to upload files based on a specific file size or a specific time interval?
2. When streaming data is transferred to Redshift from Kinesis data streams with streaming ingestion features, where will the data be stored? Will data be stored in Kinesis queue for 24 hours or not stored in anywhere?
Thanks.
Is there any feature in Redshift that allows adding dynamically metadata to database objects?
I mean something similar to object tagging in Snowflake.
I'm trying to copy the CSV file from S3 bucket to Redshift and these are the conditions using, IGNOREHEADER AS 1
delimiter ','
region as 'us-east-1'
csv;
even the table schema nd CSV file is having the same data without nulls, still facing errors as - Invalid digit, Value 'B', Pos 0, Type: Integer ,Invalid dateformat, Char length exceeded.
These are the following Cols presenet in the data - Uploaddate,Batchid,Inventoryid,Itemtype,Model,Supplierid,Quantity,Unit,Costcenter,Inventorydate,Embodiedco2,Usageco2,Riskadjustment.
I am trying to create a new table based on a simple query that contains boolean type columns.
When I run the query it gives me this error: `cannot cast type character varying to boolean`.
_Why would I want to do this?_
I'm trying to define the datatype of a column in my new table as a BOOLEAN but without having a value to put in it (yet). Basically the outcome i'm looking for is the equivalent of running the following DDL: `create table test_table (test_bool boolean);`
A simple example to reproduce this is:
```sql
CREATE table test_table AS (
SELECT CAST(NULL AS BOOLEAN) AS test_bool
);
```
**Note: I am not using a string or casting to a varchar in this code!!**
Conversely, the following works as expected - ie. has the correct column types:
```sql
CREATE table test_table AS (
SELECT
CAST(NULL AS VARCHAR(256)) AS test_varchar,
CAST(NULL AS TIMESTAMP) AS test_ts,
CAST(NULL AS INT) AS test_int,
CAST(NULL AS NUMERIC(10,2)) AS test_numeric
);
```
Hopefully I'm missing something fairly basic with the data implicit conversion or the conversion from a result set to a table definition.
Also fwiw, this was on Redshift serverless, although I did not try on regular redshift so it could be specific to serverless, but I cannot say one way or the other.
**---CLARIFICATION BY EXAMPLE---**
Here's a more concrete example:
I have source data that has people with eye_color and hair_color but nothing else. I want my target table to have the following schema: `person(eye_color VARCHAR(256), hair_color VARCHAR(256), is_left_handed BOOLEAN, salary DECIMAL(10,2))`.
I am creating this table new each run of my pipeline, and I'm using DBT to create a table from a source table (containing the source data mentioned above). This means I don't plan on running DDL to create the table and then fill it - rather I will "CREATE TABLE AS".
Since I want the destination table (person) to have all the columns (i.e. the correct schema) I need placeholders for them in my select statement. Right now it would look something like this:
```sql
CREATE TABLE person AS (
SELECT
eye_color,
hair_color,
CAST (NULL AS BOOLEAN) AS is_left_handed,
CAST (NULL AS DECIMAL(10,2)) AS salary
FROM source_data
);
```
This command fails with the error about converting to varchar. My question is about determining when the boolean value is converted to a varchar and how to prevent it from happening.
If I remove the offending column `CAST (NULL AS BOOLEAN) AS is_left_handed` it works as expected. and can be verified with this query:
```sql
SELECT column_name, is_nullable, data_type, character_maximum_length, numeric_precision, numeric_scale, datetime_precision
FROM information_schema.columns
WHERE table_schema = 'my_schema' AND table_name = 'person';
```
Hi All,
Greetings for the day.
I am curious to undersand the significance and accuracy of the field "vacuum_sort_benefit" in system view "svv_table_info".
In the current cluster that I am working, I see tables where column "unsorted" is 100 but the vacuum_sort_benefit is 0.0 which is making me to think if its really worth to run a vacuum and analyze on these tables.
Any inputs in this regard is greatly appreciated.

Can anybody knows why the zeros at timestamp (eg. 2022-11-08 07:50:08.100 to 2022-11-08 07:50:08.10) are truncated by Redshift automatically? Please refer to my screenshot for details.
Greetings,
I have a really simple ETL that should take a csv from s3 and insert it into Redshift. However, I can't configure Redshift as a target because for some reason in the target properties the dropdown only shows Glue Data Catalog databases and not Redshift ones. I have tried different browsers thinking it was a caching issue, but am now convinced it's an AWS error.

I only see a max of 4 from the Nodes dropdown.
Does Redshift cluster allow having more than 4 compute nodes?

Hello everyone !
I am trying to import data from one RDS SQL Server to Redshift but I have a query with a recursive cte like that :
```
WITH RECURSIVE table(n) AS (
select 1
UNION ALL
SELECT n+1 FROM table WHERE n < 100
)
SELECT n FROM table;
```
On Redshift there is no error but if I execute the query in an aws glue job (Transform - SQL Query) then i get this error :
*ParseException: no viable alternative at input 'WITH RECURSIVE table*
What am I missing ?
Hello AWS Experts,
We are having two different AWS account in the same region. For instance Account_1 and Account_2. Account_1 EC2 has Talend installed and Account_2 has Redshift. We need to fetch Redshift(Account_2) data directly from Account_1. Is cross account connection for fetching data is recommended?
Can anyone please help me here? Kindly give me the URL also if you have.
Appreciate your help!
Thanks,
RN
Hi,
I am using the Golang SDK for querying the data from the Redshift database.
The function GetStatementResultOutput () returns the response GetStatementResultOutput{} which contains the list of records for the SQL query. How do i fetch the data from the records, which are of type "Records [][]*Field" . I want the result in JSON format so that i can unmarshal the data into the result structure. The records will return the value field for the column. It's difficult to determine which type's value field is which. Is there any utility function available to decode the records struct.
Trying to set up DMS between RDS MySQL and Redshift. If I try to edit the the Table Mappings from the GUI, I always get a weird error (that has zero Google results, except 1 spam page)
`Replication Task Settings document error: Array element's unique identifier was not found, please provide Id for each and every element`.
No idea what that means. Every rule has a unique name and ID.
Not the biggest issue in the world, as it's usually okay if I make a new one, but it does make it hard to iterate. Any idea what the issue here could be? Thanks!