Timing Out Redshift
Backstory: We have a datset provided by an outside consultant which is an export from SPSS.
11.3K rows of 1174 columns.
The columns contain replies to a survey.
The results are in categorical values, for example the answers to How much do you agree with.... are coded as 1 strongly disagree-5 strongly agree.
The end user of the dataset wants the values recoded to store the text rather than the variable, and the columns renamed from the question number to a plaintext of the question (which breaks my relational database thinking a bit, but I understand the reasoning).
We have a datamap file which give the plaintext meaning of the different values, and I have to recode the columns
I had to split the dataset into 6 subsets to get an SQL data set to run.
I've managed to get 4 of the 6 working as MySQL dataflows, but 2 of them will not run.
my coding is 250-odd select statements of select the appropriate value from the datamap based on the possible responses
eg of a few lines with identifying information removed:
(select "Selection" from "dataset" where "Source Value"="Q50_21" and "Check"='Yes/No') as "Value Q50 From the list below, plea",
(select "Selection" from "dataset" where "Source Value"="Q51" and "Check"='Q51') as "Now please think about",
(select "Selection" from "dataset" where "Source Value"="Q52_1" and "Check"='Appeal') as "Value Q52 X- How appealing do",
When I run as MySQL I get Row size is wider than MySQL supports. Please reduce the row size or number of columns.
When I run in Redshift, it runs for an hour and makes 74% progress and then: The database reported a syntax error. [Amazon](600001) The server closed the connection.
I'm pretty sure I'm not doing it in the most efficient way possible, but this was a once off survey which won't be repeated, so the dataset will never be changed.
I literally need it to run correctly once, and to be honest after two days I don't care about efficiency anymore, I just want it done.
Any help anyone can suggest would be great.
- All Categories
- 1.2K Product Ideas
- 1.2K Ideas Exchange
- 1.4K Connect
- 1.1K Connectors
- 273 Workbench
- 2 Cloud Amplifier
- 3 Federated
- 2.7K Transform
- 78 SQL DataFlows
- 526 Datasets
- 2.1K Magic ETL
- 2.9K Visualize
- 2.2K Charting
- 440 Beast Mode
- 23 Variables
- 256 Cards, Dashboards, Stories
- 514 Automate
- 115 Apps
- 391 APIs & Domo Developer
- 8 Workflows
- 26 Predict
- 10 Jupyter Workspaces
- 16 R & Python Tiles
- 332 Distribute
- 77 Domo Everywhere
- 255 Scheduled Reports
- 67 Manage
- 67 Governance & Security
- 1 Product Release Questions
- Community Forums
- 40 Getting Started
- 26 Community Member Introductions
- 68 Community Announcements
- 4.8K Archive