Redshift order by
My first try at using Redshift. Trying to use ORDER BY and not getting the expected result. The output table should be sorted by column "a".
Here are the steps:
1. Create tbl_data
SELECT 3 as "a", 30 as "b"
UNION ALL
SELECT 3 as "a", 30 as "b"
UNION ALL
SELECT 2 as "a", 20 as "b"
UNION ALL
SELECT 2 as "a", 20 as "b"
UNION ALL
SELECT 1 as "a", 10 as "b"
UNION ALL
SELECT 1 as "a", 10 as "b"
2. Order and output as pd_a1
SELECT *
FROM "tbl_data"
ORDER BY "a"
Here is the result:
https://drive.google.com/open?id=1s6L4LDPijFJY6Hjrp9F6Nsh9JBeo6yH3
Thanks for your attention.
Comments
-
I have the same problem, but it seems to only be a problem in the preview as the end result comes out correct.
0 -
For me the preview and the output are the same.
1 -
This has to be a bug. I am having the same issue that @BruceP is having with the order by clause.
Have you submitted a ticket yet?
0 -
I wrote a simple query to order by closedate and got some strange but consistent results. I tried to dig into the database details of how this is being sorted, like sortkey and distkey, but didn't really know what i was doing, and creating a temp table and applying the sortkey/distkey using custom sql didn't seem to make a difference. Any ideas?
0 -
I am having issues with the order_by too,
is it a bug, eventually? any solution?0 -
Recently I have had the same problem!
My objective was to perform a cumulative sum of an ordered dataset. I tried to run Order By with different kinds of data type (date, string of date, row number column ecc.) but for all the tests the same result = no order.
I choose to switch to the ETL dataflow and perform a window & ranking function.
Maybe there is any window functions that coul help to force the ordering in redshift (?)
0 -
Hi Kevin,
I'm not sure why this works. But, I was getting the same odd sorting that you were:
However, if I added a rank function to it, the sort worked:
... or it did when I included one extra column???
Very odd. In conclusion... I would say to cautiously use the ORDER BY in redshift. If you do, you will want to thoroughly check your results. This has to be a bug. I was not able to find any documentation from redshift stating that the order by clause was broken, but it seems to be in whichever version Domo is using for these dataflows.
0 -
Hey everyone - here's the explanation provided by the support team at Domo - "The ORDER BY does apply for the purpose of the query that you use it in, but Redshift returns the data back to the preview in whatever order it wants. Unfortunately the ORDER BY only applies to the query itself and not the return data."
0 -
Hi everyone,
I am a Tier II Technical Advisor on the Domo Support team, and I actually field this question about Redshift a lot.
What you observe about the behavior with ORDER BY in Redshift is actually expected:
When Redshift processes a query that contains an ORDER BY clause, it does honor the specified order while processing the logic of that query (i.e. RANK() functions and the like do work properly, and as expected). However, when it has finished running the query, it re-orders the data to optimize the actual storage of the data (Redshift stores data tables distributed across many nodes, and splits the data up according to it's own storage optimization methods).
When Domo pulls up a preview of the data, we load it out of the table after it has been re-ordered and stored by Redshift.
When working with Redshift, it's important to include an ORDER BY clause in every individual query that relies on that specific ordering of the data, and never to rely on the order persisting from a prior transform.
In fact, this is best-practice when working with MySQL as well, in case somebody alters your dataflow without understanding that the order of a transform's output is important.
EDIT: I'm also getting this information added to our Dataflow Troubleshooting Knowledge Base article here: https://knowledge.domo.com/Prepare/DataFlow_Tips_and_Tricks/DataFlow_and_DataFusion_Troubleshooting_and_FAQ
Thanks for bringing it up for discussion, and happy data processing!
Starlord
**Say “Thanks” by clicking the “heart” in the post that helped you.
**Please mark the post that solves your problem by clicking on "Accept as Solution"4
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.6K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 9 Federated
- 2.9K Transform
- 103 SQL DataFlows
- 627 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 755 Beast Mode
- 61 App Studio
- 41 Variables
- 693 Automate
- 178 Apps
- 456 APIs & Domo Developer
- 49 Workflows
- 10 DomoAI
- 38 Predict
- 16 Jupyter Workspaces
- 22 R & Python Tiles
- 398 Distribute
- 115 Domo Everywhere
- 276 Scheduled Reports
- 7 Software Integrations
- 130 Manage
- 127 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 11 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 110 Community Announcements
- 4.8K Archive