Best way to get parquet data from AWS S3 bucket?
We have some parquet files being replicated to an AWS S3 bucket.
I've started to look to see if I can use Amazon Glue to crawl the bucket, Athena to query the Glue table, and then Domo to pull data from Athena. I'm running into a few issues (like the initial load file Glue picks up as a table, says it has rows, but Athena can't query any data from it) but I think I can get there.
However, before I go too far down the road, is there another approach that works?
Unfortunately the S3 connector doesn't read parquet files.
I could convert them to CSV and upload directly to Domo using something like https://stackoverflow.com/questions/62275672/converting-parquet-files-in-s3-to-csv-and-store-back-in-s3 but that seems ... cludgy?
If anyone has any suggestions, I'm all ears.
Answers
-
I think that would be your best bet currently. If you haven't upvoted this in ideas exchange to get parquet files supported in Domo I would do so https://dojo.domo.com/main/discussion/51684/ingesting-parquet-files
If I have answered your question, please click "Yes" on my comment option.
0 -
Yep - I found that and upvoted it.
0 -
It looks like there is a parquet reader built into the Domo CLI tool.
0
Categories
- All Categories
- 1.8K Product Ideas
- 1.8K Ideas Exchange
- 1.5K Connect
- 1.2K Connectors
- 300 Workbench
- 6 Cloud Amplifier
- 8 Federated
- 2.9K Transform
- 100 SQL DataFlows
- 616 Datasets
- 2.2K Magic ETL
- 3.9K Visualize
- 2.5K Charting
- 738 Beast Mode
- 58 App Studio
- 40 Variables
- 685 Automate
- 176 Apps
- 452 APIs & Domo Developer
- 47 Workflows
- 10 DomoAI
- 36 Predict
- 15 Jupyter Workspaces
- 21 R & Python Tiles
- 395 Distribute
- 113 Domo Everywhere
- 276 Scheduled Reports
- 6 Software Integrations
- 124 Manage
- 121 Governance & Security
- 8 Domo Community Gallery
- 38 Product Releases
- 10 Domo University
- 5.4K Community Forums
- 40 Getting Started
- 30 Community Member Introductions
- 108 Community Announcements
- 4.8K Archive