robsmith

I upload the dataset in about 3.5 hours, but one day I'll further distribute the uploader to see if I can improve that. But after committing the execution, Domo's "processing" stage takes 6+ hours. I'm hoping that by increasing my part sizes to ~100MB gzipped I can minimize the Domo processing time. I'm not pulling from a…

in Streams API - benefits? Comment by robsmith August 2017

I've used the Update DataSet MetaData method. But if I remember correctly, I had to delete all my rows first (things may have changed now since I haven't done this in a while). I was unhappy that I had to delete all my data rows, but at least I was able to continue streaming to the same dataset without creating a new one.

in How to change schema of a Dataset created through the Streams API? Comment by robsmith August 2017

Thanks @Medinacus your comments are really helpful. I frequently upload a (new) 400 million row (and growing) dataset with 200 columns, and I'm always looking for ways to save time. Have you found a part size that works well? I think the latest documentation recommends 20MB - 100MB (compressed size) per part, but I'm…

in Streams API - benefits? Comment by robsmith August 2017

Nice work. Did you try/ find any advantages to using gzip compression? Or is that just uncompressed CSV?

in Streams API - benefits? Comment by robsmith May 2017

I too am trying to upload rows as quickly as possible. I'm not an expert, but it seems that if the bottleneck is the upload bandwidth of one machine/network, you could use the Streams API to distribute the uploading of parts across multiple machines/networks. Do you agree? I think Workbench only operates from one machine.

in Streams API - benefits? Comment by robsmith May 2017

robsmith Member

Comments