Amazon S3 Assumerole Advanced v2 Issues

Options
jimsteph
jimsteph Contributor

We pull data from S3 using the S3 Assumerole Advanced v2 connector, and I'm having problems with it. There are three update methods available with this connector: Replace, Append, and Merge. We're bringing in employee data. Here's the problems with the three methods:

  • I would prefer to use Merge, but when I choose it it tries to load values into Merge Key Location and times out every time.
  • Our current connector uses Append, but that causes the dataset to balloon: we've had less than 100K employees total over the years, but the dataset is now almost 4 million records.
  • Replace looked good at first: in a test connector the first run pulled in the correct number of records. However, every subsequent run only pulled in new or changed records, overwriting the full record pull.

If anyone has experience with this connector do you think that it's buggy or that we might possibly have incorrect settings in the S3 account? Any suggestions for alternate connectors to get the data out of S3?

Tagged:

Best Answer

  • GrantSmith
    GrantSmith Coach
    Answer ✓
    Options

    You could do a replace method update on your S3 dataset and then feed it into a MagicETL dataflow, which outputs to another dataset, but you can set the output method to partition and define your partition key. This would get around the merge issue you're running into.

    As for the merge issue itself how many merge keys are you attempting to load? You might need to reach out to Domo Support about the issue.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**

Answers

  • GrantSmith
    GrantSmith Coach
    Answer ✓
    Options

    You could do a replace method update on your S3 dataset and then feed it into a MagicETL dataflow, which outputs to another dataset, but you can set the output method to partition and define your partition key. This would get around the merge issue you're running into.

    As for the merge issue itself how many merge keys are you attempting to load? You might need to reach out to Domo Support about the issue.

    **Was this post helpful? Click Agree or Like below**
    **Did this solve your problem? Accept it as a solution!**
  • jimsteph
    jimsteph Contributor
    Options

    Thank you for your response. I will either write a recursive ETL or see if the MySQL engine supports upsert (INSERT INTO with ON DUPLICATE KEY UPDATE) just so I can have a working dataset. I won't be able to use partitioning, however, as I'd want to partition on EmployeeID, and there are way more than the max number of partitions (1500).

    As for the merge keys, I assume it's trying to load the column names, and there are only 54 of them. That shouldn't be a problem, so I'll put in a ticket. Thanks again!