S3 Connector Files in Subdirectories

Options
pstrauss
pstrauss Member
edited February 21 in Connectors

We're working on a pipeline that moves custom JSON log data from Cloudwatch to an S3 bucket using Kinesis Data Streams and Firehose (as explained here). This process will copy new log exports into a nested directory structure that represents the year/month/day/hour, for example:

/2024/02/20/01/file1.json
/2024/02/20/02/file2.json
/2024/02/20/03/file3.json

etc…

Is there a way to make any of the Domo S3 connectors process newly added files within these subdirectories automatically? In my initial testing, it seems that the standard S3 connector won't drill into subdirectories, and will only look in the specific path you've set.

Is our best solution to create a Lambda to copy/move all of these files out of the subdirectories into a processing directory?

P.S. We originally tried streaming from Cloudwatch to the JSON Webhook connector via a custom Lambda as recommended, but it's extremely fragile, providing no feedback on successes or failures (it just returns a 200 no matter what, even if the data didn't process) so this doesn't seem like a scalable or predictable solution.

Tagged:

Best Answer

  • MattTheGuru
    MattTheGuru Contributor
    Answer ✓
    Options

    Yeah, the code above could be used to create your own connector to AWS S3 and pull the files. The issue is that the default one doesn't seem to have the capabilities for your folder setup.

    If you have the python/javascript knowledge or you have some internal devs you can show them the connector builder and the above code is most of the bones to pull that data from S3 to Domo for you.

    That's likely the only path to take unfortunately.

    Feel free to reach out over email if the above bit of code doesn't cut it/you would just rather someone else do it for ya.

Answers

  • MattTheGuru
    MattTheGuru Contributor
    edited February 21
    Options

    I have destructed some of my own code from a recent project to attempt to build you something that might solve your automation issues:

    I have included the code below as "example.txt".

    You will likely need to modify it slightly, but if you replace the secret variables I think it might just work as long as you hit the function.

  • pstrauss
    Options

    @MattTheGuru appreciate your response. That code might be useful for the AWS side of things but I'm not sure how to solve the issue on the domo side getting the S3 connector to pull files from subdirectories automatically. Do you have any thoughts on that?

  • MattTheGuru
    MattTheGuru Contributor
    Answer ✓
    Options

    Yeah, the code above could be used to create your own connector to AWS S3 and pull the files. The issue is that the default one doesn't seem to have the capabilities for your folder setup.

    If you have the python/javascript knowledge or you have some internal devs you can show them the connector builder and the above code is most of the bones to pull that data from S3 to Domo for you.

    That's likely the only path to take unfortunately.

    Feel free to reach out over email if the above bit of code doesn't cut it/you would just rather someone else do it for ya.

  • pstrauss
    Options

    Got it. We might just write a Lambda that moves the files out of the S3 subdirectories into a "to_be_processed" directory, and then point the S3 connector at that one directory.