S3 Connector Files in Subdirectories

pstrauss
pstrauss Member
edited February 2024 in Connectors

We're working on a pipeline that moves custom JSON log data from Cloudwatch to an S3 bucket using Kinesis Data Streams and Firehose (as explained here). This process will copy new log exports into a nested directory structure that represents the year/month/day/hour, for example:

/2024/02/20/01/file1.json
/2024/02/20/02/file2.json
/2024/02/20/03/file3.json

etc…

Is there a way to make any of the Domo S3 connectors process newly added files within these subdirectories automatically? In my initial testing, it seems that the standard S3 connector won't drill into subdirectories, and will only look in the specific path you've set.

Is our best solution to create a Lambda to copy/move all of these files out of the subdirectories into a processing directory?

P.S. We originally tried streaming from Cloudwatch to the JSON Webhook connector via a custom Lambda as recommended, but it's extremely fragile, providing no feedback on successes or failures (it just returns a 200 no matter what, even if the data didn't process) so this doesn't seem like a scalable or predictable solution.

Tagged:

Best Answer

  • MattTheGuru
    MattTheGuru Contributor
    Answer ✓

    Yeah, the code above could be used to create your own connector to AWS S3 and pull the files. The issue is that the default one doesn't seem to have the capabilities for your folder setup.

    If you have the python/javascript knowledge or you have some internal devs you can show them the connector builder and the above code is most of the bones to pull that data from S3 to Domo for you.

    That's likely the only path to take unfortunately.

    Feel free to reach out over email if the above bit of code doesn't cut it/you would just rather someone else do it for ya.

    ** Was this post helpful? Click 💡/💖/👍/😊 below. **
    ** If it solved your problem. Accept it as a solution! ✔️ **

    Or do you need more help? https://calendly.com/matthew-kastner/15-minute-chat
    Did I help you out? Feedback is priceless and will help me more than you know.Write a review!

Answers

  • MattTheGuru
    MattTheGuru Contributor
    edited February 2024

    I have destructed some of my own code from a recent project to attempt to build you something that might solve your automation issues:

    I have included the code below as "example.txt".

    You will likely need to modify it slightly, but if you replace the secret variables I think it might just work as long as you hit the function.

    ** Was this post helpful? Click 💡/💖/👍/😊 below. **
    ** If it solved your problem. Accept it as a solution! ✔️ **

    Or do you need more help? https://calendly.com/matthew-kastner/15-minute-chat
    Did I help you out? Feedback is priceless and will help me more than you know.Write a review!

  • @MattTheGuru appreciate your response. That code might be useful for the AWS side of things but I'm not sure how to solve the issue on the domo side getting the S3 connector to pull files from subdirectories automatically. Do you have any thoughts on that?

  • MattTheGuru
    MattTheGuru Contributor
    Answer ✓

    Yeah, the code above could be used to create your own connector to AWS S3 and pull the files. The issue is that the default one doesn't seem to have the capabilities for your folder setup.

    If you have the python/javascript knowledge or you have some internal devs you can show them the connector builder and the above code is most of the bones to pull that data from S3 to Domo for you.

    That's likely the only path to take unfortunately.

    Feel free to reach out over email if the above bit of code doesn't cut it/you would just rather someone else do it for ya.

    ** Was this post helpful? Click 💡/💖/👍/😊 below. **
    ** If it solved your problem. Accept it as a solution! ✔️ **

    Or do you need more help? https://calendly.com/matthew-kastner/15-minute-chat
    Did I help you out? Feedback is priceless and will help me more than you know.Write a review!

  • Got it. We might just write a Lambda that moves the files out of the S3 subdirectories into a "to_be_processed" directory, and then point the S3 connector at that one directory.