What should a consultant consider when ingesting this data stream?

Every day, Northern Trail Outfitters (NTO) uploads a summary of the last 24 hours of store transactions to a new file in an Amazon S3 bucket, and files older than 7 days are automatically deleted. Each file contains a timestamp in a standardized naming convention.

What should a consultant consider when ingesting this data stream?
A . Ensure the refresh mode is set to "Upsert" and Refresh only new files" is selected
B. Ensure the refresh mode is set to "Full Refresh" and the filename contains a wildcard to accommodate the timestamp
C. Ensure the refresh mode is set to "Full Refresh" and "Refresh only new files’ is selected
D. Advise NTO to change their processes: this configuration is not supported

Answer: B

Explanation:

"Full Refresh" mode would help make sure that the ingestion process captures all the relevant data from the new file.

Including a wildcard in the filename would allow the system to handle the timestamp in the standardized naming convention, ensuring that the correct file is processed each day.

Option A wouldn’t work as well because "Upsert" assumes that you are updating existing records and inserting new ones, whereas in this scenario, the data is summarized and placed in new files. Option C might not appropriately handle the filenames with timestamps, and option D is not suitable because there’s no indication in the scenario that this configuration would be unsupported.

So, option B would be the best consideration for ingesting this particular data stream, given the specific circumstances of the daily uploads, file naming conventions, and the 7-day deletion policy.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments