Which of the below approach results in perfromance improvement through linear scaling of data ingestion workload?

exams ARA-C01 V1 ARA-C01 exam 0 Comments

Which of the below approach results in perfromance improvement through linear scaling of data ingestion workload?
A . Split large files into recommended range of 10 MB to 100 MB
B. Organize data by granular path
C. Resize virtual warehouse
D. All of the above

Answer: D

Explanation:

When loading your staged data, narrow the path to the most granular level that includes your data for improved data load performance.

Use any of the following options to further confine the list of files to load:

If the file names match except for a suffix or extension, include the matching part of the file names in the path, e.g.:

copy into t1 from @%t1/united_states/california/los_angeles/2016/06/01/11/mydata;

Add the FILES or PATTERN options (see Options for Selecting Staged Data Files), e.g.:

copy into t1 from @%t1/united_states/california/los_angeles/2016/06/01/11/ files=(‘mydata1.csv’, ‘mydata1.csv’);

copy into t1 from @%t1/united_states/california/los_angeles/2016/06/01/11/ pattern=’.*mydata[^[0-9]{1,3}$$].csv’;

https://docs.snowflake.com/en/user-guide/data-load-considerations-stage.html#organizing-data-by-path Now, also understand why splitting large files help…

Each node in a virtual warehouse has 8 cores. if you split your files, the loading can be parallelized as

each file will be take care of by each core.