Which configuration setting should you specify in the ParallelRunConfig object for the PrallelRunStep step?

exams DP-100 DP-100 exam 0 Comments

You register a model that you plan to use in a batch inference pipeline.

The batch inference pipeline must use a ParallelRunStep step to process files in a file dataset. The script has the ParallelRunStep step runs must process six input files each time the inferencing function is called.

You need to configure the pipeline.

Which configuration setting should you specify in the ParallelRunConfig object for the PrallelRunStep step?
A . process_count_per_node= "6"
B . node_count= "6"
C . mini_batch_size= "6"
D . error_threshold= "6"

Answer: B

Explanation:

node_count is the number of nodes in the compute target used for running the ParallelRunStep.

Incorrect Answers:

A: process_count_per_node

Number of processes executed on each node. (optional, default value is number of cores on node.)

C: mini_batch_size

For FileDataset input, this field is the number of files user script can process in one run() call. For TabularDataset input, this field is the approximate size of data the user script can process in one run() call. Example values are 1024, 1024KB, 10MB, and 1GB.

D: error_threshold

The number of record failures for TabularDataset and file failures for FileDataset that should be ignored during processing. If the error count goes above this value, then the job will be aborted.

Reference: https://docs.microsoft.com/en-us/python/api/azureml-contrib-pipeline-steps/azureml.contrib.pipeline.steps.parallelrunconfig?view=azure-ml-py