What is the MOST likely cause of this issue?

exams MLS-C01 V2 MLS-C01 exam 0 Comments

During mini-batch training of a neural network for a classification problem, a Data Scientist notices that training accuracy oscillates.

What is the MOST likely cause of this issue?
A . The class distribution in the dataset is imbalanced
B . Dataset shuffling is disabled
C . The batch size is too big
D . The learning rate is very high

Answer: D

Explanation:

Mini-batch gradient descent is a variant of gradient descent that updates the model parameters using a subset of the training data (called a mini-batch) at each iteration. The learning rate is a hyperparameter that controls how much the model parameters change in response to the gradient. If the learning rate is very high, the model parameters may overshoot the optimal values and oscillate around the minimum of the cost function. This can cause the training accuracy to fluctuate and prevent the model from converging to a stable solution. To avoid this issue, the learning rate should be chosen carefully, such as by using a learning rate decay schedule or an adaptive learning rate algorithm1. Alternatively, the batch size can be increased to reduce the variance of the gradient estimates2. However, the batch size should not be too big, as this can slow down the training process and reduce the generalization ability of the model3. Dataset shuffling and class distribution are not likely to cause oscillations in training accuracy, as they do not affect the gradient updates directly. Dataset shuffling can help avoid getting stuck in local minima and improve the convergence speed of mini-batch gradient descent4. Class distribution can affect the performance and fairness of the model, especially if the dataset is imbalanced, but it does not necessarily cause fluctuations in training accuracy.