A Machine Learning Specialist is building a model to predict future employment rates based on a wide range of economic factors While exploring the data, the Specialist notices that the magnitude of the input features vary greatly. The Specialist does not want variables with a larger magnitude to dominate the model
What should the Specialist do to prepare the data for model training?
A . Apply quantile binning to group the data into categorical bins to keep any relationships in the data by replacing the magnitude with distribution
B . Apply the Cartesian product transformation to create new combinations of fields that are independent of the magnitude
C . Apply normalization to ensure each field will have a mean of 0 and a variance of 1 to remove any significant magnitude
D . Apply the orthogonal sparse Diagram (OSB) transformation to apply a fixed-size sliding window to generate new features of a similar magnitude.
Answer: C
Explanation:
Normalization is a data preprocessing technique that can be used to scale the input features to a common range, such as [-1, 1] or [0, 1]. Normalization can help reduce the effect of outliers, improve the convergence of gradient-based algorithms, and prevent variables with a larger magnitude from dominating the model. One common method of normalization is standardization, which transforms each feature to have a mean of 0 and a variance of 1. This can be done by subtracting the mean and dividing by the standard deviation of each feature. Standardization can be useful for models that assume the input features are normally distributed, such as linear regression, logistic regression, and support vector machines.
References:
Data normalization and standardization: A video that explains the concept and benefits of data normalization and standardization.
Standardize or Normalize?: A blog post that compares different methods of scaling the input
features.
Latest MLS-C01 Dumps Valid Version with 104 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund