Given a feature set with rows that contain missing continuous values, and assuming the data is normally distributed, what is the best way to fill in these missing features?
A . Delete entire rows that contain any missing features.
B . Fill in missing features with random values for that feature in the training set.
C . Fill in missing features with the average of observed values for that feature in the entire dataset.
D . Delete entire columns that contain any missing features.
Answer: C
Explanation:
Missing values are a common problem in data analysis and machine learning, as they can affect the quality and reliability of the data and the model. There are various methods to deal with missing values, such as deleting, imputing, or ignoring them. One of the most common methods is imputing, which means replacing the missing values with some estimated values based on some criteria. For continuous variables, one of the simplest and most widely used imputation methods is to fill in the missing values with the mean (average) of the observed values for that variable in the entire dataset. This method can preserve the overall distribution and variance of the data, as well as avoid introducing bias or noise.
Latest AIP-210 Dumps Valid Version with 90 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund