A new test to diagnose a disease is evaluated on 1152 people, and 106 people have the disease, and 1046 people do not have the disease.
The test results are summarized below:
In this sample, how many cases are false positives and false negatives?
- A . 33 false positives and 81 false negatives
- B . 81 false positives and 73 false negatives
- C . 73 false positives and 81 false negatives
- D . 81 false positives and 33 false negatives
What is the goal of the backpropagation algorithm?
- A . to randomize the trajectory of the neural network parameters during training
- B . to smooth the gradient of the loss function in order to avoid getting trapped in small local minimas
- C . to scale the gradient descent step in proportion to the gradient magnitude
- D . to compute the gradient of the loss function with respect to the neural network parameters
B
Explanation:
Reference: https://www.sciencedirect.com/topics/computer-science/backpropagation
With the help of AI algorithms, which type of analytics can help organizations make decisions based on facts and probability-weighted projections?
- A . prescriptive analytics
- B . cognitive analytics
- C . predictive analytics
- D . descriptive analytics
A
Explanation:
Reference: https://www.investopedia.com/terms/p/prescriptive-analytics.asp
What is the technique called for vectorizing text data which matches the words in different sentences to determine if the sentences are similar?
- A . Cup of Vectors
- B . Box of Lexicon
- C . Sack of Sentences
- D . Bag of Words
D
Explanation:
Reference: https://medium.com/@adriensieg/text-similarities-da019229c894
Which statement is true in the context of evaluating metrics for machine learning algorithms?
- A . A random classifier has AUC (the area under ROC curve) of 0.5
- B . Using only one evaluation metric is sufficient
- C . The F-score is always equal to precision
- D . Recall of 1 (100%) is always a good result
When should median value be used instead of mean value for imputing missing data?
- A . for skewed data
- B . for real numbers
- C . for normally distributed data
- D . for large data sets
Given the following matrix multiplication:
What is the value of P?
- A . C9
- B . 17
- C . 12
- D . C7
C
Explanation:
Reference: https://www.mathsisfun.com/algebra/matrix-multiplying.html
A neural network is composed of a first affine transformation (affine1) followed by a ReLU non-linearity, followed by a second affine transformation (affine2).
Which two explicit functions are implemented by this neural network? (Choose two.)
- A . y = affine1(ReLU(affine2(x)))
- B . y = max(affine1(x), affine2(x))
- C . y = affine2(ReLU(affine1(x)))
- D . y = affine2(max(affine1(x), 0))
- E . y = ReLU(affine1(x), affine2(x))
The formula for recall is given by (True Positives) / (True Positives + False Negatives).
What is the recall for this example?
- A . 0.2
- B . 0.25
- C . 0.5
- D . 0.33
B
Explanation:
Reference: https://machinelearningmastery.com/precision-recall-and-f-measure-for-imbalanced-classification/
After importing a Jupyter notebook and CSV data file into IBM Watson Studio in the IBM Public Cloud project, it is discovered that the notebook code can no longer access the CSV file.
What is the most likely reason for this problem?
- A . CSV files cannot be used as data sources in Watson Studio.
- B . The CSV file was converted to a binary blob and must be converted in the notebook code.
- C . The CSV file is stored in a Cloud Object Storage.
- D . The CSV file is stored in a Watson Machine Learning instance and is only accessible via REST API.
C
Explanation:
Reference: https://github.com/IBM/watson-stock-market-predictor/blob/master/README.md
Determine the number of bigrams and trigrams in the sentence.
"Data is the new oil".
- A . 3 bigrams, 3 trigrams
- B . 4 bigrams, 4 trigrams
- C . 3 bigrams, 4 trigrams
- D . 4 bigrams, 3 trigrams
Which is a preferred approach for simplifying the data transformation steps in machine learning model management and maintenance?
- A . Implement data transformation, feature extraction, feature engineering, and imputation algorithms in one single pipeline.
- B . Do not apply any data transformation or feature extraction or feature engineering steps.
- C . Leverage only deep learning algorithms.
- D . Apply a limited number of data transformation steps from a pre-defined catalog of possible operations independent of the machine learning use case.
Which is a technique that automates the handling of categorical variables?
- A . binary encoding
- B . decoding
- C . autoencoding
- D . one-hot encoding
D
Explanation:
Reference: https://hub.packtpub.com/how-to-handle-categorical-data-for-machine-learning-algorithms/
Which two statements are correct about deploying machine learning models? (Choose two.)
- A . It allows integration within business applications.
- B . It makes it possible to create reports for management dynamically using specific parameters from executives.
- C . It is critical for achieving high accuracy in training.
- D . It is a necessary step in training and evaluating the performance of the models.
- E . It is only possible on the cloud because they require a large amount of compute resources.
Which of the following entity extraction techniques would be best for the extraction of telephone numbers from a text document?
- A . complex pattern-based
- B . regex
- C . statistical
- D . dictionary
C
Explanation:
Reference: https://www.researchgate.net/publication/318093829_Developing_an_innovative_entity_extraction_method_for_unstructured_data
What statement is true about UTF-8?
- A . It is encoding for Latin script.
- B . It is rarely used today.
- C . It is encoding for Unicode characters.
- D . It is equal to ASCII.
C
Explanation:
Reference: https://www.w3.org/International/questions/qa-what-is-encoding
Which test is applied to determine the relationship between two categorical variables?
- A . paired t-test
- B . chi squared test
- C . z test
- D . t-test
B
Explanation:
Reference: https://www.pluralsight.com/guides/testing-for-relationships-between-categorical-variables-using-the-chi-square-test
With only limited labeled data available how might a neural network use case be realized?
- A . by assigning random labels
- B . by increasing the depth of the neural network
- C . by creating random data
- D . by using a customized pre-trained model
What is the first step in creating a custom model in Watson Visual Recognition service?
- A . Test the newly trained model.
- B . Document the errors from the built in models.
- C . Obtain image files containing objects to be classified and organize them into classes.
- D . Use IBM SPSS to create new machine learning classifiers.
What is used to scale large positive values during data cleaning?
- A . division by random numbers
- B . square
- C . logarithm
- D . subtract median