Exam4Training

IBM C1000-059 IBM AI Enterprise Workflow V1 Data Science Specialist Online Training

Question #1

A new test to diagnose a disease is evaluated on 1152 people, and 106 people have the disease, and 1046 people do not have the disease.

The test results are summarized below:

In this sample, how many cases are false positives and false negatives?

  • A . 33 false positives and 81 false negatives
  • B . 81 false positives and 73 false negatives
  • C . 73 false positives and 81 false negatives
  • D . 81 false positives and 33 false negatives

Reveal Solution Hide Solution

Correct Answer: A
Question #2

What is the goal of the backpropagation algorithm?

  • A . to randomize the trajectory of the neural network parameters during training
  • B . to smooth the gradient of the loss function in order to avoid getting trapped in small local minimas
  • C . to scale the gradient descent step in proportion to the gradient magnitude
  • D . to compute the gradient of the loss function with respect to the neural network parameters

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Reference: https://www.sciencedirect.com/topics/computer-science/backpropagation

Question #3

With the help of AI algorithms, which type of analytics can help organizations make decisions based on facts and probability-weighted projections?

  • A . prescriptive analytics
  • B . cognitive analytics
  • C . predictive analytics
  • D . descriptive analytics

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Reference: https://www.investopedia.com/terms/p/prescriptive-analytics.asp

Question #4

What is the technique called for vectorizing text data which matches the words in different sentences to determine if the sentences are similar?

  • A . Cup of Vectors
  • B . Box of Lexicon
  • C . Sack of Sentences
  • D . Bag of Words

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

Reference: https://medium.com/@adriensieg/text-similarities-da019229c894

Question #5

Which statement is true in the context of evaluating metrics for machine learning algorithms?

  • A . A random classifier has AUC (the area under ROC curve) of 0.5
  • B . Using only one evaluation metric is sufficient
  • C . The F-score is always equal to precision
  • D . Recall of 1 (100%) is always a good result

Reveal Solution Hide Solution

Correct Answer: B
Question #6

When should median value be used instead of mean value for imputing missing data?

  • A . for skewed data
  • B . for real numbers
  • C . for normally distributed data
  • D . for large data sets

Reveal Solution Hide Solution

Correct Answer: D
Question #7

Given the following matrix multiplication:

What is the value of P?

  • A . C9
  • B . 17
  • C . 12
  • D . C7

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Reference: https://www.mathsisfun.com/algebra/matrix-multiplying.html

Question #8

A neural network is composed of a first affine transformation (affine1) followed by a ReLU non-linearity, followed by a second affine transformation (affine2).

Which two explicit functions are implemented by this neural network? (Choose two.)

  • A . y = affine1(ReLU(affine2(x)))
  • B . y = max(affine1(x), affine2(x))
  • C . y = affine2(ReLU(affine1(x)))
  • D . y = affine2(max(affine1(x), 0))
  • E . y = ReLU(affine1(x), affine2(x))

Reveal Solution Hide Solution

Correct Answer: CD
Question #9

The formula for recall is given by (True Positives) / (True Positives + False Negatives).

What is the recall for this example?

  • A . 0.2
  • B . 0.25
  • C . 0.5
  • D . 0.33

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Reference: https://machinelearningmastery.com/precision-recall-and-f-measure-for-imbalanced-classification/

Question #10

After importing a Jupyter notebook and CSV data file into IBM Watson Studio in the IBM Public Cloud project, it is discovered that the notebook code can no longer access the CSV file.

What is the most likely reason for this problem?

  • A . CSV files cannot be used as data sources in Watson Studio.
  • B . The CSV file was converted to a binary blob and must be converted in the notebook code.
  • C . The CSV file is stored in a Cloud Object Storage.
  • D . The CSV file is stored in a Watson Machine Learning instance and is only accessible via REST API.

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Reference: https://github.com/IBM/watson-stock-market-predictor/blob/master/README.md

Question #11

Determine the number of bigrams and trigrams in the sentence.

"Data is the new oil".

  • A . 3 bigrams, 3 trigrams
  • B . 4 bigrams, 4 trigrams
  • C . 3 bigrams, 4 trigrams
  • D . 4 bigrams, 3 trigrams

Reveal Solution Hide Solution

Correct Answer: A
Question #12

Which is a preferred approach for simplifying the data transformation steps in machine learning model management and maintenance?

  • A . Implement data transformation, feature extraction, feature engineering, and imputation algorithms in one single pipeline.
  • B . Do not apply any data transformation or feature extraction or feature engineering steps.
  • C . Leverage only deep learning algorithms.
  • D . Apply a limited number of data transformation steps from a pre-defined catalog of possible operations independent of the machine learning use case.

Reveal Solution Hide Solution

Correct Answer: B
Question #13

Which is a technique that automates the handling of categorical variables?

  • A . binary encoding
  • B . decoding
  • C . autoencoding
  • D . one-hot encoding

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

Reference: https://hub.packtpub.com/how-to-handle-categorical-data-for-machine-learning-algorithms/

Question #14

Which two statements are correct about deploying machine learning models? (Choose two.)

  • A . It allows integration within business applications.
  • B . It makes it possible to create reports for management dynamically using specific parameters from executives.
  • C . It is critical for achieving high accuracy in training.
  • D . It is a necessary step in training and evaluating the performance of the models.
  • E . It is only possible on the cloud because they require a large amount of compute resources.

Reveal Solution Hide Solution

Correct Answer: CD
Question #15

Which of the following entity extraction techniques would be best for the extraction of telephone numbers from a text document?

  • A . complex pattern-based
  • B . regex
  • C . statistical
  • D . dictionary

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Reference: https://www.researchgate.net/publication/318093829_Developing_an_innovative_entity_extraction_method_for_unstructured_data

Question #16

What statement is true about UTF-8?

  • A . It is encoding for Latin script.
  • B . It is rarely used today.
  • C . It is encoding for Unicode characters.
  • D . It is equal to ASCII.

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Reference: https://www.w3.org/International/questions/qa-what-is-encoding

Question #17

Which test is applied to determine the relationship between two categorical variables?

  • A . paired t-test
  • B . chi squared test
  • C . z test
  • D . t-test

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Reference: https://www.pluralsight.com/guides/testing-for-relationships-between-categorical-variables-using-the-chi-square-test

Question #18

With only limited labeled data available how might a neural network use case be realized?

  • A . by assigning random labels
  • B . by increasing the depth of the neural network
  • C . by creating random data
  • D . by using a customized pre-trained model

Reveal Solution Hide Solution

Correct Answer: D
Question #19

What is the first step in creating a custom model in Watson Visual Recognition service?

  • A . Test the newly trained model.
  • B . Document the errors from the built in models.
  • C . Obtain image files containing objects to be classified and organize them into classes.
  • D . Use IBM SPSS to create new machine learning classifiers.

Reveal Solution Hide Solution

Correct Answer: D
Question #20

What is used to scale large positive values during data cleaning?

  • A . division by random numbers
  • B . square
  • C . logarithm
  • D . subtract median

Reveal Solution Hide Solution

Correct Answer: A
Exit mobile version