DELL EMC D-DS-FN-23 Dell Data Science Foundations 2023 Online Training
DELL EMC D-DS-FN-23 Online Training
The questions for D-DS-FN-23 were last updated at Jan 31,2025.
- Exam Code: D-DS-FN-23
- Exam Name: Dell Data Science Foundations 2023
- Certification Provider: DELL EMC
- Latest update: Jan 31,2025
You have been assigned to run a logistic regression model for each of 100 countries, and all the data is currently stored in a PostgreSQL database.
Which tool/library would you use to produce these models with the least effort?
- A . MADlib
- B . Mahout
- C . RStudio
- D . HBase
A data scientist plans to classify the sentiment polarity of 10, 000 product reviews collected from the Internet.
What is the most appropriate model to use? Suppose labeled training data is available.
- A . Naïve Bayesian classifier
- B . Linear regression
- C . Logistic regression
- D . K-means clustering
What does R code nv <- v[v < 1000] do?
- A . Selects the values in vector v that are less than 1000 and assigns them to the vector nv
- B . Sets nv to TRUE or FALSE depending on whether all elements of vector v are less than 1000
- C . Removes elements of vector v less than 1000 and assigns the elements >= 1000 to nv
- D . Selects values of vector v less than 1000, modifies v, and makes a copy to nv
You have run a Linear Regression model on the data shown in the graphic.
Which value is a reasonable guess for R-squared?
- A . -.8
- B . .8
- C . .25
- D . 1.25
You have created a scatterplot of two continuous variables for 2000 records. You want to add a line to the scatterplot to check linearity of the data.
Which function would best address this need?
- A . abline()
- B . glm()
- C . hist()
- D . lm()
Why do the Naïve Bayesian classifier implementations use the log of probability value rather than the pure probability value?
- A . To ensure the conditional independence of attribute values
- B . To avoid numerical underflow errors in high dimensional problems
- C . To obtain a more accurate estimate of the probabilities without the need for a Laplace smoothing
- D . To invalidate the variables that are continuous
Consider the following SQL query:
SELECT product_id FROM supplier_A
UNION
SELECT product_id FROM supplier_B;
What is the expected result?
- A . All product_id values from both tables with duplicates or repeating rows
- B . All product_id values from supplier_A table but not from supplier_B table
- C . All product_id values from supplier_B table but not from supplier_A table
- D . All product_id values from both tables with no duplicates or repeating rows
In data visualization, which type of chart is recommended to represent frequency data?
- A . Line chart
- B . Histogram
- C . Q-Q chart
- D . Scatterplot
Which word or phrase completes the statement; “Excessive emphasis color is to Bar chart as __________________.”?
- A . Multicollinearity is to OLS
- B . Multicollinearity is to serial correlation
- C . Confidence is to leverage
- D . Confidence interval is to regression
You submit a MapReduce job to a Hadoop cluster. Although the job was successfully submitted, you notice that it is not completing.
What should be done?
- A . Ensure that a DataNode is running
- B . Ensure that the TaskTracker is running
- C . Ensure that the NameNode is running
- D . Ensure that the JobTracker is running