What is a concern the data scientist should have about the data?
A Data Scientist is assigned to build a model from a reporting data warehouse. The warehouse contains data collected from many sources and transformed through a complex, multi-stage ETL process. What is a concern the data scientist should have about the data?A . It is too processedB . It is...
Which analytical method is considered unsupervised?
Which analytical method is considered unsupervised?A . K-means clusteringB . Naïve Bayesian classifierC . Decision treeD . Linear regressionView AnswerAnswer: A
What requests resources from YARN during a MapReduce job?
What requests resources from YARN during a MapReduce job?A . Map and reduce tasksB . ApplicationMasterC . ApplicationsManagerD . DataNodesView AnswerAnswer: B
What type of data is represented in the exhibit?
What type of data is represented in the exhibit? A . StructuredB . UnstructuredC . Quasi-structuredD . Semi-structuredView AnswerAnswer: A
In data visualization, which type of chart is recommended to represent frequency data?
In data visualization, which type of chart is recommended to represent frequency data?A . Line chartB . HistogramC . Q-Q chartD . ScatterplotView AnswerAnswer: B
Which type of numeric value does a logistic regression model estimate?
Which type of numeric value does a logistic regression model estimate?A . ProbabilityB . A p-valueC . Any integerD . Any real numberView AnswerAnswer: A
What is the next step?
You have just completed the Discovery phase of a project and finished interviewing the main stakeholders. You have identified the necessary data feeds and are now beginning to set up the analytic sandbox. What is the next step?A . Assess data qualityB . Perform ELT / ETLC . Create data...
How is dimensionality defined in a "bag of words" document representation?
How is dimensionality defined in a "bag of words" document representation?A . Average number of words per sentence in the documentB . Total number of words in the documentC . Number of unique terms in the documentD . Frequency of repeated words in the documentView AnswerAnswer: C
What would be the assigned probability, p(good), of a single male with no known savings?
Refer to the exhibit. What provides the decision tree for predicting whether or not someone is a good or bad credit risk. What would be the assigned probability, p(good), of a single male with no known savings?A . 0.83B . 0C . 0.498D . 0.6View AnswerAnswer: A
Which value is a reasonable guess for R-squared?
You have run a Linear Regression model on the data shown in the graphic. Which value is a reasonable guess for R-squared?A . -.8B . .8C . .25D . 1.25View AnswerAnswer: B