If a dataset contains rows with individual people and columns for year of birth, country, and income, how many of the columns are continuous and how many are categorical?
If a dataset contains rows with individual people and columns for year of birth, country, and income, how many of the columns are continuous and how many are categorical?A . 1 continuous and 2 categoricalB . 3 categoricalC . 3 continuousD . 2 continuous and 1 categoricalView AnswerAnswer: D Explanation:...
What should you do?
You have spent a few days loading data from comma-separated values (CSV) files into the Google BigQuery table CLICK_STREAM. The column DT stores the epoch time of click events. For convenience, you chose a simple schema where every field is treated as the STRING type. Now, you want to compute...
Which of the following is NOT one of the three main types of triggers that Dataflow supports?
Which of the following is NOT one of the three main types of triggers that Dataflow supports?A . Trigger based on element size in bytesB . Trigger that is a combination of other triggersC . Trigger based on element countD . Trigger based on timeView AnswerAnswer: A Explanation: There are...
Which approach should you take?
Flowlogistic is rolling out their real-time inventory tracking system. The tracking devices will all send package-tracking messages, which will now go to a single Google Cloud Pub/Sub topic instead of the Apache Kafka cluster. A subscriber application will then process the messages for real-time reporting and store them in Google...
Which two characteristic support this method?
You want to use a database of information about tissue samples to classify future tissue samples as either normal or mutated. You are evaluating an unsupervised anomaly detection method for classifying the tissue samples. Which two characteristic support this method? (Choose two.)A . There are very few occurrences of mutations...
How should you adjust the database design?
You designed a database for patient records as a pilot project to cover a few hundred patients in three clinics. Your design used a single database table to represent all patients and their visits, and you used self-joins to generate reports. The server resource utilization was at 50%. Since then,...
Which approach meets the requirements?
You need to compose visualizations for operations teams with the following requirements: Which approach meets the requirements?A . Load the data into Google Sheets, use formulas to calculate a metric, and use filters/sorting to show only suboptimal links in a table.B . Load the data into Google BigQuery tables, write...
Why do you need to split a machine learning dataset into training data and test data?
Why do you need to split a machine learning dataset into training data and test data?A . So you can try two different sets of featuresB . To make sure your model is generalized for more than just the training dataC . To allow you to create unit tests in...
What are all of the BigQuery operations that Google charges for?
What are all of the BigQuery operations that Google charges for?A . Storage, queries, and streaming insertsB . Storage, queries, and loading data from a fileC . Storage, queries, and exporting dataD . Queries and streaming insertsView AnswerAnswer: A Explanation: Google charges for storage, queries, and streaming inserts. Loading data...
What should you recommend they do?
Topic 4, Main Questions Set B Your company has recently grown rapidly and now ingesting data at a significantly higher rate than it was previously. You manage the daily batch MapReduce analytics jobs in Apache Hadoop. However, the recent increase in data has meant the batch jobs are falling behind....