Google Professional Data Engineer Google Certified Professional – Data Engineer Online Training
Google Professional Data Engineer Online Training
The questions for Professional Data Engineer were last updated at Nov 27,2024.
- Exam Code: Professional Data Engineer
- Exam Name: Google Certified Professional – Data Engineer
- Certification Provider: Google
- Latest update: Nov 27,2024
The _________ for Cloud Bigtable makes it possible to use Cloud Bigtable in a Cloud Dataflow pipeline.
- A . Cloud Dataflow connector
- B . DataFlow SDK
- C . BiqQuery API
- D . BigQuery Data Transfer Service
Does Dataflow process batch data pipelines or streaming data pipelines?
- A . Only Batch Data Pipelines
- B . Both Batch and Streaming Data Pipelines
- C . Only Streaming Data Pipelines
- D . None of the above
You are planning to use Google’s Dataflow SDK to analyze customer data such as displayed below. Your project requirement is to extract only the customer name from the data source and then write to an output PCollection.
Tom,555 X street
Tim,553 Y street
Sam, 111 Z street
Which operation is best suited for the above data processing requirement?
- A . ParDo
- B . Sink API
- C . Source API
- D . Data extraction
Which Cloud Dataflow / Beam feature should you use to aggregate data in an unbounded data source every hour based on the time when the data entered the pipeline?
- A . An hourly watermark
- B . An event time trigger
- C . The with Allowed Lateness method
- D . A processing time trigger
Which of the following is NOT true about Dataflow pipelines?
- A . Dataflow pipelines are tied to Dataflow, and cannot be run on any other runner
- B . Dataflow pipelines can consume data from other Google Cloud services
- C . Dataflow pipelines can be programmed in Java
- D . Dataflow pipelines use a unified programming model, so can work both with streaming and batch data sources
You are developing a software application using Google’s Dataflow SDK, and want to use conditional, for loops and other complex programming structures to create a branching pipeline.
Which component will be used for the data processing operation?
- A . PCollection
- B . Transform
- C . Pipeline
- D . Sink API
Which of the following IAM roles does your Compute Engine account require to be able to run pipeline jobs?
- A . dataflow.worker
- B . dataflow.compute
- C . dataflow.developer
- D . dataflow.viewer
Which of the following is not true about Dataflow pipelines?
- A . Pipelines are a set of operations
- B . Pipelines represent a data processing job
- C . Pipelines represent a directed graph of steps
- D . Pipelines can share data between instances
By default, which of the following windowing behavior does Dataflow apply to unbounded data sets?
- A . Windows at every 100 MB of data
- B . Single, Global Window
- C . Windows at every 1 minute
- D . Windows at every 10 minutes
Which of the following job types are supported by Cloud Dataproc (select 3 answers)?
- A . Hive
- B . Pig
- C . YARN
- D . Spark