A business flow in DataWorks integrates different node task types by business type, such a structure
improves business code development facilitation.
Which of the following descriptions about the node type is INCORRECT? Score 2
- A . A zero-load node is a control node that does not generate any data. The virtual node is generally used as the root node for planning the overall node workflow.
- B . An ODPS SQL task allows you to edit and maintain the SQL code on the Web, and easily implement code runs, debug, and collaboration.
- C . The PyODPS node in DataWorks can be integrated with MaxCompute Python SDK. You can edit the Python code to operate MaxCompute on a PyODPS node in DataWorks.
- D . The SHELL node supports standard SHELL syntax and the interactive syntax. The SHELL task can run on the default resource group
DataV is a powerful yet accessible data visualization tool, which features geographic information systems allowing for rapid interpretation of data to understand relationships, patterns, and trends. When a DataV screen is ready, it can embed works to the existing portal of the enterprise through ______.
- A . URL after the release
- B . URL in the preview
- C . MD5 code obtained after the release
- D . Jar package imported after the release
DataWorks can be used to develop and configure data sync tasks.
Which of the following statements are correct? (Number of correct answers: 3) Score 2
- A . The data source configuration in the project management is required to add data source
- B . Some of the columns in source tables can be extracted to create a mapping relationship between fields, and constants or variables can’t be added
- C . For the extraction of source data, "where" filtering clause can be referenced as the criteria of incremental synchronization
- D . Clean-up rules can be set to clear or preserve existing data before data write
You are working on a project where you need to chain together MapReduce, Hive jobs. You also need the ability to use forks, decision points, and path joins.
Which ecosystem project should you use to perform these actions? Score 2
- A . Spark
- B . HUE
- C . Zookeeper
- D . Oozie
MaxCompute supports two kinds of charging methods: Pay-As-You-Go and Subscription (CU cost). Pay-As-You-Go means each task is measured according to the input size by job cost. In this charging method the billing items do not include charges due to ______. Score 2
- A . Data upload
- B . Data download
- C . Computing
- D . Storage
In MaxCompute, if error occurs in Tunnel transmission due to network or Tunnel service, the user can resume the last update operation through the command tunnel resume; Score 1
- A . True
- B . False
You are working on a project where you need to chain together MapReduce, Hive jobs. You also need the ability to use forks, decision points, and path joins.
Which ecosystem project should you use to perform these actions?
- A . Apache HUE
- B . Apache Zookeeper
- C . Apache Oozie
- D . Apache Spark
In order to ensure smooth processing of tasks in the Dataworks data development kit, you must create an AccessKey. An AccessKey is primarily used for access permission verification between various Alibaba Cloud products. The AccessKey has two parts, they are ____. (Number of correct answers: 2) Score 2
- A . Access Username
- B . Access Key ID
- C . Access Key Secret
- D . Access Password
Scenario: Jack is the administrator of project prj1. The project involves a large volume of sensitive data such as bank account, medical record, etc. Jack wants to properly protect the data.
Which of the follow statements is necessary?
- A . set ProjectACL=true;
- B . add accountprovider ram;
- C . set ProjectProtection=true;
- D . use prj1;
Resource is a particular concept of MaxCompute. If you want to use user-defined function UDF or MapReduce, resource is needed. For example: After you have prepared UDF, you must upload the compiled jar package to MaxCompute as resource.
Which of the following objects are MaxCompute resources? (Number of correct answers: 4)
Score 2
- A . Files
- B . Tables: Tables in MaxCompute
- C . Jar: Compiled Java jar package
- D . Archive: Recognize the compression type according to the postfix in the resource name
- E . ACL Policy
Which of the following is not proper for granting the permission on a L4 MaxCompute table to a user. (L4 is a level in MaxCompute Label-based security (LabelSecurity), it is a required MaxCompute Access Control (MAC) policy at the project space level. It allows project administrators to control the user access to column-level sensitive data with improved flexibility.) Score 2
- A . If no permissions have been granted to the user and the user does not belong to the project, add the user to the project. The user does not have any permissions before they are added to the project.
- B . Grant a specific operation permission to the user.
- C . If the user manages resources that have labels, such as datasheets and packages with datasheets, grant label permissions to the user.
- D . The user need to create a project in simple mode
Synchronous development in DataWorks provides both wizard and script modes. Score 1
- A . True
- B . False
Alibaba Cloud Quick BI reporting tools support a variety of data sources, facilitating users to analyze
and present their data from different data sources. ______ is not supported as a data source yet. Score 2
- A . Results returned from the API
- B . MaxCompute
- C . Local Excel files
- D . MySQL RDS
In order to improve the processing efficiency when using MaxCompute, you can specify the partition when creating a table. That is, several fields in the table are specified as partition columns.
Which of the following descriptions aboutMaxCompute partition table are correct? (Number of correct answers: 4)
- A . In most cases, user can consider the partition to be the directory under the file system
- B . User can specify multiple partitions, that is, multiple fields of the table are considered as the partitions of the table, and the relationship among partitions is similar to that of multiple directories
- C . If the partition columns to be accessed are specified when using data, then only corresponding partitions are read and full table scan is avoided, which can improve the processing efficiency and save costs
- D . MaxCompute partition only supports string type and the conversion of any other types is not allowed
- E . The partition value cannot have a double byte characters (such as Chinese.
MaxCompute takes Project as a charged unit. The bill is charged according to three aspects: the usage of storage, computing resource, and data download respectively. You pay for compute and
storage resources by the day with no long-term commitments. Score 1
- A . True
- B . False
Machine Learning Platform for Artificial Intelligence (PAI) node is one of the node types in DataWorks business flow. It is used to call tasks created on PAI and schedule production activities based on the node configuration. PAI nodes can be added to DataWorks only _________. Score 2
- A . after PAI experiments are created on PAI
- B . after PAI service is activated
- C . after MaxCompute service is activated
- D . Spark on MaxCompute Machine Learning project is created
DataService Studio in DataWorks aims to build a data service bus to help enterprises centrally manage private and public APIs. DataService Studio allows you to quickly create APIs based on data tables and register existing APIs with the DataService Studio platform for centralized management and release.
Which of the following descriptions about DataService Studio in DataWorks is INCORRECT? Score 2
- A . DataService Studio is connected to API Gateway. Users can deploy APIs to API Gateway with oneclick.
- B . DataService Studio adopts the serverless architecture. All you need to care is the query logic of APIs, instead of the infrastructure such as the running environment.
- C . To meet the personalized query requirements of advanced users, DataService Studio provides the custom Python script mode to allow you compile the API query by yourself. It also supports multi-table association, complex query conditions, and aggregate functions.
- D . Users can deploy any APIs created and registered in DataService Studio to API Gateway for management, such as API authorization and authentication, traffic control, and metering
DataV is a powerful yet accessible data visualization tool, which features geographic information systems allowing for rapid interpretation of data to understand relationships, patterns, and trends.
When a DataV screen is ready, it can embed works to the existing portal of the enterprise through ______. Score 2
- A . URL after the release
- B . URL in the preview
- C . MD5 code obtained after the release
- D . Jar package imported after the release
A Log table named log in MaxCompute is a partition table, and the partition key is dt. Anew partition is created daily to store the new data of that day. Now we have one month’s data, starting from dt=’20180101′ to dt=’20180131′, and we may use ________ to delete the data on 20180101.
- A . delete from log where dt=’20180101′
- B . truncate table where dt=’20180101′
- C . drop partition log (dt=’20180101′)
- D . alter table log drop partition(dt=’20180101′)
There are multiple connection clients for MaxCompute, which of the following is the easiest way to configure workflow and scheduling for MaxCompute tasks? Score 2
- A . Use DataWorks
- B . Use Intelij IDEA
- C . Use MaxCompute Console
- D . No supported tool yetc
There are three types of node instances in an E-MapReducecluster: master, core, and _____ . Score 2
- A . task
- B . zero-load
- C . gateway
- D . agent
There are various methods for accessing to MaxCompute, for example, through management console, client command line, and Java API. Command line tool odpscmd can be used to create, operate, or delete a table in a project. Score 1
- A . True
- B . False
When we use the MaxCompute tunnel command to upload the log.txt file to the t_log table, the t_log is a partition table and the partitioning column is (p1 string, p2 string).
Which of the following commands is correct?
- A . tunnel upload log.txt t_log/p1="b1”, p2="b2"
- B . tunnel upload log.txt t_log/(p1="b1”, p2="b2")
- C . tunnel upload log.txt t_log/p1="b1"/p2="b2"
In MaxCompute, you can use Tunnel command line for data upload and download.
Which of the following description of Tunnel command is NOT correct: Score 2
- A . Upload: Supports file or directory (level-one) uploading. Data can only be uploaded to a single table or table partition each time.
- B . Download: You can only download data to a single file. Only data in one table or partition can be downloaded to one file each time. For partitioned tables, the source partition must be specified.
- C . Resume: If an error occurs due to the network or the Tunnel service, you can resume transmission of the file or directory after interruption.
- D . Purge: Clears the table directory. By default, use this command to clear information of the last three days.
If a task node of DataWorks is deleted from the recycle bin, it can still be restored.
- A . True
- B . False