IBM C2090-101 IBM Big Data Engineer Online Training
IBM C2090-101 Online Training
The questions for C2090-101 were last updated at Dec 24,2024.
- Exam Code: C2090-101
- Exam Name: IBM Big Data Engineer
- Certification Provider: IBM
- Latest update: Dec 24,2024
Which statement is TRUE concerning optimizing the load performance?
- A . You can improve the performance by increasing the number of map tasks assigned to the load
- B . When loading large files the number of files that you load does not impact the performance of the LOAD HADOOP statement
- C . You can improve the performance by decreasing the number of map tasks that are assigned to the load and adjusting the heap size
- D . It is advantageous to run the LOAD HADOOP statement directly pointing to large files located in the host file system as opposed to copying the files to the DFS prior to load
Which of the following statements are TRUE regarding the use of Data Click to load data into BigInsights? (Choose two.)
- A . Big SQL cannot be used to access the data moved in by Data Click because the data is in Hive
- B . You must import metadata for all sources and targets that you want to make available for Data Click activities
- C . Connections from the relational database source to HDFS are discovered automatically from within Data Click
- D . Hive tables are automatically created every time you run an activity that moves data from a relational database into HDFS
- E . HBase tables are automatically created every time you ran an activity that moves data from a relational database into HDFS
Which of the following statements regarding importing streaming data from InfoSphere Streams into Hadoop is TRUE?
- A . InfoSphere Streams can both read from and write data to HDFS
- B . The Streams Big Data toolkit operators that interface with HDFS uses Apache Flume to integrate with Hadoop
- C . Streams applications never need to be concerned with making the data schemas consistent with those on Hadoop
- D . Big SQL can be used to preprocess the data as it flows through InfoSphere Streams before the data lands in HDFS
Which of the following is TRUE about storing an Apache Spark object in serialized form?
- A . It is advised to use Java serialization over Kryo serialization
- B . Storing the object in serialized from will lead to faster access times
- C . Storing the object in serialized from will lead to slower access times
- D . All of the above
Which ONE of the following statements regarding Sqoop is TRUE?
- A . HBase is not supported as an import target
- B . Data imported using Sqoop is always written to a single Hive partition
- C . Sqoop can be used to retrieve rows newer than some previously imported set of rows
- D . Sqoop can only append new rows to a database table when exporting back to a database
Which one of the following statements is TRUE?
- A . Spark SQL does not support HiveQL
- B . Spark SQL does not support ANSI SQL
- C . To use Spark with Hive, HiveQL queries have to rewritten in Scala
- D . Spark SQL allows relational queries expressed in SQL, HiveQL, or Scala
Which of the following statements regarding Big SQL is TRUE?
- A . Big SQL doesn’t support stored procedures
- B . Big SQL can be deployed on a subset of data nodes in the BigInsights cluster
- C . Big SQL provides a SQL-on-Hadoop environment based on map reduce
- D . Only tables created or loaded via Big SQL can be accessed via Big SQL
The number of partitions created by DynamicPartitions in Hive can be controlled by which of the following?
- A . hive.exec.max.dynamic.partitions.pernode
- B . hive.exec.max.dynamic.partitions
- C . hive.exec.max.created.files
- D . All of the above
Which of the following Jaq operators groups one or more arrays based on key values and applies an aggregate expression?
- A . join
- B . group
- C . expand
- D . transform
Which of the following are CRUD operations available in HBase? (Choose two.)
- A . HTable.Put
- B . HTable.Read
- C . HTable.Delete
- D . HTable.Update
- E . HTable.Remove