Hortonworks Apache Hadoop Developer Hadoop 2.0 Certification exam for Pig and Hive Developer Online Training

Hortonworks Apache Hadoop Developer Online Training

The questions for Apache Hadoop Developer were last updated at Apr 18,2025.

Exam Code: Apache Hadoop Developer
Exam Name: Hadoop 2.0 Certification exam for Pig and Hive Developer
Certification Provider: Hortonworks
Latest update: Apr 18,2025

Question #21

You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt and #data.txt.

How many files will be processed by the FileInputFormat.setInputPaths () command when it’s given a path object representing this directory?

A . Four, all files will be processed
B . Three, the pound sign is an invalid character for HDFS file names
C . Two, file names with a leading period or underscore are ignored
D . None, the directory cannot be named jobdata
E . One, no special characters can prefix the name of an input file

Reveal Solution Hide Solution

Question #22

In a large MapReduce job with m mappers and n reducers, how many distinct copy operations will there be in the sort/shuffle phase?

A . mXn (i.e., m multiplied by n)
B . n
C . m
D . m+n (i.e., m plus n)
E . mn (i.e., m to the power of n)

Reveal Solution Hide Solution

Question #23

Which Hadoop component is responsible for managing the distributed file system metadata?

A . NameNode
B . Metanode
C . DataNode
D . NameSpaceManager

Reveal Solution Hide Solution

Question #24

Review the following data and Pig code.

M,38,95111

F,29,95060

F,45,95192

M,62,95102

F,56,95102

A = LOAD 'data' USING PigStorage('.') as (gender:Chararray, age:int, zlp:chararray);

B = FOREACH A GENERATE age;

Which one of the following commands would save the results of B to a folder in hdfs named myoutput?

A . STORE A INTO 'myoutput' USING PigStorage(',');
B . DUMP B using PigStorage('myoutput');
C . STORE B INTO 'myoutput';
D . DUMP B INTO 'myoutput';

Reveal Solution Hide Solution

Question #25

MapReduce v2 (MRv2/YARN) splits which major functions of the JobTracker into separate daemons? Select two.

A . Heath states checks (heartbeats)
B . Resource management
C . Job scheduling/monitoring
D . Job coordination between the ResourceManager and NodeManager
E . Launching tasks
F . Managing file system metadata
G . MapReduce metric reporting
H . Managing tasks

Reveal Solution Hide Solution

Question #26

Assuming the following Hive query executes successfully:

Which one of the following statements describes the result set?

A . A bigram of the top 80 sentences that contain the substring "you are" in the lines column of the input data A1 table.
B . An 80-value ngram of sentences that contain the words "you" or "are" in the lines column of the inputdata table.
C . A trigram of the top 80 sentences that contain "you are" followed by a null space in the lines column of the inputdata table.
D . A frequency distribution of the top 80 words that follow the subsequence "you are" in the lines column of the inputdata table.

Reveal Solution Hide Solution

Question #27

Given the following Pig commands:

Which one of the following statements is true?

A . The $1 variable represents the first column of data in ‘my.log’
B . The $1 variable represents the second column of data in ‘my.log’
C . The severe relation is not valid
D . The grouped relation is not valid

Reveal Solution Hide Solution

Question #28

What does Pig provide to the overall Hadoop solution?

A . Legacy language Integration with MapReduce framework
B . Simple scripting language for writing MapReduce programs
C . Database table and storage management services
D . C++ interface to MapReduce and data warehouse infrastructure

Reveal Solution Hide Solution

Question #29

What types of algorithms are difficult to express in MapReduce v1 (MRv1)?

A . Algorithms that require applying the same mathematical function to large numbers of individual binary records.
B . Relational operations on large amounts of structured and semi-structured data.
C . Algorithms that require global, sharing states.
D . Large-scale graph algorithms that require one-step link traversal.
E . Text analysis algorithms on large collections of unstructured text (e.g, Web crawls).

Reveal Solution Hide Solution

Question #29

What types of algorithms are difficult to express in MapReduce v1 (MRv1)?

A . Algorithms that require applying the same mathematical function to large numbers of individual binary records.
B . Relational operations on large amounts of structured and semi-structured data.
C . Algorithms that require global, sharing states.
D . Large-scale graph algorithms that require one-step link traversal.
E . Text analysis algorithms on large collections of unstructured text (e.g, Web crawls).

Reveal Solution Hide Solution