Hortonworks Hortonworks-Certified-Apache-Hadoop-2.0- - Developer Hadoop 2.0 Certification exam for Pig and Hive Developer Exam

Page:    1 / 22   
Total 108 questions

Review the following data and Pig code.

M,38,95111 -

F,29,95060 -

F,45,95192 -

M,62,95102 -

F,56,95102 -
A = LOAD 'data' USING PigStorage('.') as (gender:Chararray, age:int, zlp:chararray);
B = FOREACH A GENERATE age;
Which one of the following commands would save the results of B to a folder in hdfs named myoutput?

  • A. STORE A INTO 'myoutput' USING PigStorage(',');
  • B. DUMP B using PigStorage('myoutput');
  • C. STORE B INTO 'myoutput';
  • D. DUMP B INTO 'myoutput';


Answer : C

You need to create a job that does frequency analysis on input data. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and an InputWritable as the value. As this will produce proportionally more intermediate data than input data, which two resources should you expect to be bottlenecks?

  • A. Processor and network I/O
  • B. Disk I/O and network I/O
  • C. Processor and RAM
  • D. Processor and disk I/O


Answer : B

MapReduce v2 (MRv2/YARN) is designed to address which two issues?

  • A. Single point of failure in the NameNode.
  • B. Resource pressure on the JobTracker.
  • C. HDFS latency.
  • D. Ability to run frameworks other than MapReduce, such as MPI.
  • E. Reduce complexity of the MapReduce APIs.
  • F. Standardize on a single MapReduce API.


Answer : A,B

Reference: Apache Hadoop YARN – Concepts & Applications

What are the TWO main components of the YARN ResourceManager process? Choose 2 answers

  • A. Job Tracker
  • B. Task Tracker
  • C. Scheduler
  • D. Applications Manager


Answer : C,D

Given a directory of files with the following structure: line number, tab character, string:
Example:
1abialkjfjkaoasdfjksdlkjhqweroij
2kadfjhuwqounahagtnbvaswslmnbfgy
3kjfteiomndscxeqalkzhtopedkfsikj
You want to send each line as one record to your Mapper. Which InputFormat should you use to complete the line: conf.setInputFormat (____.class) ; ?

  • A. SequenceFileAsTextInputFormat
  • B. SequenceFileInputFormat
  • C. KeyValueFileInputFormat
  • D. BDBInputFormat


Answer : C

Explanation:
http://stackoverflow.com/questions/9721754/how-to-parse-customwritable-from-text-in- hadoop

Page:    1 / 22   
Total 108 questions