FAQ Hadoop -5

Posted by : Sushanth Sunday, 27 December 2015

41) What are the different relational operations in “Pig Latin”?

They are:

i. for each

ii. order by

iii. filters

iv. group

v. distinct

vi. join

vii. limit

Questions around Hive

42) What is “SerDe” in “Hive”?

The “SerDe” interface allows you to instruct “Hive” about how a record should be processed. A “SerDe” is a combination of a “Serializer” and a “Deserializer”. “Hive” uses “SerDe” (and “FileFormat”) to read and write table rows.

43) Can the default “Hive Metastore” be used by multiple users (processes) at the same time?

“Derby database” is the default “Hive Metastore”. Multiple users (processes) cannot access it at the same time. It is mainly used to perform unit tests.

44) What is the default location where “Hive” stores table data?

hdfs://namenode/user/hive/warehouse

45) What is a “generic UDF” in “Hive”?

It is a UDF which is created using a Java program to serve some specific need not covered under the existing functions in “Hive”. It can detect the type of input argument programmatically and provide appropriate responses.

Question around Oozie

46) How do you configure an “Oozie” job in Hadoop?

“Oozie” is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs such as “Java MapReduce”, “Streaming MapReduce”, “Pig”, “Hive” and “Sqoop”. To understand “Oozie” in detail and learn how to configure am “Oozie” job, do check out this Edureka blog: http://www.edureka.co/blog/brief-introduction-to-oozie/

Question around Sqoop

47) Explain “Sqoop” in Hadoop.

“Sqoop” is a tool used to transfer data between an RDBMS and a Hadoop HDFS. Using “Sqoop”, data can be transferred from an RDBMS (such as MySQL or Oracle) into the HDFS as well as export data from HDFS file to RDBMS.

Questions around HBase

48) Explain “WAL” and “Hlog” in “HBase”?

“WAL” (Write Ahead Log) is similar to the “MySQL BIN” log; it records all the changes that occur in the data. It is a standard sequence file by Hadoop and stores “HLogkeys”. These keys consist of a sequential number as well as actual data and are used to replay not yet persisted data after a server crash. So, in case of server failure, the “WAL” works as the lifeline and retrieves the lost data.

49) Mention the differences between “HBase” and “Relational Databases”?

Hbase-and-relational-databases-top-50-hadoop-interview-questions

Question around Spark

50) Can you build “Spark” with any particular Hadoop version and “Hive”?

Yes, you can build “Spark” for a specific Hadoop version. Check out this Edureka blog to learn more: http://www.edureka.co/blog/yarn-hive-get-electrified-by-spark/

Subscribe to Posts | Subscribe to Comments

Technical Articles

Software Programming articles

FAQ Hadoop -5

Leave a Reply