Sunday, 27 December 2015

FAQ Hadoop -5

41) What are the different relational operations in “Pig Latin”?
They are:
i. for each
ii. order by
iii. filters
iv. group
v. distinct
vi.  join
vii. limit
Questions around Hive
42) What is “SerDe” in “Hive”?
The “SerDe” interface allows you to instruct “Hive” about how a record should be processed. A “SerDe” is a combination of a “Serializer” and a “Deserializer”. “Hive” uses “SerDe” (and “FileFormat”) to read and write table rows.
43) Can the default “Hive Metastore” be used by multiple users (processes) at the same time?
“Derby database” is the default “Hive Metastore”. Multiple users (processes) cannot access it at the same time. It is mainly used to perform unit tests.
44) What is the default location where “Hive” stores table data?
45) What is a “generic UDF” in “Hive”?
It is a UDF which is created using a Java program to serve some specific need not covered under the existing functions in “Hive”. It can detect the type of input argument programmatically and provide appropriate responses.
Question around Oozie
46) How do you configure an “Oozie” job in Hadoop?
“Oozie” is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs such as “Java MapReduce”, “Streaming MapReduce”, “Pig”, “Hive” and “Sqoop”. To understand “Oozie” in detail and learn how to configure am “Oozie” job, do check out this Edureka blog: http://www.edureka.co/blog/brief-introduction-to-oozie/
Question around Sqoop
47) Explain “Sqoop” in Hadoop.
“Sqoop” is a tool used to transfer data between an RDBMS and a Hadoop HDFS. Using “Sqoop”, data can be transferred from an RDBMS (such as MySQL or Oracle) into the HDFS as well as export data from HDFS file to RDBMS.
Questions around HBase
48) Explain “WAL” and “Hlog” in “HBase”?
“WAL” (Write Ahead Log) is similar to the “MySQL BIN” log; it records all the changes that occur in the data. It is a standard sequence file by Hadoop and stores “HLogkeys”.  These keys consist of a sequential number as well as actual data and are used to replay not yet persisted data after a server crash. So, in case of server failure, the “WAL” works as the lifeline and retrieves the lost data.
49) Mention the differences between “HBase” and “Relational Databases”?
Question around Spark
50) Can you build “Spark” with any particular Hadoop version and “Hive”?
Yes, you can build “Spark” for a specific Hadoop version. Check out this Edureka blog to learn more: http://www.edureka.co/blog/yarn-hive-get-electrified-by-spark/


  1. Thank you for sharing your valuable article this is the best blog for the students.learn Workday HCM Online Training.

    Workday HCM Online Training

  2. A befuddling web diary I visit this blog, it's incredibly grand. Strangely, in this present blog's substance made motivation behind fact and sensible. The substance of information is instructive
    Oracle Fusion Financials Online Training
    Oracle Fusion HCM Online Training
    Oracle Fusion SCM Online Training

  3. Actually I read it yesterday but I had some thoughts about it and today I wanted to read it again because it is very well written.