>

Thursday, 24 December 2015

Functional Programming: Map/Reduce


  • Functional operations do not modify data structures. They always create new data.
  • Original data still exists in unmodified form.
  • Order of operations does not matter 
  • Example: 
 
Map  :

ÄMAP is the name of a higher-order function that applies a given function element-wise to a list of elements and returns a list of results
ÄOriginated in functional programming but common in many languages
ÄExample:
  square x = x * x
          map square [1,2,3,4,5] will return [1,4,9,16.25]
ÄNotice that I can process “map square” in parallel:
map square [1,2,3] -> [1,4,9]
map square [4,5] -> [16,25]
   



Reduce:

REDUCE (also known fold, accumulate, compress or inject) is a family of higher-order functions that iterate an arbitrary function over a data structure in some order and build up a return value. Typically, a REDUCE deals with two things: a combining function and a list of elements of some data structure. The fold then proceeds to combine elements of the data structure using the function in some systematic way.

Example:

MAX (1, 2, 3, 10,15, 20) -> 20

SUM (1, 7, 10) -> 18


In Hadoop REDUCE function always takes MAP function as an input 



ÄSome Examples of map/reduce:
 
Count of URL Access Frequency: The map function processes logs of web page requests and outputs <URL, 1>. The reduce function adds together all values for the same URL and emits a <URL, total count> pair
Reverse Web-Link Graph: The map function outputs <target, source> pairs for each link to a target URL found in a page named "source". The reduce function concatenates the list of all source URLs associated with a given target URL and emits the pair: <target, list(source)>
Count Words in the Document: The map function outputs each word plus an associated count of occurrences ("1" in this case). The reduce function sums together all the counts returned for a particular word. 
Difference between traditional and Mapreduce Programming:






0 comments:

Post a comment