Recently, I started to participate in the Hama project (a distributed scientific package on Hadoop for massive matrix and graph data), and I have taken the times to develop the bulk synchronization parallel (BSP) library on Hadoop (HAMA-195); I’m getting help from Edword Yoon, a founder of Hama project. The motivation of BSP lib is definitely clear.
The hadoop platforms are installed in cloud computing service providers and many companies as you can see in http://wiki.apache.org/hadoop/PoweredBy. However, most of them may use only MapReduce programs. As you know although MapReduce is very scalability, but it provides only the simple programming model. Many programmers want to use more various programming model without changing the platform (i.e., Hadoop). This BSP lib will be the beginning for their desires. However, like MapReduce, BSP may also be not swiss army knife. When we find appropriate applications, BSP lib on Hadoop will be valued for its scalability and ability.
Sooner, I’ll post articles about the progress of BSP library and Angrapa (the graph package on Hama).