My blog got the new domain name Diveintodata.org

The two week has passed since I opened moved this blog to here. Finally, I got a domain name “diveintodata.org” three days ago, and then I set my blog to be published as new domain. Now, I’m happy ๐Ÿ™‚

Anyway, from now I’m going to write at least one articles within a week. The articles will discuss computer science issues, especially newly emerging database issues.


Paper: Graph Twiddling in a MapReduce World

Today, at the lab seminar I presented the paper โ€œGraph Twiddling in a MapReduce Worldโ€ published in IEEE Computing in Science & Engineering. This paper addresses an investigation into the feasibility of decomposion graph operations into a series of MapReduce processes. In this post, Iโ€™m going to discuss this paper briefly.

As I mentioned above, this paper discusses the feasibility of decompositing graph operations into a series of MapReduce processes. As you know, the MapReduce has been gaining attentions in various applications that cope with large-scale datasets. However, to the best of my knowledge there have been no studies for dealing with graphs on MapReduce. This paper proposes several operations as follows:

  • Augmenting Edges with Degrees
  • Simplifying the Graph
  • Enumerating Triangles
  • Enumerating Rectangles
  • Finding Trusses
  • Barycentric Clustering
  • Finding Components

Some operations are performed in combination with other operations. Actually, some of them are very easy problems if they can traverse graphs. However, as the author said, traversing graphs with MapReduce is very inefficient (i.e., causing many MapReduce iterations) because a mapper reads only a record randomly for each map operation. Anyway, all the operations that the paper proposed avoid traversing graphs. Instead, their common pattern in graph algorithms proposed is as follows:

  1. A map operation: Read and process all the edges (or vertex) or changing some piece of edge (or vertex) information. Then, result in records by vertex as key.
  2. A reduce oprtation: For each record obtained from the previous map operation, read and determine the updated state of vertex or edge; emit this information in partially (or locally) updated records. Then, results in them.
  3. A reduce opration: For each record from the previous reduce operation, combine the updates globally and complete updated information.

Discussion

Even though this paper proposes several graph operations, they are still unnatural owing to too many MapReduce iterations; to the best of my knowledge, each MapReduce job’s initializing cost is very expensive. It is because mapper only can read record sequentially.ย The proposed graph operations based on MapReduce will cause the overhead of both MR iteration and communication. As a result, the feasible primitive graph operations with MapReduce are very limited. In addition, there are evidences to show the MapReduce is not suited to graph operations, but I will state them later.

Therefore, I think that a new programming model for graph (or complexity data) are needed. Ideally, the new programming model for graph must support graph traversing. In addition, data are needed to be preserved in locality in regards with their connectivity although data are distributed across a number of data nodes. Actually, basing these ideas Iโ€™m concreting โ€œHamburg: A New Programming Model for Graph Dataโ€ inspired by a blog post โ€œLarge-scale Graph Computing at Googleโ€

References


What is the Common Tag?

์ตœ๊ทผ Common Tag (http://www.commontag.org)๋ผ๋Š” ์ƒˆ๋กœ์šด ํ‚ค์›Œ๋“œ๊ฐ€ ์‹œ๋งจํ‹ฑ ์›น ์ปค๋ฎค๋‹ˆํ‹ฐ์— ๋“ฑ์žฅํ–ˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์‹ค ํƒœ๊ทธ(Tag)๋Š” ์ด๋ฏธ ๋งŽ์ด ์ต์ˆ™ํ•œ ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ Common Tag๊ฐ€ ์ตœ๊ทผ ๋งŽ์ด ์–ธ๊ธ‰๋˜์–ด Common Tag๊ฐ€ ๋ฌด์—‡์ธ์ง€ ๊ธฐ์กด ํƒœ๊ทธ์™€ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅธ์ง€ ๊ด€๋ จ ๊ธ€๋“ค์„ ์ฝ์–ด๋ณด๊ณ  ๊ฐ„๋‹จํžˆ ์ •๋ฆฌํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค.

๊ณต์‹ ์‚ฌ์ดํŠธ์— ์„ค๋ช…๋˜์–ด ์žˆ๋Š” Common Tag๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Common Tag is an open tagging format developed to make content more connected, discoverable and engaging. Unlike free-text tags, Common Tags are references to unique, well-defined concepts, complete with metadata and their own URLs

Common Tag๋Š” ์ปจํ„ด์ธ ๊ฐ„์˜ ์—ฐ๊ฒฐ์„ฑ, ๊ฒ€์ƒ‰ ๊ฐ€๋Šฅ์„ฑ, ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์— ์˜ํ•œ ํ™œ์šฉ์„ฑ์„ ํ–ฅ์ƒ ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๊ฐœ๋ฐœ๋œ ๊ณต๊ฐœ ํƒœ๊ทธ ํ˜•ํƒœ์ด๋‹ค. free-text ๊ธฐ๋ฐ˜์˜ ๊ธฐ์กด ํƒœ๊ทธ์™€ ๋‹ฌ๋ฆฌ ๊ณ ์œ ์„ฑ, ์ž˜ ์ •์˜๋œ ๊ฐœ๋…, ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ํ†ตํ•œ ์™„์ „์„ฑ, ์ž์ฒด URL์„ ๊ฐ€์ง„๋‹ค.

The Common Tag๊ฐ„๋‹จํžˆ ๋งํ•˜๋ฉด ๊ธฐ์กด ํƒœ๊ทธ๊ฐ€ free-text๊ธฐ๋ฐ˜์œผ๋กœ ์‚ฌ์šฉ์ž๊ฐ€ ์ž์œ ๋กญ๊ฒŒ ์ž…๋ ฅํ•˜๋Š” ํ…์ŠคํŠธ ํ˜•ํƒœ์˜€๋‹ค๋ฉด Common Tag๋Š” ๋ฏธ๋ฆฌ ์ž˜ ์ •์˜๋œ ๊ฐœ๋…์— URI๋ฅผ ๋ถ€์—ฌํ•˜๊ณ  ์ด URI๋ฅผ ํƒœ๊ทธ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ๋™์•ˆ Web 2.0์˜ ์—ฌ๋Ÿฌ ์ปจํ…์ธ ๋“ค์€ ํƒœ๊ทธ(Tag)๋ผ๋Š” free-text ํ˜•ํƒœ์˜ ํ…์ŠคํŠธ์— ์˜ํ•ด ๋ถ„๋ฅ˜๋˜์–ด์กŒ๊ณ  ๊ฒ€์ƒ‰์— ์ด์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ธฐ์กด ํƒœ๊ทธ๋Š” ์‚ฌ์‹ค ์ œ ์—ญํ• ์„ ํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ๋™์Œ์ด์˜์–ด, ๋™์˜์ด์Œ์–ด๋กœ ์ธํ•ด ๋ถ„๋ฅ˜๊ฐ€ ์ •ํ™•ํ•˜๊ฒŒ ๋˜์ง€ ๋ชปํ–ˆ๊ณ  ๋”ฐ๋ผ์„œ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋„ ๊ทธ์ € ๊ทธ๋Ÿฐ ํ€„๋ฆฌํ‹ฐ๋ฅผ ๋ณด์—ฌ์คฌ์Šต๋‹ˆ๋‹ค. Common Tag๋Š” ์ด๋Ÿฌํ•œ ๋‹จ์ ์„ ๋ณด์™„ํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆ๋œ ๊ฒƒ์œผ๋กœ ๋ณด์—ฌ์ง‘๋‹ˆ๋‹ค.

Why use Common Tag?

๊ทธ๋Ÿผ Common Tag๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ๋ฌด์—‡์ด ์ข‹์•„์งˆ๊นŒ์š”? ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ด์œ  ๋•Œ๋ฌธ์— ์ปจํ…์ธ  ์ƒ์‚ฐ์ž ๋ฐ ์†Œ๋น„์ž ๊ทธ๋ฆฌ๊ณ  ๊ด€๋ จ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ฐœ๋ฐœ์ž๋“ค์˜ ํŽธ์˜๊ฐ€ ํ–ฅ์ƒ๋ฉ๋‹ˆ๋‹ค.

  1. Findability๊ฐ€ ํ–ฅ์ƒ๋ฉ๋‹ˆ๋‹ค. ๋ช…ํ™•ํ•œ ์˜๋ฏธ์™€ ํ†ต์ผ๋œ Common Tag๋ฅผ ํ†ตํ•ด ์›ํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  2. ๋œป์ด ์ž˜ ์ •์˜๋˜์–ด ์žˆ๊ณ  ์œ ์ผํ•œ ํ‚ค๋ฅผ ๊ฐ€์ง€๋Š” Common Tag๋ฅผ ํ†ตํ•ด ์ •๋ณด๋“ค๊ฐ„์— ์—ฐ๊ฒฐ์„ฑ์ด ํ–ฅ์ƒ ๋ฉ๋‹ˆ๋‹ค. ์ฆ‰ ๊ฐ™์€ Common Tag๋ฅผ ๊ฐ€์ง„ ์ปจํ…์ธ  ๋ผ๋ฆฌ๋Š” Common Tag์— ํ•ด๋‹นํ•˜๋Š” URI๋ฅผ ํ†ตํ•ด ์—ฐ๊ฒฐ์„ฑ์„ ๊ฐ€์ง€๊ฒŒ ๋˜๋Š”๊ฑฐ์ฃ . ๊ธฐ์กด ํƒœ๊ทธ๋Š” ๋™์Œ์ด์˜์–ด ๋ฐ ๋™์˜์ด์Œ์–ด๋กœ ์ธํ•ด ์ž˜๋ชป๋œ ์—ฐ๊ฒฐ์„ ๊ฐ€์ง€๊ฑฐ๋‚˜ ์—ฐ๊ฒฐ๋˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•˜์Šต๋‹ˆ๋‹ค.
  3. Common Tag๋Š” ๋‹จ์ˆœํ•œ ์ŠคํŠธ๋ง(string)์ด ์•„๋‹Œ URI์— ์˜ํ•ด ์‹๋ณ„ ๋ฐ ์ฐธ์กฐ๋˜์–ด์ง€๋Š”๋ฐ ํ”„๋กœ๊ทธ๋žจ์ด ์ฒ˜๋ฆฌํ•˜๊ธฐ๊ฐ€ ์ˆ˜์›”ํ•ด ์ง‘๋‹ˆ๋‹ค. ๋˜ํ•œ ๋™์Œ์ด์˜์–ด์˜ ๊ฒฝ์šฐ ํ”„๋กœ๊ทธ๋žจ๋“ค์€ ์‚ฌ์‹ค ๊ตฌ๋ณ„ํ•˜๋Š” ๊ฒƒ์ด ๋ถˆ๊ฐ€๋Šฅํ•œ๋ฐ Common Tag์˜ ๊ฒฝ์šฐ๋Š” URI๋ฅผ ํ†ตํ•ด ์‹๋ณ„๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๋Ÿฐ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

How Can We Make use of Common Tag?

์›น ๋ฌธ์„œ(HTML document) ์ž์ฒด๋ฅผ Common Tag๋ฅผ ํ†ตํ•ด ํƒœ๊น…ํ•  ์ˆ˜ ๋„ ์žˆ์œผ๋ฉฐ ๋ฌธ์„œ์˜ ํŠน์ • ์„น์…˜, ํŠน์ • ๋‹จ์–ด ๋ฐ ๋ฏธ๋””์–ด ํŒŒ์ผ์— ํƒœ๊น…ํ•  ์ˆ˜ ๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๋‹ด์ด์ง€๋งŒ ์• ์ดˆ HTML์ด ํ‘œํ˜„์„ ์œ„์ฃผ๋กœ ์„ค๊ณ„๋˜์—ˆ์—ˆ๋Š”๋ฐ ์ด์™€ ๊ฐ™์ด ์‹œ๋งจํ‹ฑ ๋ฐ์ดํ„ฐ๊ฐ€ HTML์•ˆ์— ์‚ฝ์ž…๋  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์€ RDFa ๋•๋ถ„์ž…๋‹ˆ๋‹ค. RDFa๋Š” HTML์— RDF๋ฅผ embeding ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜๋Š” W3C์˜ Recommendation ์ž…๋‹ˆ๋‹ค. RDFa๋‚˜ RDF์— ๋Œ€ํ•ด์„œ๋Š” ์ถ”ํ›„์— ๋˜ ๋‹ค๋ฃจ๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

ํ•œ๊ฐ€์ง€ ์˜ˆ๋กœ Common Tag๋ฅผ ํ†ตํ•ด HTML์˜ anchor text์— ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํƒœ๊น…ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ด์™ธ์˜ ํ™œ์šฉ์— ๋Œ€ํ•ด์„œ๋Š” Common Taggingโ€™s QuickStartGuide ์„ ์ฐธ๊ณ ํ•˜์„ธ์š”.

<div xmlns:ctag="http://commontag.org/ns#" rel="ctag:tagged">
   NASA's <a typeof="ctag:Tag" rel="ctag:means"
               href="http://rdf.freebase.com/ns/en.phoenix_mars_mission"
               property="ctag:label">Phoenix Mars Lander</a> has deployed its robotic arm.
</div>

์œ„์™€ ๊ฐ™์ด ๊ธฐ์กด HTML์— ์™ ๋“ค์–ด๊ฐˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฐ ๊ธฐ์ˆ ์ ์ธ ๋ถ€๋ถ„๋งŒ ๋ณด๋ฉด ์›น ๋ฌธ์„œ์— Common Tag๊ฐ€ ์•„์ฃผ ์‰ฝ๊ฒŒ ์ ์šฉ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ Common Tag๊ฐ€ ํƒœ๊น…๋œ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋Š” ์•„์ฃผ ์œ ์šฉํ•˜๊ฒŒ ์“ฐ์ผ ์ˆ˜ ์žˆ์œผ๋‚˜ ํƒœ๊น…ํ•  ๋•Œ ๊ฒฐ๊ตญ์— ์ž์‹ ์ด ๋‚˜ํƒ€๋‚ด๊ณ  ์‹ถ์€ ๋œป์„ ๊ฐ€์ง„ Common Tag๋ฅผ ๊ณจ๋ผ๋‚ด๋Š” ์ฆ‰ ์‚ฌ๋žŒ์˜ ์†์„ ๊ฑฐ์ณ์•ผ ํ•œ๋‹ค๋Š”๊ฒŒ ๋ถˆํŽธํ•จ์œผ๋กœ ์ž‘์šฉํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์‚ฌ์‹ค ํ˜„์žฌ์˜ ๊ธฐ์ˆ ์ˆ˜์ค€์œผ๋กœ๋Š” ๋ถˆ๊ฐ€ํ”ผํ•œ ๋ฌธ์ œ์ธ๋ฐ ์›น ์ €์ž‘ํˆด๋“ค์˜ Common Tag ์ž๋™์™„์„ฑ ๊ธฐ๋Šฅ์ด๋ผ๋˜์ง€ ๊ฒ€์ƒ‰๊ธฐ๋Šฅ์œผ๋กœ ์ปค๋ฒ„๋˜์–ด์•ผ ํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Conclusion

๊ฒฐ๊ณผ์ ์œผ๋กœ๋Š” RDFa์™€ ๋”๋ถˆ์–ด Common Tag๊ฐ€ ๋งŽ์ด ๋ณด๊ธ‰๋˜์–ด ์›น ์ €์ž‘ํˆด ๋ฐ ์›น ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜๋“ค์ด ์ด๋“ค์„ ์ž˜ ์ง€์›ํ•˜๊ฒŒ ๋˜๋ฉด ์‹œ๋งจํ‹ฑ ๋ฐ์ดํ„ฐ๋Š” ๋”์šฑ ํ’๋ถ€ํ•ด์งˆ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋˜๊ณ ์š”. ๋˜ํ•œ Common Tag๋ฅผ ํ†ตํ•ด ๋ฌธ์„œ ๋˜๋Š” ๋ฌธ์„œ์˜ ์ผ๋ถ€๋ถ„์— ์ธ๋ฌผ, ์‚ฌ๋ฌผ, ์ง€๋ฆฌ ์ •๋ณด ๋ฐ ์ถ”์ƒ์  ๊ฐœ๋…์„ ์ •ํ™•ํ•˜๊ฒŒ ํƒœ๊น…ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๊ณ  ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํฅ๋ฏธ๋กœ์šด ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์ด ์Ÿ์•„์งˆ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ธ€ ์ค‘ ํ‹€๋ฆฐ ๋‚ด์šฉ์ด ์žˆ์œผ๋ฉด ์ง€์  ๋ถ€ํƒ ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

See Also: