# A Brief Summary of Independent Set in Graph Theory

**Posted:**April 24, 2010

**Filed under:**Research |

**Tags:**coloring problem, dominating set, graph, graph coloring, independent set, maximal independent set, maximum independent set, mis 1 Comment

### Graph Basics

Let *G* be a undirected graph. *G=(V,E)*, where *V* is a set of vertices and *E* is a set of edges. Every edge *e *in* E *consists of two vertices in *V *of* G. *It is said to connect, join, or link the two vertices (or end points).

### Independent Set

An independent set *S* is a subset of *V* in *G* such that no two vertices in *S* are adjacent. I suppose that its name is meaning that vertices in an independent set *S* is independent on a set of edges in a graph *G*. Like other vertex sets in graph theory, independent set has maximal and maximum sets as follows:

The independent set

Sisis not a proper subset of any independent set ofmaximalif SG.

The independent set

Sisif there is no other independent set has more vertices thanmaximumS.

That is, a largest maximal independent set is called a maximum independent set. The maximum independent set problem is an NP-hard optimization problem.

All graphs has independent sets. For a graph *G* having a maximum independent set, the independence number *α*(*G*) is determined by the cardinality of a maximum independent set.

**Relations to Dominating Sets**

- A dominating set in a graph
*G*is a subset*D*of*V*such that every vertex not in*D*is joined to at least one member of*D*by some edge. - In other words, a vertex set
*D*is a dominating set in*G*if and if only every vertex in a graph*G*is contained in (or is adjacent to) a vertex in*D.* - Every maximal independent set
*S*of vertices in a simple graph*G*has the property that every vertex of the graph either is contained in*S*or is adjacent to a vertex in*S*.- That is, an independent set is a dominating set if and if only it is a maximal independent set.

### Relations to Graph Coloring

- Independent set problem is related to coloring problem since vertices in an independent set can have the same color.

### References

- Chapter 10, Graph Theory: Modeling, Applications, and Algorithms
- http://en.wikipedia.org/wiki/Independent_set_(graph_theory)
- http://en.wikipedia.org/wiki/Dominating_set

# Data-Intensive Text Processing with MapReduce Draft Available in Online

**Posted:**March 11, 2010

**Filed under:**News, Research |

**Tags:**data intensive, graph, hadoop, mapreduce, text processing Leave a comment

Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer

Actually, there have never been books that directly deal with MapReduce programming and algorithms. This book addresses from MapReduce algorithm design to EM Algorithms for Text Processing. Although this book is still draft, it seems well-organized and very interesting. In addition, the book contains some basic graph algorithms using MapReduce.

# Java Universal Network/Graph Framework

**Posted:**September 15, 2009

**Filed under:**Research |

**Tags:**graph, java, jung, visualization tools 2 Comments

Recently, I’m primarily concerned with large-scale graph data processing. Occasionally, the visualization of graph can be a good way for us to observe some properties from graph data sets. Today, I’m going to introduce a graph framework, called *Java Universal Network/Graph Framework (Jung). *Jung provides data structures for graph, a programming interface familiar with graph features, some fundamental graph algorithms (e.g., minimum spanning tree, depth-first search, breath-first search, and dijkstra algorithm), and even visualization methods. Especially, I’m interested in its visualization methods.

The following java source shows the programming interface of Jung. In more detail, this program make a graph, add three vertices to the graph, and connect vertices. This source code is brought from Jung tutorial. As you can see, Jung’s APIs are very easy.

// Make a graph by a SparseMultigraph instance. Graph<Integer, String> g = new SparseMultigraph<Integer, String>(); g.addVertex((Integer)1); // Add a vertex with an integer 1 g.addVertex((Integer)2); g.addVertex((Integer)3); g.addEdge("Edge-A", 1,3); // Added an edge to connect between 1 and 3 vertices. g.addEdge("Edge-B", 2,3, EdgeType.DIRECTED); g.addEdge("Edge-C", 3, 2, EdgeType.DIRECTED); g.addEdge("Edge-P", 2,3); // A parallel edge // Make some objects for graph layout and visualization. Layout<Integer, String> layout = new KKLayout<Integer, String>(g); BasicVisualizationServer<Integer, String> vv = new BasicVisualizationServer<Integer, String>(layout); vv.setPreferredSize(new Dimension(800,800)); // It determine how each vertex with its value is represented in a diagram. ToStringLabeller<Integer> vertexPaint = new ToStringLabeller<Integer>() { public String transform(Integer i) { return ""+i; } }; vv.getRenderContext().setVertexLabelTransformer(vertexPaint); JFrame frame = new JFrame("Simple Graph View"); frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE); frame.getContentPane().add(vv); frame.pack(); frame.setVisible(true);

Some APIs of the Jung are based on generic programming, so you can use easily vertices or edges to contains user-defined data. If you want more detail information, visit http://jung.sourceforge.net.

The above source code shows the following diagram.

# Paper: Graph Twiddling in a MapReduce World

**Posted:**July 17, 2009

**Filed under:**Research |

**Tags:**graph, graph cluster, map-reduce, scalable computing 5 Comments

Today, at the lab seminar I presented the paper “Graph Twiddling in a MapReduce World” published in IEEE Computing in Science & Engineering. This paper addresses an investigation into the feasibility of decomposion graph operations into a series of MapReduce processes. In this post, I’m going to discuss this paper briefly.

As I mentioned above, this paper discusses the feasibility of decompositing graph operations into a series of MapReduce processes. As you know, the MapReduce has been gaining attentions in various applications that cope with large-scale datasets. However, to the best of my knowledge there have been no studies for dealing with graphs on MapReduce. This paper proposes several operations as follows:

- Augmenting Edges with Degrees
- Simplifying the Graph
- Enumerating Triangles
- Enumerating Rectangles
- Finding Trusses
- Barycentric Clustering
- Finding Components

Some operations are performed in combination with other operations. Actually, some of them are very easy problems if they can traverse graphs. However, as the author said, traversing graphs with MapReduce is very inefficient (i.e., causing many MapReduce iterations) because a mapper reads only a record randomly for each map operation. Anyway, all the operations that the paper proposed avoid traversing graphs. Instead, their common pattern in graph algorithms proposed is as follows:

- A map operation: Read and process all the edges (or vertex) or changing some piece of edge (or vertex) information. Then, result in records by vertex as key.
- A reduce oprtation: For each record obtained from the previous map operation, read and determine the updated state of vertex or edge; emit this information in partially (or locally) updated records. Then, results in them.
- A reduce opration: For each record from the previous reduce operation, combine the updates globally and complete updated information.

### Discussion

Even though this paper proposes several graph operations, they are still unnatural owing to too many MapReduce iterations; to the best of my knowledge, each MapReduce job’s initializing cost is very expensive. It is because mapper only can read record sequentially. The proposed graph operations based on MapReduce will cause the overhead of both MR iteration and communication. As a result, the feasible primitive graph operations with MapReduce are very limited. In addition, there are evidences to show the MapReduce is not suited to graph operations, but I will state them later.

Therefore, I think that a new programming model for graph (or complexity data) are needed. Ideally, the new programming model for graph must support graph traversing. In addition, data are needed to be preserved in locality in regards with their connectivity although data are distributed across a number of data nodes. Actually, basing these ideas I’m concreting “Hamburg: A New Programming Model for Graph Data” inspired by a blog post “Large-scale Graph Computing at Google”

### References

- Jonathan Conhen, “Graph Twiddling in a MapReduce World”, Volume 11, Issue 4, pp 29–41, IEEE Computing in Science & Engineering, July-Aug, 2009.
- Jeffrey Dean and Sanjay Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters”, OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, 2004.
- Google Cluster Computing and MapReduce Lecture 5
- Breath-first graph search using an iterative map-reduce algorithm
- Hamburg, Hadoop Wiki

# RDF, SPARQL, and TURTLE

**Posted:**October 2, 2008

**Filed under:**Research |

**Tags:**graph, graph database, rdf, sparql, triple, turtle, xml 15 Comments

이번주 결혼식 끝나고 급히 정리 요약 해야 할 것들..

- Resource Description Framework (RDF)
- SPARQL Query Language for RDF
- Turtle – Terse RDF Triple Language