Community search


Discovering communities in a network, known as community detection/discovery, is a fundamental problem in network science, which attracted much attention in the past several decades. In recent years, with the tremendous studies on big data, another related but different problem, called community search, which aims to find the most likely community that contains the query node, has attracted great attention from both academic and industry areas. It is a query-dependent variant of the community detection problem. A detailed survey of community search can be found at ref., which reviews all the recent studies

Main advantages

As pointed by the first work on community search published in SIGKDD'2010, many existing community detection/discovery methods consider the static community detection problem, where the graph needs to be partitioned a-priori with no reference to query nodes. While community search often focuses the most-likely communitie containing the query vertex. The main advantages of community search over community detection/discovery are listed as below:
High personalization. Community detection/discovery often uses the same global criterion to decide whether a subgraph qualifies as a community. In other words, the criterion is fixed and predetermined. But in reality, communities for different vertices may have very different characteristics. Moreover, community search allows the query users to specify more personalized query conditions. In addition, the personalized query conditions enable the communities to be interpreted easily.
For example, a recent work, which focuses on attributed graphs, where nodes are often associated with some attributes like keyword, and tries to find the communities, called attributed communities, which exhibit both strong structure and keyword cohesiveness. The query users are allowed to specify a query node and some other query conditions: a value, k, the minimum degree for the expected communities; and a set of keywords, which control the semantic of the expected communities. The communities returned can be easily interpreted by the keywords shared by all the community members. More details can be fround from.
High efficiency. With the striking booming of social networks in recent years, there are many real big graphs. For example, the numbers of users in Facebook and Twitter are often billions-scale. As community detection/discovery often finds all the communities from an entire social network, this can be very costly and also time-consuming. In contrast, community search often works on a sub-graph, which is much efficient. Moreover, detecting all the communities from an entire social network is often unnecessary. For real applications like recommendation and social media markets, people often focus on some communities that they are really interested in, rather than all the communities.
Some recent studies have shown that, for million-scale graphs, community search often takes less than 1 second to find a well-defined community, which is generally much faster than many existing community detection/discovery methods. This also implies that, community search is more suitable for finding communities from big graphs.
Support for dynamically evolving graphs. Almost all the graphs in real life are often evolving over time. Since community detection often uses the same global criterion to find communities, they are not sensitive of the updates of nodes and edges in graphs. In other words, the detected communities may loose freshness after a short period of time. On the contrary, community search can handle this easily since it is able to search the communities in an online manner, based on a query request.

Metrics for community search

Community search often uses some well-defined, fundamental graph metrics to formulate the cohesiveness of communities. The commonly used metrics are
k-core, k-truss, k-edge-connected
, etc. Among these measures, the k-core metric is the most popular one, and has been used in many recent studies as surveyed in.