ArtiClarity: Learning with Hypergraphs - An Old Article with Novel Ideas

5 min readNov 10, 2019

In this article we are going to go over the work in [1] which discusses how hypergraphs can be used to better model real-world problems and how to generalize a well-known graph partitioning algorithm for these generalized definition of graphs.

Motivation

In the usual graph classification problems there exists an assumption such that the main relationships that comprise the sufficient statistics for our overall inferences can be modeled with pairwise relationships (edges) along with entities represented by vertices in a graph. However, real-world relationships more often than not turn to be more complicated than just a pair, therefore, the notion of hypergraph might be more inclusive for modeling a wider range of problems.

For the normal graphs, there is the well-known algorithm of Normalized-Cut which was introduced in [2] for image segmentation. In layman’s terms, what it does is that it assumes that the pixels in the picture are entities connected to their adjacent neighbors. Then it uses the normalized cut algorithm to perform a graph segmentation on such images and output the clusters. The idea of graph-based segmentation and clustering is to look for clusters with weak inter-connections and strong intra-connections, meaning that they should be far enough from each other (connected only by sparse edges with low weights) that cutting the cut between them does not impose a large loss, and also they should be densely connected inside so that it is meaningful to call each one of them a cluster.

Again, as the paper mentions, it is somehow absurd to assume that pairwise connections are sufficient to model most of the problems. For example, the problem of classifying research papers based on their authors is a case in which an author is usually associated with a set of articles. Therefore, representing it with hypergraph tends to be more inclusive.

What is a Hypergraph?

Consider the normal definition of graph, which is denoted by G:(V,E). In the well-known definition of normal graphs, V is the set of vertices and E denotes the set of edges (which are in the form of (u,v) tuples). In this definition, an edge determines the two vertices it is connecting. Now, the only thing that needs to be altered in order to get us to the definition of the hypergraphs is the definition of edges. In this case, instead we have a set of vertices associated with each edge (think about the author-paper problem above, this makes the model seem more natural). The same way as for the normal graphs, positive weights can be associated with such edges to better regulate the prior information and the pathways for the information to flow in.

What is Spectral Clustering?

To better understand the fun concept of spectral clustering, I would refer you to this medium article [3]. Reading that article would be very important since notions such as adjacency matrix, degree matrix, and laplacian are all explained in it for the normal graph. Those definitions will be generalized for the hypergraphs in the work we are discussing in this article.

Roughly speaking, Spectral clustering is an approach relying on eigenvalues and eigenvectors to find clusters of nodes in a graph based on how they are connected.

Abstract

As always, in the abstract of this paper one needs to look for the main contributions. In this paper, the main contributions are:

Generalizing the Normalized-Cut algorithm for Hypergraphs
Proposing methodologies for adding empirical loss as well for transductive classification
Experimenting on a number of benchmark tasks

Of all the above, the first one is the most important contribution of this work.

Video 1: Abstract

Introduction and Preliminaries

In the introduction of this paper authors go over the main motivation which was the complex relationships in many real-world application. They bring the problem of classifying papers into topics and discuss how modeling such problem with “simple” graphs (those in which edge represents the directed or undirected connection between two vertices only) is not very natural and explain its limitations. They also point to the concept of spectral clustering and discuss the approach in this paper a little bit.

Video 2: Section 1

Video 2: Preliminaries

Generalized N-Cut Formulation

In section 3 of this article the generalized formula for the objective of this new normalized cut is brought and discussed. The aim is to partition the hypergraph into two clusters that have strong intra-connection edges (dense within the cluster) and weak inter-connection edges (sparse inter-connections).

Video 3: Sections 3 and 5

Random Walk

Imagine you are in a city with routes to other cities, and the weights of each route is proportionate to its safety. It is fairly obvious that you don’t tend to use those unsafe routes very often, therefore if you have to use these and these only to get to a different cluster of cities, chances are that never happens. Therefore, random walk and transition probabilities and stochastic processes, is important to discuss when it comes to graph clustering. The 4th section of this article talks about that.

Video 4: Section 4

Embedding and Classification

Sections 6 and 7 talk about generalizing this now generalized N-cut to k clusters, which seems natural too. The only difference is now how can we assign classes when we have k eigenvectors to consider instead of 1? The solution is to apply an unsupervised clustering on top of them (e.g. K-Means is always a nice go-to).

Video 5: Section 6

As for the transductive inference, the only difference would be adding an empirical loss (e.g. hinge loss).

Video 6: Section 7

Empirical Validation

As usual, the methodology had to be implemented and validated on benchmark datasets. The results indicated the effectiveness of this algorithm at the time. A clear decrease in error-rate was observed when compared with simple-graph based models.

References

[1] Zhou, Dengyong, Jiayuan Huang, and Bernhard Schölkopf. “Learning with hypergraphs: Clustering, classification, and embedding.” Advances in neural information processing systems. 2007.

[2] Shi, Jianbo, and Jitendra Malik. “Normalized cuts and image segmentation.” Departmental Papers (CIS) (2000): 107.

[3] William Fleshman. “Spectral Clustering: Foundation and Application.” Medium

[4] Edwards, Michael, and Xianghua Xie. “Graph based convolutional neural network.” arXiv preprint arXiv:1609.08965 (2016).