Frequent Subgraph Mining by Walking in Order Embedding Space

by · Jul 17, 2020 · 346 views ·

ICML 2020

Identifying frequent subgraphs or network motifs has been crucial in analyzing and predicting properties of real-world networks. However, finding large commonly-occurring motifs remains an open problem due to its NP-hard subroutine of subgraph counting, and the combinatorial growth of the number of possible subgraphs with their size. Here we present Subgraph Pattern Miner (SPMiner), a novel neural approach to finding frequent subgraphs in a large target graph. SPMiner integrates graph neural networks, order embedding space, and an efficient search strategy to identify network subgraph patterns that appear most frequently in a target graph dataset. SPMiner first decomposes the target graph into many overlapping subgraphs and then encodes the subgraphs into order embeddings. SPMiner then uses a monotonic walk in the order embedding space to identify frequent motifs. Compared to existing approaches and possible neural alternatives, SPMiner is more accurate, faster, and more scalable. For 5- and 6-node motifs, we show that SPMiner can identify almost all of the most frequent motifs while being 100x faster than exact enumeration methods. In addition, SPMiner can also reliably identify frequent 10-node motifs, which is well beyond the size limit of exact enumeration approaches. And last, We show that SPMiner can find large 10+ node motifs that appear 10-100x more frequently than those found by current widely-used approximate methods.