Morgan & Claypool | English | Oct 2017 | ISBN-10: 1681730391 | 206 Pages | PDF | 4.60 mb
By Danai Koutra, Christos Faloutsos
Graphs naturally represent information ranging from links between web pages, to communication
in email networks, to connections between neurons in our brains. These graphs often span
billions of nodes and interactions between them. Within this deluge of interconnected data, how
can we find the most important structures and summarize them? How can we efficiently visualize
them? How can we detect anomalies that indicate critical events, such as an attack on a
computer system, disease formation in the human brain, or the fall of a company?
This book presents scalable, principled discovery algorithms that combine globality with
locality to make sense of one or more graphs. In addition to fast algorithmic methodologies, we
also contribute graph-theoretical ideas and models, and real-world applications in two main areas.
Individual Graph Mining: We show how to interpretably summarize a single graph by
identifying its important graph structures. We complement summarization with inference,
which leverages information about few entities (obtained via summarization or other
methods) and the network structure to efficiently and effectively learn information about
the unknown entities.
Collective Graph Mining: We extend the idea of individual-graph summarization to
time-evolving graphs, and show how to scalably discover temporal patterns. Apart from
summarization, we claim that graph similarity is often the underlying problem in a host of
applications where multiple graphs occur (e.g., temporal anomaly detection, discovery of
behavioral patterns), and we present principled, scalable algorithms for aligning networks
and measuring their similarity.
The methods that we present in this book leverage techniques from diverse areas, such
as matrix algebra, graph theory, optimization, information theory, machine learning, finance,
and social science, to solve real-world problems. We present applications of our exploration
algorithms to massive datasets, including a Web graph of 6:6 billion edges, a Twitter graph of
1.8 billion edges, brain graphs with up to 90 million edges, collaboration, peer-to-peer networks,
browser logs, all spanning millions of users and interactions