We have proposedLasagne, an unsupervised learning algorithm to compute embeddings for the nodes of a graph. The basic idea of Lasagneis to use an Approximate Personalized PageRank algorithm to bias random walks more strongly to the local neighborhood of each node; and, thus, the embedding for a given node is more finely tuned to the local graph structure around that node than the embeddings from previous similar methods. Our method performs particularly well for larger graphs that are not well-structured, e.g., that have flat NCPs and/or have many nodes in deep k-cores. Our empir-ical evaluation has shown that our embeddings achieve superior prediction accuracy over competitors when used for multi-label classification in several different real-world networks. Our empirical results also provide evidence justifying the reason for this improvement. While Lasagneis primarily an exploratory tool, if one wants to use it in a more automated manner, then an important question will be how to automate the averaging of the APPR vectors over different values of the locality parameter.

op Algorithm Dataset

facebook arXiv BlogCatalog a)

DeepWalk 0.7240 0.7002 0.7921 node2vec 0.7223 0.7259 0.8108 GraRep 0.7495 0.7097 0.8759 Lasagne 0.7069 0.7195 0.8701 b)

DeepWalk 0.9610 0.8632 0.7187 node2vec 0.9644 0.8770 0.7359 GraRep 0.9629 0.7494 0.8846 Lasagne 0.9628 0.8715 0.8281 c)

DeepWalk 0.9606 0.8438 0.7799 node2vec 0.9642 0.8499 0.8044 GraRep 0.9621 0.7980 0.8713 Lasagne 0.9072 0.7036 0.7017 d)

DeepWalk 0.9593 0.8450 0.7844 node2vec 0.9646 0.8523 0.8074 GraRep 0.9635 0.7664 0.8731 Lasagne 0.9111 0.7053 0.7045 jac

DeepWalk 0.8435 0.7357 0.5525 node2vec 0.8509 0.7381 0.5644 GraRep 0.8418 0.4980 0.5567 Lasagne 0.9256 0.7361 0.5337

Table 10.4: Results for Link Prediciton; Metric: AUC scores of predictions
retrieved by binary classifiers resp. Jaccard similarity measure; Operators
used for edge embedding: a) Average: ^{f}^{i}^{(u)+f}_{2} ^{i}^{(v)}, b) Hadamard: f_{i}(u)·f_{i}(v),
c) Weighted L1: |f_{i}(u)−f_{i}(v)|, d) Weighted L2: |f_{i}(u)−f_{i}(v)|^{2}, with f_{i}(x)
being thei-th component of node x [103]; jac: Jaccard similarity measure

### Structure-Based Node Embedding

Parts of the work presented in this chapter has been published as the article Structural Graph Representations based on Multiscale Local Network Topolo-gies in the Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI), 2019 [46]. A preliminary work-in-progress version has been published as the short paper article Towards Learning Structural Node Embeddings using Personalized PageRank in the Proceedings of the LWDA, 2017 [47].

### 11.1 Introduction

The increasing relevance of graph-structured data has been accompanied by an increased interest in algorithms which can leverage underlying graph structure to make accurate predictions about the modeled entities. In many scenarios no additional facts about entities or properties of relationships are known. In such cases, the only source of information for machine learning tasks like node and graph classification is the graph topology. In this chapter, we consider the problem of deriving structural node representations or role representations based solely on the topological structure within the local node neighborhoods.

The intuition behind this task is to capture the different functionalities of network entities within vector representations such that entities that play a similar role within the network end up close together in the embedded space. Note that the notion of a role is generally diverse and might describe influencers in a social network, or a specific group of atoms in molecule net-works that are likely to bind to similar atomic substructures. In general, node representations describing the roles of the nodes within a graph are useful for downstream classification or clustering tasks, e.g., they may give

Figure 11.1: Airline networks. Left: Scandinavian Airline; Right: Niki Air.

Each node corresponds to an airport and edges connect two airports if the airline operates a flight between them. The color coding corresponds to the role descriptors as determined by our approach.

valuable insights for real world tasks like drug design, identification of
in-fluencers within social communities, link prediction, etc. To introduce the
problem more formally, we are given a set of graphsG ={G_{1}, . . . , G_{N}} with
G_{i} = (V_{i}, E_{i}) being a graph, V = SN

i=1V_{i} denoting the set of vertices and
E = SN

i=1E_{i} being the set of edges. The goal is to derive vector
represen-tations f(v_{j}) ∈ R^{d} reflecting the various roles of nodes v_{j} ∈ V in a graph
G_{i} ∈ G. Figure 11.1 illustrates the concept of node roles for airline networks.

Note that the color coding corresponds to the nodes’ roles as identified by our approach. It can clearly be seen that the role of a node (e.g. hub air-port, airport with connection to a single hub or an airport with connection to several hubs) can be extracted from the node’s local neighborhood. We can also see that these roles can be identified across different graphs although the local neighborhoods may seem to be different, e.g., in terms of size. How-ever, considering the local topology around the nodes, it can also be seen that they are similar for those nodes that have similar colors. The distinct property of node role representations is that they should be independent of specific neighbors. Therefore, nodes that are similar in the embedded space are not necessarily closely connected and even may reside in different graphs.

In general, these representations could be either continuous vectors, such as
structural node embeddings, or discrete role assignments, and should
cap-ture structural properties of the nodes within the graph. However, here, we
focus on continuous representations. In particular, given two nodes u, v ∈V
which have similar local structural neighborhood patterns with respect to
some similarity measure, i.e., S_{N}(u) ≈ S_{N}(v), then the representation of u
and v shall be similar as well. However, defining an appropriate similarity

measure seems difficult since even the notion of local structural neighbor-hood patterns is hard to grasp. In this chapter, we argue that the spread of probability mass under the node’s most relevant local neighbors is a good characteristic for the node’s role. Similarly to [91] we leverage the Approxi-mate Personalized PageRank (APPR)to effectively describe multiple locality structures around the vertices and use the probability distribution vectors as a basis to quantify the structural roles of the nodes. An important feature of our novel node representation is that it is very efficient to compute and thus, even suitable for large data sets. Furthermore, an important difference to previously published related works, e.g., [82], is that our method operates directly in the vertex domain, though the heat kernel diffusion process re-sembles that implied by PPR [71]. Additionally, our method is not restricted tok-hop neighborhoods.

Our empirical evaluation demonstrates that our simple approach outper-forms somewhat more advanced state-of-the-art role-based node representa-tions. With respect to previously published work on the topic of structural node embeddings (see Section 9.1) we summarize the key contributions of the work presented in this chapter as follows:

• A novel structure-based approach to determine role representations for single nodes directly in the vertex domain as opposed to existing diffusion-based approaches which operate in the spectral domain.

• A fast-to-compute approach that retrieves continuous role representa-tions rather than being composed of multiple, computationally rather costly structural features.

• An extensive evaluation of our proposed role representations that shows promising results when comparing our representations to state-of-the-art node embeddings when using their setups.