The prediction of edge signs in social and biological networks is a major goal of graph-based machine learning and has important implication in recommendation systems. Most current edge sign prediction methods rely on information propagation from neighbouring edges either directly by assuming sign similarity in neighbouring edges or using more complex theories based on combination of edge signs in neighbours. Such methods rely on a high network sampling fraction, and fail at low sampling level. We, here, show that edges with similar network topology, as defined by a combination of network measures have similar signs. The surprising correlation between network topology and edge sign can be used for prediction. Indeed, machine learning algorithm based on this topology can produce a higher accuracy than state of the art methods in standard datasets, even when a very small fraction of the edge signs are known, with an accuracy of up to 93%. We further show that different datasets differ in the importance of different features. A combination of features is always required to obtain a high area under the curve. When the vertices represent people, the sign is mainly affected by the edge target. When the network represents opinions, the signs are mainly affected by the edge source. The proposed method can be applied to directed and undirected, weighted and unweighted networks.
Bibliographical notePublisher Copyright:
© 2018. Published by Oxford University Press. All rights reserved.
- information propagation
- network attribute vectors.
- network topology
- sign prediction