Theory Seminar - Sean Chester: Triad Enumeration at Trillion-Scale using a Single Commodity Machine

A /triangle/, in the context of undirected graphs or social networks, is a set of 3 vertices that are all connected to each other. A /triad/ is the analogue for directed graphs and has 7 unique forms (up to isomorphism).

2019.01.23 | Dorthe Haagen Nielsen

Date Mon 01 Apr
Time 13:30 14:30
Location Nygaard-395 (5335-395). Åbogade 34, 8200 Aarhus N

Abstract

A /triangle/, in the context of undirected graphs or social networks, is a set of 3 vertices that are all connected to each other. A /triad/ is the analogue for directed graphs and has 7 unique forms (up to isomorphism). Triads characterise the extent to which communities form in social networks, because friendships are often propagated to neighbours---such as when you introduce one of your friends to another one---thereby "closing" triangles in the graph. As a result, many of the most commonly studied properties of social networks, such as the clustering coefficient and subgraph centrality, depend on counting triangles or triads. For what was previously only the domain of distributed computing, this research introduces an algorithm that enumerates all triads on the largest publicly available graph dataset (ClueWeb, with 978,408,098 vertices) using compression and a single 6000 kr. machine. With these results, it is no long necessary to access a cluster to study the effects of new algorithms on massive-scale graphs.

This research will be published and presented at the EDBT conference in Lisbon the week before it is presented here. It was conducted in collaboration with researchers at the University of Victoria, Canada, including Yudi Santoso, the paper's student first author. It is recommended to bring to the talk a device with which you can access the Internet. 

Bio

Sean Chester is a former postdoc of the Data-Intensive Systems group (2013-2015), who is now a Horizon2020 Marie-Curie fellow at the Norwegian University of Science and Technology (NTNU) in Trondheim, Norway.  Although his research is mostly in parallel algorithms for data management, his greatest discovery came while preparing this biography: by burning his tongue as he licked overflowing hot chocolate from the hot stove element that had caused it to overflow in the first place, he resolved the age-old philosophical question of whether absolutely anybody is smart enough to earn a PhD.

Host: Ira Assent 

Site Specific, CS frontpage, Lecture / talk, Staff, Students