Increasing research on social networks stresses the urgency for producing effective means of ensuring user privacy. Represented ubiquitously as graphs, social networks have a myriad of recently developed techniques to prevent identity disclosure, but the equally important attribute disclosure attacks have been neglected. To address this gap, we introduce an approach to anonymize social networks that have labelled nodes, α-proximity, which requires that the label distribution in every neighbourhood of the graph be close to that throughout the entire network. We present an effective greedy algorithm to achieve α-proximity and experimentally validate the quality of the solutions it derives.
Source code: http://cs.au.dk/~schester/code/AlphaProximity.tgz
Paper: http://dx.doi.org/10.1109/ASONAM.2011.105
Full licensing details are available with the source code, but if using it for academic purposes, please cite: S. Chester and G. Srivastava. "Social network privacy for attribute disclosure attacks." In Proc. ASONAM 445-449. 2011.
The skyline operator returns records in a dataset that provide optimal trade-offs of multiple dimensions. It is an expensive operator whose query performance can greatly benefit from materialization. However, a skyline can be executed over any subspace of dimensions, and the materialization of all subspace skylines, called the skycube, dramatically multiplies data size. Existing methods for skycube compression sacrifice too much query performance; so, we present a novel hashing- and bitstring-based compressed data structure that supports orders of magnitude faster query performance.
Source code: http://users-cs.au.dk/u071354/public_code/Hashcube.zip
Paper: http://cs.au.dk/~schester/publications/cikm_boegh_2014.pdf
Full licensing details are available with the source code, but if using it for academic purposes, please cite: KS Bøgh et al. "Hashcube: A Data Structure for Space- and Query-Efficient Skycube Compression." In Proc. CIKM 1767-1770. 2014.
The skyline is an important query operator for multi-criteria decision making. It reduces a dataset to only those points that offer optimal trade-offs of dimensions. In general, it is very expensive to compute. Recently, multicore CPU algorithms have been proposed to accelerate the computation of the skyline. However, they do not sufficiently minimize dominance tests and so are not competitive with state-of-the-art sequential algorithms.
In this paper, we introduce a novel multicore skyline algorithm, Hybrid, which processes points in blocks. It maintains a shared, global skyline among all threads, which is used to minimize dominance tests while maintaining high throughput. The algorithm uses an efficiently-updatable data structure over the shared, global skyline, based on point-based partitioning. Also, we release a large benchmark of optimized skyline algorithms, with which we demonstrate on challenging workloads a 100-fold speedup over state-of-the-art multicore algorithms and a 10-fold speedup with 16 cores over state-of-the-art sequential algorithms.
Source code: http://cs.au.dk/~schester/code/SkyBench.zip
Paper: http://cs.au.dk/~schester/publications/chester_icde2015_mcsky.pdf
Full licensing details are available with the source code, but if using it for academic purposes, please cite: S Chester et al. "Scalable Parallelization of Skyline Computation for Multi-core Processors." In Proc. ICDE. 2015.
The skyline operator returns records in a dataset that provide optimal trade-offs of multiple dimensions. State-of-the-art skyline computation involves complex tree traversals, data-ordering, and conditional branching to minimize the number of point-to-point comparisons. Meanwhile, GPGPU computing offers the potential for parallelizing skyline computation across thousands of cores. However, attempts to port skyline algorithms to the GPU have prioritized throughput and failed to outperform sequential algorithms.
In this paper, we introduce a new skyline algorithm, designed for the GPU, that uses a global, static partitioning scheme. With the partitioning, we can permit controlled branching to exploit transitive relationships and avoid most point-to-point comparisons. The result is a non-traditional GPU algorithm, SkyAlign, that prioritizes work-efficiency and respectable throughput, rather than maximal throughput, to achieve orders of magnitude faster performance.
Source code: http://users-cs.au.dk/u071354/public_code/skyAlign.zip
Paper: http://www.vldb.org/pvldb/vol8/p962-boegh.pdf
Full licensing details are available with the source code, but if using it for academic purposes, please cite: KS Bøgh et al. "Work-Efficient Parallel Skyline Computation for the GPU." PVLDB 8 (9): 962-973. 2015.