Research

Graphs and Tensors

How to analyze graphs and/or multi-dimensional data? We develop models, algorithms, and systems for graphs and tensors.

Graphs

How can we find patterns and anomalies in large graphs such as Facebook or Twitter networks? How to classfy and rank nodes in such graphs? Graphs are everywhere in our lives: social networks, the World Wide Web, biological networks, and many more. We work on algorithms, systems, and applications for analyzing large graphs.

Algorithms and Systems:

  • Random walk with restart for ranking and relation inference.
  • Graph machine learning: we develop state-of-the-art node classification and representation learning methods, which outperform recent works including Graph Convolutional Network (GCN) and Graph Attention Network (GAT).
  • Scalable graph mining: we develop large graph processing engines and related algorithms to handle very large graphs whose sizes are more than hundreds of billions of edges.

Applications:

The graph analysis researches have the following applications.

  • Graph over time: temporal trend analysis of real world graphs.
  • Link prediction: predict links that are likely to be made in the future.
  • Community detection: find communities that are groups tightly connected within each other, while loosely connected to other groups.
  • Anomaly detection: find anomalous users and activities in graphs.
  • Visualization and sensemaking: summarize, understand, and sense-make graphs.

Software

  • Pegasus: Peta-Scale Graph Mining System

Tensors

How can we find useful patterns and anomalies in large scale real-world data, such as network intrusion logs with (source-ip, target-ip, portnumber, timestamp), with multiple attributes? Tensors are suitable for modeling these multi-dimensional data, and widely used for the analysis of social networks, web data, network traffic, and in many other settings. However, current tensor decomposition methods do not scale for tensors with millions and billions of rows, columns and `fibers`, that often appear in real datasets.

Algorithms and Systems:

In this project, we design and develop large scale tensor analysis algorithms. Our goal is to design algorithms so that the sparsity of real world tensors are fully exploited to boost the performance. The supported algorithms include PARAFAC decomposition, coupled matrix-tensor decomposition, Tucker decomposition, and nonnegative tensor decompositions.

Applications:

The tensor analysis researches have the following applications.

  • Trend analysis of time evolving graphs
  • Network security (i.e., detect anomalous users or activities)
  • Healthcare data (e.g. fMRI) analysis
  • Knowledge base (e.g., FreeBase, Yago) analysis

Software

  • BigTensor: large scale tensor analysis tool on distributed platforms.

Publication

  • Jamin Yoo, Hyunsik Jeon, and U Kang., "Belief Propagation Network for Hard Inductive Semi-supervised Learning", 28th International Joint Conference on Artificial Intelligence (IJCAI) 2019, Macao, China.
  • Jun-gi Jang, Dongjin Choi, Jinhong Jung, and U Kang., "Zoom-SVD: Fast and Memory Efficient Method for Extracting Key Patterns in an Arbitrary Time Range", ACM International Conference on Information and Knowledge Management (CIKM) 2018, Lingotto, Turin, Italy. [BIBTEX] [HOMEPAGE] [PDF]
  • Ha-Myung Park, Francesco Silvestri, Rasmus Pagh, Chin-wan Chung, Sung-Hyon Myaeng, and U Kang, "Enumerating Trillion Subgraphs On Distributed Systems", ACM Transactions on Knowledge Discovery from Data (TKDD), 2018. [PDF] [BIBTEX] [HOMEPAGE (CODE , DATA)]
  • Sejoon Oh, Namyong Park, Lee Sael, and U Kang., "Scalable Tucker Factorization for Sparse Tensors - Algorithms and Discoveries", 34th IEEE International Conference on Data Engineering (ICDE) 2018, Paris, France. [BIBTEX] [HOMEPAGE] [PDF]
  • Minji Yoon, Jinhong Jung, and U Kang, "TPA: Fast, Scalable, and Accurate Method for Approximate Random Walk with Restart on Billion Scale Graphs", 34th IEEE International Conference on Data Engineering (ICDE) 2018, Paris, France. [BIBTEX] [HOMEPAGE] [PDF]
  • Junghwan Kim, Haekyu Park, Ji-Eun Lee, and U Kang, "SIDE: Representation Learning in Signed Directed Networks", The Web Conference (WWW) 2018, Lyon, France. [BIBTEX] [HOMEPAGE] [PDF]
  • Minji Yoon, Woojeong Jin, and U Kang, "Fast and Accurate Random Walk with Restart on Dynamic Graphs with Guarantees", The Web Conference (WWW) 2018, Lyon, France. [BIBTEX] [HOMEPAGE] [PDF]
  • Ha-Myung Park, Chiwan Park, and U Kang, "PegasusN: A Scalable and Versatile Graph Mining System", Thirty-Second AAAI Conference on Artificial Intelligence (AAAI) 2018, New Orleans, Lousiana, USA. [BIBTEX] [HOMEPAGE (CODE)] [PDF]
  • Haekyu Park, Jinhong Jung, and U Kang, "A Comparative Study of Matrix Factorization and Random Walk with Restart in Recommender Systems", IEEE International Conference on Big Data (BigData) 2017, Boston, MA, USA. [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF]
  • Yongsub Lim, Minsoo Jung, and U Kang, "Memory-efficient and Accurate Sampling for Counting Local Triangles in Graph Streams: From Simple to Multigraphs", ACM Transactions on Knowledge Discovery from Data (TKDD), vo. 12, issue 1, Feburuary 2018. [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF]
  • Namyong Park, Sejoon Oh, and U Kang, "Fast and Scalable Distributed Boolean Tensor Factorization", IEEE International Conference on Data Engineering (ICDE) 2017, San Diego, CA, USA. [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF]
  • Kijung Shin, Lee Sael, and U Kang, "Fully Scalable Methods for Distributed Tensor Factorization", IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 29, no. 1, pp. 100-113, Jan. 1 2017. [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF]
  • Jinhong Jung, Namyong Park, Lee Sael, and U Kang, "BePI: Fast and Memory-Efficient Method for Billion-Scale Random Walk with Restart", ACM International Conference on Management of Data (SIGMOD) 2017, Raleigh, North Carolina, USA. [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF]
  • Ha-Myung Park, Namyong Park, Sung-Hyon Myaeng, and U Kang, "Partition Aware Connected Component Computation in Distributed Systems", IEEE International Conference on Data Mining (ICDM) 2016, Barcelona, Spain. [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • Jinhong Jung, Woojeong Jin, Lee Sael, and U Kang, "Personalized Ranking in Signed Networks using Signed Random Walk with Restart", IEEE International Conference on Data Mining (ICDM) 2016, Barcelona, Spain. [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • Namyong Park, Byungsoo Jeon, Jungwoo Lee, and U Kang, "BIGtensor: Mining Billion-Scale Tensor Made Easy", ACM International Conference on Information and Knowledge Management (CIKM) 2016, Indianapolis, Indiana, USA. [BIBTEX] [HOMEPAGE (CODE)] [PDF]
  • Ha-Myung Park, Sung-Hyon Myaeng, and U Kang, "PTE: Enumerating Trillion Triangles On Distributed System", ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2016, San Francisco, USA. [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • Minsoo Jung, Sunmin Lee, Yongsub Lim, U Kang, "FURL: Fixed-memory and Uncertainty Reducing Local Triangle Counting for Graph Streams", arXiv: 1611.06615 [cs.DS], 26 November 2016. [BIBTEX] [PDF]
  • Min-Hee Jang, Christos Faloutsos, Sang-Wook Kim, U Kang, and Jiwoon Ha, "PIN-TRUST: Fast Trust Propagation Exploiting Positive, Implicit, and Negative Information", ACM International Conference on Information and Knowledge Management (CIKM) 2016, Indianapolis, Indiana, USA. [BIBTEX] [PDF]
  • Hugo Gualdron, Robson Cordeiro, Jose Rodrigeus-Jr, Duen Horng (Polo) Chau, Minsuk Kahng, and U Kang., "M-Flash: Fast Billion-scale Graph Computation Using a Bimodal Block Processing Model", European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD) 2016, Riva Del Garda, Italy. [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • Yongsub Lim, Won-Jo Lee, Ho-Jin Choi, and U Kang, "MTP: Discovering High Quality Partitions in Real World Graphs", World Wide Web Journal [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • Inah Jeon, Evangelos E. Papalexakis, Christos Faloutsos, Lee Sael, and U Kang, "Mining Billion-Scale Tensors: Algorithm and Discoveries", VLDB Journal, vol. 25, issue 4, pp. 519-544, August 2016. [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • ByungSoo Jeon, Inah Jeon, Sael Lee, U Kang, "SCouT: Scalable Coupled Matrix-Tensor Factorization-Algorithms and Discoveries", 32nd IEEE International Conference on Data Engineering (ICDE) 2016, Helsinki, Finland. [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF]
  • Jinhong Jung, Kijung Shin, Lee Sael, and U Kang, "Random Walk with Restart on Large Graphs Using Block Elimination", ACM Transactions on Database Systems (TODS), vol. 41, issue 2, pp. 12:1-12:43, June 2016. [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • ByungSoo Jeon, Inah Jeon, and U Kang, "TeGViz: Distributed Tera-Scale Graph Generation and Visualization", IEEE International Conference on Data Mining (ICDM) 2015, Atlantic City, USA. [BIBTEX] [HOMEPAGE (CODE)] [PDF]
  • Kijung Shin, Jinhong Jung, Lee Sael, and U Kang, "BEAR: Block Elimination Approach for Random Walk with Restart on Large Graphs", ACM International Conference on Management of Data (SIGMOD) 2015, Melbourne, Australia [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF]
  • Inah Jeon, Evangelos E. Papalexakis, U Kang, and Christos Faloutsos, "HaTen2: Billion-scale Tensor Decompositions", 31st IEEE International Conference on Data Engineering (ICDE) 2015, Seoul, Korea. [BIBTEX] [HOMEPAGE (CODE, DATA)] [PDF] [SUPPLEMENTARY DOCUMENT]
  • Danai Koutra, U Kang, Jilles Vreeken, and Christos Faloutsos, "Summarizing and understanding large graphs", Statistical Analysis and Data Mining, doi: 10.1002/sam.11267, 18 May 2015. [BIBTEX] [PDF]
  • Yongsub Lim , Won-Jo Lee , Ho-Jin Choi, and U Kang, "Discovering Large Subsets with High Quality Partitions in Real World Graphs", Second International Conference on Big Data and Smart Computing (BigComp) 2015, Jeju, korea. [BIBTEX] [PDF] [HOMEPAGE (CODE, DATA)]
  • Lee Sael, Inah Jeon, and U Kang, "Scalable Tensor Mining", Big Data Research Journal, Feb. 2015. [BIBTEX] [PDF]
  • Ha-Myung Park, Francesco Silvestri, U Kang, Rasmus Pagh, "MapReduce Triangle Enumeration With Guarantees", 23rd ACM International Conference on Information and Knowledge Management (CIKM) 2014, Shaghai, China [BIBTEX] [HOMEPAGE (CODE, DATASET)] [PDF]
  • Yongsub Lim, U Kang, and Christos Faloutsos, "SlashBurn: Graph Compression and Mining beyond Caveman Communities", IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 26, no. 12, pp. 3077-3089, April 2014. [BIBTEX] [CODE] [PDF]
  • Jungeun Kim, Minsoo Choy, Daehoon Kim, and U Kang, "Link Prediction Based on Generalized Cluster Information", 23rd International World Wide Web Conference (WWW) 2014, Seoul, Korea. [BIBTEX] [PDF]
  • U Kang, Jay-Yoon Lee, Danai Koutra, and Christos Faloutsos, "Net-Ray: Visualizing and Mining Billion-Scale Graphs", Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2014, Tainan, Taiwan. [BIBTEX] [HOMEPAGE (CODE)] [PDF]
  • Danai Koutra, U Kang, Jilles Vreeken, and Christos Faloutsos, "VoG: Summarizing and Understanding Large Graphs", SIAM International Conference on Data Mining (SDM) 2014, Philadelphia, Pennsylvania, USA. [BIBTEX] [PDF]
  • Zhiyuan Lin, Nan Cao, Hanghang Tong, Fei Wang, U Kang, and Duen Horng Chau, "Interactive Multi-resolution Exploration of Million Node Graphs", IEEE VIS 2013, Atlanta, Georgia, USA. [BIBTEX] [PDF]
  • U Kang, Martial Hebert, and Soonyong Park, "Fast and Scalable Approximate Spectral Graph Matching for Correspondence Problems", Information Sciences, 2013. [BIBTEX] [PDF]
  • Evangelos E. Papalexakis, U Kang, Christos Faloutsos, Nicholas D. Sidiropoulosx, and Abhay Harpale, "Large Scale Tensor Decompositions: Algorithmic Developments and Applications", Bulletin of the Technical Committee on Data Engineering, vol. 36, no. 3, September 2013. [BIBTEX] [PDF]
  • U Kang and Christos Faloutsos, "Big graph mining: algorithms and discoveries", ACM SIGKDD Explorations Newsletter Volume 14 Issue 2, December 2012. pp. 29-36. [BIBTEX] [PDF]
  • U Kang, Evangelos Papalexakis, Abhay Harpale, and Christos Faloutsos, "GigaTensor: Scaling Tensor Analysis Up By 100 Times - Algorithms and Discoveries", ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2012, Beijing, China. [BIBTEX] [PDF]
  • U Kang, Hanghang Tong, Jimeng Sun, Ching-Yung Lin, and Christos Faloutsos, "GBASE: An Efficient Analysis Platform for Large Graphs", VLDB Journal, 2012. [BIBTEX] [PDF]
  • U Kang, Hanghang Tong, and Jimeng Sun, "Fast Random Walk Graph Kernel", SIAM International Conference on Data Mining (SDM) 2012, Anaheim, California, USA. (acceptance rate 27 %) [BIBTEX] [PDF]
  • U Kang and Christos Faloutsos, "Beyond 'Caveman Communities': Hubs and Spokes for Graph Compression and Mining", IEEE International Conference on Data Mining (ICDM) 2011, Vancouver, Canada. (acceptance rate 12.2 %) [BIBTEX] [CODE] [PDF]
  • U Kang, Brendan Meeder, and Christos Faloutsos, "Spectral Analysis for Billion-Scale Graphs: Discoveries and Implementation", Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2011, Shenzhen, China. (acceptance rate 9.7 %) [BIBTEX] [PDF]
  • U Kang, Hanghang Tong, Jimeng Sun, Ching-Yung Lin, and Christos Faloutsos, "GBASE: A Scalable and General Graph Management System", ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD) 2011, San Diego, CA, USA. (acceptance rate 17.5 %) [BIBTEX] [PDF]
  • U Kang, Spiros Papadimitriou, Jimeng Sun, and Hanghang Tong, "Centralities in Large Networks: Algorithms and Observations", SIAM International Conference on Data Mining (SDM) 2011, Mesa, Arizona, USA. (acceptance rate 25.1 %) [BIBTEX] [PDF]
  • U Kang, Charalampos E. Tsourakakis, Ana Paula Appel, Christos Faloutsos, and Jure Leskovec, "HADI: Mining Radii of Large Graphs", ACM Transactions on Knowledge Discovery from Data (TKDD), 2011. [BIBTEX] [PDF]
  • U Kang, Charalampos E. Tsourakakis, and Christos Faloutsos, "PEGASUS: Mining Peta-Scale Graphs", Knowledge and Information Systems (KAIS), Springer, 2011. [BIBTEX] [PDF]
  • U Kang, Mary McGlohon, Leman Akoglu, and Christos Faloutsos, "Patterns on the Connected Components of Terabyte-Scale Graphs", IEEE International Conference on Data Mining (ICDM) 2010, Sydney, Australia. (acceptance rate 19.4 %) [BIBTEX] [PDF]
  • U Kang, Charalampos E. Tsourakakis, Ana Paula Appel, Christos Faloutsos, and Jure Leskovec, "Radius Plots for Mining Tera-byte Scale Graphs: Algorithms, Patterns, and Observations", SIAM International Conference on Data Mining (SDM) 2010, Columbus, Ohio, USA. (acceptance rate 23.4 %) [BIBTEX] [PDF]
  • U Kang, Charalampos E. Tsourakakis, and Christos Faloutsos, "PEGASUS: A Peta-Scale Graph Mining System - Implementation and Observations", IEEE International Conference on Data Mining (ICDM) 2009, Miami, Florida, USA. (acceptance rate 8.9 %) [BIBTEX] [PDF] [PEGASUS HOMEPAGE]
  • Charalampos E. Tsourakakis, U Kang, Gary Miller, and Christos Faloutsos, "DOULION: Counting Triangles in Massive Graphs with a Coin", ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) 2009, Paris, France. (acceptance rate 20 %) [BIBTEX] [PDF]