HaTen2 Hadoop Tensor method for 2 decompositions
Billion-scale Tensor Decompositions
How can we find useful patterns and anomalies in large scale real-world data with multiple attributes? Tensors are suitable for modeling these multidimensional data, and widely used for the analysis of social networks, web data, network traffic, and in many other settings. HaTen2 is a scalable distributed algorithm of tensor decomposition for large scale tensors running on the MapReduce framework. HaTen2 decomposes 100X larger tensors compared to existing methods.
Download HaTen2 - v1.0
The binary code of HaTen2 is available here.
Paper
- HaTen2: Billion-scale Tensor Decompositions.
Inah Jeon, Evangelos E. Papalexakis, U Kang, Christos Faloutsos.
31st IEEE International Conference on Data Engineering (ICDE) 2015, Seoul, Korea.

- Mining Billion-Scale Tensors: Algorithms and Discoveries.
Inah Jeon, Evangelos E. Papalexakis, Christos Faloutsos, Lee Sael, U Kang.
The International Journal on Very Large Data Bases (VLDB)
Dataset
NameDimensionalityNonzeroSourceDescription
Freebase-music23M x 23M x 16699MFreebase Web Freebase RDF data.
(Entity, Entity, Relation triples)
Freebase-sampled38M x 38M x 532139MFreebase Web Freebase RDF data.
(Entity, Entity, Relation triples)
NELL26M x 26M x 48M144MNELL Web Knowledgebase data
(Subject, Object, Predicate triples)
NELL-214K x 14K x 28K77MNELL Web Knowledgebase data
(Subject, Object, Predicate triples)
Phonecall30M x 30M x 62184M Phone call history
(Sender id, Receiver id, Date triples)
DARPA199822K x 22K x 23M28MDARPA Web Phone call history
(Source IP, Destination IP, Time triples)
People
Inah Jeon
Future IT R&D Lab
LG Electronics
Evangelos E. Papalexakis
Department of Computer Science
Carnegie Mellon University
Christos Faloutsos
Department of Computer Science
Carnegie Mellon University
Lee Sael
Department of Computer Science
SUNY
U Kang
Department of Computer Science and Engineering
Seoul National University