Ver 1.0 (Last Updated: 2021/06/14)

Overview

BIGtensor-gpu provides a fast and accurate tensor mining tool.

BIGtensor-gpu

supports a large tensor.
supports fast tensor factorization.
supports accurate tensor factorization.
supports versatile tensor algebra.

Versatile tensor algebra

We provide various tensor operations as follows:

Tensor generation : tenors filled with one or random values, tensors from given factor matrices and a core tensor, and sparse R-MAT and Kronecker tensors.
Tensor-tensor operation : binary operations, arithmetic operations, and n-mode product of two tensors.
Tensor manipulation : matricization (unfolding a tensor into a certain mode), converting a tensor into a binary tensor, permutating the order of modes of a tensor, and scaling or collapsing a mode of tensor.
Tensor factorization : PARAFAC, nonnegative PARAFAC, Tucker, nonnegative Tucker, and CMTF.

Requirements

BIGtensor-gpu requires several libraries:

OpenMP(2.0) or above: if using gcc/g++ compiler, it is installed by default in Linux environment.
OpenCL(1.2) or above:
- To check your device information, run the device-info program using ./bin/device-info command.
- Install proper OpenCL SDK, depending on your GPU environment (e.g., NVIDIA GPU -> CUDA TOOLKIT).
Eigen: already placed in src/Eigen. Visit http://eigen.tuxfamily.org/ for more information.

Usage

Install requirements.
Configure a shared library path with export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/out/.
Make a test code which uses the proposed library.
Build the library and its executables with make.
Run the program with ./demo.sh and check the results.

Quick start

There are two ways to use the BIGtensor library: (1) writing a single executable (demo) or (2) using C/C++ API.

(1) Make a demo file for a single executable.

Type make tf after installing all requirements and configuring the library path.
The executive file for demo, generated from test codes in src/Test_Code/test_tf.cpp, is in bin directory after compilation.
Type ./bin/demo_tf resource/sample_tensor_large.txt result 10 512 1 1 0 to execute the binary.
- ./bin/demo_tf [TENSOR PATH] [RESULT PATH] [RANK] [LOCAL SIZE] [NUMBER OF GPUS TO BE USED] [FULLY or PARTIALLY OBSERVABLE] [CP or TUCKER]
  - [TENSOR PATH] : the path of tensor
  - [RESULT PATH] : the result path of tensor factorization.
  - [RANK] : the number of rank size
  - [LOCAL SIZE] : GPU work group size
  - [NUMBER OF GPUS TO BE USED] : the number of GPUs, 0 for only CPU
  - [FULLY or PARTIALLY OBSERVABLE] : 0 for fully observable (dense tensor) and 1 for partially observable (sparse tensor).
  - [CP or TUCKER] : 0 for CP decomposition and 1 for Tucker decomposition
- sample_tensor_large.txt is a 3-order random tensor of size 100x100x100 with 1000 nonzeros in resource directory.
You can check the factorization results FACTOR0, FACTOR1, FACTOR2, CORETENSOR in result directory.

Please see src/Test_Code and demo.sh for further examples. Note that results of demo.sh are written in result directory and tensor-tensor operations do not output operation results. The table below describes the results of demo.sh execution.

Type	Function	Results
Tensor Factorization	Decompose tensors (CP, Tucker) with non-negative constraints or coupled matrix	`FACTOR*` and `CORETENSOR` represent factor matrices and a core tensor, respectively. For example, CP decomposition of the 3rd mode tensor returns three factor matrices of shape (tensor_shape[i], rank) and a 1-d array of shape (rank). Note that 1) results from nonnegative tensor factorization contain nonnegative values, and 2) CMTF returns extra results, a factor matrix `CFACTOR` and a weight vector `CORETENSOR2` of an input coupled matrix .
Tensor Generation	Generate tensors satisfying certain conditions (filled with random values)	Generated tensors are written in a sparse format (COO type). For example, the 3rd mode tensor filled with random values (`random.txt`) is written with rows (i, j, k, nnz) where i,j,k are tensor indices and nnz are random values.
Tensor Manipulation	Compute essential tensor operations	Operated tensors are written in a sparse format (COO type). For example, the matricization of the 3rd mode tensor (`matricization.txt`) is written with rows (i, j, nnz) where i, j are matrix indices and nnz are values.

(2) C/C++ API

To use the API, include Tensor.h and BIGtensor.h in your script.

The following is a code example using the API.

#include <Tensor.h>
#incldue <BIGtensor.hpp> 
 
int main() {
 
    /* CANDECOMP/PARAFAC (CP) FACTORIZATION */
    BIGtensor bt;
    int rank = 10;
    int fully_or_partially_observable = 1;
    char *in_tensor_path = "resource/sample_tensor_large.txt";
    char *out_factor_path = "result";
    bt.Parafac(rank, fully_or_partially_observable, in_tensor_path, out_factor_path, ANY);
    return 0   
  }

[rank] : the rank
[fully_or_partially_observable] : 0 for fully observable (dense tensor) and 1 for partially observable (sparse tensor)
[in_tensor_path] : the path of tensor
[out_factor_path] : the result path of tensor factorization.

Please see src/Test_Code/test_bigtensor.cpp for further examples.