BIGtensor-GPU  1.0
BIGtensor-gpu: Fast and Accurate Tensor Analysis System using GPUs

Ver 1.0 (Last Updated: 2021/06/14)

Overview

BIGtensor-gpu provides a fast and accurate tensor mining tool.

BIGtensor-gpu

  • supports a large tensor.
  • supports fast tensor factorization.
  • supports accurate tensor factorization.
  • supports versatile tensor algebra.

Versatile tensor algebra

We provide various tensor operations as follows:

  • Tensor generation : tenors filled with one or random values, tensors from given factor matrices and a core tensor, and sparse R-MAT and Kronecker tensors.
  • Tensor-tensor operation : binary operations, arithmetic operations, and n-mode product of two tensors.
  • Tensor manipulation : matricization (unfolding a tensor into a certain mode), converting a tensor into a binary tensor, permutating the order of modes of a tensor, and scaling or collapsing a mode of tensor.
  • Tensor factorization : PARAFAC, nonnegative PARAFAC, Tucker, nonnegative Tucker, and CMTF.

Requirements

BIGtensor-gpu requires several libraries:

  • OpenMP(2.0) or above: if using gcc/g++ compiler, it is installed by default in Linux environment.
  • OpenCL(1.2) or above:
    • To check your device information, run the device-info program using ./bin/device-info command.
    • Install proper OpenCL SDK, depending on your GPU environment (e.g., NVIDIA GPU -> CUDA TOOLKIT).
  • Eigen: already placed in src/Eigen. Visit http://eigen.tuxfamily.org/ for more information.

Usage

  1. Install requirements.
  2. Configure a shared library path with export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/out/.
  3. Make a test code which uses the proposed library.
  4. Build the library and its executables with make.
  5. Run the program with ./demo.sh and check the results.

Quick start

There are two ways to use the BIGtensor library: (1) writing a single executable (demo) or (2) using C/C++ API.

(1) Make a demo file for a single executable.

  1. Type make tf after installing all requirements and configuring the library path.
  2. The executive file for demo, generated from test codes in src/Test_Code/test_tf.cpp, is in bin directory after compilation.
  3. Type ./bin/demo_tf resource/sample_tensor_large.txt result 10 512 1 1 0 to execute the binary.
    • ./bin/demo_tf [TENSOR PATH] [RESULT PATH] [RANK] [LOCAL SIZE] [NUMBER OF GPUS TO BE USED] [FULLY or PARTIALLY OBSERVABLE] [CP or TUCKER]
      • [TENSOR PATH] : the path of tensor
      • [RESULT PATH] : the result path of tensor factorization.
      • [RANK] : the number of rank size
      • [LOCAL SIZE] : GPU work group size
      • [NUMBER OF GPUS TO BE USED] : the number of GPUs, 0 for only CPU
      • [FULLY or PARTIALLY OBSERVABLE] : 0 for fully observable (dense tensor) and 1 for partially observable (sparse tensor).
      • [CP or TUCKER] : 0 for CP decomposition and 1 for Tucker decomposition
    • sample_tensor_large.txt is a 3-order random tensor of size 100x100x100 with 1000 nonzeros in resource directory.
  4. You can check the factorization results FACTOR0, FACTOR1, FACTOR2, CORETENSOR in result directory.

Please see src/Test_Code and demo.sh for further examples. Note that results of demo.sh are written in result directory and tensor-tensor operations do not output operation results. The table below describes the results of demo.sh execution.

Type Function Results
Tensor Factorization Decompose tensors (CP, Tucker) with non-negative constraints or coupled matrix FACTOR* and CORETENSOR represent factor matrices and a core tensor, respectively. For example, CP decomposition of the 3rd mode tensor returns three factor matrices of shape (tensor_shape[i], rank) and a 1-d array of shape (rank).
Note that 1) results from nonnegative tensor factorization contain nonnegative values, and 2) CMTF returns extra results, a factor matrix CFACTOR and a weight vector CORETENSOR2 of an input coupled matrix .
Tensor Generation Generate tensors satisfying certain conditions (filled with random values) Generated tensors are written in a sparse format (COO type). For example, the 3rd mode tensor filled with random values (random.txt) is written with rows (i, j, k, nnz) where i,j,k are tensor indices and nnz are random values.
Tensor Manipulation Compute essential tensor operations Operated tensors are written in a sparse format (COO type). For example, the matricization of the 3rd mode tensor (matricization.txt) is written with rows (i, j, nnz) where i, j are matrix indices and nnz are values.

(2) C/C++ API

To use the API, include Tensor.h and BIGtensor.h in your script.

The following is a code example using the API.

#include <Tensor.h>
#incldue <BIGtensor.hpp>
int main() {
/* CANDECOMP/PARAFAC (CP) FACTORIZATION */
BIGtensor bt;
int rank = 10;
int fully_or_partially_observable = 1;
char *in_tensor_path = "resource/sample_tensor_large.txt";
char *out_factor_path = "result";
bt.Parafac(rank, fully_or_partially_observable, in_tensor_path, out_factor_path, ANY);
return 0
}
  • [rank] : the rank
  • [fully_or_partially_observable] : 0 for fully observable (dense tensor) and 1 for partially observable (sparse tensor)
  • [in_tensor_path] : the path of tensor
  • [out_factor_path] : the result path of tensor factorization.

Please see src/Test_Code/test_bigtensor.cpp for further examples.