{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
".. note::\n",
" The following content provides technical and mathematical background for the `mprod-package`. \n",
" Most users of downstream applications such as `TCAM` would probably like to skip this part\n",
"\n",
"$\\newcommand{\\mat}[1]{\\mathbf{#1}}$\n",
"$\\newcommand{\\matM}{\\mat{M}}$\n",
"$\\newcommand{\\matMt}{\\matM^{\\T}}$\n",
"$\\newcommand{\\matMi}{\\matM^{-1}}$\n",
"$\\newcommand{\\T}{\\mat{T}}$\n",
"$\\newcommand{\\xx}{\\times}$\n",
"$\\newcommand{\\mpn}{m \\xx p \\xx n}$\n",
"$\\newcommand{\\pmn}{p \\xx m \\xx n}$\n",
"$\\newcommand{\\tens}[1]{\\mathcal{#1}}$\n",
"$\\newcommand{\\tA}{\\tens{A}}$\n",
"$\\newcommand{\\tAt}{\\tA^{\\T}}$\n",
"$\\newcommand{\\thA}{\\widehat{\\tA}}$\n",
"$\\newcommand{\\thAt}{\\thA^{\\T}}$\n",
"$\\newcommand{\\tC}{\\tens{C}}$\n",
"$\\newcommand{\\tCt}{\\tC^{\\T}}$\n",
"$\\newcommand{\\thC}{\\widehat{\\tC}}$\n",
"$\\newcommand{\\thCt}{\\thC^{\\T}}$\n",
"$\\newcommand{\\tB}{\\tens{B}}$\n",
"$\\newcommand{\\tBt}{\\tB^{\\T}}$\n",
"$\\newcommand{\\thB}{\\widehat{\\tB}}$\n",
"$\\newcommand{\\thBt}{\\thB^{\\T}}$\n",
"$\\newcommand{\\tsub}[1]{\\xx_{#1}}$\n",
"$\\newcommand{\\tsM}{\\tsub{3}\\matM}$\n",
"$\\newcommand{\\tsMinv}{\\tsub{3}\\matM^{-1}}$\n",
"$\\newcommand{\\mm}{\\star_{\\scriptscriptstyle \\matM } }$\n",
"$\\newcommand{\\RR}{\\mathbb{R}}$\n",
"$\\newcommand{\\tI}{\\tens{I}}$\n",
"$\\newcommand{\\thI}{\\widehat{\\tI}}$\n",
"$\\newcommand{\\tE}{\\tens{E}}$\n",
"$\\newcommand{\\tQ}{\\tens{Q}}$\n",
"$\\newcommand{\\tQt}{\\tQ^{\\T}}$\n",
"$\\newcommand{\\thQ}{\\widehat{\\tQ}}$\n",
"$\\newcommand{\\thQt}{\\thQ^{\\T}}$\n",
"$\\newcommand{\\tV}{\\tens{V}}$\n",
"$\\newcommand{\\tVt}{\\tV^{\\T}}$\n",
"$\\newcommand{\\thV}{\\widehat{\\tV}}$\n",
"$\\newcommand{\\thVt}{\\thV^{\\T}}$\n",
"$\\newcommand{\\tU}{\\tens{U}}$\n",
"$\\newcommand{\\tUt}{\\tU^{\\T}}$\n",
"$\\newcommand{\\thU}{\\widehat{\\tU}}$\n",
"$\\newcommand{\\thUt}{\\thU^{\\T}}$\n",
"$\\newcommand{\\tS}{\\tens{S}}$\n",
"$\\newcommand{\\tSt}{\\tS^{\\T}}$\n",
"$\\newcommand{\\thS}{\\widehat{\\tS}}$\n",
"$\\newcommand{\\thSt}{\\thS^{\\T}}$\n",
"$\\newcommand{\\hsigma}{\\hat{\\sigma}}$\n",
"$\\newcommand{\\rnk}{\\operatorname{rank}}$\n",
"$\\newcommand{\\rrho}{\\boldsymbol{\\rho}}$\n",
"$\\newcommand{\\TNorm}[1]{\\|#1\\|_{2}}$\n",
"$\\newcommand{\\FNorm}[1]{\\|#1\\|_{F}}$\n",
"$\\newcommand{\\NNorm}[1]{\\|#1\\|_{*}}$\n",
"$\\newcommand{\\FNormS}[1]{\\FNorm{#1}^2}$\n",
"$\\newcommand{\\TNormS}[1]{\\TNorm{#1}^2}$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The main functionality of ``mprod-package`` is factorization of tensors, that is, expressing a tensor $\\tA \\in \\RR^{d_1 \\xx ... \\xx d_N}$ as a product of other, \"simpler\" tensors. \n",
"For this aim, one must first obtain some notion of tensor-tensor multiplication.\n",
"The \"M-product\" (denoted by $\\mm$ ), defined in Kilmer et al., refers to a \"family\" of tensor-tensor products, and provides the notion of multiplication which enables the factorization of tensors. \n",
"Here, we briefly walk through the steps of $\\mm$-product's formal construction. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# The M-product\n",
"\n",
"We begin with some definitions.
\n",
"Let $\\matM$ be an $n\\xx n$ unitary matrix ($\\matM \\matMt = \\mat{I}_n = \\matMt \\matM$), and a tensor $\\tA \\in \\RR^{\\mpn}$. \n",
"We define the **domain transform** specified by $\\matM$ as $\\thA := \\tA \\tsM$, where $\\tsM$ denotes the tensor-matrix multiplication of applying $\\matM$ to each of the tensor $n$ dimensional tube fibers ($\\tA_{i,j,:}$).\n",
"\n",
"A practical demonstration using `scipy` and `numpy` libraries:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"from scipy.stats import ortho_group # used for sampling random unitary matrices \n",
" # from the Haar distribution\n",
"\n",
"m, p, n = 10, 5, 8\n",
"\n",
"A = np.random.randn(m, p, n) # generate a random tensor\n",
"M = ortho_group.rvs(n) # random sample unitary M\n",
"\n",
"A_hat = np.zeros_like(A)\n",
"for i in range(m):\n",
" for j in range(p):\n",
" A_hat[i,j,:] = M @ A[i,j,:]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
".. attention::\n",
" The tensor-matrix product implementation is much more efficient than the above for loop\n",
"\n",
"\n",
"\n",
"The **transpose** of a real $\\mpn$ tensor $\\tA$ with respect to $\\matM$, denoted by $\\tA^{\\T}$, is a $\\pmn$ tensor for which \n",
"$$[\\widehat{\\tA^{\\T}}]_{:,:,i} = [\\thA^{\\T}]_{:,:,i} = {[\\thA]_{:,:,i}}^{\\T}$$\n",
"\n",
"Given two tensors $\\tA \\in \\RR^{\\mpn}$ and $\\tB \\in \\RR^{p \\xx r \\xx n}$ , the facewise tensor-tensor product of $\\tA$ and $\\tB$, denoted by $\\tA \\vartriangle \\tB$ , is the $m \\xx r \\xx n$ tensor for which \n",
"$$[\\tA \\vartriangle \\tB]_{:,:,i} = \\tA_{:,:,i} \\tB_{:,:,i}$$ \n",
"\n",
"The $\\mm$ **-product** of $\\tA \\in \\RR^{\\mpn}$ and $\\tB \\in \\RR^{p \\xx r \\xx n}$ is defined by \n",
"$$\\tA \\mm \\tB := (\\thA \\vartriangle \\thB) \\tsMinv \\in \\RR^{m \\xx r \\xx n}$$ \n",
"\n",
"\n",
"The `mprod-package` offers utility functions like `m_prod` implementing $\\mm$ as well as random and spectral analysis based generators of unitary transforms"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"129.30020497750468\n"
]
}
],
"source": [
"from mprod import m_prod\n",
"from mprod import generate_haar, generate_dct\n",
"\n",
"funm_haar, invm_haar = generate_haar(n) # Utility wrapper arround \n",
" # scipy.stats.ortho_group \n",
"funm_dct, invm_dct = generate_dct(n) # Generates dct and idct transforms using scipy's\n",
" # fft module. the default dct type is 2\n",
"\n",
"# generate random tensor B \n",
"r = 15\n",
"B = np.random.randn(p,r,n)\n",
"\n",
"# Multiply A and B with respect to a randomly sampled M\n",
"C_haar = m_prod(A,B,funm_haar, invm_haar)\n",
"\n",
"# Multiply A and B with respect to M = dct\n",
"C_dct = m_prod(A,B,funm_dct, invm_dct)\n",
"\n",
"print(np.linalg.norm(C_haar - C_dct))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As shown above, given two distinct transforms ${\\matM}_1, {\\matM}_2$ , we have that $\\tA \\star_{\\scriptstyle \\matM_1} \\tB$ and $\\tA \\star_{\\scriptstyle \\matM_2} \\tB$ are not equal in general.\n",
"This fact, as we will see, provides high flexibility when applying $\\mm$ based dimensionality reduction schemes.\n",
"\n",
"Two tensors $\\tA, \\tB \\in \\RR^{1 \\xx m \\xx n}$ are called $\\mm$ **-orthogonal slices** if $\\tA^{\\T} \\mm \\tB = \\mathbf{0}$, where $\\mathbf{0} \\in \\RR^{1\\xx 1 \\xx n}$ is the zero tube fiber, while $\\tQ \\in \\RR^{m \\xx m \\xx n}$ is called $\\mm$ **-unitary** if $\\tQ^{\\T} \\mm \\tQ = \\tI = \\tQ \\mm \\tQ^{\\T}$ .\n",
"
\n",
"A tensor $\\tB \\in \\RR^{p \\xx k \\xx n}$ is said to be a pseudo $\\mm$ -unitary tensor (or pseudo $\\mm$-orthogonal) if $\\tB^{\\T} \\mm \\tB$ is f-diagonal (i.e., all frontal slices are diagonal), and all frontal slices of $(\\tB^{\\T} \\mm \\tB) \\tsM$ are diagonal matrices with entries that are either ones or zeros.\n",
"\n",
"\n",
"# TSVDM\n",
"\n",
"Let $\\tA \\in \\RR^{\\mpn}$ be a real tensor, then is possible to write the full **tubal singular value decomposition** of $\\tA$ as \n",
"$$\\tA = \\tU \\mm \\tS \\mm \\tV^{\\T}$$ \n",
"\n",
"where $\\tU, \\tV$ are $(m \\xx m \\xx n)$ and $(p \\xx p \\xx n)$ , $\\mm$-unitary tensors respectively, and $\\tS \\in \\RR^{\\mpn}$ is an **f-diagonal** tensor, that is, a tensor whose frontal slices ( $\\tS_{:,:,i}$ ) are matrices with zeros outside their main diagonal.
\n",
"\n",
"We use the notation $\\hsigma_{j}^{(i)}$ do denote the $j^{th}$ largest singular value on the $i^{th}$ lateral face of $\\thS$: \n",
"$$\\hsigma_{j}^{(i)} := \\thS_{j,j,i}$$\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"U: 10x5x8\n",
"S: 5x8\n",
"V: 5x5x8 \n",
"\n",
"||A - USV'||^2 = 5.159366909775574e-27\n"
]
}
],
"source": [
"from mprod.decompositions import svdm\n",
"from mprod import tensor_mtranspose\n",
"\n",
"U,S,V = svdm(A, funm_haar, invm_haar)\n",
"\n",
"print(\"U:\", \"x\".join(map(str, U.shape)))\n",
"print(\"S:\", \"x\".join(map(str, S.shape)))\n",
"print(\"V:\", \"x\".join(map(str, V.shape)),\"\\n\")\n",
"\n",
"# Note that for practical reasons, S is stored in a lean datastructure\n",
"# To obtain the \"tensorial\" representation of S, we do as follows\n",
"tens_S = np.zeros((p,p,n))\n",
"for i in range(n):\n",
" tens_S[:S.shape[0],:S.shape[0],i] = np.diag(S[:,i])\n",
"\n",
"\n",
"# reconstruct the tensor\n",
"Vt = tensor_mtranspose(V,funm_haar, invm_haar)\n",
"US = m_prod(U, tens_S, funm_haar, invm_haar)\n",
"USVt = m_prod(US, Vt, funm_haar, invm_haar)\n",
"\n",
"print(\"||A - USV'||^2 =\",np.linalg.norm(A - USVt)**2) # practically 0"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tensor ranks and truncations\n",
"\n",
"* The **t-rank** of $\\tA$ is the number of nonezero tubes of $\\tS$: \n",
"$$\n",
"r = | \\left\\{ i = 1, \\dots, n ~;~ \\FNormS{\\tS_{i,i,:}} > 0 \\right\\} |\n",
"$$\n",
"\n",
"$\\tA^{(q)} = \\tU_{:,1:q, :} \\mm \\tS_{1:q,1:q,:} \\mm {\\tV_{:,1:q,:}}^{\\T}$ denotes the t-rank $q$ truncation of $\\tA$ under $\\mm$\n",
" \n",
"* The **multi-rank** of $\\tA$ under $\\mm$, denoted by the vector $\\rrho \\in \\mathbb{N}^{n}$ whose $i^{th}$ entry is \n",
"$$\n",
"\\rrho_i = \\rnk (\\thA_{:,:,i})\n",
"$$\n",
"\n",
"The multi-rank $\\rrho$ truncation of $\\tA$ under $\\mm$ is given by the tensor $\\tA_{\\rrho}$ for which \n",
"$$\n",
"\\widehat{\\tA_{\\rrho}}_{:,:,i} = \\thU_{:,1:\\rrho_i, i} \\thS_{1:\\rrho_i,1:\\rrho_i,i} {\\thV_{:,1:\\rrho_i,i}}^{\\T}\n",
"$$ \n",
"\n",
"* The **implicit rank** under $\\mm$ of a tensor $\\tA$ with multi-rank $\\rrho$ under $\\mm$ is \n",
"$$\n",
"r = \\sum_{i=1}^{n} \\rrho_i\n",
"$$\n",
"\n",
"Note that for t-rank truncation the $\\tU$ and $\\tV$ factors are $\\mm$-orthogonal, while for multi-rank truncation they are only pseudo $\\mm$-orthogonal."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# t-rank 4 trunctation \n",
"q = 4\n",
"tens_S_t_hat = funm_haar(tens_S.copy())\n",
"tens_S_t_hat[q:,q:,:] = 0\n",
"tens_S_t = invm_haar(tens_S_t_hat)\n",
"A4 = m_prod(m_prod(U, tens_S_t, funm_haar, invm_haar), Vt, funm_haar, invm_haar)\n",
"\n",
"\n",
"# multi-rank rho trunctation \n",
"rho = [1,3,2,2,3,1,4,3] # this is the multi-rank vector\n",
"tens_S_rho_hat = funm_haar(tens_S.copy())\n",
"for i in range(n):\n",
" tens_S_rho_hat[rho[i]:,rho[i]:,i] = 0\n",
"\n",
"tens_S_rho = invm_haar(tens_S_rho_hat)\n",
"A_rho = m_prod(m_prod(U, tens_S_rho, funm_haar, invm_haar), Vt, funm_haar, invm_haar)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"Let $\\tA = \\tU \\mm \\tS \\mm \\tV^{\\T} \\in \\RR^{\\mpn}$, \n",
"we will use $j_1,\\dots, j_{np}$ and $i_1,\\dots, i_{np}$ to denote the indexes of the non-zeros of $\\thS$ ordered in decreasing order. That is \n",
"$$\\hsigma_{\\ell} := \\hsigma_{j_{\\ell}}^{(i_{\\ell})}$$\n",
"\n",
"where $\\hsigma_1 \\geq \\hsigma_2 \\geq \\dots \\geq \\hsigma_{np}$ .\n",
"\n",
"For $q = 1 , \\dots , p n$ , the **explicit rank-** $q$ **truncation** under $\\mm$ of a tensor $\\tA$, denoted by $\\tA_q = \\tA_{\\rrho}$ , where $\\tA_{\\rrho}$ is the tensor of multi-rank $\\rrho$ under $\\mm$ such that \n",
"$$\\rrho_i = \\max \\{ j = 1, \\dots ,p ~|~ (j,i) \\in \\{(j_1, j_1), \\dots, (j_q, i_q)\\} \\} .$$ \n",
"\n",
"In words, we keep the $q$ top singular values of any frontal slice of $\\thS$, and zero out the rest. \n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
".. note::\n",
" We have that $\\tA^{(q)}, \\tA_{\\rrho}$ and $\\tA_{q}$ are the best t-rank $q$, multi-rank $\\rrho$ and explicit-rank $q$ (under $\\mm$) approximations of $\\tA$, respectively.\n",
"\n",
"\n",
"\n",
"\n",
"# The effect of choosing different transforms \n",
"\n",
"To demonstrate how might the choice of $\\matM$ influence the resulting decomposition, we use the real-world time-series dataset obtained from a study on Pediatric Ulcerative Colitis (PUC) by Schirmer et al..\n",
"\n",
"First, we obtain the data table from our analysis GitHub repo, construct a tensor from the data and apply TSVDM with respect to both randomly sampled $\\matM$ and the DCT.\n",
"\n",
"Note that in `generate_haar` function call, we set the `random_state` parameter to an integer (123) just so that the results are reproducible."
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"shape of S, by randomly sampled transform: (87, 4)\n",
"shape of S, by DCT: (4, 4)\n"
]
}
],
"source": [
"import pandas as pd\n",
"from mprod import table2tensor\n",
"\n",
"file_path = \"https://raw.githubusercontent.com/UriaMorP/\" \\\n",
" \"tcam_analysis_notebooks/main/Schirmer2018/Schirmer2018.tsv\"\n",
"\n",
"data_raw = pd.read_csv(file_path, index_col=[0,1], sep=\"\\t\"\n",
" , dtype={'Week':int})\n",
"\n",
"data_tensor, map1, map3 = table2tensor(data_raw)\n",
"\n",
"m,p,n = data_tensor.shape\n",
"\n",
"# Generate transforms according to the \n",
"# relevant dimensions\n",
"funm_haar, invm_haar = generate_haar(n,random_state=123)\n",
"funm_dct, invm_dct = generate_dct(n)\n",
"\n",
"\n",
"# Haar\n",
"Uhaar, Shaar, Vhaar = svdm(data_tensor, funm_haar, invm_haar)\n",
"print(\"shape of S, by randomly sampled transform:\", Shaar.shape)\n",
"# DCT\n",
"Udct, Sdct, Vdct = svdm(data_tensor, funm_dct, invm_dct)\n",
"print(\"shape of S, by DCT:\", Sdct.shape)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this case, we have that the t-rank of our data under the DCT domain transform is 4, and 87 under $\\mm$ where $\\matM$ is obtained from randomly sampling the Haar distribution. \n",
"\n",
"Even though it is not generally true that choosing $\\matM$ as DCT (the t-product) results in better compression, the fact that it does so for time-series data makes perfect sense; Since we assume that time-series data are samples of continuous functions, which, are easy to approximate well using very few DCT basis elements."
]
},
{
"cell_type": "raw",
"metadata": {
"raw_mimetype": "text/restructuredtext"
},
"source": [
".. footbibliography::"
]
}
],
"metadata": {
"celltoolbar": "Raw Cell Format",
"kernelspec": {
"display_name": "mprod",
"language": "python",
"name": "mprod"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.8"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}