{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ ".. note::\n", " The following content provides technical and mathematical background for the `mprod-package`. \n", " Most users of downstream applications such as `TCAM` would probably like to skip this part\n", "\n", "$\\newcommand{\\mat}[1]{\\mathbf{#1}}$\n", "$\\newcommand{\\matM}{\\mat{M}}$\n", "$\\newcommand{\\matMt}{\\matM^{\\T}}$\n", "$\\newcommand{\\matMi}{\\matM^{-1}}$\n", "$\\newcommand{\\T}{\\mat{T}}$\n", "$\\newcommand{\\xx}{\\times}$\n", "$\\newcommand{\\mpn}{m \\xx p \\xx n}$\n", "$\\newcommand{\\pmn}{p \\xx m \\xx n}$\n", "$\\newcommand{\\tens}[1]{\\mathcal{#1}}$\n", "$\\newcommand{\\tA}{\\tens{A}}$\n", "$\\newcommand{\\tAt}{\\tA^{\\T}}$\n", "$\\newcommand{\\thA}{\\widehat{\\tA}}$\n", "$\\newcommand{\\thAt}{\\thA^{\\T}}$\n", "$\\newcommand{\\tC}{\\tens{C}}$\n", "$\\newcommand{\\tCt}{\\tC^{\\T}}$\n", "$\\newcommand{\\thC}{\\widehat{\\tC}}$\n", "$\\newcommand{\\thCt}{\\thC^{\\T}}$\n", "$\\newcommand{\\tB}{\\tens{B}}$\n", "$\\newcommand{\\tBt}{\\tB^{\\T}}$\n", "$\\newcommand{\\thB}{\\widehat{\\tB}}$\n", "$\\newcommand{\\thBt}{\\thB^{\\T}}$\n", "$\\newcommand{\\tsub}[1]{\\xx_{#1}}$\n", "$\\newcommand{\\tsM}{\\tsub{3}\\matM}$\n", "$\\newcommand{\\tsMinv}{\\tsub{3}\\matM^{-1}}$\n", "$\\newcommand{\\mm}{\\star_{\\scriptscriptstyle \\matM } }$\n", "$\\newcommand{\\RR}{\\mathbb{R}}$\n", "$\\newcommand{\\tI}{\\tens{I}}$\n", "$\\newcommand{\\thI}{\\widehat{\\tI}}$\n", "$\\newcommand{\\tE}{\\tens{E}}$\n", "$\\newcommand{\\tQ}{\\tens{Q}}$\n", "$\\newcommand{\\tQt}{\\tQ^{\\T}}$\n", "$\\newcommand{\\thQ}{\\widehat{\\tQ}}$\n", "$\\newcommand{\\thQt}{\\thQ^{\\T}}$\n", "$\\newcommand{\\tV}{\\tens{V}}$\n", "$\\newcommand{\\tVt}{\\tV^{\\T}}$\n", "$\\newcommand{\\thV}{\\widehat{\\tV}}$\n", "$\\newcommand{\\thVt}{\\thV^{\\T}}$\n", "$\\newcommand{\\tU}{\\tens{U}}$\n", "$\\newcommand{\\tUt}{\\tU^{\\T}}$\n", "$\\newcommand{\\thU}{\\widehat{\\tU}}$\n", "$\\newcommand{\\thUt}{\\thU^{\\T}}$\n", "$\\newcommand{\\tS}{\\tens{S}}$\n", "$\\newcommand{\\tSt}{\\tS^{\\T}}$\n", "$\\newcommand{\\thS}{\\widehat{\\tS}}$\n", "$\\newcommand{\\thSt}{\\thS^{\\T}}$\n", "$\\newcommand{\\hsigma}{\\hat{\\sigma}}$\n", "$\\newcommand{\\rnk}{\\operatorname{rank}}$\n", "$\\newcommand{\\rrho}{\\boldsymbol{\\rho}}$\n", "$\\newcommand{\\TNorm}[1]{\\|#1\\|_{2}}$\n", "$\\newcommand{\\FNorm}[1]{\\|#1\\|_{F}}$\n", "$\\newcommand{\\NNorm}[1]{\\|#1\\|_{*}}$\n", "$\\newcommand{\\FNormS}[1]{\\FNorm{#1}^2}$\n", "$\\newcommand{\\TNormS}[1]{\\TNorm{#1}^2}$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The main functionality of ``mprod-package`` is factorization of tensors, that is, expressing a tensor $\\tA \\in \\RR^{d_1 \\xx ... \\xx d_N}$ as a product of other, \"simpler\" tensors. \n", "For this aim, one must first obtain some notion of tensor-tensor multiplication.\n", "The \"M-product\" (denoted by $\\mm$ ), defined in Kilmer et al., refers to a \"family\" of tensor-tensor products, and provides the notion of multiplication which enables the factorization of tensors. \n", "Here, we briefly walk through the steps of $\\mm$-product's formal construction. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The M-product\n", "\n", "We begin with some definitions.
\n", "Let $\\matM$ be an $n\\xx n$ unitary matrix ($\\matM \\matMt = \\mat{I}_n = \\matMt \\matM$), and a tensor $\\tA \\in \\RR^{\\mpn}$. \n", "We define the **domain transform** specified by $\\matM$ as $\\thA := \\tA \\tsM$, where $\\tsM$ denotes the tensor-matrix multiplication of applying $\\matM$ to each of the tensor $n$ dimensional tube fibers ($\\tA_{i,j,:}$).\n", "\n", "A practical demonstration using `scipy` and `numpy` libraries:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy.stats import ortho_group # used for sampling random unitary matrices \n", " # from the Haar distribution\n", "\n", "m, p, n = 10, 5, 8\n", "\n", "A = np.random.randn(m, p, n) # generate a random tensor\n", "M = ortho_group.rvs(n) # random sample unitary M\n", "\n", "A_hat = np.zeros_like(A)\n", "for i in range(m):\n", " for j in range(p):\n", " A_hat[i,j,:] = M @ A[i,j,:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ".. attention::\n", " The tensor-matrix product implementation is much more efficient than the above for loop\n", "\n", "\n", "\n", "The **transpose** of a real $\\mpn$ tensor $\\tA$ with respect to $\\matM$, denoted by $\\tA^{\\T}$, is a $\\pmn$ tensor for which \n", "$$[\\widehat{\\tA^{\\T}}]_{:,:,i} = [\\thA^{\\T}]_{:,:,i} = {[\\thA]_{:,:,i}}^{\\T}$$\n", "\n", "Given two tensors $\\tA \\in \\RR^{\\mpn}$ and $\\tB \\in \\RR^{p \\xx r \\xx n}$ , the facewise tensor-tensor product of $\\tA$ and $\\tB$, denoted by $\\tA \\vartriangle \\tB$ , is the $m \\xx r \\xx n$ tensor for which \n", "$$[\\tA \\vartriangle \\tB]_{:,:,i} = \\tA_{:,:,i} \\tB_{:,:,i}$$ \n", "\n", "The $\\mm$ **-product** of $\\tA \\in \\RR^{\\mpn}$ and $\\tB \\in \\RR^{p \\xx r \\xx n}$ is defined by \n", "$$\\tA \\mm \\tB := (\\thA \\vartriangle \\thB) \\tsMinv \\in \\RR^{m \\xx r \\xx n}$$ \n", "\n", "\n", "The `mprod-package` offers utility functions like `m_prod` implementing $\\mm$ as well as random and spectral analysis based generators of unitary transforms" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "129.30020497750468\n" ] } ], "source": [ "from mprod import m_prod\n", "from mprod import generate_haar, generate_dct\n", "\n", "funm_haar, invm_haar = generate_haar(n) # Utility wrapper arround \n", " # scipy.stats.ortho_group \n", "funm_dct, invm_dct = generate_dct(n) # Generates dct and idct transforms using scipy's\n", " # fft module. the default dct type is 2\n", "\n", "# generate random tensor B \n", "r = 15\n", "B = np.random.randn(p,r,n)\n", "\n", "# Multiply A and B with respect to a randomly sampled M\n", "C_haar = m_prod(A,B,funm_haar, invm_haar)\n", "\n", "# Multiply A and B with respect to M = dct\n", "C_dct = m_prod(A,B,funm_dct, invm_dct)\n", "\n", "print(np.linalg.norm(C_haar - C_dct))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As shown above, given two distinct transforms ${\\matM}_1, {\\matM}_2$ , we have that $\\tA \\star_{\\scriptstyle \\matM_1} \\tB$ and $\\tA \\star_{\\scriptstyle \\matM_2} \\tB$ are not equal in general.\n", "This fact, as we will see, provides high flexibility when applying $\\mm$ based dimensionality reduction schemes.\n", "\n", "Two tensors $\\tA, \\tB \\in \\RR^{1 \\xx m \\xx n}$ are called $\\mm$ **-orthogonal slices** if $\\tA^{\\T} \\mm \\tB = \\mathbf{0}$, where $\\mathbf{0} \\in \\RR^{1\\xx 1 \\xx n}$ is the zero tube fiber, while $\\tQ \\in \\RR^{m \\xx m \\xx n}$ is called $\\mm$ **-unitary** if $\\tQ^{\\T} \\mm \\tQ = \\tI = \\tQ \\mm \\tQ^{\\T}$ .\n", "
\n", "A tensor $\\tB \\in \\RR^{p \\xx k \\xx n}$ is said to be a pseudo $\\mm$ -unitary tensor (or pseudo $\\mm$-orthogonal) if $\\tB^{\\T} \\mm \\tB$ is f-diagonal (i.e., all frontal slices are diagonal), and all frontal slices of $(\\tB^{\\T} \\mm \\tB) \\tsM$ are diagonal matrices with entries that are either ones or zeros.\n", "\n", "\n", "# TSVDM\n", "\n", "Let $\\tA \\in \\RR^{\\mpn}$ be a real tensor, then is possible to write the full **tubal singular value decomposition** of $\\tA$ as \n", "$$\\tA = \\tU \\mm \\tS \\mm \\tV^{\\T}$$ \n", "\n", "where $\\tU, \\tV$ are $(m \\xx m \\xx n)$ and $(p \\xx p \\xx n)$ , $\\mm$-unitary tensors respectively, and $\\tS \\in \\RR^{\\mpn}$ is an **f-diagonal** tensor, that is, a tensor whose frontal slices ( $\\tS_{:,:,i}$ ) are matrices with zeros outside their main diagonal.
\n", "\n", "We use the notation $\\hsigma_{j}^{(i)}$ do denote the $j^{th}$ largest singular value on the $i^{th}$ lateral face of $\\thS$: \n", "$$\\hsigma_{j}^{(i)} := \\thS_{j,j,i}$$\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "U: 10x5x8\n", "S: 5x8\n", "V: 5x5x8 \n", "\n", "||A - USV'||^2 = 5.159366909775574e-27\n" ] } ], "source": [ "from mprod.decompositions import svdm\n", "from mprod import tensor_mtranspose\n", "\n", "U,S,V = svdm(A, funm_haar, invm_haar)\n", "\n", "print(\"U:\", \"x\".join(map(str, U.shape)))\n", "print(\"S:\", \"x\".join(map(str, S.shape)))\n", "print(\"V:\", \"x\".join(map(str, V.shape)),\"\\n\")\n", "\n", "# Note that for practical reasons, S is stored in a lean datastructure\n", "# To obtain the \"tensorial\" representation of S, we do as follows\n", "tens_S = np.zeros((p,p,n))\n", "for i in range(n):\n", " tens_S[:S.shape[0],:S.shape[0],i] = np.diag(S[:,i])\n", "\n", "\n", "# reconstruct the tensor\n", "Vt = tensor_mtranspose(V,funm_haar, invm_haar)\n", "US = m_prod(U, tens_S, funm_haar, invm_haar)\n", "USVt = m_prod(US, Vt, funm_haar, invm_haar)\n", "\n", "print(\"||A - USV'||^2 =\",np.linalg.norm(A - USVt)**2) # practically 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Tensor ranks and truncations\n", "\n", "* The **t-rank** of $\\tA$ is the number of nonezero tubes of $\\tS$: \n", "$$\n", "r = | \\left\\{ i = 1, \\dots, n ~;~ \\FNormS{\\tS_{i,i,:}} > 0 \\right\\} |\n", "$$\n", "\n", "$\\tA^{(q)} = \\tU_{:,1:q, :} \\mm \\tS_{1:q,1:q,:} \\mm {\\tV_{:,1:q,:}}^{\\T}$ denotes the t-rank $q$ truncation of $\\tA$ under $\\mm$\n", " \n", "* The **multi-rank** of $\\tA$ under $\\mm$, denoted by the vector $\\rrho \\in \\mathbb{N}^{n}$ whose $i^{th}$ entry is \n", "$$\n", "\\rrho_i = \\rnk (\\thA_{:,:,i})\n", "$$\n", "\n", "The multi-rank $\\rrho$ truncation of $\\tA$ under $\\mm$ is given by the tensor $\\tA_{\\rrho}$ for which \n", "$$\n", "\\widehat{\\tA_{\\rrho}}_{:,:,i} = \\thU_{:,1:\\rrho_i, i} \\thS_{1:\\rrho_i,1:\\rrho_i,i} {\\thV_{:,1:\\rrho_i,i}}^{\\T}\n", "$$ \n", "\n", "* The **implicit rank** under $\\mm$ of a tensor $\\tA$ with multi-rank $\\rrho$ under $\\mm$ is \n", "$$\n", "r = \\sum_{i=1}^{n} \\rrho_i\n", "$$\n", "\n", "Note that for t-rank truncation the $\\tU$ and $\\tV$ factors are $\\mm$-orthogonal, while for multi-rank truncation they are only pseudo $\\mm$-orthogonal." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# t-rank 4 trunctation \n", "q = 4\n", "tens_S_t_hat = funm_haar(tens_S.copy())\n", "tens_S_t_hat[q:,q:,:] = 0\n", "tens_S_t = invm_haar(tens_S_t_hat)\n", "A4 = m_prod(m_prod(U, tens_S_t, funm_haar, invm_haar), Vt, funm_haar, invm_haar)\n", "\n", "\n", "# multi-rank rho trunctation \n", "rho = [1,3,2,2,3,1,4,3] # this is the multi-rank vector\n", "tens_S_rho_hat = funm_haar(tens_S.copy())\n", "for i in range(n):\n", " tens_S_rho_hat[rho[i]:,rho[i]:,i] = 0\n", "\n", "tens_S_rho = invm_haar(tens_S_rho_hat)\n", "A_rho = m_prod(m_prod(U, tens_S_rho, funm_haar, invm_haar), Vt, funm_haar, invm_haar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Let $\\tA = \\tU \\mm \\tS \\mm \\tV^{\\T} \\in \\RR^{\\mpn}$, \n", "we will use $j_1,\\dots, j_{np}$ and $i_1,\\dots, i_{np}$ to denote the indexes of the non-zeros of $\\thS$ ordered in decreasing order. That is \n", "$$\\hsigma_{\\ell} := \\hsigma_{j_{\\ell}}^{(i_{\\ell})}$$\n", "\n", "where $\\hsigma_1 \\geq \\hsigma_2 \\geq \\dots \\geq \\hsigma_{np}$ .\n", "\n", "For $q = 1 , \\dots , p n$ , the **explicit rank-** $q$ **truncation** under $\\mm$ of a tensor $\\tA$, denoted by $\\tA_q = \\tA_{\\rrho}$ , where $\\tA_{\\rrho}$ is the tensor of multi-rank $\\rrho$ under $\\mm$ such that \n", "$$\\rrho_i = \\max \\{ j = 1, \\dots ,p ~|~ (j,i) \\in \\{(j_1, j_1), \\dots, (j_q, i_q)\\} \\} .$$ \n", "\n", "In words, we keep the $q$ top singular values of any frontal slice of $\\thS$, and zero out the rest. \n", "\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ".. note::\n", " We have that $\\tA^{(q)}, \\tA_{\\rrho}$ and $\\tA_{q}$ are the best t-rank $q$, multi-rank $\\rrho$ and explicit-rank $q$ (under $\\mm$) approximations of $\\tA$, respectively.\n", "\n", "\n", "\n", "\n", "# The effect of choosing different transforms \n", "\n", "To demonstrate how might the choice of $\\matM$ influence the resulting decomposition, we use the real-world time-series dataset obtained from a study on Pediatric Ulcerative Colitis (PUC) by Schirmer et al..\n", "\n", "First, we obtain the data table from our analysis GitHub repo, construct a tensor from the data and apply TSVDM with respect to both randomly sampled $\\matM$ and the DCT.\n", "\n", "Note that in `generate_haar` function call, we set the `random_state` parameter to an integer (123) just so that the results are reproducible." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "shape of S, by randomly sampled transform: (87, 4)\n", "shape of S, by DCT: (4, 4)\n" ] } ], "source": [ "import pandas as pd\n", "from mprod import table2tensor\n", "\n", "file_path = \"https://raw.githubusercontent.com/UriaMorP/\" \\\n", " \"tcam_analysis_notebooks/main/Schirmer2018/Schirmer2018.tsv\"\n", "\n", "data_raw = pd.read_csv(file_path, index_col=[0,1], sep=\"\\t\"\n", " , dtype={'Week':int})\n", "\n", "data_tensor, map1, map3 = table2tensor(data_raw)\n", "\n", "m,p,n = data_tensor.shape\n", "\n", "# Generate transforms according to the \n", "# relevant dimensions\n", "funm_haar, invm_haar = generate_haar(n,random_state=123)\n", "funm_dct, invm_dct = generate_dct(n)\n", "\n", "\n", "# Haar\n", "Uhaar, Shaar, Vhaar = svdm(data_tensor, funm_haar, invm_haar)\n", "print(\"shape of S, by randomly sampled transform:\", Shaar.shape)\n", "# DCT\n", "Udct, Sdct, Vdct = svdm(data_tensor, funm_dct, invm_dct)\n", "print(\"shape of S, by DCT:\", Sdct.shape)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case, we have that the t-rank of our data under the DCT domain transform is 4, and 87 under $\\mm$ where $\\matM$ is obtained from randomly sampling the Haar distribution. \n", "\n", "Even though it is not generally true that choosing $\\matM$ as DCT (the t-product) results in better compression, the fact that it does so for time-series data makes perfect sense; Since we assume that time-series data are samples of continuous functions, which, are easy to approximate well using very few DCT basis elements." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. footbibliography::" ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "mprod", "language": "python", "name": "mprod" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }