{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ".. note::\n",
    "   The following content provides technical and mathematical background for the `mprod-package`. \n",
    "   Most users of downstream applications such as `TCAM` would probably like to skip this part\n",
    "\n",
    "$\\newcommand{\\mat}[1]{\\mathbf{#1}}$\n",
    "$\\newcommand{\\matM}{\\mat{M}}$\n",
    "$\\newcommand{\\matMt}{\\matM^{\\T}}$\n",
    "$\\newcommand{\\matMi}{\\matM^{-1}}$\n",
    "$\\newcommand{\\T}{\\mat{T}}$\n",
    "$\\newcommand{\\xx}{\\times}$\n",
    "$\\newcommand{\\mpn}{m \\xx p \\xx n}$\n",
    "$\\newcommand{\\pmn}{p \\xx m \\xx n}$\n",
    "$\\newcommand{\\tens}[1]{\\mathcal{#1}}$\n",
    "$\\newcommand{\\tA}{\\tens{A}}$\n",
    "$\\newcommand{\\tAt}{\\tA^{\\T}}$\n",
    "$\\newcommand{\\thA}{\\widehat{\\tA}}$\n",
    "$\\newcommand{\\thAt}{\\thA^{\\T}}$\n",
    "$\\newcommand{\\tC}{\\tens{C}}$\n",
    "$\\newcommand{\\tCt}{\\tC^{\\T}}$\n",
    "$\\newcommand{\\thC}{\\widehat{\\tC}}$\n",
    "$\\newcommand{\\thCt}{\\thC^{\\T}}$\n",
    "$\\newcommand{\\tB}{\\tens{B}}$\n",
    "$\\newcommand{\\tBt}{\\tB^{\\T}}$\n",
    "$\\newcommand{\\thB}{\\widehat{\\tB}}$\n",
    "$\\newcommand{\\thBt}{\\thB^{\\T}}$\n",
    "$\\newcommand{\\tsub}[1]{\\xx_{#1}}$\n",
    "$\\newcommand{\\tsM}{\\tsub{3}\\matM}$\n",
    "$\\newcommand{\\tsMinv}{\\tsub{3}\\matM^{-1}}$\n",
    "$\\newcommand{\\mm}{\\star_{\\scriptscriptstyle \\matM } }$\n",
    "$\\newcommand{\\RR}{\\mathbb{R}}$\n",
    "$\\newcommand{\\tI}{\\tens{I}}$\n",
    "$\\newcommand{\\thI}{\\widehat{\\tI}}$\n",
    "$\\newcommand{\\tE}{\\tens{E}}$\n",
    "$\\newcommand{\\tQ}{\\tens{Q}}$\n",
    "$\\newcommand{\\tQt}{\\tQ^{\\T}}$\n",
    "$\\newcommand{\\thQ}{\\widehat{\\tQ}}$\n",
    "$\\newcommand{\\thQt}{\\thQ^{\\T}}$\n",
    "$\\newcommand{\\tV}{\\tens{V}}$\n",
    "$\\newcommand{\\tVt}{\\tV^{\\T}}$\n",
    "$\\newcommand{\\thV}{\\widehat{\\tV}}$\n",
    "$\\newcommand{\\thVt}{\\thV^{\\T}}$\n",
    "$\\newcommand{\\tU}{\\tens{U}}$\n",
    "$\\newcommand{\\tUt}{\\tU^{\\T}}$\n",
    "$\\newcommand{\\thU}{\\widehat{\\tU}}$\n",
    "$\\newcommand{\\thUt}{\\thU^{\\T}}$\n",
    "$\\newcommand{\\tS}{\\tens{S}}$\n",
    "$\\newcommand{\\tSt}{\\tS^{\\T}}$\n",
    "$\\newcommand{\\thS}{\\widehat{\\tS}}$\n",
    "$\\newcommand{\\thSt}{\\thS^{\\T}}$\n",
    "$\\newcommand{\\hsigma}{\\hat{\\sigma}}$\n",
    "$\\newcommand{\\rnk}{\\operatorname{rank}}$\n",
    "$\\newcommand{\\rrho}{\\boldsymbol{\\rho}}$\n",
    "$\\newcommand{\\TNorm}[1]{\\|#1\\|_{2}}$\n",
    "$\\newcommand{\\FNorm}[1]{\\|#1\\|_{F}}$\n",
    "$\\newcommand{\\NNorm}[1]{\\|#1\\|_{*}}$\n",
    "$\\newcommand{\\FNormS}[1]{\\FNorm{#1}^2}$\n",
    "$\\newcommand{\\TNormS}[1]{\\TNorm{#1}^2}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The main functionality of ``mprod-package`` is factorization of tensors, that is, expressing a tensor $\\tA \\in \\RR^{d_1 \\xx ... \\xx d_N}$ as a product of other, \"simpler\" tensors. \n",
    "For this aim, one must first obtain some notion of tensor-tensor multiplication.\n",
    "The \"M-product\" (denoted by $\\mm$ ), defined in <cite data-footcite=\"Kilmer\">Kilmer et al.</cite>,  refers to a \"family\" of tensor-tensor products, and provides the notion of multiplication which enables the factorization of tensors. \n",
    "Here, we briefly walk through the steps of $\\mm$-product's formal construction. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The M-product\n",
    "\n",
    "We begin with some definitions. <br>\n",
    "Let $\\matM$ be an $n\\xx n$ unitary matrix ($\\matM \\matMt = \\mat{I}_n = \\matMt \\matM$), and a tensor $\\tA \\in \\RR^{\\mpn}$. \n",
    "We define the **domain transform** specified by $\\matM$ as $\\thA := \\tA \\tsM$, where $\\tsM$ denotes the tensor-matrix multiplication of applying $\\matM$ to each of the tensor $n$ dimensional tube fibers ($\\tA_{i,j,:}$).\n",
    "\n",
    "A practical demonstration using `scipy` and `numpy` libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from scipy.stats import ortho_group # used for sampling random unitary matrices \n",
    "                                    # from the Haar distribution\n",
    "\n",
    "m, p, n = 10, 5, 8\n",
    "\n",
    "A = np.random.randn(m, p, n) # generate a random tensor\n",
    "M = ortho_group.rvs(n)       # random sample unitary M\n",
    "\n",
    "A_hat = np.zeros_like(A)\n",
    "for i in range(m):\n",
    "    for j in range(p):\n",
    "        A_hat[i,j,:] = M @ A[i,j,:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ".. attention::\n",
    "   The tensor-matrix product implementation is much more efficient than the above for loop\n",
    "\n",
    "\n",
    "\n",
    "The **transpose** of a real $\\mpn$ tensor $\\tA$ with respect to $\\matM$, denoted by $\\tA^{\\T}$, is a $\\pmn$ tensor for which \n",
    "$$[\\widehat{\\tA^{\\T}}]_{:,:,i} = [\\thA^{\\T}]_{:,:,i} = {[\\thA]_{:,:,i}}^{\\T}$$\n",
    "\n",
    "Given two tensors $\\tA \\in \\RR^{\\mpn}$ and  $\\tB \\in \\RR^{p \\xx r \\xx n}$ , the facewise tensor-tensor product of $\\tA$ and $\\tB$, denoted by $\\tA \\vartriangle \\tB$ ,  is the $m \\xx r \\xx n$ tensor for which \n",
    "$$[\\tA \\vartriangle \\tB]_{:,:,i} = \\tA_{:,:,i} \\tB_{:,:,i}$$ \n",
    "\n",
    "The $\\mm$ **-product** of $\\tA \\in \\RR^{\\mpn}$ and  $\\tB \\in \\RR^{p \\xx r \\xx n}$ is defined by \n",
    "$$\\tA \\mm \\tB := (\\thA \\vartriangle \\thB) \\tsMinv \\in \\RR^{m \\xx r \\xx n}$$ \n",
    "\n",
    "\n",
    "The `mprod-package` offers utility functions like `m_prod` implementing $\\mm$ as well as random and spectral analysis based generators of unitary transforms"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "129.30020497750468\n"
     ]
    }
   ],
   "source": [
    "from mprod import  m_prod\n",
    "from mprod import  generate_haar, generate_dct\n",
    "\n",
    "funm_haar, invm_haar = generate_haar(n) # Utility wrapper arround \n",
    "                                        #  scipy.stats.ortho_group \n",
    "funm_dct, invm_dct = generate_dct(n)    # Generates dct and idct transforms using scipy's\n",
    "                                        #  fft module. the default dct type is 2\n",
    "\n",
    "# generate random tensor B    \n",
    "r = 15\n",
    "B = np.random.randn(p,r,n)\n",
    "\n",
    "# Multiply A and B with respect to a randomly sampled M\n",
    "C_haar = m_prod(A,B,funm_haar, invm_haar)\n",
    "\n",
    "# Multiply A and B with respect to M = dct\n",
    "C_dct = m_prod(A,B,funm_dct, invm_dct)\n",
    "\n",
    "print(np.linalg.norm(C_haar - C_dct))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As shown above, given two distinct transforms ${\\matM}_1, {\\matM}_2$ , we have that $\\tA \\star_{\\scriptstyle \\matM_1} \\tB$ and $\\tA \\star_{\\scriptstyle \\matM_2} \\tB$ are not equal in general.\n",
    "This fact, as we will see, provides high flexibility when applying $\\mm$ based dimensionality reduction schemes.\n",
    "\n",
    "Two tensors $\\tA, \\tB \\in \\RR^{1 \\xx m \\xx n}$ are called $\\mm$ **-orthogonal slices** if $\\tA^{\\T} \\mm \\tB = \\mathbf{0}$,  where $\\mathbf{0} \\in \\RR^{1\\xx 1 \\xx n}$ is the zero tube fiber, while $\\tQ \\in \\RR^{m \\xx m \\xx n}$ is called $\\mm$ **-unitary** if $\\tQ^{\\T} \\mm \\tQ = \\tI = \\tQ \\mm \\tQ^{\\T}$ .\n",
    "<br>\n",
    "A tensor $\\tB \\in \\RR^{p \\xx k \\xx n}$ is said to be a pseudo $\\mm$ -unitary tensor (or  pseudo $\\mm$-orthogonal) if $\\tB^{\\T} \\mm \\tB$ is f-diagonal (i.e., all frontal slices are diagonal), and all frontal slices of $(\\tB^{\\T} \\mm \\tB) \\tsM$ are diagonal matrices with entries that are either ones or zeros.\n",
    "\n",
    "\n",
    "# TSVDM\n",
    "\n",
    "Let $\\tA \\in \\RR^{\\mpn}$  be a real tensor, then is possible to write the  full **tubal singular value decomposition** of $\\tA$  as \n",
    "$$\\tA = \\tU \\mm \\tS \\mm \\tV^{\\T}$$ \n",
    "\n",
    "where $\\tU, \\tV$ are $(m \\xx m \\xx n)$ and $(p \\xx p \\xx n)$ , $\\mm$-unitary tensors respectively, and $\\tS \\in \\RR^{\\mpn}$ is an **f-diagonal** tensor, that is, a tensor whose frontal slices ( $\\tS_{:,:,i}$ ) are matrices with zeros outside their main diagonal.<br>\n",
    "\n",
    "We use the notation $\\hsigma_{j}^{(i)}$ do denote the $j^{th}$ largest singular value on the $i^{th}$ lateral face of $\\thS$: \n",
    "$$\\hsigma_{j}^{(i)} := \\thS_{j,j,i}$$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "U: 10x5x8\n",
      "S: 5x8\n",
      "V: 5x5x8 \n",
      "\n",
      "||A - USV'||^2 = 5.159366909775574e-27\n"
     ]
    }
   ],
   "source": [
    "from mprod.decompositions import svdm\n",
    "from mprod import tensor_mtranspose\n",
    "\n",
    "U,S,V = svdm(A, funm_haar, invm_haar)\n",
    "\n",
    "print(\"U:\", \"x\".join(map(str, U.shape)))\n",
    "print(\"S:\", \"x\".join(map(str, S.shape)))\n",
    "print(\"V:\", \"x\".join(map(str, V.shape)),\"\\n\")\n",
    "\n",
    "# Note that for practical reasons, S is stored in a lean datastructure\n",
    "# To obtain the \"tensorial\" representation of S, we do as follows\n",
    "tens_S = np.zeros((p,p,n))\n",
    "for i in range(n):\n",
    "    tens_S[:S.shape[0],:S.shape[0],i] = np.diag(S[:,i])\n",
    "\n",
    "\n",
    "# reconstruct the tensor\n",
    "Vt = tensor_mtranspose(V,funm_haar, invm_haar)\n",
    "US = m_prod(U, tens_S, funm_haar, invm_haar)\n",
    "USVt = m_prod(US, Vt, funm_haar, invm_haar)\n",
    "\n",
    "print(\"||A - USV'||^2 =\",np.linalg.norm(A - USVt)**2) # practically 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tensor ranks and truncations\n",
    "\n",
    "* The **t-rank** of $\\tA$ is the number of nonezero tubes of $\\tS$: \n",
    "$$\n",
    "r = | \\left\\{ i = 1, \\dots, n ~;~ \\FNormS{\\tS_{i,i,:}} > 0 \\right\\} |\n",
    "$$\n",
    "\n",
    "$\\tA^{(q)} = \\tU_{:,1:q, :} \\mm \\tS_{1:q,1:q,:} \\mm {\\tV_{:,1:q,:}}^{\\T}$ denotes the t-rank $q$ truncation of $\\tA$ under $\\mm$\n",
    "    \n",
    "* The **multi-rank** of $\\tA$ under $\\mm$, denoted by the vector $\\rrho \\in \\mathbb{N}^{n}$ whose $i^{th}$ entry is \n",
    "$$\n",
    "\\rrho_i = \\rnk (\\thA_{:,:,i})\n",
    "$$\n",
    "\n",
    "The multi-rank $\\rrho$ truncation of $\\tA$ under $\\mm$ is given by the tensor $\\tA_{\\rrho}$ for which \n",
    "$$\n",
    "\\widehat{\\tA_{\\rrho}}_{:,:,i} = \\thU_{:,1:\\rrho_i, i}  \\thS_{1:\\rrho_i,1:\\rrho_i,i}  {\\thV_{:,1:\\rrho_i,i}}^{\\T}\n",
    "$$ \n",
    "\n",
    "* The **implicit rank** under $\\mm$ of a tensor $\\tA$ with multi-rank $\\rrho$ under $\\mm$ is \n",
    "$$\n",
    "r = \\sum_{i=1}^{n} \\rrho_i\n",
    "$$\n",
    "\n",
    "Note that for t-rank truncation the $\\tU$ and $\\tV$ factors are $\\mm$-orthogonal, while for multi-rank truncation they are only pseudo $\\mm$-orthogonal."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "# t-rank 4 trunctation \n",
    "q = 4\n",
    "tens_S_t_hat = funm_haar(tens_S.copy())\n",
    "tens_S_t_hat[q:,q:,:] = 0\n",
    "tens_S_t = invm_haar(tens_S_t_hat)\n",
    "A4 = m_prod(m_prod(U, tens_S_t, funm_haar, invm_haar), Vt, funm_haar, invm_haar)\n",
    "\n",
    "\n",
    "# multi-rank rho trunctation \n",
    "rho = [1,3,2,2,3,1,4,3] # this is the multi-rank vector\n",
    "tens_S_rho_hat = funm_haar(tens_S.copy())\n",
    "for i in range(n):\n",
    "    tens_S_rho_hat[rho[i]:,rho[i]:,i] = 0\n",
    "\n",
    "tens_S_rho = invm_haar(tens_S_rho_hat)\n",
    "A_rho = m_prod(m_prod(U, tens_S_rho, funm_haar, invm_haar), Vt, funm_haar, invm_haar)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "Let $\\tA = \\tU \\mm \\tS \\mm \\tV^{\\T} \\in \\RR^{\\mpn}$, \n",
    "we will use $j_1,\\dots, j_{np}$ and $i_1,\\dots, i_{np}$ to denote the indexes of the non-zeros of  $\\thS$ ordered in decreasing order. That is \n",
    "$$\\hsigma_{\\ell} := \\hsigma_{j_{\\ell}}^{(i_{\\ell})}$$\n",
    "\n",
    "where $\\hsigma_1 \\geq \\hsigma_2 \\geq \\dots \\geq \\hsigma_{np}$ .\n",
    "\n",
    "For $q = 1 , \\dots , p n$ , the **explicit rank-** $q$ **truncation** under $\\mm$  of a tensor $\\tA$, denoted by $\\tA_q = \\tA_{\\rrho}$ , where $\\tA_{\\rrho}$ is the tensor of multi-rank $\\rrho$ under $\\mm$ such that \n",
    "$$\\rrho_i = \\max \\{ j = 1, \\dots ,p ~|~ (j,i) \\in \\{(j_1, j_1), \\dots, (j_q, i_q)\\} \\} .$$ \n",
    "\n",
    "In words, we keep the $q$ top singular values of any frontal slice of $\\thS$, and zero out the rest. \n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ".. note::\n",
    "   We have that $\\tA^{(q)}, \\tA_{\\rrho}$ and $\\tA_{q}$ are the best t-rank $q$, multi-rank $\\rrho$ and explicit-rank $q$ (under $\\mm$) approximations of $\\tA$, respectively.\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "# The effect of choosing different transforms \n",
    "\n",
    "To demonstrate how might the choice of $\\matM$ influence the resulting decomposition, we use the real-world time-series dataset obtained from a study on Pediatric Ulcerative Colitis (PUC) by <cite data-footcite=\"Schirmer2018\">Schirmer et al.</cite>.\n",
    "\n",
    "First, we obtain the data table from our analysis GitHub repo, construct a tensor from the data and apply TSVDM with respect to both randomly sampled $\\matM$ and the DCT.\n",
    "\n",
    "Note that in `generate_haar` function call, we set the `random_state` parameter to an integer (123) just so that the results are reproducible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "shape of S, by randomly sampled transform: (87, 4)\n",
      "shape of S, by DCT: (4, 4)\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "from mprod import table2tensor\n",
    "\n",
    "file_path = \"https://raw.githubusercontent.com/UriaMorP/\" \\\n",
    "            \"tcam_analysis_notebooks/main/Schirmer2018/Schirmer2018.tsv\"\n",
    "\n",
    "data_raw = pd.read_csv(file_path, index_col=[0,1], sep=\"\\t\"\n",
    "                       , dtype={'Week':int})\n",
    "\n",
    "data_tensor, map1, map3 =  table2tensor(data_raw)\n",
    "\n",
    "m,p,n = data_tensor.shape\n",
    "\n",
    "# Generate transforms according to the \n",
    "# relevant dimensions\n",
    "funm_haar, invm_haar = generate_haar(n,random_state=123)\n",
    "funm_dct, invm_dct = generate_dct(n)\n",
    "\n",
    "\n",
    "# Haar\n",
    "Uhaar, Shaar, Vhaar = svdm(data_tensor, funm_haar, invm_haar)\n",
    "print(\"shape of S, by randomly sampled transform:\", Shaar.shape)\n",
    "# DCT\n",
    "Udct, Sdct, Vdct = svdm(data_tensor, funm_dct, invm_dct)\n",
    "print(\"shape of S, by DCT:\", Sdct.shape)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, we have that the t-rank of our data under the DCT domain transform is 4, and 87 under $\\mm$ where $\\matM$ is obtained from randomly sampling the Haar distribution. \n",
    "\n",
    "Even though it is not generally true that choosing $\\matM$ as DCT (the t-product) results in better compression, the fact that it does so for time-series data makes perfect sense; Since we assume that time-series data are samples of continuous functions, which, are easy to approximate well using very few DCT basis elements."
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. footbibliography::"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "mprod",
   "language": "python",
   "name": "mprod"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}