mprod.dimensionality_reduction.TCAM

class mprod.dimensionality_reduction.TCAM(fun_m: Optional[Callable[numpy.ndarray, numpy.ndarray]] = None, inv_m: Optional[Callable[numpy.ndarray, numpy.ndarray]] = None, n_components=None)[source]

tsvdm based tensor component analysis (TCAM). Linear dimensionality reduction using tensor Singular Value Decomposition of the data to project it to a lower dimensional space. The input data is centered but not scaled for each feature before applying the tSVDM (using mprod.MeanDeviationForm ) . It uses the mprod.decompositions.svdm function as basis for the TSVDMII algorithm from Kilmer et. al. (https://doi.org/10.1073/pnas.2015851118) then offers a CP like transformations of the data accordingly. See https://arxiv.org/abs/2111.14159 for theoretical results and case studies, and the Tutorials for elaborated examples

Parameters
n_componentsint, float, default=None

Number of components to keep. if n_components is not set all components are kept:

n_components == min(m_samples, p_features) * n_reps - 1

If 0 < n_components < 1 , select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components. In case n_components >= 1 is an integer then the estimated number of components will be:

n_components_ == min(n_components, min(m_samples, p_features) * n_reps - 1)
Attributes
n_components_int

The estimated number of components. When n_components is set to a number between 0 and 1. this number is estimated from input data. Otherwise it equals the parameter n_components, or min(m_samples, p_features) * n_reps -1 if n_components is None.

explained_variance_ratio_ndarray of shape (n_components_,)

The amount of variance explained by each of the selected components.

mode2_loadingsndarray (float) of shape (n_components_, n_features )

The weights driving the variation in each of the obtained factors with respect to

Methods

fit:

Compute the TCAM transformation for a given dataset

transform:

Transform a given dataset using a fitted TCAM

fit_transform:

Fit a TCAM to a dataset then return its TCAM transformation

inverse_transform:

Given points in the reduced TCAM space, compute the points pre-image in the original features space.

fit(X, y=None, **fit_params)[source]

Fit the model with X.

Parameters
Xarray-like of shape (m_samples, p_features, n_modes)

Training data, where m_samples is the number of samples, p_features is the number of features and n_modes is the number of modes (timepoints/locations etc…)

yIgnored

Ignored.

Returns
selfobject

Returns the instance itself.

Examples

>>> from mprod.dimensionality_reduction import TCAM
>>> import numpy as np
>>> X = np.random.randn(10,20,4)
>>> tca = TCAM()
>>> mdf = tca.fit(X)
fit_transform(X: numpy.ndarray, y=None, **fit_params)[source]

Fit the model with X and apply the dimensionality reduction on X.

Parameters
Xarray-like of shape (m_samples, p_features, n_modes)

Training data, where m_samples is the number of samples, p_features is the number of features and n_modes is the number of modes (timepoints/locations etc…)

yIgnored

Ignored.

Returns
X_newndarray of shape (m_samples, n_components_)

Transformed values.

inverse_transform(Y: numpy.ndarray)[source]

Inverts TCAM scores back to the original features space

Parameters
Y: np.ndarray

2d array with shape (k, n_components_)

Returns
Y_inv: NumpynDArray

3rd order tensor that is the inverse transform of Y to the original features space

property mode2_loadings

The weights driving the variation in each of the obtained factors with respect to each feature

transform(X)[source]

Apply mode-1 dimensionality reduction to X.

X is projected on the first mode-1 tensor components previously extracted from a training set.

Parameters
Xarray-like of shape (m_samples, p_features, n_modes)

Training data, where m_samples is the number of samples, p_features is the number of features and n_modes is the number of modes (timepoints/locations etc…)

Returns
X_newarray-like of shape (m_samples, n_components_)

Projection of X in the first principal components, where m_samples is the number of samples and n_components is the number of the components.