mprod.table2tensor¶

mprod.table2tensor(table: pandas.core.frame.DataFrame, missing_flag: bool = False) → Tuple[numpy.ma.core.MaskedArray, Mapping, Mapping][source]¶

Reshapes a nm x p ((samples x reps) x features) multi-indexed datafram to form a m x p x n tensor (subjects, features, reps)

Parameters

table: pd.DataFrame: a nm x p table of sampels x features
missing_flag: `bool`, default = False: When set to False (default), the function will raise an error in case there are missing samples. Setting to True will result in a tensor with masked entries.

Returns

tensor: ndarray, np.ma.array: 3’rd order tensor m x p x n (subjects, features, reps)
mode1_mappingdict: The mapping of each mode1 (frontal) slice index of the tensor to the table’s original subject name
mode3_mappingdict: The mapping of each mode3 (lateral) slice index of the tensor to the table’s original rep id

Examples

Suppose that table_data is a dataframe with no missing values.

>>> from mprod import table2tensor
>>> import pandas as pd
>>> np.random.seed(0)
>>> table_data.iloc[:5,:4]
                    f1        f2        f3        f4
SubjetID rep
a        t1   0.251259  0.744838  -0.45889 -0.208525
         t10   2.39831  0.248772   0.65873   1.36994
         t2  -0.303154 -0.337603 -0.568608   -1.0239
         t3    1.36369  0.978895  0.161972 -0.804368
         t4     1.8548   1.52954   0.78576  0.538041
>>> msk_tensor, mode1_mapping, mode3_mapping = table2tensor(table_data, missing_flag=False)
>>> msk_tensor[:3,:3,:2]
[[[0.25125853442243695 2.398308745102709]
  [0.7448378210349296 0.2487716728987871]
  [-0.4588901621837434 0.6587302072601999]]
 [[-0.5689263433318329 -0.06564253839123065]
  [1.0017636851038796 -0.49265853128383713]
  [0.45266517056628647 -1.4812390563653883]]
 [[0.7690616486878629 0.49302719962677855]
  [0.3186320585255899 1.469576084933633]
  [0.9609169837347897 -0.19564077520234632]]]
>>> mode1_mapping
{'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}
>>> mode3_mapping
{'t1': 0,
 't10': 1,
 't2': 2,
 't3': 3,
 't4': 4,
 't5': 5,
 't6': 6,
 't7': 7,
 't8': 8,
 't9': 9}

missing values

>>> msk_tensor, mode1_mapping, mode3_mapping = table2tensor(table_data.sample(40)
...                                                          , missing_flag=True)
>>> msk_tensor[:3,:3,:2]
masked_array(
  data=[[[0.07664420134210018, --],
         [-0.7358062254334045, --],
         [0.5562074188402509, --]],
        [[2.088982483926928, -0.06564253839123065],
         [0.7697757466063808, -0.49265853128383713],
         [0.4147812728859107, -1.4812390563653883]],
        [[-0.004794963866429985, 1.2262908375944879],
         [-0.15033350807209261, -0.3068131758163276],
         [0.6461670563178799, 0.1769508046682527]]],
  mask=[[[False,  True],
         [False,  True],
         [False,  True]],
        [[False, False],
         [False, False],
         [False, False]],
        [[False, False],
         [False, False],
         [False, False]]], fill_value=0.0)
>>> mode1_mapping
{'a': 3, 'b': 1, 'c': 0, 'd': 4, 'e': 2}
>>> mode3_mapping
{'t1': 2,
 't10': 1,
 't2': 3,
 't3': 6,
 't4': 5,
 't5': 7,
 't6': 8,
 't7': 4,
 't8': 0,
 't9': 9}