conpagnon.computing package

Submodules

conpagnon.computing.compute_connectivity_matrices module

Created on Tue Sep 19 17:02:19 2017

ComPagnon 2.0

@author: Dhaif BEKHA

conpagnon.computing.compute_connectivity_matrices.create_connectivity_mask(time_series_dictionary, groupes)[source]

Create boolean mask for each subjects accounting for discarded roi if they exist.

Parameters
  • time_series_dictionary (dict) –

    The time series dictionary structured as follow:
    • The first keys levels is the different groupes.

    • The second keys levels is the subjects IDs

    • The third levels is two keys : ‘time_series’ containing the

    subject time series in an array of shape (number of regions, number of time points) A key ‘discarded_rois’ containing an array of the index of ROIs where the corresponding labels is ‘void’. If no void labels is detected, then the array is empty.

  • groupes (list) – The list of groups in the study.

Returns

output – The same time series dictionary with a field called ‘masked_array’ for each subjects. A masked array is a boolean mask accounting for discarded rois, True for a discarded rois, and False elsewhere.

Return type

dict

Notes

If ‘discarded_rois’ is empty the mask is False at every position in the array. Almost all the operation in ConPagnon accounts for discarded rois with masked array via the dedicated module in NumPy.

conpagnon.computing.compute_connectivity_matrices.extract_sub_connectivity_matrices(subjects_connectivity_matrices, kinds, regions_index, vectorize=False, discard_diagonal=False)[source]

Extract sub matrices given region index.

Parameters
  • subjects_connectivity_matrices (dict) – The dictionnary containing for each subjects, for each group and kind, the connectivity matrices, of shape (n_features, n_features), and also the boolean mask array indicating the discarded roi.

  • kinds (list) – The list of kinds.

  • regions_index (list, or 1D numpy.array of shape (number of indices, )) – The list of index of regions you want to extract the connectivity. The region index also correspond to the index of ROIs in the 4D reference atlas of the study.

  • vectorize (bool, optional) – If True, the extracted sub-matrices are vectorized, keeping the diagonal of the matrix, and the corresponding boolean mask.

  • discard_diagonal (bool, optional) – If True, the diagonal is discard when the extracted connectivity matrices are vectorized.

Returns

output – A dictionnary containing for each group and kinds, for each subject the extract sub-matrice, and the corresponding boolean mask.

Return type

dict

conpagnon.computing.compute_connectivity_matrices.group_mean_connectivity(subjects_connectivity_matrices, kinds, axis=0)[source]
Compute the mean connectivity matrices for each kind accounting for masked rois.

Read the notes for the tangent space !

Parameters
  • subjects_connectivity_matrices (dict) –

    A multi-levels dictionnary organised as follow :
    • The first keys levels is the different groupes in the study.

    • The second keys levels is the subjects IDs

    • The third levels is the different kind matrices

    for each subjects, a ‘discarded_rois’ key for the discarded rois array index, a ‘masked_array’ key containing the array of Boolean of True for the discarded_rois index, and False elsewhere.

  • kinds (list) – List of kinds you want the mean connectivity. Choices are ‘correlation’, ‘tangent’, ‘covariances’, ‘precision’, ‘partial correlation’. Off course, the kind should be in the subjects_connectivity_matrices dictionnary.

  • axis (int, optional) – The axis you want to compute the mean, the subjects axis. Default is 0.

Returns

output

A multi-levels dictionnary organised as follow :
  • The first keys levels is the different groups in the study.

  • The second keys levels is the mean connectivity matrices for

the different kinds. They are array of shape (number of regions , number of regions) if vectorize is False, and shape (n_columns * (n_columns + 1) /2) else.

Return type

dict

See also

individual_connectivity_matrices()

These function returned a organised

dictionnary(), simply()

Notes

When computing the mean, we account for the ‘discarded_rois’ entries. That mean when the value is True in the masked_array, we discard the rois for the corresponding subject in the derivation of the mean. When I compute the mean in the tangent space, it’s a arithmetic mean. This mean matrix is in the tangent space, that is NOT in the same space as correlation or partial correlation matrix. Be careful with the interpretation !! That said, the tangent space is defined at ONE point in the manifold of symmetric matrices. This point is the geometric mean for the POOLED groups if multiple group are studied !

conpagnon.computing.compute_connectivity_matrices.individual_connectivity_matrices(time_series_dictionary, kinds, covariance_estimator, vectorize=False, discarding_diagonal=False, z_fisher_transform=False)[source]

Compute the connectivity matrices for groups of subjects

This function computes connectivity matrices for different metrics.

Parameters
  • time_series_dictionary (dict) –

    A multi-levels dictionnary organised as follow :
    • The first keys levels is the different groupes in the study.

    • The second keys levels is the subjects IDs

    • The third levels is two keys : ‘time_series’ containing the

    subject time series in an array of shape (number of regions, number of time points) A key ‘discarded_rois’ containing an array of the index of ROIs where the corresponding labels is ‘void’. If no void labels is detected, then the array should be empty.

  • kinds (list) – List of the different metrics you want to compute the connectivity matrices. Choices are ‘tangent’, ‘correlation’, ‘partial correlation’, ‘covariance’, ‘precision’.

  • covariance_estimator (estimator object) – All the kinds are based on derivation of covariances matrices. You need to precise the estimator, see Notes.

  • vectorize (bool, optional) – If True, the connectivity matrices are reshape into 1D arrays of the vectorized lower part of the matrices. Useful for classification, regression… Default is False.

  • z_fisher_transform (bool, optional) – If True, the z fisher transform is apply to all the connectivity matrices. Default is False

  • discarding_diagonal (bool, optional) – If True, the diagonal is discarded in the vectorization process. Default is False.

Returns

output

A multi-levels dictionnary organised as follow :
  • The first keys levels is the different groupes in the study.

  • The second keys levels is the subjects IDs

  • The third levels contain multiple keys : multiple

kind keys containing the corresponding kind matrix. a ‘discarded_rois’ key containing the index of discarded rois. Finally you should find a ‘masked_array’ key : This key contains a array of Boolean, shape (numbers of regions, numbers of regions) where the value are True for the index in ‘discarded_rois’ array, and False elsewhere. See Note for further details. The diagonal of the matrix are saved for each kind too in a dedicated field.

Return type

dict

Notes

Covariances estimator are estimators compute in the scikit-learn library. Multiple estimator can be found in the module sklearn.covariance, popular choices are the Ledoit-Wolf estimator, or the OAS estimator. In the output dictionnary, each subjects have a masked array of boolean. The masked will be useful when computing the mean connectivity matrices, we will account the discarded rois in the derivation. For the statistical test you might want perform, it will be useful too, to discarded those rois. A True value is a masked roi, False elsewhere. For the tangent kind, the derivation of individual matrices need to be made on the POOLED GROUPS which is performed here. The derivation of connectivity matrices are based on Nilearn functions. I encourage the user to read the following docstring of important functions : nilearn.connectome.ConnectivityMeasure

References

For the use of tangent :
1
  1. Varoquaux et al. “Detection of brain functional-connectivity

difference in post-stroke patients using group-level covariance modeling”, MICCAI 2010

conpagnon.computing.compute_connectivity_matrices.inter_network_subjects_connectivity_matrices(subjects_individual_matrices_dictionnary, groupes, kinds, atlas_file, sheetname, network_column_name, roi_indices_column_name)[source]

Compute for each subjects, the inter network connectivity matrices

Parameters
  • subjects_individual_matrices_dictionnary (dict) – The subjects connectivity matrices dictionnary for each groupes, and kind in the study.

  • groupes (list) – The list of groups in the study

  • kinds (list) – The list of kinds in the study

  • atlas_file (string) – The full path to an excel file containing information on the atlas

  • network_column_name (string) – The name of the column in the excel file containing the network label for each roi

  • roi_indices_column_name – The name of the column in the excel file containing the index of each ROI in the 4D atlas file.

  • sheetname (string) – The name of the active sheet in the excel file

Returns

output – The subjects inter network connectivity matrices dictionnary, for each group and kinds. Each matrices should have shape (number of network, number of network).

Return type

dict

Notes

The inter-network connectivity is simply defined as the mean of all possible connection between a network pair. We also account for the possible discarded rois in one or both the network by computing the mean on a masked array structure, that is, the inter-network coefficient of interest, along with the corresponding value of the masked array.

References

This inter-network composite score is used in the following references :

1

M. Brier, “Loss of Intranetwork and Internetwork Resting State Functional Connections with Alzheimer’s Disease Progression” The Journal of Neuroscience, 2012.

2

P. Wang, “Aberrant intra- and inter-network connectivity architectures in Alzheimer’s disease and mild cognitive impairment”, Nature Publishing Group, 2015.

conpagnon.computing.compute_connectivity_matrices.intra_network_functional_connectivity(subjects_individual_matrices_dictionnary, groupes, kinds, atlas_file, network_column_name, sheetname, roi_indices_column_name, color_of_network_column)[source]

Compute for each subjects, the intra network connectivity for each network in the study.

Parameters
  • subjects_individual_matrices_dictionnary (dict) – The subjects connectivity dictionnary containing connectivity matrices and corresponding mask array for discarded rois, for each group.

  • groupes (list) – The list of groups in the study

  • kinds (list) – The list of kinds in the study

  • atlas_file (string) – The full path to an excel file containing information on the atlas

  • network_column_name (string) – The name of the column in the excel file containing the network label for each roi

  • roi_indices_column_name (string) – The name of the column in the excel file containing the index of each ROI in the 4D atlas file.

  • sheetname (string) – The name of the active sheet in the excel file

  • color_of_network_column (string) – The name of the columns containing for each roi, the corresponding network color.

Returns

  • output 1 (dict) – A dictionnary structure containing for each subject and for each network: the network connectivity, the vectorized array of coefficients of the network without the diagonal, the corresponding vectorized mask array accounting for discarded rois, the diagonal of the mask array, the masked array structure of the network, and finally the number of rois in the network.

  • output 2 (dict) – A dictionnary network, containing information fetch from the atlas excel file of the atlas.

  • output 3 (list) – The network label list.

  • output 4 (array of shape (number of network, 3)) – The array containing the color in the normalized RGB space, in the order of the network label list.

Notes

The intra connectivity is simply defined as the mean, for each network, of the coefficient belonging to the network. Because connectivity metrics are symmetric, we only taking the vectorize part of the network connectivity matrices. We account for discarded roi, as we compute the mean on a numpy masked array structure, that is the vectorized array along with the vectorized boolean mask for the current network.

References

This intra-network composite scores is used in the following references:

1

M. Brier, “Loss of Intranetwork and Internetwork Resting State Functional Connections with Alzheimer’s Disease Progression” The Journal of Neuroscience, 2012.

2

P. Wang, “Aberrant intra- and inter-network connectivity architectures in Alzheimer’s disease and mild cognitive impairment”, Nature Publishing Group, 2015.

conpagnon.computing.compute_connectivity_matrices.mean_of_flatten_connectivity_matrices(subjects_individual_matrices_dictionary, groupes, kinds)[source]

Return the flat mean connectivity for each subjects.

Parameters
  • subjects_individual_matrices_dictionary (dict) – The subjects connectivity dictionnary containing connectivity matrices and corresponding mask array for discarded rois, for each group.

  • groupes (list) – The list of groupes in the study

  • kinds (list) – The list of kinds in the study

Returns

output – A dictionnary containing for each subject in group, a masked array structure containing the numerical array of the connectivity coefficient of interest, along with the boolean mask accounting for discarded rois. The second key, contain the mean of the extracted connectivity coefficient for each subject, also accounting for discarded rois it they exist.

Return type

dict

Notes

The subjects connectivity matrices shouldn’t be vectorized, the shape should be (n_features, n_features).

conpagnon.computing.compute_connectivity_matrices.pooled_groups_connectivity(time_series_dictionary, kinds, covariance_estimator, vectorize)[source]

Compute connectivity matrices of pooled groups.

This function simply stack the all the times series of each group in one pooled groups and compute the connectivity matrices on the whole group. When computing the tangent kind this function is called.

Parameters
  • time_series_dictionary (dict) –

  • multi-levels dictionnary organised as follow (A) –

    • The first keys levels is the different groupes in the study.

    • The second keys levels is the subjects IDs

    • The third levels is two keys : ‘time_series’ containing the

    subject time series in an array of shape (number of regions, number of time points) A key ‘discarded_rois’ containing an array of the index of ROIs where the corresponding labels is ‘void’. If no void labels is detected, then the array should be empty.

  • kinds (list) – List of the different metrics you want to compute the connectivity matrices. Choices are ‘tangent’, ‘correlation’, ‘partial correlation’, ‘covariance’, ‘precision’.

  • covariance_estimator (estimator object) – All the kinds are based on derivation of covariances matrices. You need to precise the estimator, see Notes.

  • vectorize (bool) – If True, the connectivity matrices are reshape into 1D arrays of the vectorized lower part of the matrices. Useful for classification, regression… Diagonal are discarded.

Returns

  • output 1 (dict) –

    A multi-levels dictionnary organised as follow :
    • The first level keys is the different kinds

    • The second levels is simply a ndimensional array of connectivity

    matrices of shape (number of subjects, number of regions, number of regions) if vectorize is False, and shape (n_columns * (n_columns + 1) /2) else

  • output 2 (list) – The list of subject IDs, in the order of time series computation.

Notes

Covariances estimator are estimators compute in the scikit-learn library. Multiple estimator can be found in the module sklearn.covariance, popular choices are the Ledoit-Wolf estimator, or the OAS estimator.

conpagnon.computing.compute_connectivity_matrices.pooled_groups_tangent_mean(time_series_dictionary, covariance_estimator)[source]

Compute the geometric mean of covariances connectivity matrices for the tangent kind.

The geometric mean is the point in the symmetric manifold matrices spaces where the tangent space is defined. Therefore it’s make sense for the pooled groups only.

Parameters
  • time_series_dictionary (dict) –

    A multi-levels dictionnary organised as follow :
    • The first keys levels is the different groupes in the study.

    • The second keys levels is the subjects IDs

    • The third levels is two keys : ‘time_series’ containing the

    subject time series in an array of shape (number of regions, number of time points) A key ‘discarded_rois’ containing an array of the index of ROIs where the corresponding labels is ‘void’. If no void labels is detected, then the array should be empty.

  • covariance_estimator (estimator object) –

  • the kinds are based on derivation of covariances matrices. You need (All) –

  • precise the estimator, see Notes. (to) –

Returns

output – The geometric mean a the pooled group.

Return type

numpy.array of shape (n_features, n_features)

Notes

Covariances estimator are estimators compute in the scikit-learn library. Multiple estimator can be found in the module sklearn.covariance, popular choices are the Ledoit-Wolf estimator, or the OAS estimator. For now, we doesnt account for discarded rois for the derivation of the geometric mean.

conpagnon.computing.compute_connectivity_matrices.subjects_mean_connectivity_(subjects_individual_matrices_dictionnary, connectivity_coefficient_position, kinds, groupes)[source]

Compute for each subjects, the mean connectivity for some connectivity coefficient in the general subjects connectivity matrices.

Parameters
  • subjects_individual_matrices_dictionnary (dict) – The subjects connectivity dictionnary containing connectivity matrices and corresponding mask array for discarded rois, for each group.

  • connectivity_coefficient_position (numpy.array of shape (number of rois, row_index, column_index)) – The array containing the position in the connectivity matrices of the rois you want to extract the connectivity coefficients.

  • kinds (list) – The list of kinds.

  • groupes (list) – The list of the two group in the study.

Returns

output – A dictionnary containing for each subject in group, a masked array structure containing the numerical array of the connectivity coefficient of interest, along with the boolean mask accounting for discarded rois. The second key, contain the mean of the extracted connectivity coefficient for each subject, also accounting for discarded rois it they exist.

Return type

dict

Notes

The subjects connectivity matrices shouldn’t be vectorized, the shape should be (n_features, n_features).

conpagnon.computing.compute_connectivity_matrices.tangent_space_projection(reference_group, group_to_project, bootstrap_number, bootstrap_size, output_directory, verif_null=True, correction_method='bonferroni', alpha=0.05, statistic='t')[source]

Project a group of time series to a space tangent to a reference connectivity matrix. The reference matrix is derived from a reference times series group. Then, each edges of the projected matrices is tested regarding the reference group through a one sample t-test or a simple z-score. The null distribution of edges is derived with bootstrap. Please, take a look to the reference paper for further a deeper understanding of the method.

Parameters
  • reference_group (numpy.array) – The stack of the vectorized connectivity matrices of shape (n_subjects, n_features). This is the reference point where the tangent space will be computed. All tangent connectivity matrices will be derived regarding this reference set of matrices.

  • group_to_project (numpy.array) – The stack of the vectorized connectivity matrices of shape (n_subjects, n_features). Each connectivity matrices will be projected in the tangent space, at the reference matrix of the reference group.

  • bootstrap_number (int) – The null distribution of each edges is derived trough bootstrapping of the reference set of matrices. Usually around 500.

  • bootstrap_size (int) – The size of the bootstrapped sample, i.e the number of subjects to pick from the reference set.

  • output_directory (str) – The full path of the directory for saving the results

  • verif_null (bool, optional) – If True, generate a report choosing a set of 20 random rois and plot the null distributions in those rois. True by defaults

  • correction_method (str, optional) – The correction method used for to correct for multiple comparison. The default is the classical bonferroni method. You can choose among all the correction proposed by the statsmodels library.

  • alpha (float, optional) – The type I error rate used for the multiple comparison correction. Set to .05 by defaults.

  • statistic (str, optional) – The statistic used, a one sample t-test by default. Other choices are: z, for a simple z-score.

Returns

output

A dictionary with the following fields:
  • null_distribution: The array containing

the estimated null distribution for each edges. - p_values_corrected: The p values, adjusted after the multiple comparison correction - reference_group_tangent_mean: The reference matrix, derived from the mean of the matrices from the reference group. This is the point where all matrices are projected. - reference_group_tangent_matrices: The connectivity matrices in the tangent space for the reference group. - group_to_project_tangent_matrices: The connectivity matrices in the tangent space for the projected group - group_to_project_stats: The chosen statistic for each subject, each edges in the projected set of subjects.

Return type

dict

References

1

G. Varoquaux et al. “Detection of brain functional-connectivity difference in post-stroke patients using group-level covariance modeling”, MICCAI 2010

conpagnon.computing.compute_connectivity_matrices.time_series_extraction(root_fmri_data_directory, groupes, subjects_id_data_path, reference_atlas, group_data, repetition_time, low_pass_filtering=None, high_pass_filtering=None, detrend_signal=True, standardize_signal=True, smooth_signal=None, resampling_target='data', memory_level=1, nilearn_cache_directory=None)[source]

Times series extractions for each subjects on a common atlas.

This function extract time series for each subjects according predefined regions on one common atlas.

Parameters
  • root_fmri_data_directory (str) – The full path of a root directory containing one or numerous sub-directories where functional images are.

  • groupes (list) – The list of groups of interest, i.e the name of the sub-directories containing the functional images you want to study.

  • subjects_id_data_path (str) – The full path to the data file containing the subjects IDs.

  • group_data (dict) –

    A multi-levels dictionnary structured as follow :
    • The first keys level is the different groups

    to study. - The second level keys is the subjects IDs for all subjects in each groupes. - The third level keys contain multiple field : ‘functional_file’ contain the full path to the subject fmri file, ‘atlas_file’ contain the full path to the subject atlas, ‘label_file’ the full path to the subject atlas label, ‘confounds_file’ the full path to the subject confounds file if they exist, a empty list if not.

  • reference_atlas (str) – The full path to the reference atlas which will be used to extract signals from regions.

  • repetition_time (float) – The repetition time in second, i.e the time between two volumes in a fmri image.

  • low_pass_filtering (float or None, optional) – The low pass frequency in Hz cut-off for filtering the times series. Default is None.

  • high_pass_filtering (float or None, optional) – The high pass frequency in Hz cut-off for filtering the time series. Default is None

  • detrend_signal (bool, optional) – Detrend the time series removing the first order moment to the time series, i.e removing the mean signals to each time series. Default if True.

  • standardize_signal (bool, optional) – Set the times series to unit variance Default is True.

  • smooth_signal (float or None, optional) – The full-width half maximum in millimeters of a Gaussian spatial smoothing to apply to the time series. Default is None.

  • resampling_target (str) – Gives the reference image which the source image image will be resample. Choices are : {“mask”, “maps”, “data”, None}. Default is ‘data’.

  • memory_level (int, optional) – Caching parameters of functions. Default is 1.

  • nilearn_cache_directory (str or None) – The full path which will contain the folder used to cache the regions extractions. If None, no cache is performing. Default is None.

Returns

output

A dictionnary structured as follow :
  • The first keys levels is the different groupes.

  • The second keys levels is the subjects IDs

  • The third levels is two keys : ‘time_series’ containing the

subject time series in an array of shape (number of regions, number of time points)

Return type

dict

See also

data_architecture.fetch_data()

This function

returned(), time(), simply()

Notes

The times series extraction is based on functions contain is the Nilearn packages. I encourage the users to consult the docstring of the following function for the detailed mechanism of signal extraction : nilearn.signal.clean, nilearn.input_data.NiftiMapsMasker.

The subjects IDs file, whatever the format, should not contain any header. It’s should have a row column of ID for each subjects.

References

The Nilearn official documentation on Github : [1] http://nilearn.github.io/index.html

conpagnon.computing.compute_connectivity_matrices.time_series_extraction_with_individual_atlases(root_fmri_data_directory, groupes, subjects_id_data_path, group_data, repetition_time, low_pass_filtering=None, high_pass_filtering=None, detrend_signal=True, standardize_signal=True, smooth_signal=None, resampling_target='data', memory_level=1, nilearn_cache_directory=None)[source]

Times series extractions for each subjects with an individual atlas.

This function extract time series for each subjects according predefined regions in individual atlases.

Parameters
  • root_fmri_data_directory (str) – The full path of a root directory containing one or numerous sub-directories where functional images are.

  • groupes (list) – The list of groups of interest, i.e the name of the sub-directories containing the functional images you want to study.

  • subjects_id_data_path (str) – The full path to the data file containing the subjects IDs.

  • group_data (dict) –

    A multi-levels dictionary structured as follow :
    • The first keys level is the different groups

    to study. - The second level keys is the subjects IDs for all subjects in each groupes. - The third level keys contain multiple field : ‘functional_file’ contain the full path to the subject fmri file, ‘atlas_file’ contain the full path to the subject atlas, ‘label_file’ the full path to the subject atlas label, ‘confounds_file’ the full path to the subject confounds file if they exist, a empty list if not.

  • repetition_time (float) – The repetition time in second, i.e the time between two volumes in a fmri image.

  • low_pass_filtering (float or None, optional) – The low pass frequency in Hz cut-off for filtering the times series. Default is None.

  • high_pass_filtering (float or None, optional) – The high pass frequency in Hz cut-off for filtering the time series. Default is None

  • detrend_signal (bool, optional) – Detrend the time series removing the first order moment to the time series, i.e removing the mean signals to each time series. Default if True.

  • standardize_signal (bool, optional) – Set the times series to unit variance Default is True.

  • smooth_signal (float or None, optional) – The full-width half maximum in millimeters of a Gaussian spatial smoothing to apply to the time series. Default is None.

  • resampling_target (str) – Gives the reference image which the source image image will be resample. Choices are : {“mask”, “maps”, “data”, None}. Default is ‘data’.

  • memory_level (int, optional) – Caching parameters of functions. Default is 1.

  • nilearn_cache_directory (str or None) – The full path which will contain the folder used to cache the regions extractions. If None, no cache is performing. Default is None.

Returns

output

A dictionnary structured as follow :
  • The first keys levels is the different groupes.

  • The second keys levels is the subjects IDs

  • The third levels is two keys : ‘time_series’ containing the

subject time series in an array of shape (number of regions, number of time points) A key ‘discarded_rois’ containing an array of the index of ROIs where the corresponding labels is ‘void’. If no void labels is detected, then the array is empty.

Return type

dict

See also

data_architecture.fetch_data_with_individual_atlases()

This function

returned(), time(), simply()

Notes

The times series extraction is based on functions contain is the Nilearn packages. I encourage the users to consult the docstring of the following function for the detailed mechanism of signal extraction : nilearn.signal.clean, nilearn.input_data.NiftiMapsMasker.

The subjects IDs file, whatever the format, should not contain any header. It’s should have a row column of ID for each subjects.

Remembers that the discarded rois are defined according to their labels which must be declared as ‘void’ in the subject atlas labels files.

References

The Nilearn official documentation on Github : [1] http://nilearn.github.io/index.html

Module contents

Created on Tue Sep 19 17:10:13 2017

@author: db242421