conpagnon.data_handling package¶

Submodules¶

conpagnon.data_handling.atlas module¶

class conpagnon.data_handling.atlas.Atlas(path, name)[source]¶

Bases: object

A class Atlas for computing useful information when dealing with atlases.

fetch_atlas()[source]¶: Return the complete path to the atlas.

loadAtlas()¶: Load the atlas images and return a 4D numpy array.

GetRegionNumbers()¶: Return the numbers of regions in the Atlas.

GetLabels()[source]¶: Return the list of the labels of the atlas.

GetCenterOfMass()¶: Return tha array of coordinates of center of mass to each atlas regions.

UserLabelsColors()¶: Generate an array of users defined colors to the labels for display purpose.

GetLabels(labelsFile, colname='labels')[source]¶

Read the labels text file of the atlas

Parameters

labelsFile (str) – The full path to the label file of the atlas. Supported extension are : .csv, .txt, .xlsx or .xls. By default, the header of the labels file is the column labels name entitled ‘labels’.
colname (str, optional) – The columns name containing the labels. Default is labels. If no header, leave None.

Returns

output – The list of the labels.

Return type

list

fetch_atlas()[source]¶: Fetch the complete path to the atlas file.

get_center_of_mass(asanarray=False)[source]¶

Compute centers of mass of the different atlas regions.

Parameters: asanarray (bool, optional) – If True, then the array are return if numpy.array of shape (number of regions, 3).
Returns: output – The coordinates of the centers of mass for each regions of the atlas.
Return type: list or numpy.array

get_region_numbers()[source]¶: Return the number of regions of the atlas

load_atlas()[source]¶: Load the atlas image and return the corresponding 4D numpy array.

user_labels_colors(networks, colors)[source]¶

Generates user defined labels colors for each label of the atlas.

Parameters

networks (list) – The list containing the numbers of regions for each networks.
colors (list) – The list of colors for each networks.

Returns

output – The array containing the numerical values for the RGB space for the given colors entered in colors. For the normalized RGB space, you just have divided the values of this array by 255.

Return type

numpy.array, shape(number of regions, 3)

References

Please find all possible colors at [1] https://matplotlib.org/examples/color/named_colors.html

conpagnon.data_handling.atlas.closest_colour(requested_colour)[source]¶

conpagnon.data_handling.atlas.fetch_atlas(atlas_folder, atlas_name, colors_labels='auto', network_regions_number='auto', labels='auto', normalize_colors=False)[source]¶

Return important information from an atlas file.

Parameters

atlas_folder (str) – The full path to the directory containing the atlas
atlas_name (str) – The filename of the atlas file.
colors_labels (str, or list. Optional) – If set to ‘auto’, the labels of the ROI name will get a random colors. Else, if a list of colors is provided, ROIs belonging to a network will get the desired colors. The colors should be in the same order of the network in the atlas file. The length of the list should match the number of network in the atlas.
network_regions_number (list, Optional.) – If set to ‘auto’, random colors will be chosen. If a list of the number of regions in each network is provided, the corresponding color, in the color list will be applied to the corresponding number in the list.
labels (list, optional) – The list of the ROI labels. If not provided, the ROI name is simply it’s position in the atlas file.
normalize_colors (bool, optional) – If True, all triplets in the RGB space are divided by the maximum 255.

Returns

output 1 (numpy.array) – The coordinates of the center of mass of each ROI in the atlas. An array of shape (n_rois, 3).
output 2 (list) – The name of each ROIs in the atlas. A list of length (n_rois, ).
output 3 (numpy.array) – The array containing the colors in the RGB space of each ROIs. An Array of shape (n_rois, 3).
output_4 (int) – The number of ROIs in the atlas.

conpagnon.data_handling.atlas.fetch_atlas_functional_network(atlas_excel_file, sheetname, network_column_name)[source]¶

Return a dictionary containing information for all functional networks.: Information for all network is simply fetch in the excel file.

Parameters

atlas_excel_file (str) – The full path to the excel file containing all the information on your atlas.
sheetname (str) – The active sheet name in the atlas excel file.
network_column_name (str) – The name of columns containing the label for all the functional networks.

Returns

output – A dictionnary with the networks name as keys, and the sub-dataframe and the number of roi for each networks as values.

Return type

dict

conpagnon.data_handling.atlas.generate_3d_img_network(reference_4datlas, atlas_information_xlsx_file, network_column_name, sheetname, atlas4d_index_keys, atlas3d_label_key, save_network_img_directory)[source]¶: This function generate a 3D NifTi file for each defined functional network in a 4D atlas.

conpagnon.data_handling.atlas.get_colour_name(requested_colour)[source]¶

conpagnon.data_handling.data_architecture module¶

Created on Mon Sep 18 17:32:38 2017

@author: Dhaif BEKHA

ComPagnon version 2.0

conpagnon.data_handling.data_architecture.create_group_dictionnary(subjects_id_data_path, root_fmri_data_directory, groupes)[source]¶

Initialise a dictionnary containing groups as keys and subjects IDs as values

Parameters

subjects_id_data_path (str) – The full path to the data file containing the subjects IDs.
root_fmri_data_directory (str) – The full path of a root directory containing one or numerous sub-directories where functional images are
groupes (list) – List of sub-directories names containing fmri files you want.

Returns

output – A dictionnary groupes as keys and subjects IDs as values for each groups

Return type

dict

See also

list_fmri_data(): Fetch functional images in all sub-directories.
read_text_data_file(): read a text data file.

Notes

Whatever the format for the subjects IDs datafile, it should not contains any header. it should consist of one raw columns of subject IDs.

conpagnon.data_handling.data_architecture.fetch_data(subjects_id_data_path, root_fmri_data_directory, groupes, individual_confounds_directory=None)[source]¶

Fetch a complete organised structure for a groups study on a common atlas for all subjects

Parameters

subjects_id_data_path (str) – The full path to the data file containing the subjects IDs.
root_fmri_data_directory (str) – The full path of a root directory containing one or numerous sub-directories where functional images are
groupes (list) – List of sub-directories names containing fmri files you want.
individual_confounds_directory (None or str) – Full path to confounds files for all subjects.

Returns

output – A multi level dictionnary containing all the data. The first level is the groups keys. The second levels is the subjects IDs. The last level, is all the relevant file for one subjects: fmri image, and confound file if required.

Return type

dict

See also

fetch_data_with_individual_atlases(): Fetch a complete organised

structure()

Notes

Whatever the format for the subjects IDs datafile, it should not contains any header. it should consist of one raw columns of subject IDs.

conpagnon.data_handling.data_architecture.fetch_data_with_individual_atlases(subjects_id_data_path, root_fmri_data_directory, groupes, individual_atlases_directory, individual_atlases_labels_directory, individual_atlas_file_extension, individual_atlas_labels_extension, individual_counfounds_directory=None)[source]¶

Fetch a complete organised structure for a groups study require the use of individual atlases

Parameters

subjects_id_data_path (str) – The full path to the data file containing the subjects IDs.
root_fmri_data_directory (str) – The full path of a root directory containing one or numerous sub-directories where functional images are
groupes (list) – List of sub-directories names containing fmri files you want.
individual_atlases_directory (str) – Full path to individual atlases directory for all subjects.
individual_atlases_labels_directory (str) – Full path to individual atlases labels directory for all subjects.
individual_atlas_file_extension (str) – Extension of individuals atlases images for all subjects.
individual_atlas_labels_extension (str) – Extension of text data containing individual atlases labels file for all subjects.
individual_counfounds_directory (None or str) – Full path to counfounds files for all subjects.

Returns

output – A multi level dictionnary containing all the data. The first level is the groups keys. The second levels is the subjects IDs. The last level, is all the relevant file for one subjects: fmri image, subject atlas, subject atlas labels file, and confound file if required. A keys called ‘discarded_rois’ contain the excluded rois, see Notes.

Return type

dict

See also

fetch_data(): Fetch a complete organised structure for a groups study

on()

Notes

Whatever the format for the subjects IDs datafile, it should not contains any header. it should consist of one raw columns of subject IDs.

A discarded_rois is a subject atlas ROI where the corresponding labels is ‘void’. This ROIS can be a empty ROIs, a ROI you doesnt need for the analysis. The discarded rois will be discarded (!) in the connectivity analysis when computing t-test for example.

conpagnon.data_handling.data_architecture.fetch_fmri_data(root_fmri_data_directory, groupes)[source]¶

Fetch functional images found in a list of sub-directories.

Parameters

root_fmri_data_directory (str) – The full path of a root directory containing one or numerous sub-directories where functional images are.
groupes (list) – List of sub-directories names containing fmri files you want.

Returns

output – A dictionnary with sub-directories names as keys and full path to functional images as values.

Return type

dict

See also

list_fmri_data(): Fetch functional images in all sub-directories.

conpagnon.data_handling.data_architecture.list_fmri_data(root_fmri_data_directory)[source]¶

Fetch all functional images found in sub-directories at a root directory.

Parameters: root_fmri_data_directory (str) – The full path of a root directory containing one or numerous sub-directories where functional images are.
Returns: output – A dictionnary with sub-directories names as keys and full path to functional images as values.
Return type: dict

conpagnon.data_handling.data_architecture.read_text_data_file(file_path, colname=None, header=None)[source]¶

Read a data file

The data file can be a .csv, .txt, .xlsx or .xls file

Parameters

file_path (str) – Full path to the file to read.
colname (None or str) – The column name to extract.
header (None of int) – Row number to use as the column names, and the start of the data.

Returns

output – The extracted column if form of a panda dataframe.

Return type

pandas.core.frame.DataFrame

conpagnon.data_handling.data_management module¶

conpagnon.data_handling.data_management.concatenate_dataframes(list_of_dataframes, axis=0)[source]¶: Concatenate a list of pandas DataFrame

conpagnon.data_handling.data_management.csv_from_dictionary(subjects_dictionary, groupes, kinds, field_to_write, header, csv_filename, output_directory, delimiter=',')[source]¶

Write a csv file from a subjects dictionary.

Parameters

subjects_dictionary (dict) – A dictionnary with the same structure as a subjects connectivity matrices dictionary
groupes (list) – The list of groups to write
kinds (list) – The list of kind to write
field_to_write (str) – The field containing the value to write for each subject.
header (list) – The header of the CSV file, in a list of column name
csv_filename (str) – The end of CSV filename with the extension
output_directory (str) – The full path to a directory for saving the CSV file.
delimiter (str, optional) – The delimiter between columns. Default is a comma.

conpagnon.data_handling.data_management.csv_from_intra_network_dictionary(subjects_dictionary, groupes, kinds, network_labels_list, field_to_write, output_directory, csv_prefix, delimiter=',')[source]¶: Write csv file from the intra-network connectivity dictionary structure.

conpagnon.data_handling.data_management.dataframe_to_csv(dataframe, path, delimiter=',', index=False)[source]¶: Create and write a CSV file from a DataFrame

conpagnon.data_handling.data_management.dictionary_to_csv(dictionary, output_dir, output_filename)[source]¶: Write dictionary couple (key, value) in a CSV file

conpagnon.data_handling.data_management.flatten(values)[source]¶

Flatten a list of numpy ND-array

Parameters: values (list) – A list of numpy array, with same or different dimensions.
Returns: output – A flat array (one dimensional array) containing all the values in the same order of the list of array.
Return type: numpy.array

conpagnon.data_handling.data_management.group_by_factors(dataframe, list_of_factors, return_type='list_of_dataframe')[source]¶

Group by factors present in a dataframe

Parameters

dataframe (pandas.DataFrame) – A pandas dataframe.
list_of_factors (list) – The list of factors, i.e columns name in the dataframe, you want to group by.
return_type (str) – The output format, choices are list_of_dataframe or dictionary. If the former, a list of dataframe is returned of length equal to the number of groups, if the latter a dictionary with groups name as keys and corresponding dataframe as values is returned. Default is list_of_dataframe.

Returns

A list or dictionary of the corresponding dataframe group by attribute.

Return type

output

conpagnon.data_handling.data_management.merge_by_index(dataframe1, dataframe2, left_index=True, right_index=True)[source]¶

Merge two dataframes based on the index concordances

Parameters

dataframe1 (pandas.DataFrame) – A panda dataframe
dataframe2 (pandas.DataFrame) – A panda dataframe
left_index (bool, optional) – If True, the merge operation is based on the left index
right_index (bool, optional) – If True, the merge operation is based on the right index

Returns

output – The merged dataframe.

Return type

pandas.DataFrame

Notes

If left_index and right_index are both True the merge is based on the intersection of both dataframe, i.e a missing index in one of the dataframe will be deleted in the final dataframe.

conpagnon.data_handling.data_management.merge_list_dataframes(list_dataframes)[source]¶: Merge a list of dataframes

conpagnon.data_handling.data_management.read_csv(csv_file, delimiter=',')[source]¶

Read a CSV file and return a panda.DataFrame

Parameters

csv_file (str) – The full path to the CSV file to read
delimiter (str) – The separator use in the CSV file

conpagnon.data_handling.data_management.read_excel_file(excel_file_path, sheetname, subjects_column_name)[source]¶

Read a excel document

Parameters

excel_file_path (str) – Full path to the excel document
sheetname (str) – The sheetname to read in the excel document
subjects_column_name (str) – The column name containing the subjects identifiers.

Returns

output – A panda DataFrame, indexed by subject name.

Return type

pandas.DataFrame

conpagnon.data_handling.data_management.remove_duplicate(seq)[source]¶: Remove duplicate in a sequence of items while keeping the order.

conpagnon.data_handling.data_management.shift_index_column(panda_dataframe, columns_to_index)[source]¶

Shift the index column of a pandas DataFrame

Parameters

panda_dataframe (pandas.DataFrame) – A pandas dataframe.
columns_to_index (list) – Column label or list of column labels / arrays

Returns

output – A new pandas DataFrame with the shifted columns as index.

Return type

pandas.DataFrame

conpagnon.data_handling.data_management.unflatten(flat_values, prototype)[source]¶

Unflatten a one dimension array of values to the original list of array.

Parameters

flat_values (numpy.ndarray) – The numpy array containing the values.
prototype (list) – The original list of numpy array.

Returns

output – A list of array with the same structure as prototype.

Return type

list

conpagnon.data_handling.data_management.write_ols_results(ols_fit, design_matrix, response_variable, output_dir, model_name, design_matrix_index_name=None)[source]¶: Write OLS result, along with the design matrix and the variable to explain.

conpagnon.data_handling.dictionary_operations module¶

conpagnon.data_handling.dictionary_operations.groupby_factor_connectivity_matrices(population_data_file, sheetname, subjects_connectivity_matrices_dictionnary, groupes, factors, drop_subjects_list=None, index_col=0)[source]¶: Group by attribute the subjects connectivity matrices. # TODO: 18/09/2019: I added index_col to precise the index of the column # TODO: to be considered as the index of the whole dataframe. # TODO: Side Note: this function work with a time series dictionary too. !! # TODO: Refractoring of subjects_connectivity_matrices_dictionary to subjects_dictionary.

conpagnon.data_handling.dictionary_operations.merge_dictionary(dict_list, new_key=None)[source]¶

Merge a list of dictionary

Parameters

new_key (str, optional) – The key of the new merged dictionary. If None, the dictionaries in the list are simply merged together. Default is None
dict_list (list) – A list of the dictionary to be merged

Returns

output – A dictionnary with one key, and merged dictionary as value.

Return type

dict

Notes

Note that all the dictionnary you want to merge must have different keys.

conpagnon.data_handling.dictionary_operations.random_draw_of_connectivity_matrices(subjects_connectivity_dictionary, groupe, n_matrices, subjects_id_list=None, random_state=None, extract_kwargs=None)[source]¶

Randomly pick N connectivity matrices from a subjects connectivity dictionary.

Parameters

subjects_connectivity_dictionary (dict) – The subjects dictionary containing connectivity matrices
groupe (str) – The group in which you want pick the matrices
n_matrices (int) – The number of connectivity matrices you want to randomly choose
subjects_id_list (list, optional) – The subjects identifiers list in which you want to choose matrices. If None, random matrices are picked in the entire group. Default is None.
random_state (int, optional) – The seed of the pseudo random number generator to use when shuffling the data. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
extract_kwargs (dict, optional) – A dictionnary of argument passed to extract_sub_connectivity_matrices function. Default is None

Returns

output 1 (dict) – The connectivity matrices dictionary, with subjects chosen randomly.
output 2 (list) – The list of randomly chosen subjects identifier.

conpagnon.data_handling.dictionary_operations.rebuild_subject_connectivity_matrices(subjects_connectivity_dictionary, groupes, kinds, diagonal_were_kept=False)[source]¶

Given the subject connectivity dictionary, the matrix are rebuild from the vectorized one.

Parameters

subjects_connectivity_dictionary (dict) – The subjects connectivity dictionary
groupes (list) – The list of groups to rebuild the subjects matrices.
kinds (list) – The list of kinds to rebuild.
diagonal_were_kept (bool, optional) – If True, the reconstructed matrix, will have the diagonal store in the kind diagonal field of the dictionary, and the mask diagonal field for the mask. If False, the reconstructed matrix will have a zeros diagonal, and a True diagonal for the mask.

Returns

output 1 – The reconstructed subjects connectivity matrices. All the matrices have now shape (number_of_regions, number_of_regions).

Return type

dict

Notes

If in the input dictionary, the matrices and corresponding mask where vectorized with the diagonal kept, the argument diagonal_is_there must be set to False. A dimension error will be raises otherwise.

conpagnon.data_handling.dictionary_operations.stack_subjects_connectivity_matrices(subjects_connectivity_dictionary, groupes, kinds)[source]¶

Re-arrange the subjects connectivity dictionary to return a stack version per group and kind.

Parameters

subjects_connectivity_dictionary –
groupes –
kinds –

Returns

Module contents¶

Created on Mon Sep 18 16:38:48 2017

@author: db242421

Le module data_handling fournit des fonctions commode pour stocker les differentes informations relatives au fichiers utiles (irmf, atlas individuels…) dans une structure facilement utilisable de dictionnaire.