In the domain of functional magnetic resonance imaging (fMRI) data analysis, given two correlation matrices between regions of interest (ROIs) for the same subject, it is important to reveal relatively large differences to ensure accurate interpretation. However, clustering results based only on differences tend to be unsatisfactory and interpreting the features tends to be difficult because the differences likely suffer from noise. Therefore, to overcome these problems, we propose a new approach for dimensional reduction clustering.

Currently, the neural basis of the human cognitive system is studied using noninvasive neuroimaging techniques such as functional magnetic resonance imaging (fMRI), electroencephalography (EEG), and functional near-infrared spectroscopy (fNIRS) [

Here, we focus on situations in which correlation matrices between ROIs are calculated for each subject in two different conditions. In such situations, it is important to reveal subnetworks of ROIs such that the differences between conditions are relatively large. However, it is difficult to interpret the features of distinctive clusters because the range of correlations is bounded, i.e.,

In [

In this paper, we propose a new dimensional reduction clustering approach for the inner product of the difference between two correlation matrices. The problem is to estimate both the clustering structure of differences between correlation matrices and the low-rank correlation matrix, which can describe the clustering structure, given two correlation matrices. Our approach has two advantages. First, the clustering results are superior to those from methods that rely only on the difference. Second, the range of the estimated low-rank matrix is bounded within the range

The remainder of this paper is organized as follows. In the Methods section, the reason for using the inner product of the difference between correlation matrices is discussed, and its advantage over using only the difference is explained. In addition, the proposed model and corresponding objective function are introduced. To estimate the parameters, an algorithm based on the derived majorization function is provided. Afterward, the simulation design of our numerical study and the fMRI data for a mental arithmetic task, for a demonstration of our proposed approach, are described. The results of the simulation and real fMRI data for the mental arithmetic task are discussed. Finally, we offer our concluding remarks regarding the proposed method.

In this section, we explain the proposed method. First, before introducing the optimization problem, we explain our reason for using the inner product of the difference between correlation matrices, instead of using only the difference. The model of the proposed method is then introduced, and the optimization problem of the proposed method is presented. Finally, the simulation design of our numerical study and the fMRI data for a mental arithmetic task, for a demonstration of our proposed approach, are explained.

Here, the model of the proposed method is explained. Let

However, even if

Based on Equations (

From Equation (

To reiterate, the purpose of the proposed method is estimating a low-rank matrix such that the clustering structure is emphasized. To achieve this purpose,

In this subsection, we show the proposed dimensional reduction clustering approach based on Equation (

Afterward, the optimization problem of estimating a low-rank correlation matrix is described. Given

This subsection provides a detailed description of the MM algorithm for estimating

However, first, before the derivation the majorizing function is presented, the principle of the MM algorithm is explained. For more details on the MM algorithm, see [

The objective function Equation (

The third term of Equation (

If

From (

For the algorithm based on Equation (

Finally, to detect the clustering structure of the variables,

In short, the proposed method can estimate

In this subsection, the superiority of the proposed method is shown via the results of a numerical simulation. In particular, the recovery of clustering results is evaluated in this simulation.

First, we reveal the simulation design. To evaluate the clustering results, artificial data with a true clustering structure are generated and correlation matrices between variables are calculated from the data. Dimensional reduction clustering approaches are then applied to the difference between the two correlation matrices, and the clustering results are obtained. In this numerical simulation, the true number of clusters is assumed to be known beforehand. Finally, an adjusted Rand index (ARI) [

Afterward, we explain the method for generating the artificial data. Multivariate data representing condition 1 and condition 2 are generated as follows:

In this simulation, four factors are utilized; a summary of the simulation is shown in

For this factor, we evaluate four methods. The purpose of setting this factor is to evaluate the effect of using an inner product and to estimate a bounded low-rank matrix. Here, the proposed method is referred to as method 1. By contrast, method 2 is designed to have an approach similar to that of the proposed approach, except method 2 is based only on difference and not on the inner product. Through a comparison between the proposed method and method 2, we can evaluate the effect of using the inner product. On the other hand, in method 3, the low-rank matrix is estimated from only the difference and calculated via Cholesky decomposition, where these estimated values are characterized by no constraint. Based on a comparison between the proposed method and method 3, the effect of both the inner product and bounded constraint is evaluated. Meanwhile, in method 4, the low-rank matrix is estimated from the inner product; however, the estimated values are characterized by no constraint. Therefore, based on a comparison between the proposed method and method 4, the effect of the bounded constraints is estimated.

These four methods are then explained from the perspective of calculation. The first method is the proposed approach based on the inner product of

The third method also consists of two steps. Eigenvalue decomposition is applied to

These methods, i.e., method 2, method 3, and method 4, are referred to as tandem approaches [

Both the third and fourth methods provide us with low-rank matrices and clustering results. However, from these results, it is difficult to interpret the degree of the relation because these estimated values are not bounded. On the other hand, both the first and second methods allow us to interpret the results easily because these estimated values are bounded within the range

For the four methods mentioned in the previous subsection, the rank must be determined. In this simulation, ranks are set to

All four methods mentioned in the Factor 1: Methods subsection adopt

At first, the generation of true clustering structures is determined by

For

With the aforementioned

Among the cognitive functions, working memory (WM) is important for engaging in everyday tasks such as conversations or reading books. WM is the system for temporally storing and processing necessary information [

Thirty-two healthy adults (20 males and 12 females; mean age,

The participants were asked to perform a mental arithmetic task in an fMRI scanner. The experimental design is shown in

All MRI scans were performed on a 1.5T Echelon Vega (Hitachi, Ltd.). Functional images were acquired using a gradient-echo echo-planar imaging sequence (TR = 3000 ms, TE = 40 ms, flip angle =

The first six scans were excluded from the analysis to eliminate the nonequilibrium effects of magnetization. The functional images were preprocessed using SPM12 software (Wellcome Department of Cognitive Neurology, London UK) [

For the functional connectivity analysis, the functional images preprocessed using SPM12 were further processed using the CONN toolbox [

To calculate functional connectivity during the tasks, the preprocessed images were parcellated into 116 regions, including 90 cerebrum regions and 26 cerebellum regions defined via automated anatomical labeling (AAL); the mean Blood-Oxygen-Level-Dependent (BOLD) time course was then calculated for each region. Subsequently, Pearson correlations among BOLD time courses of the 116 regions were calculated and then Fisher-z transformed. As a result, a

In this section, we describe the results of the numerical simulations and of applying the proposed fMRI data method.

In this subsection, we describe the results of the numerical simulations.

The results for

From the overall results, we note two specific observations. First, the number of clusters is relatively large and performance tends to be relatively low, irrespective of the type of method; this tendency has been reported in [

In this subsection, we show the results of applying the proposed method to the fMRI data. Concretely, the purpose of this example is to detect clustering structures where the difference between two experimental conditions, i.e., High-WM and Low-WM tasks, is emphasized. In addition, the features of these estimated clusters are interpreted in combination with knowledge on ROIs related to WM, including the task-positive network (TPN), ventral attention network (VAN), salience network (SN), visual network (VN), and default mode network (DMN). The TPN consists of the fronto-parietal network (FPN), dorsal attention network (DAN), and cingulo-opercular network (CON). The FPN and DAN are related to executive function [

We then explain how to construct the difference between correlation matrices. For each subject, using a matrix of difference between the correlation matrices of High-WM and Low-WM, we calculate one mean matrix of these differences. As described in the previous section, the input difference is a

In the proposed method, rank and the number of clusters must be selected. For the determination of rank, we set the rank candidates to

The proposed method is then applied to difference matrices of rank

The left side of

The features of the estimated cluster are then interpreted in combination with knowledge on WM.

These values on

Although the proposed method was applied to fMRI data during mental arithmetic tasks as an explanatory analysis, these results suggest that our method can detect community structures consisting of well-known functional networks associated with the human working memory system.

We proposed a new dimensional reduction clustering approach based on differences between correlation matrices. The estimated low-rank correlation represents the difference, and it is easy to interpret the relations because the estimated values are bounded within the range

In addition, we show the results of applying the proposed method to real fMRI data related to WM. Through dimensionality reduction, the clustering structures of the ROIs are emphasized, as shown in the left side of

Finally, we discuss the future direction of this research topic. There are four things to consider. First, although the proposed method certainly provides us with interpretable results from the perspective of WM, the proposed method requires further evaluation through real-data analyses and comparison with other methods. The properties of the clustering results, in particular, should be evaluated through the use of benchmark data. Second, although we show that the proposed method is based on Pearson’s correlation coefficient, other kinds of similarity measures, such as Spearman’s correlation coefficient, should be considered. Third, in the numerical simulations, we evaluate the clustering results using ARI. However, there are several other measures for comparing clustering results [

K.T. constructed the proposed statistical method, conducted numerical simulations, applied the method to real fMRI data, and drafted the initial manuscript. S.H. designed and conducted the experiment, proposed the framework of fMRI data analysis, and reviewed the manuscript. Both authors have read and agreed to the published version of the manuscript.

This work was supported by JSPS KAKENHI grant numbers JP17K12797 and JP19K12145.

Not applicable.

This study was approved by the Research Ethics Committee of Doshisha University (approval code: 1331), Kyoto, Japan. Informed consent was obtained for all subjects before they enrolled in the experiment.

The artificial data used in the numerical simulations were generated based on probability distribution. For the method of generation, see the subsection Simulation Study. The datasets used in this study are available from the authors upon reasonable request.

We appreciate our academic editor and reviewers for their useful comments.

The authors declare that there are no competing interests.

The following abbreviations are used in this manuscript:

Image of proposed approach.

Experimental design.

Simulation result for

Simulation result for

Change in rate for objective function; vertical and horizontal axes indicate rank and ratio of values for objective function, respectively.

Estimated correlation matrix for

Difference matrix (High-WM–Low-WM) permutated based on estimated clusters and WM.

Boxplot of difference (High-WM–Low-WM) based on estimated clusters and WM.

Summary of simulation design.

Names of Factors | Levels | Descriptions |
---|---|---|

Methods | 4 | proposal, Method 2, Method 3, and Method 4 |

Rank | 3 | Rank |

The number of clusters | 3 | |

The difference between true correlation | 2 |