Measurement is an essential component of scientific research, whether in natural, social, or health sciences. However, there is little discussion regarding issues of measurement, especially among nursing researchers. Measurement plays an essential role in research in the health sciences field along with other scientific disciplines. Like other natural sciences, measurement is a fundamental part of the discipline and has been approached through the development of appropriate instrumentation [1]. In the field of nursing research, it is necessary to strictly verify the validity and reliability of measurements, and consider their practical importance. Researchers have been able to prepare reports after performing studies to develop the scales of measurement [2]. Three guidelines are followed: the Standards for Educational and Psychological Testing [3], the Standards for Reporting of Diagnostic Accuracy (STARD) initiative [4], and the Guidelines for Reporting Reliability and Agreement Studies [5]. Many leading biomedical journals that publish reports of diagnostic tests, such as the Annals of Internal Medicine, The Journal of the American Medical Association, Radiology, The Lancet, the British Medical Association, and Clinical Chemistry and Laboratory Medicine, have adopted STARD, along with journals in psychology such as the Journal of Personality Assessment [2]. The standards of reliability and validity can be seen in Table 1.
Table 1
Reporting Reliability and Validity
Development and validation of a scale is a time-consuming job. After the scale has been qualitatively developed, it goes through a rigorous process of quantitative examination and its score reliability and validation are measured. This includes measurement of construct, concurrent, predictive, concurrent, and discriminant validity. There are numerous techniques to evaluate construct validity such as using exploratory factor analysis (EFA), confirmatory factor analysis (CFA), or using a structural equation model. In this editorial, we will discuss the major issues in using factor analysis, a widely used method for validating scales. EFA and CFA differ greatly in assumptions, approaches, and applications, and should therefore be understood correctly and applied appropriately.
EFA is used to reduce the number of measured variables to investigate the structure between the variables, and the increasing statistical efficiency. It is used when the relationship between observed variables and factors has not been theoretically established or logically organized. All observed variables are assumed to be influenced by all factors (each factor is related to all observed variables), based on which an observed variable highly correlated with a factor (and with low correlation with other factors) is extracted to reduce the number of variables. Consequently, EFA is data-driven as it accepts the results as is, rather than based on the theoretical background or literature review [6].
To check the suitability of the data for factor analysis, the Kaiser–Meyer–Olkin and Bartlett’s test of sphericity is performed beforehand. Factor analysis is performed in three steps: factor extraction, rotation, and cleaning. Determining the number of factors is critical as data interpretation is dependent on it. However, the most widely used “Kaiser rule” (dropping all components with Eigen values under 1.0) is also the most misused. While each significant factor should have an Eigen value of ≥ 1.0, not all factors with an Eigen value of 1.0 or above are significant. Unfortunately, many research papers make the error of accepting any factor with an Eigen value of ≥ 1.0 without any further consideration [7, 8]. The limitations of Kaiser’s rule can be overcome by using parallel analysis and scree test. Comprehensive consideration of these methods reduces over- or under-extraction of factors in determining the number of factors.
When trying to reduce the number of factors using variables, it can be difficult to figure out which variables belong to which factors. Factor rotation minimizes the complexity of factor loading, which makes interpretation easy, and enables a more detailed factor analysis.
An important difference between rotation methods is that they can create factors that are correlated or uncorrelated with each other. Four orthogonal rotation methods (equamax, orthomax, quartimax, and varimax) assume the factors are uncorrelated in the analysis. In contrast, oblique rotation methods assume that the factors are correlated [9]. In nursing research, it is theoretically reasonable to assume that factors are correlated. Researchers should apply the orthogonal or oblique rotation method by examining the correlation, and report the rotation method according to the presence or absence of correlation.
Finally, the cleaning of variables, the most difficult step of EFA, is used where convergent validity or discriminant validity is impaired, that is when factors other than those of previous studies are loaded.
The factor analysis for the purpose of validity assessment is directly connected to the conceptual base of the measuring instrument. The purpose for CFA is testing whether the results of the factor analysis procedure consistent with the specified conceptual base or framework of the instrument. Hence, CFA is conducted when the conceptual base or framework of a measuring instrument clearly specifies the dimensionality of a concept or construct. Therefore, CFA assumes that a specific observed variable is necessarily affected only by a related factor (latent variable), but not by others, based on a strong theoretical background or previous research [6]. Regarding the indicators to measure specific constructs, cases where the result of theoretical studies is reliable, CFA can be used to obtain the correct model fit index. However, if there are uncertainties over the factor structure with controversies and differences in research results, EFA may be attempted first [10]. Park et al. [11] adopted CFA, instead of EFA, to discover and validate the structure of factors identified in various studies.
To perform CFA, the model fit should be good enough and variables with little explanatory power should be removed in the process. There are a number of model fit indices with different criteria. No index has provided a perfect explanation of the model fit and there is no consensus on the appropriate index in the literature. Therefore, it is recommended that a variety of indices be used to assess the model fit [7, 10]. There are small differences in the indices recommended and used by researchers. As Boateng et al. [12] introduced, Browne and Cudeck [13] recommend RMSEA ≤ .05 as indicative of close fit, .05 ≤ RMSEA ≤ .08 as indicative of fair fit, and values > .10 as indicative of poor fit between the hypothesized model and the observed data. However, Hu and Bentler [14] have suggested RMSEA ≤ .06 may indicate a good fit. Bentler and Bonnett [15] suggest that models with overall fit indices of < .90 are generally inadequate and can be improved substantially. Hu and Bentler [14] recommend TLI ≤ .95. CFI ≥ .95 is often considered an acceptable fit. The threshold for acceptable model fit is SRMR ≤ .08.
Additionally, a CFA should have construct validity, convergent validity, and discriminant validity to be accurate. Construct validity is about ensuring that observed variables, constituting the latent variables, are created with appropriate concepts and definitions, through CFA. It is established when the standardization coefficient value, with observed variables constituting the latent variable, is 0.5 or higher. Convergent validity refers to how closely the new scale is related to other variables and measures of the same construct (construct reliability ≥ .70). In general, it is determined by construct reliability and variance extracted index (AVE > .50). Discriminant validity checks for overlap or similarity between concepts constructed between two or more latent variables. It can be verified by comparing the variance extraction index of the convergent validity and the square of correlation coefficient of each factor in Table 2 [12, 16].
Table 2
Comparison of Validity
Factor analysis is one of the most common multivariate statistical analysis in measurement for validity assessment. The EFA process is used for reducing dimensions, by extracting a small number of constructs (factors or variables) from a large number of observed variables under the assumption that all observed variables are related to all the factors (theoretical background and previous studies are insufficient). On the other hand, the CFA process is used for identifying the relationships between latent variables and observed variables (pre-determined based on a strong theoretical background and previous research). Therefore, it is crucial to develop and validate tools to measure constructs in the nursing field. We hope that this editorial will provide practical momentum for conducting a more accurate and systematic factor analysis with high validity and reliability.
CONFLICTS OF INTEREST:Park JH and Kim JI have been the Editors of JKAN since 2020. Except for that, we declare no potential conflict of interest relevant to this article.
FUNDING:This work was supported by Soonchunhyang University.
AUTHOR CONTRIBUTIONS:
Conceptualization or/and Methodology: Park JH & Kim JI.
Data curation or/and Analysis: Park JH.
Funding acquisition: None.
Investigation: Park JH & Kim JI.
Project administration or/and Supervision: Kim JI.
Resources or/and Software: Park JH & Kim JI.
Validation: Park JH & Kim JI.
Visualization: Kim JI.
Writing original draft or/and Review & Editing: Park JH & Kim JI.
None.
Please contact the corresponding author for data availability.