Conners’ Adult ADHD Rating Scale Infrequency Index Validation and Pilot Comparison of Administration Formats

One major obstacle to the accurate diagnosis of ADHD in college students is malingering, although many symptom self-report measures used in the diagnostic process do not contain validity scales to identify feigners. The Infrequency Index (CII) for the Conners’ Adult ADHD Rating Scale–Self-Report: Long Version (CAARS-S: L) was developed in response to this concern, although further validation of this index is needed. Another topic of interest in ADHD malingering research is the increasing use of online assessments. However, little is known about how ADHD is malingered in an online format, particularly on the CAARS-S: L. The current study utilized a coached simulation design to examine the feigning detection accuracy of the CII and provide initial results on the effect of administration format (paper vs. online) on CAARS-S: L profiles. Data from 139 students were analyzed. Students with ADHD and students instructed to feign the disorder produced statistically comparable elevations on seven of eight CAARS-S: L clinical scales. Clinical scale elevations were generally comparable between paper and online forms, although some differences in the clinical and simulated ADHD groups suggest the need for further research. The CII demonstrated modest sensitivity (0.36) and adequate specificity (0.85) at the recommended cut score across administration formats. Specificity reached desirable levels (>= .90) at a raised cut score. These values were similar across administration formats. Results support the use of the CII and online CAARS-S: L form. © 2020 Elizabeth Wallace. Hosting by Science Repository. All rights reserved.


Introduction
Attention-deficit/hyperactivity disorder (ADHD) is a neurodevelopmental condition characterized by persistent symptoms of inattention and/or hyperactivity/impulsivity [1]. Although previously regarded as a disorder confined to childhood, it is now known that ADHD continues into adulthood for some individuals. Current prevalence rate for the disorder in adulthood is estimated at 4.4%, with ADHD affecting approximately 4.5% of college-aged (i.e., ages 18 to 24) adults [2]. Furthermore, of all college students receiving disability services on campuses, approximately 25% have been diagnosed with ADHD, a figure which is expected to increase [3].
Unfortunately for clinicians evaluating adults for ADHD, there are multiple obstacles to accurate diagnosis. One such challenge is malingering, which is defined as faking/exaggerating deficits for external benefit such as financial gain or avoidance of responsibilities [1]. Experts in the area suggest that feigning deficits is more likely to occur in 'high-stakes' psychological evaluations, such as those that could lead to external benefits for the examinee [4]. ADHD evaluations can be considered 'high-stakes' in that diagnosed college students may be eligible to receive academic accommodations, such as additional testing time, access to a private testing room, and/or stimulant medication [5]. Access to controlled stimulant medications, such as Adderall or Ritalin, can be particularly appealing to college students. The effects of such drugs include heightened and prolonged focus, which can be desirable in competitive academic environments. There is also growing evidence that these medications are sought by students for recreational use/abuse, with prevalence rates estimated between 13% and 34% [6]. Stimulant abuse may lead to excessive dopamine and norepinephrine levels in the prefrontal cortex, which is particularly concerning for college students as the prefrontal cortex continues developing in young adulthood [7]. Importantly, stimulant abuse is associated with higher rates of alcohol and drug use and other risky behaviors [8].
Furthermore, symptoms of ADHD are detailed online, making research on the disorder relatively easy for motivated students seeking a diagnosis [9]. Given these potential external gains and the availability of symptom information, feigning is a salient issue in this area. In fact, it has been estimated that as many as 25-48% of college students feign deficits during self-referred ADHD evaluations [10]. This rate is particularly troubling given excess healthcare costs associated with adult ADHD totaling $8.51 billion [11]. Thus, objective assessment for exaggerated and/or feigned ADHD symptoms is vital for a valid diagnosis.
Unfortunately, feigning in adult ADHD evaluations is difficult to identify accurately. First, no consistent pattern of deficits pathognomonic for the disorder has been identified. Accordingly, a standard ADHD assessment battery has not been established [12]. Second, clinicians often rely on self-report measures for information on past and current symptom severity [13]. However, research indicates that symptoms of ADHD are easily feigned by college students on retrospective and current self-report measures, including the Barkley Adult ADHD Rating Scale-IV and ADHD Behavior Checklist [14][15][16][17]. This is particularly problematic because ADHD self-report measures rarely include standard validity scales intended to identify potential feigners. Highlighting this concern are reports such as that by Jachimowicz and Geiselman (2004), which found that 90% of students instructed to feign on a self-report ADHD measure were successful at producing profiles consistent with ADHD impairment [18].
Other validity measures, namely performance validity tests (PVTs) designed for the detection of improbable impairment on cognitive measures, have demonstrated effectiveness in accurately detecting feigned ADHD [19,20]. Although symptom validity tests (SVTs), designed to detect exaggerated symptom reports, have demonstrated less ADHD feigning detection capability compared to PVTs, embedded SVTs offer clinicians a way to check for feigning without lenghthening their assessment batteries [21]. However, many self-report measures without embedded SVTs continue to be widely used and easily manipulated. These factors solidify the need for more robust self-report measures with validity indicators in ADHD evaluations. Further, if patterns of feigning can be identified, a gold standard ADHD assessment battery best equipped to identify feigned deficits may be established.
One popular self-report measure used in ADHD evaluations is the Conners' Adult ADHD Rating Scales-Self Report: Long Version [22]. The CAARS-S: L in its original form includes eight clinical scales and one validity index (the Inconsistency Index). The Inconsistency Index (INC) assesses careless/random responding rather than over-reporting or feigning, although those seeking to dissimulate may employ careless/random responding in an attempt to feign ADHD deficits [12,[23][24][25]. The CAARS-S: L's lack of a feigning validity scale has rendered it vulnerable to feigned symptom reports.
As previously mentioned, multiple studies have found few or no statistically significant differences on the CAARS-S: L clinical scales when comparing feigning and diagnosed ADHD groups [5,20,26]. Though the CAARS-S: L manual warns that clinical scale scores greater than 80 could indicate feigning, it also states that such elevations could indicate extreme yet truthful symptomology [22]. Thus, the CAARS-S: L clinical scales and INC scores alone are likely inadequate for differentiating honest from feigned responses. In order to address this concern, the CAARS-S: L Infrequency Index (CII) was created as an embedded SVT to detect potential feigning [27]. The CII is composed of 12 items rarely endorsed by typically developing adults as well as those diagnosed with ADHD. Suhr and colleagues (2011) identified a cut score of > 21 as producing high specificity for ADHD. The index was found to have generally modest sensitivity (approximately 30%) and high specificity (approximately 95%). In further validation work by Cook, Bolinger, and Suhr (2016), the CII demonstrated 52% sensitivity to feigning and 97% specificity for ADHD based on extreme elevations of the three CAARS-S: L clinical scales derived from DSM-IV ADHD criteria [28].
However, subsequent validation using varied criteria for defining noncredible reporting has produced mixed results: Using the Minnesota Multiphasic Personality Inventory-2-Restructured Form validity scales, Word Memory Test, and Digit Span subtest of the Wechsler Adult Intelligence Scale-Fourth Edition to indicate feigning, the CII showed low sensitivity (range 13% to 36%) and acceptable to high specificity (range 87% to 91.8%) [29][30][31]. Similarly, in simulation studies, CII accuracy has been limited: Andresen (2012) did not find a statistically significant difference between feigning and ADHD groups on the CII [32]. Fuermaier and colleagues (2016) found that the CII did not explain a significant amount of variance in predicting feigned ADHD above and beyond the measure's clinical scales [13]. CII sensitivity was moderate (range 32% to 52%), whereas specificity was inadequate (65%). Given the index's initial promise and subsequent mixed findings, further validation of the CII is desirable.
Another salient issue is that the use of online assessments is increasing in popularity [33,34]. Few differences have been found between online and paper formats for measures of depression, panic, traumatic stress, and other clinical constructs [35][36][37]. However, published work on the comparability of administration formats of the CAARS-S: L appears limited to one study utilizing a sample of honestly responding adults without ADHD, a concern given caveats against utilizing online assessments without first undertaking thorough validation efforts [38][39]. Results of the sole investigation in this area by Hirsch and colleagues (2013) indicated similar factor structure across formats, but online respondents relative to paper yielded significantly higher scores on three (Inattention/Memory Problems, Impulsivity/Emotional Lability, and Problems with Self-Concept) of the four factor-derived clinical scales [38]. The remaining four scales (three clinical scales derived from DSM-IV criteria and one ADHD Index) were not examined in this study.
Further, the comparability of feigning on online versus paper assessments has received little research attention. Extant evidence suggests that individuals are able to dissimulate successfully on selfreport measures regardless of administration format [39]. However, to the authors' knowledge, feigning on the online CAARS-S: L has not yet been investigated. The current study aims to integrate the various strands mentioned above: Effect of feigning vs. honest instructions on CAARS-S: L clinical scale scores; accuracy of the proposed CAARS-S: L Infrequency Index (CII); and initial analyses on the comparability of online vs. paper CAARS-S: L forms and use of the CII in the online format. The hypotheses of this study were as follows: 1. CAARS-S: L clinical scale scores produced by clinical and simulated ADHD groups will not differ to a statistically significant degree.
2. The CII will exhibit adequate specificity (≥ .80) at its standard cut score (> 21) in identifying feigned ADHD on both paper and online forms, as online and paper formats are thought to be comparable. 3. In keeping with the findings of Hirsch and colleagues (2013), online respondents will produce significantly higher scores on three of the four factor-derived clinical scales relative to those responding on paper [38]. No a priori hypotheses were made regarding the differences between administration formats for the remaining four clinical scales of the CAARS, as no study has examined these scales to the authors' knowledge.  Note. ASRS = Adult ADHD Self-Report Scale (ASRS-v1.1) Symptom Checklist Part A; CAARS-S:L = Conners' Adult ADHD Rating Scale-Self-Report: Long Version. Some participants were excluded for multiple reasons; thus, the number of participants meeting the above exclusion criteria is greater than the number of excluded participants.

Method I Participants
The present study was approved by an institutional review board and included 139 undergraduate students at a large university; of these, 27 had ADHD diagnoses and 112 did not. Figures 1 & 2 detail the inclusion and exclusion process for participants. In the ADHD group, one participant (3.70%) met criteria on the structured interview for predominantly inattentive subtype, one (3.70%) for predominantly hyperactive/impulsive subtype, and 25 (92.59%) for combined presentation. Mean age of diagnosis was 13.15 years (SD = 5.45). Approximately 78% of these participants reported current medication use for ADHD. All participants consented to the use of their data. Demographic characteristics are presented by instruction set in (Table  1).

II Measures
The Conners' Adult ADHD Rating Scale-Self Report: Long Version is a 66-item test measuring current DSM-IV ADHD symptoms [22]. The CAARS-S: L yields scores on eight clinical scales, including Inattention/Memory Problems, Hyperactivity/Restlessness, and an overall ADHD index. The instrument has been shown to have 82% sensitivity, 87% specificity, and 85% hit rate for ADHD at the recommended cut score [40]. The aforementioned CII, a new validity scale created to detect potential feigning on the CAARS-S: L, was utilized on the CAARS-S: L, with authors reporting scores of 21 or greater as indicative of potential feigning [27].

III Procedure
The study utilized a 2x3 coached simulation design. The design included three instruction set groupsnonclinical resonding honestly (HON), clinical responding honestly (ADHD), and feigning (FGN)and two administration format groups (paper and online). Participants with a diagnosis of ADHD and who met study inclusion criteria comprised the ADHD group. Nonclinical participants were randomly assigned to the honest (HON) or feigning (FGN) groups, with more participants placed in the FGN group as the HON group served as a manipulation check on the feigning instructions. All participants were randomly assigned to complete the CAARS-S: L either online or on paper. The online measure is available through Multi-Health Systems Inc. Online Assessment Center. A username-and password-protected account was created for the completed assessments. Each participant in the online group completed the CAARS-S: L using a unique identification number to ensure anonymity. All participants completed the study within a laboratory setting.
Participants completed the informed consent procedure followed by a brief demographic's questionnaire and the Adult ADHD Self-Report Scale (ASRS-v1.1) Symptom Checklist Part A [41]. Participants were then given their instructions for completing the CAARS-S: L according to their instruction set group: Those in the HON group were asked to complete the CAARS-S: L honestly. Participants in the ADHD group were also asked to answer the assessment honestly, according to their unmedicated symptom experience. Those in the FGN group were asked to respond to the assessment as if they had ADHD. These participants were warned that the test has scales to detect faking and were encouraged to simulate ADHD without being detected. As a monetary incentive for feigning, those in the FGN group were told that they would win $25 cash if they could take the CAARS-S: L in a way consistent with ADHD but without being detected as faking. In reality, all participants in this group received $25 upon completion of the study, although they were not told of this until post-experimental debriefing. Following review of their instructions, FGN participants were given a packet of ADHD reading materials adapted from Walls and colleagues (2017), which included a description of typical ADHD symptoms available online and a hypothetical scenario explaining the possible benefits of receiving academic accommodations/medication for ADHD [25].
Following their review of the packet, FGN participants completed an instruction check questionnaire. This questionnaire asked participants to summarize their instructions, recall ADHD characteristics, and write down strategies for faking the disorder. All participants then completed the CAARS-S: L either on paper or online on laptop computers followed by a posttest questionnaire, which asked participants to reproduce their task instructions and to indicate their effort to follow instructions on a 5point Likert scale. Lastly, all participants were debriefed.

Results
Sample distributions demonstrated nonsignificant skewness and kurtosis. However, the results of Levene's test indicated significant heterogeneity of variances across dependent variable groups. Because of this violation of the assumption of homogeneity of variance, analyses utilized Welch ANOVA omnibus tests and Games-Howell follow-up contrasts [42]. Due to the large number of contrasts performed, alpha was set at .01 to minimize Type I error rate. Cohen's d effect sizes of group differences are provided where appropriate and were interpreted using the following guide: 0.20 (small effect), 0.50 (moderate effect), and 0.80 (large effect) [43].    Table 2 presents CAARS-S: L clinical scale T scores by instruction set. The HON group produced significantly lower scores than the FGN group on all clinical scales. These differences between FGN and HON groups indicated success of feigning manipulation. Further, ADHD and FGN groups produced statistically similar elevations on all clinical scales with the exception of higher scores for FGN on Impulsivity/Emotional Lability. This similarity of clinical scale elevations suggests FGN participants were largely able to produce CAARS-S: L profiles similar to those with ADHD diagnoses.
The CII overall produced modest sensitivity to feigning (0.36) and acceptable specificity for ADHD (0.85) at the recommended cut score of 21 or greater [27]. As specificity of 90% or greater is generally considered desirable to avoid feigning false positives, the CII cut score was incrementally raised in order to reach that value. The CII's specificity improved at cut score 22 (0.89) and reached the desirable level at cut score 23 (0.96). Values similarly ranged from acceptable to optimal on both paper and online forms as well. Operating characteristics for various cut scores and the administration formats are available in (Table 3).
Following from Hirsch et al. (2013), clinical scale score differences between the paper and online administration formats were examined as presented in (Table 4) [37]. Only small differences on clinical scale elevations were observed in the HON group. Within the ADHD group, medium to large effect sizes were observed on the following scales: Inattention/Memory Problems; DSM-IV: Inattentive Symptoms; DSM-IV: ADHD Symptoms Total. On these scales, participants completing the measure on paper produced higher scores than those online. In the FGN group, one medium effect size was observed for the Hyperactivity/Restlessness scale, with participants completing the assessment online producing higher scores than those on paper.

Discussion
The accurate detection of feigned ADHD with commonly used assessment measures, such as the CAARS-S: L, is a salient clinical issue in the college setting. In keeping with previous research findings indicating the vulnerability of the CAARS-S: L to feigning, the scores from the FGN group in this study were statistically comparable to those of the ADHD group on all but one of the clinical scales [12,20]. The FGN group produced a higher Impulsivity/Emotional Lability scale score, likely reflecting the association between greater endorsement of these symptoms and feigned ADHD [44].  In response to the demonstrated vulnerability of the CAARS-S: L to feigning, which was supported by this study, the Infrequency Index (CII) was created as the first fake bad scale for the measure [27]. In the current study, the CII demonstrated modest sensitivity (0.36) and adequate specificity (0.85) at the recommended cut score of 21, consistent with hypotheses [27]. The CII demonstrated lower specificity than in some previous research, yet the value was higher than in other previous work [13,25,27,28]. CII specificity improved upon raising the cut score to 22 and reached desirable levels when raised to 23.
Regarding the feigning detection accuracy of the CII in paper vs. online forms of the CAARS-S: L, both formats produced modest sensitivity (.34 and .39, respectively) and acceptable specificity (.85 and .86, respectively) at the standard cut score of 21. With raised cut scores, both the paper and online forms achieved optimal specificity. Results support the use of the CII on paper and online, although higher cut scores may be needed to achieve desirable specificity. CAARS-S: L clinical scale scores were also examined for paper and online forms as a comparison to the findings of Hirsch and colleagues (2013) [37]. Unlike this previous study, current results indicated no large clinical scale score differences between the HON paper vs. HON online groups. These results instead support the literature suggesting comparability of paper and online forms of assessments of various clinical constructs [35,36,39].
Extending the findings of Hirsch and colleagues (2013), the ADHD and FGN groups were also examined for administration format comparability [37]. Within the ADHD group, participants completing the online assessment produced significantly lower elevations on the Inattention/Memory Problems, DSM-IV: Inattentive Symptoms, and DSM-IV: ADHD Symptoms Total scales than those completing the paper version. Within the FGN group, participants completing the online assessment produced higher scores on the Hyperactivity/Restlessness scale than those completing the paper version. This latter result could reflect the phenomenon wherein responders endorse more severe or undesirable characteristics in an online format relative to paper, especially when asked to distort responses negatively and report a higher level of impairment than they truly experience, as was the case in the FGN group [45]. These results suggest that completing the CAARS-S: L on paper vs. online may affect the level of symptomatology endorsed in both clincal and feigning groups, indicating caution may be warranted for clinicians seeking to utilize computerized assessments in their practice. However, as this is the first study extending the findings of Hirsch and colleagues (2013) to ADHD and FGN groups and to report on all eight clinical CAARS-S: L scales, future research is warranted to characterize these format differences further [37].

I Strengths and Limitations
This study included the following efforts to strengthen internal validity: DSM-5 structured interview for ADHD diagnoses; ASRS-v1.1 Part A symptom checklist as a screening measure for HON and FGN groups to ensure minimal presence of ADHD symptoms in these nonclinical groups; instruction check and effort measure; and monetary incentive for the FGN group. Efforts were also made to strengthen external validity, such as using symptom information that is available online in the FGN instruction packet. Limitations included the following: The ADHD group was significantly older with more years of education than the nonclinical groups; all groups were predominantly female, which often occurs with undergraduate psychology subject pools; and the examiner in the testing sessions was not blinded to participant instruction set. Additional limitations included the inherently limited external validity of simulation designs [46]. Thus, quality known-groups design studies (i.e., of individuals presenting for real-life ADHD assessments, those likely honestly responding and those likely feigning deficits as indicated by failure of a validity test) are needed in this area.
There was a high exclusion rate (approximately 36%) in our sample for reasons including endorsement of inadequate effort to complete the CAARS-S: L according to instruction set (17% of all participants) and nonclinical participants endorsing a level of ADHD symptoms suggestive of diagnosis on a brief screening instrument (16% of HON and FGN groups). The authors hypothesize that participants endorsing inadequate effort conflated effort with difficulty to follow instructions. More systematic evaluation of compliance with instructions may be warranted in future studies. The high rate of endorsement of ADHD symptoms in the nonclinical groups was not surprising; in fact, using the same brief ADHD screening questionnaire as utilized in the current sample, Matte and colleagues (2015) found that approximately 33% of young adults screened positive for ADHD [41,47]. Such a rate of symptom endorsement in the current sample reflects the pervasiveness of ADHD symptoms in the general population.

Conclusion
This study aimed to provide further validation of the CII as a feigning indicator for the CAARS-S: L and novel cross-validation of this indicator in an online format. Initial results as to the clinical scale comparability of the paper and online forms of the assessment were also provided. Students instructed to feign ADHD were able to produce clinical scale scores similar to those who have been diagnosed with ADHD on paper and online forms. This study provided further validation of the CII, which distinguished dissimulated from diagnosed ADHD with modest sensitivity and adequate specificity at the recommended cut score of > 21. Specificity of the CII improved at raised cut scores of > 22 and 23, suggesting utility for use of higher cut scores in clinical practice in order to limit false positives for feigning. This study is the first to examine the performance of the CII in an online format, with results indicating similar feigning detection ability of the index across administration formats. While some clinical scale elevation differences were found on paper and online CAARS-S: L in clinical and feigning groups, this prelimary work suggests that the CII is able to detect feigning regardless of administration format.