Introduction

Aging entails a general decline in cognitive functioning, with memory being one of the most affected functions1. As people get older, they become more vulnerable to everyday forgetfulness2,3,4 and perform worse in free recall and recognition tests5,6. Furthermore, older adults’ memory for names and faces is poorer than that of younger adults7,8,9, especially when retrieving face-name associations10. In fact, recognizing people’s faces and retrieving information about them seems to become particularly demanding with aging, being one of the most commonly reported complaints made by older adults9,11. Critically, Pike et al.12 suggest that deficits in retrieving face-name pairs may help distinguishing between mild cognitive impairment (MCI) and healthy aging, which is especially relevant if we take into account that people with MCI have an increased risk of developing Alzheimer’s disease13.

One possible explanation for age related changes was proposed by Hasher and Zacks14 in their Inhibitory Deficit Theory (IDT), which posits that cognitive failures related to normal aging are due to a deficit in inhibitory mechanisms. These authors argue that age-related deficits in attention, language or memory, could be due to a common underlying mechanism: a decline in inhibitory function. According to the IDT, older adults do not have the ability to suppress or inhibit unwanted information from entering working memory. Therefore, older adults’ naming difficulties could be due to an inability to suppress irrelevant-competing representations making it harder to access and choose the desired information15.

Inhibitory mechanisms have been shown to rely on prefrontal brain structures16,17,18, which are strongly affected by aging19,20. Anatomically, the aging brain is known to lose overall volume, with atrophy in the frontal lobe being closely linked with a decline in cognitive function21,22. Moreover, aging entails a decrease in grey and white matter densities; increases in ventricular size, neuronal death and a loss of dendritic density23. These changes are particularly pronounced in prefrontal cortex23 and accompanied by metabolic20 and functional changes19 in these same regions.

Taking into account such anatomical and functional prefrontal age-related decline, it is then perhaps unsurprising that several studies have found older adults to be more susceptible to interference24,25,26 and less capable to prevent irrelevant information from interfering27. Age-related impairments have also been shown in paradigms specifically testing interference detection and its resolution by means of inhibitory mechanisms, in both the motor (Stop Signal and Go/No go28) and memory (Think/No Think29, but see30) domains.

One paradigm commonly used to investigate inhibitory function in selective memory retrieval is the retrieval practice paradigm31. In this paradigm, participants first study pairs of words belonging to a given category (e.g. FRUIT-Apple; FRUIT-Orange; ANIMAL-Elephant) and are then asked to retrieve half of the words from half of the categories, upon presentation of a cue (e.g. FRUIT–Ap___). The presentation of the category cue (FRUIT) leads to the activation in memory of all previously studied related items, eliciting interference between related competitors (Apple, Orange, Banana…). According to the inhibitory account31, when interference is detected, a control mechanism is triggered to suppress competing memory representations (Orange) and promote the retrieval of the target memory (Apple). Thus inhibition is required to suppress strong competing responses and allow the expression of weaker but more appropriate ones32. Two different memory effects can be observed in a final test. First, a facilitation effect where practiced items (Apple) are recalled significantly better than control items (items that were neither studied nor belong to studied categories, such as Elephant). Second, the recall of unpracticed items from practised categories (Orange) is significantly impaired in comparison to control items, an effect named Retrieval Induced Forgetting (RIF31).

Although alternative explanations have been proposed33,34 there is overwhelming support for RIF’s inhibitory nature, showing that it occurs due to the inhibition of neural assemblies representing the competitor item35,36.

Importantly, the inhibitory account implicates at least two key mechanisms: i) a mechanism that detects interference, and ii) a mechanism that reduces interference by suppressing competing memories. Although studies have shown that the behavioural RIF effect is gradually impaired in older adults, and modulated by factors such as age itself37,38 or the availability of cognitive resources39, there are not, to our knowledge, any EEG studies specifically investigating the brain mechanisms underlying age-related changes in RIF. Hence, it remains unclear whether older adults struggle with the detection of an early interference signal or with the suppression of irrelevant competitors. Due to its superb temporal resolution, EEG should allow the dissociation between the two neural signatures mediating RIF (interference detection and inhibition), a difficult goal to achieve when relying purely on behavioural methods, and enable us to identify the source of age-related changes in RIF.

Electrophysiological studies evidence that interference in the retrieval practice paradigm can be traced by mid-frontal theta (~4–8 Hz), with increments in mid-frontal theta power shown when comparing a competitive to a non-competitive condition40,41, localized to medial prefrontal brain regions (such as the ACC), and predicting later forgetting41. In order to specifically untangle interference and inhibition signals in this paradigm, Ferreira, Marful, Staudigl, Bajo and Hanslmayr47 presented a category cue (e.g. Actor) and a retrieval specific cue (the face of a specific actor) separated in time. Mid-frontal theta oscillations were attributed specifically to interference detection, by showing an increase in theta power in a competitive (vs. a non-competitive) condition, upon presentation of the category cue, when competing items become active in memory. Theta power decreased in the competitive condition from the presentation of the category cue to the presentation of the retrieval cue, reflecting a reduction in interference or its resolution, which correlated with later forgetting.

This paradigm, in combination with electrophysiology, is therefore ideally suited to reveal the mechanism potentially impaired in older adults: interference detection or its resolution (i.e. inhibition). In two consecutive and independent experiments (using faces and semantic material) we employ a procedure similar to Ferreira et al.47. Specifically, we compare the neural correlates of RIF between two age groups, throughout three subsequent cycles of retrieval practice. In each cycle, a category cue (occupational or semantic category in Experiments 1 and 2, respectively) and an item-specific cue (face in Experiment 1 and word stem in Experiment 2) are presented for all 24 critical items, prompting participants to retrieve them (see Methods section). We assess theta oscillations upon presentation of the category cue to disentangle interference detection (presentation of the category cue on the first cycle, when interference should be at its highest level; Cue 1) and inhibition or interference resolution (difference between presentation of the category cue on the first and third cycles; Cue 1 vs. 3).

If the older adults’ difficulties in naming faces are due to poor interference detection, we would expect theta power upon presentation of the first cue to be lower for older compared to younger participants (Fig. 1A), as supported by studies showing that low forgetters exhibit lower levels of theta than high forgetters41, as well as lower ACC activity17 on the first retrieval practice cycle. In this case, theta power should remain constant across retrieval cycles, since inhibitory mechanisms should not be called into play unless interference is detected42. If, on the other hand, the problem lies on interference resolution, theta power upon presentation of the first retrieval cue should be equivalent between young and older participants, and again, should remain constant (at high levels, in this case) across cycles for the older adults, as they would not be able to engage the necessary mechanisms to solve interference amongst stimuli.

Figure 1
figure 1

(A) Expected neural results. The blue line represents the expected results in theta power for the young participants, replicating previous studies. The orange lines represent the expected results for the older adults. The solid line represents the expected results if older participants suffer from a deficit in interference detection, whereas the dashed line depicts what would be expected if the older adults’ struggle is in solving interference. (B) Depiction of the experimental paradigm.

In Experiment 2, we aim to rule out alternative explanations for the results found in Experiment 1. Given the nature of the material used in the first experiment, any age-related deficits found could be attributable to the older adults not processing the category cue (which elicits or boosts interference) to the same extent that young adults do. This idea is supported by studies showing impaired context processing in older adults43,44,45. Thus, older participants might simply not process the context (the category cue) due to this impairment. Moreover, the cue is not essential to perform the task correctly. Participants can still recall the name “Banderas” upon face presentation, regardless of whether or not they read the cue “Actors” beforehand.

To rule out this possibility, in Experiment 2 we use semantic material based on pairs of words instead of faces. We present a category cue (FRUIT) followed by the word stem of an exemplar of that category (Ap___ to retrieve Apple). Using semantic material forces participants to focus on the category cue: in order to successfully retrieve the target “Apple” one needs more information than the word stem. This necessary information is given by the category cue (FRUIT). Therefore participants cannot afford to ignore the cue in order to retrieve the correct target.

If any deficits found in Experiment 1 are due to an impairment in context processing, or to the use of different strategies by the older adults, by forcing them to focus on the category cue, we should find results (both at the behavioural and neural level) more in line with those found in the younger adults. If, however, older adults do indeed struggle with detecting or solving competition, then we should be able to replicate the results from the Experiment 1.

Results

Experiment 1

Behavioural results

Familiarity ratings at study did not differ overall between groups (Myoung = 4.00; SDyoung = 0.78; Molder = 4.10; SDolder = 0.53; p > 0.05). Dividing the items according to occupational category revealed that older adults were more familiar with writers than the younger adults (Myoung = 3.00; SDyoung = 0.82; Molder = 4.30; SDolder = 0.39; t(14) = −3.84, p = 0.002), whereas the opposite was true for football players (Myoung = 4.70; SDyoung = 0.15; Molder = 3.90; SDolder = 0.26; t(14) = 7.74, p = 0.000). No other categories differed between groups.

Mean recall during the retrieval practice phase did not differ between young and older adults (Myoung = 0.76; SDyoung = 0.16; Molder = 0.67; SDolder = 0.18; p > 0.05).

Two 2 × 2 repeated measures ANOVA were conducted to assess forgetting and facilitation effects separately on the final memory test. Post-hoc analyses were conducted for each group, using 1-tailed paired-sample t-test. A summary of the descriptive statistics is detailed in Table 1.

Table 1 Summary of the behavioural descriptive statistics from Experiment 1 (face material).

Forgetting. The results of the ANOVA type of item (unpractised vs. control) × group (younger vs. older) showed a significant effect of group [F(1,46) = 10.03, p = 0.003, \({\eta }_{p}^{2}\) = 0.18], where younger participants’ overall proportion of recall was higher than older participants’ (see Table 1). A marginally significant effect of item type [F(1,46) = 3.64, p = 0.06, \({\eta }_{p}^{2}\) = 0.07] was also found, with lower mean recall of unpractised items than of controls. Although the interaction between age group and item type did not reach significance [F(1,46) < 1), n.s.], planned contrasts were conducted nonetheless, given that we had very specific hypotheses of what to expect. These revealed that whereas the difference between unpractised and control items was significant for the younger adults [t(23) = −1.98, p = 0.03], such difference was not significant for older participants.

Facilitation. Regarding the facilitation effect, no significant item type (practised vs. control) × age group interaction was found [F(1,46) = 2.66, n.s.]. However, there was a significant main effect of age group [F(1,46) = 7.92, p = 0.007, \({\eta }_{p}^{2}\) = 0.78], according to which younger adults recalled more items overall than older adults did. Moreover, item type also reached statistical significance [F(1,46) = 30.49, p = 0.000, \({\eta }_{p}^{2}\) = 0.39]. Recall of practised items was significantly better than recall of control items. Both young [t(23) = 3.1, p = 0.003] and older [t(23) = 4.6, p = 0.000] participants recalled practised items significantly better than baseline (see Table 1).

Theta power results

Young vs. Older: Cue 1 and Cue 1 vs. Cue 3. Differences in theta power upon presentation of the cue on the first and third cycles were computed for each participant in the young and older group. We first report the analysis for the first cycle (Cue 1; interference index) and then the difference between the first and third cycles (Cue 1 vs. 3; interference resolution).

For the first cue presentation a significant difference in theta power was found between younger and older adults (pcorr = 0.002), such that younger adults showed greater theta power (7–8 Hz) over our pre-defined mid-frontal ROI (see Analyses of oscillatory power), in a time window ranging from 0 to 500 ms (Fig. 2A).

Figure 2
figure 2

Neural results from Experiment 1. (A) Topography depicting differences in activity between younger and older participants, upon presentation of the first category cue. (B) Interaction analysis: differences between younger (cue1 − cue3) and older adults (cue1 − cue3). On the left, the time-frequency plot shows the significant time-frequency windows used for subsequent analyses and the topographies on the right show the distribution of these effects. All the analyses leading to plots A) and B) where conducted over a central ROI comprising 9 mid-frontal electrodes (depicted in black circles), and electrodes that showed significant differences can be seen in red circles. (C) Topographies depicting differences between cue 1 and cue 3 in each age group in the first time window (7–8 Hzm, 0 to 500 ms). The young group is represented on the left and the older on the right. (D) The bar graph shows the percentage signal change in theta power (7–8 Hz), from 0 to 500 ms upon presentation of the category cue in each cycle for young (left) and older (right) participants. Note how theta power decreases across retrieval cycles for the younger participants but shows an opposite pattern for the older adults (*p < 0.05; **p < 0.01).

The interaction analysis (Cue 1 minus Cue 3 × age group) yielded a significant effect in two different time-frequency windows, over the mid-frontal ROI. In both windows theta power was higher for younger compared to older adults (Fig. 2B). The first time-frequency window ranged from 7–8 Hz in the first 500 ms upon stimulus onset (pcorr = 0.002); the second significant window ranged from 6–7 Hz at 500 to 1000 ms (pcorr = 0.005). Follow-up comparisons on these two effects are described next.

Young adults: Cue 1 vs. Cue 3. As expected, young adults showed a significant theta power decrease upon cue presentation from the first to the third retrieval practice at the sensor level, both from 7–8 Hz during the first 500 ms (pcorr = 0.005, Fig. 2C) and from 500 to 1000 ms, at a frequency range from 6–7 Hz (pcorr = 0.01). In order to get a clearer picture of how theta power progresses from one cycle to another, we extracted theta power values upon presentation of the category cue, in the two significant time windows and over the 9 electrode ROI, for the three cycles. The results of this analysis indicated that in both time-frequency windows, theta power gradually decreased across cycles. For both time-windows, the difference between first and third cue reached statistical significance (first time-window, depicted in Fig. 2D: t(23) = 2.81, p = 0.005; second time-window: t(23) = 2.47, p = 0.01).

Older adults: Cue 1 vs. Cue 3. Older adults also showed a significant modulation of mid-frontal theta power across retrieval cycles in the first time window, from 0 to 500 ms and from 7-Hz (pcorr = 0.02). However, in stark contrast with the younger adults, theta power increased from the first to the third presentation of the category cue (Fig. 2C). No significant difference emerged in the second time window (500–1000 ms; 6–7 Hz; all pcorr > 0.05). We again extracted theta power values upon presentation of the cue for each retrieval practice cycle. As expected from the interaction analysis results, older people seem to have, overall, lower levels of theta power compared to the younger. Theta power increased numerically from the first to the second and third retrieval cycles, with differences between first and third cue reaching statistical significance on the first time-frequency window [t(19) = −1.89, p = 0.04, Fig. 2D].

Experiment 2

Behavioural results

Descriptive behavioural statistics for Experiment 2 are summarized in Table 2. For the intermediate retrieval practice phase, no differences in mean recall were found for the two age groups (p > 0.05).

Table 2 Summary of the behavioural descriptive statistics from Experiment 2 (semantic material).

Behavioural analyses were conducted in the same way than Experiment 1, but note that in this experiment practised and unpractised items were contrasted against their own control items, drawn from equally low or high representativeness of their category, respectively (see Methods section).

Forgetting. The ANOVA conducted to assess forgetting (type of item × age group) revealed no significant effect of item type [F(1,45) = 2.51, p > 0.05] or age group [F(1,45) = 1.69, p > 0.05], but did yield a significant item × group interaction [F(1,45) = 4.88, p = 0.3, \({\eta }_{p}^{2}\) = 0.10].

Post-hoc analyses showed a significant difference between unpractised items and their respective controls (see Table 2) for the younger adults [t(23) = −3.19, p = 0.004]. No such difference was found however in the older group.

Facilitation. Concerning the facilitation effect (see Table 2), a significant main effect of item type was obtained [F(1,46) = 61.72, p = 0.000, \({\eta }_{p}^{2}\) = 0.57], according to which practised items were recalled significantly better than their controls. No significant main effect of group [F(1,46) < 1, n.s.] or item × group interaction [F(1,46) = 1.20, n.s.] were obtained. Post-hoc analyses revealed that facilitation effects were present in both groups [young: t(23) = 7.23, p = 0.000; older: t(23) = 4.30, p = 0.000].

Theta power results

Young vs. Older: Cue 1 and Cue 1 vs. Cue 3. Differences in theta power upon presentation of the cue on the first and third cycles were computed for each participant in the young and older group. As in Experiment 1, we first report the analysis for the first cycle (Cue 1; interference detection index) and then the difference between the first and third cycles (Cue 1 vs. 3; interference resolution).

For the first cue presentation a significant difference in theta power was found between younger and older adults (pcorr = 0.009), such that younger adults showed greater theta power (7–8 Hz) compared to older adults over frontal and parietal areas, in a time window ranging from 200 to 500 ms.

The interaction analysis (first cue minus third cue × age group) yielded a significant difference over a time window ranging from 200 to 500 ms, at 7 Hz (pcorr = 0.02; Fig. 3A). Over the mid-frontal ROI, theta power was higher for younger compared to older adults. Planned comparisons on this effect are described next.

Figure 3
figure 3

Neural results from Experiment 2. (A) Interaction analysis: differences between younger (cue1 − cue3) and older adults (cue1 − cue3). The time-frequency plot on the left shows the significant time-frequency window (over a central ROI comprising 9 mid-frontal electrodes, depicted in black circles) used for subsequent analyses and the topography on the right shows the distribution of this effect. Electrodes that showed significant differences can be seen in the red coloured circles. (B) Percentage signal changes in theta power (7 Hz), from 200 to 500 ms upon presentation of the category cue in each cycle for young (left) and older (right) participants. Note how theta power decreases across retrieval cycles for the younger participants but not for older adults (*p < 0.05).

Young adults: Cue 1 vs. Cue 3. For young adults, mid-frontal theta power decreased at 7 Hz upon cue presentation from the first to the third retrieval practice (pcorr = 0.008) from 200 to 500 ms. Replicating the results from the previous experiment, theta power gradually decreased from the first to the third cycle (Fig. 3B). The differences in theta power between the first and third cue were statistically significant [t(23) = 2.11, p = 0.02].

Older adults: Cue 1 vs. Cue 3. For older participants, no significant differences were found between the first and the third cue (Fig. 3B).

General Discussion

Across two experiments using the retrieval practice paradigm31 with different materials, younger adults exhibited the typical RIF effect (unpractised items recalled below baseline), while this effect was absent in the older adults’ group. We argue that whereas younger adults inhibited competing items to promote the correct recall of targets, older adults were not capable of suppressing these irrelevant memories. Importantly, older adults did benefit equally from repeated retrieval, given that in both experiments there were no differences in the average recall during the retrieval practice and that the facilitation effects were similar in younger and older adults.

To rule out alternative explanations, Experiment 2 used semantic material, forcing participants to focus on and process the category cue. Crucially, behavioural RIF was still absent in the older group. Note that alternative accounts of RIF, such as blocking34 or contextual theories33 cannot fully explain these results.

This leaves open three possibilities: i) the absence of RIF is due to a poor detection of interference (that is, participants do not detect interference and consequently do not trigger the necessary suppression mechanisms); ii) the lack of RIF occurs due to an inhibitory deficit (participants do detect interference but suffer from an inhibitory deficit that does not allow them to overcome this interference) or iii) both of these processes underlie impaired RIF in the older adults.

Since this is a difficult question to address on a purely behavioural level, we focused on mid-frontal theta power as a proxy for interference detection46. Interference should be highest during the first cycle17,41. Indeed, our results show that mid-frontal theta power was higher for younger than older participants, in a time-frequency window ranging from 7–8 Hz and 0 to 500 ms, which is in good agreement with previous results40,41,47. The fact that older adults showed lower levels of theta power than younger participants parallels results showing that lower forgetters exhibit less theta power during the first retrieval cycle41 and less ACC activity17 than high forgetters. Moreover, it is well known that prefrontal structures suffer from aging to a great extent, and age-related atrophy in frontal lobes has been closely linked to a decrement in cognitive functioning21,22. Cummins and Finnigan19 found altered frontal/ACC theta power in older adults and Pardo et al.20 showed a decrease of glucose uptake with aging in the ACC, which correlated with a decline in cognitive performance. Thus, the fact that older participants show less theta power upon cue presentation indicates they do not efficiently engage the brain mechanism in charge of detecting and reacting to interference, which is in line with studies showing this population is more susceptible to interference24,25,26. Altered ACC function, along with inefficient connection between fronto-parietal regions, within a neural network relevant to perform a memory task (as shown in Pinal, Zurrón, Díaz and Sauseng48), could potentially underlie the reduced mid-frontal theta power that older participants show across the two studies presented here.

A significant reduction in theta power from the first to the third category cue was found in the younger adults’ group, in both experiments. Theta power decreased gradually from one cycle to the next, arguably reflecting the successful down-regulation of interference, a marker of how successful inhibition was17,41,49. The more effective detection of interference by younger adults allowed them to trigger the necessary inhibitory mechanisms, which suppress competing items, and to therefore resolve interference. This successful suppression of competing items35,36 promotes the correct recall of the sought after target memories in younger adults.

Remarkably, no such theta power reduction was found for the older participants’ group. Theta power was either constant across retrieval cycles (Experiment 2), or even went in the opposite direction, with theta power increasing from the first to third retrieval cycle (Experiment 1). The discrepancy of this result between experiments could potentially be explained by the different nature of the stimuli used. Although the experiments were designed to be as similar as possible, both relying on highly familiar material and associations (which should have led to similar levels of competition), the use of different material should also engage different cognitive processes. In fact, literature has shown that face-name associations are particularly hard as people age9,10,11, whereas semantic memory is often unaltered50. It then remains an open question whether theta increase in Experiment 1 could be the cause or the reflection of an increased difficulty in retrieving face-name associations and whether differences in material could indeed account for this different result in Experiments 1 and 2.

It is noteworthy, however, that in spite of these differences in material we still find parallel behavioural and neural results across experiments. Experiment 1 shows that personal representations, such as faces and names are prone to interference and inhibition, just like other objects, challenging accounts that these representations enjoy a special status on cognition51,52,53 and giving empirical support to face recognition models that propose faces and names are vulnerable to competition54,55,56,57. The mechanisms underlying our results however should be more controlled in nature that those posited by face recognition models56. This resonates with previous authors who suggested that controlled mechanisms could play a role in face recognition58,59, evidencing the key role that interference and inhibition play when attempting to retrieve a target memory, regardless of the nature of the stimuli.

Our results are thus consistent with the Inhibitory Deficit Theory (IDT14), in that they show impairment in an inhibitory task, but advance this theory by identifying a possible reason for this impairment, which might lie in an earlier stage of interference detection. As shown in Anderson et al.42, inhibition is interference dependent. Accordingly, if the older adults did not detect interference, as discussed above, inhibition should not be called into play. This is evidenced not only by the fact that theta power did not decrease across cycles, but also by the absence of a behavioural RIF effect across the two experiments.

Previous research has pointed in a similar direction. For instance, ERP studies in young adults showed that during incongruent trials of a Stroop task (interference inducing trials), a medial frontal negativity (MFN) component occurs between 400 and 500 ms (N450)60,61, with several studies showing medial prefrontal brain regions to be the generator of this MFN62,63,64. Crucially, the MFN generated by older adults has been shown to be attenuated, during different variants of the Stroop task65,66. Similarly, Tays et al.26 found that in a Sternberg-like task, older adults showed a large frontal positivity instead of the MFN, and that this unique pattern of frontal positivity is associated with poorer behavioural performance, rather than with compensatory mechanisms. The fact that a component that is consistently found in interference related trials (such as the MFN) is attenuated in older adults agrees with our results and with the idea that older adults have a harder time detecting interference.

In sum, the present work aimed to understand how age changes the neural dynamics underlying RIF. We sought to disentangle whether cognitive aging affects interference detection or interference resolution mechanisms, especially in the context of face naming, a task that seems to be rendered especially hard as people age9,10,11. In two experiments we show that the age-related inhibitory deficit largely described in the literature, might be due to a missing early interference signal, with the older adults not detecting interference and consequently not recruiting the inhibitory mechanisms necessary to overcome it. This is, to our knowledge, the first study using electrophysiological measures to understand age-related neural changes underlying RIF and suggesting a specific source for the absence of RIF in the older adults. Going a step further, these findings contribute to the current understanding of the cognitive dynamics during memory retrieval in aging and how they are reflected in brain oscillations.

Methods

Participants

The sample size for the experiment was estimated by a power analysis based on the effect size from Ferreira et al.47, (\({\eta }_{p}^{2}\) = 0.28); α err prob = 0.05 and power = 0.95. Estimated total sample from the mixed ANOVA resulted in a total of 44 participants. We increased the sample to 48 in the current study (24 in each group) in order to keep a complete counterbalance of the task materials.

Experiment 1

Twenty-four students from the University of Granada (17 female; Mage = 24.70; SD = 5.56) and 24 older adults (10 female; Mage = 68.38; SD = 5.18; range 60–79) participated in this study.

Older participants were recruited from an association for retired people. These participants were highly educated (Mscholarity(years) = 13.31; SD = 3.25) and going through a process of normal healthy aging (Mini Mental State Exam67 mean score = 28.65/30, SD = 1.37). There were no significant working memory differences between the two age groups, as measured by the digits span test from the Wechsler Adult Intelligence Scale (WAIS III; Myoung = 15.13, SDyoung = 2.82; Molder = 13.58, SDolder = 2.99; p > 0.05).

All participants were Spanish or had been living in Spain for at least 15 years and all reported normal or corrected-to-normal vision. Participants were given all the information about the study and signed an informed consent prior to its start. All subjects received course-credits or a monetary reward (15€) for their participation in the study. The experiment followed the Helsinki Declaration guidelines and was approved by the Ethics Committee of the University of Granada.

For the EEG analysis, four of the older participants were excluded due to excessive movement during the task leading to poor EEG data quality.

Experiment 2

For Experiment 2, a new sample of 24 students from the University of Granada (17 female; Mage = 21.13; SD = 3.45) and 24 older adults (8 female; Mage = 64.74; SD = 3.45; range 60–75) was recruited. Older participants were recruited from an advert published in a local newspaper and on the University of Granada webpage. Inclusion criteria specifically stated that older adults should have a minimum of 12 years of education. Mean years of education for this sample was of 15.46 (SD = 2.25). As in Experiment 1, participants completed the Mini Mental State Exam67, scoring 28.1/30 (SD = 0.98). No differences were found between the age groups as to working memory capacity, measured by the digits span test from the WAIS III (Myoung = 15.88; SDyoung = 2.58; Molder = 14.53; SDolder = 2.98; p > 0.05).

All participants were Spanish or had been living in Spain for at least 15 years and were thus native or very fluent speakers. All reported normal or corrected-to-normal vision. Participants were given all the information about the study and signed an informed consent prior to its beginning. Young participants received course-credits and older adults were monetarily rewarded (15€) for their participation in the study. The experiment followed the Helsinki Declaration guidelines and was approved by the Ethics Committee of the University of Granada.

Material

Experiment 1

A total of sixty-four pictures were used in this experiment, all of famous people. Faces were divided into eight occupational categories (male: actors, politicians, football players, writers, and TV hosts; female: singers, royalty members, and tabloid stars). These materials had been used in previous experiments47,68 and were originally chosen from a pilot study that served the purpose of evaluating each item’s familiarity. Pictures were selected so that they had the highest familiarity values, provided they did not share the first two letters of the corresponding name. Six additional exemplars were chosen as filler items: 3 radio personalities and 3 bull-fighters. These were used to control for primacy and recency effects and were not taken into account in any of the analyses.

Pictures were presented in colour (5.19 cm × 6.99 cm) against a white background. In order to standardize them, an oval template was applied around each picture (see Young, Ellis, Flude, McWeeny, & Hay69). All faces displayed a neutral to mildly positive expression. Eight counterbalance versions were created so that all faces were seen in all conditions across participants.

Experiment 2

A total of 64 target words plus six fillers were used. The words belonged to eight different categories (animals, fruits, tools, vehicles, insects, trees, clothes and furniture) with eight exemplars each. Filler items belonged to two extra categories (beverages and toys, with three exemplars each).

Within the same category, no items shared the first two letters. Moreover, in order to maximize competition between items, within each category four items were highly representative of their categories, while other four were poor representatives. The poor representatives were used as practised items and their baseline, whereas highly representative words were used as unpractised ones and their respective baseline. This manipulation is thought to boost interference, since the more representative of its category an item is, the more it will compete with the to-be retrieved ones31,59.

Indices of frequency and rank for each item respective to its category were taken from Marful, Díez and Fernández70, using the NIPE database (Norms and Indices for Experimental Psychology71). Mean frequency, measured as the number of participants (out of the total 284) who produced any given word as an exemplar of their category, was of 4.70 (SD = 6.20) for practice items and 208.3 (SD = 52.80) for competitors. Rank scores were on average 8.5 (SD = 2.20) and 4.4 (SD = 1.50) for practice and competitors respectively.

The words were presented in the centre of the screen in a black font (Courier New, 18 pts) on a white background. Category cues were always presented in uppercase letters, whereas the specific items and their stems were presented in a capitalized fashion.

Procedure

Experiment 1 consisted of a version of the retrieval practice paradigm (Fig. 1B), comprising a study phase, a retrieval practice phase and a final test.

Study phase

The experiment started with a study phase, where participants were shown the 64 critical faces sequentially. Presentation was randomized except that the first and last three faces were always filler items, to account for primacy and recency effects. After a 1000 ms fixation cross, a face appeared on the screen for 4000 ms with its respective name and profession written below (e.g. Actors–Banderas). The participants’ task consisted in pressing a number from 1 to 5 on the keyboard to rate how familiar they were with the face presented on the screen (1- not known at all; 5- very well known). This was done not only to control for possible differences in item familiarity between older and younger participants, but also to keep participants engaged and ensure they attended to and processed the stimuli. Subjects were instructed to pay close attention not only to the faces but also to their names and professions since they would be asked about them in the next phase.

Retrieval practice phase

During this phase, which occurred right after study, participants were asked to retrieve half of the exemplars from six of the eight categories. Participants first saw a jittered fixation cross (1000–1500 ms) followed by the category cue (e.g. Actors) for 2000 ms, a blank screen (500 ms) and a specific face (2500 ms). Then a red question mark appeared on the screen and participants were instructed to give their response (name the person they had just seen on screen) at that moment. Participants were explicitly asked to refrain from responding until the question mark was presented on screen, to avoid speech artefacts. Faces were presented in a pseudo-random order, so that a whole set would be presented before repeating itself. As in the study phase, the first and last faces were filler items used to control for primacy and recency effects.

Crucially, there were three cycles of retrieval practice, that is, each of the 24 critical faces used during this part of the experiment was repeated three times, in order to allow comparisons between first and third cycles, similarly to what has been done in previous studies17,18,41.

Test phase

A 5 minute distracter task followed retrieval practice (the digits span test from the WAIS III). Thereafter, a final memory test took place, where each studied face was presented again for naming. After a fixation cross (1000 ms), a face appeared on the screen for 3000 ms and participants were asked to retrieve the corresponding name as soon as possible. The order of presentation was pseudo randomized, such that all unpractised items and half of the control items were presented first, followed by practised items and the other half of the baseline-items. This was done to prevent possible confounds regarding the forgetting effect, as retrieval of practised items first could block access to the unpractised ones (blocking effect34,72).

Experiment 2 followed the procedure of Experiment 1 as closely as possible. The three phases of the retrieval practice paradigm were maintained, the only difference being that instead of seeing faces together with their respective name and occupation, participants saw a category cue (e.g. FRUIT) either together with an exemplar of that same category (e.g. Apple) during the study phase, or followed by the word stem (Ap___) in the practice and test phases.

EEG recording

The EEG was recorded from 64 scalp electrodes, mounted on an elastic cap, on an extended 10–20 system. Four additional electrodes were used to control for eye movements: two set above and below the left eye (controlling for vertical movement) and another two set at the outer side of each eye, to control for horizontal movement.

Continuous activity was recorded using Neuroscan Synamps2 amplifiers (El Paso, TX) and was first recorded using a midline electrode (half-way between Cz and CPz) as reference. The data was then re-referenced offline against a common average reference. Each channel was amplified with a band pass of 0.01–100 Hz and digitized at a 500 Hz sampling rate. Impedances were kept below 5 kΩ.

Prior to analysing the data, a high-pass filter (at 1 Hz) was applied and artefacts (such as eye movements and EKG) were removed using independent component analysis (ICA). Remaining artefacts after ICA were manually removed by carefully inspecting the data. The average number of valid trials remaining after artefact removal for each participant in each cycle and experiment can be seen in Supplementary Table 1.

EEG pre-processing

For EEG analyses we used the Fieldtrip toolbox73 on Matlab (The MathWorks, Munich, Germany). The EEG data were cut into segments ranging from −2000 ms before stimulus presentation to 4000 ms after, around both the category cue and the retrieval specific cue (i.e. the face in Experiment 1 and the word stem in Experiment 2; first, second, and third cycles in both cases). These large segments were chosen to avoid filter artefacts after wavelet transformation at the beginning and end of each period. Data analysis was restricted to a smaller time window from −500 ms to 2000 ms.

Analyses of oscillatory power

For time-frequency analyses, a Morlet wavelet transformation (7 cycles) was applied to the data. Data were filtered in a frequency range from 1–30 Hz and exported in bins of 50 ms and 1 Hz. As in previous experiments47, power changes were calculated in relation to a prestimulus baseline (from −500 to 0 ms before category cue onset).

Given our a-priori hypotheses, analyses were restricted to the theta frequency range (4–8 Hz) and to the time window around cue presentation. A region of interest analysis was applied on a set of 9 fronto-central electrodes (Fcz, F1, Fz, F2, Fc1, Fc2, C1, Cz, C2) based on our previous study47 and on a plethora of other studies showing that mid-frontal theta oscillations are typically recorded at these locations46. We believe that restricting our analyses to this ROI, allows for more specific interpretations of the results, since we have very clear a-priori hypotheses about what mid-frontal theta should be reflecting (for unrestricted analyses, however, see Supplementary Fig. S1). Power differences over this ROI were used to define the exact time-frequency windows for subsequent analyses.

Since the aim of this study was to assess differences between young and older adults, our first step was to compute group differences. We first looked at differences in theta power upon presentation of the first category cue, as an index of initial levels of interference detection, and then performed an interaction analysis (cue cycle 1 minus cue cycle 3 × age group). Differences in theta power upon presentation of the cue on the first cycle minus presentation of the cue on the third one were calculated for each participant, at the aforementioned mid-frontal ROI. These differences were then subjected to an independent samples t-test, comparing the two age groups. Note that although age-related anatomical differences, such as skull thickness, could have an effect on absolute power, they should not affect these relative measures.

Analyses of oscillatory power upon face presentation were performed in a similar fashion to the category cue analyses, although differences were computed for all electrodes, rather than for a particular ROI. These are not reported since they yielded no significant results.

In order to control for multiple comparisons, Monte Carlo randomization was used (see details on this method in Maris & Oostenveld74). From this procedure, clusters of electrodes that significantly differed from one cycle to the other were obtained (pcorr < 0.05).

Planned comparisons were then made for each group (young and older) separately, comparing first cue and face presentations minus third, over the time and frequency windows significant in the interaction analysis.