Introduction

The role of post-encoding sleep in memory

Post-encoding sleep plays an important role in consolidating hippocampus-dependent memories (Lewis & Durrant, 2011; O’Neill et al., 2010), especially their contextual elements (Van Der Helm et al., 2011). The reactivation of neural populations that were active during encoding can lead to superior memory retrieval (Lewis & Durrant, 2011; O’Neill et al., 2010). For instance, re-exposure to an encoding cue such as an odor (Rasch et al., 2007) or auditory stimulus (Bendor & Wilson, 2012) during sleep reactivates and helps to consolidate hippocampal representations.

The consolidation of memories during sleep can change their representation both quantitatively and qualitatively (Diekelmann & Born, 2010). Post-encoding sleep can strengthen memory associations to produce robust representations that are resistant to interference (Ellenbogen et al., 2006; Korman et al., 2007) and enhance recall performance (Gais et al., 2002; Plihal & Born, 1997; Tucker et al., 2006). Moreover, post-encoding sleep can induce qualitative changes by reorganizing newly encoded memory representations to form new associations (Diekelmann & Born, 2010). Such qualitative changes enable invariant features to be extracted from complex stimuli (Inostroza & Born, 2013; McClelland et al., 1995; Rasch & Born, 2007), allowing inference and insight into implicit rules (Ellenbogen et al., 2007; Fischer et al., 2006; Wagner et al., 2004). While a considerable number of studies to date have demonstrated the facilitatory effect of sleep-dependent consolidation in human memory (Inostroza & Born, 2013), questions remain about the way that episodic memories are transformed by post-encoding sleep. One way in which consolidation is hypothesized to affect episodic memory is to alter the connections between semantically related pieces of information (Inostroza & Born, 2013). Based on this suggestion, we test the hypothesis that semantic coherence, the conceptual relatedness between statements (Landauer & Dumais, 1997; for further details, see Methods section), will be changed by post-encoding sleep.

Latent semantic analysis (LSA)

In the past few decades, computational advances in characterizing the meaning of words in memory has led to the emergence of a class of vector semantic models for semantic structure, in which a given word is represented as a vector in a multi-dimensional semantic space (Burgess & Lund, 2000; Landauer & Dumais, 1997; Osgood et al., 1957). In this semantic space, words that tend to co-occur across similar contexts will be located in similar regions of the space. The Euclidean distance, or the angle between the word vectors, can be calculated to represent the semantic distance or similarity between word pairs or group of words. One such framework is latent semantic analysis (LSA; Deerwester et al., 1990; Landauer et al., 1998; Landauer & Dumais, 1997), which has been widely applied to successfully predict memory-related responses (Jones et al., 2006; Landauer et al., 1998; Landauer & Dumais, 1997), and to multiple-choice vocabulary tests and domain knowledge tests (Landauer et al., 1998; Landauer & Dumais, 1997).

One of the methods developed by Landauer and Dumais (1997) was the prediction of discourse “coherence” and text comprehension using LSA. In a coherent text, succeeding sentences use the concepts introduced in the preceding discourse, whereas less coherent texts incorporate sentences that are less related to one another (Landauer & Dumais, 1997). Semantic coherence at the sentence level can predict text-passage comprehensibility in a way that is not possible from the word level alone (Landauer & Dumais, 1997). Despite this, most LSA applications to episodic memory have relied on word-level measures (e.g., Steyvers et al., 2005).

In this study, we applied LSA on memory free-recall texts after wakefulness or sleep to test how post-encoding sleep modifies semantic coherence of memory recall. We examine sequential semantic coherence (SSC hereafter) – conceptual similarity between successive sentences – and topic semantic coherence (TSC hereafter) – conceptual similarity between all free-recall sentences – to probe the effect of sleep on neighboring and overall free-recalled information.

Materials and methods

Participants

One hundred and fifteen participants (51 males, 64 females; Mage = 20.8 years; age range:18–24 years), all English speakers without a learning or attention disorder, were recruited from the Pittsburgh community. Participants were recruited and randomly assigned to one of four groups until all groups had at least 25 participants. The groups differed in the delay between the first and final session (12 or 24 h) and time-of-day of training/testing, giving: 12-h morning/night (“12-h AM1”; n = 25), 12-h night/morning (“12-h PM”; n = 29), 24-h morning/morning (“24-h AM”; n = 29), 24-h night/night (“24-h PM”; n = 32). Therefore, participants in both 24-h groups and 12-h night/morning (“12-h PM”) groups slept, and participants in the 12-h morning/night (“12-h AM”) group did not sleep. These four conditions allowed us to study the effect of sleep by comparing the performance of participants with/without sleep between sessions, while controlling for the effect of time of day. Four participants were removed from analysis due to not completing the experimental procedures. Consent was obtained prior to the study, and participants received course credit or payment as compensation for their time. The University of Pittsburgh Institutional Review Board approved all procedures. Data from these participants have been reported elsewhere (Coutanche et al., 2020), but without examinations of semantic coherence.

Design

Encoding

Participants were instructed to watch 24 silent naturalistic video clips featuring six rare animals in their natural habitat. Each animal was featured in four videos, with each video lasting 45 s in duration during the initial encoding session. The motivation for using naturalistic visual episodes is the established role for temporal and spatial dimensions on the formation of memory, which cannot be easily incorporated into static images or words (Sonkusare et al., 2019). Six animals were selected based on their typical (un)familiarity, as rated by an independent norming group through Amazon Mechanical Turk on a scale of 1 (“not familiar”) to 5 (“very familiar”). In particular, six rare animals (mudskipper, weedy sea dragon, aye-aye, shrew, shoebill, frigate) were selected that had a mean familiarity of 1.9, which is half the familiarity (M = 4.1) of typically familiar species such as “lion.” Videos were played in a pseudo-randomized order so that a video featuring each animal was presented before the second video of every animal, etc. Participants were not informed that they would be later required to recall information from the videos.

Immediate and delayed memory recall

After viewing the videos, participants were asked to verbally free recall (e.g., “Can you tell me what you saw in the video?”) as much detail as they could from the videos featuring three of the six animals (one fish, one bird, one mammal). This was followed by more specific cued recall questions (not analyzed here and reported elsewhere; Coutanche et al., 2020). Importantly, these questions were administered to all participants, so that any group differences in free recall were due to the group assignment. Participants typed their free-recall responses into a text file (free-recall text hereafter). An example of a free-recall response of a participant is as follows: “There were a lot of mice.2 They were in a tree or some sort of plant. Two of the mice started fighting each other and were attacking each other and rolling around for a while. They chased each other and went inside the hole or tree where there were a lot more mice and they all fought each other. They then left the hole and came out one by one. Some of them fell down while they were leaving. Then the mice lined up and connected themselves to each other and traveled down on the ground.”

After the immediate-recall session, participants left the laboratory and returned either 12 or 24 h later (depending on their condition assignment). Participants were then asked to free-recall videos featuring the remaining three animals. The selection of the three animals for the first (or second) session was counterbalanced across participants (see Fig. 1). Additionally, participants rated their prior familiarity (i.e., before the study) with the six featured animals on a 7-point scale (1 = not at all familiar; 4 = somewhat familiar; 7 = very familiar). This will be used to account for individual differences in prior familiarity with featured animals.

Fig. 1
figure 1

Experiment design. Participants were first shown 24 videos of six animals during the encoding session with each animal being featured in four videos. Free recall of videos featuring three of the animals was tested immediately after encoding, and free recall of videos of the remaining three animals was tested after either 12 h or 24 h featuring sleep or wakefulness. Animals selected for immediate versus delayed recall was counterbalanced across participants

Measurement of semantic coherence by applying LSA

LSA was applied to quantify the semantic coherence of free-recall texts. A semantic space created from the TASA (Touchstone Applied Science Associates, Inc.) corpus was used, which covers a broad variety of different topics (e.g., Arts, Science, and Social Studies) and contains 92,393 different terms. Semantic coherence of each sentence in a free-recall text is calculated as the average of the vectors of the words it contains, and is represented as a vector in a high-dimensional semantic space. “Local coherence” was calculated as the cosine value between two adjacent sentences using coherence function in the R package LSAfun (Guenther & Guenther, 2019; Günther et al., 2015; Landauer & Dumais, 1997). Sequential semantic coherence (SSC hereafter) refers to the conceptual similarity between consecutive sentences, and was calculated as the average value of local coherence between one sentence and its successive sentence (Guenther & Guenther, 2019; Günther et al., 2015; Landauer & Dumais, 1997). Topic semantic coherence (TSC hereafter) refers to the overall conceptual similarity between all sentences within a free-recall text. TSC was obtained by calculating the average value of semantic coherence between each pair of sentences within a free-recall text (two models are illustrated in Fig. 2). Cosine value of semantic coherence was log-normally distributed and ranges from -1 to +1 with a higher value indicating greater semantic coherence.

Fig. 2
figure 2

S1–S7 indicates the order of the sentences within a free-recall text. (A) Model SSC represents conceptual similarity between adjacent sentences within a free-recall text, with arrows indicate semantic coherence in the sequential order. (B) Model TSC represents conceptual similarity between all sentences within a free-recall text

Conciseness and repetition

To examine the basis for any differences in the semantic coherence metric, we also created metrics reflecting text conciseness and repetition. It is important to ensure that any lower semantic coherence metrics were not reflecting responses becoming more concise and less repetitive after sleep (leading to a sparser semantic space). Therefore, we systematically quantified the number of unique ideas, and word repeats. The unique idea units in each free-recall text were coded by three trained independent researchers, where a single “idea unit” reflected the same event. For example, “mice started fighting each other and attacking each other” would be counted as one idea unit (i.e., fight), whereas “they went inside the hole” would be counted as a different idea unit. For each free-recall text, the mean number of unique idea units was calculated from the three coders. The number of repeated words was calculated through a Python script.

Statistical analysis

A linear mixed-effects regression model was implemented with an interaction term between sleep condition and session included in the model (wake or sleep × immediate or delayed recall), which is necessary because the sleep variable includes texts from both sessions (i.e., both before and after sleep/wakefulness). Considering that participants might have different levels of prior familiarity of these featured animals and also the facilitatory effect of prior familiarity on memory (Bruett et al., 2018; Popov et al., 2019), we included prior familiarity in the model to account for its effect on semantic coherence during memory recall. Additionally, the number of unique idea units, total word repeats, number of words, and number of sentences were included in the model to control for their effects. Random effects of subjects and featured animals were also included in the model to account for any such differences. Lastly, based on prior evidence that time-of-day (Folkard et al., 1977; Tilley & Warren, 1983) may affect memory retrieval (Tilley & Warren, 1983), we conducted an additional analysis to verify that time-of-day had limited effect on both SSC and TSC of memory recall.

Results

Linear mixed-effects models were used to test how sleep influences SSC and TSC during memory recall for naturalistic visual episodes. The mean rating of prior familiarity (on a 1–7 scale) for each featured animal ranged from 2.58 to 4.75 (mean (M) = 3.31, standard deviation (SD) = 2.08). There were significant differences between animals (F5,660 = 17.51, p < 0.001). The effect of sleep was tested through the significance of an interaction between sleep condition (sleep or wakefulness) and recall session (immediate or delayed free recall). The numbers of words and sentences in each free-recall text for the four key conditions are summarized in Table 1.

Table 1 Descriptive statistics for number of words and number of sentences for four conditions. Awake- and Sleep-immediate reflects free recall from participants immediately after encoding. The Awake- and Sleep-delayed reflects free recall after the delay (filled with wakefulness or sleep, respectively). Values indicate means, with standard deviations in parenthesis

Sequential semantic coherence (SSC)

SSC scores were significantly lower after post-encoding sleep than wakefulness (interaction term: ß = -0.14, p = .036; sleep immediate recall: MSSC = 0.42, SDSSC = 0.13; sleep delayed recall: MSSC = 0.38, SDSSC = 0.14; wake immediate recall: MSSC = 0.41, SDSSC = 0.14; wake delayed recall: MSSC = 0.40, SDSSC = 0.13; Fig. 3). Prior familiarity and the number of unique idea units were both included in the model, and neither familiarity (ß = - 0.004, p =.602) nor the number of unique idea units (ß = - 0.002, p = .708) influenced SSC.

Fig. 3
figure 3

Mean sequential semantic coherence (SSC) scores for pre- and post-sleep and wakefulness. Error bars indicate standard error of the mean. * p <0.05 reflects significant SSC reduction after sleep and the significance of the interaction

Furthermore, we conducted an additional linear regression analysis to verify that there were no time-of-day effects. For immediate recall, we tested if groups with immediate retrieval in the morning (“12-h AM” and “24-h AM”) significantly differed from groups with immediate retrieval in the evening (“12-h PM” and “24-h PM”) for SSC. For delayed recall, we contrasted participants with delayed recall in the morning (“12-h PM”3 and “24-h AM”) and evening (“24-h PM”). The wake group (“12-h AM”) was excluded to keep the sleep state constant. SSC did not significantly differ in immediate recall (ß = - 0.06; p = .239) or delayed recall (ß = - 0.15; p = .092). Therefore, it is unlikely that time-of-day significantly affected semantic coherence.

Topic semantic coherence (TSC)

The findings for TSC were consistent with those of SSC. TSC scores were significantly lower after post-encoding sleep comparing to wakefulness (interaction term: ß = - 0.13, p = .046; sleep immediate recall: MTSC = 0.41, SDTSC = 0.12; sleep delayed recall: MTSC = 0.36, SDTSC = 0.13; wake immediate recall: MTSC = 0.40, SDTSC = 0.14; wake delayed recall: MTSC = 0.39, SDTSC = 0.13; Fig. 4). Prior familiarity and the number of unique idea units were both included in the above model, and neither familiarity (ß = - 0.004, p =.641) nor the number of unique idea units (ß = - 0.006, p =.352) influenced TSC. Moreover, TSC did not differ by time-of-day for immediate recall (ß = - 0.04; p = .404) nor delayed recall (ß = - 0.14; p = .180).

Fig. 4
figure 4

Mean topic semantic coherence (TSC) scores for pre- and post-sleep and wakefulness. Error bars indicate standard error of the mean. * p <0.05 reflects significant TSC reduction after sleep and the significance of the interaction

Unique idea units and word repeats

We took the opportunity to also ask how the number of unique idea units and word repeats change across recall sessions with sleep or wakefulness. Specifically, to test for a sleep effect on the number of unique idea units recalled, we conducted a linear mixed-effects model of the total unique idea units, with an interaction term between sleep conditions (sleep/wakefulness) and session (immediate/delay), while controlling for number of words, number of sentences, and word repeats. Random effects of subject and animal were also included to account for any such differences. The interaction term (assessing an effect of sleep) was not significant (p = .063), indicating that the numbers of unique idea units did not differ by the presence of sleep. Similarly, in a linear mixed-effects model of the sleep effect on participants’ word repeats, the interaction term (sleep condition × session) was not significant (p = 0.541), indicating the total word repeats did not differ based on sleep.

Discussion

This investigation examined how semantic coherence in memory recall is influenced by post-encoding sleep using LSA. We found that both SSC and TSC significantly declined after post-encoding sleep, compared to wakefulness. This shift was robust and not influenced by recall conciseness or repetitiveness.

A number of studies have shown a beneficial role of sleep in consolidating hippocampus-dependent memories (Inostroza & Born, 2013). For instance, sleep facilitates the free recall of episodic details (Aly & Moscovitch, 2010; Lahl et al., 2008), possibly through the consolidation of new relational associations (Coutanche et al., 2013; Ellenbogen et al., 2007) between elements (Hunt & Einstein, 1981; Long & Kahana, 2015). Moreover, previous work has shown that sleep processes might change the features of hippocampal representation through a form of re-encoding during consolidation (Inostroza & Born, 2013; Karpicke & Roediger, 2008). Consistent with theories involving the transformation of episodic representations, we found that both SSC and TSC were significantly reduced after post-encoding sleep, compared to wakefulness. An earlier analysis of this data (Coutanche et al., 2020) found that sleep can help protect the free recall of different aspects of episodic memory – temporal, spatial, and attributes – from the decline seen after wakefulness. Our novel analyses suggest that, as well as protecting from decline, sleep can qualitatively change the semantic organization of free-recalled information. This highlights the influence of both quantitative and qualitative shifts that occur through sleep-induced consolidation.

An alternative result from our analyses would have been for semantic coherence to increase after sleep, compared to wakefulness, possibly due to hippocampal replay of the encountered episodes. Our findings are, however, consistent with a role for sleep in pattern separation, which has been found to stabilize after sleep compared to its deterioration after wakefulness (Hanert et al., 2017). Pattern separation is a principal feature of hippocampal-dependent memory reprocessing (McNaughton & Morris, 1987; O’Reilly & McClelland, 1994), and refers to the creation of non-overlapping neural representations for similar episodic inputs by the hippocampal dentate gyrus (Guzowski et al., 2004; Leutgeb et al., 2007; McClelland et al., 1995). The decline we observed in semantic coherence after post-encoding sleep may reflect the reconstructed orthogonal representations that follow pattern separation. In other words, neural pattern reactivation during sleep may counter trace decay by the formation of non-overlapping neural representations (Hanert et al., 2017), likely at the cost of their semantic coherence. These results provide additional evidence to previous studies that identified a role for sleep in forming distinct neural representations (Drosopoulos et al., 2005; Ellenbogen et al., 2009; Fenn et al., 2009; Gais et al., 2000).

Cutler et al. (2019) recently reported results addressing the influential role of the hippocampus in a feature generation task using a vector space model (but with word-level metrics rather than sentence-level coherence). Features generated by healthy controls were found to be less similar to the presented target concept, and located farther from each other in semantic space, than were features generated by patients with hippocampal damage. The authors proposed a hippocampus curation hypothesis that memory representation may have degraded in the patients due to the absence of a typical hippocampal memory consolidation process. In our study, in line with this theory, without the facilitation effect of sleep-dependent consolidation, memory representations may have degraded without reactivation. Impoverished memory representations may make remote features less accessible, leading to responses generated by the wake group being more similar and closer in semantic space. Subsequent to Cutler and colleagues’ findings, Solomon and Schapiro (2020) proposed that the hippocampus might contribute a dynamic connective code to semantic search, which allows the search process to quickly access relatively distant features. Our findings that responses produced by participants with sleep are less similar and more sparsely located in semantic space, supports the notion that hippocampus-dependent consolidation contributes to dynamic memory search, increasing accessibility to features located farther in the semantic space. Altogether, our findings argue that hippocampal replay during sleep affects the features of reconstructed neural representations in a way that reduces semantic coherence.

In this study, we selected videos that featured unfamiliar animals, based on the response of an independent norming group. Accounting for participants’ ratings of prior familiarity in our models showed that prior familiarity with the featured animals did not significantly influence the semantic coherence of the recalled episodes. A possible direction of future work would be to investigate the effect of prior familiarity on memory free-recall tasks by including animals from a wide range of familiarity levels. Additionally, semantic coherence is only one aspect of free-recall memory. We briefly touched upon others here, including recall conciseness and repetition. Future studies might more directly investigate how these, and other properties of memory, are shifted by sleep.

Notes

  1. 1.

    AM or PM indicates time of encoding and immediate recall.

  2. 2.

    Although this participant referred to the animals as “mice,” they were actually shrews.

  3. 3.

    “PM” indicates time of encoding and immediate recall.