When ready to respond, the participant pressed the microphone button on the right side of the screen and spoke their response. There was no limit to the time they could take to press the microphone button, but once they pressed this button, they only had 3 s to articulate their response.
A further button press then moved on to the next trial. Participants also completed 27 trials of a 3-alternative forced choice task. Trials were blocked by category. Each trial showed a picture of a speaker with three buttons underneath.
Each button showed a picture drawn from the target set creatures, plants, or shells. One of the pictures was the target word and the other two were foils. The speaker lit up as the target word was said and participants were asked to choose the matching picture as quickly as they could after they heard the target. The buttons could not be clicked till the sound had stopped playing, to ensure that participants only made their choice once the pseudoword had been said.
Items were scored 1 for accurate answers and 0 for inaccurate choices, and this was averaged by category; chance level for this task would be 0. Participants completed two sessions spaced exactly 1 week apart, each of which was roughly an hour long.
During the first session, they provided demographic details and then completed the word learning game. Immediately after they had finished the training phase, they completed the first cued recall and recognition tests.
Recognition was always completed after the cued recall test so participants could not use recent exposure to phonological forms to improve their cued recall performance. Participants were then given a short questionnaire to assess if they had used any strategies to complete the game and if they were familiar with any of the words or pictures in the test. If participants had time, they completed this questionnaire by the end of session 1.
At the start of the second session, participants were given the cued recall and recognition tests for a second time cued recall was completed before recognition. About 30 min after they completed the initial phase of CVLT-II, they completed the late phase and were then paid for their time. We scored all audio-recorded productions during the cued recall phase as accurate 1 or inaccurate 0.
These were then averaged to calculate cued recall accuracy over the different levels of syllable and training condition. A second rater coded all the words produced in the cued recall condition. We also calculated normalised Levenshtein distance normLD scores between the presented sequence and participant cued recall for each of these words.
The Levenshtein distance is the smallest number of edit operations insertion, substitution, or deletion of a single character necessary to modify one string to obtain another.
By transcribing this data using the International Phonetic Alphabet, we calculated LD in phonemic units. N is the number of units in the sequence for further details on normalisation, see [ 50 ].
However, as these might be of future interest, normLD scores are available in the data tables on the Open Science Framework. We report results for the recall and recognition tests immediately after training Week 0 and 7 days after training Week 1. We present results for training condition and syllable length separately, as we had no hypotheses about an interaction between these two factors. All other analyses are reported in exploratory results.
We present results from the classical hypothesis testing analyses that we pre-registered and if significant would allow us to reject the null hypothesis, and then report additional Bayesian analyses using JASP 0. Cued Recall : Taken together, our first two hypotheses were that we would observe a time x condition interaction on the accuracy measure of cued recall testing. The data were also examined by using Bayesian analyses [ 51 ], which allows us to avoid some statistical issues related to p -values for example, the setting an arbitrary criterion to achieve significance [ 52 ].
The Bayesian approach adopted employs Bayes factors to compare support for the alternative or experimental hypothesis relative to the null hypothesis. We use the default priors implemented in JASP v 0.
This is the Bayes factor averaged across all the models that include the effect of interest, compared to all the models that do not include this effect. This would be considered decisive evidence for the alternative hypothesis [ 55 ]. The Bayesian inclusion factors BF 10 for condition and the interaction between time and condition were 1. Effect of Training Condition on recall and recognition.
The bars show the mean accuracy in each training condition over Week 0 and Week 1. Individual datapoints show the score achieved by each participant by condition. The dotted line on the recognition graphs denotes chance. We therefore did not run parametric statistics on these measures.
The main effect of time is as noted above. The interaction was driven by a reduced rate of forgetting between Week 0 and Week 1 for the 4-syllable words, relative to the 2-syllable and 3-syllable words. The Bayesian inclusion factors indicated decisive to strong effects for all three factors, Time 7. Effect of Word Length on recall and recognition. The bars show the average accuracy for each syllable length over Week 0 and Week 1. Therefore, we did not use parametric analysis of variance to analyse this data.
In our pre-registered analysis, we only planned to conduct follow-up t-tests if the main effects of training condition or the interaction between training condition and time were significant. This was not the case. However, in these exploratory analyses, we sought to quantify specific evidence for hypotheses 1 and 2 using t-tests, and assess in what ways the data looked different from our predictions. Hypothesis 1 posited that training conditions with more listening exposures to stimuli Reproduce and Restudy would lead to greater accuracy immediately after training relative to the condition where they received fewer exposures cued Recall.
To evaluate this hypothesis, separate paired t-tests were used to compare the accuracy scores for cued recall test at week 0 of stimuli studied under the condition of cued Recall with those studied under the Restudy and the Reproduce conditions. The estimated Bayes factor indicated that the data were approximately 3.
Hypothesis 2 stated that accuracy at Week 1 would be highest for the words studied under the Recall condition relative to the other two conditions. We tested this using separate t-tests with cued recall accuracy as a dependent measure.
The estimated Bayes factors indicated that these data were approximately 2. We considered whether we could replicate the classic production effect reported for written words, that is, a benefit of learning words in the Reproduce relative to the Restudy condition. This effect is traditionally observed in tests immediately following the production trials. Using a one-tailed paired t-test, we examined whether the Reproduction condition resulted in higher accuracy for cued Recall test scores than the Restudy condition at Week 0.
Using a directional Bayesian paired t-test testing if accuracy in the Reproduction condition exceeded accuracy in the Restudy condition, excluding effects in the opposite direction , the Bayes Factor was estimated to be 0.
This constitutes only inconclusive evidence in support of the alternative hypothesis. Using a one-tailed Bayesian paired t-test, the Bayes Factor was estimated to be 0. This is weak evidence in support of the null. We also examined possible relationships between IQ, verbal memory, age and word learning ability as assessed in Week 0. Age and IQ did not account for significant variance in this model.
Overall, recognition accuracy was high immediately after training and on the delayed test that followed a week after training, indicating that participants were able to form associations between the novel picture sets and pseudowords. As participants were at ceiling on recognition, the recognition task was not useful for exploring differences between training conditions, and we discuss the remainder of the results only with respect to cued recall performance assessed by oral production of the target pseudoword in response to the visual referent.
We had predicted that accuracy in the cued Recall condition would be worse than in the Reproduce and Restudy conditions immediately after training, but the pattern of data for cued recall accuracy went opposite to prediction, with an advantage for the cued Recall over the Restudy condition though not over the Reproduce condition.
In addition, contrary to our prediction, there was no boost for the cued Recall condition over time relative to the other two conditions. Rather, we found that participants forgot all words over time regardless of the training condition. Thus active and effortful manipulation of words, both via retrieval and reproduction, relative to passively listening to words, did not confer a long-term learning advantage.
We treated the word-length effect as a positive control, i. In this case the prediction was that shorter words should be easier to learn than longer words. There was strong evidence of this effect in the production task, indicating that we were both adequately powered and measuring relevant aspects of phonological and speech motor learning.
While previous work has shown that the word-length effect is a stable and robust phenomenon [ 44 , 56 ], we also show that this effect persists over a period of 1 week. These findings are unsurprising — the longest words were associated with the lowest production accuracy at both immediate testing and after a delay of 1 week. We did observe a syllable length x time condition that we did not predict. This appears to suggest that once a longer 4-syllable item is encoded, it is more resistant to forgetting.
Further testing is required to confirm whether this effect is specific to the words we included, or whether this would generalise to other samples. In contrast to word length, we found mixed evidence for testing and production effects. Although there was only inconclusive evidence in support of a time x condition interaction, exploratory testing for our specific hypotheses allows us to shed some light on the pattern of data we observed when testing immediately after training, and a week following training.
We found some support for the testing effect and the production effect immediately after training these are discussed below. Yet, despite this initial pattern of results, there was no evidence to suggest that testing or production benefits persisted a week after training. First, the testing enhancement did not translate into better retention at the one-week re-test. This means that we did not replicate the classic testing effect, which is associated with not just better performance but reduced forgetting [ 6 , 11 , 16 ].
Second, unlike Ozubko and colleagues [ 34 ], we observed no beneficial effect of production over a longer time-span. The lack of these differences might be accounted for by procedural variations between previous training tasks and the one we employed. For instance, the lack of a sustained benefit of cued Recall over Restudy at Week 1 might be because we tested memory at a different stage of encoding.
In the Karpicke and Roediger [ 6 ] study, participants were allowed to learn until they achieved correct recall of the target-response pairing, and only at this stage were recall or restudy regimes put in place.
In contrast, we used the same number of exposure trials for all conditions, and this may have led to a reduced performance boost for the test condition. The long-term production advantage in the Ozbuko et al. However, the fact that we did observe a difference between conditions immediately after training would temper this argument.
Another possible influence on our results at Week 1 is the fact that we tested all words immediately after training. Single instances of testing have been shown in previous studies to lead to an improvement in performance [ 57 , 58 ]. It is possible that we enhanced learning in the Restudy and Reproduce conditions by providing an opportunity to practice retrieval of these words by testing performance on all words immediately after training. The bifurcation hypothesis [ 25 ] predicts that successful retrieval of items at the final test immediately after training would confer a substantial advantage to these items.
Therefore, by assessing performance of words learned in the non-retrieval conditions, we may have inadvertently provided a retrieval opportunity, which provided a learning boost in these conditions. In order to explore this question more carefully, it would be necessary to use a design where only half of the words from all conditions were tested immediately after training, and then all the words were tested a week after training.
If the single instance of retrieval is of benefit, then the untested words in the Reproduce and Restudy conditions would show no enhancement. However, it is worth highlighting that this would suggest that a single instance of cued Recall had the same effect as five instances of cued Recall with interleaved exposure, which is somewhat unlikely. A goal of this study was to assess whether tasks purported to rely on different neurobiological pathways lead to differences in behavioural accuracy, but we did not find any such differences in healthy adults.
We note that the lack of behavioural differences does not argue against the use of different neurobiological pathways to accomplish this learning.
In healthy adults, it is entirely possible that learning via different pathways offers the same learning benefits.
Therefore, the ideal way to address this question would be to use similar tasks with populations where one of the learning pathways is compromised.
Thus, this is an issue to be addressed in future studies, using either patient groups, or using drug manipulations that affect the functioning of dopaminergic systems. We did observe a strong testing effect immediately after training, although we note that the direction of the effect was contrary to our predictions [ 16 ].
However, our result is consistent with the bifurcation hypothesis [ 25 ], which maintains that successfully retrieved items receive a boost that restudied items do not, so that high exposure to items might allow a testing enhancement to be observed at short intervals. Given the nature of our testing materials, this enhancement is unlikely to result from elaborative semantic processing, as both the visual referent and the phonological form were unfamiliar to the participant.
More suitable explanations are offered by retrieval effort hypotheses [ 22 ] discussed earlier. Procedural variations introduced in our study may have influenced the strength of the production effect.
This may have led to better performance by improving attention to all words. Alternatively, participants may have covertly applied reproduction or recall strategies that were required in other blocks. There is some evidence that covert retrieval is as effective as overt retrieval [ 14 ], and that covert reproduction involves the same mechanisms as overt reproduction [ 40 ].
Participants typically do not assume that testing leads to more effective learning [ 6 ], and therefore they would be unlikely to apply this strategy more broadly. Finally, we found that performance on stimuli learnt under the cued Recall and Reproduce conditions was indistinguishable when tested immediately after training.
This result was not what we hypothesised and contrasts with previous findings in foreign vocabulary learning in which practice with recall results in greater learning accuracy [ 8 ]. This suggests that the similarities between the cued Recall and Reproduce conditions encoding and producing the word form may have been more important than their differences retrieval mode, level of cognitive processing.
A factor that further distinguishes our study from previous work on the testing and production effects is the focus on oral production rather than testing recall via written means [ 6 , 9 ]. Despite testing healthy young adults, we found a great deal of individual variation with respect to production performance. We found that some participants were unable to recall accurately any of the words they had just learned, despite receiving at least five auditory exposures.
On the other hand, some participants could accurately recall all of the words. This indicates that participants were able to match words to their referents. Consequently, the individual variation in production must stem from the phonological and motor aspects of novel word learning. This is a non-trivial process even in adulthood, especially when a phonological form has to be learned aurally, and cannot be derived from existing vocabulary. The fact that the verbal recall of a list of words predicted performance on novel word learning suggests that short term memory and chunking processes may aid learning.
So how could such phonological learning be improved? Previous developmental studies have also suggested that learners benefit from the presence of orthography [ 60 ], which may help learners segment phonological chunks in novel words. Future studies that assess whether participants are better able to learn these words when they are presented in their written form, and if recall accuracy differs across the spoken and written form, are warranted. Other studies have also suggested that feedback can enhance the testing effect [ 61 , 62 ], although many of these studies do not test the learning of novel phonology.
In our paradigm, even though participants had a chance to receive further exposures to a target-referent pair, no corrective feedback was given. In summary, it is clear that our primary hypothesis about training conditions conferring specific advantages for oral vocabulary learning was not supported by our data.
In other words, the results from our study suggest that in training expressive vocabulary, reproducing, recalling or restudying a word leads to similar production accuracy over the long term.
There may, of course, be practical reasons to prefer one training method over another: a busy teacher might find that it is far easier to get students to imitate new words rather than designing tests for recall practice. Students might prefer restudying to the anxiety associated with tests. We used one specific training paradigm and it is possible that variations in procedure could lead to one of these conditions inducing better learning.
For example, providing only one of set of instructions to a participant and assessing between-subject effects to maintain purity of condition, or providing feedback on performance [ 62 ] might serve to enhance the effects these conditions. Changing the set size or the phonotactics of the words to be learned might also lead to a different pattern of results. Nevertheless, our findings indicate that we cannot assume that classic training effects will generalise beyond the written paradigms that are typically used.
This is particularly important for translational purposes. A recent study found that memory strategies that were robust in laboratory settings could not be replicated in real-world settings such as classrooms [ 63 ].
The authors argued this could be because of increased noise, the presence of other tasks, and the overall performance difficulty of conditions associated with better learning in the lab. Therefore, pinning down both what does and does not work is equally important to help us assess what training conditions could confer benefit in clinical and educational settings.
In the memory literature, recognition is the process involving detection of familiarity with an event or item, whereas recall involves the retrieval of related details from memory.
Semantic advantage for learning new phonological form representations. J Cogn Neurosci. Article PubMed Google Scholar. Articulating novel words: Children's Oromotor skills predict nonword repetition abilities. J Speech Lang Hear Res. A model linking immediate serial recall, the Hebb repetition effect and the learning of phonological word forms. Litt RA, Nation K. The nature and specificity of paired associate learning deficits in children with dyslexia.
J Mem Lang. Article Google Scholar. The declarative system in children with specific language impairment: a comparison of meaningful and meaningless auditory-visual paired associate learning. BMC Psychol. The critical importance of retrieval for learning. Carrier M, Pashler H. The influence of retrieval on retention. Mem amp; Cogn.
Often we start with a piece of information that is easier to recall to narrow down our choices, then we go through the resulting choices one by one and recognize the relevant one. This transforms your task into one of scanning the SERP search engine results page and relying on recognition to pick out the desired website from among the other options listed.
In fact, a paper by Eytan Adar, Jaime Teevan, and Susan Dumais showed that this method of retracing the path to a previous page is the preferred method for revisiting content on the web.
Search does require users to generate query terms from scratch — which most people are bad at — but from then on users are able to rely on recognition while using the search results. This is one of the reasons search engines have become such an essential tool for using the web. Search suggestions are a major advance in search usability because they partly transform the query generation task from one of recall to one of recognition.
The classic example of recall in an interface is login. When you log in to a site, you have to remember both a username or email and a password. You receive very few cues to help you with that memory retrieval: usually, just the site itself. Some people make it easier for themselves by using the same credentials everywhere on the web. Others create a password that is related to the site e. And many others just keep their passwords somewhere on their computer or on a piece of paper. A menu system is the most classic example of a recognition-based user interface: the computer shows you the available commands, and you recognize the one you want.
Before the advent of graphical user interfaces you would have had to recall the name of this rarely used formatting feature. A difficult and error-prone task. Now, however, you look at the menu of formatting options and easily recognize the term strikethrough as being the one you want. How do you promote recognition? By making information and interface functions visible and easily accessible. You can make both the content and the interface easy to remember; both can benefit from designing for recognition rather than recall.
Providing access to the pages recently visited and searches performed in the near past can help users resume tasks that they left incomplete and that may have a hard time recalling. Search engines such as Google and Bing often help users retrace their searches by providing past histories. Amazon and many other ecommerce websites shows users lists of items that they visited recently.
These lists help users remember to finish a purchase that they may have started a few days ago. Other tools that let users save information in an app or on a website favorites, wish lists , shopping lists, etc. Command-line interfaces are an example of interfaces that are based on recall. If you want to rename a file called myfile in a UNIX system, you would have to type the command mv myfile yourfile.
You would have to recall not only that mv is the command for move, but also the correct order of the arguments. When direct manipulation and WYSIWYG came around, the idea was to replace some of these commands with actions that would be intuitive, so people would not need to recall anything from memory.
Psychological Review , , 79 , 97— Article Google Scholar. A propositional theory of recognition memory. Google Scholar.
Bahrick, H. The ebb of retention. Psychological Review , , 72 , 60— Article PubMed Google Scholar. Bartlett, F. Cambridge, England: Cambridge University Press, Craik, F. Levels of processing: A framework for memory research. Retrieval independence in recognition and recall.
Psychological Review , , 85 , — Humphreys, M. Forward and backward associations in cued recall: Predictions from the encoding specificity principle. Keppwl, G. Free-association responses to the primary responses and other responses selected from the Palermo-Jenkins norms. Keppel Eds. New York: Academic Press, Kintsch, W. Learning, memory, and conceptual processes. New York: Wiley, Lockhart, R. Depth of processing in recognition and recall: Some aspects of a general memory system.
Brown Ed. London: Wiley, Mazuryk, G. Negative recency and levels of processing in free recall. Canadian Journal of Psychology , , 28 , — Mccormack, P. Recognition memory: How complex a retrieval system? Canadian Journal of Psychology , , 26 , 18— Palermo, D. Word association norms: Grade school through college. Minneapolis: University of Minnesota Press,
0コメント