Eye Movements in Reading

  1. Why study eye movements?
    1. Eye movements represent an intersection of 4 major cognitive systems: Language processing, attention, vision, and oculomotor control.
      1. Eye movements represent the interface between high-level cognition (language) and the perceptual-motor loop (visual-oculomotor).
      2. We make approximately 250,000 eye movements each day. How do we decide where to look? In most cases, it doesn’t seem to be a conscious process.
    2. Eye movements are recorded (using eye-tracking devices) and these measurements are routinely used to make inferences about language processing during reading.
      1. This methodology rests on the assumption that there is a tight link between eye-movement control and language processing during reading; i.e., language processing determines when and/or where the eyes move.
    3. Two theoretical "camps" regarding the issue of eye-movement control during reading:
      1. Global control – eye movements do not really reflect moment-by-moment cognitive processing demands during reading; instead, they reflect a global "rate" (e.g., move eyes quickly) that is adjusted to accommodate the text difficulty level (e.g., O’Regan, 1984).
      2. Direct control – eye movements reflect moment-by-moment cognitive processing demands during reading (e.g., Just & Carpenter, 1987).
        1. Immediacy Hypothesis – the eyes do not move until all processing (e.g., identification, integration of meaning into discourse representation, etc.) of a word has been completed.
        2. Eye-Mind Hypothesis – only the word that is currently being looked at will be processed.
      3. These two "camps" represent the extremes, or end-points, of a theoretical continuum. The best available evidence suggests a more intermediate position, in which language processing primarily controls when the eyes move, while visual processing and properties of the oculomotor system control where the eyes move (Reichle et al., 1998).
  2. Basic methodology and measurement issues:
    1. Eye-tracking devices interface an eye-tracker (which records when and where the eyes fixate with millisecond accuracy) with a computer and a cathode-ray tube (display).
    2. Subjectively, the eyes (and mind) seem to move smoothly across the page, only stopping (or reversing) when we pause to think or when we encounter some difficulty understanding what we are reading.
      1. This impression is false; instead, the eyes are immobile for brief periods called fixations (which last 150-400 ms) that are separated by very rapid (20-35 ms) ballistic eye movements called saccades (which means "jump" in French). FIGURE 4.1 (sample eye movement corpus)
      2. Approximately 80% of words are fixated.
        1. Words tend to be fixated somewhere between the beginning and middle of the word, on the preferred viewing location.
        2. Because of oculomotor error, the fixation locations tend to be normally distributed.
      3. This pattern of eye movements reflects visual acuity limitations.
        1. High-resolution vision (i.e., the quality of vision needed to identify small, complex shapes, like letters) is limited to the fovea, or center 2° of the retina.
        2. Visual acuity decreases rapidly from the fovea to the parafovea (which extends out to 5° on either side of the fovea) to the periphery of the retina.
      4. Fixations are usually 200-300 ms in duration, but range from 50-500 ms. FIGURE 4.2 (fixation duration and saccade length distributions)
      5. Saccades typically move the eyes forward 7-9 character spaces (or approximately 2°).
        1. Character spaces are the appropriate measure (not visual angle) because saccade length (in characters) in not affected by the distance between the reader and the text or font size.
          1. This reflects the inherent tradeoff between foveal vs. parafoveal acuity.
          2. As the reading distance increases, more visual information is processed in the fovea, but the resolution of this information also decreases because its retinal image becomes smaller.
        2. 10-15% of saccades consist of regressions, or movements back to earlier parts of the text.
          1. Regressions usually reflect higher-level language processing difficulty.
      6. Useful visual information is only extracted during fixations.
        1. The visual information on the retina during saccades (a smear) is "masked" by the information that is extracted from the subsequent fixation.
        2. Ex: Try to see your eyes move in the mirror.
        3. "Slide show" metaphor – reading is like a slide show in which the eyes view each "slide" (i.e., the information being looked at) for about a quarter of a second and then move before moving to the next slide (i.e., the next location).
    3. Eye-movement data is quantified using several basic word-based measurements (Fig. 4.1, eye movement corpus):
      1. First fixation duration – the duration (in ms) of the first fixation on a word during the first pass through the text (i.e., excluding fixations that occur after regressions)
      2. Gaze duration – the sum of all first-pass fixations on a word. Single fixation duration – first-pass fixation duration on a word that is only fixated once.
      3. Probability of skipping – mean probability (across subjects) that a word is skipped.
      4. Probability of making a single fixation - mean probability (across subjects) that a word is fixated exactly once.
      5. Probability of making a refixation – mean probability (across subjects) that a word is fixated exactly twice. (Few words are fixated more than twice.)
      6. Total viewing time – the sum of all fixations on a word, including those that occur after regressions.
    4. As text difficulty increases, mean fixation durations typically increase, as do the mean saccade length and number of regressions. TABLE 4.1 (reading times as a function of type of reading material)
      1. The average reading speed for college readers is about 300 words per minute.
      2. Variables that increase durations and probabilities on individual words:
        1. Normative frequency of occurrence (i.e., the familiarity of the word) – common (high-frequency) words are fixated for less time and skipped more often than less common (low-frequency) words.
        2. Predictability – words that are predictable from there surrounding context (e.g., "tire" in "The car ran over a nail and punctured a …") are fixated for less time and skipped more often than less predictable words.
        3. Function words (e.g., "the") are fixated for less time and skipped more often than content words (e.g., nouns, verbs).
        4. Reading for complete comprehension (e.g., studying) requires more time than when reading for main ideas (e.g., skimming).
        5. Reading speed slows down at clause boundaries and at the ends of sentences because of "wrap up" (i.e., main ideas need to be integrated into the overall discourse representation).
      3. Speedreading (which teaches people to skim text) is not an affective method of reading.
        1. Although the eyes make many fewer fixations (so that the overall "reading" rate increases), overall comprehension also decreases substantially
          1. Just et al. (1982) monitored the eye movements of speedreaders (600-700 wpm), normal readers that were instructed to "skim" the text (600-700 wpm), and normal readers instructed to read normally (250 wpm).
          2. The eye movements of speedreaders and "skimmers" looked very similar, and were less "dense" than those of the normal readers.
          3. Speedreaders and "skimmers" performed very poorly on those comprehension questions that required detailed knowledge of the text.
        2. "I took a speed-reading course and read War and Peace in 5 minutes. It’s about Russia." (Woody Allen)
  3. The perceptual span is the size of the effective visual field in a fixation during reading, or the spatial extent to which useful (helpful) visual information can be extracted in a single fixation during reading.
    1. The fact that some words are skipped suggests that they are at least occasionally identified in the parafovea.
    2. How big is the perceptual span?
      1. McConkie & Rayner (1975) addressed this question using a moving-window technique, in which an eye-tracker is used to display a "window" of normal text at locations that are contingent upon where the subject is looking. TABLES 4.2 & 4.3 (examples of moving window paradigm)
        1. Window size had a substantial effect on reading speed.
          1. Reading speed was completely normal with a 31-character window (15 character spaces to either side of fixation).
          2. 7-character windows (windows that, on average, only allow a single word to be viewed on each fixation) reduced the normal reading speed by 60%.
        2. Xs with vs. without blank spaces between words:
          1. The absence of blank spaces between words reduced reading speed when the windows were smaller than 31 character spaces.
          2. This finding indicates that, out to about 15 character spaces from fixation, readers use the spaces between words to guide their eyes into a region a text.
        3. Similar vs. dissimilar letters:
          1. Letter shape differences reduced reading speed when the windows were smaller than 21 character spaces.
          2. This finding indicates that letter shape information is extracting from the page up to 10 character spaces from fixation.
      2. The size of the perceptual span is related to the "density" of different writing systems.
        1. Pollatsek et al. (1981) found that the perceptual span for native Israeli readers reading Hebrew text (which is denser than English text because it often does not include vowels) was smaller than the perceptual span for English readers.
        2. Likewise, Osaka (1987) found that the perceptual span extends 6 characters beyond fixation in Japanese readers reading Japanese (which, again, is much denser than English).
      3. Rayner & Bertera (1979) also addressed this question using as moving-mask paradigm, in which a variable-sized "window" of character spaces around the fixation point is replaced with Xs so as to produce an artificial scomota (i.e., a blind spot on the retina). TABLE 4.5 (moving mask paradigm)
        1. Masking the fovea (i.e., the central 7 around the fixation point) slowed reading to 12 words per minute.
        2. Masking the fovea and part of the parafovea (the central 11-17 characters) made reading virtually impossible.
    3. Is the perceptual span symmetrical?
      1. McConkie & Rayner (1976) found that:
        1. Reading speed was normal when the window extended 14 character spaces to either side of fixation.
        2. Reading speed was normal when the window extended 4 spaces to the left of fixation and 14 spaces to the right of fixation.
        3. Reading speed was impaired when the window extended 14 spaces to the left of fixation and 4 spaces to the right of fixation.
      2. Rayner, Well, & Pollatsek (1980) found that the left and right boundaries of the perceptual span are constituted differently:
        1. The left boundary is defined by the beginning (i.e., left edge) of the word that is being fixated.
        2. The right boundary is defined by the number of visible letters.
      3. Pollatsek et al. (1981) found that:
        1. The perceptual span extends further to the left for native Israeli speakers reading Hebrew, which is read from right to left.
        2. The asymmetry of the perceptual span is not "hard-wired," but instead reflects the allocation of attention; the perceptual span for bilingual English-Israeli readers extending further to the right when reading English and further to the left when reading Hebrew.
  4. How is information integrated across saccades?
    1. We don’t notice any discontinuities in what we see when we move our eyes; instead, the visual world appear seamless as we move our eyes from one viewing location to the next.
      1. In the context of reading, the parafoveal information from one fixation is integrated with information from the fovea during the next fixation?
      2. How?
        1. What type of information is integrated?
        2. Is the information visual (e.g., line segments), orthographic (e.g., abstract letter codes), phonological (i.e., sound codes), semantic (i.e., the word’s meaning), or some combination of these?
    2. Evidence against the integration of visual information:
      1. McConkie et al. (1980) slightly shifted entire lines of text that subjects were reading (during saccades).
      2. Subjects rarely noticed these shifts, and reading speed was not affected.
    3. Evidence suggesting that orthographic information is integrated across saccades:
      1. Rayner (1975) used a boundary paradigm to determine what types of information are extract from the parafovea. FIGURE 4.4 (boundary paradigm)
        1. In this paradigm, a critical word is changed as the reader makes a saccade across an invisible, pre-defined boundary.
        2. The dependent measure is the time spent fixating the base form of the critical word, as a function of its similarity to what was seen in the parafoveal preview.
        3. When the fixation prior to crossing the boundary is far from the critical word (so that there is no parafoveal preview), the nature of the preview does not matter. FIGURE 4.5 (boundary experiment results)
        4. When the fixation prior to crossing the boundary is near the critical word, the critical word is fixated for less time if:
          1. The preview word is identical to the critical word;
          2. The preview words shares letters with (and is similar in shape to) the critical word.
          3. Nonwords having similar shape and sharing letters also reduce the fixation on the critical word, but to a lesser degree.
    4. Evidence for the integration of phonological information: Pollatsek et al. (1992) showed that parafoveal preview of a phonologically similar homophone (e.g., "sun") reduced the fixation duration on a critical target (e.g., "son") as much as a orthographically similar preview word (e.g., "sin").
    5. Evidence against the integration of semantic information: Rayner et al. (1986) showed that parafoveal preview of a semantically related word ("song") did not reduce the fixation duration on the critical word (e.g., "tune").