Visual Perception

The Five Senses and Beyond: The Encyclopedia of Perception - Jennifer L. Hellier 2017

Visual Perception

The visual system is a sensory system that is responsible for the sense of sight or vision. Visual perception, however, consists of the psychological process of how an animal or person sees a visual image. Specifically, visual perception is how the brain interprets the external environment and surroundings that are contained by visible light, and that interpretation can vary from person to person based on their previous experiences. Because of this difference, the visual system is a separate entry in this encyclopedia. Finally, sight is the combination of the visual system, visual processing, and visual perception.


For thousands of years, people have realized that the eye and brain are intimately interconnected. The visual pathway from the eyes to the brain was first documented and described by Galen of Pergamon (130—200 CE). Considering the technology of the time, scientists are still amazed at Galen’s accuracy in many of his anatomical drawings and physiological understandings of the visual system. However, he did make a few mistakes that are now better understood with modern scientific tools.

Our understanding of particularly the neurophysiology of the visual system is attributed to Canadian neurophysiologist David H. Hubel (1926—2013) and Swedish neurophysiologist Torsten N. Wiesel (1924—). Together, Hubel and Wiesel won the 1981 Nobel Prize in Physiology or Medicine for their contributions to neuroscience about vision and are considered the fathers of the visual system.

Visual Processing

Once an object’s image hits the retina, the neuronal signals for that image are carried from the optic nerve and terminate in the visual cortex. It is here where the signal is processed and then passed on to visual association cortices to be translated into visual perception. The visual cortex is found at the posterior part of the brain called the occipital lobe. There are two occipital lobes, one on each brain hemisphere, and they each receive information from the opposite visual fields (what the eye sees when it is fixed and looking straight ahead).

The signals are then relayed to the V1 region of the visual cortex. This is the first of the hierarchy for visual processing. The V1 region is also called the primary visual cortex, Brodmann area 17, and the striate cortex. The striate cortex is named as such due to the striped nature of myelinated fiber within the cortex. The visual cortex is responsible for the initial processing of image information. In humans, it makes up the largest of all sensory systems that are represented in the brain. The neurons located in V1 send their axons to three main brain regions: (1) the extrastriate visual cortices (regions V2, V3, V3a, and V4) for additional processing of the visual signal; (2) the superior colliculus to modulate eye movements; and (3) the lateral geniculate nucleus (LGN) to have central control of the sensory input.

Neurons in the extrastriate cortex will then project to the medial temporal, inferotemporal, and posterior parietal cortices. From the output cells of the retina a represented map is found not only in the LGN but also in the striate and the extrastriate cortices. In fact, there are six retinotopic maps in the occipital lobe with one each in V1, V2, V3, V3a, V4, and in the middle temporal area that borders the temporal and occipital lobes (called V5, which is important for the perception of movement). In addition, the retinal representations are found in the inferotemporal and posterior parietal cortices. The posterior parietal cortex is responsible for integrating both somatic and visual sensations together. These topographical maps of the retina throughout the visual system show how the visual signals are conserved and organized so that the object’s information is preserved in the brain.

In humans, neurons in the striate and extrastriate cortices process the most basic information, such as light intensity, colors, lines and edges making “bars,” and orientation. As visual information passes through the hierarchy of the visual cortex, the processing becomes more and more complex, making the image more realistic. When an object’s visual signal reaches the visual association cortices, the neurons will respond to complete objects that were seen in the visual field. This means, for example, that cells in the visual association cortex will be activated when a specific type of bird, like an adult bald eagle, is seen. This information is then moved into two different pathways of the brain, called the ventral and dorsal streams, to identify “what” the object is and “where” it is in space compared to the person.

Visual Perception

Visual perception takes the information from visual processing and attempts to “make sense” of the object. Additionally, visual perception is important for the perception of movement, the perception of depth, and figure-ground perception. These three types of visual perception are based in Gestalt psychology, which tries to understand how the human eye sees objects first as a whole and then as a sum of its individual parts. It also looks at how the entire object is anticipated even when the parts are not integrated, such as filling in the “blind spots” of the visual field to complete the entire image.

To perceive movement, the neurons located in region V5 are activated when the speed and direction of an object are seen. In humans, the vestibular system is also necessary for understanding motion perception. This is because the system compares the speed of the person against the speed of the object to determine the motion. There are some people who cannot perceive movement. This rare condition is called akinetopsia. Persons with akinetopsia see the world in several “still” pictures instead of fluid actions. Research has shown that these individuals have a lesion in V5, thus confirming the location of motion perception.

Seeing the world in three dimensions (3D) and seeing how far an object is from a person is called depth perception. This perception is best with binocular vision as well as utilizing depth cues such as stereopsis, parallax, and convergence of the eyes. Stereopsis is the impression of depth by viewing a 3D scene (or external environment) with both eyes. The different locations of each eye on the head present a disparity in what is seen, which is then processed by the brain to perceive depth. Parallax is used to determine distance by looking at an object along two different lines of sight and measuring it against the angle made by the two lines. Closer objects have a larger parallax compared to far objects. Using parallax is how ancient astronomers determined the distance of the moon, sun, and stars. Finally, the inward movement of both eyes (looking toward the center) is called convergence. This helps in focusing the object onto the retina and producing binocular vision and stereopsis.

Figure-ground perception is the process of determining a “figure” from the “background.” For example, it is how you are reading this entry and seeing each letter and word as its own figure and not part of the white background. Specifically, figure-ground perception looks at edges (or borders) of the figure as well as its shape, so that it can be perceived in the brain as a singular object. One of the most famous figure-ground perception examples comes from Danish psychologist Edgar Rubin (1886—1951). It is a vase-face drawing that is also considered an optical illusion. The vase is white and in the middle of a black background. However, the black area looks like two faces looking at each other with a white space between them. This drawing—called the Rubin vase—focuses on the edges of the vase, which are also the edges of the two faces. If the person focuses on the white side of the border, the brain will see a white vase. But if the person focuses on the black sides of the edges, the brain will interpret the image as two faces. The visual system will go back and forth between the two images and the two interpretations.

For the brain to determine which to look at as the “figure” and the “ground,” it uses other cues that are based on the size, the shape, the color, and the movement of the object. Generally, the object is smaller than the background, the shape tends to be curved—particularly convex, the color is more distinct and varied for an object compared to the “monotone” color of the background, and the movement of the object is faster than the “static” background.

Jennifer L. Hellier

See also: Blind Spot; Color Blindness; Color Perception; Hubel, David H.; Occipital Lobe; Optic Nerve; Perception; Retina; Sensory Receptors; Tunnel Vision; Visual Fields; Visual System; Wiesel, Torsten N.

Further Reading

Bear, Mark F., Barry W. Connors, & Michael A. Paradiso. (2007). Neuroscience exploring the brain (3rd ed.). Baltimore, MD: Lippincott Williams & Wilkins.

Dragoi, Valentin, & Chieyeko Tsuchitani. (1997). Visual processing: Cortical pathways. In Neuroscience Online, an electronic textbook for the neurosciences (Chap. 15). Retrieved from

Kandel, Eric R., James H. Schwartz, Thomas M. Jessell, Steven A. Siegelbaum, & A. J. Hudspeth (Eds.). (2012). Principles of neural science (5th ed.). New York, NY: McGraw-Hill.