Vocal learning, or acquiring vocalizations through imitation rather than instinct, is not unique to humans, but is extremely rare in the animal kingdom. In fact, only three other distantly related groups of mammals (elephants, bats, and cetaceans (whales and dolphins)) and three distantly related groups of birds (hummingbirds, parrots, and songbirds) are capable of vocal learning.
The group that has received the most attention from neuroscientists, with respect to vocal learning, includes the songbirds. A growing body of research focuses on the brain structures that allow birds to perceive, learn, and generate songs, and the hope is that this research may contribute to understanding the neural mechanisms underlying human language acquisition. These structures, which include seven “vocal brain nuclei,” are involved in networks that mediate both the perception and production of sounds, allowing birds to hear themselves, hear others, and control the acoustic structure of their vocal output.
By looking at genes that are up- or down-regulated in response to singing behavior, researchers (notably Dr. Erich Jarvis at Duke University) found that these seven nuclei are strikingly similar in location, connectivity, and behaviorally-driven gene expression across all three bird groups, and speculates that similar brain structures are at play in other vocal learners.
This image shows comparable vocal and auditory brain areas among vocal learning birds and humans. Left hemispheres are shown, as that is the dominant side for human language. From Jarvis, Ann. N.Y. Acad. Sci 1026: 749-777 (2004).
Because human brain lesions and brain imaging studies do not allow for a high resolution analysis, the neuroanatomy of comparable vocal nuclei in humans has not been demonstrated. It is interesting to note however, that like birds, humans have brain regions in the cerebrum that control the acoustic structure of their vocal behavior. For example, Broca’s area plays a selective role in speech production; humans with damage to this region have difficulty speaking, but have little or no deficits in comprehension. Somewhat surprisingly, the neuroanatomy of similar structures in other vocal learning mammals (elephants, bats, cetaceans) has not been examined.
Given that parrots and hummingbirds diverged from one another about 65 million years ago, and that birds evolved 50-100 million years after mammals, the evolution of vocal learning is perplexing. Not only did this complex behavior evolve in phylogenically disparate groups, but also, in every species that has been examined, it is governed by similar neural structures. How did these striking similarities evolve?
Dr. Jarvis suggests three hypotheses. The first is convergent evolution; that is, similar structures evolved independently in all vocal learners. This implies significant constraints on how these structures can evolve to mediate this complex behavior. (For comparison, the “eye” is believed to have evolved independently eleven times, but each time has produced dramatically different structures…compare a human eye to those of a spider).
The second hypothesis is that all vocal learners came from a common ancestor, and that vocal learning was thus lost in all non-learners. Given that these animals diverged hundreds of millions of years ago, this would imply that the capacity for vocal learning was lost independently by every other animal; non-human primates would have lost this capacity multiple times before humans evolved with the trait intact. The third hypothesis modifies the first, positing that all animals have rudimentary neural structures for vocal learning, but that these structures have been independently amplified in vocal learners.
All three of these hypotheses are possible alternatives, and all are constrained by the rarity of vocal learning in the animal kingdom. Vocal learning has clear benefits: by permitting the modification of sounds, it allows for innovative and flexible communication. This system may be the foundation upon which spoken language in humans was established, and certainly contributes to reproductive success in songbirds. Further, these attributes may allow animals to maximize sound propagation in novel environments (e.g. if an animal must adjust from living in an open savannah to living on a heavily forested mountain). If such a useful behavior could evolve seven independent times, why didn’t it evolve more often? Alternatively, if vocal learning was present early on, why would most animals have lost the capacity?
Jarvis’s answer: predation. The ability to make novel, varied sounds, and to maximize their propagation, is an excellent way to advertise one’s presence to potential predators. Thus, for the majority of animals, the benefits of vocal learning are far outweighed by the hazards it brings.
Only seven known groups of distantly related animals are capable of vocal learning--this in itself is fascinating. The fact that seven similar brain structures have evolved in three of these learners only adds to the excitement. Have the brains of mammalian vocal learners evolved similar mechanisms for perceiving and producing sounds? More importantly, have humans? Did this ability to imitate and improvise sounds lead to the evolution of human language?