Although the attributes of musical rhythm are many and varied, the most agreed components of rhythm structure are beat, meter, tempo, and accent. Phase rhythm, melodic rhythm, rhythm pattern, or rhythm group are some names given to the rhythm of the melody and harmony, these are parts that overlie and/or are entwined with beat, meter, tempo, and accent, making it difficult to separate discussion of physical structure from rhythm as a psychological phenomenon.
The beat is the unit division of musical time and it underlies rhythm’s structural components, it also generally divides the duration into equal segments. Beat is often referred to as pulse, but beats are fundamental to music's metric structure while pulses are significant in relation to its rhythmic context.
From the acoustic point of view, beats are loudness fluctuations. Beat frequencies occur when two nearly equal frequencies are sounded together. If two tones are about 15 Hz or less apart interference will result from their similar, though not exactly identical frequencies. Gradually they will move out of phase until at 180° destructive interference results, producing diminished loudness. When they move back into phase, constructive interference will produce increased loudness. Thus, beats are a form of amplitude modulation. As two frequencies are brought closer together, the beats will gradually slow down and disappear when they become identical. In Figure 3.1 the superposition of two sine waves of 100 Hz and 110 Hz is shown.
Beats recur at a rate equal to the difference between the two frequencies, called the beat frequency. Thus the beat frequency produced by the 100 Hz and 110 Hz sine waves is 10 Hz.
Meter involves a grouping of beats usually metric beats. In practice, the unit designated by a meter signature as receiving the beat is not always the same as the beat that is felt in response to the music. Thus, the metric beat is that which a meter signature indicates and the true beat is the one beat felt in response to music. Generally in much music the metric beat coincides with the true beat and is simply referred to as beat. Meter usually is considered in terms of notation and is commonly indicated by bar lines. In many types of music, the first beat of each measure should receive an accent, thus delineating the meter. It is important to note that music does not always conform mechanically to this pattern because music is an expressive medium and is not merely mechanical or arithmetic. The usual approach to meter is indicated at the opening of a piece of music by a time signature and is defined in algebraic terms in the standard musical notation. They are represented most often by either quarter notes (e.g., in 2/4, 3/4 meter) or eighth notes (in 4/8, 6/8 meter).
Tempo refers to the speed at which beats recur. In music notation, tempo is indicated by use of the traditional Italian terms: grave, largo, adagio, lento, andante, moderato, allegro, and presto (from slowest to quickest). More precise tempo indications are given in terms of metronome markings, that indicate the number of times a given note value or unit of time recurs in one minute. The note value indicated may coincide with either the metric beat or the true beat.
Accent is the aspect of rhythm that makes prominent or emphasizes a beat. Creston  views accent as the "very life of rhythm" without which meter becomes monotonous and classifies it in eight types: dynamic, metric, harmonic, weight, pitch, pattern, and embellished. Kramer  maintains that there are just three types of accents: stress, rhythmic, and metric. Metric accents help to define the regular grouping of beats. Rhythm accents help define rhythmic groups and may serve to define groups at several levels, e.g., a motive, phrase, period, section, or movement.
Whereas beat and meter provide reference points in musical time, tempo refers to the speed at which beats recur and accent provides a means for emphasizing a beat. A listener, however, may group phrase or melodic rhythm patterns at various levels. The mind apparently seeks some organizing principle in the perception of music. When a grouping of sounds is not objectively present, the mind imposes one of its own.
Experiments show that the mind instinctively groups regular and identical sounds into twos and threes, stressing every second or third beat, and thus creates from an otherwise monotonous series a succession of strong and weak beats. Regardless of how one labels or describes rhythm patterns, there is agreement that melodic rhythms overlay and entwine themselves in relation to the beat. Consequently, a psychoacoustic simplification of the model of rhythm perception could be done using as a base the perception of beat.
Rhythm could be defined in terms of perceptual response as emotion in hearing a "dancing," "exciting," or "calm" rhythm. The response also might be behavioral, as clapping or tapping, or might be physiological, as in changes in heart rate or muscular movements. There is the familiar idea of rhythm as patterns of accentuated beats. These patterns may vary from moment to moment and they can be modified to make them more interesting. Musicologists refer to this incessant beating of drums as meter.
There is another conception of rhythm that is the rhythm of organic movement; it is generated all day long, e.g. the rhythm of cascading water and howling wind, or the rhythm of speech. In contrast, this kind of rhythm lacks the repetitive, evenly paced accentuations of measured rhythm. In music, it is built up by a succession of irregular sonic shapes that combine in various ways and is called phrasing. These two conceptions of rhythm are sometimes referred to as “vocal” for phrasing and “instrumental” for meter. Music could hardly exist without both kinds of rhythm. Meter gives order to time and without it music takes on the static quality of Gregorian chant. Without phrasing music becomes repetitious and banal. On the other hand, phrasing imparts a kind of narrative to music
Consequently, to analyze them, the human brain requires some way to segment the longest sonic objects that music provides. It cannot wait until the end of a ten-minute composition to figure out what happened. The brain is always looking for clues about where musical objects begin and end. Rhythm exists in music to help the brain in this task, drawing lines around musical figures. A sequence of rhythmic markers tells the brain where the beginning or the end of a musical object is. Without rhythmic markers, the brain would quickly be overwhelmed by a multitude of observations.
Rhythm is often associated to the beating if a clock, suggesting that it is concerned with measuring temporal durations. The brain measures the lengths of individual sounds and the silences that fall between them. It seeks patterns among these durations and then patterns among these patterns.
In observing pitch space, our brains naturally perceive octaves that can be subdivided to form scales. Once a brain has become accustomed to a culture's scale structure it can use the scale's pitches as a framework for perceiving any composition. But time presents no natural unit of measure akin to an octave to guess the temporal scale. Without meter, we do not have anything to tell us how long any of the notes actually last. So the brain cannot approach a composition with fixed notions of temporal durations the way it can for pitch distances.
The core of the meter is the pulse, an unceasing clock-beat that rhythmic patterns overlay. Idealized pulses exist as the steady recurrence of contraction and relaxation, tension and release. Psychologically, a pulse constitutes a renewal of perception, a reestablishment of attention. It is a basic property of our nervous systems that they soon cease to perceive phenomena that do not change. Pulses keep unchanging phenomenon alive. This process of renewing attention comes so naturally to us that our nervous systems add pulse where none is found. When the brain begins to sense a train of pulses, it continues to anticipate them even when individual pulses disappear into silence, or into notes held long. Certain pulses are made more prominent by accenting them. Typically, every second or third or fourth note is played louder, causing our brains to automatically form groups of two or three or four beats, each group starting from accent. When meter is more than four beats, a brain perceiving five beats as two followed by three, or three followed by two, would strain to constantly readjust its scope. The brain tries to grasp the five beats as a whole. But five beats runs much longer than the two-and three-beat periods to which we are accustomed and many listeners cannot manage this. They complain that music written in 5/4 time "has no rhythm."
A perceptual challenge for our brains is called Polyrhythm, which should be called "polymeter" since it is made by playing more than one meter at a time. It is difficult for the brain to simultaneously generate two rhythms, even when they are related. In Polyrhythm any combination is possible, and any number of meters can be combined.
On the other hand, tempo is very important because the mechanics of music perception are very sensitive to the rate at which musical structures are presented to the brain. Every aspect of the perception of music as individual tones, their timbre, their groupings, their harmonic relatedness, depend on speed of presentation. However, it is important to note that when music is played quickly, we may miss detail, but when it is played very slowly, the reach of the perceptual present is diminished and we may fail to observe groupings of melody, harmony and meter. Tempo and rhythm are strongly related. Tempo is the number of renewals of attention that establishes the underlying beat. In addition, the human skills of rhythm recognition are innate, but are quite different between a novice and a music conservatory student. The latter after a long period of practice is able to play rhythms written in common music notation and to recognize played rhythms, transcribing them into notation.
Due to its nature, the perceptual basis of rhythmic behavior has been more a matter of speculation and theory than research. Traditional psychology of music literature recognized instinctive, physiological, and motor theories as possible explanations for human interaction with musical rhythm. Lundin  proposed a learning theory in the development of rhythmic behaviors. Lundin's account of rhythmic response recognizes the importance of learning, which involves both perception and motor response. Perception of rhythm requires observation of rhythmic stimuli and may or may not involve overt behaviors. It involves both perceptual organization of rhythmic stimuli and discrimination among stimuli. Lundin contends that the ability to organize and discriminate among rhythmic stimuli is dependent on learning. He also viewed rhythm behavior as both a perceptual and behavioral response.
Seashore , as a major proponent of the instinctive theory, held that there are two fundamental factors in the perception of rhythm: an instinctive tendency to group impressions in hearing and a capacity for doing this with precision and stress. This theory reflects the position that rhythmic potential is an inherited trait, not a learned one. However, a number of studies provide data suggesting that training can improve rhythmic potentially disproving the theory.
Jaques-Dalcroze  proposed that the human heart rate is a basis for musical rhythm and tempo. However, evidence to support the heart rate theory is entirely lacking. Mursell  criticized the heart rate notion on the basis that there is no psychological mechanism by which the heartbeat gives us our sense of time. Lund  reported no significant relationships between college students preferred tempi for selected popular songs and the rate of any of their objectively measured physiological processes.
Recent research on tempo perception offers little or no support for physiological theories. While the natural rhythms of human physiology, including the menstrual cycle and cyclic changes in body temperature, wakefulness, and biochemistry, may influence a person's receptivity to musical stimuli, they are too lengthy, complex, and variable to explain rhythm responses to relatively short-term musical stimuli.
The motor theory holds that rhythm depends on the action of the voluntary muscles. Schoen  noted that nearly all investigations concerning the nature of rhythmic experiences find a motor or musical factor, thus lending support to motor theory advocates. Mursell  and Lundin  both recognize motor theory as the most plausible of the traditional theories, but neither accepted it without reservation. Mursell argued that neuromuscular movement does not function in isolation from the human brain. Rather, music functions in conjunction with the brain and central nervous system that control voluntary movements.
Today, much of the research related to rhythmic behavior has focused on perception of various aspects of rhythm: the role of movement in the perception of rhythm, tempo perception, meter perception, perception of rhythm groups, and expressive rhythm in music.
The general principles that govern the perceptual organization of the auditory world correspond well to those described by the Gestalt psychologists. When we listen to rapid sequences of sounds, they may be perceived as a single perceptual stream or they may split into a number of perceptual streams. This process is known as primary auditory stream segregation or fission. Fission is more likely to occur if the elements making up the sequence differ markedly in frequency, amplitude, location, or spectrum. Such elements would normally emanate from more that one sound source. When two elements of a sound are grouped into different streams, it is more difficult to judge their temporal order than when they form part of the same stream.
The principle of similarity is that sounds will be grouped into a single perceptual stream if they are similar in pitch, timbre, loudness or subjective location. In visual perception, similar objects tend to be grouped together as is shown in Figure 0.2. Rows and columns are equally spaced, but columns of X or 0 are perceived.
The principle of good continuation is that smooth changes in frequency, intensity, location or spectrum will be perceived as changes in a single source, whereas abrupt changes indicate a change in source. The principle of common fate is that if two components in a sound undergo the same kind of changes at the same time, they will be grouped and perceived as part of a single source. The principle of belongingness is that a given element in a sound can only form part of one stream at a time. The principle of closure is that when parts of a sound are masked or occluded, that sound will be perceived as continuous, provided that there is no direct sensory evidence to indicate that it has been interrupted. We tend to complete incomplete experience as is shown is Figure 3.3. Despite the lines are not completely finished, a Letter A is perceived.
Usually, we attend primarily to one perceptual stream at a time. That stream stands out from the background formed by other streams. Stream formation places constraints upon attention, but attention may also influence the information of streams. Stream formation may also depend upon information not directly available in the acoustic waveform.
Beat is the unit division of musical time, the pace of the fundamental beat is called tempo (Italian "time"). The expressions “slow tempo” and “quick tempo” suggest the existence of a tempo that is neither slow nor fast "moderate" is often assumed to be that of a natural walking pace (76 to 80 paces per minute) or of a heartbeat (72 per minute). The tempo of a piece of music indicated by a composer is, however, neither absolute nor final. In performance, it is likely to vary according to the performer's interpretative ideas or to such considerations as the size and reverberation of the hall, the size of the ensemble, and to a lesser extent, the sonority of the instruments. A change within such limits does not affect the rhythmic structure of a work. “Time provides a framework for auditory events where the onset and offset of sounds define those events. One temporal quality is whether the sound is roughly continuous (e.g., duct noise), oscillates in intensity (e.g., hand sawing), or is a series of discrete units (e.g., hammering, clapping, walking). Another temporal quality is the rhythm or timing between discrete sounds. Some physical systems are defined by damped rhythms in which successive sounds are progressively closer together in time (bouncing balls).”.
Music involves the temporal patterning of stimulus features in addition to the well-known spectral aspects of stimuli. Langner  emphasized that music contains periodic fluctuations in amplitude, that is, envelopes of AM (amplitude modulation). Such AM information can be used to bind sounds in various frequency channels, as separated by the cochlea, into a common sound source. Langner further points out to make use of this type of information the central auditory system must perform a periodicity analysis.
The neurons have the ability to respond to different levels of the auditory system to respond reliably to different rates of AM sounds. The modulation transfer function is the common response to AM stimuli. This provides an index of the ability of a neuron to respond synchronously to AM envelopes of pure tones. The rate of AM to which a cell responds maximally is called the “best modulation frequency” or BMF.
The perception in music involves the perceptual organization of patterns in time. Behavioral studies have revealed that listeners organize or group streams of sounds and silence. These studies suggested that grouping is done on the “run and gap” principle, namely, that patterns are organized to begin with the longest run of like elements and end with the longest gap (Garner, 1974) .
Perceptual grouping of temporal sequences is based on the stimulus element that elicits the largest response in the auditory system. The longest silent period, which perceptually completes a sequence, allows for the longest time for recovery, which would produce the largest response to the first element of the next pattern presented.
Fraisse  drew a primary distinction between the perception of time and the estimation of time. The former is confined to temporal phenomena extending to no more that about 5 seconds or so, whereas the latter relies primarily on the reconstruction of temporal estimates from information stored in memory. The boundary between these two corresponds to the length of the perceptual present, which he defined as “ the temporal extent of simulations that can be perceived at a given time, without the intervention of rehearsal during or after the stimulation.” (Fraisse, 1978) .
Rhythm perception, therefore, is essentially concerned with phenomena that can be apprehended in this immediate fashion and is also closely tied up with motor functioning. In studies of spontaneous tapping, Fraisse observed that by far the most ubiquitous relationship between successive tapped intervals was a ratio of 1:1. Fraisse regarded this as intimately connected with anatomical and motor properties- most notably the bilateral symmetry of the body, the pendular movements of the limbs in walking and running, and the regular alternation of exhalation and inhalation in breathing. Both arrhythmic and rhythmic tapping as a break with the underlying tendency for pendular movement, but whereas there is no structure in the former case, the latter exploits a principle of identity or clear differentiation between time intervals. This principle of equality or differentiation creates two distinct categories of duration, long duration and short duration. These categories are not only quantitatively, but also qualitatively different.