When you look up the pronunciation of a word in a dictionary, what you’ll find is the word’s ‘citation form’. This is how the word is likely to be pronounced when uttered in isolation, for example in answer to the question, ‘What’s this word? I can’t read it’, which someone might ask while reading a handwritten text. The citation form is the guise the word adopts when it’s behaving most self-consciously, when it puts a suit on and combs its hair to have an official photo taken.
But in the rough-and-tumble of their everyday life, words are more likely to appear in more casual gear – T-shirt and jeans, for example – and you might not recognize them as they jog or skateboard past you at high speed.
For a small number of high-frequency grammatical words, dictionaries and pronunciation books give two pronunciations: strong and weak. The strong forms are the ‘official photo’ ones, and the much more frequent weak forms are the ‘rough-and-tumble’ ones.
But the small set of words which are generally recognized as having weak forms is actually only the tip of a very big iceberg. Any word – or, more accurately, any syllable – which isn’t stressed is likely to be eroded in its pronunciation. And the more familiar the word is, and the more casually it’s uttered, the more drastic the erosion is likely to be: consonants are dropped or ‘elided‘, vowels are reduced to schwa and then face the further risk of elision. Consider the word actually. In its citation form – i.e. when it’s spoken carefully and clearly – it has four syllables. But when it’s used unthinkingly in the rough-and-tumble of spontaneous speech, it’s subject to varying degrees of erosion, and very often sounds identical to the name Ashley – i.e. only two syllables.
Elision of weak vowels results in such pronunciations as:
for all I know > f’r all … /fr/
save it for later > f’later /fl/
Here, instead of being separated by a schwa, /f/ + /r/ and /f/ + /l/ form consonant clusters (sequences of two or more consonants with no intervening vowels) like those at the beginning of fry and fly.
Such elision also produces so-called ‘illegal’ (yes, that’s the official term!) consonant clusters:
forgot to ask > f’got to ask /fg/
for some people > f’some people /fs/
for this to work > f’this to work /fð/
phonetics > f’netics /fn/
together > t’gether /tg/
it’s the only way > ‘s the only way /sð/
‘cos if it is > ‘c’s if it is /ks/
potato > p’tato /pt/
tomato > t’mato /tm/
banana > b’nana /bn/
(Some of these are ‘legal’ in other positions – e.g. /fs/ at the end of words such as ‘laughs’, and /pt/ at the ends of words such as ‘stopped’.)
Phonetic erosion of high-frequency phrases can disguise them beyond recognition. Consider the phrase if you see what I mean. I often hear this in a guise that reminds me of some of the Victorian and Edwardian coins in my modest collection: they suffered so much friction as they circulated from pocket to pocket, purse to purse, hand to hand, that the images, words and figures on them were gradually worn away almost beyond recognition. But they still retained their value among people who had grown up with them. Similarly, native speakers of English are skilled and experienced in drawing accurate conclusions from severely impoverished phonetic information – in recognizing, for example, that something like fySEEwo’MEAN is to be interpreted as if you see what I mean. For learners who have had much less exposure to spoken English, though, the experience of listening to English is akin to wandering through a dense phonetic fog in which only the stressed syllables of key words stand out as recognizable landmarks.
