Learning Pronunciation - Refold Roadmap Library

Pronunciation in the Refold method develops organically through massive input, with targeted practice layered on top in the speaking phases. You don't need to drill pronunciation from day one — in fact, too much early focus on production can cement bad habits before your ears are trained.

The Progression

Phase 1-2 (Input only): Your ears absorb the sounds of the language through hundreds of hours of listening. You're not required to produce anything yet, but your brain is building a mental model of how the language sounds.
Phase 3 (Ear training): You sharpen your ability to distinguish sounds through ear training, intensive listening, and dialect-specific sound study. By the end of Phase 3, you have a clear mental model of how the language should sound.
Phase 3D-4A (First output): Light pronunciation practice begins: chorusing, reading aloud. Your advanced listening skills act as a guide — you can hear when you're not matching the native model.
Phase 4B+ (Refinement): Continued practice through conversation, corrected reading aloud, and speaking analysis. Your pronunciation improves steadily with use.

Why Ears Come First

When you grow up speaking a language, your brain learns to sort all the sounds you hear into a fixed set of categories — like sorting colors into bins. Sounds that are slightly different but fall into the same bin all get treated as identical.

This is efficient for your native language, but it becomes a problem when learning a new one: your brain "snaps" unfamiliar sounds to the closest category it already knows. This is why English speakers pronounce the final vowel in "sombrero" as an English "oh" sound rather than the pure Spanish "o" — they're snapping to the closest vowel their brain recognizes.

Hundreds of hours of listening helps your brain start to recognize new categories instead of forcing everything through your native language filter. If you try to produce sounds before your brain can distinguish them, you'll just produce the closest native-language equivalent — and then practice that mistake over and over.

Understanding How Sounds Work

A little technical knowledge about how sounds are produced can dramatically speed up your pronunciation practice. You don't need to memorize the IPA, but knowing the basics helps you understand what to do with your mouth when a sound isn't coming out right.

Vowels are open, unobstructed sounds that exist on a spectrum — like colors. They're defined by three factors: how open your jaw is, how forward or back your tongue is, and how rounded your lips are. Learning a new vowel means adjusting these dimensions to hit a different point than what your mouth is used to. 

A useful exercise is to change one dimension at a time — for example, try saying an "ee" sound but with rounded lips to produce the German/French "ü" sound.

Consonants are sounds where airflow is blocked or restricted. They're more concrete than vowels — you can usually tell more clearly if you're producing one correctly. They're defined by where the blockage happens (lips, teeth, roof of mouth, throat), how it happens (full stop, friction, trill), and whether your voice box is buzzing. 

A simple test: put your hand on your throat and compare "s" (no buzz) with "z" (buzz). Knowing whether a sound should be voiced or unvoiced can instantly fix certain errors.

There is a LOT more to sounds and pronunciation, but a little knowledge goes a long way.

A Progression for Each Sound

Learning a new sound is a lot like learning to whistle — you need some knowledge of what your mouth should do, plus consistent daily practice, and eventually it clicks.

The progression goes: learn how the sound is produced, practice it in isolation, try it in words, then work on using it in fluid speech. Being able to produce a sound correctly once doesn't mean you can use it naturally mid-sentence — that takes time.

Tips

Don't stress about pronunciation in the early phases. Ears first, mouth second.
If you find making sounds helpful for learning and focus, feel free to try! You're not going to damage your accent or anything, if your true focus is on the ears.
Chorusing is the single most effective pronunciation exercise — especially for prosody, rhythm, and tone.
Native speakers will understand you even with imperfect pronunciation. Clarity matters more than perfection.
Some sounds take lots of practice. Come back daily rather than grinding for hours in one session.
Look up your target language's phonology on Wikipedia and compare sounds at ipachart.com.

Research and Reasoning

The "ears first" approach is grounded in models of L2 speech perception. Best & Tyler (2007) proposed the Perceptual Assimilation Model for L2 learners (PAM-L2), which explains how adults perceive unfamiliar sounds by assimilating them to the closest native language categories — the perceptual snapping described above. Flege & Bohn (2021) proposed the Speech Learning Model, which predicts that L2 sounds similar to native categories are the hardest to learn, precisely because learners keep assimilating them rather than forming new categories. Both models emphasize that accurate perception must precede accurate production.

Research on perceptual training supports this. Sakai & Moorman (2018) reviewed High Variability Phonetic Training (HVPT) studies showing that listeners can learn to distinguish new L2 sound contrasts through intensive exposure to varied speakers, and that these perceptual gains transfer to production — even without explicit pronunciation practice.

The practical guidance on vowels and consonants reflects standard articulatory phonetics as described in Ladefoged & Johnson (2014). A small amount of metalinguistic knowledge helps learners make targeted adjustments rather than relying on trial and error alone.