Sound study means learning about the sounds used in your target language — its phonology. Your target language likely has sounds that don't exist in your native language. Studying them, even briefly, gives your ears a head start in recognizing them during immersion.
You don't need to memorize the International Phonetic Alphabet (IPA), but it's a useful tool. When you see a different IPA symbol, it means a different sound — so it helps you notice distinctions you might otherwise miss.
A few sessions of 10-20 minutes is enough to get the basics. You don't need to master phonology — just get enough awareness that you can notice these sounds when they appear in your immersion. The real learning happens through exposure.
Look for YouTube videos by experienced teachers of your target language. Many have dedicated pronunciation or sound system videos. Fluent Forever's pronunciation trainers are also well-regarded. The Wikipedia article on your target language's phonology can be surprisingly useful as a reference.
Sound study begins in Phase 1A and is revisited in Phase 3B when choosing a dialect.
When you learn your first language as a child, your brain tunes itself to the sounds that matter in that language — and tunes out the ones that don't. By adulthood, your native language acts as a perceptual filter. Best's (1995) Perceptual Assimilation Model (PAM) describes how this works: when an L2 sound is similar to an L1 sound, learners map it onto the familiar category, making it hard to notice the difference. This is why sound study matters — without some explicit awareness of what's different, your brain will quietly ignore distinctions that are meaningful in the target language.
The good news is that this filtering is not permanent. Flege and Bohn's (2021) revised Speech Learning Model (SLM-r) proposes that the mechanisms for learning new sound categories remain available across the lifespan. Research on High Variability Phonetic Training has shown that even short perceptual training sessions produce gains in L2 sound discrimination that generalize to new speakers and contexts (Sakai & Moorman, 2018). The goal isn't mastery — it's building enough awareness that your brain knows what to listen for during immersion.
The recommendation to focus on hearing before producing reflects a well-established principle in L2 phonology. Flege's (1995) Speech Learning Model posits that accurate production of an L2 sound cannot occur unless the learner has first formed a perceptual representation of it. While more recent work suggests this relationship is somewhat bidirectional (Flege & Bohn, 2021), the broad consensus remains that attuning your ears first gives production a foundation to build on.