CELPIP Speaking Task 3 is not about naming every object in the picture. It is about helping a listener understand the scene.
That is why the best answers do not sound like a random list. They sound like a guided description.
#What the examiner wants from Task 3
This task asks you to describe a scene to someone who cannot see it. A strong answer usually includes:
- the place or setting
- the overall mood or activity
- two or three clear detail groups
- people, objects, and actions
- a short wrap-up
#The easiest scene-description structure to remember
The safest pattern is:
- Say what kind of place it is.
- Say what is happening overall.
- Describe one area.
- Describe another area.
- End with a short summary.
-
Identify the place firstIn prep time, decide whether the image looks like a store, office, park, street, or event. That gives your description a strong starting point.
-
Choose one main actionAsk yourself what people seem to be doing overall. Shopping, waiting, arguing, celebrating, or working are useful big-picture verbs.
-
Split the image into clustersA simple method is foreground and background, or left and right. That creates order and stops you from jumping randomly around the picture.
-
Describe actions, not only objectsThe response becomes stronger when you mention what people are doing and how they seem to feel, not just what items exist in the image.
-
Finish with one summary sentenceA quick closing line helps the listener understand the overall scene and makes the answer sound complete.
#A master CELPIP Speaking Task 3 template
This looks like a [place], and the scene seems [busy / calm / lively / serious].
Overall, people are [main action], and the main focus is [focal point].
In the [left / foreground], I can see [person or object]. They are [action], and they seem [emotion].
On the [right / background], there is [person or object]. It looks like [action or interaction].
So overall, this scene shows [short summary of what is happening].
#What changes depending on the image
#If it is a public place
Focus on:
- movement
- crowd behavior
- visible interactions
#If it is a workplace or school setting
Focus on:
- roles
- tasks
- serious or professional mood
#If it is an event or celebration
Focus on:
- energy
- group activity
- emotions
Do not say “there is” for everything
If every sentence starts the same way, the description sounds repetitive and less organized. Mix overview sentences with action and location phrases.
#A worked example
Here is a sample scene image in the same kind of format Task 3 uses.
Example image for Task 3. The goal is to describe what is happening clearly, not to mention every single object.
Overview: Say it looks like a busy grocery store and that the mood seems active but normal.
Main action: Explain that people are shopping, choosing produce, and paying at the checkout area.
Detail cluster 1: Describe the foreground on the left, where a parent and child are standing with shopping baskets.
Detail cluster 2: Move to the center and right side, where a cashier is working and other shoppers are waiting or talking near the produce and refrigerated section.
Wrap-up: End by summarizing the image as a normal but busy shopping scene with several people doing different tasks at the same time.
#Timing plan for Task 3
You get 30 seconds to prepare and 60 seconds to speak. A reliable prep split is:
- 10 seconds for place and main action
- 10 seconds for detail cluster one
- 10 seconds for detail cluster two and closing
That is enough if you describe selectively.
#Final checklist before you finish
- Did I start with the place and overall scene?
- Did I describe actions, not just objects?
- Did I organize the image into clear areas?
- Did I use location phrases like in the foreground, on the right, or in the background?
- Did I finish with a short wrap-up sentence?
#Frequently asked questions
Do I need to describe every part of the image?
Can I mention feelings or relationships?
What if I do not know the exact word for something?
What is the biggest mistake in Task 3?
#Final takeaway
For Task 3, your job is not to mention everything. Your job is to help the listener see the scene.
If you remember one thing, remember this:
big picture first, detail clusters next, short summary last.