Five Executive Functions in Practice — Issue 2: Working Memory

He has shopped at the same grocery store for twenty-three years.
He knows the layout. Produce on the left. Cereal in aisle four. Frozen foods at the back. He does not need a map because the map has always lived in his head.
Since the stroke, he comes home without half the list. Not because he forgot what he needed. He has the list in his hand. He comes home missing items because he cannot hold the spatial layout of the store in his mind while he is moving through it. He walks past cereal three times. He doubles back. He restarts from the entrance. The trip that used to take twenty minutes now takes an hour and ends in exhaustion and gaps.
👉 This is not a memory problem in the general sense. It is a visual working memory deficit, the specific inability to create and hold a spatial mental image while simultaneously using it to guide behavior.
It is trainable. And a 3x4 grid of images is how you start.
What the activity actually is
The tools:
🗂️ The 3x4 Visual Memory Grid — 12 images arranged in a fixed grid, printed or displayed on a card. The grid content is interchangeable: the Hay to Couch set is the starting point, but any 12 images meaningful to your patient work — grocery items, household objects, a floor plan.
🃏 Clinician question cards — four levels, each targeting a distinct working memory demand
⏱️ A timer
📋 A tracking sheet: accuracy by level, errors with grid visible vs. hidden, response latency, level at which performance breaks down
The setup:
Grid card placed flat on the table in front of the patient. Clinician holds the question card stack. Patient studies the grid — do not rush this. The encoding phase is part of the intervention.
The task, in four levels:
Level 1 — Basic Position. Grid visible. Clinician asks where a specific image is located by row and column. This establishes whether the patient can accurately read and report spatial positions from a visible reference. This is not yet working memory — it is visual scanning and spatial language. It is your baseline.
Level 2 — Spatial Shifts. Grid visible, then covered. Clinician asks what is one space up, left, right, or diagonally from a named image. Patient must mentally navigate within the grid from a known anchor point. The grid is no longer available. The map is now in the mind.
Level 3 — Semantic Retrieval. Grid covered. Clinician asks which image belongs to a category — something used for sleeping, something found on a farm. Patient must search the mental grid by meaning, not just location. This is the level that most closely matches the functional demand of navigating a grocery store by department.
Level 4 — Relative Position. Grid covered. Multi-step spatial questions: what is two spaces down from Hay? What is one up and one right from Tractor? Patient must hold the grid, navigate multiple steps, and report without losing place. This is the highest working memory load in the protocol.
👉 The grid is not the intervention. Removing the grid is the intervention.
Before the first question — lead with science every time
Before the grid goes down, say this:
"What's happening at the grocery store isn't that you're forgetting things. It's that your brain is having trouble holding the map of the store in mind while you're walking through it at the same time. Working memory is what keeps that map active while you're using it. What we're doing right now is training your brain to hold a visual map and work from it — without being able to see it. Every time I take the card away and ask you a question, that's a repetition of exactly the skill the store is asking for."
👉 The patient who understands they are rebuilding a specific cognitive system practices differently than the patient who thinks they are playing a memory card game.
Why removing the grid is the clinical move
With the grid visible, the patient is reading. That is not working memory, it is visual scanning. Accurate performance with the grid present tells you the patient can process spatial information. It tells you nothing about whether they can hold it.
The moment the grid is covered, the task changes entirely. Now the patient must retrieve from a mental image they constructed during encoding. The quality of that mental image — how complete it is, how stable it is under questioning, how navigable it is when the questions require spatial movement, is your working memory data.
👉 A patient who is accurate at Level 1 with the grid visible and fails at Level 2 with it covered has an encoding-to-retrieval gap. The information got in but the mental image is not stable enough to navigate. That is your intervention target.
A patient who succeeds at Level 2 but fails when semantic retrieval is required at Level 3 has a different profile — the spatial map is intact but integrating meaning into the retrieval process overloads the system. Also a different target.
Document which level the performance breaks down at and whether the breakdown is consistent or variable across questions at that level. Consistent failure at a level is a capacity ceiling. Variable failure is a load-sensitivity finding. They are not the same.
Why the grid content is a clinical decision
The images on the visual memory activity grid are semantically neutral (i.e., a barn, a tractor, roller skates). They carry no particular emotional weight and no prior spatial associations for most patients.
That is useful for baseline. It is not always the right tool for functional transfer.
👉 When your patient's functional goal is grocery shopping, replace the grid with twelve grocery items organized by department. When the goal is medication management, use the pill organizer layout as the grid. When the goal is home navigation, map their kitchen or bathroom.
The cognitive demand is identical. The transfer is direct. A patient who can hold and navigate a mental grid of grocery items organized by aisle is practicing the exact spatial working memory operation the store requires.
Document the grid content as a controlled variable. When you change the content, note it and expect a performance shift. New content requires new encoding and the working memory demand resets.
