Comparing Visual Models under Human-Like Foveation Constraints
Motivation
Humans only see a small central region of there visual field in high resolution and rely on saccades to explore the rest in high resolution as well. This so called foveated setup strongly shapes how we learn, recognize, and search in visual scenes. If we put models under similar constraints as us humans, we can ask how different tasks (classification, search, reconstruction) change what the models “pay attention to”, how they gather information over multiple glimpses, and where their behavior diverges from models without such constraints. This lets us study task-driven differences in behavior under limited visual access, and also where human-like constraints help or hurt performance and generalization across tasks.
Project
The core idea is to compare model behavior with and without foveated constraints across different tasks. We previously developed a foveated model for image classification and visual search. A starting task for the project would be to extend this setup to image reconstruction. That means reconstructing the the image in full resolution as good as possible from a few foveated glimpses. For a different foveated model this has been done before Liu et al., 2024, so the task would be to adapt and apply their reconstruction implementation within our existing framework.
The next task would be to set up datasets and evaluation protocols where these behaviors can be meaningfully observed and compared across tasks. On these datasets, the plan is to analyze how behavior changes between foveated and non-foveated models, and to include comparisons to models that don’t use foveation on the same tasks Engbert et al., 2015. There’s room to deviate e.g., trying alternative architectures or training strategies.
Thesis
Possible research questions:
- How does model behavior under foveated constraints vary across tasks like classification, visual search, and reconstruction?
- Where do behaviors differ most clearly between foveated and non-foveated setups for scanpath prediction?
- Under which conditions do foveation-like constraints help or hinder task performance and transfer?
Requirements
- Familiar with Python (PyTorch, data processing)
- Basic machine learning and deep learning knowledge
The student is expected to work independently and actively contribute to the direction of the project. Your supervisor (Valentin) is also actively involved in the project and is open to discussing research directions and any other related questions. The goal is a good and pleasant learning atmosphere.
Contact
To apply please email Valentin Hassler stating your interest in this project
