Technology for visual Impairments Rehabilitation on Early-life through Social Information Augmentation

Informations projet

TIRESIA

N° de convention de subvention: 896415

DOI

10.3030/896415

Projet clôturé

Date de signature de la CE 1 Avril 2020

Date de début 1 Septembre 2021

Date de fin 31 Août 2024

Financé au titre de

EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions

Coût total

€ 251 002,56

Contribution de l’UE

€ 251 002,56

251 002,56

Coordonné par

FONDAZIONE ISTITUTO ITALIANO DI TECNOLOGIA
Italy

Periodic Reporting for period 2 - TIRESIA (Technology for visual Impairments Rehabilitation on Early-life through Social Information Augmentation)

Période du rapport: 2023-09-01 au 2024-08-31

Considering that vision is an extremely important sensory modality in humans, visual impairments (VI) influence most instrumental and social activities of daily living, affecting overall quality of life. In the vast majority of cases, both if the visual impairments are caused by peripheral or cerebral lesions of the visual system, individuals with VI retain some degree of vision. Though visual disorders are less frequent in children than in adults, it is especially important to timely assess the presence of a vision’s impairment in order to program an early intervention. Indeed, problems in visual functioning affect the child’s overall development. Therefore, early planning of a visual rehabilitation therapy is crucial to improve the child’s functional vision and facilitate the development in all areas right when the brain plasticity is maximal and key competencies are developing.
The aim of this action was to contribute to connect advances in neuroscience and machine learning to study the impact of VI on key functional competencies and improve treatment strategies. Specific objectives were focusing on exploiting machine learning techniques both to investigate how visual impairments at an early age affect the development of social skills, and to develop novel tools for their training, in the context of visual rehabilitation. The ultimate goal was that of generating knowledge and data useful to develop technologies that enhance residual skills of visually impaired people and facilitate their social inclusion, independence and equal opportunities in the society.
Results allowed to characterize social skills and social attention of children and adults in tasks related to emotion inference from facial expressions and classification of social interactions from videos. Specifically, a custom dataset of videos was developed, where two simple agents were performing goal-oriented actions and social interactions (help, hinder or no interaction). Differences were analyzed in terms of age (testing groups of children on different age ranges and adults), and condition (testing groups of children with visual impairments, autism spectrum disorders and typically developing children), to characterize the effect of development as well as the effect of low-level sensory issues and high-level cognitive issues on visual social attention and social skills.

The work carried out during the action took place at the Infolab, within the Computer Science and Artificial Intelligence Lab (CSAIL) of the Massachusetts Institute of Technology (MIT), and at the Unit for Visually Impaired People (UVIP) of the Italian Institute of Technology (IIT). The following main results were achieved:
• Definition of the characteristics of groups participating to the studies. Three groups of subjects (both children and adults) were selected: typically developing (TYP), with visual impairments (VI), and with Autism Spectrum Disorders (ASD). The aim was to characterize the influences of sensory perceptual issues (visual impairments) and cognitive issues (neurodevelopmental disorders such as ASD) on the development of social skills, considering visual attention patterns in the context of social interactions.
• Design of a dataset of dynamic (video) visual social stimuli, suitable to be used both for studies with human subjects and to train computer vision models. The dataset is suitable on one hand to characterize the development of the human ability to understand social behaviors (specifically to classify interactions between agents, distinguishing among help, hinder or no interaction). On the other hand, it can be used to train computer vision models to classify social interactions.
• Collection of behavioral and eye-tracking datasets from subjects with visually impairments, subjects with autism spectrum disorders and healthy subjects, both children and adults. These datasets contributed to: 1) characterize how social skills develop on typically developing individuals and on individuals with sensory or cognitive issues, and 2) build a benchmark for training human-like attentive computational models, able to predict the effect of perceptual and cognitive issues on visual social attention. Specifically, the datasets included: 1) an eye-tracking dataset on static images, during a preferential looking task (social versus object) from VI children (considering both low visual acuity and nystagmus) and sighted children; 2) behavioral datasets of answers related to recognition of facial expressions with and without face masks from sighted and VI children (including children with low visual acuity, nystagmus and saccadic issues); 3) eye-tracking datasets (including behavioral answers) collected from sighted adults, children with VI and children with ASD, who were asked to classify videos of social interactions (help, hinder and no interaction) among simple agents performing goal-oriented actions.
• Development of deep models: 1) to predict social salience of the visual input during classification of social interactions, based on human-gaze data; 2) to augment the visual input based on identified socially salient areas. The first application provided a baseline for developing human-like computational models of visual attention in the context of social interactions. The outputs from this research were intended to be exploitable for applications of machine learning tools to support the assessment and rehabilitation of sensory visual impairments as well as of developmental disorders affecting social behaviors, such as ASD.

Results from this action contributed to shed light on how humans process socially relevant visual information. In particular, the underlying hypothesis was that there is a significant difference in human visual attention patterns when observing a socially relevant versus a non-relevant visual input, and that this difference could be modeled computationally. The second hypothesis was that both sensory perceptual low-level and neurological high-level issues affect the resulting social saliency. Results suggested that visual impairments, similarly to neurodevelopmental disorders such as autism, can significantly impact the attention towards socially relevant visual information. The project introduced a model-driven approach to understand the social abilities of people with VI and ASD. Results included the development of a model that for the first time agreed with human subjects when classifying videos of social interactions, and it served as a basis to investigate whether it is possible to build computational models that are sensitive enough to recognize impairments of people with VI and ASD. The data collected from human subjects with sensory and cognitive impairments provided an important ground truth, exploitable for future studies to discover what components in diverse models’ architectures are contributing to differences in social perception between these populations. Finally, the experimental protocol developed during the project offered an effective tool to quantify the extent to which different groups have limited social perception, and under what conditions. Such an outcome served towards generating a better understanding of both neurodevelopmental and sensory atypical conditions, and supported the potential of a data-driven approach to planning of rehabilitation interventions.

Differences in visual attention among typical (TYP) and visually impaired (VI) children.

Periodic Reporting for period 2 - TIRESIA (Technology for visual Impairments Rehabilitation on Early-life through Social Information Augmentation)

Télécharger Télécharger le contenu de la page