In Prep
-
Young Children Identify Knowledgeable Speakers Using Causal Influence.
[BibTeX]
A. Chuey, R. Z. Sparks, S. Wistreich, & H. Gweon. in prepBibTeX: @article{chuey2024, author = {Chuey, Aaron and Sparks, Robert Z and Wistreich, Suzannah and Gweon, Hyowon}, date-added = {2024-06-13 16:15:56 +0200}, date-modified = {2024-06-13 16:15:56 +0200}, booktitle = {}, title = {Young Children Identify Knowledgeable Speakers Using Causal Influence}, year = {in prep} }
Under Review
-
The BabyView dataset: High-resolution egocentric videos of infants’ and young children’s everyday experiences.
[Abstract]
[PrePrint]
[BibTeX]
B. Long, V. Xiang, S. Stojanov, R. Z. Sparks, Z. Yin, G. E. Keene, A. W. M. Tan, S. Y. Feng, C. Zhuang, V. A. Marchman, D. L. K. Yamins, & M. C. Frank. under reviewAbstract: Human children far exceed modern machine learning algorithms in their sample efficiency, achieving high performance in key domains with much less data than current models. This “data gap” is a key challenge both for building intelligent artificial systems and for understanding human development. Egocentric video capturing children’s experience – their “training data” – is a key ingredient for comparison of humans and models and for the development of algorithmic innovations to bridge this gap. Yet there are few such datasets available, and extant data are low-resolution, have limited metadata, and importantly, represent only a small set of children’s experiences. Here, we provide the first release of the largest developmental egocentric video dataset to date – the BabyView dataset – recorded using a high-resolution camera with a large vertical field-of-view and gyroscope/accelerometer data. This 493 hour dataset includes egocentric videos from children spanning 6 months – 5 years of age in both longitudinal, at-home contexts and in a preschool environment. We provide gold-standard annotations for the evaluation of speech transcription, speaker diarization, and human pose estimation, and evaluate models in each of these domains. We train self-supervised language and vision models and evaluate their transfer to out-of-distribution tasks including syntactic structure learning, object recognition, depth estimation, and image segmentation. Although performance in each scales with dataset size, overall performance is relatively lower than when models are trained on curated datasets, especially in the visual domain. Our dataset stands as an open challenge for robust, human-like AI systems: how can such systems achieve human-levels of success on the same scale and distribution of training data as humans?
BibTeX: @inproceedings{long2024, author = {Long, Bria and Xiang, Violet and Stojanov, Stefan and Sparks, Robert Z and Yin, Zi and Keene, Grace E and Tan, Alvin W M and Feng, Steven Y and Zhuang, Chengxu and Marchman, Virginia A and Yamins, Daniel L K and Frank, Michael C}, date-added = {2024-06-13 16:15:56 +0200}, date-modified = {2024-06-13 16:15:56 +0200}, booktitle = {}, title = {The BabyView dataset: High-resolution egocentric videos of infants’ and young children’s everyday experiences}, preprint = {https://arxiv.org/abs/2406.10447}, year = {under review} }
-
A universal of human social cognition: Children from 17 communities process gaze in similar ways.
[Abstract]
[PrePrint]
[BibTeX]
M. Bohn, J. Prein, A. Ayikoru, F. M. Bednarski, A. Dzabatou, M. C. Frank, A. M. E. Henderson, J. Isabella, J. Kalbitz, P. Kanngeisser, D. Keşşafoğlu, B. Köymen, M. V. Manrique-Hernandez, S. Magazi, L. Mújica-Manrique, J. Ohlendorf, D. Olaoba, W. R. Pieters, S. Pope-Caldwell, K. Slocombe, R. Z. Sparks, … D. B. M. Haun. under reviewAbstract: Theoretical accounts assume that key features of human social cognition are universal. Here we focus on gaze-following, the bedrock of social interactions and coordinated activities, to test this claim. In this comprehensive cross-cultural study spanning five continents and 17 distinct cultural communities, we examined the development of gaze-following in early childhood. We identified key processing signatures through a computational model that assumes that participants follow an individual’s gaze by estimating a vector emanating from the eye-center through the pupil. Using a single reliable touchscreen-based task, we found these signatures in all communities, suggesting that children worldwide processed gaze in highly similar ways. Absolute differences in performance between groups are accounted for by a cross-culturally consistent relationship between children’s exposure to touchscreens and their performance in the task. These results provide strong evidence for a universal process underlying a foundational socio-cognitive ability in humans that can be reliably inferred even in the presence of cultural variation in overt behavior.
BibTeX: @article{bohn2024, author = {Bohn, Manuel and Prein, Julia and Ayikoru, Agnes and Bednarski, Florian M and Dzabatou, Ardain and Frank, Michael C and Henderson, Annette M E and Isabella, Joan and Kalbitz, Josefine and Kanngeisser, Patricia and Keşşafoğlu, Dilara and Köymen, Bahar and Manrique-Hernandez, Maria V and Magazi, Shirley and Mújica-Manrique, Lizbeth and Ohlendorf, Julia and Olaoba, Damilola and Pieters, Wesley R and Pope-Caldwell, Sarah and Slocombe, Katie and Sparks, Robert Z and Sunderarajan, Jahnavi and Vieira, Wilson and Zhang, Zhen and Zong, Yufei and Stengelin, Roman and Haun, Daniel B M}, date-added = {2024-06-13 16:15:56 +0200}, date-modified = {2024-06-13 16:15:56 +0200}, booktitle = {}, title = {A universal of human social cognition: Children from 17 communities process gaze in similar ways}, preprint = {https://osf.io/preprints/psyarxiv/z3ahv}, year = {under review} }
-
Learning Variability Network Exchange (LEVANTE): A global framework for measuring children’s learning variability through collaborative data sharing.
[BibTeX]
M. C. Frank, H. Baumgartner, M. Braginsky, G. Kachergis, A. Lightbody, R. Z. Sparks, R. Zhu, K. A. Dodge, & A. Cubillo. under reviewBibTeX: @article{frank2024, author = {Frank, Michael C and Baumgartner, Heidi and Braginsky, Mika and Kachergis, George and Lightbody, Amy and Sparks, Robert Z and Zhu, Rebecca and Dodge, Kenneth A and Cubillo, Ana}, date-added = {2024-06-13 16:15:56 +0200}, date-modified = {2024-06-13 16:15:56 +0200}, booktitle = {}, title = {Learning Variability Network Exchange (LEVANTE): A global framework for measuring children’s learning variability through collaborative data sharing}, year = {under review} }
-
Measuring variation in gaze following across communities, ages, and individuals — a showcase of the TANGO–CC.
[Abstract]
[PrePrint]
[BibTeX]
J. Prein, F. M. Bednarski, A. Dzabatou, M. C. Frank, A. M. E. Henderson, J. Isabella, J. Kalbitz, P. Kanngeisser, D. Keşşafoğlu, B. Köymen, M. V. Manrique-Hernandez, S. Magazi, L. Mújica-Manrique, J. Ohlendorf, D. Olaoba, W. R. Pieters, S. Pope-Caldwell, U. Sen, K. Slocombe, R. Z. Sparks, R. Stengelin, … M. Bohn. under reviewAbstract: Cross-cultural studies are crucial for investigating the cultural variability and universality of cognitive developmental processes. However, cross-cultural assessment tools in cognition across languages and communities are limited. This paper describes a gaze following task designed to measure basic social cognition across individuals, ages, and communities (TANGO–CC). The task was developed and psychometrically assessed in one cultural setting and, with input of local collaborators, adapted for cross-cultural data collection. Minimal language demands and the web-app implementation allow fast and easy contextual adaptations to each community. The TANGO–CC captures individual- and community-level variation and shows good internal consistency in a data set from 2.5- to 11-year-old children from 17 diverse communities. Within-community variation outweighed between-community variation. We provide an open-source website for researchers to customize and use the task (https://ccp-odc.eva.mpg.de/tango-cc). The TANGO–CC can be used to assess basic social cognition in diverse communities and provides a roadmap for researching community-level and individual-level differences across cultures.
BibTeX: @article{prein2024, author = {Prein, Julia and Bednarski, Florian M and Dzabatou, Ardain and Frank, Michael C and Henderson, Annette M E and Isabella, Joan and Kalbitz, Josefine and Kanngeisser, Patricia and Keşşafoğlu, Dilara and Köymen, Bahar and Manrique-Hernandez, Maria V and Magazi, Shirley and Mújica-Manrique, Lizbeth and Ohlendorf, Julia and Olaoba, Damilola and Pieters, Wesley R and Pope-Caldwell, Sarah and Sen, Umay and Slocombe, Katie and Sparks, Robert Z and Stengelin, Roman and Sunderarajan, Jahnavi and Sutherland, Kirsten and Tusiime, Florence and Vieira, Wilson and Zhang, Zhen and Zong, Yufei and Haun, Daniel B M and Bohn, Manuel}, date-added = {2024-06-13 16:15:56 +0200}, date-modified = {2024-06-13 16:15:56 +0200}, booktitle = {}, title = {Measuring variation in gaze following across communities, ages, and individuals — a showcase of the TANGO–CC}, preprint = {https://osf.io/preprints/psyarxiv/fcq2g}, year = {under review} }
In Press
2024
-
Characterizing Contextual Variation in Children’s Preschool Language Environment Using Naturalistic Egocentric Videos.
[Abstract]
[Link]
[OSF]
[PDF]
[PrePrint]
[BibTeX]
R. Z. Sparks, B. Long, G. E. Keene, M. J. Perez, A. W. M. Tan, V. A. Marchman, & M. C. Frank. (2024). In Proceedings of the 46th Annual Conference of the Cognitive Science Society.Abstract: What structures children’s early language environment? Large corpora of child-centered naturalistic recordings provide an important window into this question, but most available data centers on young children within the home or in lab contexts interacting primarily with a single caregiver. Here, we characterize children’s language experience in a very different kind of environment: the preschool classroom. Children ages 3 – 5 years (N = 26) wore a head-mounted camera in their preschool class, yielding a naturalistic, egocentric view of children’s everyday experience across many classroom activity contexts (e.g., sand play, snack time), with >30 hours of video data. Using semi-automatic transcriptions (227,624 words), we find that activity contexts in the preschool classroom vary in both the quality and quantity of the language that children both hear and produce. Together, these findings reinforce prior theories emphasizing the contribution of activity contexts in structuring the variability in children’s early learning environments.
BibTeX: @inproceedings{sparks2024, author = {Sparks, Robert Z and Long, Bria and Keene, Grace E and Perez, Malia J and Tan, Alvin W M and Marchman, Virginia A and Frank, Michael C}, date-added = {2024-06-13 16:15:56 +0200}, date-modified = {2024-06-13 16:15:56 +0200}, booktitle = {Proceedings of the 46th Annual Conference of the Cognitive Science Society}, title = {Characterizing Contextual Variation in Children’s Preschool Language Environment Using Naturalistic Egocentric Videos}, website = {https://escholarship.org/uc/item/94j9m5v1}, pdf = {https://rbzsparks.github.io/papers/2024__Sparks_CogSci.pdf}, osf = {https://osf.io/967zv/}, preprint = {https://osf.io/preprints/psyarxiv/y75zu}, year = {2024} }
2023
-
The BabyView Camera: Designing a New Head-mounted Camera to Capture Children’s Early Social and Visual Environments.
[Abstract]
[PDF]
[OSF]
[DOI]
[PrePrint]
[BibTeX]
B. Long, G. Kachergis, V. A. Marchman, S. F. Radwan, R. Z. Sparks, V. Xiang, C. Zhuang, O. Hsu, B. Newman, D. L. K. Yamins, & M. C. Frank. (2023). Behavior Research Methods. doi:10.3758/s13428-023-02206-1Abstract: Head-mounted cameras have been used in developmental psychology research for more than a decade to provide a rich and comprehensive view of what infants see during their everyday experiences. However, variation between these devices has limited the field’s ability to compare results across studies and across labs. Further, the video data captured by these cameras to date has been relatively low-resolution, limiting how well machine learning algorithms can operate over these rich video data. Here, we provide a well-tested and easily constructed design for a head-mounted camera assembly—the BabyView—developed in collaboration with Daylight Design, LLC., a professional product design firm. The BabyView collects high-resolution video, accelerometer, and gyroscope data from children approximately 6 - 30 months of age via a GoPro camera custom mounted on a soft child-safety helmet. The BabyView also captures a large, portrait-oriented vertical field-of-view that encompasses both children’s interactions with objects and with their social partners. We detail our protocols for video data management and for handling sensitive data from home environments. We also provide customizable materials for onboarding families with the BabyView. We hope that these materials will encourage the wide adoption of the BabyView, allowing the field to collect high-resolution data that can link children’s everyday environments with their learning outcomes.
BibTeX: @article{long2023, author = {Long, Bria and Kachergis, George and Marchman, Virginia A and Radwan, Samaher F and Sparks, Robert Z and Xiang, Violet and Zhuang, Chengxu and Hsu, Oliver and Newman, Brett and Yamins, Daniel LK and Frank, Michael C}, date-added = {2023-07-24 16:15:56 +0200}, date-modified = {2023-07-24 16:15:56 +0200}, journal = {Behavior Research Methods}, title = {The BabyView Camera: Designing a New Head-mounted Camera to Capture Children’s Early Social and Visual Environments}, website = {https://link.springer.com/article/10.3758/s13428-023-02206-1}, preprint = {https://psyarxiv.com/238jk/}, osf = {https://osf.io/kwvxu/}, doi = {10.3758/s13428-023-02206-1}, year = {2023} }
-
Young children can identify knowledgeable speakers from their causal influence over listeners.
[Abstract]
[Link]
[OSF]
[PDF]
[BibTeX]
A. Chuey, R. Z. Sparks, & H. Gweon. (2023). In Proceedings of the 45th Annual Conference of the Cognitive Science Society.Abstract: Prior work demonstrates an early-emerging understanding of how speakers can alter listeners’ minds and actions. Yet, an abstract understanding of communication entails more than forward inferences about its influence on the listener; it also supports inverse inferences about the speaker based on its causal influence over the listener. Can children reason about the minds of speakers based on their causal influence over listeners? Across three studies, children viewed two communicative exchanges where a listener attempted to activate a toy; we manipulated when speakers communicated (Exp.1), how listeners’ subsequent actions changed (Exp.2), and whether speakers spoke or sneezed (Exp.3). By 5 years of age, children inferred the speaker who appeared to cause the listener to succeed was more knowledgeable, but only when they produced speech. These results suggest children can reason causally about the sources of communication, identifying knowledgeable speakers based on their influence over a listener’s actions and their outcomes.
BibTeX: @inproceedings{chuey2023, author = {Chuey, Aaron and Sparks, Robert Z and Gweon, Hyowon}, date-added = {2023-07-24 16:15:56 +0200}, date-modified = {2023-07-24 16:15:56 +0200}, booktitle = {{Proceedings of the 45th Annual Conference of the Cognitive Science Society}}, pages = {230235}, title = {Young children can identify knowledgeable speakers from their causal influence over listeners}, website = {https://escholarship.org/uc/item/9qh630dn}, pdf = {https://rbzsparks.github.io/papers/2023_Chuey_Sparks_Gweon_CogSci.pdf}, osf = {https://osf.io/derxp/?view_only=d3ad5730e321405da0e5347dfb35a3f0}, year = {2023} }
2022
-
Preschool-Aged Children Can Infer What Speakers Know Based on How They Influence Others.
[Abstract]
[Link]
[DOI]
[PDF]
[BibTeX]
R. Z. Sparks, A. Chuey, & H. Gweon. (2022). Stanford Digital Repository.Abstract: How do we know what others know? Prior work has examined how children use evidence about isolated agents, like their perceptual access and actions, to infer what they know. However, humans are rarely fully isolated; instead, we are often surrounded by others whom we interact with, influence, and are influenced by. In these contexts, we can use a speaker’s communication and the way it causes a listener to behave to infer what that speaker knows - even if we do not know the specific content of what was communicated. The present studies investigated how preschool-aged children use two pieces of evidence about listeners to reason about what speakers know: changes in the outcomes of a listener’s actions following communication (Study 1) and changes in a listener’s actions themselves following communication (Study 2). In both studies, children observed two scenarios where a listener failed to activate a toy before succeeding. In Study 1, children observed a speaker produce nonsense language towards a listener after they failed but before they succeeded to activate a toy, as well as another speaker who spoke to a listener prior to initial failure. In Study 2, children observed a speaker communicate with a listener before a distinct change in action, followed by success, as well as another speaker who communicated with a listener resulting in no distinct change in action, followed by success. When asked which speaker knows how to make the toy work, 5 year-olds chose the speaker who appeared to cause the listener to succeed (Study 1) or change their action (Study 2). These results suggest that preschool-aged children are sensitive to the way speakers influence others via communication and can use evidence of that influence to infer what speakers know. More broadly, these studies highlight children’s ability to reason about the knowledge of one agent (a speaker) based primarily on evidence about another agent (a listener).
BibTeX: @thesis{sparks2022, author = {Sparks, Robert Z and Chuey, Aaron and Gweon, Hyowon}, date-added = {2023-07-24 16:15:56 +0200}, date-modified = {2023-07-24 16:15:56 +0200}, title = {Preschool-Aged Children Can Infer What Speakers Know Based on How They Influence Others}, journal = {Stanford Digital Repository}, website = {https://purl.stanford.edu/xx316hn9817}, doi = {10.25740/xx316hn9817}, pdf = {https://rbzsparks.github.io/papers/2022_honors_thesis.pdf}, year = {2022} }