top of page

Resource Web: Open Data Repositories for CSD Research


Open Data

View Resource

About this Resource


  • NIH's Resource for Finding Data Repositories - Hint, narrow your search by selecting "NIDCD" in the Institute or Center column.

  • TalkBank - The goal of TalkBank is to foster fundamental research in the study of human communication with an emphasis on spoken communication. Currently, TalkBank provides repositories in 14 research areas.

Child Language

  • PhonBank - PhonBank is the child phonology component of the TalkBank system.

  • SEED Corpus - The SEED corpus includes recordings of single words and continuous speech samples that provide examples of speakers with and without speech disorders.

  • UltraSuite  - UltraSuite is a repository of ultrasound and acoustic data from child speech therapy sessions.

  • PERCEPT - Data for PERCEPT-R and PERCEPT-GFTA were collected during 34 separate cross-sectional and longitudinal studies at Syracuse University, Montclair State University, and New York University between 2006 and 2021.

  • ASDBank - ASDBank includes data on language development from children and adolescents with autism spectrum disorder. This bank is part of the TalkBank repository.

  • HomeBank - HomeBank is a resource for shared multi-hour, real-world recordings of children’s everyday experiences (for example, daylong home recordings using the LENA system), plus tools for analyzing those recordings. It is a component of the TalkBank system.

Adult Language & Aphasia

  • AphasiaBank - AphasiaBank is a shared database of multimedia interactions for the study of communication in aphasia. Access to the data in AphasiaBank is password protected and restricted to members of the AphasiaBank consortium group. This bank is part of the TalkBank repository.


  • FluencyBank - FluencyBank is a shared database for the study of fluency development. Participants include typically-developing monolingual and bilingual children, children and adults who stutter (C/AWS) or who clutter (C/AWC), and second language learners. This bank is part of the TalkBank repository.

Bilingualism & Second Language Acquisition

  • BilingBank  - BilingBank is a component of TalkBank dedicated to providing corpora for the study of multilingualism. This bank is part of the TalkBank repository.

  • SLABank - SLABank is a component of TalkBank dedicated to providing corpora for the study of second language acquisition. This bank is part of the TalkBank repository.


  • TORGO - The TORGO database of dysarthric articulation consists of aligned acoustics and measured 3D articulatory features from speakers with either cerebral palsy (CP) or amyotrophic lateral sclerosis (ALS), which are two of the most prevalent causes of speech disability (Kent and Rosen, 2004), and matched controls

This resource has been contributed to by Elaine Kearney and Austin Thompson. Additionally, resources have been sourced from Benway et al. (2023).

Are we missing something? Let us know!

© 2023 

  • Twitter
  • Instagram
  • Youtube
  • Spotify
  • Apple Music

In partnership with

bottom of page