A Comprehensive List of ASC Data Resources

The value of big, open data research is increasingly being recognised, especially with the development of notable data-sharing initiatives and accessible resources. These resources are available for professionals interested in conducting research on ASC. To bring awareness to these resources and progress ASC research, ShARL conducted a systematic review identifying and describing sources of multiple data types such as phenotypic, neuroimaging, and genetic data.

The resources are organised in the below tables and can be updated by ShARL and the research community. To propose a resource for inclusion o this list, kindly click on 'Submit New Resource' and complete a short form. We will then upload the details to this page. The resources contain data either from individuals with ASC or data relevant to ASC research (e.g. from individuals with certain genetic profiles or syndromes related to ASC research).

Associated academic publication: Al-jawahiri, R. & Milne, E. (2017). Resources available for autism research in the big data era: a systematic review. PeerJ, 5, e2880-e2880.

Resource Metadata Data Type Category Data Type Number of Participants with ASD/ Description
National Database for Autism Research (NDAR) (Hall et al., 2012),
(Novikova et al., 2013),
(Torgerson et al., 2015),
(Payakachat et al., 2016)
Phenotypic, neuroimaging, genetic, omics Phenotypic, neuroimaging, genetic, omics data Over 80,203 participants (however this number includes the control participants of the ASC studies).
Simons Foundation Autism Research Initiative (SFARI) (Fischbach and Lord, 2010),
(Simons VIP Consortium, 2012)
Phenotypic, neuroimaging, genetic Phenotypic data, biospecimens, genetic data, neuroimaging data, participant recruitment (to recruit SSC families for additional studies) Over 3,000 participants (SSC), over 200 participants (Simons VIP), 50,0001 participants (SPARK).
Autism Genetic Resource Exchange (AGRE) (Geschwind et al., 2001),
(Lajonchere and AGRE Consortium, 2010)
Phenotypic, genetic, biospecimens Phenotypic data; genetic data, biospecimens Over 1,700 families with over 3,300 ASC participants.
Interactive Autism Network (IAN) - ASC participant recruitment services Phenotypic data, ASC participant recruitment services Over 17,000 participants.
Autism Spectrum Database-UK (ASD-UK) (Warnell et al., 2015) ASD participant recruitment services Phenotypic data, ASC participant recruitment services Over 3,000 families.
Autism BrainNet - BioBank Postmortem brain and related biospecimens Over 25 donations (since 2014)1.
Autism Brain Imaging Data Exchange (ABIDE) (DiMartino et al., 2014) Neuroimaging Resting state functional magnetic resonance imaging (R-fMRI), structural MRI, phenotypic data 539 participants (ABIDE I), 487 participants (ABIDE II).
Australian EEG Database (AED)2 (Hunter et al., 2005) Neuroimaging EEG data 50 participants3.
BrainMap4 (Laird et al., 2005),
(Laird et al., 2011)
Human brain statistical maps fMRI, PET, and structural coordinate-based results (x,y,z) in Talairach or MNI space 70 results / articles relevant to ASC functional data (using BrainMapWeb).
NeuroVault (Gorgolewski et al., 2015),
(Gorgolewski et al., 2016)
Human brain statistical maps Unthresholded statistical maps, parcellations, and atlases produced by MRI and PET studies Five studies: 277, 60, 50, 13, 218 participants in each study.
USC Multimodal Connectivity Database (Brown et al., 2012),
(Brown et al., 2016)
Brain connectivity matrices Brain connectivity matrices of fMRI and DTI 42 (fMRI) participants, 51 (DTI) participants.
Dryad - General data repository lncRNA, MRI, metabolite, MEG Four studies: two, 34, 12, and 13 participants respectively.
FigShare4 (Singh, 2011),
(Enis, 2013)
General data repository Phenotypic, statistical, genetic data -
NIMH Repository and Genomics Resource (NIMH-RGR) - Biospecimens, genetics Biospecimens (DNA samples and cell lines, Induced Pluripotent Stem Cell (iPSC) and Source Cells), GWAS, genomic sequences Biospecimens: 4,793 families and 19,359 individuals of which 17,189 have DNA cell lines. Genome-Wide Association Studies (GWAS) Data: 4 studies (1,232 cases, 739 families, 943 families, 935 families). Sequence data (exome): 2,119 cases.
Avon Longitudinal Study of Parents and Children (ALSPAC)5 (Golding et al., 2011),
(Fraser et al., 2013)
Phenotypic, clinical, biospecimens, genetic Phenotypic, clinical, biospecimens, genetic (including GWAS, SNPs, VNTRs, in addition to sequence data from UK10K project available via EGA), ALPAC data linked with data (e.g., routine health and social records) from external sources, bespoke dataf 96 participants (as identified via follow up questionnaires completed by carers for when the proband was nine years old).
Coriell BioRepositories (including Autism Research Resource) - BioBank Cell cultures, DNA samples, and induced pluripotent stem cells 158 ASC cases.
NIH NeuroBioBank (NBB) - BioBank Postmortem brain and related biospecimens 64 ASC cases. 22 ASC suspected.
Medical Research Council London Neurodegenerative Diseases Brain Bank - BioBank Postmortem brain and spinal cord tissue 4 ASC cases.

1 The data is not yet available: It is intended to be available in a future date according to the SFARI website.

2 There is no website or portal for the AED resource; however, the data is available via email requests to aed@newcastle.edu.au

3 The approximate number of ASD participants was found via email correspondence with aed@newcastle.edu.au

4 Accurate information regarding the approximate number of participants with ASC is not readily available on the website, due to the nature of the search functionality.

5 Data specifically from ASC participants are not necessarily available in all the different data types described in this table (therefore further specific enquiries directed to the ALSPAC team is advised).

Genetics and Omics Data Resources
Resource Metadata Data Type Category Data Type Number of Participants with ASD/ Description
MSSNG - Genetic/ Genomic Phenotypic, genomic (whole genome sequencing of blood DNA) 10,000 participants. However, data from only 3000 probands is currently available.
Simons Foundation Autism Research Initiative Gene (SFARI Gene) (Banerjee-Basu and Packer, 2010),
(Abrahams et al., 2013)
Gene Catalogue Animal Model, Protein Interaction (PIN), Gene Scoring, CNV An up-to-date, manually annotated reference set of ASC-linked genes.
Autism Chromosome Rearrangement Database (ACRD) (Marshall et al., 2008) Gene Catalogue Genomic structural variation data - CNVs A curated catalogue of structural variation related to ASC extracted from publicly available literature and unpublished data.
Autism Knowledgebase (AutismKB) (Xu et al., 2012) Gene Catalogue A collection of genes and variations associated with ASC with annotations -
National Center for Biotechnology Information (NCBI) (NCBI Resource Coordinators, 2013),
(NCBI Resource Coordinators, 2017)
Genetics, omics A collection of multiple resources - Omics and sequencing data -
European Molecular Biology Laboratory (EMBL-EBI) (Brooksbank et al., 2014) Genetics, omics A collection of multiple resources - Omics and sequencing data -
Universal Protein Resource (UniProt) (UniProt Consortium, 2015),
(UniProt Consortium, 2017)
Protein sequences Protein sequences and their annotations Can be found among EMBL-EBI resources. 91 (reviewed) and 346 (unreviewed) protein records associated with ASC.
The European Genome-phenome Archive (EGA) (Lappalainen et al., 2015) Omics - Functional genomics Interaction of genotype and phenotype (including data from UK10K project) Can be found among EMBL-EBI resources.
Biological General Repository for Interaction Datasets (BioGRID) (Stark et al., 2006),
(Chatr-Aryamontri et al., 2015)
Omics Genetic and protein interaction data Resource that archives and disseminates genetic and protein interaction data.
Global Proteome Machine Database (GPM DB) (Craig et al., 2004) Omics - Proteomics Proteomics data from tandem mass spectrometry Open-source system for analyzing, storing, and validating proteomics information derived from tandem mass spectrometry.
PeptideAtlas (Desiere et al., 2005),
(Deutsch et al., 2015)
Omics - Proteomics Peptide sequences, mapping - proteome information/data A collection of peptides identified in a large set of tandem mass spectrometry proteomics experiments.
DNA DataBank of Japan (DDBJ) (Tateno et al., 2002),
(Kodama et al., 2012)
DNA and RNA sequences DNA and RNA sequences Annotated collection of all publicly available nucleotide sequences and their translated amino acid sequences.
The Chromosome 7 Annotation Project (Scherer et al., 2003) DNA sequences DNA sequence and annotation of the entire human chromosome 7 84 cases.
miRBase: the microRNA database (Kozomara and Griffiths-Jones, 2014) miRNA sequences miRNA sequences and annotation -
Sullivan Lab Evidence Project (SLEP) (Konneker et al., 2008) Genetics, omics A collection of genes and variations associated with ASC with annotations Findings from genome wide linkage (GWL), genome wide association (GWA), and microarray (MA) studies for ASC.