FAQs for Omics Phenotypes Related to Down Syndrome for the INCLUDE Project

The following FAQs are specific to :

How will X01 projects/cohort be selected?

Investigators whose projects are selected for this opportunity will be notified by NIH INCLUDE Project staff with the estimated number of samples approved for sequencing. Since there is no “award” associated with the X01 mechanism, X01 decisions are not finalized by an NIH Institute or Center (IC) Council. Rather, following initial peer review, recommended applications will receive a second level of review by the NIH staff involved in the INCLUDE Project, and decisions are approved by the NIH INCLUDE Project Steering Committee. The following will be considered in making cohort selections:

  • Scientific and technical merit of the proposed project as determined by scientific peer review
  • Availability of funds
  • Relevance of the proposed project to program priorities
  • Value of incorporating the dataset into the Data Hub to empower research among the Down syndrome community.
  • Compliance with resource sharing policies as appropriate and ability to broadly share and use data from the cohort in line with the goals of the program (i.e. combining and cross-analyzing genomic datasets). INCLUDE Project staff reserves the right to not include cohorts that cannot be broadly shared or cross-analyzed with other INCLUDE datasets.
  • Informative study design and sufficient clinical and phenotypic data.
  • Availability of samples in a timely manner.
  • Sample quality in terms of suitability for whole genome sequencing (as well as exome, transcriptome and other omics if applicable).
  • Please note that DNA from patient-derived cell lines will not be accepted due to the possible introduction of mutations that could confound the identification of disease-causing rare variants.

Approval to access the sequencing capacity is conditional on the submission of a completed Institutional Certification covering all samples to be submitted for sequencing. If the document does not meet the INCLUDE Project 's expectation for broad data sharing (i.e. General Research Use), another cohort with broader sharing may be selected instead. For more information, please see our FAQs on Data Sharing.

What information is required as "Other Attachments"?

INCLUDE Project is asking for specific information to be summarized and included as attachments. This is described in the NOFO under Section IV. Application and Submission Information under the subheading SF424(R&R) Other Project Information. Applicants must include:

  • Institutional Certification – Institutional Certifications specify the data use limitations and data use limitation modifiers, as determined by the institution’s IRB or equivalent body based on the informed consent agreed to by the participants.
    • If the IRB or equivalent body has not completed its review and therefore the institution cannot attest to all of the elements of the formal Institutional Certification, a provisional Institutional Certification is acceptable. If a provision Institutional Certification is submitted, the applicant is asked to describe the anticipated data use limitations and data use limitation modifiers. For institutional and/or provisional certifications, please use the current template:.
  • Sample Information, including type (e.g., DNA, RNA), tissue source, fixation method (when appropriate), and description of clinical and phenotypic data that are available to be shared through the INCLUDE Data Hub. Applications that propose submitting rich phenotypic data sets will be looked upon favorably.
  • Optional – Family Structure or Pedigrees
  • INCLUDE Project has developed athat applicants can use to summarize the samples, phenotype data, and data use limitations (if needed) for the proposed cohort. You can also use DCC’s to create a data dictionary describing the data that will be provided to the INCLUDE DCC for sharing with the broader research community upon release of the dataset. While applicants are required to provide this information, the use of these template forms is optional. Applicants may submit the required information in whatever format meets their individual purposes if it provides, at a minimum, the information requested in the NOFO.

Do the cohorts have to be properly consented before applying for the X01?

Participants in cohorts selected under this NOFO must have given consent to allow sharing of individual-level genome sequence and relevant phenotype data through dbGaP or other NIH-approved repositories. Applicants must provide documentation of this by submitting an Institutional Certification (or Provisional Certification with a description of anticipated data use limitations) that covers all sites samples, as an attachment (see question above).

Cohort samples that have consents allowing for broad data sharing (e.g. for General Research Use with no data use limitation modifiers) will be given highest priority. No funds will be provided for obtaining new consent for existing samples. Consent to re-contact participants for additional phenotyping or collection of additional samples is strongly encouraged. Applicants are required to describe any data use limitations.

What biospecimen information and phenotype data elements are expected?

Certain biospecimen and clinical/phenotype data are expected in order to process and analyze datasets; however, deep phenotyping is preferred. For phenotype data, the following data elements are expected, where available:
sex, race, ethnicity, age at enrollment and/or diagnosis, diagnoses (e.g. type of co-occurring conditions), phenotypes for affected cases and unaffected families members, vital status, age at last known vital status, clinical information, and family medical history (e.g., family history of cancer or birth defects).

For templates and additional resources related to information required, please visit .

If investigators have already registered a project in dbGaP, and are seeking Whole Genome Sequencing (WGS) through INCLUDE Project for samples from the same cohort, is a new Institutional Certification required?

As long as the Institutional Certification for the registered project complies with NIH Genomic Data Sharing policy and covers all of the participants whose samples will be sequenced through the INCLUDE Project, a new certification is not required with the application. However, the Genomic Program Administrator (GPA) may ask for an Institutional Certification using the most recent NIH template (), if needed, prior to registering the study in dbGaP.

Is it important to know the source of the DNA for samples being submitted for WGS through INCLUDE Project?

It is important to know the source of the DNA for samples provided to INCLUDE Project Sequencing Centers. We ask that applicants provide a description of the samples, such as collection site; number of samples included in the study; a detailed inventory of the sources of the DNA (e.g., number of samples from blood, number of samples from saliva); and previous genotyping or sequencing. DNA from fresh/frozen blood or tissue is ideal for sequencing, as DNA from saliva can be contaminated with microbial DNA, which may result in higher costs (and therefore reduce the number of total samples that can be sequenced). Cell lines will not be accepted because they often have significant genomic differences compared to the original germline which could complicate analysis. There are circumstances where studies might include induced pluripotent stem cells (iPSCs), but even then, a normal sample for comparison may be desirable.

What is the role of the INCLUDE Data Coordinating Center (DCC) and how will data submitted to the INCLUDE Data Hub be shared?

  • The has launched the to facilitate data submission, harmonization, sharing, and interoperability of data generated by INCLUDE projects and other NIH-designated data repositories, as appropriate. To explore the data available in the INCLUDE Data Hub, visit.
  • The INCLUDE Data Hub’s data sharing model is based on the following set of core principles:
    • Accelerating research through broad data sharing
    • Fostering transparency and collaboration among researchers and other community members
    • Maximizing data availability and searchability through indexing and visualizations in the INCLUDE Data Hub Portal
    • Managing sensitive data according to participant consent and existing governance structures where appropriate
  • The INCLUDE DCC intends to make all INCLUDE data Findable, Accessible, Interoperable and Reusable () through the INCLUDE Portal, the primary entry point to the INCLUDE Data Hub. However, access to some datasets may require additional approvals, for example, from the NIH Data Access Committees (via dbGaP) for individual-level genomic datasets or consortia approvals (see below).
  • INCLUDE data will be shared as rapidly as feasible and in line with the Final NIH Policy for Data Management and Sharing () and the goals of the NIH INCLUDE Project.
  • For any datasets submitted to the INCLUDE Data Hub that require controlled access (e.g., genomic data), any consent-based limitations on data use will be documented using theand access will be managed through dbGaP.

For questions about submitting and sharing data through the INCLUDE Data Hub, contact:info@includedcc.org.

It seems that no funds will be awarded to investigators, but a detailed analytic plan is requested. Are investigators expected to obtain funds to support analysis separately?

There are no direct funds available under the X01 opportunity to support analysis of sequence data or other activities. The request for applicants to provide an analysis plan is intended to increase the likelihood that the samples to be sequenced are of high quality, that the number of specimens is appropriate for the stated aims, and that those submitting X01 applications will be prepared to do the analyses. Those investigators providing the samples are likely to have a significant advantage in conducting analyses because they are familiar with the cohort, and they will be interacting directly with NIH, sequencing centers, and the INCLUDE DCC throughout the process.

Each X01 investigator team has six months of proprietary access to the sequence data before it is released to the public for controlled access via dbGaP. To learn about funding opportunities for supporting data analysis see: Active INCLUDE Funding Opportunities, especially the to Support Data Analysis, Curation, and/or Sharing of DS-related data.

Is it possible to submit an application with multiple PIs from different Institutions in order to build an adequate sample size or create a larger, more compelling cohort? Alternatively, is it possible to reach an adequate sample size by adding trios or families with a different childhood cancer or structural birth defect?

Efforts to increase sample number by collaborations across institutions are acceptable and encouraged. Strong justification for the proposed sample size is expected in each application. Increasing sample numbers by aggregating across related conditions is acceptable. However, applicants doing this should be prepared to provide a description of the analyses that will be performed across the aggregated cohort, and it may be easier to do this for sets of samples with related phenotypes or suspected underlying pathways. In addition, investigators should state how aggregating samples won’t slow the process of sending samples to the Sequencing Centers.

Should we propose quality metrics for the genome sequencing?

No, this is not necessary. You should note the quality of the samples being proposed for submission.

Do applicants need to describe the capacity to store BAM files?

Applicants are encouraged to make use of the cloud-based workspace provided by the INCLUDE Data Hub. Therefore, local download and processing of data may not be necessary for interacting with INCLUDE datasets. If your group plans to download data to a local server as part of the data management plan, it is important to make clear that your team has the capacity (including equipment, security infrastructure, and physical resources) at your institution to securely accept and store large data files. If your group plans to make use of cloud-based workspaces, please describe a plan for analyzing data in such spaces. For information about the DCC cloud-based workspaces, visit.

Data may be stored/hosted on local cloud-based platforms. For more information see “”.

Although the maximum project period is 1 year, could one propose to sequence 70 trios now and then add 50 trios next year after additional collections?

All samples must be extracted, properly consented, and ready to send off to the sequencing center shortly after the review date. Please refer to the NOFO for a more detailed timeframe.

Who is responsible for data deposition?

The sequencing center is responsible for deposition of the sequence data into a NIH approved data repository (e.g., dbGaP/ or the INCLUDE Data Hub). The study Principal Investigator will be responsible for directly submitting the clinical/phenotypic data to the INCLUDE DCC.

What amount and concentration of samples will be required and what will be the coverage? Please keep in mind the following information may be updated by the time the X01 is awarded, please use the latest information provided by omics centers before shipping your samples.

Whole genome sequencing (WGS) of germline DNA will be done at 30X mean coverage using paired end sequencing. Depending on the sequencing center’s protocol, tumors may be sequenced at 60X or 30X mean coverage using paired end sequencing combined with whole exome sequencing (WES) and RNA sequencing both at 100X also using paired end sequencing. The NIH and sequencing center staff will work with each project to determine the best coverage and approach for sequencing and analysis of tumors and/or affected tissue.

Amount DNA or RNA required/recommended

Concentration

Coverage

Additional info

Amount of DNA/RNA and coverage

WGS (Short-read)*

~2ug DNA

20-50 ng/ul preferred

30X

paired end reads

WES

275 ng DNA (minimum); 1 ug recommended

20 ng/ul (minimum)

100X, greater than 80% coding exons covered at 20X

paired end reads

RNA-Seq

750 ng total RNA (minimum); 1 ug recommended

20 ng/ul (minimum)

100X, greater than 40% coding exons covered at 20X

paired end reads

Epigenetics (Infinium® MethylationEPIC 850K BeadChip.) – 1,000 ng of genomic DNA for bisulfite array profiling

Metabolomics (mass spectrometry) – 200 microliters. EDTA plasma is recommended. Applicants should provide a list of specific metabolites that would be required for their analysis.

Proteomics (Olink) – 200 microliters. EDTA plasma is recommended.

*Long-read sequencing technologies such as those offered by PacBio have specific requirements. Contact the sequencing center or program staff for more information.

Are applicants expected to describe how results will be returned to study participants or how incidental findings will be reported?

Decisions about returning individual results and incidental findings to study participants lie with the institution and their IRBs or equivalent body and are outlined in the consent form agreed to by participants. NIH does not require that INCLUDE X01 applicants describe a plan for return of results. Investigators and participants should keep in mind that the technology used to generate sequence data in this program is designed for research purposes, not for identifying clinical results. Communicating clinically meaningful results to participant requires sequencing and analysis by a CLIA-approved laboratory. Since the INCLUDE Project is focused on research and discovery, CLIA sequencing is not provided.

Can I just propose other omics without WGS data in my application?

In general, we will provide omics measurements only on samples that have or will have WGS data. Please contact NIH staff if you have a special case for consideration.

What are the omics that can currently be requested by X01 Investigators?

They are Whole Genome Sequencing (WGS), Epigenetics (methylation), RNA-seq, Metabolomics, and/or Proteomics. The single-cell omics technology will be introduced in the near future.

Are there any requirements for previous omics being included for an application?

No, but the applicant should note previous technologies that could indicate sample quality, e.g. array based genotyping or Whole Exome Sequencing to indicate DNA quality, etc.

Is there a minimal requirement for prior omics analysis?

No. However, if the application includes WGS or other omics data generated elsewhere, investigators may need to address whether the existing data can be part of the INCLUDE in terms of data sharing, quality, and formats.

Where will the assays be performed?

The INCLUDE Project will leverage the Kids First Sequencing Center at the Broad Institute for WGS, RNA Sequencing and Long Read Sequencing; and will leverage the NHLBI TOPMed Omics Centers for Methylomics, Proteomics and Metabolomics.

More details about the TOPMed Omics Centers will be provided after awards are made. A list of previously awarded centers can be founded at TOPMed website ().

Are there examples of the Optional Tables for requirements described in “Other Attachments” (see Section IV.2 of PAR-24-081)?

You may use the to help address the information requested in the “Other Attachments” section of PAR-24-081. The use of these tables is optional; applicants may choose to describe the data use limitations, samples, clinical/phenotypic data, and family structures in another format. The tables serve to help both applicants and reviewers by providing a uniform structure for organizing this information.

This page last reviewed on January 30, 2024