The integrated Human Lung Cell Atlas (HLCA) v1.0
Atlas Description
The integrated Human Lung Cell Atlas (HLCA) represents the first large-scale, integrated single-cell reference atlas of the human lung. It consists of over 2 million cells from the respiratory tract of 486 individuals and includes 49 different datasets. It is split into the fully integrated HLCA core reference and the extended or full HLCA.
HLCA Core
The HLCA core includes data of healthy lung tissue from 107 individuals and includes manual cell type annotations based on consensus across six independent experts, as well as demographic, biological, and technical metadata. The datasets in the HLCA core were integrated using scANVI. The HLCA core can be used as a reference to map new datasets onto using scArches, allowing the transfer of the harmonized cell-type labels to any new dataset.
HLCA Full
The full HLCA includes 35 further datasets that include donors with various lung diseases. These datasets were mapped onto the core with scArches, and include disease annotations as well as consensus cell type labels transferred from the HLCA core onto the mapped datasets.
Batch Correction
Note that while the HLCA includes an integrated, batch-corrected low-dimensional embedding, the gene counts themselves were not batch-corrected. Both the HLCA core and the full HLCA can be explored below.
Metadata
Detailed information about all metadata in the objects, as well as further HLCA-related information, can be found on the HLCA landing page: https://github.com/LungCellAtlas/HLCA.
Raw Counts
Raw counts are available in the downloaded .h5ad at adata.raw.X
, in the downloaded .rds at seurat_object@assays$RNA@counts
, and via CELLxGENE Census.
Atlas Name | Tissue | Disease | Cells | Explore | Download |
---|---|---|---|---|---|
An integrated cell atlas of the human lung in health and disease (full) | 4 tissues | normal 15 diseases | 2.3M | ||
An integrated cell atlas of the human lung in health and disease (core) | 3 tissues | normal | 584.9k |