The Hi-C method, which uses high-throughput sequencing to identify DNA ligation events genome-wide, has allowed the large-scale study of chromatin folding. This has identified several principles of eukaryotic genome organisation, such as A/B compartments and TADs. Subsequently however, Hi-C was applied to single cells, and given that each cell represents only one underlying genome conformation, it was possible to compute complete genome structures, albeit at low resolution; typically for 100kb segments. This allow us to know roughly where all the different DNA sequences of an entire genome reside in the 3D volume of the nucleus.
However, in order to avoid ambiguity problems from having two copies of each chromosome, genome structures were initially only computed for stem cells that were maintained in a haploid state. Given that halpoidy limits the usefulness of single-cell Hi-C, which we would like to employ in any tissue type, including to investigate allele-specific phenomena. Single-cell Hi-C has now been extended to normal, diploid cells by making use of heterozygous sequence polymorphisms to discriminate between homologous chromosomes.
Here I present an iterative method for computing diploid whole-genome structures that is coupled to resolving ambiguous contacts between homologous chromosomes, and which is applicable to cell types derived from hybrid mouse strains. This works with as few as 200,000 DNA ligation events per genome, to generate a particle-on-a-string representation of chromosomes with segment sizes down to at least 100 kb, and at a similar precision to haploid genome structures.