Genomic selection: methods in crop and animal breeding

Genomic selection NGS blog 2 photoGenomic selection: 6 factors to consider when choosing between targeted GBS and microarrays

Genomic selection through genotyping is more accurate than conventional breeding methods and promises to revolutionise crop and animal breeding. Gel-based technologies such as restriction fragment length polymorphism (RFLP) analysis and Sanger sequencing were used during the development of this field, followed by microarrays and PCR-based genotyping.

Next generation sequencing (NGS) is now powering the development of more targeted genotyping by sequencing (tGBS) methods, including capture-based enrichment followed by analysis using NGS. The question is, which genotyping solution is right for the challenges you face? Let’s compare the main contenders, arrays and targeted genotyping by sequencing (tGBS), by looking at some key factors that will affect the efficiency of your breeding program.

Can you implement the flexible and scalable marker strategy you need?

The number of markers you need to screen for genomic selection depends on the species and the stage in your breeding cycle. Single nucleotide polymorphism (SNP) discovery involves 10,000–100,000 markers on perhaps as little as 5 samples, whereas the sweet spot for genomic selection is around 1,000–25,000 assays run on approximately 1,000 samples (see Figure 1). Being able to apply different levels of multiplexing using the same technology adds efficiency and consistency to your breeding program.

NGS Blog 2 Figure 1 updated

Figure 1. A typical breeding program involves moving from high coverage of a few samples in SNP discovery to medium multiplex levels for genomic selection.

Certainly arrays of different densities can deliver high and medium capacity SNP analysis, but this technology is very rigid, making it difficult to adapt marker density and composition based on the stage in your breeding program. There are, on the other hand, tGBS methods that can be used to screen up to 100,000 markers per sample but also function efficiently in that mid-plex sweet spot of 500 to 25,000 markers. This gives you the flexibility you need for genomic selection, even when you are working with multiple populations that have different genetic backgrounds.

 

Certainly arrays of different densities can deliver high and medium capacity SNP analysis, but this technology is very rigid, making it difficult to adapt marker density and composition based on the stage in your breeding program. There are, on the other hand, tGBS methods that can be used to screen up to 100,000 markers per sample but also function efficiently in that mid-plex sweet spot of 500 to 25,000 markers. This gives you the flexibility you need for genomic selection, even when you are working with multiple populations that have different genetic backgrounds.

Can you be cost-effective?

The effective application of genomic selection means screening a large number of samples quickly and efficiently, which can reduce breeding cycles by years. This speeds up time to market for new varieties, giving you that competitive edge. To achieve this requires the right technology and also cost efficiency. Array technology is lagging behind in terms of flexibility, and the high setup cost can also be daunting. On the other hand, data output and efficiency of NGS platforms is continually being improved, dramatically reducing the cost of NGS (Figure 2). Already today we can multiplex thousands of samples for tGBS on a single flow cell of even a medium throughput NGS system.

So basing sample selection on NGS analysis will inevitably drive up throughput while reducing costs. Added to that, highly efficient enrichment methods can reduce day-to-day operation costs even further.

 

NGS Blog 2 Figure 2

Figure 2. The cost of NGS is falling rapidly. Source: nature.com

Can you stay on target?

 

Using NGS for whole genome sequencing will deliver a relatively low cost per data point, but there are strong arguments for ensuring that analysis is limited to the specific genomic regions relevant to your study. For example, in most crop genomes, the exome corresponds to only 1–2% of the entire genome. Specifically targeting the regions of interest through capture and sequencing significantly reduces the cost of sequencing and data analysis (see 1).

Can you make the most of imputation?

One way to reduce genotyping cost is imputation, which is the statistical inference of unobserved alleles by using known haplotypes based on database information progenitors and sequenced parental lines. Imputation is cheaper in breeding programs because the numbers of markers that are used for screening are reduced. Therefore, accurate and informative imputation can make breeding strategies much more cost effective, but this can only be achieved with high-quality data from previously screened populations.

Imputation can be performed both from arrays and sequencing data. The trick is to select an optimized subset of existing markers. In the case of arrays, these design rounds can be very time consuming and prohibitively expensive. Added to that, it may be impossible to replace these markers since they are fixed on an array that may be the result of collaboration between many groups. In contrast, the lower setup costs and flexibility of tGBS make this approach much more attractive when developing imputation panels. With tGBS, any non-informative markers can be quickly and easily exchanged for others that may be more informative in further rounds of screening and imputation.

Does the technology fit into your breeding cycle?

The setup time for an array based on a new set of markers can be considerable, up to six months. In contrast, tGBS approaches can enable a turnaround time of less than 2 weeks, plus 4–6 weeks for the design of a new oligo library, which means you can fit it into a plant breeding cycle and improve selection of the accessions to be transplanted to the field and progressed. The result can be years of savings in development time.

Can you discover de novo variants?

Arrays discriminate targeted SNPs and are, by definition, fixed. Sequencing-based methods such as tGBS on the other hand enable the discovery of new SNPs and structural variants in flanking sequences of targeted SNPs. This increases the amount of genetic information you have at your fingertips, increasing the power of genomic selection. For example, in a study of 500 markers using sequences previously tested on an array, only 491 SNP sequences were originally selected to be common between the tGBS library and array data whereas tGBS discovered 5,733 de novo SNPs (2).

How to find the sweet spot with tGBS

As we have seen, exploiting genomic selection will help you produce new varieties faster. But it means finding a sequence-based genotyping solution that can meet your needs in terms of flexibility and cost-efficiency, while enabling you to carry out de novo SNP discovery, imputation, and much more. We will look at one way of achieving this in the last article in this series.

Want to learn more? Download the white paper: SeqSNP tGBS as alternative for array genotyping in routine breeding programs.

About the author: Darshna ‘Dusty’ Vyas

Dusty has been with LGC for the last 6 years working as a plant genetics specialist.

Her career began at the James Hutton Institute, formerly the Scottish Crop Research Institute, developing molecular markers for disease resistance in raspberries. From there Dusty moved on to Biogemma UK Ltd for a period of 13 years, where she worked primarily with cereal crops such as wheat, maize and barley. Through her participation in the Artemisia Project, funded by the Bill and Melinda Gates Foundation, at York University, she gained a vast understanding of the requirements by breeders for varietal development using molecular markers in MAS.

Dusty’s goal is to further breeding programs for global agricultural sustainability using high throughput methods such as SeqSNP.

References

  1. Efficient genome-wide genotyping strategies and data integration in crop plants. Torkamaneh D et al. Theor Appl Genet. Mar;131(3):499–511 (2018)
  2. White paper: SeqSNP tGBS as alternative for array genotyping in routine breeding programs.

 

This blog post was originally published on the LGC, Biosearch Technologies blog.

Our hungry planet: new tools in agrigenomics are key to food security

Our hungry planet NGS blog 1 photoFood security is a major global threat and traditional methods of plant and animal breeding will not be sufficient to increase production to the level needed to sustain the growing world population. Modern genomics-driven breeding, through analysis based on technologies such as next generation sequencing (NGS) and arrays, is revolutionizing agriculture and making genomic selection a viable approach throughout the industry. In this three series blog post find out how technology is changing global food security and what the newest tools bring to the table.

The power of genomic selection

Perhaps the biggest revolution in agriculture in the last decade is the emergence of agrigenomics to enhance traditional breeding programs. Molecular techniques, such as marker assisted selection and genomic selection, have enabled selection of improved varieties without having to rely on assessing visible characteristics. Genomic selection, in particular, addresses the key factors of the breeder’s equation (2) that increase the rate of genetic gain in plant and animal breeding:

  • Reduced breeding cycles – individuals can be progressed faster when selection is based on genotype rather than phenotype alone
  • Greater selection intensity – selecting individuals based on genotype is cheaper than selecting on phenotype, so more individuals can be evaluated (increasing ‘n’)
  • Improved accuracy – the genomic estimated breeding value (GEBV) enables prediction models to select with greater accuracy based on phenotype and previous pedigree historical data and enables prediction models to be applied with greater accuracy.
  • More efficient integration of new genetic material through the development of training population, where intensive phenotyping and genotyping can be assessed

Genomic selection has been instrumental in dairy cattle breeding where it has essentially replaced progeny testing, enabling greater and faster improvements in terms of genetic gain (see, for example, reference 3). Genomic selection has, however, had a relatively slow uptake in plant breeding. Reasons include its relative complexity compared to traditional methods, the need for expensive investments, complexity of plant genomes and ability to analyse big data using bioinformatics. The divergence of plant and animal breeding has also hindered the translation of methods between these two fields, but this problem is being addressed and hopefully both animal and plant breeding of the future will gain from common insights into genomic selection (1).

Technological development powers the agrigenomics breakthrough

Genomic selection has been made more practical by a range of methods, including next generation sequencing (NGS) and microarrays for genotyping and single nucleotide polymorphism (SNP) analysis. Massive developments in NGS technology in particular have realized the potential of genotyping by sequencing, (or GBS), and promises to revolutionize the drive to develop varieties of plant crops with, for example, desirable traits such as drought tolerance, disease resistance, and higher yield.

Despite all these advances, there are still gaps to fill in the toolbox of technologies, and finding the optimal solution for genomic selection can be a demanding process. We will be looking into these issues in the next article in this series.

Make sure you don’t miss the rest of this series by subscribing to our blog!

About the author: Darshna ‘Dusty’ Vyas

Dusty has been with LGC for the last 6 years working as a plant genetics specialist.

Her career began at the James Hutton Institute, formerly the Scottish Crop Research Institute, developing molecular markers for disease resistance in raspberries. From there Dusty moved on to Biogemma UK Ltd for a period of 13 years, where she worked primarily with cereal crops such as wheat, maize and barley. Through her participation in the Artemisia Project, funded by the Bill and Melinda Gates Foundation, at York University, she gained a vast understanding of the requirements by breeders for varietal development using molecular markers in MAS.

Dusty’s goal is to further breeding programs for global agricultural sustainability using high throughput methods such as SeqSNP.

References

  1. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery. J M Hickey, T Chiurugwi, I Mackay, W Powell & Implementing Genomic Selection in CGIAR Breeding Programs Workshop Participants. Nature Genetics volume 49, pages 1297–1303 (2017)
  2. Animal breeding plans 2nd J L Lush. The Iowa State College Press (1943)
  3. Genomic selection strategies in a small dairy cattle population evaluated for genetic gain and profit. J R Thomasen et al, J. Dairy Sci. 97:458–470. http://dx.doi.org/10.3168/jds.2013-6599 (2014).

 

This blog originally appeared on the LGC, Biosearch Technologies blog.