--- title: "Introduction" output: rmarkdown::html_vignette author: Shoji Taniguchi vignette: > %\VignetteIndexEntry{Introduction} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## 0. Introduction to gpyramid package R package `gpyramid` has been designed for gene pyrammiding in plant breeding. The gene pyramidding was formulated by Servin et al. (2004) . This document describes how to conduct the same calculation as Servin et al. (2004) in the R environment. ## 1. Set up ```{r setup} library(gpyramid) library(ape) library(dplyr) ``` ## 2. Prepare data ### 2.1 Gene data ```{r} line_df <- data.frame(line = c("x1", "x2", "x3", "x4"), gene1 = c("A", "B", "B", "B"), gene2 = c("B", "A", "B", "B"), gene3 = c("B", "B", "A", "B"), gene4 = c("B", "B", "B", "A")) line_df ``` ### 2.2 Position data ```{r} position_df <- data.frame(Gene = c("g1", "g2", "g3", "g4"), Chr = c("1A", "1A", "1A", "1A"), cM = c(0, 20, 40, 60)) position_df ``` ### 2.3 Preprosessing #### Generate haplotype dataframe from row data ```{r} gene_dat <- util_haplo(line_df, target = "A", non_target = "B", hetero = "H", line_cul = "line") gene_df1 <- gene_dat[[1]] gene_df2 <- gene_dat[[2]] line_id <- gene_dat[[3]] colnames(gene_df1) <- line_id colnames(gene_df2) <- line_id gene_df1 gene_df2 ``` #### Generate recombination probability matrix from raw data ```{r} recom_mat <- util_recom_mat(position_df, "cM") recom_mat ``` ## 3. Find parent sets from candidate lines (cultivars) Fron candidate lines, `findPset` function returns the parent sets for gene pyramidding. In this example, only one parent set was returned. ```{r} line_comb_lis <- findPset(gene_df1, gene_df2, line_id) line_comb_lis ``` ## 4. Calculate the number of necessary individuals and generations `calcCostAll` function calculates the number of necessary individuals and generations as the crossing cost for all the crossing schemes. Given parent sets for gene pyramidding, `calCostAll` function simulates all the crossing schemes and calculates the number of necessary individuals and generations as the cost of gene pyramidding. `calcCostAll` function returns the `gpyramid_all` object, which contains information of all the crossing schemes. Here, `getFromAll` function get one crossing scheme from `gpyramid_all` object. ```{r} rslt <- calcCostAll(line_comb_lis, gene_df1, gene_df2, recom_mat, prob_total = 0.99, last_cross = T, last_selfing = T) rslt$cost_all ``` ### 4.1 Fig 4a (Servin et al., 2004) Fig 4a in Servin et al. (2004) corresponds to `cross_id = 15` in the above `gpyramid_all` object. ```{r fig.width=4, fig.height=4} rslt_one <- getFromAll(rslt, cross_id = 15) summary(rslt_one) plot(rslt_one$topolo) nodelabels() ``` ### 4.2 Fig 4b (Servin et al., 2004) Fig 4b in Servin et al. (2004) corresponds to `cross_id = 6` in the above `gpyramid_all` object. ```{r fig.width=4, fig.height=4} rslt_one <- getFromAll(rslt, cross_id = 6) summary(rslt_one) plot(rslt_one$topolo) nodelabels() ``` ### 4.3 Fig 4c (Servin et al., 2004) Fig 4c in Servin et al. (2004) corresponds to `cross_id = 13` in the above `gpyramid_all` object. ```{r fig.width=4, fig.height=4} rslt_one <- getFromAll(rslt, cross_id = 13) summary(rslt_one) plot(rslt_one$topolo) nodelabels() ```