当前位置: 首页 > 期刊 > 《核酸研究》 > 2004年第12期 > 正文
编号:11371767
Demarcating the gene-rich regions of the wheat genome
http://www.100md.com 《核酸研究医学期刊》
     Washington State University, Pullman, WA 99164, USA and 1 Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583-0915, USA

    * To whom correspondence should be addressed at Department of Crop and Soil Sciences, 277 Johnson Hall, PO Box 646420, Washington State University, Pullman, WA 99164, USA. Tel: +509 335 4666; Fax: +509 335 8674; Email: ksgill@wsu.edu

    Present addresses: Devinder Sandhu, G302 Agronomy Hall, Department of Agronomy, Iowa State University, Ames, IA-50011-1010, USA

    Mustafa Erayman, Agricultural College, Department of Crop Sciences, Mustafa Kemal University, 31034 Hatay, Turkey

    Muharrem Dilbirligi, Central Research Institute for Field Crops, Eskisehir yolu, 10 km, Lodumulu/Ankara, Pk: 226, 0642 Ulus/Ankara, Turkey

    ABSTRACT

    By physically mapping 3025 loci including 252 phenotypically characterized genes and 17 quantitative trait loci (QTLs) relative to 334 deletion breakpoints, we localized the gene-containing fraction to 29% of the wheat genome present as 18 major and 30 minor gene-rich regions (GRRs). The GRRs varied both in gene number and density. The five largest GRRs physically spanning <3% of the genome contained 26% of the wheat genes. Approximate size of the GRRs ranged from 3 to 71 Mb. Recombination mainly occurred in the GRRs. Various GRRs varied as much as 128-fold for gene density and 140-fold for recombination rates. Except for a general suppression in 25–40% of the chromosomal region around centromeres, no correlation of recombination was observed with the gene density, the size, or chromosomal location of GRRs. More than 30% of the wheat genes are in recombination-poor regions thus are inaccessible to map-based cloning.

    INTRODUCTION

    Bread wheat (Triticum aestivum L.) is an allohexaploid (2n = 6x = 42, AABBDD) containing three homoeologous genomes (1). Wheat genome is about 35 times larger than rice (Oryza sativa L.) although both belong to Poaceae family (2). Estimates for the gene-containing fraction of the wheat genome range from 1–5% obtained from the available sequence data analyses and genome size comparisons with other plant genomes, to 15% by DNA re-association kinetics experiments (3–5). It is, therefore, imperative to identify and demarcate the gene-containing regions for an efficient and targeted characterization of this important genome. Second, genetic maps and DNA sequence comparisons along with other phylogenetic studies have suggested that various grass genomes originated from a common ancestor (6,7). This is also evident from the fact that majority of wheat, barley (Hordeum vulgare L.) and rice protein sequences are 98% similar (8). It is, however, unknown how monophyletic origin of the grasses resulted in as much as 35-fold difference in genome size.

    Genes are unevenly distributed on wheat as well as other plant chromosomes (5,9–12). In Arabidopsis thaliana, 45% of the genome accounts for all 25 000 genes (5,13). The remaining 55% is ‘gene-empty’ and is interspersed among genes as blocks ranging in size from a few hundred basepairs to 50 kb. Gene-rich and gene-poor regions were also observed in pea, tomato and palm (14). Uneven distribution of genes on chromosomes seems to be a common feature of other higher eukaryotes also (15–17). In larger genomes such as human, unevenness of gene distribution is more pronounced (18). The 30 000 genes with an average size of 27 kb account for 25% of the human genome. The remaining 75% is composed of retrotransposon-like repetitive sequences interspersed among genes as a result of multiple invasions by different retrotransposons at different times during evolution followed by their inactivation by transposition and/or heterochromatinization (18–20).

    By physically mapping gene markers on an array of chromosome deletion lines, it has been shown that most wheat genes are present in clusters that occur more frequently in distal parts of the chromosomes (9,21,22). The exact location, relative size, gene density, and structural organization of the gene-containing regions are however not known. Comparisons of genetic distances among C-bands revealed that distribution of recombination is also uneven along the wheat chromosomes (23,24). Comparisons of wheat physical and genetic linkage maps confirmed the uneven distribution of both genes and recombination, and established a general correlation between the two (9,10,21,25–29). Precise relationships among rate of recombination, distribution of genes, gene density and location on the chromosome have not been established in wheat. Regions around the eukaryotic centromeres and in some cases telomeres have been reported to suppress recombination rates (13,30,31). Furthermore, as shown in barley, maize (Zea mays L.), Arabidopsis, and other plants, distribution of recombination can be highly uneven even within a few kilo bases of DNA (32–35). Relationships of the similar putative recombination hotspots of wheat with the currently identified highly recombinogenic chromosomal regions have not been established.

    About 461 phenotypic markers and useful genes have been identified in wheat (36) (http://wheat.pw.usda.gov). Although the linkage relationship of about 377 has been established relative to molecular markers, the physical location of only seven of these genes is known (37). Because of uneven distribution of recombination, the physical location of the genes along with a precise kb/cM estimate for the region, are essential for map-based cloning. This is particularly important for wheat where the difference in the recombination among regions is expected to be greater than other crop plants.

    The objectives of this study were to generate a comprehensive map of the wheat genome; identify and precisely localize the gene-containing regions; reveal gene density, distribution of recombination and kb/cM estimate for each of the gene-containing regions; and physically map phenotypically characterized genes.

    MATERIALS AND METHODS

    Plant material

    Various aneuploid stocks along with the wild-type wheat cultivar Chinese Spring (CS) were used for physical mapping. The 21 nullisomic–tetrasomic lines (NT, missing a pair of chromosomes, the deficiency of which is compensated for by an extra pair of homoeologous chromosomes) were used for inter-chromosomal mapping and 14 ditelosomic lines (DT, missing a pair of chromosome arms) were used to reveal arm location of the gene markers. Sub-arm localization of markers was accomplished using 334 deletion lines covering all 21 wheat chromosomes. A list of the deletion lines along with the fraction length (FL) of the retained arm is provided as supplementary information in Supplementary Table 1. These deletion lines were generated using gametocidal genes from Aegilops speltoides (38–40). Giemsa C-banding characterization and low-density restriction fragment length polymorphism (RFLP) mapping suggested that most of these deletion lines resulted from single breaks followed by the loss of the acentric fragments (26,27,40). High-density mapping revealed a few complex and secondary deletions/rearrangements in some of the deletion lines (10). Additional information about the deletion lines is available at http://www.ksu.edu/wgrc/Germplasm/Deletions/.

    Table 1. Physical locations of genes/markers, deletion lines, recombination in GRRs in wheat genome

    DNA analysis and marker selection

    Plant genomic DNA was extracted following a previously described method (41). Gel blot analysis was performed using 15 g of genomic DNA digested with either EcoRI or HindIII restriction enzymes and size separated on 0.8% agarose gels. All other steps for the gel blot DNA analysis were as previously described (26).

    The published and the Poaceae maps from the ‘graingenes’ database (http://wheat.pw.usda.gov/ggpages/maps.html) were used to select gene markers for physical mapping. The DNA markers were from wheat (FBA, FBB, NOR, PSR, TAG, TAM, UNL, WG), Ae. tauschii (KSU), barley (ABC, ABG, BCD, MWG and cMWG) oat (CDO) and rice (RZ) (10,42,43). Priority was given to cDNA clones although some PstI genomic clones were also used.

    Deletion mapping

    Chromosomal and arm locations of each fragment band were revealed by the NT and DT mapping. A fragment band was mapped to a chromosomal region bracketed by the breakpoints of the largest deletion possessing the fragment band and the smallest deletion lacking it. Multiple fragments were considered non-allelic (shown by a letter at the end of the probe name) if the corresponding bands mapped to different regions or if more than two DNA fragments larger than the probe size were detected.

    Demarcating the gene-rich regions

    The deletion mapping information for the three homoeologs of each group was combined to generate a high-resolution consensus physical map. All deletion breakpoints for a group were aligned on a hypothetical chromosome based on FL values and each marker was placed in the shortest possible interval based on its location on the three homoeologs. Due to the occasional differences in size and distribution of gene-poor regions among homoeologs, relative locations of markers were also considered along with the FL values to place deletion breakpoints on the consensus map. To determine physical location of a GRR, the FL value consistent with at least two of the three homoeologs, was used. Deletion mapping data from the three homoeologs along with FL location were used to localize deletion breakpoints within a GRR.

    Recombination analysis

    A comprehensive genetic linkage map for each of the seven wheat homoeologous groups was constructed using 137 genetic linkage maps including 17 each for group 1 and 2, 32 for 3, 25 for 4, 16 for 5, and 15 each for group 6 and 7 chromosomes (http://wheat.pw.usda.gov/ggpages/maps.shtml). First, the markers/genes that were common among several wheat genetic linkage maps for a homoeologous group were designated as anchors. Total genetic length of the consensus map and distances among anchors were estimated by taking average of recombination among anchor markers from various maps. Second, markers present between two anchor markers on multiple maps were integrated on the consensus map. Relative genetic distances of the markers present in the segment between two anchors were averaged over different genetic linkage maps and calibrated according to the genetic distances between the two anchor markers. Third, markers that were present on a few linkage maps were incorporated relative to anchor markers. Finally, the markers and genes present only on one or two maps were placed on the consensus genetic maps via linked markers. Genetic distances on the consensus linkage maps are relative rather than absolute. In cases where map distances between markers were not comparable, the distances were extrapolated based on an average over majority of the maps.

    RESULTS

    In order to identify and demarcate the gene-containing regions, precise physical location of 3025 gene marker loci including 252 phenotypically characterized genes and 17 quantitative trait loci (QTLs), was revealed. First, 942 cDNA or PstI genomic clones were physically mapped using 334 deletion lines. Each gene marker loci was localized to the smallest possible chromosomal interval by combining mapping information from the three wheat homoeologs on a consensus physical map. This analysis localized the gene-containing regions and identified the flanking deletion breakpoints. Second, a consensus genetic linkage map was constructed by combining information from 137 wheat maps and was compared with the consensus physical map via 428 common markers (underlined bold face markers, Table 1). This comparison revealed the physical location of phenotypically characterized genes, confirmed gene distribution and provided estimates of recombination at a sub-chromosomal level.

    Physical mapping

    The physical mapping results are provided in supplementary figures (A–G) and are summarized in Table 1. With an average of 135, the number of physically mapped probes ranged from 43 for group 4 to 274 for group 5 chromosomes. The average number of deletion lines used per homoeologous group was 48 with a range of 40 to 58 (Table 2). The 942 probes detected 2036 loci of which 617 were for the A, 740 for the B and 679 were for the D genome chromosomes (Table 3). With an average of 2.2, the number of loci per probe ranged from 1.9 for group 6 to 2.4 for groups 1 and 3. Using one restriction enzyme, 45% of the probes detected loci on all three homoeologs, 25% on two and 29% of the probes detected loci on only one of the three homoeologs (Table 3). Among the probes detecting loci on two homoeologs, the number for B and D genome chromosomes was more than double (124 probes) compared to either of the other two combinations (A and D, or A and B). Similarly, among the probes detecting loci on only one of the three homoeologs, the number for the B genome was more than double compared to either of the other two. The total number of loci detected for the B genome was 37% compared with 33% for the D and 30% for the A genome. The number of probes mapping on all three homoeologs ranged from 29% for group 6 to 60% for group 1. Similarly, the number of probes mapping only on one of the three homoeologs ranged from 19% for group 3 to 41% for group 6.

    Table 2. Summary information about the identified GRRs

    Table 3. Summary of the physically mapped wheat loci on homoeologous groups

    Except for a few discrepancies, relative order, FL location and overall distribution of gene markers were similar among homoeologs. It was, therefore possible to generate consensus physical maps by combining mapping information from the homoeologs. Deletion breakpoints of homoeologs were placed on the consensus map and each marker was placed to the shortest possible interval (see Methods).

    Gene distribution on the wheat chromosomes is obvious from Figure 1. Regions of very high marker density were observed on all consensus physical maps and were called gene-rich regions (GRRs). The location and size of each GRR was drawn to scale and the bracketing deletion breakpoints are shown on the ‘left’. Detailed information regarding the bracketing deletions, size, markers and other deletion lines for each GRR is given in Table 1 and is summarized in Table 2. With an average of seven and a range of five to eight per chromosome, 48 GRRs were identified for the wheat genome that contained 94% of the markers (Table 2). The remaining 6% of the markers were present in other regions including centromere of chromosome 1B that contained markers Xbcd1072 and Xpsr161.

    Figure 1. (Group 1 to 7). Distribution of genes and recombination on wheat chromosomes. Sizes of consensus chromosomes, location and sizes of GRRs are drawn to scale based on the average size of the three homoeologous chromosomes for each homoeologous group (54). Names of the GRRs are given on the left side of a consensus chromosome. In the nomenclature of GRRs (e.g. ‘1S0.8’), the first digit represents wheat homoeologous group followed by the arm location either as short arm (S) or long arm (L). The last two numeral numbers represent GRR location as fraction length (FL) of the chromosome (e.g. 0.8 for ‘1S0.8’). The flanking deletion lines for each GRR are shown in blue on the left side of a consensus physical chromosome. Actual physical size (black) and the ratio of physical to genetic distance (red) for a region are given on the right hand side of a consensus chromosome. Sizes (in Mb) of GRRs are calculated based on the cytological measurements (measurements of the region bracketed by the flanking deletion line breakpoints in comparison to the total chromosome size), and are drawn to scale. Recombination in a chromosomal region is calculated by comparing deletion line based physical map with the consensus genetic linkage map for each chromosome. Flanking markers for each GRR are joined with dotted lines connecting physical and consensus genetic linkage maps. On the consensus genetic linkage maps, the genes for which exact location in reference to other markers was known, are shown in red and the genes/QTLs for which the location was imprecise, are shown in green, *, fraction length for satellite. , region not precisely marked by deletion lines. , markers, which were not flanked by the flanking deletion lines for the GRR but probably, are part of that region.

    Comparative analysis

    With the objectives to confirm the gene distribution, physically map additional markers including phenotypically characterized genes, and to reveal precise distribution of recombination on the wheat chromosomes, consensus genetic linkage maps were constructed for each of the homoeologous groups and compared with the consensus physical maps. A total of 1417 markers including 252 phenotypically characterized genes and 17 QTLs were placed on the consensus genetic linkage maps (Figure 1). With an average of 202 per chromosome, the number of markers ranged from 151 for group 4 to 235 for group 2. The average number of phenotypically characterized genes/QTLs was 38 per chromosome group with a range of 28 for group 5 to 43 for group 1.

    The 428 common markers were used to compare the consensus physical and genetic linkage maps. A sector corresponding to each of the GRR was identified on the genetic linkage map. Additional markers and useful genes present on the consensus linkage map were then localized to the corresponding GRR and the results are summarized in Table 1. About 94% (1336/1417) of the consensus genetic linkage map markers including 241 of the 252 phenotypically characterized genes and all 17 QTLs, mapped in the GRRs (Figure 1, Table 1). The distribution of gene markers observed from the genetic linkage maps was similar to that from the physical maps. Between the two methods, 3025 marker loci were physically localized on wheat chromosomes that corresponded to 1931 unique gene loci (Figure 1, Table 1). Of these, 94% mapped within the 48 GRRs. The remaining 6% were present in small groups of one to four markers, separated by more than one deletion breakpoints.

    Gene distribution

    The number and location of GRRs was different among wheat chromosomes (Figure 1 and Table 1). Twenty-one GRRs were present on the short arms and contained 35% of the wheat genes. The remaining 27 GRRs were present on the long arms and contained about 59% of the genes (Table 1). The number of GRRs per chromosome varied from five (groups 4 and 6) to eight (Groups 1, 5 and 7). Essentially no GRR was observed in the centromeric regions of the chromosomes. Only one small GRR (‘7L0.1’) containing 4% of the arm's genes was observed in the proximal 20% of any chromosome. In general, GRRs present in the distal regions were relatively higher in gene density. More than 80% of the total marker loci mapped in the distal half of the chromosomes and 58% mapped in the distal 20% (Figure 1 and Table 1).

    Significant differences were observed among GRRs for marker number, marker density and the size. The 48 GRRs encompassed about 1554 Mb that is equivalent to 29% of the wheat genome (Table 1). Gene density and number in 18 GRRs was very high (major GRRs) (underlined in Table 1 and Figure 1). These major GRRs contained nearly 60% of the wheat genes but covered only 11% of the genome. The gene number and density was variable even among these major GRRs. Five of these (‘1S0.8’, ‘2L1.0’, ‘4S0.7’, ‘6S1.0’ and ‘6L0.9’) contained 26% of the wheat genes but spanned only 3% of the genome. On the contrary, four minor GRRs (‘1S0.4’, ‘2L0.3’, ‘5S0.4’ and ‘7L0.1’) contained only 2% of the wheat genes but covered >2% of the genome. Estimated sizes of GRRs varied from 3 Mb for ‘1S0.4’ to 71 Mb for ‘3L0.9’, with an average of 32 Mb. The total region spanned by GRRs ranged from 119 Mb for group 1 to 288 Mb for group 5. Number of genes in a GRR varied from 4% of the arm in ‘7L0.1’ to 82% in ‘6S1.0’. Gene density varied as much as 70-fold among GRRs with a range from a gene per 78 kb in ‘1S0.8’ to 5500 kb in ‘7L0.1’. Gene distribution even within GRRs appeared to be uneven. The ‘1L0.9’ region is present between FLs 0.84 and 0.9. Of the total 22 markers present in the region, 16 mapped between FL 0.84 and 0.85, one between FL 0.85 and 0.89, and five between FL 0.89 and 0.9.

    Physical mapping of phenotypic markers

    The 269 phenotypic markers mapped in this study included 80 monogenically inherited wheat disease resistance genes and 17 QTLs (Figure 1 and Table 1). Some of the other useful genes were Eps (earliness per se), Hd (awnedness), Kr (crossibility), Ph1, plant height, male sterility and ear morphology genes (36). Of the 269 phenotypic markers and useful genes, 159 were reliably located on the consensus genetic maps (colored red, Figure 1). The remaining 93 genes were placed on the consensus genetic maps because of their close association (<5 cM) with one or more markers. The 17 QTLs were located to 10 cM intervals on the consensus linkage map as their precise location within the intervals was not known. These 93 genes and 17 QTLs are colored green in Figure 1. With an average of 36, the number of phenotypic markers varied from 29 for group 4 to 44 for group 1. Comparison of the genetic with the physical consensus maps localized 258 of the 269 phenotypic markers and useful genes to 37 GRRs. The highest number (32) of phenotypic markers was observed for the ‘1S0.8’ region. The 11 GRRs that lacked phenotypic markers were present in the proximal regions of the chromosomes (Figure 1 and Table 1). The genes mapping in the non-GRR regions are Per1, Hk1, Lr35, Xksu905 (Wip), Lr25, Lr9, Pm7, Lr30, Xwsu4 (Dor4), tav1933 (Vdoc) and v1.

    Distribution of recombination

    Comparison of the consensus physical and genetic linkage maps via 428 common markers provided a detailed and accurate estimate of recombination at a sub-regional level (Figure 1). In general, a severe suppression of recombination was observed in the centromeric regions of the wheat chromosomes (Figure 1). One-fourth of the wheat genome present around the centromeres accounted for <1% of the total recombination. Nearly perfect linkage was observed among markers present on different arms of a chromosome. For example, markers Xcdo618 (‘1S0.4’) and Xcdo98 (‘1L0.2’) were physically separated by one-fourth of the chromosomal length but were <2 cM apart. Due to the lack of recombination in the proximal regions, a large number of markers clustered in the centromeric regions of the linkage maps (Figure 1).

    The gene-poor regions accounted for only 5% of the recombination as 95% of the recombination was observed in the GRRs (Figure 1 and Table 1). Markers present in two different GRRs separated by a gene-poor region were usually tightly linked on the genetic linkage map. For example, no recombination was observed between markers Xwg789 (‘1S0.6’) and Xcdo618 (‘1S0.4’). Similarly, markers XksuD22 (‘2L0.8’) and Xcdo373 (‘2L1.0’) were inseparable on the genetic linkage map even though these were physically separated by 20% of the arm's length (Figure 1). Maximum recombination observed in a gene-poor region was for the region between ‘4S0.9’ and ‘4S0.7’ that accounted for 12% of the arm's recombination. Recombination occurring in the GRRs varied from 90% for group 4 to 98.5% for groups 6 and 7. The chromosome 3L GRRs accounted for almost 100% of the arm's recombination. Among GRRs, maximum genetic length of 143 cM was observed for ‘3L0.9’ and minimum of 1 cM was for ‘1S0.4’ region (Figure 1 and Table 1). Recombination rate for two GRRs (‘7L0.1’ and ‘7L1.0’) present on the same chromosome arm was 53-fold different. Only 1–3% of the arm's recombination was observed for the GRRs ‘1L0.2’ and ‘7L0.1’ (Figure 1).

    Recombination in the distal chromosomal regions was much higher as compared to the proximal halves. The recombination rates among GRRs present in the distal half of the chromosomes were, however, highly variable. Recombination rates in some proximal GRRs were higher than in distal GRRs. For example, ‘1L0.9’ region is proximal to ‘1L1.0’ but is about 5-fold higher in recombination (Figure 1). There are six other examples where recombination in the proximal GRRs is 38–325% higher compared to the distal GRRs (Figure 1).

    In general, the recombination rate in the smaller GRRs seemed higher as compared to the GRRs that encompassed larger chromosomal regions. The 10 Mb sized region ‘6S1.0’ had the highest percentage recombination per chromosome arm (Table 1). Similarly, the ‘1S0.8’ region showed 86% of the arm's recombination and is only 7 Mb in size. On the contrary, a larger GRR ‘4L0.5’ is 70 Mb but accounted for 15% of the arm's recombination (Figure 1). As in the case of ‘3L0.9’, this pattern of negative correlation between GRR size and recombination rate was not observed for some regions.

    Comparisons of consensus physical and genetic linkage maps showed that the ratio of physical to genetic distance (kb/cM) varied as much as 140-fold among GRRs. The kb/cM estimate ranged from 151 for ‘1S0.8’ to 21687 for ‘7S0.2’ with an average of 3381 (Figure 1 and Table 1). In most cases recombination rate was higher for GRRs with a higher gene density although there were some exceptions. For example, gene density was relatively high in the GRR ‘1S0.4’ (1 gene per 273 kb) but only 3% of the arm's recombination was observed in this region (Table 1). Similarly, ‘5S0.9’ region is relatively low in gene density (1 gene per 2688 kb), but 60% of the arm's recombination occurred in this region.

    DISCUSSION

    Gene-containing regions of wheat

    The number of genes in most of the higher plants is expected to be similar although variations due to ploidy changes are expected. The in silico approach based gene number estimate for Arabidopsis is 25 000 and for rice ranges from 32 000 to 50 000 (8,13). Accounting for polyploidy, the corresponding number for wheat may be between 75 000 and 150 000. Using the average gene size of 2.5 kb that is found in rice, only 1.2–2.4% (188–375 Mb) of the wheat genome is therefore expected to contain genes. The predicted number of genes in the sequenced eukaryotes dropped 2- to 3-fold with the availability of better computational and biological tools. The human genome, once thought to have more than 100 000 genes, contains 30 000 genes based on new stringent criteria and improved gene-prediction programs. Similarly, the gene-containing fraction of the wheat genome may even be smaller than the above predictions. Therefore, precise localization and demarcation of the gene-containing regions is particularly important for wheat.

    Use of deletion lines to reveal the physical location of gene markers and the approach to combine the mapping data of the three homoeologs to generate a consensus map, have been shown previously (9,10,26). Relative gene marker location was conserved among the three homoeologs even at the current density (Supplementary Figures A–G). It was therefore possible to construct consensus maps for the seven homoeologous groups. The consensus maps had an average of 48 deletion breakpoints per chromosome group, with a range from 40 for group 3 to 58 for group 7. Based on the average chromosome size of the homoeologs, deletion breakpoints occurred every 13.3 Mb for group 7 to 20.4 Mb for group 3. The breakpoints were two times more frequent around the GRRs than randomly expected; once every 8 Mb. Of the 334 deletions, 210 (63%) occurred in the GRRs that encompassed 29% of the wheat genome (Table 1).

    The distribution of genes on wheat chromosomes is striking. Wheat chromosomes have regions of high gene density interspersed by vast expanses of unpopulated regions consisting of repeated DNA. Based on the sample of 3025 gene loci, 29% of the wheat genome contained 94% of the genes (Figure 1). Of the 48 GRRs that were identified, 18 were major spanning 11% of the genome but accounted for 60% of the genes. All major GRRs were present in the distal 35% of the chromosomes. The long arms of wheat contained twice as many genes as the short arms. Of the total 94% genes present in GRRs, 59% were present in the 27 GRRs on the long arm and remaining in the 21 GRRs on the short arms. No significant correlation was observed between the chromosome size and the gene fraction or the size of the GRRs. For example, group 3 has the largest chromosomes among the wheat groups but contained only 13% of the wheat genes compared with group 5 chromosomes that contained 20%. Roughly, the gene fraction seemed to correlate with the size of the GRRs. The size of chromosomes encompassing the GRRs is the largest for group 5 and the smallest for group 6. Correspondingly, the gene fraction was the highest for group 5 and the lowest for group 6.

    The actual size of the GRRs may be smaller than that shown in Figure 1 because the accuracy of bracketing the GRRs is dependent upon the number of deletion breakpoints. If randomly occurring, breakpoints of the 334 deletion lines are expected to occur every 16 Mb on the consensus physical map. As mentioned earlier, frequency of these chromosomal breaks was about double in the GRRs. Even with that frequency, GRRs interspersed by <8 Mb of gene-poor DNA will not be resolved.

    As observed in wheat, distribution of genes is uneven in other grass species as well. In rice, 20% of the genome has not been contiged/sequenced in spite of extensive efforts (8). The unsequenced part of the genome is probably gene-poor. About 21% of the sequenced portion of the rice genome contains 40% of the genes (44). Translocation breakpoint-based physical maps showed that barley chromosomes are also partitioned into gene-rich and gene-poor regions (45). Similarly, in the smaller genome plant Arabidopsis, genes are unevenly distributed on the chromosomes. About 45% of the genome accounts for about 25 000 genes with an average gene size of 2 kb.

    Based on the expected size of the gene-containing fraction, only a small part of the currently demarcated GRRs is alleged to contain genes. Assuming that the wheat genes may encompass only 188–375 Mb (1.2–2.4%) of the genome, the gene-containing fraction within the GRRs is predicted to be only 12–24%. The exact distribution of genes in the GRRs is not known but preliminary results suggest that the GRRs may further be partitioned into mini gene-rich and gene-poor compartments. Within the GRRs, the estimated 75 000 to 150 000 wheat genes are expected to occur every 10–20 kb. Except for ploidy differences, the wheat and the barley genomes are very similar in gene synteny and composition. An 1.1 Mb sequence for selected wheat and barley regions showed that gene density within the GRRs may range from a gene every 4–103 kb with an average of 10–20 kb (46–49). The size of the ‘gene-empty’ blocks interspersed in these regions ranged from 0.8 to 94 kb. The largest contig among these wheat and barley regions is 261 kb around the Mla locus (49). A 130 kb region in the contig contained 23 genes with an average gene density of a gene per 5.6 kb. Size of the interspersing ‘gene-empty’ regions ranged from a few hundred base pairs to 11 kb. The second region of high gene density was 40 kb long and contained 10 genes (a gene every 4 kb). The largest ‘gene-empty’ region was 5.5 kb. Another 60 kb barley contig spanned a 32 kb region around the bronze locus with a gene every 3.2 kb (34). Very similar observations were made in other large genome grasses such as Triticum monococcum, Aegilops tauschii and maize (47,50,51).

    Localization of the gene containing regions of wheat is based on studying gene distribution of 3025 gene loci by three different methods (Table 2). Gene distribution by deletion-line-based physical mapping was very similar to that revealed by the comparative analysis between physical and consensus genetic linkage maps. Similarly, distribution of gene markers was comparable to that of phenotypically characterized genes. Therefore, we believe that the demarcation and distribution of genes given in Figure 1 is a good representation of the wheat genome. Since the current analysis was based on the study of 3.9–7.8% of the wheat genes, relative numbers among various GRRs may slightly vary with the analysis of additional genes. Considering the diversity and randomness of the analyzed gene markers, the size, relative gene density and gene proportion are however not expected to change dramatically. A probable exception might be multicopy gene families that were not represented in this study because of the technical difficulties. The current interpretations will not hold true for the multicopy genes if their distribution is dramatically different from that of the single-copy genes.

    The presence of 6% of the genes in the gene-poor regions suggests that additional yet smaller GRRs may exist that were not identified during the current analysis. Some genes were observed in the highly heterochromatic, gene-poor regions. Two cDNA clones mapped in the centromeric region of wheat homoelogous group 1 chromosomes that is highly heterochromatic and gene-poor. Other such examples are gene markers Xbcd1124, Xcdo1340 and Xpsr596 on chromosome 1B, Xpsr107, XunlBF483292, Xabg356 and Xbcd260 on group 2, Xtam12, Xcdo681, Xbcd1278, Xbcd102, XksuI32, XksuF34, Xksu609, Xcdo920 and Xwg222 on group 3, and Xbcd1092, Xbcd734, Xcdo1337 and Xbcd1262 on group 4.

    Distribution of recombination

    Recombination is highly uneven along the wheat chromosomes (Figure 1). Recombination mainly occurred in the 48 GRRs that accounted for 95% of the recombination. More than 100-fold differences were however observed for the rate of recombination among various GRRs. A part of this difference seemed to be due to the suppression effect of centromere on recombination. Essentially no recombination was observed in the proximal 30% of the wheat chromosomes in spite of the presence of GRRs. In group 7 chromosomes e.g., genetic distance between the GRRs ‘7S0.2’ and ‘7L0.8’ is only 7 cM. The spanned region is >50% of the chromosome and extended over three GRRs. Most of the highly recombinogenic GRRs were localized in the distal parts of the chromosomes but not all distal GRRs were highly recombinogenic. Genetic distances spanned by ‘2L0.8’ and ‘6L0.7’ GRRs were <5 cM.

    Recombination distribution within the GRRs is not known, but the available data suggest that recombination is highly asymmetrical even within the highly recombinogenic GRRs. Analysis of a 1 Mb region of barley homoeologous to one of the most recombinogenic region of wheat (‘1S0.8’) identified a 240 kb region spanning the Mla locus where no recombination was observed (32). Recombination differences up to 10-fold were observed among different sectors of the remaining part of the region. Similar observations have also been made in other eukaryotes. Distribution of recombination has been best studied on yeast chromosomes. Measured at 1 kb resolution, the average frequency of recombination was 0.35 cM/kb with a range of 0–2 cM/kb (http://www.yeastgenome.org/). Measured at a 150 kb interval, recombination within a 1 Mb region of rice ranged from 0 to 1.5cM even though the gene distribution was fairly even (44) (http://rgp.dna.affrc.go.jp/Publicdata.html). The average frequency of recombination in rice is 0.003 cM/kb with a range of 0–0.06 cM/kb (http://rgp.dna.affrc.go.jp/Publicdata.html). Similarly in wheat at a resolution of 7 Mb, the average recombination frequency was 0.0003 cM/kb with a range from 0 to 0.007 cM/kb. Non-recombinogenic regions have been observed in yeast as well as in rice, but the highest rate of recombination for a region appears to be 35-fold less in rice. It may be because the rice estimates were at 150 kb resolution, whereas the recombination hotspots may be much smaller in length. In yeast, there is a major hotspot around PHO8 gene located on the distal end of chromosome IV where recombination is 2 cM/kb (http://www.yeastgenome.org/). A 194 bp region around the waxy (wx) locus of rice has a recombination rate of 0.06 cM/kb that is 20-fold higher than the genome average (52). Similarly, a 377 bp region of the maize genome has a recombination rate of 0.02 cM/kb that is about 30 times higher than the genome average (35,53).

    SUPPLEMENTARY MATERIAL

    ACKNOWLEDGEMENTS

    A contribution of the Agriculture Research Center, Washington State University, Pullman, WA. Journal series No. 0902-03. This research project was supported by the U.S. Department of Agriculture-National Research Initiative (USDA-NRI) and the WSU Vogel Endowment Funds.

    REFERENCES

    Sears,E.R. ( (1954) ) The aneuploids of common wheat. Research Bulletin 572 University of Missouri Agricultural Experiment Station, pp. 1–58.

    Arumuganathan,K. and Earle,E.D. ( (1991) ) Estimation of nuclear DNA content of plants by flow cytometry. Plant Mol. Biol. Rep., , 9, , 229–241.

    Flavell,R.B., Bennett,M.D., Smith,J.B. and Smith,D.B. ( (1974) ) Genome size and the proportion of repeated nucleotide sequence DNA in plants. Biochem. Genet., , 12, , 257–269.

    Sandhu,D. and Gill,K.S. ( (2002) ) Gene-containing regions of wheat and the other grass genomes. Plant Physiol., , 128, , 803–811.

    Sidhu,D. and Gill,K.S. ( (2004) ) Distribution of genes and recombination in wheat and other eukaryotes. Plant Cell Tissue Organ Cult., , in press.

    Clayton,W.D. ( (1981) ) Evolution and distribution of grasses. Ann. Missouri Bot. Gard., , 69, , 5–14.

    Kellogg,E. ( (1998) ) Relationships of cereal crops and other grasses. Proc. Natl Acad. Sci. USA, , 95, , 2005–2010.

    Goff,S.A., Ricke,D., Lan,T.H., Presting,G., Wang,R., Dunn,M., Glazebrook,J., Sessions,A., Oeller,P., Varma,H. et al. ( (2002) ) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science, , 296, , 92–100.

    Gill,K.S., Gill,B.S., Endo,T.R. and Boyko,E. ( (1996) ) Identification and high-density mapping of gene-rich regions in chromosome group 5 of wheat. Genetics, , 143, , 1001–1012.

    Sandhu,D., Champoux,J.A., Bondareva,S.N. and Gill,K.S. ( (2001) ) Identification and physical localization of useful genes and markers to a major gene-rich region on wheat group 1S chromosomes. Genetics, , 157, , 1735–1747.

    Akhunov,E.D., Goodyear,A.W., Geng,S., Qi,L.L., Echalier,B., Gill,B.S., Miftahudin, Gustafson,J.P., Lazo,G., Chao,S. et al. ( (2003) ) The organization and rate of evolution of wheat genomes are correlated with recombination rates along chromosome arms. Genome Res., , 13, , 753–763.

    Barakat,A.Carels,N. and Bernardi,G. ( (1997) ) The distribution of genes in the genome of Gramineae. Proc. Natl Acad. Sci. USA, , 94, , 6857–6861.

    TAGI. ( (2000) ) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, , 408, , 796–815.

    Barakat,A., Han,D.T., Benslimane,A., Rode,A. and Bernardi,G. ( (1999) ) The gene distribution in the genomes of pea, tomato and date palm. FEBS Lett., , 463, , 139–142.

    Sumner,A.T., Torre,J.D.L. and Stuppia,L. ( (1993) ) The distribution of genes on chromosomes: A cytological approach. J. Mol. Evol., , 37, , 117–122.

    Clay,O. and Bernardi,G. ( (2001) ) The isochores in human chromosome 21 and 22. Biochem. Biophys. Res. Commun., , 285, , 855–856.

    Jabbari,K. and Bernardi,G. ( (2000) ) The distribution of genes in drosophila genome. Gene, , 18, , 1859–1867.

    Lander,E.S., Linton,L.M., Birren,B., Nusbaum,C., Zody,M.C., Baldwin,J., Devon,K., Dewar,K., Doyle,M., FitzHugh,W. et al. ( (2001) ) Initial sequencing and analysis of the human genome. Nature, , 409, , 860–921.

    McPherson,J.D., Marra,M., Hillier,L., Waterston,R.H., Chinwalla,A., Wallis,J., Sekhon,M., Wylie,K., Mardis,E.R., Wilson,R.K. et al. ( (2001) ) A physical map of the human genome. Nature, , 409, , 934–941.

    Venter,J.C., Adams,M.D., Myers,E.W., Li,P.W., Mural,R.J., Sutton,G.G., Smith,H.O., Yandell,M., Evans,C.A., Holt,R.A. et al. ( (2001) ) The sequence of the human genome. Science, , 291, , 1304–1351.

    Gill,K.S., Gill,B.S., Endo,T.R. and Taylor,T. ( (1996) ) Identification and high-density mapping of gene-rich regions in chromosome group 1 of wheat. Genetics, , 144, , 1883–1891.

    Sandhu,D. and Gill,K.S. ( (2002) ) Structural and functional organization of the ‘1S0.8 gene-rich region’ in the Triticeae. Plant Mol. Biol., , 48, , 791–804.

    Dvorak,J. and Chen,K.-C. ( (1984) ) Distribution of nonstructural variation between wheat cultivars along chromosome arm 6Bp: evidence from the linkage map and physical map of the arm. Genetics, , 106, , 325–333.

    Curtis,C.A. and Lukaszewski,A.J. ( (1991) ) Genetic linkage between C-bands and storage protein genes in chromosome 1B of tetraploid wheat. Theor. Appl. Genet., , 81, , 245–252.

    Werner,J.E., Kota,R.S. and Gill,B.S. ( (1992) ) Distribution of telomeric repeats and their role in the healing of broken chromosome ends in wheat. Genome, , 35, , 844–848.

    Gill,K.S., Gill,B.S. and Endo,T.R. ( (1993) ) A chromosome region-specific mapping strategy reveals gene-rich telometric ends in wheat. Chromosoma, , 102, , 374–381.

    Kota,R.S., Gill,K.S., Gill,B.S. and Endo,T.R. ( (1993) ) A cytogenetically based physical map of chromosome 1B in common wheat. Genome, , 36, , 548–554.

    Faris,J.D., Haen,K.M. and Gill,B.S. ( (2000) ) Saturation mapping of a gene-rich recombination hot spot region in wheat. Genetics, , 154, , 823–835.

    Weng,Y., Tuleen,N.A. and Hart,G.E. ( (2000) ) Extended physical maps and a consensus physical map of the homoeologous group-6 chromosomes of wheat (Triticum aestivum L. em Thell.). Theor. Appl. Genet., , 100, , 519–527.

    Tanksley,S.D., Ganal,M.W., Prince,J.P., de Vicente,M.C., Bonierbale,M.W., Broun,P., Fulton,T.M., Giovannoni,J.J., Grandillo,S., Martin,G.B. et al. ( (1992) ) High density molecular linkage maps of the tomato and potato genomes. Genetics, , 132, , 1141–1160.

    Puechberty,J., Laurent,A.M., Gimenez,S., Billault,A., Laurent,M.E.B., Calenda,A., Marcais,B., Prades,C., Loannou,P., Yurov,Y. et al. ( (1999) ) Genetic and physical analyses of the centromeric and pericentromeric regions of human chromosome 5: Recombination across 5cen. Genomics, , 56, , 274–287.

    Wei,F., Gobelman-Werner,K., Morroll,S.M., Kurth,J., Mao,L., Wing,R.A., Leister,D., Schulze-Lefert,P. and Wise,R.P. ( (1999) ) The Mla (powdery mildew) resistance cluster is associated with three NBS-LRR gene families and suppressed recombination within a 240-kb DNA interval on chromosome 5S (1HS) of barley. Genetics, , 153, , 1929–1948.

    Feuillet,C. and Keller,B. ( (2002) ) Comparative genomics in the grass family: molecular characterization of grass genome structure and evolution. Ann. Bot. (Lond), , 89, , 3–10.

    Fu,H., Zheng,Z. and Dooner,H.K. ( (2002) ) Recombination rates between adjacent genic and retrotransposon regions in maize vary by 2 orders of magnitude. Proc. Natl Acad. Sci. USA, , 99, , 1082–1087.

    Yao,H., Zhou,Q., Li,J., Smith,H., Yandeau,M., Nikolau,B.J. and Schnable,P.S. ( (2002) ) Molecular characterization of meiotic recombination across the 140-kb multigenic a1-sh2 interval of maize. Proc. Natl Acad. Sci. USA, , 99, , 6157–6162.

    Adams,M.D., Celniker,S.E., Holt,R.A., Evans,C.A., Gocayne,J.D., Amanatides,P.G., Scherer,S.E., Li,P.W., Hoskins,R.A., Galle,R.F. et al. ( (2000) ) The genome sequence of Drosophila melanogaster. Science, , 287, , 2185–2195.

    Boyko,E.V., Gill,K.S., Mickelson-Young,L., Nasuda,S., Roupp,W.J., Ziegle,J.N., Singh,S., Hassawi,D.S., Fritz,A.K., Namuth,D. et al. ( (1999) ) A high-density genetic linkage map of Aegilops tauschii, the D-genome progenitor of bread wheat. Theor. Appl. Genet., , 99, , 16–26.

    Endo,T.R. ( (1982) ) Gametocidal chromosomes of three Aegilops species in common wheat. Can. J. Genet. Cytol., , 24, , 201–206.

    Endo,T.R., Mukai,Y., Yamamoto,M. and Gill,B.S. ( (1991) ) Physical mapping of a male-fertility gene of common wheat. Jpn. J. Genet., , 66, , 291–295.

    Endo,T.R. and Gill,B.S. ( (1996) ) The deletion stocks of common wheat. J. Hered., , 87, , 295–307.

    Anderson,J.A., Ogihara,Y., Sorrells,M.E. and Tanksley,S.D. ( (1992) ) Development of a chromosomal arm map for wheat based on RFLP markers. Theor. Appl. Genet., , 83, , 1035–1043.

    Sandhu,D., Sidhu,D. and Gill,K.S. ( (2002) ) Identification of expressed sequence markers for a major gene-rich region of wheat chromosome group 1 using RNA fingerprinting-differential display. Crop Sci., , 42, , 1285–1290.

    Dilbirligi,M., Erayman,M., Sandhu,D., Sidhu,D. and Gill,K.S. ( (2004) ) Identification of wheat chromosomal regions containing expressed resistance genes. Genetics, , 166, , 461–481.

    Wu,J., Maehara,T., Shimokawa,T., Yamamoto,S., Harada,C., Takazaki,Y., Ono,N., Mukai,Y., Koike,K., Yazaki,J. et al. ( (2002) ) A comprehensive rice transcript map containing 6591 expressed sequence tag sites. Plant Cell, , 14, , 525–535.

    Kunzel,G., Korzum,L. and Meister,A. ( (2000) ) Cytologically integrated physical restriction fragment length polymorphism maps for the barley genome based on translocation breakpoints. Genetics, , 154, , 397–412.

    Shirasu,K., Schulman,A.H., Lahaye,T. and Schulze-Lefert,P. ( (2000) ) A contiguous 66-kb barley DNA sequence provides evidence for reversible genome expansion. Genome Res., , 10, , 908–915.

    Wicker,T., Stein,N., Albar,L., Feuillet,C., Schlagenhauf,E. and Keller,B. ( (2001) ) Analysis of a contiguous 211 kb sequence in diploid wheat (Triticum monococcum L.) reveals multiple mechanisms of genome evolution. Plant J., , 26, , 307–316.

    Brueggeman,R., Rostoks,N., Kudrna,D., Kilian,A., Han,F., Chen,J., Druka,A., Steffenson,B. and Kleinhofs,A. ( (2002) ) The barley stem rust-resistance gene Rpg1 is a novel disease-resistance gene with homology to receptor kinases. Proc. Natl Acad. Sci. USA, , 99, , 9328–9333.

    Wei,F., Wing,R.A. and Wise,R.P. ( (2002) ) Genome dynamics and evolution of the Mla (powdery mildew) resistance locus in barley. Plant Cell, , 14, , 1903–1917.

    Feuillet,C. and Keller,B. ( (1999) ) High gene density is conserved at syntenic loci of small and large grass genomes. Proc. Natl Acad. Sci. USA, , 96, , 8265–8270.

    Tikhonov,A.P., SanMiguel,P.J., Nakajima,Y., Gorenstein,N.M., Bennetzen,J.L. and Avramova,Z. ( (1999) ) Colinearity and its exceptions in orthologous adh regions of maize and sorghum. Proc. Natl Acad. Sci. USA, , 96, , 7409–7414.

    Inukai,T., Sako,A., Hirano,H.Y. and Sano,Y. ( (2000) ) Analysis of intragenic recombination at wx in rice: Correlation between the molecular and genetic maps within the locus. Genome, , 43, , 589–596.

    Civardi,L., Xia,Y., Edwards,K.J., Schnable,P.S. and Nikolau,B.J. ( (1994) ) The relationship between genetic and physical distances in the cloned a1-sh2 interval of the Zea mays L. genome. Proc. Natl Acad. Sci. USA, , 91, , 8268–8272.

    Gill,B.S., Friebe,B. and Endo,T.R. ( (1991) ) Standard karyotype and nomenclature system for description of chromosome bands and structural aberrations in wheat (Triticum aestivum). Genome, , 34, , 830–839.(Mustafa Erayman, Devinder Sandhu, Deepak)