Directory UMM :Data Elmu:jurnal:P:PlantScience:Plant Science_BioMedNet:

224

Unraveling plant metabolism by EST analysis
John Ohlrogge* and Christoph Benning†
Large-scale single-pass sequencing of cDNAs prepared from
specific plant species or tissues has evolved as an inexpensive
and efficient gene-discovery tool that can be used to identify
novel cDNAs encoding enzymes of specific plant metabolic
pathways. Collections of expressed sequence tags from
metabolically active tissues can provide quantitative estimates
of gene expression levels and thus are being exploited to
unravel plant metabolic and regulatory networks.
Addresses
*Department of Botany and Plant Pathology, Michigan State University,
East Lansing, Michigan 48823, USA; e-mail: ohlrogge@pilot.msu.edu
† Department of Biochemistry, Michigan State University, East Lansing,
Michigan 48823, USA; e-mail: benning@pilot.msu.edu
Current Opinion in Plant Biology 2000, 3:224–228
1369-5266/00/$ — see front matter
© 2000 Elsevier Science Ltd. All rights reserved.
Abbreviations

DXPS 1-deoxy-D-xylulose-5-phosphate synthase
EST
expressed sequence tag

cost and time involved in obtaining large EST data sets.
A single sequencer with 96 capillaries can now produce at
least 400 sequences a day, each of 600 base pairs (bp),
costing less than $10 per sequence. A researcher can
therefore obtain 5000 sequences from a cDNA library
within a few weeks and at a cost that is substantially
lower than the personnel costs associated with many
other forms of gene isolation/discovery. Additional benefits of EST approaches are that they require very few
assumptions about the target gene and provide broad
additional data that may be useful in the future. For
example, searching for clones of genes that encode an
enzyme involved in the production of a valuable natural
product may succeed, but it may subsequently become
apparent that other enzymes are needed to accompany it.
The EST data set may already contain the information
needed to quickly obtain such ‘accessory’ clones, thus

eliminating the need for further clone-isolation projects.
Later, the same data set may prove extremely valuable to
researchers working on other pathways who may then be
able to obtain their favorite cDNA by directly accessing
the respective EST cDNA set.

Introduction
Dramatic improvements in DNA-sequencing technology
have paved the way for the use of large-scale single-pass
cDNA sequencing — which has given rise to large
expressed sequence tag (EST) collections — to address
many biological questions. As of December 1999, GenBank
(National Center for Biotechnology Information, URL
http://www.ncbi.nlm.nih.gov/) contained approximately
250,000 plant entries in its dbEST database (see [1]; URL
http://www.ncbi.nlm.nih.gov/) with the greatest numbers of
entries for tomato (51,932), rice (47,449) and Arabidopsis
(45,757) [2–4]. One of the first uses of EST collections was
in identifying genes involved in specific plant metabolic
pathways. Plants synthesize more than 100,000 different

compounds, many with high nutritional value, or with medical or industrial applications. These valuable compounds
are frequently found in exotic species rather than in the crop
or model plants whose genomes have been most widely
researched. Because it is desirable to introduce valuable
genes into a crop plant by genetic engineering, efficient
methods of isolating these genes within such exotic species
are needed. In the past, it was necessary to isolate the target
enzymes from exotic tissues to obtain the molecular information necessary for the cloning of the encoding gene — an
often challenging, time-consuming and sometimes unsuccessful endeavor. More recently, large-scale sequencing of
cDNAs prepared from tissues with specific metabolic activities has provided a cost-effective and rapid alternative route
toward the isolation of several genes (e.g. [5,6•]).
The establishment of high-throughput DNA-sequencing
centers in many locations has greatly reduced both the

How many sequences are needed for gene discovery? If
the cDNA library is produced from a tissue source in which
the desired products are abundant, then it can be expected that the enzymes needed to produce such a product will
also be relatively abundant (> 0.1% of total protein), as will
be the cDNA clones corresponding to the relevant mRNA.
If the abundance of an enzyme is 0.1% of the total protein

and its mRNA is at a similar level, then, in theory,
sequencing 3000 clones will provide a greater than 95%
chance of discovering this enzyme. In several cases, clones
for novel enzymes have been discovered after less than
1000 clones were sequenced (as described below).
Particularly successful examples of EST-based gene discovery include the isolation of genes encoding enzymes
that are involved in the biosynthesis of unusual fatty acids
[7,8•]. This success stems from the fact that the structures
of fatty-acid-modifying enzymes, such as fatty acid desaturases and related enzymes, follow a general blueprint.
When enzyme function cannot be deduced from sequence
annotation because no molecular background information
is available, the EST strategy has clear limitations.
Nevertheless, even in this case, cDNA sequencing may
eventually lead to success by targeting abundant classes of
ESTs of unknown function derived from metabolically
specialized tissues for further analysis. In this review, we
will discuss several recent examples of the identification of
genes encoding specific plant enzymes using the EST
approach. We will also discuss attempts to exploit largescale EST data sets to provide transcriptional profiles and
to unravel regulatory metabolic networks.


Unraveling plant metabolism by EST analysis Ohlrogge and Benning

EST sequencing as a gene discovery tool
Fatty-acid modification

The first application of large-scale sequencing of cDNAs
derived from a plant tissue and having specialized metabolic activity was the isolation of the cDNA encoding the
enzyme responsible for ricinoleic acid biosynthesis [5].
Because of the historic significance and to illustrate the
principles of the approach, we will discuss this example in
detail. Ricinoleic acid (i.e. 12-hydroxyoleic acid) has many
industrial applications. It is highly abundant (90% of total
fatty acids) in castor seeds but absent from most other
plants. Although the activity of castor hydroxylase, the crucial enzyme involved in ricinoleic acid biosynthesis, could
be studied in microsomal fractions isolated from developing
seeds, this enzyme is labile and a successful biochemical
purification seemed intractable. Nevertheless, multiple
lines of biochemical evidence suggested that the reaction
mechanism of this hydroxylase should be similar to that of

membrane-bound desaturases [9]. Equipped with this
knowledge, Van de Loo et al. [5] reasoned that the hydroxylase involved in ricinoleic acid biosynthesis should contain
a pair of histidine-rich motifs, which are characteristic of
membrane-bound desaturases. In addition, the gene
encoding the hydroxylase should be expressed in seeds but
not leaves (which lack ricinoleic acid), and its mRNA abundance in developing seeds should be similar to the known
abundance of mRNAs encoding other fatty acid desaturases. Accordingly, a cDNA library was constructed from
mRNA isolated from developing castor-bean endosperm
[10]. Clones were randomly picked, arrayed and enriched
for seed-specific mRNAs by probing the arrays with firststrand cDNA derived from leaf mRNA. Further
enrichment was achieved by eliminating highly abundant
clones by hybridizing the arrays with storage protein encoding probes. In addition, using an immunoblotting
procedure, clones reacting with antiserum against microsomes from developing castor-bean endosperm were
selected for sequencing. Among 468 clones from this
enriched set, two showed similarity to a clone encoding the
membrane-bound desaturase from Arabidopsis. Following
expression of the full-length cDNA in tobacco, ricinoleic
acid was produced, thereby confirming that a cDNA encoding the hydroxylase had been identified [5].
More recently, a similar EST approach has been successfully
applied to identify fatty-acid conjugases for the first time

[6•]. This new type of fatty-acid modifying enzyme is
involved in the lipid-linked biosynthesis of fatty acids with
conjugated double bonds, such as the 18-carbon fatty acids
α-eleostearic acid and α-parinaric acid. These fatty acids are
the main constituents of tung oil and are ideally suited as
drying agents in paints and varnishes. The developing seeds
of certain plant species, such as Momordica charantia and
Impatiens balsamina, contain high concentrations of these
fatty acids, whereas the leaves of the same species are devoid
of them. The mechanism for the introduction of conjugated
double bonds into fatty acids was unknown, but either desaturase- or lipoxygenase-like reactions had been proposed.

225

Thus, Cahoon et al. [6•] prepared cDNA libraries from developing seeds of M. charantia and I. balsamina and obtained
5′-sequences from approximately 3000 randomly picked
clones from each library. In this case, no enrichment procedures were used. In the M. charantia and I. balsamina EST
sets, 3 and 12 copies, respectively, of cDNAs encoding proteins with similarity to oleoyl-phosphatidylcholine-type
microsomal desaturases were detected, and the proteins
were designated MomoFadX and ImpFadX, respectively.

Expression of full-length cDNAs in yeast and somatic transformation of soybean embryos with the respective constructs
confirmed the conjugase activity of both MomoFadX and
ImpFadX; this work also demonstrated that their cDNAs can
be used to engineer fatty acids with conjugated double
bonds in crop plants. Furthermore, the cloning of these fatty
acid conjugase encoding cDNAs provides the first step
towards the elucidation of the mechanism that introduces
conjugated double bonds into fatty acids.
Isoprenoid biosynthesis

Although developing seeds are often quite specialized with
regard to their biosynthetic capabilities, metabolic specialization may have reached its pinnacle in the oil glands of
peppermint. The cells of these glands devote almost all of
their biosynthetic capacity toward synthesis of isoprenoid
products such as menthol. A key accomplishment leading to
the successful application of ESTs to this system was the isolation of the oil glands and preparation of sufficient mRNA
for cDNA library construction [11]. Single-pass sequencing
of 150 randomly picked cDNA clones from a peppermint oil
gland cDNA library can hardly be deemed ‘large-scale’, but
it was sufficient to allow the discovery of cDNAs encoding

two novel proteins that are involved in isoprenoid biosynthesis. In fact, 8–10% of the cDNAs in the library appear to
encode enzymes that are involved in isoprenoid biosynthesis
[12], including (E)-β-farnesene synthase [11] and 1-deoxyD-xylulose-5-phosphate synthase (DXPS) [13••].
The first enzyme identified, (E)-β-farnesene synthase
[11], catalyzes the conversion of farnesyl diphosphate to
(E)-β-farnesene, an acyclic sesquiterpene olefin that is
present in many plants and is a chemical signal in some
plant–insect interactions. The rationale for using the EST
approach was based on three main factors: the presence of
(E)-β-farnesene (3.4% of the sesquiterpenes) in peppermint oil; initial experiments that showed that terpenoid
biosynthetic enzymes such as monoterpene cyclase and
limonen synthase are highly abundant in the peppermint
oil gland; and the availability of molecular information on
other plant sesquiterpene synthases. Expression of the
candidate cDNA in Escherichia coli followed by analysis of
the recombinant protein confirmed the nature of the
encoded novel sesquiterpene synthase.
The second enzyme identified, DXPS [13••], catalyzes the
first reaction of the recently discovered mevalonate-independent pathway for isoprenoid biosynthesis [14]. Among the
150 ESTs from the peppermint oil gland cDNA library were


226

Physiology and metabolism

two encoding a protein with high sequence similarity to
CLA1 from Arabidopsis [15]. The molecular function of CLA1
is unknown, but the phenotype of the cla1 mutant is consistent with a deficiency in isoprenoid biosynthesis. Expression
of a full-length cDNA corresponding to the two ESTs from
the peppermint oil gland library in E. coli and analysis of the
recombinant protein led to the discovery of DXPS-encoding
cDNAs [13••]. This achievement may represent a crucial step
towards the genetic engineering of isoprenoids in plants.
Cell-wall biosynthesis

Plant cell walls consist of intricately interwoven, highly
complex polymers, and our insights into their biosynthesis
are still sketchy. For example, success in identifying genes
encoding plant cellulose synthase, which is responsible for
the biosynthesis of the world’s most abundant biopolymer,

cellulose, remained elusive until Pear et al. [16] sequenced
250 randomly picked cDNAs from developing cotton
fibers. They were able to identify two cDNAs encoding
proteins with similarity to known bacterial cellulose synthases and to show that the expression of the respective
genes, celA1 and celA2, is strongly induced in developing
cotton fibers at the onset of secondary cell-wall biosynthesis. Although no in vitro assay for cellulose produced by the
recombinant cellulase synthase could be established, the
binding of the substrate UDP–glucose was observed.
Taken together, these results suggest that the two cDNAs
encode cellulose synthases. Most recently, it has been
shown that antibodies against these proteins react with the
rosette terminal complex [17•], a cell-wall-associated structure that is visible by electron microscopy and that is
believed to represent the cell-wall cellulose-synthase complex. This long-awaited breakthrough in cell-wall research
is not the end of the story because a large number of cellulose-synthase-like ESTs are present in different plant
EST data-sets and their function is not yet known [18].
Secondary cell walls are the main structural elements of
wood, and two large-scale sequencing projects have recently
been targeted at identifying of genes involved in gymnosperm [19] and angiosperm [20] wood formation. In the
first study [19], 1097 ESTs from four cDNA libraries were
combined into one data set. The four libraries were derived
from bent loblolly pine trees, specifically compression wood
(i.e. the reaction wood formed on the underside of a leaning
tree), normal wood, and two reciprocally substracted libraries
derived from the former two. Approximately 10% of the
ESTs encoded proteins with similarity to those known to be
involved in complex-carbohydrate biosyntheses or lignin
biosynthesis, or cell-wall-associated proteins, suggesting that
this data set is enriched in ESTs encoding proteins that are
involved in cell-wall biosynthesis. Nevertheless, because the
data set was derived from four different libraries, two of
which were were enriched in cDNAs specific to normal or
compression wood, it will be difficult to make meaningful
comparisons with other data sets with regard to transcript
abundance. The second wood project [20] involved two
poplar species and two different libraries: a more general,

cambial-region-specific library from one species that was
expected to yield ESTs involved in xylem and phloem formation, and a more specific cDNA library from the
developing xylem of the other species. A total number of
5692 ESTs was obtained, of which 4% were found to be related to wood formation. It was observed that ESTs encoding
proteins involved in xylem formation were twice as common
in the 883 ESTs derived from the developing xylem library
as in the general cambium-specific library. Interestingly, the
gymnosperm [19] and angiosperm [20] data sets had similar
composition with regard to the frequency of ESTs encoding
proteins involved in wood formation and those encoding proteins of unknown function. Nevertheless, in order to
interpret properly the comparison between gymnosperm and
angiosperm data sets, a truly interesting experiment, the data
sets would have to be acquired in similar ways with regard to
library preparation and clone enrichment.

Transcriptional profiling and metabolic networks
In addition to providing an efficient method for gene discovery, EST data sets can also provide information on gene
expression [21•,22••]. This aspect of EST analysis is based
on the rationale that if gene X is highly expressed, (leading
to high mRNAX levels), then cDNAs corresponding to
mRNAX will be abundant in a cDNA library made from the
tissue in which gene X is expressed. After random sequencing of a large number of clones from the cDNA library,
simply counting the number of ESTs that correspond to the
mRNA for gene X will provide an estimate of the abundance
of mRNAX in the original population. Furthermore, subject
to the limitations discussed below, the number of EST
sequences corresponding to mRNAX divided by the total
number of ESTs in the data set provides an absolute estimate of mRNAX abundance. Such ‘electronic or digital’
Northerns have both advantages and limitations in comparison to conventional Northern analysis or microarray analysis.
Conventional Northern-blot analysis is almost always performed with only one or a few genes, and the experimental
objective is usually to compare expression of the same gene
under different conditions or in different tissues. Although
obtaining quantitative data on absolute mRNA expression
levels is possible by careful comparison to standards, such
analysis is almost never conducted. In principle, microarray
analysis makes it possible to analyze simultaneously the
expression levels of thousands of genes. Nevertheless, the
technique is similar to Northern-blot analysis with regard to
its quantitative aspects. In most microarray techniques,
because of a number of technical limitations, the absolute
intensity of the fluorescence signal on a spot may not reflect
absolute mRNA levels. Microarray techniques can therefore
provide reliable data when comparing the ratio of expression
of the same mRNA under different conditions, but comparing the relative expression of different genes is problematic.
Similar to microarray analysis, the use of large EST data
sets allows the analysis of many hundreds or even thousands of genes simultaneously. EST data sets, however,

Unraveling plant metabolism by EST analysis Ohlrogge and Benning

have an added advantage of making it possible to compare
the mRNA expression of different genes quantitatively. A
few limitations to this principle must be kept in mind.
First, in a few cases, mRNA secondary structure may cause
reverse transcriptase to ineffectively produce cDNA and
therefore the copy number of such cDNAs will under-represent the mRNA species from which they are derived.
Second, the statistics of sampling small numbers from a
large population must be considered. Audic and Claverie
[23] addressed this statistical issue and established a rigorous significance test for identifying differentially
expressed genes by comparing relative abundance of
ESTs. Third, the usefulness of EST data sets to estimate
gene expression levels will sometimes be compromised if
the data originate from non-normalized cDNA libraries.
Unfortunately, for the purposes of digital Northerns, the
investigator wanting to use EST sequencing as a tool for
gene discovery will frequently try to normalize or subtract
a cDNA library to enhance the representation of the
sought-after gene. Such procedures may or may not limit
the usefulness of the data set for comparison of gene
expression levels. For example, removing only the abundant storage-protein genes from a seed library should not
alter the relative EST numbers of the vast majority of
enzymes or other less abundant mRNA.
Almost all of the plant EST data in dbEST are derived
from non-normalized cDNA libraries, making this large
data set useful for digital-Northern analysis. A number of
Arabidopsis cDNA sequences that represent re-sequencing
of clones previously selected as ‘unique’ have, however,
recently been deposited in GenBank. Future use of
dbEST for digital Northerns in Arabidopsis will need to
accommodate this new set of data. Using the digitalNorthern analysis approach, Mekhedov et al. [22••] have
recently analyzed the abundance of ESTs corresponding to
62 proteins that are involved in plant lipid biosynthesis.
Their results are available via the internet at
http://www.bpp.msu.edu/lgc/index.html. This analysis
provided the first semi-quantitative comparison of the
mRNA levels of a large number of enzymes in a plant
metabolic pathway. In general, analyses of the data sets
from Arabidopsis and rice revealed similar patterns of EST
abundance, thereby providing support for the validity of
the digital-Northern approach. Furthermore, the abundance of ESTs for specific reactions correlated well with
our understanding of the enzymology and flux characteristics of the pathway. For example, desaturases, which have
low catalytic efficiency, were found to be more abundant
than thioesterases or acyltransferases, which are extremely
efficient enzymes with high turnover rates. In a few cases,
however, the number of ESTs did not match expectations,
and these examples may provide initial clues regarding the
regulation of these enzymes. For instance, ESTs for the
FatB acyl–ACP thioesterase occurred 21 times compared to
seven times for FatA acyl–ACP thioesterase, although flux
through the FatA reaction is several-times greater than
through the FatB reaction.

227

We have also recently examined an EST data set derived
from developing Arabidopsis seeds (JA White et al., unpublished data). More than 10,000 ESTs were partially
sequenced from a library from which cDNAs for storage proteins had been largely subtracted. Approximately 40% of the
ESTs do not correspond to those previously deposited in
dbEST; this new data set has therefore revealed a substantially different set of transcripts to those present in other
tissues. Consistent with the major storage of oil in Arabidopsis
seeds, the abundance of ESTs representing lipid-biosynthetic enzymes was two to five times higher than in the non-seed
Arabidopsis EST data sets. When combined with microarray
data, the EST data have revealed several hundred previously uncharacterized genes whose expression is highly specific
to seeds. Included in this subset are a number of transcription factors, kinases and other proteins that are probably
involved in regulation of seed metabolism. Thus, this
process has identified ‘candidate’ genes that may control the
specific metabolism of oilseeds and that now can be examined further in transgenic plant experiments.

Conclusions
The dramatic reduction in the cost of high-throughput
DNA sequencing now makes this technology affordable for
almost all researchers. In most cases, the ability to generate
thousands of DNA sequences provides investigators not
only with low-cost identification of a target gene, but also
with a wealth of information about gene expression and
metabolism in the plant tissue and species from which the
data were derived. Expansion of large-scale sequencing to a
diverse range of non-crop species will help biologists to
access the vast biodiversity represented by more than
300,000 plant species. When such sequence data are
deposited in the public domain, other researchers with different interests or perspectives may be able to ‘mine’ new
insights from them. Deriving maximum information from
such data sets also requires new bioinformatics tools that are
only now being developed. Software is needed that can
allow comparisons of large sequence data sets to identify
common themes in gene expression, in addition to those
patterns that are species- or tissue-specific. It is likely that
unexpected dividends will accrue as our ability to organize
these new large data sets evolves.

Update
Schmitt et al. [24] have recently described a four-step procedure for systematic mining of whole EST libraries for
differentially expressed genes. Using about four million
human ESTs, they eliminated redundant entries from the
EST libraries before building contigs of maximal length
upon the remaining ESTs. Putative genes were compared
against a database comprising ESTs from 16 different tissues (both normal and tumour affected) to determine
whether or not they are differentially expressed (i.e. a digital-Northern approach was used).
A recently published review by Rezvani et al. [25] outlines
the continued usefulness of the large-scale generation of

228

Physiology and metabolism

ESTs in establishing expression profiles. This review provides a brief summary of recent data implicating genes that
may be involved in apoptosis in the cardiovascular system.
Machine learning techniques that can predict secretory
proteins from protein, genomic and EST sequences have
also been described recently [26].

Acknowledgements
Work in the authors’ laboratories is supported by grants from the National
Science Foundation (DCB90-05290 and MCB 9807943), the Consortium for
Plant Biotechnology Research (CPBR), and the Department of Energy
(DE-FG02-87ER12729 and DE-FG02-98ER20305). We acknowledge the
Michigan Agricultural Experiment Station for its support of this research.

References and recommended reading
Papers of particular interest, published within the annual period of review,
have been highlighted as:

• of special interest
•• of outstanding interest
1.

Boguski MS, Lowe TM, Tolstoshev CM: dbEST — database for
‘expressed sequence tags’. Nat Genet 1993, 4:332-333.

2.

Cooke R, Raynal M, Laudie M, Grellet F, Delseny M, Morris PC,
Guerrier D, Giraudat J, Quigley F, Clabault G et al.: Further progress
towards a catalogue of all Arabidopsis genes: analysis of a set of
5000 non-redundant ESTs. Plant J 1996, 9:101-124.

3.

Newman T, de Bruijn FJ, Green P, Keegstra K, Kende H, McIntosh L,
Ohlrogge J, Raikhel N, Somerville S, Thomashow M: Genes galore: a
summary of methods for accessing results from large-scale partial sequencing of anonymous Arabidopsis cDNA clones. Plant
Physiol 1994, 106:1241-1255.

4.

5.

Sasaki T, Song J, Koga Ban Y, Matsui E, Fang F, Higo H, Nagasaki H,
Hori M, Miya M, Murayama-Kayano E et al.: Toward cataloging all
rice genes: large-scale sequencing of randomly chosen rice
cDNAs from a callus cDNA library. Plant J 1994, 6:615-624.
Van de Loo FJ, Broun P, Turner S, Somerville C: An oleate
12-hydroxylase from Ricinus communis L. is a fatty acyl desaturase homolog. Proc Natl Acad Sci USA 1995, 92:6743-6747.

6.


Cahoon EB, Carlson TJ, Ripp KG, Schweiger BJ, Cook GA, Hall SE,
Kinney AJ: Biosynthetic origin of conjugated double bonds:
production of fatty acid components of high-value drying oils in
transgenic soybean embryos. Proc Natl Acad Sci USA 1999,
96:12935-12940.
The authors describe the isolation of genes that encode fatty-acid conjugase
using an EST approach. Fatty-acid conjugases, a term coined in this paper,
introduce conjugated double bonds into fatty acids, generating high-value
conjugated products.
7.

Shanklin J, Cahoon EB: Desaturation and related modifications of
fatty acids. Annu Rev Plant Physiol Plant Mol Biol 1998,
49:611-641.

8.


Broun P, Shanklin J, Whittle E, Somerville C: Catalytic plasticity of
fatty acid modification enzymes underlying chemical diversity of
plant lipids. Science 1998, 282:1315-1317.
The authors of this paper demonstrate the close structural relationship between
different fatty-acid-modifying enzymes. The principle that different enzymatic
activities can be converted into each other by protein engineering is proved.
9.

Arondel V, Lemieux B, Hwang I, Gibson S, Goodman HM,
Somerville CR: Map-based cloning of a gene controlling omega-3
fatty acid desaturation in Arabidopsis. Science 1992, 258:1353-1355.

10. Van de Loo FJ, Turner S, Somerville C: Expressed sequence tags
from developing castor seeds. Plant Physiol 1995, 108:1141-1150.

11. Crock J, Wildung M, Croteau R: Isolation and bacterial expression
of a sesquiterpene synthase cDNA clone from peppermint
(Mentha x piperita, L.) that produces the aphid alarm pheromone
(E)-beta-farnesene. Proc Natl Acad Sci USA 1997,
94:12833-12838.
12. Lange BM, Croteau R: Genetic engineering of essential oil production in mint. Curr Opin Plant Biol 1999, 2:139-144.
13. Lange BM, Wildung MR, McCaskill D, Croteau R: A family of
•• transketolases that directs isoprenoid biosynthesis via a
mevalonate-independent pathway. Proc Natl Acad Sci USA 1998,
95:2100-2104.
This paper describes the isolation of a cDNA that encodes the enzymes
involved in the first reaction of the recently discovered mevalonate-independent
pathway of isoprenoid biosynthesis. The authors demonstrate that the recently
described Arabidopsis cla1 mutant is deficient in isoprenoid biosynthesis.
14. Lichtenthaler H: The 1-deoxy-D-xylose-5-phosphate pathway of
isoprenoid biosynthesis in plants. Annu Rev Plant Physiol Plant
Mol Biol 1999, 50:47-65.
15. Mandel MA, Feldmann KA, Herrera-Estrella L, Rocha-Sosa M, Leon P:
CLA1, a novel gene required for chloroplast development, is highly conserved in evolution. Plant J 1996, 9:649-658.
16. Pear JR, Kawagoe Y, Schreckengost WE, Delmer DP, Stalker DM:
Higher plants contain homologs of the bacterial celA genes
encoding the catalytic subunit of cellulose synthase. Proc Natl
Acad Sci USA 1996, 93:12637-12642.
17


Kimura S, Laosinchai W, Itoh T, Cui X, Linder CR, Brown RMJ:
Immunogold labeling of rosette terminal cellulose-synthesizing
complexes in the vascular plant Vigna angularis. Plant Cell 1999,
11:2075-2086.
The authors demonstrate that the terminal rosette complex observed by electron microscopy does indeed contain cellulose synthase, as had been speculated for a long time.
18. Cutler S, Somerville C: Cloning in silico. Curr Biol 1997, 7:R108-R111.
19. Allona I, Quinn M, Shoop E, Swope K, Cyr SS, Carlis J, Riedl J,
Retzel E, Campbell MM, Sederoff R, Whetten RW: Analysis of
xylem formation in pine by cDNA sequencing. Proc Natl Acad Sci
USA 1998, 95:9693-9698.
20. Sterky F, Regan S, Karlsson J, Hertzberg M, Rohde A, Holmberg A,
Amini B, Bhalerao R, Larsson M, Villarroel R et al.: Gene discovery in
the wood-forming tissues of poplar: analysis of 5,692 expressed
sequence tags. Proc Natl Acad Sci USA 1998, 95:13330-13335.
21. Ewing RM, Kahla AB, Poirot O, Lopez F, Audic S, Claverie JM: Large

scale statistical analyses of rice ESTs reveal correlated patterns
of gene expression. Genome Res 1999, 9:950-959.
This is an important technical paper describing the bioinformatic procedures
behind the cluster analysis of large EST data sets. Several tissue-specific EST
data sets from rice were analyzed. A ‘digital’ analysis of the data is provided.
22. Mekhedov S, Martinez de Ilarduya O, Ohlrogge J: Towards a
•• functional catalog of the plant genome: a survey of genes for lipid
biosynthesis. Plant Physiol 2000, in press.
The authors provide an extensive in silico analysis of publicly available plant ESTs
encoding lipid-metabolizing enzymes. The validity of the approach is demonstrated with reference to existing knowledge of the underlying biochemistry.
23. Audic S, Claverie JM: The significance of digital gene expression
profiles. Genome Res 1997, 7:986-995.
24. Schmitt AO, Specht T, Beckmann G, Dahl E, Pilarsky CP,
Hinzmann B, Rosenthal A: Exhaustive mining of EST libraries for
genes differentially expressed in normal and tumour tissues.
Nucleic Acids Res 1999, 27:4251-4260.
25. Rezvani M, Barrans JD, Dai KS, Liew CC: Apoptosis-related genes
expressed in cardiovascular development and disease an EST
approach. Cardiovasc Res 2000, 45:621-629.
26. Ladunga I: Large-scale predictions of secretory proteins from
mammalian genomic and EST sequences. Curr Opin Biotechnol
2000, 11:13-18.