Screening methods to detect mRNA targets of editing enzymes
Ключові слова
Інформація про патент
Номер патенту | 5866333 |
Подається | 02/29/1996 |
Дата патентування | 02/01/1999 |
Анотація
Претензії
What is claimed is:
1. A method for detecting edited candidate mRNA in a tissue, comprising
a) generating cDNA from RNA in the tissue using one or more mooring primers and reverse transcriptase;
b) amplifying the cDNA of step (a) by polymerase chain reaction using one or more mooring primers and one or more 5' arbitrary primers, and
c) separating the products of step (b) by gel electrophoresis;
wherein the mooring primer comprises a nucleic acid having the sequence 5'-TATANTGATCAXA-3' wherein X is 3 to 5 nucleotides.
2. The method of claim 1, further comprising
(d) sequencing one or more products of step (c).
3. The method of claim 1, wherein the tissue is liver.
4. The method of claim 1, wherein the tissue is kidney.
5. The method of claim 1, wherein the tissue is small intestine.
6. The method of claim 1, wherein the arbitrary primer comprises the sequence
5'-GATCATAGCC-3'.
7. The method of claim 1, wherein one or more products of step (c) are cloned.
8. The method of claim 1, wherein DNA encoding a product of step (c) is isolated from a DNA library.
9. A sequence for use as a primer comprising 5'-TATANTGATCAXA3' wherein X is 3 to 5 nucleotides.
Опис
FIELD OF THE INVENTION
The present invention relates to RNA editing polypeptides and methods for screening mRNA for editing.
BACKGROUND OF THE INVENTION
Apolipoprotein B occurs naturally in two forms, apo-B100 and apo-B48, both encoded by the same gene. Apo-B100, a 550 KD protein, is the major protein responsible for cholesterol transport in the blood and plays a crucial role in cholesterol and lipoprotein metabolism. Apo-B100 is an integral component of very low density lipoproteins (VLDL) and intermediate density lipoproteins (IDL), and is the sole component of low density lipoprotein (LDL). Apo-B48, a 264 KD protein, is synthesized in the intestines of humans and rabbits and in the liver and intestines of rats and mice. In mice and rats, hepatic-derived apo-B48 is a component of VLDL; the total VLDL is a mixture of VLDL containing apo-B100 or apo-B48. Intestinally derived apo-B48 is secreted as an integral component of chylomicrons.
Apo-B48 is produced by a biological process in which the apoB primary transcript is postranslationally modified by a type of RNA processing known as RNA editing. The term RNA editing is used to describe the specific modification of mRNA (or the coding region of pre-RNA) that alters the genetic information encoded in the transcript.
Apolipoprotein B mRNA editing deaminates a specific cytidine (C.sup.6666) to create a uridine. This changes the codon at position 2153 from a genomically encoded CAA (glutamine) to an in-frame stop codon (UAA). Apolipoprotein B mRNA editing occurs in the small intestine of all mammals and in the liver of rats, mice, dogs, and horses.
Hepatic apo-B mRNA editing in the rat and mouse, both of which normally modify approximately 65% of the apoB mRNA, is developmentally and hormonally regulated. Editing activity is regulated by growth hormone, thyroxine, cortisol, fasting, and diet. Apolipoprotein B mRNA editing also demonstrates developmental regulation in the human intestine. Human fetal intestine at 11 weeks of gestation predominantly produces apo-B100, whereas at 16 weeks of gestation, both apo-B100 and apo-B48 are secreted in roughly equal proportions. In the adult intestine, only apo-B48 is secreted.
A specific 11 nucleotide "mooring" sequence in apoB mRNA occurring 5 nucleotides downstream from C.sup.6666 is critical for editing in vitro. When the mooring sequence is inserted into another location on apoB or non-apoB cDNA, the resulting chimeric RNA is edited in vitro (Driscoll et al. Mol. Cell. Biol. 13: 7288-7294 (1993); Backus et al. Biochim. Biophys. Acta. 1217: 65-73 (1994); Shah et al. J. Biol. Chem. 266: 16301-16304 (1991); Backus et al. Biochim. Biophys. Acta. 1219: 1-14 (1994)).
Several proteins appear to be necessary for apoB mRNA editing in vitro. One of these proteins has been cloned from a rat intestinal library (Teng et al. Science 260: 1816-1819 (1993)). This 27 kD protein, which deaminates cytidine.sup.6666 in apoB mRNA, has been designated APOBEC-1 (apoB mRNA-editing enzyme catalytic polypeptide #1) (Davidson et al. RNA 1:3 (1995)). The major functional domains of this RNA editing polypeptide are highly conserved in the cloned homologues of rat APOBEC-1 from human (Hadjiagapiou et al. Nucleic Acids Res. 22: 1874-1879 (1994); Lau et al. Proc. Natl. Acad. Sci. U.S.A. 91: 8522-8526 (1994)), rabbit (Yamanaka et al. J. Biol. Chem. 269: 21725-21734 (1994)), and mouse (Nakamuta et al. J. Biol. Chem. 270: 13042-13056 (1995)).
Transgenic mice and rabbits expressing APOBEC-1 have been generated (Yamanaka etal. Proc. Natl. Acad. Sci. U.S.A. 92: 8483-8487 (1995)). The transgenic mice and rabbits had liver dysplasia, and many developed hepatocellular carcinomas.
Comparison of gene expression in different tissues or under different conditions can be performed using the technique of differential mRNA display (Liang and Pardee, Science 257: 967-971 (1992)), also termed differential display reverse transcriptase PCR. In this technique, two or more RNA populations (e.g., RNA preparations from different tissues) are made into cDNA using reverse transcriptase and a set of oligonucleotide primers, one being anchored to the polyadenylate tail of a subset of mRNAs by, for example, the two nucleotide sequence CA, the other being short and arbitrary in sequence so that it anneals at different positions relative to the first primer. The resulting cDNA is amplified by PCR, using a 5' primer of arbitrary base sequence, chosen to anneal at positions randomly distributed in distance from the poly(A) tail. The resulting amplified DNA sequences can be separated by gel electrophosis. An amplified DNA band can be subcloned into a vector, or can be sequenced, for example, by using extended primers for futher amplification (Wang and Feuerstein, Biotechniques 18: 448-452 (1995)), ligation linked PCR (Reeves et al. Biotechniques 18: 18-20 (1995)).
Recent studies have provided some insight into the genetic basis for obesity. A mouse obesity gene, ob, and its human homologue have been cloned and sequenced (Zhang et al. Nature 372: 425-432 (1994)). Mutation in ob results in profound obesity and type II diabetes as part of a syndrome that resembles morbid obesity in humans (Friedman et al. Genomics 11: 1054-1062 (1991). Halaas et al. (Science 269: 543-546 (1995); Pelleymounter et al. (Science 269: 540-543(1995)); and Campfield et al. (Science 269: 546-549(1995)) described the weight-reducing effects of the plasma protein OB (leptin) encoded by the obese gene. A transmembrane receptor for OB protein has been cloned (Tartaglia et al. Cell 83: 1263 (1995)). Mice with mutations in the diabetes gene db have a phenotype similar to mice having mutations in ob. Chua et al. (Science 271: 994-996(1996)) have demonstrated by genetic mapping and genomic analysis, that mouse db, rat fatty (a homologue of db), and the gene encoding OB-R are the same gene.
SUMMARY OF THE INVENTION
The expression of APOBEC-1 in the liver of transgenic mice can cause liver dysplasia and liver tumors, and the expression of human APOBEC-1 in the intestine, stomach, or brain can cause obesity. Promiscuous editing of mRNAs other than apoB is correlated to these these phenotypic changes. The instant invention discloses novel techniques for detecting aberrantly edited mRNAs, and hence, genes responsible for non-wild type phenotypes.
One aspect of the invention is a method for detecting mRNA candidates for editing in a tissue, comprising
a) generating cDNA from RNA in the tissue using one or more mooring primers and reverse transcriptase;
b) amplifying the cDNA of step (a) by polymerase chain reaction using one or more mooring primers and one or more 5' arbitrary primers, and
c) separating the products of step (b) by gel electrophoresis;
wherein the mooring primer comprises a sequence complementary to a sequence having at least 80% sequence homology to a known mooring sequence.
Another aspect of the invention is a method for identifying an obesity gene in an animal, comprising
a) generating an obese transgenic animal expressing an exogenous RNA editing polypeptide;
b) generating cDNA from RNA in tissue from obese and control animals using one or more mooring primers and reverse transcriptase;
c) amplifying the cDNA of step (b) by polymerase chain reaction using one or more mooring primers and one or more 5' arbitrary primers;
d) comparing the products of step (c) to identify at least one RNA species differing in occurrence in the obese and control animals;
wherein the mooring primer comprises a sequence complementary to a sequence having at least 80% sequence homology to a known mooring sequence.
Another aspect of the invention is a method for identifying an oncogene in an animal, comprising
a) generating a transgenic animal expressing an exogenous RNA editing polypeptide;
b) identifying animals with a non-wild type phenotype;
c) generating cDNA from RNA from a tissue of the animals of step b) and control animals, wherein the tissue expresses the RNA editing polypeptide, using one or more mooring primers and reverse transcriptase;
d) amplifying the cDNA of step (c) by polymerase chain reaction using one or more mooring primers and one or more 5' arbitrary primers;
e) comparing the products of step (d) to identify at least one RNA species differing in occurrence in the non-wild type and control animals;
wherein the mooring primer comprises a sequence complementary to a sequence having at least 80% sequence homology to a known mooring sequence.
Another aspect of the invention is a method for screening for aberrantly edited mRNA, comprising:
(a) identifying a DNA sequence in a data bank, wherein the DNA sequence contains a mooring-like sequence;
(b) preparing cDNA from mRNA in a tissue of interest with a non-wild type phenotype and from the tissue of interest with a wild type phenotype;
(c) amplifying the cDNAs from step (b) by polymerase chain reaction using a 5' primer specific for the DNA sequence of step (a) and a 3' mooring sequence primer, wherein the mooring sequence primer comprises a sequence complementary to a sequence having at least 80% sequence homology to a known mooring sequence;
(d) comparing the products of step (c) to identify aberrantly edited mRNA in the tissue having a non-wild type phenotype.
Another aspect of the invention is a sequence for use as a primer comprising in the 5' to 3' direction a sequence complementary to a mooring-like sequence; 3 to 5 degenerate nucleotides; and adenosine.
Another aspect of the invention is a composition comprising a nucleic acid encoding NTA1, an obesity-related protein. Other aspects of the invention include a composition comprising purified NTA1.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a diagram depicting the use of mooring-dependent polymerase chain reaction (PCR) to identify edited mRNA. (SEQ ID NO'S 2-5)
FIG. 2 is an autoradiagram of mooring-dependant PCR amplification products from APOBEC-1 transgenic mouse liver having a higher abundance than in control mice.
FIG. 3 is a diagram depicting editing of multiple cytidines in NTA1 mRNA from APOBEC-1 transgenic mouse liver. Arrows indicate edited cytidines. (SEQ ID NO'S 6, 7)
FIG. 4 is an autoradiagram depicting primer extension analysis of PCR amplified APOBEC-1 transgenic mouse liver. "E" denotes edited mRNA.
FIG. 5 comprises FIGS. 5A-C and is a diagram depicting the nucleotide sequence encoding NTA1 and the predicted amino acid sequence of NTA1. (SEQ ID NO: 8)
DESCRIPTION OF THE PREFERRED EMBODIMENTS
General Methods
In some embodiments of the instant invention methods are provided for the identification of mRNAs that are candidates for RNA editing by virtue of having a mooring or mooring-like sequence, for screening candidate mRNAs for aberrant editing, and for identifying genes encoding such candidate mRNAs.
To identify mRNA candidates for RNA editing, total RNA is first isolated from a cell culture or tissue of interest by a variety of methods well known in the art (see, for example, Sambrook, et. al. Molecular Cloning. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1989)). To selectively enrich the cDNA population for mRNAs that are likely to be subjected to editing, cDNA is generated by using "mooring primers" as primers for reverse transcriptase. These "mooring primers" are sequences complementary to known "mooring sequences" or "mooring-like sequences." As used herein, the term "mooring sequence" is intended to refer to a nucleotide sequence correlated with editing of an mRNA. An example of a mooring sequence is the specific 11 nucleotide "mooring" sequence 5'-UGAUCANUAUA[SEQ ID NO:1]-3 in apo-B mRNA occurring 5 nucleotides downstream from C.sup.6666. A "mooring-like" sequence is a sequence having greater than 80% homology with a mooring sequence.
Such mooring primers can be about 10 to about 20 nucleotides in length, more preferably about 13 to about 16 nucleotides in length. One or more mooring primers can be used in the generation of cDNA. Typically, to identify mRNAs that are candidates for editing, sets of mooring primers will be synthesized having desired sequences by techniques well known in the art for oligonucleotide synthesis (See, for example, Gait, ed. Oligonucleotide Synthesis. A Practical Approach. IRL Press, N.Y. (1984)).
In some embodiments of the invention, mRNAs that have been aberrantly edited can be identified by designing the mooring primers to include a sequence complementary to a mooring sequence or mooring-like sequence, followed by 3-5 nucleotides, and an adenosine at the 3' end. Preferably, the 3-5 nucleotides are degenerate. The C-terminal adenosine allows more efficient amplification of edited mRNA containing thymidine than the unedited mRNA containing cytidine. In other embodiments of the invention, a cytosine, thymidine, or guanosine residue is preferable as the 3' terminal residue, depending on the change introduced into mRNA by the RNA editing polypeptide. In further embodiments of the invention, the mooring primer has the general formula
wherein N is any nucleotide and X is 3 to 5 nucleotides. Preferably, X is degenerate.
Conditions for cDNA generation from RNA using reverse transcriptase are well known in the art. See for example, Liang et. al. Science 257: 967-971 (1991).
To amplify the cDNA generated with mooring primers, the mooring primers are additionally used as 3' primers in PCR amplification. The 5' primers are typically different sets of arbitrary primers chosen such that annealing positions to cDNA are randomly distributed from the mooring sequence. Typically, such 5' primers are designed to anneal to the cDNA at a position less than 2 to 3 KB from the mooring or mooring-like sequence. Preferably, the position is less than about 500 nucleotides from the mooring or mooring-like sequence to allow better resolution on a sequencing gel. The 5' primer is preferably relatively short, from about 8 nucleotides to 20 nucleotides. Examples of arbitrary sequences include 5'-GATCATAGCC[SEQ ID NO: 2]-3'.
Preferably, the PCR is performed with radiolabeled nucleotides to allow detection by autoradiography. Conditions for PCR are well known in the art. See, for example, Liang et. al., supra.
The resulting amplified DNA sequences can be separated by gel electrophosis by techniques well known in the art (see, for example, Liang et. al., supra). An amplified DNA band can be subcloned into a vector, or can be sequenced, for example, by using extended primers for futher amplification (Wang and Feuerstein, Biotechniques 18: 448-452 (1995)), or ligation linked PCR (Reeves et al. Biotechniques 18: 18-20 (1995)).
Searches of databases, such as Genbank, can also be used in the identification of candidate mRNAs for editing. As shown in the examples below, several editing candidates were obtained by searching a database for sequences similar to the mooring sequence of ApoB. Once identified, as exemplified below, specific primers can be designed to amplify only the sequence of interest by PCR.
In some embodiments of the invention, it is desirable to determine whether an mRNA candidate for RNA editing can be edited by an editing polypeptide such as APOBEC-1. This can be accomplished in vitro, for example, by adding recombinant APOBEC-1, a tissue extract containing other components of the editing enzyme complex and a substrate synthetic RNA having a desired sequence and analyzing the RNA for editing as described above. For examples of in vitro editing assays, see Backus, et. al. Biochim. Biophys. Acta. 1217: 65-73 (1994) and Giannoni, et. al. J. Biol. Chem. 269: 5932-5936 (1994). In further embodiments of the invention, variants of an RNA of interest, created by substitutions in the DNA template, can be subjected to such an in vitro assay to determine what sequences are involved in the editing process.
In a further embodiment of the invention, a transgenic animal can be provided expressing an exogenous RNA editing polypeptide, such as APOBEC-1, as exemplified below. In such animals, in vivo editing of a candidate mRNA by APOBEC-1 in a tissue of interest can be determined as described above, begining with preparing RNA from the tissue. In some embodiments of the invention, it can be advantageous to provide a transgenic animal having multiple copies of an RNA editing enzyme. Such multiple copies can lead to higher levels of expressison of the editing enzyme, thereby resulting in aberrant editing of a candidate mRNA.
Transgenic animals can be generated by techniques well known in the art. See, for example, U.S. Pat. No. 4,736,866.
In additional embodiments of the invention, the generation of a transgenic animal can result in a non-wild type phenotype. For example, the experimental results below demonstrate the generation of obese APOBEC-1 transgenic animals and APOBEC-1 animals with pathological liver changes. In such instances, it is desirable to identify mRNAs that have been edited in those tissues by the RNA editing polypeptide. Aberrant editing of an mRNA by an RNA editing polypeptide in a cell culture, tissue or animal possessing pathological or non-wild type phenotype provides an opportunity to identify genes involved in the generation of those phenotypes. Without being limited to any one theory, such aberrant editing could result from the presence of an editing polypeptide not normally expressed in the tissue, by an excess of such a polypeptide, by an excess of such a polypeptide normally found in the tissue, and so on. The identification of such mRNAs thus provides insight into the role of the genes, including oncogenes and obesity modifying genes, encoding them in growth, differentiation, or lipid accumulation, and into the possible role of an RNA editing enzyme or polypeptide in causing obesity or as an oncogene. The term "oncogene" as used herein is intended to include cancer genes, transforming genes, and genetic loci responsible for neoplastic changes in the host cell. "Obesity gene" as used herein is intended to include obesity modifying genes encoding products acting on adipose tissues or on distant sites controlling lipid metabolism and accumulation.
In such transgenic animals, the use of mooring primers for cDNA generation and PCR, along with arbitrary 5' primers for PCR allows the identification of candidate mRNA for editing. Preferably, the use of mooring primers having 3 to 5 degenerate nucleotides plus adenosine following the mooring or mooring like complementary sequence described above allows the identification of mRNA likely to have experienced a C to U change. More preferably, an RNA preparation from the non-wild type tissue is compared with tissue having a wild type phenotype to identify those mRNAs which have been edited aberrantly. Control animals providing such a wild type phenotype can include transgenic animals, transgenic animals generated with innocuous or unrelated transgenes, or non-transgenic animals. Sequencing of the PCR products, either directly or after cloning, provides a basis for isolating the gene of interest from a library. The sequencing information also provides opportunities to search databases for sequences related to the PCR product.
Once such an edited mRNA is identified, and the sequence of the corresponding cDNA obtained, the amino acid sequence of the polypeptide product of that cDNA can be predicted. Such polypeptides can be synthesized, obtained by expression techniques well known in the art, or isolated from a tissue of interest. Such polypeptides can be useful as therapeutic reagents in situations wherein the edited version of the polypeptide results in a pathological state. Therapeutic use of the non-edited version, or a fragment of the polypeptide having biological function of the non-edited version, can restore, ameliorate, or at least partly reduce the severity of the pathological state. Additionally, such polypeptides or the edited version (i.e. the product of the edited mRNA) thereof can be used in the preparation of antibodies for therapeutic use in a pathological condition and as research tools for investigating the role of the protein in such processes as growth and development.
The scope of the invention is intended to include nucleic acid which is homologous to NTA1 nucleic acid, including edited versions, and polypeptides, including the products of edited verions of mRNA, which are homologous to NTA1 polypeptide. Generally, this will be about 90% and is more typically about 95%.
Experimental Examples
I. APOBEC-1 Transgenic Mice with Transformed Phenotype
In this study, transgenic rabbits and mice expressing rabbit APOBEC-1 in their livers were generated to determine if the hepatic expression of APOBEC-1 would lower LDL concentrations by reducing apo-B100 synthesis by the liver.
To generate the transgenic mice and rabbits, a full-length rabbit APOBEC-1 cDNA was subcloned into a vector, (pLiv11) containing a promoter, an intron, and a hepatic control region, all from the human apo-E gene. Other heterologous cDNAs previously cloned into this vector were expressed robustly and predominantly in the liver (Fan et al. Proc. Natl. Acad. Sci. U.S.A. 91: 8724-8728 (1994)). Specifically, a cDNA fragment of rabbit APOBEC-1 was isolated from the plasmid pREPR (Yamanaka etal. J. Biol Chem. 269: 21725-21734 (1994)) by digestion with EcoRI and subcloned into pLiv11, which had been linearized with Mun I, to produce the vector pLivREPR. A SalI-SpeI fragment of pLivREPR was gel-purified by Qiaex (Qiagen, Chatsworth, Calif.) and microinjected into fertilized oocytes of strain ICR mice (Charles River Breeding Laboratories) and New Zealand White rabbits. Founder pups were identified by Southern blot analysis of genomic DNA from mouse tails or rabbit ears with a .sup.32 P-labeled probe of rabbit APOBEC-1 cDNA. The transgene copy number was estimated by using endogenous rabbit APOBEC-1 as a control. Four independent trangenic mouse lines (I-20, I-22, I-28L, and I-28H) were established from the three mouse founders expressing wild-type rabbit APOBEC-1, and their transgene numbers were estimated to be 7, 17, 3, and 10, respectively.
TABLE 1 __________________________________________________________________________ Pathology of Transgenic Mouse Livers Liver/body Mice with pathology, no. Mice Gene Age at weight Normal- Dysplasia Hepato- studied copy analysis, ratios, minimal with fatty cellular Mouse lines no. no. days % (n) dysplasia change carcinoma __________________________________________________________________________ Transgenic (wild-type) 1-20 21 7 76-274 12.0 (16) 0 20 1 1-22 5 17 92-200 13.3 (1) 0 3 2 1-28H 5 10 29-121 23.2 (3) 0 1 4 1-28L 4 3 66-209 21.4 (1) 1 2 1 Transgenic (mutant) 8 257-401 4.8 (4) 8 0 0 Control 20 66-140 5.5 (19) 20 0 0 __________________________________________________________________________
One transgenic rabbit founder had one copy of the transgene, and the other founder had 17 copies.
In the transgenic animals, the transgene APOBEC-1 mRNA was expressed predominantly in the liver. The apoB mRNAs from the livers of the transgenic mice and rabbits were extensively edited, and the transgenic animals had reduced concentrations of apo-B100 and LDL compared to those of the control animals. Specifically, an average of 92% of apo-B mRNA was edited in the livers of all four transgenic mouse lines (n=26). In the transgenic control mouse line, no increase in the amount of edited apoB mRNA (82%, n=4) was observed over that of the non-transgenic mouse line (83%, n=23). Liver extracts from wild-type APOBEC-1 transgenic mice were 15 fold more active in editing a synthetic apoB RNA substrate in vitro than liver extracts from age- and sex-matched control mice. This finding demonstrated that APOBEC-1 was overexpressed, resulting in excess editing activity in the transgenic mouse livers. As expected by agarose gel elctrophoroesis, B-migrating lipoproteins (LDL) were not detected in the plasma of the transgenic mice. The apparent lack of apo-B100 in the transgenic mouse lipoprotein was confirmed by SDS/PAGE.
Unexpectedly, all of the transgenic mice and a transgenic rabbit displayed liver dysplasia, and many transgenic mice developed hepatocellular carcinomas. The livers of the transgenic mice were at least twice as large, and in some cases, 10 times as larger than those of their non-transgenic litter mates. One liver weighed 18 g, which was 40% of the entire weight of the mouse. Histological examination of the livers revealed that hepatocytes were dysplastic, exhibiting variations in nuclear and cellular morphology concomittant with changes in the cytoplasm-to-nucleus ratio. In all of the mouse lines, large multinodular tumors, having the morphological appearance of hepatocellualr carcinomas, were observed. One transgenic mouse had severe hepatic hyperplasia, with nodules of trabecular carcinoma, but only minimal lipid deposits at 24 days.
The transgenic rabbit with 17 copies of the transgene appeared normal at birth but grew more slowly than its nontransgenic litter mates. After 8 weeks it became weak and immobile and, therefore, was euthanized. The transgenic rabbit's weight was about 50% of that of its litter mate (1160 vs. 2250 g). Necropsy revealed an enlarged liver, weighing about 1.5 times more than that of its nontransgenic littermate (112 vs. 72 g). The transgenic rabbit's liver had visible scarring, fibrosis, and cytoplasmic lipid droplets. In the transgenic rabbit founder, as in the mice, APOBEC-1 mRNA was expressed mainly in the liver. Primer extension analysis showed that 78% of the apo-B mRNA was edited in the transgenic rabbit liver, whereas no significant editing of apo-B mRNA occurred in the control rabbit liver. Although fasted plasma from normal rabbit contains only apo-B100, the lipoproteins isolated from the high copy number transgene rabbit contained greater than 50% apo-B48.
To determine whether the mechanism for dysplasia in the APOBEC-1 transgenic animals is the aberrant editing of other mRNAs, other hepatic mRNAs with mooring sequence motifs similar to that of apo-B mRNA were examined for this type of editing (i.e., cytidine deamination). GenBank was searched for sequences similar to the mooring sequence using the FASTA program (Pearson et al. Proc. Natl. Acad. Sci. U.S.A. 85: 2444-2448 (1988)). Twenty three rodent sequences were found to have the exact mooring sequence. Among these are the mouse fatty acid synthase (FAS) and mouse P1 protein (P1), both of which contain a cytidine residue 4-6 nucleotides upstream from the mooring sequence. Furthermore, greater than 100 sequences were identified with sequence motifs just 1 nucleotide different from the mooring sequence. These include mouse protein-tyrosine kinase TEC and mouse prostaglandin synthase (TIS10), which contain a cytidine residue 5 or 7 nucleotides upstream from the mooring sequences, respectively.
To determine whether these mRNA were edited in vivo, RNAs isolated from control and transgenic mouse livers were amplified by reverse transcription-PCR. The PCR products were analyzed for editing by primer-extension anlaysis. None of the four mRNAs (P1, FAS, TIS10, TEC) were edited in the livers of the control mice. Without being limited to any one theory, this result suggests that the mooring sequence is normally not sufficient to support editing. Even in transgenic mice with high hepatic editing levels, three of the transcripts were not edited. However, mouse protein-tyrosine kinase TEC was edited in transgenic mice, providing a second example of C to U editing of an mRNA. This particular change was a silent codon change, and thus could not be the cause of the observed pathological phenotype. Nevertheless, these results establish that other cytidines in mammalian mRNA can be the targets of APOBEC-1 mediated deamination. Specific C to U changes can result in the formation of new termination codons, new initiation codons, and missense mutations, all of which can have potentially severe biological consequnces.
Although apoB mRNA editing does not normally modify DNA, it is possible that the overexpressed APOBEC-1 might modify small amounts of DNA. However, C.sup.6666 in a 282 bp apoB DNA fragment, 6504-6785, was not edited in vitro. Moreover, PCR and primer extension analysis of genomic apoB DNA from transgenic mouse livers overexpressing APOBEC-1 indicated that the genomic C.sup.6666 was not edited in vivo. It is thus likely that the predisposition to hepatocellular carcinoma is not due to DNA modifications.
2. APOBEC-1 Transgenic Mice with Obese Phenotype
A second transgenic animal model was generated by using human genomic DNA containing APOBEC-1. These mice were generated using the APOBEC-1 promoter, rather than that of apo-E, which was used in the construct above. Five transgenic mouse lines were generated using the procedure described above that had from two to seven copies of the human APOBEC-1 transgene. These mice expressed human APOBEC-1 in their intestine, stomach, and brain. The expression of APOBEC-1 in these tissues, with the exception of the brain, mimicked the expression of APOBEC-1 in human tissue.
In four of five lines, 47% of the transgenic mice were obese (at least 10% greater body weight than their nontransgenic littermates). The weights of these mice were from 33% to 54% greater than their nontransgenic littermates. Autopsy examination of the transgenic mice revealed large amounts of abdominal fat compared to the nontransgenic littermate controls. For example, one transgenic mouse contained about 9 grams of abdominal fat compared to about 3 grams in the control animal.
An additional independent control has been generated in the form of an "apo-B48-only" mouse by other investigators at the Gladstone Institutes. Gene targeting in embryonic stem cells has been used to generate mice synthesizing only apo-B48. These non-obese mice are homozygous for the C.fwdarw.T mutation at position 6666 of apo-B cDNA that corresponds exactly to the C.fwdarw.U editing change mediated by APOBEC-1. These mice synthesize only apo-B48 and no apo-B100 and develop normally with livers of normal size, morphology, and function. The histology of the livers of six "apo-B48-only" mice (aged 5-9 months) was examined. Their livers were normal or near-normal and did not have the changes present in the livers of transgenic mice expressing APOBEC-1. The plasma activities of the liver enzyme glutamic-oxaloacetic transaminase (GOT), in international units per liter, were 306.+-.127 (n=4), 91.+-.12 (n=4), 53.+-.6 (n=4), and 98.+-.34 (n=4) for transgenic mice, nontransgenic mice, transgenic mice with the mutant APOBEC-1 construct, and "apo-B48-only" mice, respectively. The glutamic-pyruvic transaminase (GPT) activities were 330.+-.106 (n=4), 26.+-.2 (n=4), 22.+-.2 (n=4), and 29.+-.3 (n=4), respectively.
Thus, the evidence indicated that the tumorigenesis or obesity in the transgenic mice expressing APOBEC-1 was caused by the promiscuous editing of other mRNAs involved in cell functions or, in the case of obesity, an mRNA involved in either the endocrine, metabolic, or behavioral control of energy metabolism.
3. Screening Technique for Aberrantly Edited mRNAs
The instant invention discloses a novel approach to search for these aberrantly edited mRNAs. This approach, "mooring-dependent PCR," is designed to amplify edited mRNAs selectively. The sequence specificity of apo-B mRNA editing appears to be provided by an 11-nucleotide (nt) sequence motif, the mooring sequence, which is located 5 nt downstream from the cytidine to be edited. Primers (mooring primers) were designed consisting of a sequence complementary to the mooring sequence, followed by 3-5 degenerate nucleotides, and an adenosine at the 3' end (FIG. 1). Because of this adenosine, the mooring primers allow more efficient amplification of edited mRNA containing thymidine than the unedited mRNA containing cytidine. The mooring primers were used as primers for reverse transcriptase in the generation of cDNA and additionally as primers in PCR amplification. The 5' primers were different sets of arbitrary primers chosen such that annealing positions to cDNA were randomly distributed from the mooring sequence.
Total RNA from control and transgenic mouse livers was isolated and subjected to mooring-dependent PCR. The PCR was performed in the presence of radioactive deoxynucleotides, and the radioactive amplification products were separated on a DNA-sequencing gel. Shown in FIG. 2 are two examples of an amplification product from the transgenic mouse liver having a higher abundance than that of the control mice. These bands were excised from the gel, and the DNAs were reamplified by mooring-dependent PCR and sequenced.
4. Identification of a Novel Target of APOBEC-1 #1 (NTA1)
Initial studies using mooring-dependent PCR have identified six differentially amplified bands. A search of Genbank revealed that one of these six clones was nearly identical to several human-expressed sequence tags (ESTs). ESTs are partial cDNA sequences that have not been characterized. Importantly, the mouse sequence had thymidines in place of several cytidines in the human sequence, suggesting that these cytidines may be edited in the APOBEC-1 transgenic mice. To confirm this editing, primers were designed, based on the human sequence, and used to amplify the sequence from control mouse liver RNA, transgenic mouse liver RNA, and mouse genomic DNA. The cDNA generated from the control mouse liver mRNA and genomic DNA had the same sequence as that of the human. In contrast, the cDNA generated from the APOBEC-1 transgenic mouse liver mRNA had thymidines in the place of several cytidines, as observed in the cDNA from differential display, demonstrating that these cytidines are edited only in the transgenic mice (FIG. 3). Primer extension analysis has further confirmed that approximately 20% of each cytidine is edited in the transgenic mice (FIG. 4). This mRNA has the laboratory designation of "NTA1," an acronym for novel target of APOBEC-1 #1. The APOBEC-1 transgenic rabbit founder described above that developed severe hepatic dysplasia was determined to also have edited NTA1.
Five EST clones of NTA1 were obtained from the Washington University-Merck EST project. Sequencing of the longest clone (2.4 kb) revealed a single open reading frame of 1,626 base pairs and a relatively long 3' untranslated region (UTR) of 758 base pairs. Apparently, this clone is not full-length. Northern blot analysis indicated that the transcript is about 4 kb, and the open reading frame lacks the ATG codon in the Kozak consensus motif for the initiation codon.
5' rapid amplification of cDNA ends (RACE)-PCR was performed to obtain a full-length cDNA clone of NTA1. The nucleotide sequence and predicted amino acid sequence of NT1 are provided in FIGS. 5A-C.
The results indicated that human and mouse sequences are highly conserved and have 95% identity throughout the entire sequence, including the 3' UTR. The last 200 base pairs in the 3' end are nearly identical in the human, mouse, rabbit, and rat. This indicated that this region in 3' UTR may have important functions, such as regulation of stability and/or translation of the NTA1 mRNA. It is in this 3' end region where APOBEC-1 modifies multiple cytidines in the transgenic mouse liver.
Thus, the experimental examples indicate that it is possible to identify mRNAs edited by APOBEC-1 and to determine which one(s) cause cellular transformation and obesity. The mooring-dependent PCR approach of disclosed herein identifies other targets of apo-B mRNA editing. By this strategy, mRNAs aberrantly edited in transgenic organs or other tissues can be detected. This screening technique thus allows both the identification of genes correlated with a particular phenotype and insight into the mechanism of gene function and pathogenesis. Additionally, the APOBEC-1 transgenic mouse provides a novel model for the study of obesity and the identification of genes and gene products related to obesity.
All references cited herein are expressely incorporated by reference in their entirety.
__________________________________________________________________________ SEQUENCE LISTING (1) GENERAL INFORMATION: (iii) NUMBER OF SEQUENCES: 12 (2) INFORMATION FOR SEQ ID NO:1: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 11 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: RNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: UGAUCANUAUA11 (2) INFORMATION FOR SEQ ID NO:2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: GATCATAGCC10 (2) INFORMATION FOR SEQ ID NO:3: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: RNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: CNNNNUGAUCANUAUA16 (2) INFORMATION FOR SEQ ID NO:4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: RNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: ANNNNACTAGTNATAT16 (2) INFORMATION FOR SEQ ID NO:5: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: RNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: UNNNNUGAUCANUAUA16 (2) INFORMATION FOR SEQ ID NO:6: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 38 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: CCTATTATAGTATTGAAATTAAGTCTACTTAATTTATC38 (2) INFORMATION FOR SEQ ID NO:7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 38 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: AAGTCATGTTCATGCCCTGATTTTATATACTTGTATCT38 (2) INFORMATION FOR SEQ ID NO:8: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3751 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: (A) NAME/KEY: CDS (B) LOCATION: 277..2994 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: CGGCGGGTACCAGGTGGCGGCTGCAGCAGCTACTCCTCTGAGCTGAGACTTTCAAGCCGG60 CCGCGTCTCCTTCCTCCCCCTTCCCTCCCCCTTTTTTGTTCTCCGTTCCCCTTCCCCCTC120 CCTTCCCCGTCCCCGACGACCGGATCCAGAGGAGGCAGCTGCGGTGGCAGCTGCTGAGTT180 CTCGGTGAAGGCTTTTCATTTCTACCATCCCCTCCCCTCCCCACCCCATCCATTAATATT240 ATTCTTTTGAAGATTCTTCGTTGTCAAGCCGCCAAAGTCGAGAGTGCGATTGCA294 ValGluSerAlaIleAla 15 GAAGGGGGTGCTTCTCGTTTCAGTGCTTCTTCGGGCGGAGGAGGAAGT342 GluGlyGlyAlaSerArgPheSerAlaSerSerGlyGlyGlyGlySer 101520 AGGGGTGCACCTCAGCACTATCCCAAGACTGCTGGCAACAGCGAGTTC390 ArgGlyAlaProGlnHisTyrProLysThrAlaGlyAsnSerGluPhe 253035 CTGGGGAAAACCCCAGGGCAAAACGCTCAGAAATGGATTCCTGCACGA438 LeuGlyLysThrProGlyGlnAsnAlaGlnLysTrpIleProAlaArg 404550 AGCACTAGACGAGATGACAACTCCGCAGCAAACAACTCCGCAAATGAA486 SerThrArgArgAspAspAsnSerAlaAlaAsnAsnSerAlaAsnGlu 55606570 AAAGAACGACATGATGCAATCTTCAGGAAAGTAAGAGGCATACTAAAT534 LysGluArgHisAspAlaIlePheArgLysValArgGlyIleLeuAsn 758085 AAGCTTACTCCTGAAAAGTTTGACAAGCTATGCCTTCAGCTCCTCAAT582 LysLeuThrProGluLysPheAspLysLeuCysLeuGlnLeuLeuAsn 9095100 GTGGGTGTAGAGTCTAAACTCATCCTTAAAGGGGTCATACTGCTGATT630 ValGlyValGluSerLysLeuIleLeuLysGlyValIleLeuLeuIle 105110115 GTGGACAAAGCCCTAGAGGAGCCAAAGTATAGCTCACTGTATGCTCAG678 ValAspLysAlaLeuGluGluProLysTyrSerSerLeuTyrAlaGln 120125130 CTATGTCTGCGATTGGCAGAAGATGCACCAAACTTTGATGGCCCAGCA726 LeuCysLeuArgLeuAlaGluAspAlaProAsnPheAspGlyProAla 135140145150 GCAGAGGGTCAACCAGGACAGAAGCAAAGCACAACATTCAGACGCCTC774 AlaGluGlyGlnProGlyGlnLysGlnSerThrThrPheArgArgLeu 155160165 TTGATTTCCAAATTGCAAGATGAATTTGAAAACCGAACCAGAAATGTT822 LeuIleSerLysLeuGlnAspGluPheGluAsnArgThrArgAsnVal 170175180 GATGTCTATGATAAGCGTGAAAATCCCCTCCTTCCTGAGCACGAGGAA870 AspValTyrAspLysArgGluAsnProLeuLeuProGluHisGluGlu 185190195 CAGAGAGCCATTGCTAAGATCAAGATGTTGGGGAACATCAAATTCATT918 GlnArgAlaIleAlaLysIleLysMetLeuGlyAsnIleLysPheIle 200205210 GGAGAACTTGGCAAGCTTGATCTTATTCATGAATCTATCCTTCATAAG966 GlyGluLeuGlyLysLeuAspLeuIleHisGluSerIleLeuHisLys 215220225230 TGCATCAAAACACTTTTGGAAAAGAAGAAGAGAGTCCAACTCCAAGAT1014 CysIleLysThrLeuLeuGluLysLysLysArgValGlnLeuGlnAsp 235240245 ATGGGAGAGGATTTGGAGTGCCTCTGTCAGATAATGAGGACAGTGGGA1062 MetGlyGluAspLeuGluCysLeuCysGlnIleMetArgThrValGly 250255260 CCTCGATTAGACCATGAACGAGCCAAGTCCTTAATGGATCAGTACTTT1110 ProArgLeuAspHisGluArgAlaLysSerLeuMetAspGlnTyrPhe 265270275 GCCAGAATGTGTTCCTTAATGTTAAGTAAGGAATTGCCAGCCAGGATT1158 AlaArgMetCysSerLeuMetLeuSerLysGluLeuProAlaArgIle 280285290 CGTTTCCTACTGCAGGATACTGTAGAGTTGCGAGAGCACCATTGGGTT1206 ArgPheLeuLeuGlnAspThrValGluLeuArgGluHisHisTrpVal 295300305310 CCTCGCAAGGCTTTTCTTGACAATGGACCAAAGACGATCAATCAAATC1254 ProArgLysAlaPheLeuAspAsnGlyProLysThrIleAsnGlnIle 315320325 CGTCAAGATGCAGTAAAAGATCTAGGAGTGTTTATTCCTGCTCCTATG1302 ArgGlnAspAlaValLysAspLeuGlyValPheIleProAlaProMet 330335340 GCTCAAGGGAGAAGTGACTTCTTCCTGGAGGGACCGTTCATGCCGCCA1350 AlaGlnGlyArgSerAspPhePheLeuGluGlyProPheMetProPro 345350355 AGGATGAAAATGGATAGGGACCCACTTGGGGGACTTCGTGATATGTTT1398 ArgMetLysMetAspArgAspProLeuGlyGlyLeuArgAspMetPhe 360365370 GGACAAATGCCAGGTAGTGGAATTGGTACTGGTCCAGGAGTTATCCAG1446 GlyGlnMetProGlySerGlyIleGlyThrGlyProGlyValIleGln 375380385390 GATACATTTTCACCCACAATGGGACGTCATCGTTCAAATCAGCTCTTC1494 AspThrPheSerProThrMetGlyArgHisArgSerAsnGlnLeuPhe 395400405 AATGGCCATGGGGGGCACATCATGCCTCCCACGCAATCGCAGTTTGGA1542 AsnGlyHisGlyGlyHisIleMetProProThrGlnSerGlnPheGly 410415420 GAGATGGGGGGCAAGTTTATGAAAAGCCAGGGGCTAAGCCAGCTCTAC1590 GluMetGlyGlyLysPheMetLysSerGlnGlyLeuSerGlnLeuTyr 425430435 CATAACCAGAGTCAGGGACTCTTATCCCAGCTGCAAGGACAGTCGAAG1638 HisAsnGlnSerGlnGlyLeuLeuSerGlnLeuGlnGlyGlnSerLys 440445450 GATATGCCACCTCGGTTTTCTAAGAAAGGACAGCTTAATGCAGATGAG1686 AspMetProProArgPheSerLysLysGlyGlnLeuAsnAlaAspGlu 455460465470 ATTAGTTTGAGGCCTGCTCAGTCGTTTCTAATGAATAAAAATCAGGTG1734 IleSerLeuArgProAlaGlnSerPheLeuMetAsnLysAsnGlnVal 475480485 CCAAAGCTTCAGCCCCAGATAACTATGATTCCTCCCAGTGCACAGCCA1782 ProLysLeuGlnProGlnIleThrMetIleProProSerAlaGlnPro 490495500 CCACGCACTCAAACACCGCCTCTGGGACAGACACCTCAACTTGGTCTC1830 ProArgThrGlnThrProProLeuGlyGlnThrProGlnLeuGlyLeu 505510515 AAAACTAATCCACCACTTATCCAGGAAAAGCCTGCCAAGACTAGCAAA1878 LysThrAsnProProLeuIleGlnGluLysProAlaLysThrSerLys 520525530 AAGCAACCACCATCAAAGGAAGAACTACTTAAACTGACCGAAGCCGTT1926 LysGlnProProSerLysGluGluLeuLeuLysLeuThrGluAlaVal 535540545550 GTGACTGACTATCTGAACAGTGGAAATGCCAACGAGGCTGTCAGTGGT1974 ValThrAspTyrLeuAsnSerGlyAsnAlaAsnGluAlaValSerGly 555560565 GTGAGAGAAATGAGAGCTCCAAAACACTTTCTTCCTGAGATGCTAAGC2022 ValArgGluMetArgAlaProLysHisPheLeuProGluMetLeuSer 570575580 AAAGTGATCATCCTGTCACTTGATAGAAGCGATGAAGATAAAGAAAAA2070 LysValIleIleLeuSerLeuAspArgSerAspGluAspLysGluLys 585590595 GCAAGCTCTTTAATCAGTTTACTCAAACAGGAAGCGATAGCCACAAGT2118 AlaSerSerLeuIleSerLeuLeuLysGlnGluAlaIleAlaThrSer 600605610 GACAACTTCATGCAGGCTTTCCTGAATGTATTGGAGCAGTGCCCCAAA2166 AspAsnPheMetGlnAlaPheLeuAsnValLeuGluGlnCysProLys 615620625630 CTGGAGGTTGACATCCCCTTGGTGAAATCTTACTTGGCACAGTTTGCA2214 LeuGluValAspIleProLeuValLysSerTyrLeuAlaGlnPheAla 635640645 GCTCGTGCTATAATTTCAGAGTTGGTGAGCATTTCCGAACTAGCTCAA2262 AlaArgAlaIleIleSerGluLeuValSerIleSerGluLeuAlaGln 650655660 CCACTGGAGAGTGGCACCCACTTCCCTCTCTTCTTACTTTGTCTTCAA2310 ProLeuGluSerGlyThrHisPheProLeuPheLeuLeuCysLeuGln 665670675 CAATTAGCTAAATTGCAAGACCGAGAGTGGTTAACCGAACTTTTTCAA2358 GlnLeuAlaLysLeuGlnAspArgGluTrpLeuThrGluLeuPheGln 680685690 CAAAGCAAGGTCAATATGCAGAAAATGCTGCCAGAAATTGATCAGAAT2406 GlnSerLysValAsnMetGlnLysMetLeuProGluIleAspGlnAsn 695700705710 AAGGATCGAATGTTGGAGATTTTGGAAGGAAAGGGACTGAGTTTCTTA2454 LysAspArgMetLeuGluIleLeuGluGlyLysGlyLeuSerPheLeu 715720725 TTCCCACTCCTTAAATTGGAGAAGGAACTATTGAAGCAAATTAAGCTG2502 PheProLeuLeuLysLeuGluLysGluLeuLeuLysGlnIleLysLeu 730735740 GATCCATCCCCTCAAACTATATATAAATGGATTAAAGATAACATCTCT2550 AspProSerProGlnThrIleTyrLysTrpIleLysAspAsnIleSer 745750755 CCCAAACTTCATGTAGATAAAGGATTCGTGAACATCTTAATGACCAGC2598 ProLysLeuHisValAspLysGlyPheValAsnIleLeuMetThrSer 760765770 TTCTTACAGTACATTTCTAGTGAAGTAAGCCCACCCAGCGATGAAACA2646 PheLeuGlnTyrIleSerSerGluValSerProProSerAspGluThr 775780785790 GATTCTTCCTCTGCTCCTTCCAAAGAGCAGTTAGAGCAGGAAAAACAG2694 AspSerSerSerAlaProSerLysGluGlnLeuGluGlnGluLysGln 795800805 CTGCTGCTCTCTTTTAAGCCAGTGATGCAGAAATTTCTTCATGATCAT2742 LeuLeuLeuSerPheLysProValMetGlnLysPheLeuHisAspHis 810815820 GTGGATCTACAGGTCAGTGCCCTGTATGCTTTTCAGGTGCACTGTTAC2790 ValAspLeuGlnValSerAlaLeuTyrAlaPheGlnValHisCysTyr 825830835 AACAGCAGCTTCCCAAAAGGCATGTTACTTCGATTTTTTGTTCACTTC2838 AsnSerSerPheProLysGlyMetLeuLeuArgPhePheValHisPhe 840845850 TATGACATGGAAATTATTGAAGAGGAAGCTTTCTTAGCTTGGAAGGAA2886 TyrAspMetGluIleIleGluGluGluAlaPheLeuAlaTrpLysGlu 855860865870 GACATAACTCAAGAGTTTCCAGGAAAAGGCAAGGCTTTGTTCCAGGTG2934 AspIleThrGlnGluPheProGlyLysGlyLysAlaLeuPheGlnVal 875880885 AATCAGTGGCTAACCTGGCTAGAAACTGCTGAAGAAGAAGAATCAGAG2982
AsnGlnTrpLeuThrTrpLeuGluThrAlaGluGluGluGluSerGlu 890895900 GAAGAAGCTGACTAAAGAACCAGCCAAAGCCTTAAATTGTGCAAAACATACT3034 GluGluAlaAsp 905 GTTGCTATGATGTAACTGCATTTGACCTAACCACTGCGAAAATTCATTCCGCTGTAACGT3094 TTTTTCACAATATTTAAAGCAGAAGCACGTCAGTAAGGTTTCCTTCTGCATAAGGTTTTT3154 GTAGTGTGATGTCTTAATCATAGTCTACCATCAAATACTTTAGGAGTATCCTTAATGTTT3214 AGATAGAATATTAGCAGCATGCAATAATTACATCCTAAGTTCTCAAGCAGAAGCAGTCTA3274 TTGCAAGGACCTTCTTTGCTGCCAGTTACCATAGGCTGTTTTAAGTTAGAAAACTGAATA3334 GCAACACTGAATACTGTAGAAATGCACTTTGCTCAGTAATACTTGAGTTGTTGCAATATT3394 TGATTATCCATTTGGTTGTTACAGAAAAATTCTTAACTGTAATTGATGGTTGTTGCCGTA3454 ATAGTATATTGCCTGTATTTCTACCTCTAGTAATGGGCTTTATGTGCTAGATTTTAAAAT3514 CCTTGAGCCTGGGCAAGTGCACAAGTCTTTTTAAAAGAAACATGGTTTACTTGCACCAAA3574 CTGATCAGTTTGAGAGATCATTAATGCCCTTGAAGTGGTTTTTGTGGGTGTGAAACAAAT3634 GGTGAGAATTTGAATTGGTCCCTCTTATTATAGTATTGAAATTAAGTCTACTTAATTTAT3694 CAAGTCATGTTCATGCCCTGATTTTATATACTTGTATCTATCAATAAACATTGTGAA3751 (2) INFORMATION FOR SEQ ID NO:9: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 906 amino acids (B) TYPE: amino acid (D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: ValGluSerAlaIleAlaGluGlyGlyAlaSerArgPheSerAlaSer 151015 SerGlyGlyGlyGlySerArgGlyAlaProGlnHisTyrProLysThr 202530 AlaGlyAsnSerGluPheLeuGlyLysThrProGlyGlnAsnAlaGln 354045 LysTrpIleProAlaArgSerThrArgArgAspAspAsnSerAlaAla 505560 AsnAsnSerAlaAsnGluLysGluArgHisAspAlaIlePheArgLys 65707580 ValArgGlyIleLeuAsnLysLeuThrProGluLysPheAspLysLeu 859095 CysLeuGlnLeuLeuAsnValGlyValGluSerLysLeuIleLeuLys 100105110 GlyValIleLeuLeuIleValAspLysAlaLeuGluGluProLysTyr 115120125 SerSerLeuTyrAlaGlnLeuCysLeuArgLeuAlaGluAspAlaPro 130135140 AsnPheAspGlyProAlaAlaGluGlyGlnProGlyGlnLysGlnSer 145150155160 ThrThrPheArgArgLeuLeuIleSerLysLeuGlnAspGluPheGlu 165170175 AsnArgThrArgAsnValAspValTyrAspLysArgGluAsnProLeu 180185190 LeuProGluHisGluGluGlnArgAlaIleAlaLysIleLysMetLeu 195200205 GlyAsnIleLysPheIleGlyGluLeuGlyLysLeuAspLeuIleHis 210215220 GluSerIleLeuHisLysCysIleLysThrLeuLeuGluLysLysLys 225230235240 ArgValGlnLeuGlnAspMetGlyGluAspLeuGluCysLeuCysGln 245250255 IleMetArgThrValGlyProArgLeuAspHisGluArgAlaLysSer 260265270 LeuMetAspGlnTyrPheAlaArgMetCysSerLeuMetLeuSerLys 275280285 GluLeuProAlaArgIleArgPheLeuLeuGlnAspThrValGluLeu 290295300 ArgGluHisHisTrpValProArgLysAlaPheLeuAspAsnGlyPro 305310315320 LysThrIleAsnGlnIleArgGlnAspAlaValLysAspLeuGlyVal 325330335 PheIleProAlaProMetAlaGlnGlyArgSerAspPhePheLeuGlu 340345350 GlyProPheMetProProArgMetLysMetAspArgAspProLeuGly 355360365 GlyLeuArgAspMetPheGlyGlnMetProGlySerGlyIleGlyThr 370375380 GlyProGlyValIleGlnAspThrPheSerProThrMetGlyArgHis 385390395400 ArgSerAsnGlnLeuPheAsnGlyHisGlyGlyHisIleMetProPro 405410415 ThrGlnSerGlnPheGlyGluMetGlyGlyLysPheMetLysSerGln 420425430 GlyLeuSerGlnLeuTyrHisAsnGlnSerGlnGlyLeuLeuSerGln 435440445 LeuGlnGlyGlnSerLysAspMetProProArgPheSerLysLysGly 450455460 GlnLeuAsnAlaAspGluIleSerLeuArgProAlaGlnSerPheLeu 465470475480 MetAsnLysAsnGlnValProLysLeuGlnProGlnIleThrMetIle 485490495 ProProSerAlaGlnProProArgThrGlnThrProProLeuGlyGln 500505510 ThrProGlnLeuGlyLeuLysThrAsnProProLeuIleGlnGluLys 515520525 ProAlaLysThrSerLysLysGlnProProSerLysGluGluLeuLeu 530535540 LysLeuThrGluAlaValValThrAspTyrLeuAsnSerGlyAsnAla 545550555560 AsnGluAlaValSerGlyValArgGluMetArgAlaProLysHisPhe 565570575 LeuProGluMetLeuSerLysValIleIleLeuSerLeuAspArgSer 580585590 AspGluAspLysGluLysAlaSerSerLeuIleSerLeuLeuLysGln 595600605 GluAlaIleAlaThrSerAspAsnPheMetGlnAlaPheLeuAsnVal 610615620 LeuGluGlnCysProLysLeuGluValAspIleProLeuValLysSer 625630635640 TyrLeuAlaGlnPheAlaAlaArgAlaIleIleSerGluLeuValSer 645650655 IleSerGluLeuAlaGlnProLeuGluSerGlyThrHisPheProLeu 660665670 PheLeuLeuCysLeuGlnGlnLeuAlaLysLeuGlnAspArgGluTrp 675680685 LeuThrGluLeuPheGlnGlnSerLysValAsnMetGlnLysMetLeu 690695700 ProGluIleAspGlnAsnLysAspArgMetLeuGluIleLeuGluGly 705710715720 LysGlyLeuSerPheLeuPheProLeuLeuLysLeuGluLysGluLeu 725730735 LeuLysGlnIleLysLeuAspProSerProGlnThrIleTyrLysTrp 740745750 IleLysAspAsnIleSerProLysLeuHisValAspLysGlyPheVal 755760765 AsnIleLeuMetThrSerPheLeuGlnTyrIleSerSerGluValSer 770775780 ProProSerAspGluThrAspSerSerSerAlaProSerLysGluGln 785790795800 LeuGluGlnGluLysGlnLeuLeuLeuSerPheLysProValMetGln 805810815 LysPheLeuHisAspHisValAspLeuGlnValSerAlaLeuTyrAla 820825830 PheGlnValHisCysTyrAsnSerSerPheProLysGlyMetLeuLeu 835840845 ArgPhePheValHisPheTyrAspMetGluIleIleGluGluGluAla 850855860 PheLeuAlaTrpLysGluAspIleThrGlnGluPheProGlyLysGly 865870875880 LysAlaLeuPheGlnValAsnGlnTrpLeuThrTrpLeuGluThrAla 885890895 GluGluGluGluSerGluGluGluAlaAsp 900905 (2) INFORMATION FOR SEQ ID NO:10: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 14 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: TATANTGATCNNNA14 (2) INFORMATION FOR SEQ ID NO:11: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 15 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: TATANTGATCNNNNA15 (2) INFORMATION FOR SEQ ID NO:12: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: TATANTGATCNNNNNA16 __________________________________________________________________________