Quantcast
Channel: Latest Articles of Bio-Synthesis Inc.
Viewing all 572 articles
Browse latest View live

Homocitrulline

$
0
0
Homocitrulline


Homocitrulline, a
lso known as L-Homocitrulline, N(6)-carbamoyl-L-lysine, N(6)-(amino-carbonyl)-L-lysine, (2S)-2-amino-6-(carbamoyl-amino) hexanoic acid, has the molecular formula C7H15N3O3, and a molecular weight of 189.2123 daltons. Molecular models for homocitrulline are shown below.

 

Figure 1: Structural models for homocitrulline. A: Chemical structure; B: Stick model, energy minimized; C: Space filling model.


The amino acid homocitrulline is a metabolite of ornithine in human metabolism and mammalian. The amino acid can be detected in larger amounts in the urine of individuals with urea cycle disorders. At presence, it is thought that the depletion of the ornithine supply causes the accumulation of carbamyl-phosphate in the urea cycle which may be responsible for the enhanced synthesis of homocitrulline and homoarginine. Both amino acids can be detected in urine. Amino acid analysis allows for the quantitative analysis of these amino acid metabolites in biological fluids such as urine or blood.

Homocitrulline is one methylene group longer than citrulline, but similar in structure. The metabolite is generated from a lysine residue after lysine reacts with cyanate. Cyanate is present in the human body in equilibrium with urea. Under physiological conditions the urea concentration may be too low to allow extensive carbamylation. However, the conversion process leading to the formation of homocitrulline from lysine in proteins is known to occur in vivo. During renal failure conditions, the urea concentration increases and carbamylation of many proteins can occur, which can be detected. It is believed that most carbamylation takes place during inflammation when the enzyme myeloperoxidase is released from neutrophils. This enzyme converts thiocyanate to cyanate. Increased levels of cyanate can now carbamylate lysine residues.

Figure 2: Carbamylation of lysine. Myeloperoxidase released from neutrophils converts thiocyanate to cyanate which carbamylates lysine residues to form homocitrulline. Thiocyanate (SCN) is a naturally occurring pseudohalide found in dietary sources. Myeloperoxidase can use SCN as a cosubstrate together with hydrogen peroxide (H2O2) to form cyanate. In patients with kidney dysfunction urea is elevated. Urea is in equilibrium with cyanate and isocyanate. Carbamylation of nucleophilic amino groups, for example lysine residues, can modify protein structures and ultimately cause metabolic dysfunctions.


Homocitrulline has been identified as an antigen specific to rheumatoid arthritis as a target of anti-citrulline protein/peptide antibodies. More recently, it has been shown that homocitrulline-containing proteins are present in rheumatoid arthritis (RA) joints of rodents and that they may affect T-cell triggering and possibly autoantibody formation, and possibly also in humans.  

In another metabolic disorder, in the hyperornithinemia-hyperammonemia-homocitrullinuria (HHH) syndrome, first described in 1969, ornithine levels maybe elevated five to ten times in comparison to normal levels. In addition, in this syndrome, levels of alanine, orotic acid and homocitrulline may be elevated as well. In people with hyperammonemia orotic acid and homocitrulline appear to be chronically elevated after a high protein diet, but may be normal when fasting.

The metabolic disorder, lysinuric protein intolerance is caused by the body's inability to digest and use certain protein building blocks or amino acids. These are lysine, arginine, and ornithine. These amino acids are found in many protein-rich foods. Since in this disorder the human body cannot effectively break down these amino acids people typically experience nausea and vomiting after ingesting protein rich foods. Associated features of this protein intolerance may include an enlarged liver and spleen, short stature, muscle weakness, impaired immune function, and progressively brittle bones that are prone to fracture and a lung disorder called pulmonary alveolar proteinosis may also develop. In addition, the accumulation of amino acids in the kidneys can cause end-stage renal disease (ESRD).  In ESRD the kidneys are no longer able to filter fluids and waste products from the body effectively.


Reference

Dionisi Vici C, Bachmann C, Gambarara M, Colombo JP, Sabetta G: Hyperornithinemia-hyperammonemia-homocitrullinuria syndrome: low creatine excretion and effect of citrulline, arginine, or ornithine supplement. Pediatr Res. 1987 Sep;22(3):364-7.

Evered DF, Vadgama JV: Absorption of homocitrulline from the gastrointestinal tract. Br J Nutr. 1983 Jan;49(1):35-42.

http://pubchem.ncbi.nlm.nih.gov/rest/chemical/homocitrulline

http://www.perkinelmer.com/industries/healthcare/newborntestingservices/clinician-information/hypermethioninemia.xhtml

Hommes FA, Roesel RA, Metoki K, Hartlage PL, Dyken PR: Studies on a case of HHH-syndrome (hyperammonemia, hyperornithinemia, homocitrullinuria). Neuropediatrics. 1986 Feb;17(1):48-52.

Kato T, Sano M, Mizutani N.; Homocitrullinuria and homoargininuria in lysinuric protein intolerance. J Inherit Metab Dis. 1989;12(2):157-61.

Kato T, Sano M, Mizutani N: Inhibitory effect of intravenous lysine infusion on urea cycle metabolism. Eur J Pediatr. 1987 Jan;146(1):56-8. Pubmed: 3107993.

Kato T, Sano M, Mizutani N, Hayakawa C: Homocitrullinuria and homoargininuria in hyperargininaemia. J Inherit Metab Dis. 1988;11(3):261-5.

Kato T, Sano M: Effect of ammonium chloride on homocitrulline and homoarginine synthesis from lysine. J Inherit Metab Dis. 1993;16(5):906-7.

Kato T, Sano M, Mizutani N: Homocitrullinuria and homoargininuria in lysinuric protein intolerance. J Inherit Metab Dis. 1989;12(2):157-61.

Koshiishi I, Kobori Y, Imanari T: Determination of citrulline and homocitrulline by high-performance liquid chromatography with post-column derivatization. J Chromatogr. 1990 Oct 26;532(1):37-43.

Kraus LM, Gaber L, Handorf CR, Marti HP, Kraus AP Jr: Carbamoylation of glomerular and tubular proteins in patients with kidney failure: a potential mechanism of ongoing renal damage. Swiss Med Wkly. 2001 Mar 24;131(11-12):139-4.

Kraus LM, Elberger AJ, Handorf CR, Pabst MJ, Kraus AP Jr: Urea-derived cyanate forms epsilon-amino-carbamoyl-lysine (homocitrulline) in leukocyte proteins in patients with end-stage renal disease on peritoneal dialysis. J Lab Clin Med. 1994 Jun;123(6):882-91.

Rajantie J, Simell O, Perheentupa J: Oral administration of epsilon N-acetyllysine and homocitrulline in lysinuric protein intolerance. J Pediatr. 1983 Mar;102(3):388-90.

Shi J, Knevel R, Suwannalai P, van der Linden MP, Janssen GM, van Veelen PA, Levarht NE, van der Helm-van Mil AH, Cerami A, Huizinga TW, Toes RE, Trouw LA.;Autoantibodies recognizing carbamylated proteins are present in sera of patients with rheumatoid arthritis and predict joint damage. Proc Natl Acad Sci U S A. 2011 Oct 18;108(42):17372-7.

Simell O, Mackenzie S, Clow CL, Scriver CR: Ornithine loading did not prevent induced hyperammonemia in a patient with hyperornithinemia-hyperammonemia-homocitrullinuria syndrome. Pediatr Res. 1985 Dec;19(12):1283-7.

Tuchman M, Knopman DS, Shih VE: Episodic hyperammonemia in adult siblings with hyperornithinemia, hyperammonemia, and homocitrullinuria syndrome. Arch Neurol. 1990 Oct;47(10):1134-7.

Zammarchi E, Donati MA, Filippi L, Resti M: Cryptogenic hepatitis masking the diagnosis of ornithine transcarbamylase deficiency. J Pediatr Gastroenterol Nutr. 1996 May;22(4):380-3. 


Amino Acid Analysis of Homocitrulline

$
0
0

Homocitrulline with the molecular formula C7H15N3O3, and a molecular weight of 189.2123 daltons, can be analyzed using amino acid analysis. Molecular models for homocitrulline are shown below.

 

Figure 1: Structural models for homocitrulline. A: Chemical structure; B: Stick model, energy minimized; C: Space filling model.

The amino acid homocitrulline is a metabolite of ornithine in human and mammalian metabolism. The amino acid can be detected in larger amounts in the urine of individuals with urea cycle disorders. Both amino acids can be detected in urine. Amino acid analysis allows for the quantitative analysis of these amino acid metabolites in biological fluids such as urine, blood, plasma or proteins.

Figure 2: Chromatograph of a homocitrulline standard using AQC chemistry.  


Figure 3: Chromatograph of an amino acid standard including homocitrulline using AQC chemistry.

Figure 4: Chromatograph of an amino acid standard including homocitrulline. An aliquot of 100 pimoles each were injected into the amino acid analyzer column. The zoomed in region from 12 to 40 minutes of the chromatogram is shown.

Several structures of myelin oligodendroyte glycoprotein or MOG have been solved

$
0
0
 Several structures for MOG have been solved!


MOG or
myelin oligodendrocyte glycoprotein is a key central nervous system (CNS)-specific autoantigen for primary demyelination in multiple sclerosis. The gene product is a membrane protein expressed on the oligodendrocyte cell surface and the outermost surface of myelin sheaths, in the brain and spinal cord. Due to this localization, it is a primary target antigen involved in immune-mediated demyelination. This protein is thought to be involved in completion and maintenance of the myelin sheath and in cell-cell communication. In addition, alternatively spliced transcript variants encoding different isoforms have been identified. MOG is a transmembrane protein that belongs to the immunoglobulin superfamily. The protein contains an Ig-like domain and has two potential membrane-spanning regions. Even thought the disease-inducing role of MOG has been established, its precise function in the CNS is still unknown.

 

Three possible functions for MOG have been suggested in the past:

 

(1) The protein may function as a cellular adhesive molecule.

(2) The protein may function as a regulator of oligodendrocyte microtubule stability.

(3) The protein may function as a mediator of interactions between myelin and the immune system, as well as the complement cascade.

 

Amor et al. in 1996 used myelin basic protein (MBP) and synthetic MBP peptides to screen for their ability to induce experimental allergic encephalomyelitis in Biozzi ABH (H-2Ag7) mice. Their data suggest the presence of a peptide core between MBP 21-26 (HARHGF). This peptide motif contains similar elements to the previously defined encephalitogenic MOG 1-22 and PLP 56-70 peptides. The authors further investigated the fine specificity of these epitopes using frame-shifted peptides, which indicated cores between MOG 9-15 (GYPIRAL) and PLP 62-68 (NVIHAFQ). Based on these pathogenic peptides, a putative H-2Ag7 binding motif was suggested that contains a series of hydrophobic, basic, small, and large hydrophobic residues within a 6 to 7 amino acid core. The authors suggest that these findings may have relevance in the design of strategies in the treatment of experimental autoimmune diseases in animals that express this haplotype.

Figure 1: Crystal structures of the extracellular domain of MOG (MOGIgd) at 1.45-A resolution and the complex of MOGIgd with the antigen-binding fragment (Fab) of the MOG-specific demyelinating monoclonal antibody 8-18C5 at 3.0-A resolution.


The demyelination in multiple sclerosis involves an autoantibody response to myelin oligodendrocyte glycoprotein. Breithaupt et al. in 2003 reported the crystal structures of the extracellular domain of MOG (MOGIgd) at 1.45-A resolution and the complex of MOG-Igd with the antigen-binding fragment (Fab) of the MOG-specific demyelinating monoclonal antibody 8-18C5 at 3.0-A resolution. The structures showed that MOG-Igd adopts an IgV like fold with the A'GFCC'C" sheet harboring a cavity similar to the one used by the costimulatory molecule B7-2 to bind its ligand CTLA4. The antibody 8-18C5 binds to three loops located at the membrane-distal side of MOG. Dominant contribution to these interactions are made by MOG residues 101-108 containing a strained loop that forms the upper edge of the putative ligand binding site. The sequence R101DHSYQEE108 is unique for MOG. However, large parts of the remaining sequence are conserved in MOG homologues that potentially produce immunological tolerance which are expressed outside the immuno-privileged environment of the CNS.



Figure 1: Crystal structure of myelin oligodendrocyte glycoprotein, a key autoantigen in multiple sclerosis.


To gain new insights into the physiological and immunopathological role of MOG, Clements et al. also in 2003 determined the 1.8-A crystal structure of the MOG extracellular domain (MOGED). MOGED adopts a classical Ig (Ig variable domain) fold that was observed to form an antiparallel head-to-tail dimer. The dimeric form of native MOG was also observed. MOGED was also shown to dimerize in solution. This observation is consistent with the view of MOG acting as a homophilic adhesion receptor. The MOG35-55 peptide, a major encephalitogenic determinant recognized by both T cells and demyelinating autoantibodies, is partly occluded within the dimer interface.

MOG peptides can be synthesized using automated solid phase peptide synthesis (SPPS) which are useful tools for the study of protein-protein or protein-peptide interactions to help elucidate the role of MOG and/or similar proteins. 

 

Reference


Amor
S, O'Neill JK, Morris MM, Smith RM, Wraith DC, Groome N, Travers PJ, Baker D,; Encephalitogenic epitopes of myelin basic protein, proteolipid protein, myelin oligodendrocyte glycoprotein for experimental allergic encephalomyelitis induction in Biozzi ABH (H-2Ag7) mice share an amino acid motif. J Immunol. 1996 Apr 15;156(8):3000-8.


Breithaupt
C, Schubart A, Zander H, Skerra A, Huber R, Linington C, Jacob U; Structural insights into the antigenicity of myelin oligodendrocyte glycoprotein.Proc.Natl.Acad.Sci.Usa (2003) 100 p.9446


Clement
s CS, Reid HH, Beddoe T, Tynan FE, Perugini MA, Johns TG, Bernard CC, Rossjohn J.; The crystal structure of myelin oligodendrocyte glycoprotein, a key autoantigen in multiple sclerosis.Proc.Natl.Acad.Sci.Usa (2003) 100 p.11059


http://www.genecards.org/cgi-bin/carddisp.pl?gene=MOG&search=8df341b292f81c6f4b6083f744e16c37


http://www.ncbi.nlm.nih.gov/structure/?term=myelin+oligodendrocyte+glycoprotein


https://www.wikigenes.org/e/gene/e/17441.html


Antibody Drug Conjugates - A new way to treat cancer

$
0
0
 Antibody drug conjugates- a new way to treat cancer


By Klaus D. Linse

The approval of rituximab, a chimeric monoclonal antibody (mAB) that recognizes the CD20 protein, in 1997 by the U.S. Food and Drug Administration to treat B-cell non-Hodgkin lymphomas resistant to other chemotherapies ushered in an area in which monoclonal antibodies became important components of therapeutic regimens in oncology. To circumvent obstacles encountered with earlier drugs, showing off-target effects such as the targeting of healthy dividing cells as well as other severe side effects, researchers turned to ADCs to selectively deliver toxic compounds to diseased tissue to be used as “Magic Bullets”. A “Magic Bullet” is a drug or concept first proposed in the early 1900s by PaulEhrlich, a German physician and scientist, that allows the selective targeting of diseased tissue cells.

The mAB rituximab is used to treat cancers of the white blood system such as leukemias and lymphomas. Monoclonal antibodies now have become a successful class of therapeutic molecules. This success was a result of great progress made in molecular biology and biotechnology which contributed to our understanding of the inner works of the human cell as well of the key players in the immune system. According to the antibody society 35 monoclonal antibodies (mABs) have been approved as therapeutic agents and close to 7 are pending approval (July 2014). 

Therapeutic mABs have had a substantial effect on medical care for a wide range of diseases in the past two decades. This is because of the high specificity and ability of mABs to bind target antigens marking these for removal by methods such as complement-dependent cytotoxicity or antibody-dependent cell-mediated cytotoxicity. In addition, mABs can also have therapeutic benefits by binding and inhibiting the function of target antigens. Unfortunately, as medical scientists have found out, antibodies against tumor-specific antigens often lack therapeutic activities.

In recent years it has been found that the conjugation of cytotoxic drugs or radionuclides can expand the utility of mABS. Antibody-drug-conjugates (ADCs) with improved potency and effectiveness are now used as a means to target and deliver a toxic payload to the selected diseased tissue. In recent years this approach has become a major focus for therapeutic research. In the past decade antibodies have been conjugated to a number of cytotoxic drugs using various linker chemistries. Already two such drugs are marketed in the United States. These are ado-trastuzumab emtansine (Kadcyla®) and brentuximab vedotin (Adcetris®), and over 30 ADCs are currently undergoing clinical studies. This makes it likely that more conjugates may be approved in the near future.

Figure 1: A model of an ADC where a mAb is conjugated to DM1 illustrating the conjugation approach used for Ado-Trastuzumab Emtansine. Ado-trastuzumab emtansine, formerly called Trastuzumab-DM1 (T-DM1), is a HER2 antibody drug conjugate (ADC). In this ADC the trastuzumab antibody is linked to the cell-killing agent, DM1. T-DM1 combines two strategies— anti-HER2 activity and targeted intracellular delivery of the potent anti-microtubule agent, DM1 (a maytansine derivative)—to produce cell cycle arrest and apoptosis. Ado-trastuzumab emtansine is marketed under the brand name Kadcyla and is indicated for use in HER2-positive, metastatic breast cancer patients who have already used taxane and/or trastuzumab for metastatic disease or had their cancer recur within 6 months of adjuvant treatment.


The ADC brentuximag vedotin or Adcetris® combines an anti-CD30 antibody and the drug monomethyl auristatin E (MMAE). It is an anti-neoplastic agent used in the treatment of Hodgkin lymphoma and systemic anaplastic large cell lymphoma and was approved in 2011. In January 2012, the drug label was revised to include a boxed warning of progressive multifocal leukoencephalopathy and death following JC virus infection.
 The structure of the cAC10-vcMMAE system used for the production of brentuximag vedotin is illustrated in figure 2.


Figure 2:
Structure of the cAC10-vcMMAE system used for the production of brentuximag vedotin. Francisco et al. 2003 report how this ADC was prepared. A controlled partial reduction of internal cAC10 disulfides with DTT, followed by addition of the maleimide-vc-linker-MMAE was used. Stable thioether-linked ADCs were formed by the addition of free sulfhydryl groups on the mAbs to the maleimides present on the drugs. cAC10-vcMMAE and the cIgG-vcMMAE used in the report contained approximately 8 drugs/mAb.


The development of ADCs in the past decade has evolved from the use of murine antibodies to conjugate standard chemotherapeutics drugs to fully human antibodies conjugated to highly potent cytotoxic drugs. Critical factors required for the successful development of ADCs, as scientists have learned over the past decade, include target antigen selection, as well as the selection of the best antibody, the correct linker and payload. Typically, ADCs are complex biomolecules composed of an antibody or antibody fragment linked to a biological active cytotoxic or anti-cancerous payload using stable chemical linkers with labile bonds. Other terms used for this type of molecules are the terms “bioconjugate” and “immune-conjugates”. The reasoning for the combination of the unique targeting abilities of antibodies with cancer-killing drugs to create unique ADCs is the potential these drugs can have to treat cancer that are hard to treat with standard classical chemotherapies.

One area of research important for the development and design of ADCs is that of conjugation chemistry. One recent improvement is the implementation of site-specific conjugation methods. Here, the conjugation only occurs at engineered cysteine residues or unnatural amino acids. The result is a homogeneous ADC production and improved ADC pharmacokinetic (PK) properties. The following figure illustrates critical factors that influence the performance of ADC therapeutics.


Figure 3: Critical factors that influence ADC therapeutics. ADCs are designed and produced by conjugating a cytotoxic drug to a monoclonal antibody via a selected linker. All components of an ADC affect the performance of the molecule. The optimization of all molecular parts is essential for the development of successful ADCs.

In the beginning years of ADC development many obstacles had to be overcome. Early ADCs showed little or no therapeutic effect. The primary reason may have been pour target selection. In addition, the use of either chimeric or murine (mouse) antibodies, which can elicit an immunogenic response, and the use of lower potency drugs can also be factors that limit potency. Ultimately, researches learned from their failures and incorporated what they had learned in the next generations of ADCs. Knowledge gained from the development of ADCs has led to a better understanding of the ways in which ADCs function and their clinical performance.
  

Design of ADCs


Many papers reporting the use of conjugates have been published since the 1970s. A PubMed search using the terms “Antibody and Drug and Conjugates” retrieved a list of over 2,300 papers covering this subject indicating that this has now become an active research field. Antibody-based therapeutics against cancer are now highly successful in clinical trials and are currently recognized for their potential.

Hundreds of mAbs including bispecific mAbs and multispecific fusion proteins, mAbs conjugated with small molecule drugs and mAbs with optimized pharmacokinetics have already been produced and are in clinical trials. However, many challenges still remain and a deeper understanding of mechanisms for how ADCs work is needed to overcome encountered problems including resistance to therapy, access to

molecular targets, as well as to understand the complexity of biological systems and individual variations and how these drug conjugates interact with them.

Unlike conventional chemotherapeutic drugs which do not selectively localize to tumors, antibody-drug conjugates can bind specifically to cells that express the targeted antigen. Therefore these molecules are considered to be ideal vehicles for applications that require delivery of drugs. With the help of various linker strategies antibodies can be conjugated to a variety of drugs or “payloads”. 

Ongoing research is investigating new strategies how to conjugate proteins such as IgGs to cytotoxic molecules without altering the natural functions of the proteins.

Antibody or protein vehicles


Although whole IgGs already have shown to have many benefits, for certain applications the use of smaller antibody fragments such as monomeric or dimeric fragments, sometimes called diabodies, single-chain variable fragments (scFv) or other small proteins maybe of advantage. This may allow tailoring the delivery time, as indicated by the half-life of the ADC, more precisely. For example, when delivering cytokines to the extracellular space of a tumor, a longer half-life may cause unwanted inflammation.

Linkers

A variety of linkers have now been investigated to connect an antibody or the protein of choice to the drug of choice. The most commonly used linkers are based on amide bond, also called peptide linkers, disulfide bonds or hydrazones as illustrated in the next figure. The selection of the linker determines what happens to the ADC once it is inside a cell. Most linkers are selected so that they can be cleaved by specific mechanisms. Depending on their sequence peptide linkers can be site-specifically cleaved by proteases that recognize the selected peptide sequence motif, or by statistical proteolysis. Disulfide bonds can be broken by the reducing environment of the cytosol and hydrazones can be cleaved by acid-mediated hydrolysis. In addition, ADCs containing proteins such as cytokines maybe produced as fusion proteins. 

Payloads

Very potent cytotoxic agents can be used to ensure that the tumor killing can be mediated with acceptably low doses of the ADCs employed. ADCs containing monomethyl auristin E (MMAE) or the maytansioniod DM1 kill tumor cells by inhibiting microtubule polymerization. However, alternative payloads may be used as well.  Examples are DNA damaging drugs or cytokines.

The next figure illustrates some building blocks that are used for the design and production of ADCs.


Figure 4:
Structures of some building blogs and linkers used in ADC therapeutics. Linkers with an amide bond, a disulfide bond, and a hydrazone bond are illustrated in the upper panel. Structures of drug-SPDP-mAB, drug-SPP-mAB, drug-SSNPP-mAB, and drug-SMCC-mAB are listed in order of stability, least to greatest. The drug-SMCC-mAB conjugate contains a nonreducible linker.


To summarize, ADCs are sophisticated delivery systems for antitumor cytotoxic drugs. Monoclonal antibodies are used to guide the toxin precursor to the target cancer cell. When the target is reached the prodrug can be converted chemically or enzymatically to the parent drug and unfold its activity to kill the cancer. On its journey from the blood vessels to the molecular target in the tumor tissue the ADC is exposed to different conditions.

  • During circulation:The ADC must behave like a naked antibody when circulating in the plasma. The linkers used must be stable in the blood stream. The goal is to limit the damage to healthy tissue since decomposition or decay would relase the cytotoxin before being delivered to the target.

  • Antigen Binding: The conjugated mAB needs to retain high immunoaffinity thus the attachment of the cytotoxic drug must not disturb the binding specificity.

  • Internalization: For the ADC to work well, a sufficient intracellular concentration of the drug must be achieved. This maybe one of the biggest challenges since antigen targets on cell sufaces are often present in limited numbers. In addition, the internalization process for antigen-antibody complexes is often inefficient.

  • Drug Release: Once inside the cell, the ADC needs to efficiently release the selected cytotoxic drug in its active form inside the tumor cell.

  • Drug Action: The potency of the selected drug must be sufficient to kill the tumor cells, often even at low concentration. To achieve this the use of very potent drugs is necessary. Compounds that are too toxic when tested as a stand-alone chemotherapy appear to be quite suitable candidates to be used as ADC payloads.  


The insight gained during the development of ADCs in the past decade have resulted in a number of new strategies and drugs. This is indicated by the flood of publications in this research field.

 

Selected References


ADC poster:
http://www.nature.com/nbt/extra/adc/index.html

Franco Dosio, Paola Brusa and Luigi Cattel; Immunotoxins and Anticancer Drug Conjugate Assemblies: The Role of the Linkage between Components. Toxins 2011, 3, 848-883; doi:10.3390/toxins3070848

Ducry L, Stump B.; Antibody-drug conjugates: linking cytotoxic payloads to monoclonal antibodies. Bioconjug Chem. 2010 Jan;21(1):5-13. doi: 10.1021/bc9002019.

Joseph A. Francisco, Charles G. Cerveny, Damon L. Meyer, Bruce J. Mixan, Kerry Klussman, Dana F. Chace, Starr X. Rejniak, Kristine A. Gordon, Ron DeBlanc, Brian E. Toki, Che-Leung Law, Svetlana O. Doronina, Clay B. Siegall, Peter D. Senter, and Alan F. Wahl; cAC10-vcMMAE, an anti-CD30–monomethyl auristatin E conjugate with potent and selective antitumor activity. August 15, 2003; Blood: 102 (4).

SEAN L KITSON, DEREK J QUINN, THOMAS S MOODY, DAVID SPEED, WILLIAM WATTERS, DAVID ROZZELL; Antibody-Drug Conjugates (ADCs) – Biotherapeutic bullets. Chemistry Today - vol. 31(4) July/August 2013.

Nilvebrant J, Åstrand M, Georgieva-Kotseva M, Björnmalm M, Löfblom J, Hober, S.; (2014) Engineering of Bispecific Affinity Proteins with High Affinity for ERBB2 and Adaptable Binding to Albumin. PLoS ONE 9(8): e103094. doi:10.1371/journal.pone.0103094.

Lewis Phillips GD, Li G, Dugger DL, Crocker LM, Parsons KL, Mai E, Blättler WA, Lambert JM, Chari RV, Lutz RJ, Wong WL, Jacobson FS, Koeppen H, Schwall RH, Kenkare-Mitra SR, Spencer SD, Sliwkowski MX; Targeting HER2-positive breast cancer with trastuzumab-DM1, an antibody-cytotoxic drug conjugate. Cancer Res. 2008 Nov 15;68(22):9280-90. doi: 10.1158/0008-5472.CAN-08-1776.

Panowksi S, Bhakta S, Raab H, Polakis P, Junutula JR.; Site-specific antibody drug conjugates for cancer therapy. MAbs. 2014 Jan-Feb;6(1):34-45. doi: 10.4161/mabs.27022.

Sassoon I, Blanc V.; Antibody-drug conjugate (ADC) clinical pipeline: a review. Methods Mol Biol. 2013;1045:1-27. doi: 10.1007/978-1-62703-541-5_1.

Feng Tiana, Yingchun Lua, Anthony Manibusana, Aaron Sellersa, Hon Trana, Ying Suna, Trung Phuonga, Richard Barnetta, Brad Hehlia, Frank Songa, Michael J. DeGuzmanb, Semsi Ensarib, Jason K. Pinkstaffc, Lorraine M. Sullivanc, Sandra L. Birocc, Ho Chod, Peter G. Schultze, John DiJoseph, Maureen Dougher, Dangshe Ma, Russell Dushing, Mauricio Lealh, Lioudmila Tchistiakovai, Eric Feyfanti, Hans-Peter Gerber, and Puja Sapra; A general approach to site-specific antibody drug conjugates. (2014) PNAS 111, 5, 1766-1771. 

Biotin - a water-soluble Vitamin

$
0
0

Biotin- a water-soluble Vitamin


Biotin is a water-soluble vitamin that functions as a coenzyme for five mammalian carboxylases: pyruvate carboxylase (EC 6.4.1.1), propionyl-CoA carboxylase (EC 6.4.1.3), methylcrotonyl-CoA carboxylase (EC 6.4.1.4), and both isoforms of acetyl-CoA carboxylase (EC 6.4.1.2). Biotin, cis-hexahydro-2-oxo-1H-thieno[3,4-d]-imidazoline-4-valeric acid, is a small molecule with the molecular weight of 244.3 dalton. Other synonyms for biotin are D-Biotin, Bios II, Coenzyme R, Vitamin B 7, and Vitamin H.

Biotin is a colorless crystalline powder with the chemical formula 
C10H16N2O3S. It is part of the vitamin B2 complex and is essential for all mammals including humans. Vitamin B2 was found to be a complex of several chemically unrelated heat-stable factors, including niacin, biotin, and pantothenic acid. Furthermore, it is a cofactor for many enzymes in the body and is found in large quantities in liver, egg yolk, milk, and yeast.

 

Figure 1:  Structural models for biotin.

(A) Chemical structure;
(B) Stick model of the energy minimized structure;
(C) Structure of biotin as observed in a crystal of a deglycosylated avidin in  complex with biotin published 1993 by Livnah et al.  [2AVI];
(D) Space filling model of biotin.

 

Biotin was discovered during nutritional experiments that revealed it as a factor in many foods. Biotin was found to be capable of curing scaly dermatitis, hair loss, and neurologic signs induced in rats when fed dried egg whites. Deficiency in biotin causes dermatitis and loss of hair. Biotin is a water-soluble, coenzyme present in small amounts in every living being. It occurs mainly bound to proteins or polypeptides. However, biotin is abundant in liver, kidney, pancreas, yeast, and milk. In humans biotin is a coenzyme for five carboxylases: propionyl-CoA carboxylase, methylcrotonyl-CoA carboxylase, pyruvate carboxylase, and 2 forms of acetyl-CoA carboxylase. This fact makes biotin essential for amino acid catabolism, gluconeogenesis, and fatty acid metabolism.

 

The enzyme biotin holocarboxylase synthetase (EC 6.3.4.10) catalyzes the attachment of biotin to the apocarboxylases by an amide bond to a specific lysine residue–producing holocarboxylases. Biotinylated proteins are normally turned over by proteolytic degradation to biocytin (biotinyl-lysine) and biotinylated oligopeptides that are subsequently cleaved by biotinidase (EC 3.5.1.12). These reactions recycle biotin. In addition, biotin is found covalently attached to histones and appears to be necessary for gene stability, repression of transposable elements and some genes.

 

Normal dietary intake of foods usually supplies enough biotin to prevent a biotin deficiency. However, deficiency can be caused by eating too many raw egg whites for a prolonged time period. Egg whites contain the protein avidin in large amounts. This protein can bind biotin strongly and prevent it from being ingested. Biotin in the body is regulated by dietary intake, the biotin transporters monocarboxylate transporter 1 and sodium-dependent multivitamin transporter, peptidyl hydrolase biotinidase (BTD), and the protein ligase holocarboxylase synthetase. Inhibiting any of these enzymes can cause a biotin deficiency.

Biotin Deficiencies


Nutritional biotin deficiency and inherited enzymatic deficiencies of the biotin-dependent carboxylases cause abnormally increased urinary excretion of characteristic organic acids. 

 

Biotin Deficiency
Observed Metabolites in Urine
Methylcrotonyl-CoA carboxylase deficiencyObservation of 3-methylcrotonylglycine and 3-hydroxyisovalerate (3HIA) in urine
Propionyl-CoA carboxylase deficiencyObservation of 3-hydroxypropionate (3HPA), propionylglycine, and methylcitrate in urine
Pyruvate carboxylase deficiencyObservation of lactate most likely indicates this deficiency


Structures of observed metabolites

 
 

 

Figure 2: Structures of metabolites observed in urine for biotin deficiencies. 


Biotin as a coenzyme is a catalyst for carboxylation reactions. One example is the reaction catalyzed by propionyl-coenzyme A (CoA) carboxylase illustrated in figure 3. 

Figure 3: Carboxylation reaction catalyzed by propionyl-CoA carboxylase.


In the reaction catalyzed by propionyl-CoA carboxylase, acidic hydrogen on carbon is removed as a proton and is replaced by the electrophilic acyl carbon of bicarbonate in an acyl exchange reaction. Here, water is the formal leaving group. The energy-yielding reaction for biotin catalyzed carboxylation reactions usually is the hydrolysis of MgATP to MgADP which is coupled to the carboxylation reaction. Biotin is covalently attached to the active site of an enzyme by an amide between the carboxylate of the coenzyme and a lysine from the protein.

 

Figure 4: Biotin covalently attached to the active site of an enzyme. The biotin moiety is attached by an amide between the carboxylate of the coenzyme and a lysine from the protein.


Figure 5: Biotin-Avidin Complex.The crystal structures of a deglycosylated form of the egg-white glycoprotein avidin and the avidin-biotin complex have been determined to 2.6 and 3.0 Å by Livnah et al. in 1993. The structures revealed the amino acid residues critical for the stabilization of the tetrameric protein complex assembly as well as for the very tight binding of biotin. Each avidin monomer folds into a eight-stranded antiparallel beta-barrel structure. This structure is quite similar to that of the genetically distinct bacterial analog streptavidin. Similar to streptavidin, binding of biotin involves a highly stabilized network of polar and hydrophobic interactions. Different views of the tetramer and monomer are illustrated. The red arrow points to the biotin binding pocket.

The valeric acid side chain of biotin molecules contains a carbonic acid group which can be readily conjugated to various reactive groups. This allows the attachment of biotin to molecules such as proteins, peptides or oligonucleotides and others. Any molecule attached to biotin can be captured for detection, immobilization or affinity purification using conjugates or supports conjugates to avidin or streptavidin proteins. These proteins bind strongly and specifically to the biotin group.

A wide variety of native and recombinant derivatives of avidin and streptavidin proteins are now readily commercially available in modified, labeled and immobilized forms. This "avidin-biotin system" has now been used in many research applications, for example, the detection or purification of target molecules. However, since the avidin-biotin affinity interaction is very strong, it is usually impractical to elute biotinylated targets that have been captured to immobilized avidin or streptavidin. For this reason, modified biotin labeling reagents have been developed. For example, cleavable biotin, iminobiotin and desthiobiotin provide reversible interactions with streptavidin and have now become useful tools for soft-release applications.

To conclude, optimized modern conjugation reactions allow for the design and synthesis of many different versions of biotinylated reagents or conjugates. 


Reference

http://lpi.oregonstate.edu/infocenter/vitamins/biotin/

Lanska DJ.; The discovery of niacin, biotin, and pantothenic acid.Ann Nutr Metab. 2012;61(3):246-53. doi: 10.1159/000343115. Epub 2012 Nov 26.


Livnah
O, Bayer EA, Wilchek M, Sussman JL; Three-dimensional structures of avidin and the avidin-biotin complex.Proc. Natl. Acad. Sci. U.S.A. (1993) 90 p.5076-5080.


Rebecca Mardach, Janos Zempleni, Barry Wolf, Martin J. Cannon, Michael L. Jennings, Sally Cress, Jane Boylan, Susan Roth, Stephen Cederbaum, Donald M. Mock;
Biotin dependency due to a defect in biotin transport. J Clin Invest. 2002 June 15; 109(12): 1617–1623. doi: 10.1172/JCI13138


Hamid M Said;
Biotin: the forgotten vitamin. Am J Clin Nutr 2002; 75:179–80.


Wolf, B. 2001. Disorders of biotin metabolism. In The Metabolic and Molecular Basis of Inherited Disease. C.R. Scriver, A.L. Beaudet, W.S. Sly, and D. Valle, editors. McGraw-Hill Inc. New York, New York, USA. 3151–3177.

Growth Hormone

$
0
0
Growth Hormone


Growth
hormone or somatotrophin is a naturally occuring peptide hormone, a polypeptide or small protein that contains a single chain polypeptide which amino acid sequence is made up of 191 amino acids. The three-dimensional structure is stabilized by two disulfide bridges and has four helical structures. The position of the helices and the overall three-dimensional structure of this hormone are important for receptor binding. The hormone shares structural homologies with prolactin and human chorionic somatomammotropin (hCS). HCS is a growth hormone variant synthesized exclusively in the placenta. The hormone circulating in the body is rather heterogeneous.

There is a cluster of five genes from which these polypeptide hormones may be synthesized but normally only one gene expressed tissue-specifically. Binding of the tissue-specific transcription factor Pit-1 to the promoter region of the growth hormone gene results in only one form of growth hormone that are secreted by the anterior pituitary gland.

Human Growth Hormone (HGH)

$
0
0

Human growth hormone or hGH 


Human growth hormone (hGH) is a naturally occurring peptide or protein hormone secreted by the pituitary gland. Growth hormone (GH or HGH) is also known as somatotropin or somatropin. It is a peptide or protein hormone that stimulates growth, cell reproduction and regeneration in humans. In the body the hormone is rather heterogeneous. The major isoform of the human growth hormone is a protein of 191 amino acids and a molecular weight of 22,124 daltons.




3D models of HGH  {Source PDB: 1HGU}

The three-dimensional structure is stabilized by two disulfide bridges four helical structures. The position of the helices and the overall three-dimensional structure of this hormone are important for receptor binding. The hormone shares structural homologies with prolactin and human chorionic somatomammotropin (hCS). HCS is a growth hormone variant synthesized exclusively in the placenta. There is a cluster of five genes from which these polypeptide hormones may be synthesized but normally only one gene expressed tissue-specifically. Binding of the tissue-specific transcription factor Pit-1 to the promoter region of the growth hormone gene results in only one form of growth hormone that are secreted by the anterior pituitary gland.


Before recombinant technology was available, the only source of hGH was human cadavers, but the contamination that led to Creutzfeldt–Jakob disease made this form of treatment obsolete. During the late 1980s, recombinant hGH (rhGH) was developed through genetic engineering. Recombinant hGH has been used with good results in the treatment of patients with hGH deficiency allowing bone growth and impacting on the patient’s final stature. This form of hGH has a sequence identical to the naturally occurring 22 kDa hormone.


Some athletes and bodybuilders appear to have used rhGH and claim that it increases lean body mass and decreases fat mass. Besides its anabolic properties hGH also effects carbohydrate and fat metabolism. During sport doping investigations rhGH has been found in swimmers and also in players taking part in other major sports events. International federations and the International Olympic Committee have hGH now on the list of forbidden compounds since 1989.


Human growth hormone is secreted from somatotropic cells in the anterior pituitary gland in a pulsating fashion. Two hypothalamic peptides, growth hormone releasing hormone, which stimulates hGH secretion, and somatostatin, which inhibits hGH secretion by back regulation, regulate its secretion.


hGH binds to specific receptors present throughout the whole body and exerts its biological effects on target cells. The secretion of hGH is slightly higher in women than in men. The highest levels are observed at puberty. Secretion decreases with age by approximatelly 14 % per decade. In addition, secretion of the hormone varies with normal physiological and pathological conditions and hGH levels are higher during slow wave sleep and are increased by exercise, stress, fever, fasting and, with increased levels of some amino acids (leucine and arginine). Drugs, such as clonidine, L-dopa and c-hydroxybutyrate, increase its secretion, as do androgens and estrogens. hGH binds to specific membrane receptors found in abundance throughout the body. It has both direct and indirect effects on the tissues. Indirect effects are mediated by IGF-1, generated in the liver in response to GH.

References


Chantalat
L, Jones ND, Korber F, Navaza J, Pavlovsky AG; The crystal-structure of wild-type growth-hormone at 2.5 angstrom resolution. Protein Pept.Lett. (1995) 2 p.333.


M
Saugy, N Robinson, C Saudan, N Baume, L Avois, P Mangin; Human growth hormone doping in sport. Br J Sports Med 2006;40(Suppl I):i35–i39. doi: 10.1136/bjsm.2006.027573.

Amino acids are stereo-isomers, posses handedness and are chiral molecules

$
0
0
Amino acids are stereo-isomers, posses handedness and are chiral molecules


Molecules with the same atoms and functional groups are called stereoisomers.  Stereoisomers are compounds made up of the same atoms and bonded by the same sequence of bonds, but having different three dimensional (3D) structures. The different 3D structures are called configurations and are not interchangeable. Two stereoisomers cannot superimpose. Thus, even when two molecules contain the same functional groups but are stereoisomers organisms can usually distinguish between them.

Figure 1 illustrates the concepts of chirality, enantiomers, handedness, isomers and stereoisomers. The amino acid alanine is used here as an example.     


Figure 1:  Enantiomers are illustrated on the left site of the figure. The right site illustrated handedness, which is the tendency to use one hand rather than the other, as well as the property of the two hand of not being identical with its mirror image. The property of nonsuperimposability of an object on its mirror image, in this case for a pair of hands and the enantiomers of the amino acid alanine, called chirality is illustrated here.


Any material that has the ability to rotate the plane of polarized light is known to be optically active. If a pure compound or molecule is optically active, the structure of the molecule is nonsuperimposable on its mirror image. On the other hand, if a molecule is superimposable on its mirror image, the compound does not rotate the plane of polarized light and it is optically inactive.  The property of nonsuperimposability of an object on its mirror image is called chirality. If a molecule is superimposable on its mirror image it is optically inactive and called achiral. Apparently the the relationship between optical activity and chirality is absolute and no exceptions are known.

The term chirality as used in chemistry, biochemistry and biology describes the property of asymmetry molecules can posses. The word chirality is derived from the Greek word for “hand”, χειρ (kheir).  Human hands are an example for a chiral object. An object or a system is chiral if it is not identical to its mirror image, that is, it cannot be superposed onto it.

If a molecule is nonsuperimposable on its mirror image, the mirror image must be a different molecule. Superimposability is the same as identity, thus the image and the mirror image correlates with the same molecule.

Pure compounds that are optical active have two and only two isomers. These are called enantiomers or sometimes enantiomorphs. The two enantiomers differ in structure only in the left and right handedness of their orientation. 


Enantiomers have identical physical and chemical properties except in two important properties:

  • They rotate the plane of polarized light in opposite directions, however in equal amounts. The isomer or enantiomer that rotates the plane counterclockwise or to the left is called the levo isomer and is designated (-). The other isomer rotates the plane clockwise or to the right and is therefore called the dextro isomer and is designated (+). They are also called optical antipodes.

  • They react at different rates with other chiral compounds. However, the reaction rates may be so close together that a distinction is not always possible, or they may be so far apart that one isomer reacts much faster than the other or not at all. This explains why many compounds are biologically active while their enantiomers are not. However, enantiomers react at the same rate with achiral compounds.

In addition, the amount of rotation α is not constant for a given enantiomer. The amount of rotation depends on the length of the sample vessel, the temperature, the solvent and concentration (in the case of solutions), the pressure (in the case of gases), and the wavelength of light.

The specific rotation [α] is defined by the formula [α] = α / lc, for solutions, and [α] = α / ld for pure compounds, where α is the observed rotation, l is the length of the cell in decimeters, c is the concentration in grams per milliliter, and d is the density in the same units. The specific rotation is usually reported along with the temperature and wavelength, for examples as [α]25546. [α]D indicated that the rotation was measured with sodium D light at λ= 589 nm. The molecular rotation [M]tλ is the specific rotation times the molecular weight divided by 100.

The reporting structure is important since changes in conditions can change not only the amount of rotation but sometimes also the direction of rotation. For example, one of the enantiomers of aspartic acid, when dissolved in water, has [α]D equal to  +4.36° at 20 °C and -1.86° at 90 °C, although the molecular structure is unchanged.

In 1891, Emil Fisher, invented the Fisher projections, a method of how to represent tetrahedral carbons on paper. By this convention, the model is held so that the two bonds in front of the papers are horizontal and those behind the paper are vertical. The ability of these models is limited but they are useful for a quick test if the molecules in question are chiral. In any case, 3D models are much better for the determination of the nature of enantiomers or stereoisomers.  

The DL system has been widely used in the past but it is not without faults. Therefore it is only used nowadays for certain groups of molecules, such as carbohydrates and amino acids. The DL system has been replaced by the Cahn-Ingold-Prelog system in which the four groups on an asymmetric carbon are ranked according to a set of sequence rules.

Reference

Any school or college book covering biochemistry and molecular biology including handbooks for amino acids and organic molecule may be reviewed.

http://goldbook.iupac.org/C00772.html


IUPAC. Compendium of Chemical Terminology, 2nd ed. (the "Gold Book"). Compiled by A. D. McNaught and A. Wilkinson. Blackwell Scientific Publications, Oxford (1997). XML on-line corrected version: http://goldbook.iupac.org (2006-) created by M. Nic, J. Jirat, B. Kosata; updates compiled by A. Jenkins. ISBN 0-9678550-9-8.

March’s Advanced Organic Chemistry; Reactions, Mechanisms, and Structure. 6th editions. M.B. Smith and J. March. 2007. Wiley & Sons. Hoboken, NJ.


Different Attachment Linkers for Oligonucleotide Conjugation

$
0
0
Traditionally amino and sulfhydryl groups attached to oligonucleotides have been the most common functionalities employed in bio-conjugation.1.2 That kind of attachment chemistries have been adopted from peptide chemistry that was a great complement to the fast growing area of the specialty oligonucleotide synthesis.

Complexity of the modern synthetic biology dictates attachment and labeling methods that would not interfere with parallel biological process. Conventional methods are very limited by their chemistry, therefore it is important to have arsenal of completely orthogonal and chemo-selective methods. That is why during last decade dozens of new attachment and labeling methods have been developed that increased availability and versatility of the conjugation tools.

Carboxyl modified oligonucleotides

Oligonucleotide conjugation is predominantly carried out by use of a nucleophilic group on an oligonucleotide to react with an electrophilic group on a reporter molecule or a solid support. This is predominant approach because the common oligonucleotide deprotection is performed by base treatment e.g. ammonia, primary alkylamines or their combinations which are inherently nucleophilic. However, there are many situations when researchers need to introduce an electrophilic group into oligonucleotides and use it in the attachment method towards of a nucleophilic moiety.

In the case, when oligonucleotides have been used in that type of the bio-conjugation the activated carboxylate have been generated post-synthetically.2 Free carboxyl modified oligonucleotides can be activated by EDC in situ in the organic or aqueous conditions and subsequently conjugated to aminated counterpart (Fig. 1). However, in order to generate free carboxylate attached oligonucleotide using commercially available building blocks it requires cleavage of the ester bond with alkaline base4,5 before oligonucleotide base deprotection in concentrated ammonia, otherwise the ester will beconverted into corresponding amide.

carboxyl and alkylamino reactoin


Figure 1. Bio-conjugation between carboxyl modified oligonucleotide and alkylamino moiety via EDC reagent.

Using that strategy carboxyl modified oligonucleotides can be easily immobilized on solid support such as micro-array slides and various types of aminoalkylated beads.


Huisgen’s 1-3 Dipolar cycloaddition

The Copper (I) catalyzed Huisgen’s 1,3-dipolar cycloaddition between alkynes and azides discovered by the Sharpless group in 2002 (“click chemistry”),9 is a novel and very potent method for incorporation of molecules of interest (reporter molecules, lipophilic ligands, etc.) into oligonucleotides. The methods have been limited to the post-synthetic attachment of labels, and the proposed methods have not been commercially viable alternatives to standard synthesis approaches.10-12

Recently Prof. Brown’s group discovered that the neutral heteroaromatic “click” backbone, when it introduced instead of natural phosphodiester bond is acceptable for Taq polymerase and can be used for most polymerase dependent proceses.13

Cooper dependent “click chemistry” often limits that type of attachment chemistry due to cytotoxicity. Recently developed azadibenzocyclooctyne doesn’t require any catalysts and it is highly reactive towards aliphatic and aromatic azides.14

1,3-dipolar cycloaddition

Figure 3. Cooper free 1,3-dipolar cycloaddition.

Diels-Alder attachment method

Another important catalyst free chemo-selective attachment method is Diels-Alder reaction that was successfully employed in bio-conjugation.15 In order to make this process highly efficient at ambient temperature, the alkyldienyl group should be activated with electron donating group (EDG) and the dienophile should have adjacent electron withdrawing group (EWG).

Diels-Alder Reaction


Figure 4. Diels-Alder reaction used in bio-conjugation.


Bio-synthesis offers not only wide varieties modified oligonucleotides and also their conjugates with peptides, proteins and antibodies.1


References:

  1. http://www.biosyn.com/Bioconjugation.aspx
  2. E. Jablonski, E. W.Moomaw, R. H.Tullis and J. L.Ruth Nucleic Acid Res., 1986, 14, 6115-6128.
  3. J. D. Kahl and M. M. Greenberg J. Org. Chem., 1999, 64 (2), 507–510
  4. A.V. Kachalova, T. S. Zatsepin, E. A. Romanova, D. A. Strelenko, M. J. Gait, T. S. Oretskaya Nucleosides, Nucleotides Nucleic Acids. 2000, 19, 1693-1707
  5. T. P. Prakash, A. M. Kawasaki, E. A. Lesnik, S. R. Owens, M. Manoharan Org. Lett. 2003, 5, 403-406.
  6. M. A. Podyminogin, E. A. Lukhtanov and M. W. Reed Nucleic Acid Res., 2001, 29, 5090-5098.
  7. S. Raddatz, J. Mueller-Ibeler, J. Kluge, L. Wäß, G. Burdinski, J. R. Havens, T. J. Onofrey, D. Wang, and M Schweitzer Nucleic Acid Res., 2002, 30, 4793-4802.
  8. E. N. Timofeev, A. D. Mirzabekov, S. V. Kochetkova and V. L. Florentiev Nucleic Acid Res., 1996, 24, 3142-3148.
  9. V. V. Rostovtsev, L.G. Green, V. V. Fokin, K.B. Sharpless, Agnew. Chem. Int. Ed., 2002, 41, 2596-2599.
  10. A.V. Ustinov, et al, Tetrahedron, 2007, 64, 1467-1473.
  11. Agnew, B. et al., US Patent application 20080050731/A1.
  12. X. Ming, P. Leonard, D. Heindle and F. Seela, Nucleic Acid Symposium Series No. 52, 471-472, 2008.
  13. A. H. El-Saghner, and T. Brown, Accounts of Chemical Research, 2012, 45 (8), 1258-67.
  14. M. F. Debets, S. S. van Berkel, S. Schoffelen, F. P. J. T. Rutjes, J. C. M. van Hest and F. L. van Delft, Chem. Commun., 2010,46, 97-99.
  15. V. Marcha ´n, S. Ortega, D. Pulido, E. Pedroso and A. Grandas, Nucleic Acid Res., 2006, 34, e24

A Collection of Protein Precipitation Methods for Proteomics

$
0
0
 A Collection of Precipitation Methods
for Proteomics


By Klaus D. Linse


Protein precipitation methods are used for the concentration of diluted proteins in solution. The goal is to purify and concentrate contaminated proteins or proteins dissolved in various matrices, buffers, detergents or from natural sources, such as blood, urine or other biofluids. The mechanism of precipitation for proteins is to alter the solvation potential of the solvent. The solubility of the solute is lowered by adding a specific reagent to manipulate the repulsive electrostaci forces between proteins.

The following is list of precipitation methods collected over several years. 

Trichloroacetic Acid (TCA) Precipitation = TCA Precipitation

 
  • Add 1/2 volume 50% (w/v) TCA (0 oC) to the aqueous protein solution.

·                optional: (0.1% b-mercaptoethanol or 20 mM dithiothreitol)

  • Equilibrate (0 oC) for 10-20 min.
  • Centrifuge (~5000-13,000 x g cold) to pellet protein.
  • Resuspend in appropriate solution (may require neutralization).


Acetone Precipitation

 

·        Add acetone (-20 oC) at a 4:1 ratio to the aqueous protein solution.

·                optional: (0.1% b-mercaptoethanol or 20 mM dithiothreitol)

·        Equilibrate (-20 oC) for 60 min.

·        Centrifuge (~5000-13,000 x g cold) to pellet protein.

·        Resuspend in appropriate solution.


The precipitate is often difficult to resuspend in aqueous solution.  Sonication, detergents, acetonitrile, methanol, salts, TFA (acetic or formic acid) are often used to assist in the process.  Be careful of what effects these processes have on the downstream processes.


TCA-Acetone Precipitation

·        Add 15% TCA in acetone (-20 oC) at a 4:1 ratio to the aqueous protein solution.

·                optional: (0.1% b-mercaptoethanol or 20 mM dithiothreitol)

·        Equilibrate (-20 oC) for 20-60 min.

·        Centrifuge (~5000-13,000 x g cold) to pellet protein.

·        Resuspend in appropriate solution.


This technique is somewhat superior in overall recovery to TCA and Acetone separately.  The precipitate is often difficult to resuspend in aqueous solution.  Sonication, detergents, acetonitrile, methanol, salts, TFA (acetic or formic acid) are often used to assist in the process.  Be careful of what effects these processes have on the downstream processes.

Phenol, Ammonium Acetate/Methanol Precipitation

 “It’s a little tricky but I like this the best” 


            Hurkman and Tanaka, 1986, Plant Physiology 81:802-806.

  • Add 1 volume of buffered (pH~8.5-9.0) phenol (e.g. 0.1 M Tris-HCl pH 8.8, 10 mM EDTA(~3 basic), 0.2% 2-mercaptoethanol/20 mM dithiothreitol, 900 mM sucrose).

  • Mix for 30 min and centrifuge (~5000-13,000 x g cold) to phase separate.

  • Re-extract aqueous phase (top) with 1 volume each of co-equilibrated Phenol and aqueous extraction buffers.

  • Mix for 30 min, and then centrifuge (~5000-13,000 x g cold) to phase separate.

  • Combine phenol phase.

  • Precipitate proteins by adding 5 volumes of 0.1 M Ammonium Acetate in Methanol (-20 oC, 1 hr-overnight).

  • Pellet protein by centrifugation (20 min 20-30 min, 14,000-20,000 x g, 4 oC).

  • Wash pellet (2x) with 0.1 M Ammoniuim Acetate in Methanol (Resuspend 15 min, -20 oC; 10 min >14,000 x g, 4 oC).

  • Wash pellet (2x) with Acetone (Resuspend 15 min, -20 oC; 10 min >14,000 x g, 4 oC).

  • Wash pellet with 70 % Ethanol (Resuspend 15 min, -20 oC; 10 min >14,000 x g, 4 oC).

  • Dry pellet under vacuum and store under Ar/N2 (-20 oC)


This technique is superior to the other methods, particularly for plant tissues, however it is very time consuming.  The precipitate is often difficult to resuspend in aqueous solution.  Sonication, detergents, acetonitrile, methanol, salts, TFA (acetic or formic acid) are often used to assist in the process.  Be careful of what effects these processes have on the downstream processes.

Methanol Chloroform   (A)

        Wessel and Fluegge, 1984;   Anal. Biochem. 138,141-143.

·        Add 4 volumes methanol.

·        Add 3 volumes chloroform.

·        Add 3-4 volumes H2O

·                optional: (0.1% b-mercaptoethanol or 20 mM dithiothreitol)

·        Vortex.

·        Centrifuge (~9,000 x g 10-60 min) to pellet protein between phases.

·        Discard upper phase (aqueous).

·        Add 3 volumes (original volumes) methanol

·        Centrifuge (~9,000 x g 10-20 min) to pellet protein.

·        Discard supernatant

·        Dry sample (N2/Ar or using vacuum).

·        Resuspend in appropriate solution.

 

In my hands, this technique is superior to the other methods. Minimal amounts of protein (1 µg or less) are distributed over a rather large area and many proteins are, subsequently, insoluble in buffers lacking SDS. Only small quantities of chemicals or solutions are needed. Use the best grade solvents (highest purity available) for this method. 

For amino acid analysis and protein sequencing: The dried pellet can be dissolved either using pure TFA, TFA/water, heptafluoracetone (HPFA)/TFA or heptafluorpropanol (HPFA)/TFA for direct spotting on to the sample support used in the protein sequencer.

 

Methanol Chloroform  (B)      Modified for speed !

  • In a 1.5 ml micro centrifuge tube add to 100 µl of your sample solution

    400 µl Methanol,
    300 µl Chloroform and
    300 to 400 µl H2O  

  • Vortex the solution intensively and centrifuge for 10 to 20 (sometimes even 30 to 60 minutes), at about 9000 rpm (table centrifuge). Separation into two phases should be visible. The protein/peptide is found between the two phases. Discard the upper phase carefully (or keep it for further investigations or distribution studies).
  • Add another 300 µl Methanol, vortex and centrifuge for 10 - 20 minutes.
  • Discard the liquid and dry the precipitated pellet (with a stream of air or, better, N2 or argon, or by evaporating the remaining liquid by applying a vacuum in a speed-vac centrifuge or equivalent).
  • Dissolve the sample in a buffer suitable for your next separation method and proceed.
 
Note 1:         This method can also be used for larger sample volumes. To do so, one can increase the sample volume as well as the solvent volumes multi-fold by keeping the volume ratio the same.

Note 2:          (This applies to protein sequencing:)   The dried pellet can be dissolved either using pure TFA, TFA/water, heptafluoracetone (HPFA)/TFA or heptafluorpropanol (HPFA)/TFA for direct spotting on to the sample support used in the protein sequencer.

Note 3:           Minimal amounts of protein (1 µg or less) are distributed over a rather large area and many proteins are, subsequently, insoluble in buffers lacking SDS. Only small quantities of chemicals or solutions are needed. Use the best grade solvents (highest purity available) for this method. 

This method was modified after the method reported by:


Wessel and Fluegge, 1984;   A method for the quantitative recovery of protein in dilute solution in the presence of detergents and lipids. Anal. Biochem. 138,141-143).



Note 4:
           If possible use only high quality chemicals.

 


Desalting/Lyophylization

  • Desalt sample by dialysis, ultrafiltration or a desalting column.
  • Freeze sample with N2 liq, Dry ice-Methanol or freezer and lyophylize sample.

This technique is just a dehydration step which does not remove contaminants.  Solvation is easier than with other methods, however, many fine precipitates often remain.  The de-salting steps are often far less effective than initially expected.


Chemicals needed              

 
I prefer to use the following brands based on our protein sequencer and mass spectrometer performance.

Acetonitrile:     EMAX0142-1, Omnisolve, EMD; HPLC, spectrophotometry, GC, Gradient Analysis, (CAS 75-05-8) EMD

Chloroform:     SIGMA (FLUKA) 25668; BioChjemika Ultra, for MB, 49:1 >99.5%, 100 ml


Isopropanol:    EMPX1838-1, Omnisolve, EMD; HPLC, spectrophotometry, GC, Gradient Analysis, (CAS 67-63-0) EMD

Methanol:        EMX0475-1 HPLC grade (CAS 67-56-1) EMD

Water:             Ultrapure water. Biosynthesis Inc. 

Click Chemistry - A Review

$
0
0

Click Chemistry - A Review

Click chemistry refers to a modular chemical approach that utilizes the copper (I) catalyzed 1,2,3-triazole formation from azides and terminal acetylenes as a powerful linking reaction to produce unique useful and versatile new biological compounds. This copper (I) catalyzed coupling of azides to terminal acetylenes is the premier reaction in click chemistry and creates 1,4-disubstitued 1,2,3-triazole linkages. These linkages share useful topological and electronic features with ubiquitous amide connectors that are not susceptible to cleavage.


Figure 1: The copper catalyzed 1,3-dipolar cycloadditon of an azide to an alkyne to create 1,2,3-triazoles is a Huisgen [3 + 2] cycloaddition reaction.

 

The beauty of click reactions is that click chemistry employs chemical reactions that are high yielding, cover a wide scope of reactions, create only byproducts that can be removed without chromatography, are stereospecific, simple to perform, and can be conducted in easily removable solvents. Since its introduction by K. B. Sharpless in 2001 click chemistry has enabled modular approaches for the generation of novel pharmacophores via a collection of reliable chemical reactions. The power of click chemistry enables the production of stereoselective products in high yields. The resulting products contain inoffensive byproducts, are insensitive to oxygen and water, utilize readily available starting materials, and have a thermodynamic driving force of at least 20 kcal mol-1. According to Rostovtsev et al. (2002) by simply stirring in water, organic azides and terminal alkynes are readily and cleanly converted into 1,4-disubstituted 1,2,3-triazoles through a highly efficient and regioselective copper(I)-catalyzed process.

The dipolar structure of azides was first recognized by Linus Pauling in 1933. Pauling published a paper in 1933 describing the investigation of the structures of methyl azide, CH3-N3, and carbon suboxide, C3O2, by electron-diffraction. The deduced structure for methyl azide as determined by Pauling is illustrated in figure 1.  


Figure 2: The structure for methyl azide as reported by Pauling in 1933 and the structure of the dipolar nature for the resonance hybrids for azides are illustrated.

The copper (I) catalyzed coupling of azides to terminal acetylenes is a Huisgen 1,3-dipolar cycloaddition reaction. Reactions in which azides add to double bonds to give triazolines belong to a large group of 2 + 3 cycloaddition reactions in which five-membered heterocyclic compounds are prepared by the addition of 1,3-dipolar compounds to double bonds. These compounds with the sequence a-b-c contain a sextet of electrons in the outer shell, usually located at a, and an octet with at least one unshared electron pair, located on c. This is a reaction of a 4πe- zwitterionic system with a 2πe-  neutral system to form a 5-membered ring. During the reaction the number of σ bonds increase at the expense of the number of π bonds. However, since compounds with six electrons on the outer shell of an atom are usually not stable, the a-b-c system is actually a resonance hybrid as illustrated in figure 2 for the structure of methyl azide.



Carbon-carbon triple bonds can also undergo 1,3-dipolar additions. The Huisgen [3 + 2] cycloaddition between a terminal alkyne and an azide generates substituted 1,2,3-triazoles. This is the premier reaction utilized in click reactions when copper (I) is used for the catalysis of the reaction. Click chemistry is now used for a variety of applications in various facets of drug discovery. Schemes of chemical 1,3-dipolar addition reactions are illustrated in figure 3.



Figure 3: Reaction schemes of the 1,3-dipolar addition to yield triazolines. A. The addition of molecules with the sequence a-b-c to double bonds is shown. The cycloaddition of phenyl azide is used as an example on the right. B.  The addition of molecules with the sequence a-b-c to triple bonds is shown. The reaction of phenyl azide is used as an example on the right. These type of reactions were intensively studied by Rolf Huisgen and are known as 1,3-dipolar Huisgen cycloaddition reaction. C. The copper (I) catalyzed cycloaddition to a triple bond is illustrated here. This chemistry is known as “click chemistry” and yields exclusively the 1,4-disubstituted 1,2,3-triazole.

 

As pointed out by Barry Sharpless’s group (Kolb et al. 2001), the characteristic of click reactions is the high thermodynamic driving force which is usually greater than 20 kcal per mol. The reactions quickly proceed to completion and tend to be highly selective for a single product. Carbon-heteroatom bond forming reactions are the most common examples.

Click chemistry includes the following classes of chemical transformations:

  • Cycloadditions of unsaturated species. Including 1,3-dipolar cycloaddition reactions and Diels-Alder transformations.
  • Nucleophilic substitution chemistries. For example, ring-opening reactions of strained heterocyclic electrophiles such as epoxides, aziridines, aziridinium ions, and episulfonium ions.
  • Carbonyl chemistry of the “non-aldol” type, such as formation of ureas, thioureas, aromatic heterocycles, oxime ethers, hydrazones, and amides.
  • Additions to carbon-carbon multiple bonds. For example, epoxidation, dihydroxylation, azirdination, sulfenyl halide addition and Michael additions of Nu-H reactants.


References

Franck Amblard, Jong Hyun Cho, and Raymond F. Schinazi; The Cu(I)-catalyzed Huisgen azide-alkyne 1,3-dipolar cycloaddition reaction in nucleoside, nucleotide and oligonucleotide chemistry. Chem Rev. 2009 September ; 109(9): 4207–4220. doi:10.1021/cr9001462.


Bohm
, T.; Webber, A.; Sauer, J. "Nonstereospecific 1,3-Dipolar Cycloadditions of Azomethine Ylides and Enamines." Tetrahedron1999, 55, 9535-9558.


Z. P. Demko and K. B. Sharpless, An Expedient Route to the Tetrazole Analogs of α–Amino Acids, Org. Lett., 4, 2525 (2002).


R. Huisgen, R. Grashey, J. Sauer in Chemistry of Alkenes, Interscience, New York, 1964, 806-877.


IUPAC Nomenclature: http://goldbook.iupac.org/


Hartmuth C. Kolb, M. G. Finn, and K. Barry Sharpless; Click Chemistry: Diverse Chemical Function from a Few Good Reactions. Angew. Chem. Int. Ed. 2001, 40, 2004 ± 2021.

Jerry March: Advanced Organic Chemistry: Reactions, Mechanisms, and Structure. 2nd Edition McGRAW-HILL International Book Company.



March
’s: Advanced Organic Chemistry: Reactions, Mechanisms, and Structure. 6nd Edition, M.B. Smith and J. March. WILEY-INTERSCIENCE. WILEY.COM.



Sarah A McCarthy
,Gemma-Louise Davies & Yurii K Gun'ko; Preparation of multifunctional nanoparticles and their assemblies. Nature Protocols 7, 1677–1693 (2012). doi:10.1038/nprot.2012.082.



Vsevolod V. Rostovtsev, Luke G. Green, Valery V. Fokin and K. Barry Sharpless; A Stepwise Huisgen Cycloaddition Process: Copper(I)-Catalyzed Regioselective “Ligation” of Azides and Terminal Alkynes. Article first published online: 15 JUL 2002. DOI: 10.1002/1521-3773(20020715)41:14<2596::AID-ANIE2596>3.0.CO;2-4. © 2002 WILEY-VCH Verlag GmbH, Weinheim, Fed. Rep. of Germany. Angewandte Chemie International Edition.
Volume 41, Issue 14, pages 2596–2599, July 15, 2002.

Publications of the Sharpless Lab: http://www.scripps.edu/sharpless/pubs.html

  ! Contact BSI for more info on click chemistry based conjuagtion !

or

Click here ! 

 

Gene Expression

$
0
0

Gene Expression

Gene expression is the most fundamental process that organisms undergo to perform the cellular functions required for life.  Gene expression can also be manipulated in the lab to take genes wanted from one organism and express that gene in another organism of interest.   Genes are expressed in a two-step process called transcription and translation.

The first step towards gene expression is transcription.  A gene is located within the genomic DNA of a cell.  The genomic DNA is similar to a blueprint in that it contains all the information needed to build each components of the cell.  These components are all proteins.  Transcription begins with RNA polymerase making a copy of the template strand of the gene of interest to form a single stranded RNA called messenger RNA.  Once messenger RNA is made, it proceeds to the next step called translation. 

In the last few years, rapid progress has been made in mapping every gene of every organism due to high throughput sequencing techniques.  As a result of this, any kind of gene encoding for any type of protein can be found online in a gene databank.  In the lab the gene of interest can be obtained by looking at the computer database instead of having proteins scan genomic DNA like it does in a cell.  While a cell is limited to take a certain portion of its own genomic DNA and convert it into mRNA for gene expression, scientists are able to synthetically synthesize genes from any organism and insert it into a vector for gene expression to take place.  All of this is only possible due to advances in gene sequencing and gene synthesis. 

The last step toward gene expression is translation.  Weather the gene of interest was obtained by the cell or by laboratory manipulation; translation occurs naturally using cellular proteins called ribosomes.  Once mRNA is made, it is tagged with a poly-A tail which guides the mRNA outside the nuclear pores of the nucleus to the ribosomes of the cell.  Within the ribosomes, mRNA is finally expressed into genes.  Gene expression is defined as the portion of genomic DNA (a gene) which can be expressed into a protein. 

In summary, gene expression is a complex process that occurs within the cell to express proteins of interest.  Due to advancements in technology, genes from any type of organism can be synthetically made and expressed into a wide variety of hosts.  These advancements in gene expression techniques will and have led to discoveries in therapeutic research.  

DaOligonucleotide Fluorescent Labeling

$
0
0

Oligonucleotide Fluorescent Labeling


By Andrei Laikhter

Optimally labeled nucleic acids are used as molecular probes and are very useful for a variety of nucleic acid based applications such as in antisense technology, biochemistry, biology, chemistry, cell biology, DNA sequencing, forensic science, genetic analysis, medicinal chemistry, molecular diagnostics, neuroscience, pharmacology, RT-PCR based molecular detection, and many others. However, major applications of fluorescent or fluorogenic oligonucleotides appear to be sequencing, forensic and genetic analysis.

Fluorophores, fluorescent chemical compounds or molecules that can re-emit light upon light excitation, have been and are used for the labeling of many biomolecules. Oligonucleotides can be be used as reporter molecules and typically contain covalently linked functional modifications. However, most non-radioactive labels incorporated into nucleotides are not stable during chemical synthesis of oligonucleotides. Therefore, blocked nucleophilic groups such as alkyl-sulfhydral or -amines are incorporated during the oligonucleotide synthesis procedure. These groups can then be used to direct the incorporation of nucleophile-specific labeling reagents. Oligonucleotides labeled in this way have a wide variety of applications. Among them are DNA and RNA probes (1,2), micro-arrays, molecular diagnostic probes, automated sequencing(3), electron microscopy, fluorescence microscopy (4) and hybridization affinity chromatography (5).

Structures and spectral properties of fluorescent dyes

Several groups of chromophores consisting of conjugated unsaturated hydrocarbons and hetero-aromatic molecules have strong fluorescent properties. The most common fluorophores employed in fluorescent assays are derived from fluorescein, rhodamine, coumarin or cyanine type of chromophores which structures are illustrated in figure 1.  


Figure 1. Structures of the most common fluorophores. Where X is a linker, R is an oligonucleotide.

 

Each of these molecules has a characteristic absorbance spectrum and a characteristic emission spectrum. The specific wavelength at which one of these molecules will most efficiently absorb energy is called the absorbance peak and the wavelength at which it will most efficiently emit energy is called the emission peak as illustrated in figure 2.  

Figure 2.  Characterisitics of the absorbance and emission spectra of a fluorophore.



The difference between absorbance peak and emission peak is known as the Stokes Shift.  Absorbance peak and emission peak wavelengths for most of the fluorophores used in molecular applications are shown in Table 1 (for a complete list of the fluorescent dyes please visit our website (6).

Table 1. Fluorescent properties of commonly used dyes.

Dye

Ab (nm)

Em (nm)

SS (nm)

e (M-1cm-1)

Acridine

362

462

100

11,000

Alexa 350

346

442

96

19,000

Alexa 488

495

519

24

71,000

Alexa 594

590

716

26

73,000

Alexa 610

612

628

16

144,000

Alexa 633

632

647

15

159,000

Alexa 700

696

719

23

196,000

AMCA

353

442

89

19,000

ATTO 390

390

479

89

24,000

ATTO 425

436

486

50

45,000

ATTO 465

453

508

55

75,000

ATTO 488

501

523

22

90,000

ATTO 495

495

527

32

80,000

ATTO 590

594

624

30

120,000

ATTO 610

615

634

19

150,000

ATTO 633

629

657

28

130,000

ATTO 647

645

669

24

120,000

ATTO 700

700

719

19

120,000

BODIPY FL

531

545

14

75,000

BODIPY TMR

544

570

26

56,000

BODIPY TR

588

616

28

68,000

Cascade Blue

396

410

14

29,000

Cy2

489

506

17

150,000

Cy3

552

570

18

150,000

Cy3.5

581

596

15

150,000

Cy5

643

667

24

250,000

Cy5.5

675

694

19

250,000

Cy7

743

767 

24

250,000

Edans

335

493

158

5,900

Eosin

521

544

23

95,000

Erythrosin

529

553

24

90,000

6-FAM

494

518

24

83,000

6-TET

521

536

15

-

6-HEX

535

556

21

-

JOE

520

548

28

71,000

LightCycler 640

625

640

15

110,000

LightCycler 705

685

705

20

-

Lissamine

558

583

25

88,000

NBD

465

535

70

22,000

Rhodamine 6G

524

550

26

102,000

Rhodamine Green

504

532

28

78,000

Rhodamine Red

560

580

20

129,000

TAMRA

565

580

15

91,000

ROX

585

605

20

82,000

Texas Red

595

615

20

80,000

NED

546

575

29

-

VIC

538

554

26

-

 
The conjugation or addition of electron withdrawing groups (EWG) to a basic fluorophore moiety usually leads to a red shift resulting in a shift of the absorbance and emission peaks to longer wavelengths or lower energies.      

Methods of incorporation

The most common and convenient method for the attachment of a fluorescent dye to an oligonucleotide is the phosphoramidite method. This method makes it possible to use commercially available fluorescent phosphoramidites for the conjugation or incorporation of one or more fluorophores into or to both, the 5' and/or 3' end, of the oligonucleotide. However, if the fluorophore is not stable in basic conditions needed for the oligonucleotide base deprotection step, the attachment to an oligonucleotide has to be done using a post-synthetic method after the base deprotection step is completed. In this situation, it is best that the oligonucleotide contains a functional group that will react with a reactive moiety on the selected fluorophore resulting in a stable covalent bond between the fluorophore and the oligonucleotide.

Several chemo-selective methods are available that can be used for the post-synthetic oligonucleotide labeling. One of the commonly chemo-selective labeling method used employs amino modified oligonucleotides together with the corresponding NHS esters or similar amine reactive synthons as illustrated in figure 3.

Figure 3.  Oligonucleotide labeling using TAMRA NHS ester.


More recently “click chemistry” was successfully employed to label oligonucleotides with various fluorescent reporter molecules (7-9).  This chemical process is outlined in figure 4.

Figure 4.  Huisgen’s 1,3-dipolar cycloaddition between an alkyne modified oligonucleotide (1) and an azide modified reporter molecule (2).  Where R1 and R2 are a hydrogen atom (H) or an extension of the oligonucleotide chain, X is a linker, and Y is a reporter molecule.

The advantage of this chemistry is that it is completely orthogonal to any other attachment method. This chemistry can also be used in addition to any type of Michael-Addition reaction or chemistry as well as any other active esters that are reactive towards alkylamino modified oligonucleotides.

Polymerase dependent polynucleotide labeling using fluorescently labeled deoxynucleoside-5’-triphosphates (NTP) can be considered to be the major method for cDNA labeling (10). This method uses DNA polymerase or terminal deoxynucleotidyl transferase in order to incorporate fluorescently labeled nucleobases with the help of the corresponding NTPs into polynucleotides. These types of oligonucleotides may be used further in any type of micro-array applications.

Dark quenchers suitable for ultrasensitive probes

In recent years Dabcyl, TAMRA and other fluorescent acceptor molecules used in qPCR probes, have been replaced with one or more of the growing family of dark quencher molecules. For this reason, fluorophore-quencher dual-labeled probes have become a standard in kinetic qPCR assay. The properties of dark quencher dyes are provided in table 2.

Table 2.  Characteristic properties of quencher dyes.

Quencher

lmax, nm

e, M-1cm-1

Dabcyl

470

32,000

BHQ2

578

38,000

IB FQ

531

38,000

IQ4

585

59,000


However, even the most efficient quencher dyes show a narrow and limited range of quenching that is predetermined by their narrow absorbance spectra. Therefore, each of the quencher dyes requires a fluorophore within a certain fluorescence emission spectrum range in order to have an efficient energy transfer between the two dyes or chromophors. The broad absorbance spectrum of our new generation of quencher dyes, for Instant for the Quencher dyes (IQ4), makes these probes suitable for multiplexing (11). Their highly efficient quenching characteristics lead to a higher sensitivity expressed by the probe. These significantly improved novel quencher dyes, also showing improved Ct values, now allow for the design of new linear highly sensitive probes. A comparison of the UV spectral properties for standard mono-labeled oligonucleotides are illustrated in the figure 4.  


Figure 4.  UV Spectra of standard mono-labeled decamer oligonucleotides labeled with the leading quencher dyes.


Biosynthesis, Inc.  now offers all types of fluorescently labeled oligonucleotides including their conjugates with peptides proteins and various nanoparticles. 


References:

  1. Zimmerman, J.; Voss, H.; Schwanger, C.; Stegemann, J.; Erfle, H.; Stucky, K.; Kristensen, T.; Ansorge, W., Nucleic Acids Res., 1990, 18,  1067.

  2. Agrawal, S.; Zamecnik, P. C.,. Nucleic Acids Res., 1990, 18, 5419.

  3. a) Landgraf, A.; Reckmann, B.; Pingoud, A., Anal. Biochem., 1991, 193, 231. b) Lee, L. G.; Connell, C. R. and Bloch, W. Nucleic Acids Res., 1993, 21, 3761. c) Tyagi, S.; Kramer, F. R., Nature Biotechnology, 1996, 14, 303.

  4. Fisher, T. L.; Terhorst, T.; Cao, X.; Wagner, R. W., Nucleic Acids Res., 1993, 21,  3857.

  5. Urdea, M. S.; Warner, B. D.; Running, J. A.; Stempien, M.; Clyne, J.; Horn, T., Nucleic Acids Res., 1988, 16,  4937.

  6. http://www.biosyn.com/oligonucleotide-modification-services.aspx

  7. A.V. Ustinov, et al, Tetrahedron, 2007, 64, 1467-1473.

  8. Agnew, B. et al., US Patent application 20080050731/A1.

  9. X. Ming, P. Leonard, D. Heindle and F. Seela, Nucleic Acid Symposium Series No. 52, 471-472, 2008.

  10. a) Hessner, M.J., X. Wang, K. Hulse, L. Meyer, Y. Wu, S. Nye, S.W. Guo, and S. Ghosh. 2003. Nucleic Acids Res., 2003, 31:e14.  b) C. E. Guerra, BioTechniques, 2006, 41 (1), 53–56.

  11. Laikhter A. et al. US patent 7,956,169.

-.-

Western Blot

$
0
0
Western Blot

The western blotting technique was developed in 1979 by Towbin.  The system received its name from the step where a protein is “blotted” from a gel onto a membrane and as a spinoff from the name of the southern blot technique.  The western blot has become a routine technique in the detection of target proteins in a mixture.  The western blot produces both qualitative and semi quantitative data about the protein in question.  

One of the first steps in western blotting is sample preparation.  Samples from tissues or cells are broken using common lysis techniques such as a blender, homogenizer, or a sonicator.  Once the tissues are blended in a mix, different detergents such as salts and buffers are used to lyse the cells and solubilize the proteins.  A mixture of molecular biology techniques are employed to separate different cell compartments.  

The second step in the western blot technique is to perform a gel electrophoresis separation of the proteins in the sample.  Proteins can be separated by their isoelectric value (PI), molecular weight, electric charge, or all three.  The type of separation performed depends heavily on the type of gel being used such as SDS-PAGE, polyacrylamide, or even acrylamide.  

The most widely used gel in western blots is the polyacrylamide gel.  The gel is used with a buffer called sodium dodecyl sulfate which keeps the polypeptides in its denatured state.  In its denatured state, a linear poplypeptide can travel through a gel pore and be separated by size.  Smaller proteins migrate through the acrylamide gel complex faster and larger proteins move slower.  The percentage of acrylamide used in gels determines its resolution.  For large molecular weights a smaller percentage of acrylamide is used while for small molecular weights a larger percentage of acrylamide is employed.  Samples are loaded into wells in a gel which make up lanes.  One lane has a ladder which is simply a known standard of proteins of known weights.  One electricity is applied the linear proteins migrate to the cathode at a rate that is based on their isoelectric point charge and mass. 

 The third step is known as the blotting step.  In order to allow the proteins to be accessible to antibody detection they must be transferred from a gel to a membrane made of nitrocellulose or polyvinylidene difluoride.  The gel is placed on top of a nitrocellulose membrane and then sandwiched between two filter papers.  This entire stack is placed into a buffer solution inside an electroblotting box.  The solution moves up the paper by capillary action and electric current taking the protein along with it.  The protein binds to the nitrocellulose membrane based upon hydrophobic interactions as well as its charges.  

The fourth step is known as the blocking step.  This step is important in order to prevent antibodies used to detect the protein of interest from non-specific binding interactions with the membrane itself.  Blocking is carried out using BSA or dry milk in TBS with a small percentage of a detergent such as Tween 20 or Triton X-100.  When the membrane is placed in this solution the blocking protein detergent mix fills in all the spaces on the membrane where no protein is attached from the blotting step.   Now when the antibody is added there is no place on the membrane for the antibody to attach.  Therefore the antibody only attaches to the protein it recognizes.  This leads to more accurate results by removing the chances for false positives.  

The next step is to detect the protein of interest using a modified antibody.  The antibody is usually linked to some kind of reporter molecule.  When this modified antibody is allowed to react with a specific substrate that the enzyme will convert the substrate and produce a color.  Traditionally this takes place in two steps:
  • First the primary antibody is added.  Primary antibody production is usually generated in a host species such as a rabbit, horse, or even exotic animals like llamas.  The antigen is injected into the animal and the animal’s serum is harvested for its antibody.  After the blocking step a small amount of primary antibody is mixed in with the membrane under mild shaking.  The primary antibody is allowed to bind for 1-8 hours.  Different temperatures are employed to affect binding both specific and non-specific.  
  • After the primary antibody is added, allowed to incubate, and washed from the membrane the second antibody is added.  This second antibody is against a portion of a host species specifically the animal used to generate the primary antibody.  The secondary antibody is linked to a reporter enzyme such as alkaline phosphatase or HRP.  HRP is the most common reporter used.  A sheet of photographic film is placed against the membrane and then exposed to light.  This reaction forms an image of the antibodies bound to the blot.  Another form of detection in western blots are to use a static fluorophore linked antibody or radioactive labeled antibodies. 
The fifth step in the western blot technique is to analyze the resulting data.  Once the unbound probes are washed away, the western blot can be “developed” to detect the bound proteins of interest.   Proteins of interest will not always be visualized as one clean band in a membrane.  Sometimes more than one band can be visualized.   The size of the protein can be estimated by looking at the lane with the ladder.   Sometimes the total protein can be visualized against actin or tubulin in order to correct for errors.  There are a few ways to go about visualizing the detected proteins:
  • Chemiluminescent detection:  This method utilizes an enzyme such as HRP which converts a substrate and causes it to luminesce.  A CCD camera is used to take an image of the western blot or photographic film.  The image is then analyzed by densitometry and quantifies the result in terms of OD (optical density).   This method is one of the first methods to be used for western blot detection.
  • Fluorphore detection:  These immunoassays require fewer steps because there is no need for a substrate to develop the assay, however special equipment must be used to detect a fluorescent signal.  Recently digital imaging has shifted towards infrared regions of detection.  Near IR and quantum dots has increased the use of fluorescent probes due to their enhanced sensitivity in western blot analysis.    
  • Radioactive detection:  Radioactive label can also be used in analysis.  The membrane is placed against an X-ray film and then allowed to develop.  The film beings to show the bands of the protein of interest as dark regions.  This type of detection is rarely used due health risks.  
To summarize, the western blot is a powerful technique used by many labs in order to analyze a protein in a mixture.  Western blotting utilizes a series of steps in order to separate, blot, and detect individual proteins.  While the technique itself is not new advances in data analysis continue to improve to yield more accurate results. 
 

The Ebola Virus Genome and Proteome

$
0
0

The Ebola Virus Genome and Proteome


The Ebola virus is a single-stranded, negative-sense mini-genome RNA virus. Zaire Ebola virus is responsible for the recent outbreak in West-Africa. Ebola viruses belong to the filoviridae family, and together with Paramyxoviridae, Rhabdoviridae, and Borna disease virus, Filoviridae viruses belong to the taxonomic order mononegavirales.

Mononegavirales is the term used for "nonsegmented negative-strand RNA viruses" (NNSV). These are enveloped viruses that have mini-genomes consisting of a single RNA molecule of negative or anti-mRNA sense.

Nucleic acids isolated from negative strand RNA viruses or virus-infected cells cannot infect or initiate an infection cycle when introduced into the host cell. This criterion was used to distinguish “positive’ from “negative”-strand RNA viruses. The viral genome needs to be first transcribed to produce mRNAs. Therefore, the purified virion RNA is not infectious. The virus needs to bring its own RNA polymerase into the cell in order to produce mRNA. To allow the virus to be infective a viral polymerase must be part of the viral particle  or virion.

The use of non-infectious synthetic viral RNA allows for the design of
PCR primers or probes as well as peptides and recombinant proteins for molecular diagnostics. Similarly, these molecules may lend themselves for the design and production of vaccines against the virus. 

Features of the unsegmented genome of negative-stranded RNA viruses are:

  • Negative sense RNA in the virion
  • Virion-associated RNA polymerase mediates transcription and replication
  • Genome transcribed into 6-10 separate mRNAs from a single promoter
  • Replication occurs by synthesis of a complete positive-sense RNA antigenome
  • Nucleoprotein is the functional template for synthesis of replicative and mRNA
  • Independently assembled nucleocapsids are enveloped at the cell surface at sites containing virus proteins
  • Are mainly cytoplasmic
  • Can occur in invertebrates, vertebrates and plants

Features of the family Filoviridae are:

  • Filamentous forms with branching; sometimes U-shaped, 6-shaped or circular 
  • Uniform diameter of 80 nm and varying lengths up to 14,000 nm. Infectious particle length is 790 nm for Marburg virus and 970 nm for Ebola virus
  • Surface spikes of 10 nm length
  • Helical nucleocapsid; 50 nm diameter, with an axial space of 20 nm diameter and helical periodicity of about 5 nm
  • Filamentous with a linear ~13-19 kb mini-genome with a negative-sense single-stranded RNA of molecular weight (Mr) =  4.2 x 10
  • At least five (5) proteins; a large (polymerase) protein, a surface glycoprotein, two (2) nucleocapsid-associated proteins, and at least one other protein of unknown function
  • Biology enigmatic; only two antigenically unrelated viruses known; blood borne infection of humans and monkeys

Filoviruses are responsible for newly emerging infections. Filoviruses are considered as Biosafety Level 4 agents, in comparison HIV is only considered as Biosafety Level 2+. Filoviruses can infect mice, hamsters, guinea pigs and monkeys. However, it is not known at presence where the virus originates in the wild. 

Most human epidemics appear to be blood-born spread, in hospitals often transmitted via contaminated needles, and transmitted via close contact with infected persons or their body fluids. Primary infections with Marburg and Ebola are usually 25 to 90% fatal.  Death is thought to occur because of visceral organ necrosis, for example of the liver, due to viral infection of tissue parenchymal cells. 

Viral RNA is not infectious by itself. Therefore, the use of cloned or synthetic viral RNA can be very useful for the development and production of diagnostic tests or the development of vaccines against filoviruses, for example, the Ebola virus. 

Research with the aim to develop a vaccine for Ebola has already been started for several years now. In 1998, the first immunization for Ebola virus infections that was successful was reported. 

“Abstract: Infection by Ebola virus causes rapidly progressive, often fatal, symptoms of fever, hemorrhage and hypotension. Previous attempts to elicit protective immunity for this disease have not met with success. We report here that protection against the lethal effects of Ebola virus can be achieved in an animal model by immunizing with plasmids encoding viral proteins. We analyzed immune responses to the viral nucleoprotein (NP) and the secreted or transmembrane forms of the glycoprotein (sGP or GP) and their ability to protect against infection in a guinea pig infection model analogous to the human disease. Protection was achieved and correlated with antibody titer and antigen-specific T-cell responses to sGP or GP. Immunity to Ebola virus can therefore be developed through genetic vaccination and may facilitate efforts to limit the spread of this disease.” 

{Xu L, Sanchez A, Yang Z, Zaki SR, Nabel EG, Nichol ST, Nabel GJ.; Immunization for Ebola virus infection. Nat Med. 1998 Jan;4(1):37-42.}

The result – a DNA vaccine encoding the glycoprotein (sGP or GP) of the Ebola virus evoked a T-cell based immune response in guinea pigs and protected the animals against infection. Further studies indicated that a DNA vaccine can is useful for vaccination. The use of DNA immunization together with adenovirus vectors encoding viral proteins in nonhuman primates resulted in the protection of crab-eating or cynomolgus macaques (Macaca fascicularis) from the lethal pathogen, the wild-type Zaire virus. 

“Abstract: Outbreaks of haemorrhagic fever caused by the Ebola virus are associated with high mortality rates that are a distinguishing feature of this human pathogen. The highest lethality is associated with the Zaire subtype, one of four strains identified to date. Its rapid progression allows little opportunity to develop natural immunity, and there is currently no effective anti-viral therapy. Therefore, vaccination offers a promising intervention to prevent infection and limit spread. Here we describe a highly effective vaccine strategy for Ebola virus infection in non-human primates. A combination of DNA immunization and boosting with adenoviral vectors that encode viral proteins generated cellular and humoral immunity in cynomolgus macaques. Challenge with a lethal dose of the highly pathogenic, wild-type, 1976 Mayinga strain of Ebola Zaire virus resulted in uniform infection in controls, who progressed to a moribund state and death in less than one week. In contrast, all vaccinated animals were asymptomatic for more than six months, with no detectable virus after the initial challenge. These findings demonstrate that it is possible to develop a preventive vaccine against Ebola virus infection in primates.”

 {Sullivan NJ, Sanchez A, Rollin PE, Yang ZY, Nabel GJ.; Development of a preventive vaccine for Ebola virus infection in primates. Nature. 2000 Nov 30;408(6812):605-9.}

Results from sequence analysis of Ebola viruses from outbreaks in 1976 and 1995 showed a high degree of genetic conservation for this virus type. An explanation of this could be that Ebola viruses may have coevolved with their natural host reservoirs and do not change a lot in the wild.

Reference

Biology of Negative Strand RNA Viruses: The Power of Reverse Genetics; Y. Kawaoka (Ed.). © Springer-Verlag Berlin Heidelberg 2004.

Ebihara H, Takada A, Kobasa D, Jones S, Neumann G, et al. (2006) Molecular determinants of Ebola virus virulence in mice. PLoS Pathog 2(7): e73. DOI: 10.1371/ journal.ppat.0020073.


MOLECULAR BASIS OF VIRUS EVOLUTION; Edited by ADRIAN J. GIBBS, CHARLES H. CALISHER, and FERNANDO GARCIA-ARENAL, © Cambridge University Press 1995.


http://www.ncbi.nlm.nih.gov/pubmed/?term=ebola+virus+review

 

The Zaire Ebola Virus Genome and Proteome

Graphical display and FASTA file from Pubmed.

Zaire ebolavirus isolate Ebola virus H.sapiens-tc/COD/1976/Yambuku-Mayinga, complete genome. NCBI Reference Sequence: NC_002549.1. 

Source

http://www.ncbi.nlm.nih.gov/nuccore/10313991?report=graph

 

 

LOCUS       NC_002549              18959 bp    cRNA    linear   VRL 27-AUG-2014
DEFINITION  Zaire ebolavirus isolate Ebola virus
            H.sapiens-tc/COD/1976/Yambuku-Mayinga, complete genome.
ACCESSION   NC_002549
VERSION     NC_002549.1  GI:10313991
DBLINK      BioProject: PRJNA14703
KEYWORDS    RefSeq.
SOURCE      Zaire ebolavirus (ZEBOV)
  ORGANISM  Zaire ebolavirus
            Viruses; ssRNA negative-strand viruses; Mononegavirales;
            Filoviridae; Ebolavirus.
REFERENCE   1  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Volchkova,V.A., Chepurnov,A.A., Blinov,V.M.,
            Dolnik,O., Netesov,S.V. and Feldmann,H.
  TITLE     Characterization of the L gene and 5' trailer region of Ebola virus
  JOURNAL   J. Gen. Virol. 80 (Pt 2), 355-362 (1999)
   PUBMED   10073695
REFERENCE   2  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Volchkova,V.A., Slenczka,W., Klenk,H.D. and
            Feldmann,H.
  TITLE     Release of viral glycoproteins during Ebola virus infection
  JOURNAL   Virology 245 (1), 110-119 (1998)
   PUBMED   9614872
REFERENCE   3  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Feldmann,H., Volchkova,V.A. and Klenk,H.D.
  TITLE     Processing of the Ebola virus glycoprotein by the proprotein
            convertase furin
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 95 (10), 5762-5767 (1998)
   PUBMED   9576958
REFERENCE   4  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Becker,S., Volchkova,V.A., Ternovoj,V.A.,
            Kotov,A.N., Netesov,S.V. and Klenk,H.D.
  TITLE     GP mRNA of Ebola virus is edited by the Ebola virus polymerase and
            by T7 and vaccinia virus polymerases
  JOURNAL   Virology 214 (2), 421-430 (1995)
   PUBMED   8553543
REFERENCE   5  (bases 1 to 18959)
  AUTHORS   Bukreyev,A.A., Volchkov,V.E., Blinov,V.M. and Netesov,S.V.
  TITLE     The VP35 and VP40 proteins of filoviruses. Homology between Marburg
            and Ebola viruses
  JOURNAL   FEBS Lett. 322 (1), 41-46 (1993)
   PUBMED   8482365
REFERENCE   6  (bases 1 to 18959)
  CONSRTM   NCBI Genome Project
  TITLE     Direct Submission
  JOURNAL   Submitted (27-SEP-2000) National Center for Biotechnology
            Information, NIH, Bethesda, MD 20894, USA
REFERENCE   7  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E.
  TITLE     Direct Submission
  JOURNAL   Submitted (02-JUN-2000) Institute of Virology, Philipps-University
            Marburg, Robert-Koch-Str. 17, Marburg 35037, Germany
  REMARK    Sequence update by submitter
REFERENCE   8  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E.
  TITLE     Direct Submission
  JOURNAL   Submitted (20-AUG-1998) Institute of Virology, Philipps-University
            Marburg, Robert-Koch-Str. 17, Marburg 35037, Germany
COMMENT     PROVISIONAL REFSEQ: This record has not yet been subject to final
            NCBI review. The reference sequence is identical to AF086833.
            COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     source          1..18959

                     /organism="Zaire ebolavirus"

                     /mol_type="viral cRNA"

                     /isolate="Ebola virus

                     H.sapiens-tc/COD/1976/Yambuku-Mayinga"

                     /db_xref="taxon:186538"

     5'UTR           1..55

                     /note="putative leader region"

                     /citation=[1]

                     /function="regulation or initiation of RNA replication"

     gene            56..3026

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /db_xref="GeneID:911830"

     mRNA            56..3026

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /product="nucleoprotein"

                     /db_xref="GeneID:911830"

     misc_signal     56..67

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /note="putative; transcription start signal"

                     /citation=[1]

     CDS             470..2689

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /function="encapsidation of genomic RNA"

                     /codon_start=1

                     /product="nucleoprotein"

                     /protein_id="NP_066243.1"

                     /db_xref="GI:10314000"

                     /db_xref="GeneID:911830"

                     /translation="MDSRPQKIWMAPSLTESDMDYHKILTAGLSVQQGIVRQRVIPVY

                     QVNNLEEICQLIIQAFEAGVDFQESADSFLLMLCLHHAYQGDYKLFLESGAVKYLEGH

                     GFRFEVKKRDGVKRLEELLPAVSSGKNIKRTLAAMPEEETTEANAGQFLSFASLFLPK

                     LVVGEKACLEKVQRQIQVHAEQGLIQYPTAWQSVGHMMVIFRLMRTNFLIKFLLIHQG

                     MHMVAGHDANDAVISNSVAQARFSGLLIVKTVLDHILQKTERGVRLHPLARTAKVKNE

                     VNSFKAALSSLAKHGEYAPFARLLNLSGVNNLEHGLFPQLSAIALGVATAHGSTLAGV

                     NVGEQYQQLREAATEAEKQLQQYAESRELDHLGLDDQEKKILMNFHQKKNEISFQQTN

                     AMVTLRKERLAKLTEAITAASLPKTSGHYDDDDDIPFPGPINDDDNPGHQDDDPTDSQ

                    DTTIPDVVVDPDDGSYGEYQSYSENGMNAPDDLVLFDLDEDDEDTKPVPNRSTKGGQQ

                     KNSQKGQHIEGRQTQSRPIQNVPGPHRTIHHASAPLTDNDRRNEPSGSTSPRMLTPIN

                     EEADPLDDADDETSSLPPLESDDEEQDRDGTSNRTPTVAPPAPVYRDHSEKKELPQDE

                     QQDQDHTQEARNQDSDNTQSEHSFEEMYRHILRSQGPFDAVLYYHMMKDEPVVFSTSD

                     GKEYTYPDSLEEEYPPWLTEKEAMNEENRFVTLDGQQFYWPVMNHKNKFMAILQHHQ"

     misc_feature    524..2671

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /note="Ebola nucleoprotein; Region: Ebola_NP; pfam05505"

                     /db_xref="CDD:147601"

     polyA_signal    3015..3026

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

     misc_feature    3027..3031

                     /note="intergenic region"

     gene            3032..4407

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /db_xref="GeneID:911827"

     mRNA            3032..4407

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /product="VP35"

                     /citation=[5]

                     /db_xref="GeneID:911827"

     misc_signal     3032..3043

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /note="putative; transcription start signal"

                     /citation=[5]

     CDS             3129..4151

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /function="polymerase complex protein"

                     /citation=[5]

                     /codon_start=1

                     /product="polymerase complex protein"

                     /protein_id="NP_066244.1"

                     /db_xref="GI:10313992"

                     /db_xref="GeneID:911827"

                     /translation="MTTRTKGRGHTAATTQNDRMPGPELSGWISEQLMTGRIPVSDIF

                     CDIENNPGLCYASQMQQTKPNPKTRNSQTQTDPICNHSFEEVVQTLASLATVVQQQTI

                     ASESLEQRITSLENGLKPVYDMAKTISSLNRVCAEMVAKYDLLVMTTGRATATAAATE

                     AYWAEHGQPPPGPSLYEESAIRGKIESRDETVPQSVREAFNNLNSTTSLTEENFGKPD

                     ISAKDLRNIMYDHLPGFGTAFHQLVQVICKLGKDSNSLDIIHAEFQASLAEGDSPQCA

                     LIQITKRVPIFQDAAPPVIHIRSRGDIPRACQKSLRPVPPSPKIDRGWVCVFQLQDGK

                     TLGLKI"

     misc_feature    3186..4148

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /note="Filoviridae VP35; Region: Filo_VP35; pfam02097"

                     /db_xref="CDD:145320"

     gene            4390..5894

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /db_xref="GeneID:911825"

     mRNA            4390..5894

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /product="VP40"

                     /citation=[5]

                     /db_xref="GeneID:911825"

     misc_signal     4390..4401

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /note="transcription start signal"

                     /citation=[5]

     polyA_signal    4397..4407

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /citation=[5]

     CDS             4479..5459

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /citation=[5]

                     /codon_start=1

                     /product="matrix protein"

                     /protein_id="NP_066245.1"

                     /db_xref="GI:10313993"

                     /db_xref="GeneID:911825"

                     /translation="MRRVILPTAPPEYMEAIYPVRSNSTIARGGNSNTGFLTPESVNG

                     DTPSNPLRPIADDTIDHASHTPGSVSSAFILEAMVNVISGPKVLMKQIPIWLPLGVAD

                     QKTYSFDSTTAAIMLASYTITHFGKATNPLVRVNRLGPGIPDHPLRLLRIGNQAFLQE

                     FVLPPVQLPQYFTFDLTALKLITQPLPAATWTDDTPTGSNGALRPGISFHPKLRPILL

                     PNKSGKKGNSADLTSPEKIQAIMTSLQDFKIVPIDPTKNIMGIEVPETLVHKLTGKKV

                     TSKNGQPIIPVLLPKYIGLDPVAPGDLTMVITQDCDTCHSPASLPAVIEK"

     misc_feature    4479..5363

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /note="Matrix protein VP40; Region: VP40; pfam07447"

                     /db_xref="CDD:116068"

     polyA_signal    5883..5894

                     /gene="VP40"

                    /locus_tag="ZEBOVgp3"

                     /citation=[5]

     misc_feature    5895..5899

                     /note="intergenic region"

     gene            5900..8305

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /db_xref="GeneID:911829"

     mRNA            5900..8305

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /product="sGP"

                     /note="unedited mRNA"

                     /citation=[4]

                     /db_xref="GeneID:911829"

     misc_signal     5900..5911

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="putative; transcription start signal"

                     /citation=[4]

     CDS             join(6039..6923,6923..8068)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /function="receptor binding and fusion"

                     /artificial_location="low-quality sequence region"

                     /note="virion spike glycoprotein precursor; an addition A

                     residue is inserted during transcription; encodes two

                     disulfide linked subunits GP1 and GP2"

                     /citation=[2]

                     /citation=[3]

                     /citation=[4]

                     /codon_start=1

                     /product="spike glycoprotein"

                     /protein_id="NP_066246.1"

                     /db_xref="GI:10313995"

                     /db_xref="GeneID:911829"

                     /translation="MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQ

                     VSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVNYEAG

                     EWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF

                     LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTI

                     RYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWK

                     VNPEIDTTIGEWAFWETKKNLTRKIRSEELSFTVVSNGAKNISGQSPARTSSDPGTNT

                     TTEDHKIMASENSSAMVQVHSQGREAAVSHLTTLATISTSPQSLTTKPGPDNSTHNTP

                     VYKLDISEATQVEQHHRRTDNDSTASDTPSATTAAGPPKAENTNTSKSTDFLDPATTT

                     SPQNHSETAGNNNTHHQDTGEESASSGKLGLITNTIAGVAGLITGGRRTRREAIVNAQ

                     PKCNPNLHYWTTQDEGAAIGLAWIPYFGPAAEGIYIEGLMHNQDGLICGLRQLANETT

                     QALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPHDWTKNITDKID

                     QIIHDFVDKTLPDQGDNDNWWTGWRQWIPAGIGVTGVIIAVIALFCICKFVF"

     misc_feature    7529..7540

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="encodes the glycoprotein cleavage site, precursor

                     GP is cleaved by subtilisin-like cellular protease furin

                     into subunits GP1 and GP2 that are linked by a disulfide

                     bond"

                     /citation=[3]

     misc_feature    7793..7870

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="immunosuppressive motif; other site"

     misc_feature    7988..8053

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="transmembrane anchor; transmembrane region"

     misc_feature    7706..7924

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="heptad repeat 1-heptad repeat 2 region of the

                     transmembrane subunit of Filoviridae viruses, Ebola virus

                     and Marburg virus, and related domains; Region:

                     Ebola-like_HR1-HR2; cd09850"

                     /db_xref="CDD:197367"

     misc_feature    join(6081..6923,6923..7153)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Filovirus glycoprotein; Region: Filo_glycop;

                     pfam01611"

                     /db_xref="CDD:110602"

     misc_feature    7706..7732

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1A; other site"

                     /db_xref="CDD:197367"

     misc_feature    7733..7762

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1B; other site"

                     /db_xref="CDD:197367"

     misc_feature    7763..7783

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1C; other site"

                     /db_xref="CDD:197367"

     misc_feature    7784..7831

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1D; other site"

                     /db_xref="CDD:197367"

     misc_feature    7787..7837

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="immunosuppressive region; other site"

                     /db_xref="CDD:197367"

     misc_feature    order(7838..7858,7859..7861)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="CX(6,7)C motif; other site"

                     /db_xref="CDD:197367"

     misc_feature    7886..7924

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR2; other site"

                     /db_xref="CDD:197367"

     misc_feature    order(7784..7786,7793..7795)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Cl binding site [ion binding]; other site"

                     /db_xref="CDD:197367"

     misc_feature    order(7706..7714,7718..7723,7727..7732,7736..7744,

                     7748..7756,7760..7765,7769..7777,7781..7807,7811..7819,

                     7823..7828,7844..7849,7856..7858,7865..7876,7880..7882,

                     7889..7894,7901..7903,7910..7915,7922..7924)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="homotrimer interface [polypeptide binding]; other

                     site"

                     /db_xref="CDD:197367"

     misc_feature    order(7706..7714,7718..7726,7730..7735,7739..7747,

                     7754..7768,7772..7783,7787..7792,7796..7804,7808..7813,

                     7817..7819)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1-GP1 interface [polypeptide binding]; other

                     site"

                     /db_xref="CDD:197367"

     CDS             6039..7133

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="sGP, small non-structural, secreted glycoprotein;

                     sGP secreted as a anti-parallel oriented homodimer"

                     /citation=[4]

                     /codon_start=1

                     /product="small secreted glycoprotein"

                     /protein_id="NP_066247.1"

                     /db_xref="GI:10313994"

                     /db_xref="GeneID:911829"

                     /translation="MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQ

                     VSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVNYEAG

                     EWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF

                     LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTI

                     RYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWK

                     VNPEIDTTIGEWAFWETKKTSLEKFAVKSCLSQLYQTEPKTSVVRVRRELLPTQGPTQ

                     QLKTTKSWLQKIPLQWFKCTVKEGKLQCRI"

     misc_feature    6081..7130

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Filovirus glycoprotein; Region: Filo_glycop;

                     pfam01611"

                     /db_xref="CDD:110602"

     CDS             join(6039..6922,6924..6933)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /artificial_location="low-quality sequence region"

                     /note="ssGP; second non-structural secreted glycoprotein;

                     secreted in a monomeric form; one A residue is deleted or

                     two additional A residues are inserted at the editing site

                     during transcription of the GP gene"

                     /citation=[4]

                     /codon_start=1

                     /product="second secreted glycoprotein"

                     /protein_id="NP_066248.1"

                     /db_xref="GI:10313996"

                     /db_xref="GeneID:911829"

                     /translation="MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQ

                     VSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVNYEAG

                     EWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF

                     LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTI

                     RYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWK

                     VNPEIDTTIGEWAFWETKKPH"

     misc_feature    join(6081..6922,6924..>6924)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Filovirus glycoprotein; Region: Filo_glycop;

                     pfam01611"

                     /db_xref="CDD:110602"

     misc_signal     6918..6924

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="additional A residues are inserted or deleted

                     during transcription of the GP gene by the viral

                     polymerase"

                     /citation=[4]

                     /function="RNA editing"

     gene            8288..9740

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /db_xref="GeneID:911826"

     mRNA            8288..9740

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /product="VP30"

                     /db_xref="GeneID:911826"

     misc_signal     8288..8299

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="putative; transcription start signal"

     polyA_signal    8295..8305

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /citation=[4]

     CDS             8509..9375

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="polymerase complex protein"

                     /codon_start=1

                     /product="minor nucleoprotein"

                     /protein_id="NP_066249.1"

                     /db_xref="GI:10313997"

                     /db_xref="GeneID:911826"

                     /translation="MEASYERGRPRAARQHSRDGHDHHVRARSSSRENYRGEYRQSRS

                     ASQVRVPTVFHKKRVEPLTVPPAPKDICPTLKKGFLCDSSFCKKDHQLESLTDRELLL

                     LIARKTCGSVEQQLNITAPKDSRLANPTADDFQQEEGPKITLLTLIKTAEHWARQDIR

                     TIEDSKLRALLTLCAVMTRKFSKSQLSLLCETHLRREGLGQDQAEPVLEVYQRLHSDK

                     GGSFEAALWQQWDRQSLIMFITAFLNIALQLPCESSAVVVSGLRTLVPQSDNEEASTN

                     PGTCSWSDEGTP"

     misc_feature    8932..9321

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="Ebola virus-specific transcription factor VP30;

                     Region: Transcript_VP30; pfam11507"

                     /db_xref="CDD:151944"

     polyA_signal    9730..9740

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="putative"

     misc_feature    9741..9884

                     /note="intergenic region"

     gene            9885..11518

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="putative"

                     /db_xref="GeneID:911828"

     mRNA            9885..11496

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /product="VP24"

                     /db_xref="GeneID:911828"

     misc_signal     9885..9896

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="transcription start signal"

     CDS             10345..11100

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /codon_start=1

                     /product="membrane-associated protein"

                     /protein_id="NP_066250.1"

                     /db_xref="GI:10313998"

                     /db_xref="GeneID:911828"

                     /translation="MAKATGRYNLISPKKDLEKGVVLSDLCNFLVSQTIQGWKVYWAG

                     IEFDVTHKGMALLHRLKTNDFAPAWSMTRNLFPHLFQNPNSTIESPLWALRVILAAGI

                     QDQLIDQSLIEPLAGALGLISDWLLTTNTNHFNMRTQRVKEQLSLKMLSLIRSNILKF

                     INKLDALHVVNYNGLLSSIEIGTQNHTIIITRTNMGFLVELQEPDKSAMNRMKPGPAK

                     FSLLHESTLKAFTQGSSTRMQSLILEFNSSLAI"

     misc_feature    10369..11040

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="Filovirus membrane-associated protein VP24; Region:

                     Filo_VP24; pfam06389"

                     /db_xref="CDD:253701"

     polyA_signal    11485..11496

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="putative"

     misc_feature    11497..11500

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="intergenic region"

     gene            11501..18282

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /db_xref="GeneID:911824"

     mRNA            11501..18282

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /product="polymerase"

                     /citation=[1]

                     /db_xref="GeneID:911824"

     misc_signal     11501..11512

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /note="transcription start signal"

                     /citation=[1]

     polyA_signal    11508..11518

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

     CDS             11581..18219

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /function="synthesis of viral RNAs; transcriptional RNA

                     editing"

                     /note="polymerase"

                     /citation=[1]

                     /codon_start=1

                     /product="RNA-dependent RNA polymerase"

                     /protein_id="NP_066251.1"

                     /db_xref="GI:10313999"

                     /db_xref="GeneID:911824"

                     /translation="MATQHTQYPDARLSSPIVLDQCDLVTRACGLYSSYSLNPQLRNC

                     KLPKHIYRLKYDVTVTKFLSDVPVATLPIDFIVPVLLKALSGNGFCPVEPRCQQFLDE

                     IIKYTMQDALFLKYYLKNVGAQEDCVDEHFQEKILSSIQGNEFLHQMFFWYDLAILTR

                     RGRLNRGNSRSTWFVHDDLIDILGYGDYVFWKIPISMLPLNTQGIPHAAMDWYQASVF

                     KEAVQGHTHIVSVSTADVLIMCKDLITCRFNTTLISKIAEIEDPVCSDYPNFKIVSML

                     YQSGDYLLSILGSDGYKIIKFLEPLCLAKIQLCSKYTERKGRFLTQMHLAVNHTLEEI

                     TEMRALKPSQAQKIREFHRTLIRLEMTPQQLCELFSIQKHWGHPVLHSETAIQKVKKH

                     ATVLKALRPIVIFETYCVFKYSIAKHYFDSQGSWYSVTSDRNLTPGLNSYIKRNQFPP

                     LPMIKELLWEFYHLDHPPLFSTKIISDLSIFIKDRATAVERTCWDAVFEPNVLGYNPP

                     HKFSTKRVPEQFLEQENFSIENVLSYAQKLEYLLPQYRNFSFSLKEKELNVGRTFGKL

                     PYPTRNVQTLCEALLADGLAKAFPSNMMVVTEREQKESLLHQASWHHTSDDFGEHATV

                     RGSSFVTDLEKYNLAFRYEFTAPFIEYCNRCYGVKNVFNWMHYTIPQCYMHVSDYYNP

                     PHNLTLENRDNPPEGPSSYRGHMGGIEGLQQKLWTSISCAQISLVEIKTGFKLRSAVM

                     GDNQCITVLSVFPLETDADEQEQSAEDNAARVAASLAKVTSACGIFLKPDETFVHSGF

                     IYFGKKQYLNGVQLPQSLKTATRMAPLSDAIFDDLQGTLASIGTAFERSISETRHIFP

                     CRITAAFHTFFSVRILQYHHLGFNKGFDLGQLTLGKPLDFGTISLALAVPQVLGGLSF

                     LNPEKCFYRNLGDPVTSGLFQLKTYLRMIEMDDLFLPLIAKNPGNCTAIDFVLNPSGL

                     NVPGSQDLTSFLRQIVRRTITLSAKNKLINTLFHASADFEDEMVCKWLLSSTPVMSRF

                     AADIFSRTPSGKRLQILGYLEGTRTLLASKIINNNTETPVLDRLRKITLQRWSLWFSY

                     LDHCDNILAEALTQITCTVDLAQILREYSWAHILEGRPLIGATLPCMIEQFKVFWLKP

                     YEQCPQCSNAKQPGGKPFVSVAVKKHIVSAWPNASRISWTIGDGIPYIGSRTEDKIGQ

                     PAIKPKCPSAALREAIELASRLTWVTQGSSNSDLLIKPFLEARVNLSVQEILQMTPSH

                     YSGNIVHRYNDQYSPHSFMANRMSNSATRLIVSTNTLGEFSGGGQSARDSNIIFQNVI

                     NYAVALFDIKFRNTEATDIQYNRAHLHLTKCCTREVPAQYLTYTSTLDLDLTRYRENE

                     LIYDSNPLKGGLNCNISFDNPFFQGKRLNIIEDDLIRLPHLSGWELAKTIMQSIISDS

                     NNSSTDPISSGETRSFTTHFLTYPKIGLLYSFGAFVSYYLGNTILRTKKLTLDNFLYY

                     LTTQIHNLPHRSLRILKPTFKHASVMSRLMSIDPHFSIYIGGAAGDRGLSDAARLFLR

                     TSISSFLTFVKEWIINRGTIVPLWIVYPLEGQNPTPVNNFLYQIVELLVHDSSRQQAF

                     KTTISDHVHPHDNLVYTCKSTASNFFHASLAYWRSRHRNSNRKYLARDSSTGSSTNNS

                     DGHIERSQEQTTRDPHDGTERNLVLQMSHEIKRTTIPQENTHQGPSFQSFLSDSACGT

                     ANPKLNFDRSRHNVKFQDHNSASKREGHQIISHRLVLPFFTLSQGTRQLTSSNESQTQ

                     DEISKYLRQLRSVIDTTVYCRFTGIVSSMHYKLDEVLWEIESFKSAVTLAEGEGAGAL

                     LLIQKYQVKTLFFNTLATESSIESEIVSGMTTPRMLLPVMSKFHNDQIEIILNNSASQ

                     ITDITNPTWFKDQRARLPKQVEVITMDAETTENINRSKLYEAVYKLILHHIDPSVLKA

                     VVLKVFLSDTEGMLWLNDNLAPFFATGYLIKPITSSARSSEWYLCLTNFLSTTRKMPH

                     QNHLSCKQVILTALQLQIQRSPYWLSHLTQYADCELHLSYIRLGFPSLEKVLYHRYNL

                     VDSKRGPLVSITQHLAHLRAEIRELTNDYNQQRQSRTQTYHFIRTAKGRITKLVNDYL

                     KFFLIVQALKHNGTWQAEFKKLPELISVCNRFYHIRDCNCEERFLVQTLYLHRMQDSE

                     VKLIERLTGLLSLFPDGLYRFD"

     misc_feature    11608..14853

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /note="Mononegavirales RNA dependent RNA polymerase;

                     Region: Mononeg_RNA_pol; pfam00946"

                     /db_xref="CDD:250248"

     misc_feature    15223..18192

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /note="mRNA capping enzyme, paramyxovirus family; Region:

                     paramyx_RNAcap; TIGR04198"

                     /db_xref="CDD:234496"

     polyA_signal    18272..18282

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /citation=[1]

     3'UTR           18283..18959

                     /note="putative trailer region"

                     /citation=[1]

                     /function="regulation or initiation of RNA replication"

ORIGIN      
        1 cggacacaca aaaagaaaga agaattttta ggatcttttg tgtgcgaata actatgagga
       61 agattaataa ttttcctctc attgaaattt atatcggaat ttaaattgaa attgttactg
      121 taatcacacc tggtttgttt cagagccaca tcacaaagat agagaacaac ctaggtctcc
      181 gaagggagca agggcatcag tgtgctcagt tgaaaatccc ttgtcaacac ctaggtctta
      241 tcacatcaca agttccacct cagactctgc agggtgatcc aacaacctta atagaaacat
      301 tattgttaaa ggacagcatt agttcacagt caaacaagca agattgagaa ttaaccttgg
      361 ttttgaactt gaacacttag gggattgaag attcaacaac cctaaagctt ggggtaaaac
      421 attggaaata gttaaaagac aaattgctcg gaatcacaaa attccgagta tggattctcg
      481 tcctcagaaa atctggatgg cgccgagtct cactgaatct gacatggatt accacaagat
      541 cttgacagca ggtctgtccg ttcaacaggg gattgttcgg caaagagtca tcccagtgta
      601 tcaagtaaac aatcttgaag aaatttgcca acttatcata caggcctttg aagcaggtgt
      661 tgattttcaa gagagtgcgg acagtttcct tctcatgctt tgtcttcatc atgcgtacca
      721 gggagattac aaacttttct tggaaagtgg cgcagtcaag tatttggaag ggcacgggtt
      781 ccgttttgaa gtcaagaagc gtgatggagt gaagcgcctt gaggaattgc tgccagcagt
      841 atctagtgga aaaaacatta agagaacact tgctgccatg ccggaagagg agacaactga
      901 agctaatgcc ggtcagtttc tctcctttgc aagtctattc cttccgaaat tggtagtagg
      961 agaaaaggct tgccttgaga aggttcaaag gcaaattcaa gtacatgcag agcaaggact
     1021 gatacaatat ccaacagctt ggcaatcagt aggacacatg atggtgattt tccgtttgat
     1081 gcgaacaaat tttctgatca aatttctcct aatacaccaa gggatgcaca tggttgccgg
     1141 gcatgatgcc aacgatgctg tgatttcaaa ttcagtggct caagctcgtt tttcaggctt
     1201 attgattgtc aaaacagtac ttgatcatat cctacaaaag acagaacgag gagttcgtct
     1261 ccatcctctt gcaaggaccg ccaaggtaaa aaatgaggtg aactccttta aggctgcact
     1321 cagctccctg gccaagcatg gagagtatgc tcctttcgcc cgacttttga acctttctgg
     1381 agtaaataat cttgagcatg gtcttttccc tcaactatcg gcaattgcac tcggagtcgc
     1441 cacagcacac gggagtaccc tcgcaggagt aaatgttgga gaacagtatc aacaactcag
     1501 agaggctgcc actgaggctg agaagcaact ccaacaatat gcagagtctc gcgaacttga
     1561 ccatcttgga cttgatgatc aggaaaagaa aattcttatg aacttccatc agaaaaagaa
     1621 cgaaatcagc ttccagcaaa caaacgctat ggtaactcta agaaaagagc gcctggccaa
     1681 gctgacagaa gctatcactg ctgcgtcact gcccaaaaca agtggacatt acgatgatga
     1741 tgacgacatt ccctttccag gacccatcaa tgatgacgac aatcctggcc atcaagatga
     1801 tgatccgact gactcacagg atacgaccat tcccgatgtg gtggttgatc ccgatgatgg
     1861 aagctacggc gaataccaga gttactcgga aaacggcatg aatgcaccag atgacttggt
     1921 cctattcgat ctagacgagg acgacgagga cactaagcca gtgcctaata gatcgaccaa
     1981 gggtggacaa cagaagaaca gtcaaaaggg ccagcatata gagggcagac agacacaatc
     2041 caggccaatt caaaatgtcc caggccctca cagaacaatc caccacgcca gtgcgccact
     2101 cacggacaat gacagaagaa atgaaccctc cggctcaacc agccctcgca tgctgacacc
     2161 aattaacgaa gaggcagacc cactggacga tgccgacgac gagacgtcta gccttccgcc
     2221 cttggagtca gatgatgaag agcaggacag ggacggaact tccaaccgca cacccactgt
     2281 cgccccaccg gctcccgtat acagagatca ctctgaaaag aaagaactcc cgcaagacga
     2341 gcaacaagat caggaccaca ctcaagaggc caggaaccag gacagtgaca acacccagtc
     2401 agaacactct tttgaggaga tgtatcgcca cattctaaga tcacaggggc catttgatgc
     2461 tgttttgtat tatcatatga tgaaggatga gcctgtagtt ttcagtacca gtgatggcaa
     2521 agagtacacg tatccagact cccttgaaga ggaatatcca ccatggctca ctgaaaaaga
     2581 ggctatgaat gaagagaata gatttgttac attggatggt caacaatttt attggccggt
     2641 gatgaatcac aagaataaat tcatggcaat cctgcaacat catcagtgaa tgagcatgga
     2701 acaatgggat gattcaaccg acaaatagct aacattaagt agtcaaggaa cgaaaacagg
     2761 aagaattttt gatgtctaag gtgtgaatta ttatcacaat aaaagtgatt cttatttttg
     2821 aatttaaagc tagcttatta ttactagccg tttttcaaag ttcaatttga gtcttaatgc
     2881 aaataggcgt taagccacag ttatagccat aattgtaact caatattcta actagcgatt
     2941 tatctaaatt aaattacatt atgcttttat aacttaccta ctagcctgcc caacatttac
     3001 acgatcgttt tataattaag aaaaaactaa tgatgaagat taaaaccttc atcatcctta
     3061 cgtcaattga attctctagc actcgaagct tattgtcttc aatgtaaaag aaaagctggt
     3121 ctaacaagat gacaactaga acaaagggca ggggccatac tgcggccacg actcaaaacg
     3181 acagaatgcc aggccctgag ctttcgggct ggatctctga gcagctaatg accggaagaa
     3241 ttcctgtaag cgacatcttc tgtgatattg agaacaatcc aggattatgc tacgcatccc
     3301 aaatgcaaca aacgaagcca aacccgaaga cgcgcaacag tcaaacccaa acggacccaa
     3361 tttgcaatca tagttttgag gaggtagtac aaacattggc ttcattggct actgttgtgc
     3421 aacaacaaac catcgcatca gaatcattag aacaacgcat tacgagtctt gagaatggtc
     3481 taaagccagt ttatgatatg gcaaaaacaa tctcctcatt gaacagggtt tgtgctgaga
     3541 tggttgcaaa atatgatctt ctggtgatga caaccggtcg ggcaacagca accgctgcgg
     3601 caactgaggc ttattgggcc gaacatggtc aaccaccacc tggaccatca ctttatgaag
     3661 aaagtgcgat tcggggtaag attgaatcta gagatgagac cgtccctcaa agtgttaggg
     3721 aggcattcaa caatctaaac agtaccactt cactaactga ggaaaatttt gggaaacctg
     3781 acatttcggc aaaggatttg agaaacatta tgtatgatca cttgcctggt tttggaactg
     3841 ctttccacca attagtacaa gtgatttgta aattgggaaa agatagcaac tcattggaca
     3901 tcattcatgc tgagttccag gccagcctgg ctgaaggaga ctctcctcaa tgtgccctaa
     3961 ttcaaattac aaaaagagtt ccaatcttcc aagatgctgc tccacctgtc atccacatcc
     4021 gctctcgagg tgacattccc cgagcttgcc agaaaagctt gcgtccagtc ccaccatcgc
     4081 ccaagattga tcgaggttgg gtatgtgttt ttcagcttca agatggtaaa acacttggac
     4141 tcaaaatttg agccaatctc ccttccctcc gaaagaggcg aataatagca gaggcttcaa
     4201 ctgctgaact atagggtacg ttacattaat gatacacttg tgagtatcag ccctggataa
     4261 tataagtcaa ttaaacgacc aagataaaat tgttcatatc tcgctagcag cttaaaatat
     4321 aaatgtaata ggagctatat ctctgacagt attataatca attgttatta agtaacccaa
     4381 accaaaagtg atgaagatta agaaaaacct acctcggctg agagagtgtt ttttcattaa
     4441 ccttcatctt gtaaacgttg agcaaaattg ttaaaaatat gaggcgggtt atattgccta
     4501 ctgctcctcc tgaatatatg gaggccatat accctgtcag gtcaaattca acaattgcta
     4561 gaggtggcaa cagcaataca ggcttcctga caccggagtc agtcaatggg gacactccat
     4621 cgaatccact caggccaatt gccgatgaca ccatcgacca tgccagccac acaccaggca
     4681 gtgtgtcatc agcattcatc cttgaagcta tggtgaatgt catatcgggc cccaaagtgc
     4741 taatgaagca aattccaatt tggcttcctc taggtgtcgc tgatcaaaag acctacagct
     4801 ttgactcaac tacggccgcc atcatgcttg cttcatacac tatcacccat ttcggcaagg
     4861 caaccaatcc acttgtcaga gtcaatcggc tgggtcctgg aatcccggat catcccctca
     4921 ggctcctgcg aattggaaac caggctttcc tccaggagtt cgttcttccg ccagtccaac
     4981 taccccagta tttcaccttt gatttgacag cactcaaact gatcacccaa ccactgcctg
     5041 ctgcaacatg gaccgatgac actccaacag gatcaaatgg agcgttgcgt ccaggaattt
     5101 catttcatcc aaaacttcgc cccattcttt tacccaacaa aagtgggaag aaggggaaca
     5161 gtgccgatct aacatctccg gagaaaatcc aagcaataat gacttcactc caggacttta
     5221 agatcgttcc aattgatcca accaaaaata tcatgggaat cgaagtgcca gaaactctgg
     5281 tccacaagct gaccggtaag aaggtgactt ctaaaaatgg acaaccaatc atccctgttc
     5341 ttttgccaaa gtacattggg ttggacccgg tggctccagg agacctcacc atggtaatca
     5401 cacaggattg tgacacgtgt cattctcctg caagtcttcc agctgtgatt gagaagtaat
     5461 tgcaataatt gactcagatc cagttttata gaatcttctc agggatagtg ataacatcta
     5521 tttagtaatc cgtccattag aggagacact tttaattgat caatatacta aaggtgcttt
     5581 acaccattgt cttttttctc tcctaaatgt agaacttaac aaaagactca taatatactt
     5641 gtttttaaag gattgattga tgaaagatca taactaataa cattacaaat aatcctacta
     5701 taatcaatac ggtgattcaa atgttaatct ttctcattgc acatactttt tgcccttatc
     5761 ctcaaattgc ctgcatgctt acatctgagg atagccagtg tgacttggat tggaaatgtg
     5821 gagaaaaaat cgggacccat ttctaggttg ttcacaatcc aagtacagac attgcccttc
     5881 taattaagaa aaaatcggcg atgaagatta agccgacagt gagcgtaatc ttcatctctc
     5941 ttagattatt tgttttccag agtaggggtc gtcaggtcct tttcaatcgt gtaaccaaaa
     6001 taaactccac tagaaggata ttgtggggca acaacacaat gggcgttaca ggaatattgc
     6061 agttacctcg tgatcgattc aagaggacat cattctttct ttgggtaatt atccttttcc
     6121 aaagaacatt ttccatccca cttggagtca tccacaatag cacattacag gttagtgatg
     6181 tcgacaaact agtttgtcgt gacaaactgt catccacaaa tcaattgaga tcagttggac
     6241 tgaatctcga agggaatgga gtggcaactg acgtgccatc tgcaactaaa agatggggct
     6301 tcaggtccgg tgtcccacca aaggtggtca attatgaagc tggtgaatgg gctgaaaact
     6361 gctacaatct tgaaatcaaa aaacctgacg ggagtgagtg tctaccagca gcgccagacg
     6421 ggattcgggg cttcccccgg tgccggtatg tgcacaaagt atcaggaacg ggaccgtgtg
     6481 ccggagactt tgccttccat aaagagggtg ctttcttcct gtatgatcga cttgcttcca
     6541 cagttatcta ccgaggaacg actttcgctg aaggtgtcgt tgcatttctg atactgcccc
     6601 aagctaagaa ggacttcttc agctcacacc ccttgagaga gccggtcaat gcaacggagg
     6661 acccgtctag tggctactat tctaccacaa ttagatatca ggctaccggt tttggaacca
     6721 atgagacaga gtacttgttc gaggttgaca atttgaccta cgtccaactt gaatcaagat
     6781 tcacaccaca gtttctgctc cagctgaatg agacaatata tacaagtggg aaaaggagca
     6841 ataccacggg aaaactaatt tggaaggtca accccgaaat tgatacaaca atcggggagt
     6901 gggccttctg ggaaactaaa aaaacctcac tagaaaaatt cgcagtgaag agttgtcttt
     6961 cacagttgta tcaaacggag ccaaaaacat cagtggtcag agtccggcgc gaacttcttc
     7021 cgacccaggg accaacacaa caactgaaga ccacaaaatc atggcttcag aaaattcctc
     7081 tgcaatggtt caagtgcaca gtcaaggaag ggaagctgca gtgtcgcatc taacaaccct
     7141 tgccacaatc tccacgagtc cccaatccct cacaaccaaa ccaggtccgg acaacagcac
     7201 ccataataca cccgtgtata aacttgacat ctctgaggca actcaagttg aacaacatca
     7261 ccgcagaaca gacaacgaca gcacagcctc cgacactccc tctgccacga ccgcagccgg
     7321 acccccaaaa gcagagaaca ccaacacgag caagagcact gacttcctgg accccgccac
     7381 cacaacaagt ccccaaaacc acagcgagac cgctggcaac aacaacactc atcaccaaga
     7441 taccggagaa gagagtgcca gcagcgggaa gctaggctta attaccaata ctattgctgg
     7501 agtcgcagga ctgatcacag gcgggagaag aactcgaaga gaagcaattg tcaatgctca
     7561 acccaaatgc aaccctaatt tacattactg gactactcag gatgaaggtg ctgcaatcgg
     7621 actggcctgg ataccatatt tcgggccagc agccgaggga atttacatag aggggctaat
     7681 gcacaatcaa gatggtttaa tctgtgggtt gagacagctg gccaacgaga cgactcaagc
     7741 tcttcaactg ttcctgagag ccacaactga gctacgcacc ttttcaatcc tcaaccgtaa
     7801 ggcaattgat ttcttgctgc agcgatgggg cggcacatgc cacattctgg gaccggactg
     7861 ctgtatcgaa ccacatgatt ggaccaagaa cataacagac aaaattgatc agattattca
     7921 tgattttgtt gataaaaccc ttccggacca gggggacaat gacaattggt ggacaggatg
     7981 gagacaatgg ataccggcag gtattggagt tacaggcgtt ataattgcag ttatcgcttt
     8041 attctgtata tgcaaatttg tcttttagtt tttcttcaga ttgcttcatg gaaaagctca
     8101 gcctcaaatc aatgaaacca ggatttaatt atatggatta cttgaatcta agattacttg
     8161 acaaatgata atataataca ctggagcttt aaacatagcc aatgtgattc taactccttt
     8221 aaactcacag ttaatcataa acaaggtttg acatcaatct agttatctct ttgagaatga
     8281 taaacttgat gaagattaag aaaaaggtaa tctttcgatt atctttaatc ttcatccttg
     8341 attctacaat catgacagtt gtctttagtg acaagggaaa gaagcctttt tattaagttg
     8401 taataatcag atctgcgaac cggtagagtt tagttgcaac ctaacacaca taaagcattg
     8461 gtcaaaaagt caatagaaat ttaaacagtg agtggagaca acttttaaat ggaagcttca
     8521 tatgagagag gacgcccacg agctgccaga cagcattcaa gggatggaca cgaccaccat
     8581 gttcgagcac gatcatcatc cagagagaat tatcgaggtg agtaccgtca atcaaggagc
     8641 gcctcacaag tgcgcgttcc tactgtattt cataagaaga gagttgaacc attaacagtt
     8701 cctccagcac ctaaagacat atgtccgacc ttgaaaaaag gatttttgtg tgacagtagt
     8761 ttttgcaaaa aagatcacca gttggagagt ttaactgata gggaattact cctactaatc
     8821 gcccgtaaga cttgtggatc agtagaacaa caattaaata taactgcacc caaggactcg
     8881 cgcttagcaa atccaacggc tgatgatttc cagcaagagg aaggtccaaa aattaccttg
     8941 ttgacactga tcaagacggc agaacactgg gcgagacaag acatcagaac catagaggat
     9001 tcaaaattaa gagcattgtt gactctatgt gctgtgatga cgaggaaatt ctcaaaatcc
     9061 cagctgagtc ttttatgtga gacacaccta aggcgcgagg ggcttgggca agatcaggca
     9121 gaacccgttc tcgaagtata tcaacgatta cacagtgata aaggaggcag ttttgaagct
     9181 gcactatggc aacaatggga ccgacaatcc ctaattatgt ttatcactgc attcttgaat
     9241 attgctctcc agttaccgtg tgaaagttct gctgtcgttg tttcagggtt aagaacattg
     9301 gttcctcaat cagataatga ggaagcttca accaacccgg ggacatgctc atggtctgat
     9361 gagggtaccc cttaataagg ctgactaaaa cactatataa ccttctactt gatcacaata
     9421 ctccgtatac ctatcatcat atatttaatc aagacgatat cctttaaaac ttattcagta
     9481 ctataatcac tctcgtttca aattaataag atgtgcatga ttgccctaat atatgaagag
     9541 gtatgataca accctaacag tgatcaaaga aaatcataat ctcgtatcgc tcgtaatata
     9601 acctgccaag catacctctt gcacaaagtg attcttgtac acaaataatg ttttactcta
     9661 caggaggtag caacgatcca tcccatcaaa aaataagtat ttcatgactt actaatgatc
     9721 tcttaaaata ttaagaaaaa ctgacggaac ataaattctt tatgcttcaa gctgtggagg
     9781 aggtgtttgg tattggctat tgttatatta caatcaataa caagcttgta aaaatattgt
     9841 tcttgtttca agaggtagat tgtgaccgga aatgctaaac taatgatgaa gattaatgcg
     9901 gaggtctgat aagaataaac cttattattc agattaggcc ccaagaggca ttcttcatct
     9961 ccttttagca aagtactatt tcagggtagt ccaattagtg gcacgtcttt tagctgtata
    10021 tcagtcgccc ctgagatacg ccacaaaagt gtctctaagc taaattggtc tgtacacatc
    10081 ccatacattg tattaggggc aataatatct aattgaactt agccgtttaa aatttagtgc
    10141 ataaatctgg gctaacacca ccaggtcaac tccattggct gaaaagaagc ttacctacaa
    10201 cgaacatcac tttgagcgcc ctcacaatta aaaaatagga acgtcgttcc aacaatcgag
    10261 cgcaaggttt caaggttgaa ctgagagtgt ctagacaaca aaatattgat actccagaca
    10321 ccaagcaaga cctgagaaaa aaccatggct aaagctacgg gacgatacaa tctaatatcg
    10381 cccaaaaagg acctggagaa aggggttgtc ttaagcgacc tctgtaactt cttagttagc
    10441 caaactattc aggggtggaa ggtttattgg gctggtattg agtttgatgt gactcacaaa
    10501 ggaatggccc tattgcatag actgaaaact aatgactttg cccctgcatg gtcaatgaca
    10561 aggaatctct ttcctcattt atttcaaaat ccgaattcca caattgaatc accgctgtgg
    10621 gcattgagag tcatccttgc agcagggata caggaccagc tgattgacca gtctttgatt
    10681 gaacccttag caggagccct tggtctgatc tctgattggc tgctaacaac caacactaac
    10741 catttcaaca tgcgaacaca acgtgtcaag gaacaattga gcctaaaaat gctgtcgttg
    10801 attcgatcca atattctcaa gtttattaac aaattggatg ctctacatgt cgtgaactac
    10861 aacggattgt tgagcagtat tgaaattgga actcaaaatc atacaatcat cataactcga
    10921 actaacatgg gttttctggt ggagctccaa gaacccgaca aatcggcaat gaaccgcatg
    10981 aagcctgggc cggcgaaatt ttccctcctt catgagtcca cactgaaagc atttacacaa
    11041 ggatcctcga cacgaatgca aagtttgatt cttgaattta atagctctct tgctatctaa
    11101 ctaaggtaga atacttcata ttgagctaac tcatatatgc tgactcaata gttatcttga
    11161 catctctgct ttcataatca gatatataag cataataaat aaatactcat atttcttgat
    11221 aatttgttta accacagata aatcctcact gtaagccagc ttccaagttg acacccttac
    11281 aaaaaccagg actcagaatc cctcaaacaa gagattccaa gacaacatca tagaattgct
    11341 ttattatatg aataagcatt ttatcaccag aaatcctata tactaaatgg ttaattgtaa
    11401 ctgaacccgc aggtcacatg tgttaggttt cacagattct atatattact aactctatac
    11461 tcgtaattaa cattagataa gtagattaag aaaaaagcct gaggaagatt aagaaaaact
    11521 gcttattggg tctttccgtg ttttagatga agcagttgaa attcttcctc ttgatattaa
    11581 atggctacac aacataccca atacccagac gctaggttat catcaccaat tgtattggac
    11641 caatgtgacc tagtcactag agcttgcggg ttatattcat catactccct taatccgcaa
    11701 ctacgcaact gtaaactccc gaaacatatc taccgtttga aatacgatgt aactgttacc
    11761 aagttcttga gtgatgtacc agtggcgaca ttgcccatag atttcatagt cccagttctt
    11821 ctcaaggcac tgtcaggcaa tggattctgt cctgttgagc cgcggtgcca acagttctta
    11881 gatgaaatca ttaagtacac aatgcaagat gctctcttct tgaaatatta tctcaaaaat
    11941 gtgggtgctc aagaagactg tgttgatgaa cactttcaag agaaaatctt atcttcaatt
    12001 cagggcaatg aatttttaca tcaaatgttt ttctggtatg atctggctat tttaactcga
    12061 aggggtagat taaatcgagg aaactctaga tcaacatggt ttgttcatga tgatttaata
    12121 gacatcttag gctatgggga ctatgttttt tggaagatcc caatttcaat gttaccactg
    12181 aacacacaag gaatccccca tgctgctatg gactggtatc aggcatcagt attcaaagaa
    12241 gcggttcaag ggcatacaca cattgtttct gtttctactg ccgacgtctt gataatgtgc
    12301 aaagatttaa ttacatgtcg attcaacaca actctaatct caaaaatagc agagattgag
    12361 gatccagttt gttctgatta tcccaatttt aagattgtgt ctatgcttta ccagagcgga
    12421 gattacttac tctccatatt agggtctgat gggtataaaa ttattaagtt cctcgaacca
    12481 ttgtgcttgg ccaaaattca attatgctca aagtacactg agaggaaggg ccgattctta
    12541 acacaaatgc atttagctgt aaatcacacc ctagaagaaa ttacagaaat gcgtgcacta
    12601 aagccttcac aggctcaaaa gatccgtgaa ttccatagaa cattgataag gctggagatg
    12661 acgccacaac aactttgtga gctattttcc attcaaaaac actgggggca tcctgtgcta
    12721 catagtgaaa cagcaatcca aaaagttaaa aaacatgcta cggtgctaaa agcattacgc
    12781 cctatagtga ttttcgagac atactgtgtt tttaaatata gtattgccaa acattatttt
    12841 gatagtcaag gatcttggta cagtgttact tcagatagga atctaacacc gggtcttaat
    12901 tcttatatca aaagaaatca attccctccg ttgccaatga ttaaagaact actatgggaa
    12961 ttttaccacc ttgaccaccc tccacttttc tcaaccaaaa ttattagtga cttaagtatt
    13021 tttataaaag acagagctac cgcagtagaa aggacatgct gggatgcagt attcgagcct
    13081 aatgttctag gatataatcc acctcacaaa tttagtacta aacgtgtacc ggaacaattt
    13141 ttagagcaag aaaacttttc tattgagaat gttctttcct acgcacaaaa actcgagtat
    13201 ctactaccac aatatcggaa cttttctttc tcattgaaag agaaagagtt gaatgtaggt
    13261 agaaccttcg gaaaattgcc ttatccgact cgcaatgttc aaacactttg tgaagctctg
    13321 ttagctgatg gtcttgctaa agcatttcct agcaatatga tggtagttac ggaacgtgag
    13381 caaaaagaaa gcttattgca tcaagcatca tggcaccaca caagtgatga ttttggtgaa
    13441 catgccacag ttagagggag tagctttgta actgatttag agaaatacaa tcttgcattt
    13501 agatatgagt ttacagcacc ttttatagaa tattgcaacc gttgctatgg tgttaagaat
    13561 gtttttaatt ggatgcatta tacaatccca cagtgttata tgcatgtcag tgattattat
    13621 aatccaccac ataacctcac actggagaat cgagacaacc cccccgaagg gcctagttca
    13681 tacaggggtc atatgggagg gattgaagga ctgcaacaaa aactctggac aagtatttca
    13741 tgtgctcaaa tttctttagt tgaaattaag actggtttta agttacgctc agctgtgatg
    13801 ggtgacaatc agtgcattac tgttttatca gtcttcccct tagagactga cgcagacgag
    13861 caggaacaga gcgccgaaga caatgcagcg agggtggccg ccagcctagc aaaagttaca
    13921 agtgcctgtg gaatcttttt aaaacctgat gaaacatttg tacattcagg ttttatctat
    13981 tttggaaaaa aacaatattt gaatggggtc caattgcctc agtcccttaa aacggctaca
    14041 agaatggcac cattgtctga tgcaattttt gatgatcttc aagggaccct ggctagtata
    14101 ggcactgctt ttgagcgatc catctctgag acacgacata tctttccttg caggataacc
    14161 gcagctttcc atacgttttt ttcggtgaga atcttgcaat atcatcatct cgggttcaat
    14221 aaaggttttg accttggaca gttaacactc ggcaaacctc tggatttcgg aacaatatca
    14281 ttggcactag cggtaccgca ggtgcttgga gggttatcct tcttgaatcc tgagaaatgt
    14341 ttctaccgga atctaggaga tccagttacc tcaggcttat tccagttaaa aacttatctc
    14401 cgaatgattg agatggatga tttattctta cctttaattg cgaagaaccc tgggaactgc
    14461 actgccattg actttgtgct aaatcctagc ggattaaatg tccctgggtc gcaagactta
    14521 acttcatttc tgcgccagat tgtacgcagg accatcaccc taagtgcgaa aaacaaactt
    14581 attaatacct tatttcatgc gtcagctgac ttcgaagacg aaatggtttg taaatggcta
    14641 ttatcatcaa ctcctgttat gagtcgtttt gcggccgata tcttttcacg cacgccgagc
    14701 gggaagcgat tgcaaattct aggatacctg gaaggaacac gcacattatt agcctctaag
    14761 atcatcaaca ataatacaga gacaccggtt ttggacagac tgaggaaaat aacattgcaa
    14821 aggtggagcc tatggtttag ttatcttgat cattgtgata atatcctggc ggaggcttta
    14881 acccaaataa cttgcacagt tgatttagca cagattctga gggaatattc atgggctcat
    14941 attttagagg gaagacctct tattggagcc acactcccat gtatgattga gcaattcaaa
    15001 gtgttttggc tgaaacccta cgaacaatgt ccgcagtgtt caaatgcaaa gcaaccaggt
    15061 gggaaaccat tcgtgtcagt ggcagtcaag aaacatattg ttagtgcatg gccgaacgca
    15121 tcccgaataa gctggactat cggggatgga atcccataca ttggatcaag gacagaagat
    15181 aagataggac aacctgctat taaaccaaaa tgtccttccg cagccttaag agaggccatt
    15241 gaattggcgt cccgtttaac atgggtaact caaggcagtt cgaacagtga cttgctaata
    15301 aaaccatttt tggaagcacg agtaaattta agtgttcaag aaatacttca aatgacccct
    15361 tcacattact caggaaatat tgttcacagg tacaacgatc aatacagtcc tcattctttc
    15421 atggccaatc gtatgagtaa ttcagcaacg cgattgattg tttctacaaa cactttaggt
    15481 gagttttcag gaggtggcca gtctgcacgc gacagcaata ttattttcca gaatgttata
    15541 aattatgcag ttgcactgtt cgatattaaa tttagaaaca ctgaggctac agatatccaa
    15601 tataatcgtg ctcaccttca tctaactaag tgttgcaccc gggaagtacc agctcagtat
    15661 ttaacataca catctacatt ggatttagat ttaacaagat accgagaaaa cgaattgatt
    15721 tatgacagta atcctctaaa aggaggactc aattgcaata tctcattcga taatccattt
    15781 ttccaaggta aacggctgaa cattatagaa gatgatctta ttcgactgcc tcacttatct
    15841 ggatgggagc tagccaagac catcatgcaa tcaattattt cagatagcaa caattcatct
    15901 acagacccaa ttagcagtgg agaaacaaga tcattcacta cccatttctt aacttatccc
    15961 aagataggac ttctgtacag ttttggggcc tttgtaagtt attatcttgg caatacaatt
    16021 cttcggacta agaaattaac acttgacaat tttttatatt acttaactac tcaaattcat
    16081 aatctaccac atcgctcatt gcgaatactt aagccaacat tcaaacatgc aagcgttatg
    16141 tcacggttaa tgagtattga tcctcatttt tctatttaca taggcggtgc tgcaggtgac
    16201 agaggactct cagatgcggc caggttattt ttgagaacgt ccatttcatc ttttcttaca
    16261 tttgtaaaag aatggataat taatcgcgga acaattgtcc ctttatggat agtatatccg
    16321 ctagagggtc aaaacccaac acctgtgaat aattttctct atcagatcgt agaactgctg
    16381 gtgcatgatt catcaagaca acaggctttt aaaactacca taagtgatca tgtacatcct
    16441 cacgacaatc ttgtttacac atgtaagagt acagccagca atttcttcca tgcatcattg
    16501 gcgtactgga ggagcagaca cagaaacagc aaccgaaaat acttggcaag agactcttca
    16561 actggatcaa gcacaaacaa cagtgatggt catattgaga gaagtcaaga acaaaccacc
    16621 agagatccac atgatggcac tgaacggaat ctagtcctac aaatgagcca tgaaataaaa
    16681 agaacgacaa ttccacaaga aaacacgcac cagggtccgt cgttccagtc ctttctaagt
    16741 gactctgctt gtggtacagc aaatccaaaa ctaaatttcg atcgatcgag acacaatgtg
    16801 aaatttcagg atcataactc ggcatccaag agggaaggtc atcaaataat ctcacaccgt
    16861 ctagtcctac ctttctttac attatctcaa gggacacgcc aattaacgtc atccaatgag
    16921 tcacaaaccc aagacgagat atcaaagtac ttacggcaat tgagatccgt cattgatacc
    16981 acagtttatt gtagatttac cggtatagtc tcgtccatgc attacaaact tgatgaggtc
    17041 ctttgggaaa tagagagttt caagtcggct gtgacgctag cagagggaga aggtgctggt
    17101 gccttactat tgattcagaa ataccaagtt aagaccttat ttttcaacac gctagctact
    17161 gagtccagta tagagtcaga aatagtatca ggaatgacta ctcctaggat gcttctacct
    17221 gttatgtcaa aattccataa tgaccaaatt gagattattc ttaacaactc agcaagccaa
    17281 ataacagaca taacaaatcc tacttggttt aaagaccaaa gagcaaggct acctaagcaa
    17341 gtcgaggtta taaccatgga tgcagagaca acagagaata taaacagatc gaaattgtac
    17401 gaagctgtat ataaattgat cttacaccat attgatccta gcgtattgaa agcagtggtc
    17461 cttaaagtct ttctaagtga tactgagggt atgttatggc taaatgataa tttagccccg
    17521 ttttttgcca ctggttattt aattaagcca ataacgtcaa gtgctagatc tagtgagtgg
    17581 tatctttgtc tgacgaactt cttatcaact acacgtaaga tgccacacca aaaccatctc
    17641 agttgtaaac aggtaatact tacggcattg caactgcaaa ttcaacgaag cccatactgg
    17701 ctaagtcatt taactcagta tgctgactgt gagttacatt taagttatat ccgccttggt
    17761 tttccatcat tagagaaagt actataccac aggtataacc tcgtcgattc aaaaagaggt
    17821 ccactagtct ctatcactca gcacttagca catcttagag cagagattcg agaattaact
    17881 aatgattata atcaacagcg acaaagtcgg actcaaacat atcactttat tcgtactgca
    17941 aaaggacgaa tcacaaaact agtcaatgat tatttaaaat tctttcttat tgtgcaagca
    18001 ttaaaacata atgggacatg gcaagctgag tttaagaaat taccagagtt gattagtgtg
    18061 tgcaataggt tctaccatat tagagattgc aattgtgaag aacgtttctt agttcaaacc
    18121 ttatatttac atagaatgca ggattctgaa gttaagctta tcgaaaggct gacagggctt
    18181 ctgagtttat ttccggatgg tctctacagg tttgattgaa ttaccgtgca tagtatcctg
    18241 atacttgcaa aggttggtta ttaacataca gattataaaa aactcataaa ttgctctcat
    18301 acatcatatt gatctaatct caataaacaa ctatttaaat aacgaaagga gtccctatat
    18361 tatatactat atttagcctc tctccctgcg tgataatcaa aaaattcaca atgcagcatg
    18421 tgtgacatat tactgccgca atgaatttaa cgcaacataa taaactctgc actctttata
    18481 attaagcttt aacgaaaggt ctgggctcat attgttattg atataataat gttgtatcaa
    18541 tatcctgtca gatggaatag tgttttggtt gataacacaa cttcttaaaa caaaattgat
    18601 ctttaagatt aagtttttta taattatcat tactttaatt tgtcgtttta aaaacggtga
    18661 tagccttaat ctttgtgtaa aataagagat taggtgtaat aaccttaaca tttttgtcta
    18721 gtaagctact atttcataca gaatgataaa attaaaagaa aaggcaggac tgtaaaatca
    18781 gaaatacctt ctttacaata tagcagacta gataataatc ttcgtgttaa tgataattaa
    18841 gacattgacc acgctcatca gaaggctcgc cagaataaac gttgcaaaaa ggattcctgg
    18901 aaaaatggtc gcacacaaaa atttaaaaat aaatctattt cttctttttt gtgtgtcca

 

//Ebola Genome

Zaire ebolavirus isolate Ebola virus H.sapiens-tc/COD/1976/Yambuku-Mayinga, complete genome. NCBI Reference Sequence: NC_002549.1. 

http://www.ncbi.nlm.nih.gov/nuccore/10313991?report=graph

LOCUS       NC_002549              18959 bp    cRNA    linear   VRL 27-AUG-2014
DEFINITION  Zaire ebolavirus isolate Ebola virus
            H.sapiens-tc/COD/1976/Yambuku-Mayinga, complete genome.
ACCESSION   NC_002549
VERSION     NC_002549.1  GI:10313991
DBLINK      BioProject: PRJNA14703
KEYWORDS    RefSeq.
SOURCE      Zaire ebolavirus (ZEBOV)
  ORGANISM  Zaire ebolavirus
            Viruses; ssRNA negative-strand viruses; Mononegavirales;
            Filoviridae; Ebolavirus.
REFERENCE   1  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Volchkova,V.A., Chepurnov,A.A., Blinov,V.M.,
            Dolnik,O., Netesov,S.V. and Feldmann,H.
  TITLE     Characterization of the L gene and 5' trailer region of Ebola virus
  JOURNAL   J. Gen. Virol. 80 (Pt 2), 355-362 (1999)
   PUBMED   10073695
REFERENCE   2  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Volchkova,V.A., Slenczka,W., Klenk,H.D. and
            Feldmann,H.
  TITLE     Release of viral glycoproteins during Ebola virus infection
  JOURNAL   Virology 245 (1), 110-119 (1998)
   PUBMED   9614872
REFERENCE   3  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Feldmann,H., Volchkova,V.A. and Klenk,H.D.
  TITLE     Processing of the Ebola virus glycoprotein by the proprotein
            convertase furin
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 95 (10), 5762-5767 (1998)
   PUBMED   9576958
REFERENCE   4  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E., Becker,S., Volchkova,V.A., Ternovoj,V.A.,
            Kotov,A.N., Netesov,S.V. and Klenk,H.D.
  TITLE     GP mRNA of Ebola virus is edited by the Ebola virus polymerase and
            by T7 and vaccinia virus polymerases
  JOURNAL   Virology 214 (2), 421-430 (1995)
   PUBMED   8553543
REFERENCE   5  (bases 1 to 18959)
  AUTHORS   Bukreyev,A.A., Volchkov,V.E., Blinov,V.M. and Netesov,S.V.
  TITLE     The VP35 and VP40 proteins of filoviruses. Homology between Marburg
            and Ebola viruses
  JOURNAL   FEBS Lett. 322 (1), 41-46 (1993)
   PUBMED   8482365
REFERENCE   6  (bases 1 to 18959)
  CONSRTM   NCBI Genome Project
  TITLE     Direct Submission
  JOURNAL   Submitted (27-SEP-2000) National Center for Biotechnology
            Information, NIH, Bethesda, MD 20894, USA
REFERENCE   7  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E.
  TITLE     Direct Submission
  JOURNAL   Submitted (02-JUN-2000) Institute of Virology, Philipps-University
            Marburg, Robert-Koch-Str. 17, Marburg 35037, Germany
  REMARK    Sequence update by submitter
REFERENCE   8  (bases 1 to 18959)
  AUTHORS   Volchkov,V.E.
  TITLE     Direct Submission
  JOURNAL   Submitted (20-AUG-1998) Institute of Virology, Philipps-University
            Marburg, Robert-Koch-Str. 17, Marburg 35037, Germany
COMMENT     PROVISIONAL REFSEQ: This record has not yet been subject to final
            NCBI review. The reference sequence is identical to AF086833.
            COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     source          1..18959

                     /organism="Zaire ebolavirus"

                     /mol_type="viral cRNA"

                     /isolate="Ebola virus

                     H.sapiens-tc/COD/1976/Yambuku-Mayinga"

                     /db_xref="taxon:186538"

     5'UTR           1..55

                     /note="putative leader region"

                     /citation=[1]

                     /function="regulation or initiation of RNA replication"

     gene            56..3026

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /db_xref="GeneID:911830"

     mRNA            56..3026

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /product="nucleoprotein"

                     /db_xref="GeneID:911830"

     misc_signal     56..67

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /note="putative; transcription start signal"

                     /citation=[1]

     CDS             470..2689

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /function="encapsidation of genomic RNA"

                     /codon_start=1

                     /product="nucleoprotein"

                     /protein_id="NP_066243.1"

                     /db_xref="GI:10314000"

                     /db_xref="GeneID:911830"

                     /translation="MDSRPQKIWMAPSLTESDMDYHKILTAGLSVQQGIVRQRVIPVY

                     QVNNLEEICQLIIQAFEAGVDFQESADSFLLMLCLHHAYQGDYKLFLESGAVKYLEGH

                     GFRFEVKKRDGVKRLEELLPAVSSGKNIKRTLAAMPEEETTEANAGQFLSFASLFLPK

                     LVVGEKACLEKVQRQIQVHAEQGLIQYPTAWQSVGHMMVIFRLMRTNFLIKFLLIHQG

                     MHMVAGHDANDAVISNSVAQARFSGLLIVKTVLDHILQKTERGVRLHPLARTAKVKNE

                     VNSFKAALSSLAKHGEYAPFARLLNLSGVNNLEHGLFPQLSAIALGVATAHGSTLAGV

                     NVGEQYQQLREAATEAEKQLQQYAESRELDHLGLDDQEKKILMNFHQKKNEISFQQTN

                     AMVTLRKERLAKLTEAITAASLPKTSGHYDDDDDIPFPGPINDDDNPGHQDDDPTDSQ

                    DTTIPDVVVDPDDGSYGEYQSYSENGMNAPDDLVLFDLDEDDEDTKPVPNRSTKGGQQ

                     KNSQKGQHIEGRQTQSRPIQNVPGPHRTIHHASAPLTDNDRRNEPSGSTSPRMLTPIN

                     EEADPLDDADDETSSLPPLESDDEEQDRDGTSNRTPTVAPPAPVYRDHSEKKELPQDE

                     QQDQDHTQEARNQDSDNTQSEHSFEEMYRHILRSQGPFDAVLYYHMMKDEPVVFSTSD

                     GKEYTYPDSLEEEYPPWLTEKEAMNEENRFVTLDGQQFYWPVMNHKNKFMAILQHHQ"

     misc_feature    524..2671

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

                     /note="Ebola nucleoprotein; Region: Ebola_NP; pfam05505"

                     /db_xref="CDD:147601"

     polyA_signal    3015..3026

                     /gene="NP"

                     /locus_tag="ZEBOVgp1"

     misc_feature    3027..3031

                     /note="intergenic region"

     gene            3032..4407

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /db_xref="GeneID:911827"

     mRNA            3032..4407

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /product="VP35"

                     /citation=[5]

                     /db_xref="GeneID:911827"

     misc_signal     3032..3043

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /note="putative; transcription start signal"

                     /citation=[5]

     CDS             3129..4151

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /function="polymerase complex protein"

                     /citation=[5]

                     /codon_start=1

                     /product="polymerase complex protein"

                     /protein_id="NP_066244.1"

                     /db_xref="GI:10313992"

                     /db_xref="GeneID:911827"

                     /translation="MTTRTKGRGHTAATTQNDRMPGPELSGWISEQLMTGRIPVSDIF

                     CDIENNPGLCYASQMQQTKPNPKTRNSQTQTDPICNHSFEEVVQTLASLATVVQQQTI

                     ASESLEQRITSLENGLKPVYDMAKTISSLNRVCAEMVAKYDLLVMTTGRATATAAATE

                     AYWAEHGQPPPGPSLYEESAIRGKIESRDETVPQSVREAFNNLNSTTSLTEENFGKPD

                     ISAKDLRNIMYDHLPGFGTAFHQLVQVICKLGKDSNSLDIIHAEFQASLAEGDSPQCA

                     LIQITKRVPIFQDAAPPVIHIRSRGDIPRACQKSLRPVPPSPKIDRGWVCVFQLQDGK

                     TLGLKI"

     misc_feature    3186..4148

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /note="Filoviridae VP35; Region: Filo_VP35; pfam02097"

                     /db_xref="CDD:145320"

     gene            4390..5894

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /db_xref="GeneID:911825"

     mRNA            4390..5894

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /product="VP40"

                     /citation=[5]

                     /db_xref="GeneID:911825"

     misc_signal     4390..4401

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /note="transcription start signal"

                     /citation=[5]

     polyA_signal    4397..4407

                     /gene="VP35"

                     /locus_tag="ZEBOVgp2"

                     /citation=[5]

     CDS             4479..5459

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /citation=[5]

                     /codon_start=1

                     /product="matrix protein"

                     /protein_id="NP_066245.1"

                     /db_xref="GI:10313993"

                     /db_xref="GeneID:911825"

                     /translation="MRRVILPTAPPEYMEAIYPVRSNSTIARGGNSNTGFLTPESVNG

                     DTPSNPLRPIADDTIDHASHTPGSVSSAFILEAMVNVISGPKVLMKQIPIWLPLGVAD

                     QKTYSFDSTTAAIMLASYTITHFGKATNPLVRVNRLGPGIPDHPLRLLRIGNQAFLQE

                     FVLPPVQLPQYFTFDLTALKLITQPLPAATWTDDTPTGSNGALRPGISFHPKLRPILL

                     PNKSGKKGNSADLTSPEKIQAIMTSLQDFKIVPIDPTKNIMGIEVPETLVHKLTGKKV

                     TSKNGQPIIPVLLPKYIGLDPVAPGDLTMVITQDCDTCHSPASLPAVIEK"

     misc_feature    4479..5363

                     /gene="VP40"

                     /locus_tag="ZEBOVgp3"

                     /note="Matrix protein VP40; Region: VP40; pfam07447"

                     /db_xref="CDD:116068"

     polyA_signal    5883..5894

                     /gene="VP40"

                    /locus_tag="ZEBOVgp3"

                     /citation=[5]

     misc_feature    5895..5899

                     /note="intergenic region"

     gene            5900..8305

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /db_xref="GeneID:911829"

     mRNA            5900..8305

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /product="sGP"

                     /note="unedited mRNA"

                     /citation=[4]

                     /db_xref="GeneID:911829"

     misc_signal     5900..5911

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="putative; transcription start signal"

                     /citation=[4]

     CDS             join(6039..6923,6923..8068)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /function="receptor binding and fusion"

                     /artificial_location="low-quality sequence region"

                     /note="virion spike glycoprotein precursor; an addition A

                     residue is inserted during transcription; encodes two

                     disulfide linked subunits GP1 and GP2"

                     /citation=[2]

                     /citation=[3]

                     /citation=[4]

                     /codon_start=1

                     /product="spike glycoprotein"

                     /protein_id="NP_066246.1"

                     /db_xref="GI:10313995"

                     /db_xref="GeneID:911829"

                     /translation="MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQ

                     VSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVNYEAG

                     EWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF

                     LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTI

                     RYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWK

                     VNPEIDTTIGEWAFWETKKNLTRKIRSEELSFTVVSNGAKNISGQSPARTSSDPGTNT

                     TTEDHKIMASENSSAMVQVHSQGREAAVSHLTTLATISTSPQSLTTKPGPDNSTHNTP

                     VYKLDISEATQVEQHHRRTDNDSTASDTPSATTAAGPPKAENTNTSKSTDFLDPATTT

                     SPQNHSETAGNNNTHHQDTGEESASSGKLGLITNTIAGVAGLITGGRRTRREAIVNAQ

                     PKCNPNLHYWTTQDEGAAIGLAWIPYFGPAAEGIYIEGLMHNQDGLICGLRQLANETT

                     QALQLFLRATTELRTFSILNRKAIDFLLQRWGGTCHILGPDCCIEPHDWTKNITDKID

                     QIIHDFVDKTLPDQGDNDNWWTGWRQWIPAGIGVTGVIIAVIALFCICKFVF"

     misc_feature    7529..7540

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="encodes the glycoprotein cleavage site, precursor

                     GP is cleaved by subtilisin-like cellular protease furin

                     into subunits GP1 and GP2 that are linked by a disulfide

                     bond"

                     /citation=[3]

     misc_feature    7793..7870

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="immunosuppressive motif; other site"

     misc_feature    7988..8053

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="transmembrane anchor; transmembrane region"

     misc_feature    7706..7924

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="heptad repeat 1-heptad repeat 2 region of the

                     transmembrane subunit of Filoviridae viruses, Ebola virus

                     and Marburg virus, and related domains; Region:

                     Ebola-like_HR1-HR2; cd09850"

                     /db_xref="CDD:197367"

     misc_feature    join(6081..6923,6923..7153)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Filovirus glycoprotein; Region: Filo_glycop;

                     pfam01611"

                     /db_xref="CDD:110602"

     misc_feature    7706..7732

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1A; other site"

                     /db_xref="CDD:197367"

     misc_feature    7733..7762

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1B; other site"

                     /db_xref="CDD:197367"

     misc_feature    7763..7783

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1C; other site"

                     /db_xref="CDD:197367"

     misc_feature    7784..7831

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1D; other site"

                     /db_xref="CDD:197367"

     misc_feature    7787..7837

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="immunosuppressive region; other site"

                     /db_xref="CDD:197367"

     misc_feature    order(7838..7858,7859..7861)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="CX(6,7)C motif; other site"

                     /db_xref="CDD:197367"

     misc_feature    7886..7924

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR2; other site"

                     /db_xref="CDD:197367"

     misc_feature    order(7784..7786,7793..7795)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Cl binding site [ion binding]; other site"

                     /db_xref="CDD:197367"

     misc_feature    order(7706..7714,7718..7723,7727..7732,7736..7744,

                     7748..7756,7760..7765,7769..7777,7781..7807,7811..7819,

                     7823..7828,7844..7849,7856..7858,7865..7876,7880..7882,

                     7889..7894,7901..7903,7910..7915,7922..7924)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="homotrimer interface [polypeptide binding]; other

                     site"

                     /db_xref="CDD:197367"

     misc_feature    order(7706..7714,7718..7726,7730..7735,7739..7747,

                     7754..7768,7772..7783,7787..7792,7796..7804,7808..7813,

                     7817..7819)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="HR1-GP1 interface [polypeptide binding]; other

                     site"

                     /db_xref="CDD:197367"

     CDS             6039..7133

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="sGP, small non-structural, secreted glycoprotein;

                     sGP secreted as a anti-parallel oriented homodimer"

                     /citation=[4]

                     /codon_start=1

                     /product="small secreted glycoprotein"

                     /protein_id="NP_066247.1"

                     /db_xref="GI:10313994"

                     /db_xref="GeneID:911829"

                     /translation="MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQ

                     VSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVNYEAG

                     EWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF

                     LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTI

                     RYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWK

                     VNPEIDTTIGEWAFWETKKTSLEKFAVKSCLSQLYQTEPKTSVVRVRRELLPTQGPTQ

                     QLKTTKSWLQKIPLQWFKCTVKEGKLQCRI"

     misc_feature    6081..7130

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Filovirus glycoprotein; Region: Filo_glycop;

                     pfam01611"

                     /db_xref="CDD:110602"

     CDS             join(6039..6922,6924..6933)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /artificial_location="low-quality sequence region"

                     /note="ssGP; second non-structural secreted glycoprotein;

                     secreted in a monomeric form; one A residue is deleted or

                     two additional A residues are inserted at the editing site

                     during transcription of the GP gene"

                     /citation=[4]

                     /codon_start=1

                     /product="second secreted glycoprotein"

                     /protein_id="NP_066248.1"

                     /db_xref="GI:10313996"

                     /db_xref="GeneID:911829"

                     /translation="MGVTGILQLPRDRFKRTSFFLWVIILFQRTFSIPLGVIHNSTLQ

                     VSDVDKLVCRDKLSSTNQLRSVGLNLEGNGVATDVPSATKRWGFRSGVPPKVVNYEAG

                     EWAENCYNLEIKKPDGSECLPAAPDGIRGFPRCRYVHKVSGTGPCAGDFAFHKEGAFF

                     LYDRLASTVIYRGTTFAEGVVAFLILPQAKKDFFSSHPLREPVNATEDPSSGYYSTTI

                     RYQATGFGTNETEYLFEVDNLTYVQLESRFTPQFLLQLNETIYTSGKRSNTTGKLIWK

                     VNPEIDTTIGEWAFWETKKPH"

     misc_feature    join(6081..6922,6924..>6924)

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="Filovirus glycoprotein; Region: Filo_glycop;

                     pfam01611"

                     /db_xref="CDD:110602"

     misc_signal     6918..6924

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /note="additional A residues are inserted or deleted

                     during transcription of the GP gene by the viral

                     polymerase"

                     /citation=[4]

                     /function="RNA editing"

     gene            8288..9740

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /db_xref="GeneID:911826"

     mRNA            8288..9740

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /product="VP30"

                     /db_xref="GeneID:911826"

     misc_signal     8288..8299

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="putative; transcription start signal"

     polyA_signal    8295..8305

                     /gene="GP"

                     /locus_tag="ZEBOVgp4"

                     /citation=[4]

     CDS             8509..9375

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="polymerase complex protein"

                     /codon_start=1

                     /product="minor nucleoprotein"

                     /protein_id="NP_066249.1"

                     /db_xref="GI:10313997"

                     /db_xref="GeneID:911826"

                     /translation="MEASYERGRPRAARQHSRDGHDHHVRARSSSRENYRGEYRQSRS

                     ASQVRVPTVFHKKRVEPLTVPPAPKDICPTLKKGFLCDSSFCKKDHQLESLTDRELLL

                     LIARKTCGSVEQQLNITAPKDSRLANPTADDFQQEEGPKITLLTLIKTAEHWARQDIR

                     TIEDSKLRALLTLCAVMTRKFSKSQLSLLCETHLRREGLGQDQAEPVLEVYQRLHSDK

                     GGSFEAALWQQWDRQSLIMFITAFLNIALQLPCESSAVVVSGLRTLVPQSDNEEASTN

                     PGTCSWSDEGTP"

     misc_feature    8932..9321

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="Ebola virus-specific transcription factor VP30;

                     Region: Transcript_VP30; pfam11507"

                     /db_xref="CDD:151944"

     polyA_signal    9730..9740

                     /gene="VP30"

                     /locus_tag="ZEBOVgp5"

                     /note="putative"

     misc_feature    9741..9884

                     /note="intergenic region"

     gene            9885..11518

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="putative"

                     /db_xref="GeneID:911828"

     mRNA            9885..11496

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /product="VP24"

                     /db_xref="GeneID:911828"

     misc_signal     9885..9896

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="transcription start signal"

     CDS             10345..11100

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /codon_start=1

                     /product="membrane-associated protein"

                     /protein_id="NP_066250.1"

                     /db_xref="GI:10313998"

                     /db_xref="GeneID:911828"

                     /translation="MAKATGRYNLISPKKDLEKGVVLSDLCNFLVSQTIQGWKVYWAG

                     IEFDVTHKGMALLHRLKTNDFAPAWSMTRNLFPHLFQNPNSTIESPLWALRVILAAGI

                     QDQLIDQSLIEPLAGALGLISDWLLTTNTNHFNMRTQRVKEQLSLKMLSLIRSNILKF

                     INKLDALHVVNYNGLLSSIEIGTQNHTIIITRTNMGFLVELQEPDKSAMNRMKPGPAK

                     FSLLHESTLKAFTQGSSTRMQSLILEFNSSLAI"

     misc_feature    10369..11040

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="Filovirus membrane-associated protein VP24; Region:

                     Filo_VP24; pfam06389"

                     /db_xref="CDD:253701"

     polyA_signal    11485..11496

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="putative"

     misc_feature    11497..11500

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

                     /note="intergenic region"

     gene            11501..18282

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /db_xref="GeneID:911824"

     mRNA            11501..18282

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /product="polymerase"

                     /citation=[1]

                     /db_xref="GeneID:911824"

     misc_signal     11501..11512

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /note="transcription start signal"

                     /citation=[1]

     polyA_signal    11508..11518

                     /gene="VP24"

                     /locus_tag="ZEBOVgp6"

     CDS             11581..18219

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /function="synthesis of viral RNAs; transcriptional RNA

                     editing"

                     /note="polymerase"

                     /citation=[1]

                     /codon_start=1

                     /product="RNA-dependent RNA polymerase"

                     /protein_id="NP_066251.1"

                     /db_xref="GI:10313999"

                     /db_xref="GeneID:911824"

                     /translation="MATQHTQYPDARLSSPIVLDQCDLVTRACGLYSSYSLNPQLRNC

                     KLPKHIYRLKYDVTVTKFLSDVPVATLPIDFIVPVLLKALSGNGFCPVEPRCQQFLDE

                     IIKYTMQDALFLKYYLKNVGAQEDCVDEHFQEKILSSIQGNEFLHQMFFWYDLAILTR

                     RGRLNRGNSRSTWFVHDDLIDILGYGDYVFWKIPISMLPLNTQGIPHAAMDWYQASVF

                     KEAVQGHTHIVSVSTADVLIMCKDLITCRFNTTLISKIAEIEDPVCSDYPNFKIVSML

                     YQSGDYLLSILGSDGYKIIKFLEPLCLAKIQLCSKYTERKGRFLTQMHLAVNHTLEEI

                     TEMRALKPSQAQKIREFHRTLIRLEMTPQQLCELFSIQKHWGHPVLHSETAIQKVKKH

                     ATVLKALRPIVIFETYCVFKYSIAKHYFDSQGSWYSVTSDRNLTPGLNSYIKRNQFPP

                     LPMIKELLWEFYHLDHPPLFSTKIISDLSIFIKDRATAVERTCWDAVFEPNVLGYNPP

                     HKFSTKRVPEQFLEQENFSIENVLSYAQKLEYLLPQYRNFSFSLKEKELNVGRTFGKL

                     PYPTRNVQTLCEALLADGLAKAFPSNMMVVTEREQKESLLHQASWHHTSDDFGEHATV

                     RGSSFVTDLEKYNLAFRYEFTAPFIEYCNRCYGVKNVFNWMHYTIPQCYMHVSDYYNP

                     PHNLTLENRDNPPEGPSSYRGHMGGIEGLQQKLWTSISCAQISLVEIKTGFKLRSAVM

                     GDNQCITVLSVFPLETDADEQEQSAEDNAARVAASLAKVTSACGIFLKPDETFVHSGF

                     IYFGKKQYLNGVQLPQSLKTATRMAPLSDAIFDDLQGTLASIGTAFERSISETRHIFP

                     CRITAAFHTFFSVRILQYHHLGFNKGFDLGQLTLGKPLDFGTISLALAVPQVLGGLSF

                     LNPEKCFYRNLGDPVTSGLFQLKTYLRMIEMDDLFLPLIAKNPGNCTAIDFVLNPSGL

                     NVPGSQDLTSFLRQIVRRTITLSAKNKLINTLFHASADFEDEMVCKWLLSSTPVMSRF

                     AADIFSRTPSGKRLQILGYLEGTRTLLASKIINNNTETPVLDRLRKITLQRWSLWFSY

                     LDHCDNILAEALTQITCTVDLAQILREYSWAHILEGRPLIGATLPCMIEQFKVFWLKP

                     YEQCPQCSNAKQPGGKPFVSVAVKKHIVSAWPNASRISWTIGDGIPYIGSRTEDKIGQ

                     PAIKPKCPSAALREAIELASRLTWVTQGSSNSDLLIKPFLEARVNLSVQEILQMTPSH

                     YSGNIVHRYNDQYSPHSFMANRMSNSATRLIVSTNTLGEFSGGGQSARDSNIIFQNVI

                     NYAVALFDIKFRNTEATDIQYNRAHLHLTKCCTREVPAQYLTYTSTLDLDLTRYRENE

                     LIYDSNPLKGGLNCNISFDNPFFQGKRLNIIEDDLIRLPHLSGWELAKTIMQSIISDS

                     NNSSTDPISSGETRSFTTHFLTYPKIGLLYSFGAFVSYYLGNTILRTKKLTLDNFLYY

                     LTTQIHNLPHRSLRILKPTFKHASVMSRLMSIDPHFSIYIGGAAGDRGLSDAARLFLR

                     TSISSFLTFVKEWIINRGTIVPLWIVYPLEGQNPTPVNNFLYQIVELLVHDSSRQQAF

                     KTTISDHVHPHDNLVYTCKSTASNFFHASLAYWRSRHRNSNRKYLARDSSTGSSTNNS

                     DGHIERSQEQTTRDPHDGTERNLVLQMSHEIKRTTIPQENTHQGPSFQSFLSDSACGT

                     ANPKLNFDRSRHNVKFQDHNSASKREGHQIISHRLVLPFFTLSQGTRQLTSSNESQTQ

                     DEISKYLRQLRSVIDTTVYCRFTGIVSSMHYKLDEVLWEIESFKSAVTLAEGEGAGAL

                     LLIQKYQVKTLFFNTLATESSIESEIVSGMTTPRMLLPVMSKFHNDQIEIILNNSASQ

                     ITDITNPTWFKDQRARLPKQVEVITMDAETTENINRSKLYEAVYKLILHHIDPSVLKA

                     VVLKVFLSDTEGMLWLNDNLAPFFATGYLIKPITSSARSSEWYLCLTNFLSTTRKMPH

                     QNHLSCKQVILTALQLQIQRSPYWLSHLTQYADCELHLSYIRLGFPSLEKVLYHRYNL

                     VDSKRGPLVSITQHLAHLRAEIRELTNDYNQQRQSRTQTYHFIRTAKGRITKLVNDYL

                     KFFLIVQALKHNGTWQAEFKKLPELISVCNRFYHIRDCNCEERFLVQTLYLHRMQDSE

                     VKLIERLTGLLSLFPDGLYRFD"

     misc_feature    11608..14853

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /note="Mononegavirales RNA dependent RNA polymerase;

                     Region: Mononeg_RNA_pol; pfam00946"

                     /db_xref="CDD:250248"

     misc_feature    15223..18192

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /note="mRNA capping enzyme, paramyxovirus family; Region:

                     paramyx_RNAcap; TIGR04198"

                     /db_xref="CDD:234496"

     polyA_signal    18272..18282

                     /gene="L"

                     /locus_tag="ZEBOVgp7"

                     /citation=[1]

     3'UTR           18283..18959

                     /note="putative trailer region"

                     /citation=[1]

                     /function="regulation or initiation of RNA replication"

ORIGIN      
        1 cggacacaca aaaagaaaga agaattttta ggatcttttg tgtgcgaata actatgagga
       61 agattaataa ttttcctctc attgaaattt atatcggaat ttaaattgaa attgttactg
      121 taatcacacc tggtttgttt cagagccaca tcacaaagat agagaacaac ctaggtctcc
      181 gaagggagca agggcatcag tgtgctcagt tgaaaatccc ttgtcaacac ctaggtctta
      241 tcacatcaca agttccacct cagactctgc agggtgatcc aacaacctta atagaaacat
      301 tattgttaaa ggacagcatt agttcacagt caaacaagca agattgagaa ttaaccttgg
      361 ttttgaactt gaacacttag gggattgaag attcaacaac cctaaagctt ggggtaaaac
      421 attggaaata gttaaaagac aaattgctcg gaatcacaaa attccgagta tggattctcg
      481 tcctcagaaa atctggatgg cgccgagtct cactgaatct gacatggatt accacaagat
      541 cttgacagca ggtctgtccg ttcaacaggg gattgttcgg caaagagtca tcccagtgta
      601 tcaagtaaac aatcttgaag aaatttgcca acttatcata caggcctttg aagcaggtgt
      661 tgattttcaa gagagtgcgg acagtttcct tctcatgctt tgtcttcatc atgcgtacca
      721 gggagattac aaacttttct tggaaagtgg cgcagtcaag tatttggaag ggcacgggtt
      781 ccgttttgaa gtcaagaagc gtgatggagt gaagcgcctt gaggaattgc tgccagcagt
      841 atctagtgga aaaaacatta agagaacact tgctgccatg ccggaagagg agacaactga
      901 agctaatgcc ggtcagtttc tctcctttgc aagtctattc cttccgaaat tggtagtagg
      961 agaaaaggct tgccttgaga aggttcaaag gcaaattcaa gtacatgcag agcaaggact
     1021 gatacaatat ccaacagctt ggcaatcagt aggacacatg atggtgattt tccgtttgat
     1081 gcgaacaaat tttctgatca aatttctcct aatacaccaa gggatgcaca tggttgccgg
     1141 gcatgatgcc aacgatgctg tgatttcaaa ttcagtggct caagctcgtt tttcaggctt
     1201 attgattgtc aaaacagtac ttgatcatat cctacaaaag acagaacgag gagttcgtct
     1261 ccatcctctt gcaaggaccg ccaaggtaaa aaatgaggtg aactccttta aggctgcact
     1321 cagctccctg gccaagcatg gagagtatgc tcctttcgcc cgacttttga acctttctgg
     1381 agtaaataat cttgagcatg gtcttttccc tcaactatcg gcaattgcac tcggagtcgc
     1441 cacagcacac gggagtaccc tcgcaggagt aaatgttgga gaacagtatc aacaactcag
     1501 agaggctgcc actgaggctg agaagcaact ccaacaatat gcagagtctc gcgaacttga
     1561 ccatcttgga cttgatgatc aggaaaagaa aattcttatg aacttccatc agaaaaagaa
     1621 cgaaatcagc ttccagcaaa caaacgctat ggtaactcta agaaaagagc gcctggccaa
     1681 gctgacagaa gctatcactg ctgcgtcact gcccaaaaca agtggacatt acgatgatga
     1741 tgacgacatt ccctttccag gacccatcaa tgatgacgac aatcctggcc atcaagatga
     1801 tgatccgact gactcacagg atacgaccat tcccgatgtg gtggttgatc ccgatgatgg
     1861 aagctacggc gaataccaga gttactcgga aaacggcatg aatgcaccag atgacttggt
     1921 cctattcgat ctagacgagg acgacgagga cactaagcca gtgcctaata gatcgaccaa
     1981 gggtggacaa cagaagaaca gtcaaaaggg ccagcatata gagggcagac agacacaatc
     2041 caggccaatt caaaatgtcc caggccctca cagaacaatc caccacgcca gtgcgccact
     2101 cacggacaat gacagaagaa atgaaccctc cggctcaacc agccctcgca tgctgacacc
     2161 aattaacgaa gaggcagacc cactggacga tgccgacgac gagacgtcta gccttccgcc
     2221 cttggagtca gatgatgaag agcaggacag ggacggaact tccaaccgca cacccactgt
     2281 cgccccaccg gctcccgtat acagagatca ctctgaaaag aaagaactcc cgcaagacga
     2341 gcaacaagat caggaccaca ctcaagaggc caggaaccag gacagtgaca acacccagtc
     2401 agaacactct tttgaggaga tgtatcgcca cattctaaga tcacaggggc catttgatgc
     2461 tgttttgtat tatcatatga tgaaggatga gcctgtagtt ttcagtacca gtgatggcaa
     2521 agagtacacg tatccagact cccttgaaga ggaatatcca ccatggctca ctgaaaaaga
     2581 ggctatgaat gaagagaata gatttgttac attggatggt caacaatttt attggccggt
     2641 gatgaatcac aagaataaat tcatggcaat cctgcaacat catcagtgaa tgagcatgga
     2701 acaatgggat gattcaaccg acaaatagct aacattaagt agtcaaggaa cgaaaacagg
     2761 aagaattttt gatgtctaag gtgtgaatta ttatcacaat aaaagtgatt cttatttttg
     2821 aatttaaagc tagcttatta ttactagccg tttttcaaag ttcaatttga gtcttaatgc
     2881 aaataggcgt taagccacag ttatagccat aattgtaact caatattcta actagcgatt
     2941 tatctaaatt aaattacatt atgcttttat aacttaccta ctagcctgcc caacatttac
     3001 acgatcgttt tataattaag aaaaaactaa tgatgaagat taaaaccttc atcatcctta
     3061 cgtcaattga attctctagc actcgaagct tattgtcttc aatgtaaaag aaaagctggt
     3121 ctaacaagat gacaactaga acaaagggca ggggccatac tgcggccacg actcaaaacg
     3181 acagaatgcc aggccctgag ctttcgggct ggatctctga gcagctaatg accggaagaa
     3241 ttcctgtaag cgacatcttc tgtgatattg agaacaatcc aggattatgc tacgcatccc
     3301 aaatgcaaca aacgaagcca aacccgaaga cgcgcaacag tcaaacccaa acggacccaa
     3361 tttgcaatca tagttttgag gaggtagtac aaacattggc ttcattggct actgttgtgc
     3421 aacaacaaac catcgcatca gaatcattag aacaacgcat tacgagtctt gagaatggtc
     3481 taaagccagt ttatgatatg gcaaaaacaa tctcctcatt gaacagggtt tgtgctgaga
     3541 tggttgcaaa atatgatctt ctggtgatga caaccggtcg ggcaacagca accgctgcgg
     3601 caactgaggc ttattgggcc gaacatggtc aaccaccacc tggaccatca ctttatgaag
     3661 aaagtgcgat tcggggtaag attgaatcta gagatgagac cgtccctcaa agtgttaggg
     3721 aggcattcaa caatctaaac agtaccactt cactaactga ggaaaatttt gggaaacctg
     3781 acatttcggc aaaggatttg agaaacatta tgtatgatca cttgcctggt tttggaactg
     3841 ctttccacca attagtacaa gtgatttgta aattgggaaa agatagcaac tcattggaca
     3901 tcattcatgc tgagttccag gccagcctgg ctgaaggaga ctctcctcaa tgtgccctaa
     3961 ttcaaattac aaaaagagtt ccaatcttcc aagatgctgc tccacctgtc atccacatcc
     4021 gctctcgagg tgacattccc cgagcttgcc agaaaagctt gcgtccagtc ccaccatcgc
     4081 ccaagattga tcgaggttgg gtatgtgttt ttcagcttca agatggtaaa acacttggac
     4141 tcaaaatttg agccaatctc ccttccctcc gaaagaggcg aataatagca gaggcttcaa
     4201 ctgctgaact atagggtacg ttacattaat gatacacttg tgagtatcag ccctggataa
     4261 tataagtcaa ttaaacgacc aagataaaat tgttcatatc tcgctagcag cttaaaatat
     4321 aaatgtaata ggagctatat ctctgacagt attataatca attgttatta agtaacccaa
     4381 accaaaagtg atgaagatta agaaaaacct acctcggctg agagagtgtt ttttcattaa
     4441 ccttcatctt gtaaacgttg agcaaaattg ttaaaaatat gaggcgggtt atattgccta
     4501 ctgctcctcc tgaatatatg gaggccatat accctgtcag gtcaaattca acaattgcta
     4561 gaggtggcaa cagcaataca ggcttcctga caccggagtc agtcaatggg gacactccat
     4621 cgaatccact caggccaatt gccgatgaca ccatcgacca tgccagccac acaccaggca
     4681 gtgtgtcatc agcattcatc cttgaagcta tggtgaatgt catatcgggc cccaaagtgc
     4741 taatgaagca aattccaatt tggcttcctc taggtgtcgc tgatcaaaag acctacagct
     4801 ttgactcaac tacggccgcc atcatgcttg cttcatacac tatcacccat ttcggcaagg
     4861 caaccaatcc acttgtcaga gtcaatcggc tgggtcctgg aatcccggat catcccctca
     4921 ggctcctgcg aattggaaac caggctttcc tccaggagtt cgttcttccg ccagtccaac
     4981 taccccagta tttcaccttt gatttgacag cactcaaact gatcacccaa ccactgcctg
     5041 ctgcaacatg gaccgatgac actccaacag gatcaaatgg agcgttgcgt ccaggaattt
     5101 catttcatcc aaaacttcgc cccattcttt tacccaacaa aagtgggaag aaggggaaca
     5161 gtgccgatct aacatctccg gagaaaatcc aagcaataat gacttcactc caggacttta
     5221 agatcgttcc aattgatcca accaaaaata tcatgggaat cgaagtgcca gaaactctgg
     5281 tccacaagct gaccggtaag aaggtgactt ctaaaaatgg acaaccaatc atccctgttc
     5341 ttttgccaaa gtacattggg ttggacccgg tggctccagg agacctcacc atggtaatca
     5401 cacaggattg tgacacgtgt cattctcctg caagtcttcc agctgtgatt gagaagtaat
     5461 tgcaataatt gactcagatc cagttttata gaatcttctc agggatagtg ataacatcta
     5521 tttagtaatc cgtccattag aggagacact tttaattgat caatatacta aaggtgcttt
     5581 acaccattgt cttttttctc tcctaaatgt agaacttaac aaaagactca taatatactt
     5641 gtttttaaag gattgattga tgaaagatca taactaataa cattacaaat aatcctacta
     5701 taatcaatac ggtgattcaa atgttaatct ttctcattgc acatactttt tgcccttatc
     5761 ctcaaattgc ctgcatgctt acatctgagg atagccagtg tgacttggat tggaaatgtg
     5821 gagaaaaaat cgggacccat ttctaggttg ttcacaatcc aagtacagac attgcccttc
     5881 taattaagaa aaaatcggcg atgaagatta agccgacagt gagcgtaatc ttcatctctc
     5941 ttagattatt tgttttccag agtaggggtc gtcaggtcct tttcaatcgt gtaaccaaaa
     6001 taaactccac tagaaggata ttgtggggca acaacacaat gggcgttaca ggaatattgc
     6061 agttacctcg tgatcgattc aagaggacat cattctttct ttgggtaatt atccttttcc
     6121 aaagaacatt ttccatccca cttggagtca tccacaatag cacattacag gttagtgatg
     6181 tcgacaaact agtttgtcgt gacaaactgt catccacaaa tcaattgaga tcagttggac
     6241 tgaatctcga agggaatgga gtggcaactg acgtgccatc tgcaactaaa agatggggct
     6301 tcaggtccgg tgtcccacca aaggtggtca attatgaagc tggtgaatgg gctgaaaact
     6361 gctacaatct tgaaatcaaa aaacctgacg ggagtgagtg tctaccagca gcgccagacg
     6421 ggattcgggg cttcccccgg tgccggtatg tgcacaaagt atcaggaacg ggaccgtgtg
     6481 ccggagactt tgccttccat aaagagggtg ctttcttcct gtatgatcga cttgcttcca
     6541 cagttatcta ccgaggaacg actttcgctg aaggtgtcgt tgcatttctg atactgcccc
     6601 aagctaagaa ggacttcttc agctcacacc ccttgagaga gccggtcaat gcaacggagg
     6661 acccgtctag tggctactat tctaccacaa ttagatatca ggctaccggt tttggaacca
     6721 atgagacaga gtacttgttc gaggttgaca atttgaccta cgtccaactt gaatcaagat
     6781 tcacaccaca gtttctgctc cagctgaatg agacaatata tacaagtggg aaaaggagca
     6841 ataccacggg aaaactaatt tggaaggtca accccgaaat tgatacaaca atcggggagt
     6901 gggccttctg ggaaactaaa aaaacctcac tagaaaaatt cgcagtgaag agttgtcttt
     6961 cacagttgta tcaaacggag ccaaaaacat cagtggtcag agtccggcgc gaacttcttc
     7021 cgacccaggg accaacacaa caactgaaga ccacaaaatc atggcttcag aaaattcctc
     7081 tgcaatggtt caagtgcaca gtcaaggaag ggaagctgca gtgtcgcatc taacaaccct
     7141 tgccacaatc tccacgagtc cccaatccct cacaaccaaa ccaggtccgg acaacagcac
     7201 ccataataca cccgtgtata aacttgacat ctctgaggca actcaagttg aacaacatca
     7261 ccgcagaaca gacaacgaca gcacagcctc cgacactccc tctgccacga ccgcagccgg
     7321 acccccaaaa gcagagaaca ccaacacgag caagagcact gacttcctgg accccgccac
     7381 cacaacaagt ccccaaaacc acagcgagac cgctggcaac aacaacactc atcaccaaga
     7441 taccggagaa gagagtgcca gcagcgggaa gctaggctta attaccaata ctattgctgg
     7501 agtcgcagga ctgatcacag gcgggagaag aactcgaaga gaagcaattg tcaatgctca
     7561 acccaaatgc aaccctaatt tacattactg gactactcag gatgaaggtg ctgcaatcgg
     7621 actggcctgg ataccatatt tcgggccagc agccgaggga atttacatag aggggctaat
     7681 gcacaatcaa gatggtttaa tctgtgggtt gagacagctg gccaacgaga cgactcaagc
     7741 tcttcaactg ttcctgagag ccacaactga gctacgcacc ttttcaatcc tcaaccgtaa
     7801 ggcaattgat ttcttgctgc agcgatgggg cggcacatgc cacattctgg gaccggactg
     7861 ctgtatcgaa ccacatgatt ggaccaagaa cataacagac aaaattgatc agattattca
     7921 tgattttgtt gataaaaccc ttccggacca gggggacaat gacaattggt ggacaggatg
     7981 gagacaatgg ataccggcag gtattggagt tacaggcgtt ataattgcag ttatcgcttt
     8041 attctgtata tgcaaatttg tcttttagtt tttcttcaga ttgcttcatg gaaaagctca
     8101 gcctcaaatc aatgaaacca ggatttaatt atatggatta cttgaatcta agattacttg
     8161 acaaatgata atataataca ctggagcttt aaacatagcc aatgtgattc taactccttt
     8221 aaactcacag ttaatcataa acaaggtttg acatcaatct agttatctct ttgagaatga
     8281 taaacttgat gaagattaag aaaaaggtaa tctttcgatt atctttaatc ttcatccttg
     8341 attctacaat catgacagtt gtctttagtg acaagggaaa gaagcctttt tattaagttg
     8401 taataatcag atctgcgaac cggtagagtt tagttgcaac ctaacacaca taaagcattg
     8461 gtcaaaaagt caatagaaat ttaaacagtg agtggagaca acttttaaat ggaagcttca
     8521 tatgagagag gacgcccacg agctgccaga cagcattcaa gggatggaca cgaccaccat
     8581 gttcgagcac gatcatcatc cagagagaat tatcgaggtg agtaccgtca atcaaggagc
     8641 gcctcacaag tgcgcgttcc tactgtattt cataagaaga gagttgaacc attaacagtt
     8701 cctccagcac ctaaagacat atgtccgacc ttgaaaaaag gatttttgtg tgacagtagt
     8761 ttttgcaaaa aagatcacca gttggagagt ttaactgata gggaattact cctactaatc
     8821 gcccgtaaga cttgtggatc agtagaacaa caattaaata taactgcacc caaggactcg
     8881 cgcttagcaa atccaacggc tgatgatttc cagcaagagg aaggtccaaa aattaccttg
     8941 ttgacactga tcaagacggc agaacactgg gcgagacaag acatcagaac catagaggat
     9001 tcaaaattaa gagcattgtt gactctatgt gctgtgatga cgaggaaatt ctcaaaatcc
     9061 cagctgagtc ttttatgtga gacacaccta aggcgcgagg ggcttgggca agatcaggca
     9121 gaacccgttc tcgaagtata tcaacgatta cacagtgata aaggaggcag ttttgaagct
     9181 gcactatggc aacaatggga ccgacaatcc ctaattatgt ttatcactgc attcttgaat
     9241 attgctctcc agttaccgtg tgaaagttct gctgtcgttg tttcagggtt aagaacattg
     9301 gttcctcaat cagataatga ggaagcttca accaacccgg ggacatgctc atggtctgat
     9361 gagggtaccc cttaataagg ctgactaaaa cactatataa ccttctactt gatcacaata
     9421 ctccgtatac ctatcatcat atatttaatc aagacgatat cctttaaaac ttattcagta
     9481 ctataatcac tctcgtttca aattaataag atgtgcatga ttgccctaat atatgaagag
     9541 gtatgataca accctaacag tgatcaaaga aaatcataat ctcgtatcgc tcgtaatata
     9601 acctgccaag catacctctt gcacaaagtg attcttgtac acaaataatg ttttactcta
     9661 caggaggtag caacgatcca tcccatcaaa aaataagtat ttcatgactt actaatgatc
     9721 tcttaaaata ttaagaaaaa ctgacggaac ataaattctt tatgcttcaa gctgtggagg
     9781 aggtgtttgg tattggctat tgttatatta caatcaataa caagcttgta aaaatattgt
     9841 tcttgtttca agaggtagat tgtgaccgga aatgctaaac taatgatgaa gattaatgcg
     9901 gaggtctgat aagaataaac cttattattc agattaggcc ccaagaggca ttcttcatct
     9961 ccttttagca aagtactatt tcagggtagt ccaattagtg gcacgtcttt tagctgtata
    10021 tcagtcgccc ctgagatacg ccacaaaagt gtctctaagc taaattggtc tgtacacatc
    10081 ccatacattg tattaggggc aataatatct aattgaactt agccgtttaa aatttagtgc
    10141 ataaatctgg gctaacacca ccaggtcaac tccattggct gaaaagaagc ttacctacaa
    10201 cgaacatcac tttgagcgcc ctcacaatta aaaaatagga acgtcgttcc aacaatcgag
    10261 cgcaaggttt caaggttgaa ctgagagtgt ctagacaaca aaatattgat actccagaca
    10321 ccaagcaaga cctgagaaaa aaccatggct aaagctacgg gacgatacaa tctaatatcg
    10381 cccaaaaagg acctggagaa aggggttgtc ttaagcgacc tctgtaactt cttagttagc
    10441 caaactattc aggggtggaa ggtttattgg gctggtattg agtttgatgt gactcacaaa
    10501 ggaatggccc tattgcatag actgaaaact aatgactttg cccctgcatg gtcaatgaca
    10561 aggaatctct ttcctcattt atttcaaaat ccgaattcca caattgaatc accgctgtgg
    10621 gcattgagag tcatccttgc agcagggata caggaccagc tgattgacca gtctttgatt
    10681 gaacccttag caggagccct tggtctgatc tctgattggc tgctaacaac caacactaac
    10741 catttcaaca tgcgaacaca acgtgtcaag gaacaattga gcctaaaaat gctgtcgttg
    10801 attcgatcca atattctcaa gtttattaac aaattggatg ctctacatgt cgtgaactac
    10861 aacggattgt tgagcagtat tgaaattgga actcaaaatc atacaatcat cataactcga
    10921 actaacatgg gttttctggt ggagctccaa gaacccgaca aatcggcaat gaaccgcatg
    10981 aagcctgggc cggcgaaatt ttccctcctt catgagtcca cactgaaagc atttacacaa
    11041 ggatcctcga cacgaatgca aagtttgatt cttgaattta atagctctct tgctatctaa
    11101 ctaaggtaga atacttcata ttgagctaac tcatatatgc tgactcaata gttatcttga
    11161 catctctgct ttcataatca gatatataag cataataaat aaatactcat atttcttgat
    11221 aatttgttta accacagata aatcctcact gtaagccagc ttccaagttg acacccttac
    11281 aaaaaccagg actcagaatc cctcaaacaa gagattccaa gacaacatca tagaattgct
    11341 ttattatatg aataagcatt ttatcaccag aaatcctata tactaaatgg ttaattgtaa
    11401 ctgaacccgc aggtcacatg tgttaggttt cacagattct atatattact aactctatac
    11461 tcgtaattaa cattagataa gtagattaag aaaaaagcct gaggaagatt aagaaaaact
    11521 gcttattggg tctttccgtg ttttagatga agcagttgaa attcttcctc ttgatattaa
    11581 atggctacac aacataccca atacccagac gctaggttat catcaccaat tgtattggac
    11641 caatgtgacc tagtcactag agcttgcggg ttatattcat catactccct taatccgcaa
    11701 ctacgcaact gtaaactccc gaaacatatc taccgtttga aatacgatgt aactgttacc
    11761 aagttcttga gtgatgtacc agtggcgaca ttgcccatag atttcatagt cccagttctt
    11821 ctcaaggcac tgtcaggcaa tggattctgt cctgttgagc cgcggtgcca acagttctta
    11881 gatgaaatca ttaagtacac aatgcaagat gctctcttct tgaaatatta tctcaaaaat
    11941 gtgggtgctc aagaagactg tgttgatgaa cactttcaag agaaaatctt atcttcaatt
    12001 cagggcaatg aatttttaca tcaaatgttt ttctggtatg atctggctat tttaactcga
    12061 aggggtagat taaatcgagg aaactctaga tcaacatggt ttgttcatga tgatttaata
    12121 gacatcttag gctatgggga ctatgttttt tggaagatcc caatttcaat gttaccactg
    12181 aacacacaag gaatccccca tgctgctatg gactggtatc aggcatcagt attcaaagaa
    12241 gcggttcaag ggcatacaca cattgtttct gtttctactg ccgacgtctt gataatgtgc
    12301 aaagatttaa ttacatgtcg attcaacaca actctaatct caaaaatagc agagattgag
    12361 gatccagttt gttctgatta tcccaatttt aagattgtgt ctatgcttta ccagagcgga
    12421 gattacttac tctccatatt agggtctgat gggtataaaa ttattaagtt cctcgaacca
    12481 ttgtgcttgg ccaaaattca attatgctca aagtacactg agaggaaggg ccgattctta
    12541 acacaaatgc atttagctgt aaatcacacc ctagaagaaa ttacagaaat gcgtgcacta
    12601 aagccttcac aggctcaaaa gatccgtgaa ttccatagaa cattgataag gctggagatg
    12661 acgccacaac aactttgtga gctattttcc attcaaaaac actgggggca tcctgtgcta
    12721 catagtgaaa cagcaatcca aaaagttaaa aaacatgcta cggtgctaaa agcattacgc
    12781 cctatagtga ttttcgagac atactgtgtt tttaaatata gtattgccaa acattatttt
    12841 gatagtcaag gatcttggta cagtgttact tcagatagga atctaacacc gggtcttaat
    12901 tcttatatca aaagaaatca attccctccg ttgccaatga ttaaagaact actatgggaa
    12961 ttttaccacc ttgaccaccc tccacttttc tcaaccaaaa ttattagtga cttaagtatt
    13021 tttataaaag acagagctac cgcagtagaa aggacatgct gggatgcagt attcgagcct
    13081 aatgttctag gatataatcc acctcacaaa tttagtacta aacgtgtacc ggaacaattt
    13141 ttagagcaag aaaacttttc tattgagaat gttctttcct acgcacaaaa actcgagtat
    13201 ctactaccac aatatcggaa cttttctttc tcattgaaag agaaagagtt gaatgtaggt
    13261 agaaccttcg gaaaattgcc ttatccgact cgcaatgttc aaacactttg tgaagctctg
    13321 ttagctgatg gtcttgctaa agcatttcct agcaatatga tggtagttac ggaacgtgag
    13381 caaaaagaaa gcttattgca tcaagcatca tggcaccaca caagtgatga ttttggtgaa
    13441 catgccacag ttagagggag tagctttgta actgatttag agaaatacaa tcttgcattt
    13501 agatatgagt ttacagcacc ttttatagaa tattgcaacc gttgctatgg tgttaagaat
    13561 gtttttaatt ggatgcatta tacaatccca cagtgttata tgcatgtcag tgattattat
    13621 aatccaccac ataacctcac actggagaat cgagacaacc cccccgaagg gcctagttca
    13681 tacaggggtc atatgggagg gattgaagga ctgcaacaaa aactctggac aagtatttca
    13741 tgtgctcaaa tttctttagt tgaaattaag actggtttta agttacgctc agctgtgatg
    13801 ggtgacaatc agtgcattac tgttttatca gtcttcccct tagagactga cgcagacgag
    13861 caggaacaga gcgccgaaga caatgcagcg agggtggccg ccagcctagc aaaagttaca
    13921 agtgcctgtg gaatcttttt aaaacctgat gaaacatttg tacattcagg ttttatctat
    13981 tttggaaaaa aacaatattt gaatggggtc caattgcctc agtcccttaa aacggctaca
    14041 agaatggcac cattgtctga tgcaattttt gatgatcttc aagggaccct ggctagtata
    14101 ggcactgctt ttgagcgatc catctctgag acacgacata tctttccttg caggataacc
    14161 gcagctttcc atacgttttt ttcggtgaga atcttgcaat atcatcatct cgggttcaat
    14221 aaaggttttg accttggaca gttaacactc ggcaaacctc tggatttcgg aacaatatca
    14281 ttggcactag cggtaccgca ggtgcttgga gggttatcct tcttgaatcc tgagaaatgt
    14341 ttctaccgga atctaggaga tccagttacc tcaggcttat tccagttaaa aacttatctc
    14401 cgaatgattg agatggatga tttattctta cctttaattg cgaagaaccc tgggaactgc
    14461 actgccattg actttgtgct aaatcctagc ggattaaatg tccctgggtc gcaagactta
    14521 acttcatttc tgcgccagat tgtacgcagg accatcaccc taagtgcgaa aaacaaactt
    14581 attaatacct tatttcatgc gtcagctgac ttcgaagacg aaatggtttg taaatggcta
    14641 ttatcatcaa ctcctgttat gagtcgtttt gcggccgata tcttttcacg cacgccgagc
    14701 gggaagcgat tgcaaattct aggatacctg gaaggaacac gcacattatt agcctctaag
    14761 atcatcaaca ataatacaga gacaccggtt ttggacagac tgaggaaaat aacattgcaa
    14821 aggtggagcc tatggtttag ttatcttgat cattgtgata atatcctggc ggaggcttta
    14881 acccaaataa cttgcacagt tgatttagca cagattctga gggaatattc atgggctcat
    14941 attttagagg gaagacctct tattggagcc acactcccat gtatgattga gcaattcaaa
    15001 gtgttttggc tgaaacccta cgaacaatgt ccgcagtgtt caaatgcaaa gcaaccaggt
    15061 gggaaaccat tcgtgtcagt ggcagtcaag aaacatattg ttagtgcatg gccgaacgca
    15121 tcccgaataa gctggactat cggggatgga atcccataca ttggatcaag gacagaagat
    15181 aagataggac aacctgctat taaaccaaaa tgtccttccg cagccttaag agaggccatt
    15241 gaattggcgt cccgtttaac atgggtaact caaggcagtt cgaacagtga cttgctaata
    15301 aaaccatttt tggaagcacg agtaaattta agtgttcaag aaatacttca aatgacccct
    15361 tcacattact caggaaatat tgttcacagg tacaacgatc aatacagtcc tcattctttc
    15421 atggccaatc gtatgagtaa ttcagcaacg cgattgattg tttctacaaa cactttaggt
    15481 gagttttcag gaggtggcca gtctgcacgc gacagcaata ttattttcca gaatgttata
    15541 aattatgcag ttgcactgtt cgatattaaa tttagaaaca ctgaggctac agatatccaa
    15601 tataatcgtg ctcaccttca tctaactaag tgttgcaccc gggaagtacc agctcagtat
    15661 ttaacataca catctacatt ggatttagat ttaacaagat accgagaaaa cgaattgatt
    15721 tatgacagta atcctctaaa aggaggactc aattgcaata tctcattcga taatccattt
    15781 ttccaaggta aacggctgaa cattatagaa gatgatctta ttcgactgcc tcacttatct
    15841 ggatgggagc tagccaagac catcatgcaa tcaattattt cagatagcaa caattcatct
    15901 acagacccaa ttagcagtgg agaaacaaga tcattcacta cccatttctt aacttatccc
    15961 aagataggac ttctgtacag ttttggggcc tttgtaagtt attatcttgg caatacaatt
    16021 cttcggacta agaaattaac acttgacaat tttttatatt acttaactac tcaaattcat
    16081 aatctaccac atcgctcatt gcgaatactt aagccaacat tcaaacatgc aagcgttatg
    16141 tcacggttaa tgagtattga tcctcatttt tctatttaca taggcggtgc tgcaggtgac
    16201 agaggactct cagatgcggc caggttattt ttgagaacgt ccatttcatc ttttcttaca
    16261 tttgtaaaag aatggataat taatcgcgga acaattgtcc ctttatggat agtatatccg
    16321 ctagagggtc aaaacccaac acctgtgaat aattttctct atcagatcgt agaactgctg
    16381 gtgcatgatt catcaagaca acaggctttt aaaactacca taagtgatca tgtacatcct
    16441 cacgacaatc ttgtttacac atgtaagagt acagccagca atttcttcca tgcatcattg
    16501 gcgtactgga ggagcagaca cagaaacagc aaccgaaaat acttggcaag agactcttca
    16561 actggatcaa gcacaaacaa cagtgatggt catattgaga gaagtcaaga acaaaccacc
    16621 agagatccac atgatggcac tgaacggaat ctagtcctac aaatgagcca tgaaataaaa
    16681 agaacgacaa ttccacaaga aaacacgcac cagggtccgt cgttccagtc ctttctaagt
    16741 gactctgctt gtggtacagc aaatccaaaa ctaaatttcg atcgatcgag acacaatgtg
    16801 aaatttcagg atcataactc ggcatccaag agggaaggtc atcaaataat ctcacaccgt
    16861 ctagtcctac ctttctttac attatctcaa gggacacgcc aattaacgtc atccaatgag
    16921 tcacaaaccc aagacgagat atcaaagtac ttacggcaat tgagatccgt cattgatacc
    16981 acagtttatt gtagatttac cggtatagtc tcgtccatgc attacaaact tgatgaggtc
    17041 ctttgggaaa tagagagttt caagtcggct gtgacgctag cagagggaga aggtgctggt
    17101 gccttactat tgattcagaa ataccaagtt aagaccttat ttttcaacac gctagctact
    17161 gagtccagta tagagtcaga aatagtatca ggaatgacta ctcctaggat gcttctacct
    17221 gttatgtcaa aattccataa tgaccaaatt gagattattc ttaacaactc agcaagccaa
    17281 ataacagaca taacaaatcc tacttggttt aaagaccaaa gagcaaggct acctaagcaa
    17341 gtcgaggtta taaccatgga tgcagagaca acagagaata taaacagatc gaaattgtac
    17401 gaagctgtat ataaattgat cttacaccat attgatccta gcgtattgaa agcagtggtc
    17461 cttaaagtct ttctaagtga tactgagggt atgttatggc taaatgataa tttagccccg
    17521 ttttttgcca ctggttattt aattaagcca ataacgtcaa gtgctagatc tagtgagtgg
    17581 tatctttgtc tgacgaactt cttatcaact acacgtaaga tgccacacca aaaccatctc
    17641 agttgtaaac aggtaatact tacggcattg caactgcaaa ttcaacgaag cccatactgg
    17701 ctaagtcatt taactcagta tgctgactgt gagttacatt taagttatat ccgccttggt
    17761 tttccatcat tagagaaagt actataccac aggtataacc tcgtcgattc aaaaagaggt
    17821 ccactagtct ctatcactca gcacttagca catcttagag cagagattcg agaattaact
    17881 aatgattata atcaacagcg acaaagtcgg actcaaacat atcactttat tcgtactgca
    17941 aaaggacgaa tcacaaaact agtcaatgat tatttaaaat tctttcttat tgtgcaagca
    18001 ttaaaacata atgggacatg gcaagctgag tttaagaaat taccagagtt gattagtgtg
    18061 tgcaataggt tctaccatat tagagattgc aattgtgaag aacgtttctt agttcaaacc
    18121 ttatatttac atagaatgca ggattctgaa gttaagctta tcgaaaggct gacagggctt
    18181 ctgagtttat ttccggatgg tctctacagg tttgattgaa ttaccgtgca tagtatcctg
    18241 atacttgcaa aggttggtta ttaacataca gattataaaa aactcataaa ttgctctcat
    18301 acatcatatt gatctaatct caataaacaa ctatttaaat aacgaaagga gtccctatat
    18361 tatatactat atttagcctc tctccctgcg tgataatcaa aaaattcaca atgcagcatg
    18421 tgtgacatat tactgccgca atgaatttaa cgcaacataa taaactctgc actctttata
    18481 attaagcttt aacgaaaggt ctgggctcat attgttattg atataataat gttgtatcaa
    18541 tatcctgtca gatggaatag tgttttggtt gataacacaa cttcttaaaa caaaattgat
    18601 ctttaagatt aagtttttta taattatcat tactttaatt tgtcgtttta aaaacggtga
    18661 tagccttaat ctttgtgtaa aataagagat taggtgtaat aaccttaaca tttttgtcta
    18721 gtaagctact atttcataca gaatgataaa attaaaagaa aaggcaggac tgtaaaatca
    18781 gaaatacctt ctttacaata tagcagacta gataataatc ttcgtgttaa tgataattaa
    18841 gacattgacc acgctcatca gaaggctcgc cagaataaac gttgcaaaaa ggattcctgg
    18901 aaaaatggtc gcacacaaaa atttaaaaat aaatctattt cttctttttt gtgtgtcca
//

Tumor-specific internalizing peptides or tumor-homing peptides

$
0
0

Can cancer cells or their microenvironment be targeted selectively to treat tumors?

Yes, is appears that this is possible.


A number of peptides have been reported to specifically target tumor and tumor associated microenvironments, such as the tumor vasculature, after their systematic delivery. These peptides are known as “tumor-specific internalizing peptides” (TSIPs) or “tumor homing peptides” (THPs).

Tumor-specific internalizing peptides are usually short peptides in sequence lengths of 3 to 15 amino acids that specifically recognize and bind to tumor cells or tumor vasculature. Since 1998 a number of these peptides have been identified using in vitro and in vivo phage display technology. Phage display is a molecular biology technology in which proteins or peptides are displayed on the surface of a phage as a fusion with one of the phage coated proteins. Phage display has been used intensively for the screening for protein-protein interactions. This screening method allowed for the identification of tumor-specific or tumor homing peptides that target specific tumor cells or tumor vasculature.

According to the International Agency for Research on Cancer, an agency of the World Health Organization, cancer is now the world’s biggest killer. The “World Cancer Report” showed that there were 8.2 million deaths from cancer in 2012 and predicts that cancer cases worldwide will rise by 75 % over the next two decades. By then it is estimated that up to 25 million people may be suffering from cancer worldwide. Unfortunately, despite progress made in our understanding of the molecular basis of cancer and improvements made in treatment options, mortality rate is still high. This suggests that the availability of new types, more selective drugs that fight cancer would be of great benefit to humans.

Tumor-specific internalizing peptides or tumor homing peptides have common sequence motifs like RGD, or NGR, which specifically bind to a surface molecule on tumor cells or tumor vasculature. The best known examples are the short peptides RGD and NGR. The RGD (Arg-Gly-Asp) peptide is known to bind α integrins and NGR (Asn-Gly-Arg) is known to bind to a receptor aminopeptidase N present on the surface of tumor endothelial cells, also called tumor angiogenic markers. It is no wonder that tumor-specific internalizing peptides are being used in cancer diagnosis and treatment. So far, many anti-cancer and imaging agents have been targeted to tumor sites in mice models by conjugation them to tumor-specific peptides. A database called “TumorHoPe” provides comprehensive information about experimentally validated tumor homing peptides and their target cells (http://crdd.osdd.net/raghava/tumorhope/). This is a manually curated database containing 744 entries of experimentally characterized tumor homing peptides that recognize tumor tissues and tumor associated micro environment, including tumor metastasis.


A list of some tumor homing peptide motifs

 

Motif

Action

NGR (Asn-Gly-Arg)

Binds aminopeptidase N

GSL (Gly-Ser-Leu)

Inhibition of tumor homing

RGD (Arg-Gly-Asp)

Binds selectively to integrins which are overexpressed on endothelial cell surface in the cancer and facilitate cancer cell migration

TSPLNIHGQKL

Hn-1 appears to be HNSCC specific. Targeted drug delivery into solid tumors.

 

The specific internalization of peptides that target tumor cells has been evaluated for targeted siRNA delivery into human cancer cells. Un et al. in 2012 investigated the internalization of the HN-1TYR-anti-hRRM2 siRNAR peptide conjugate in human head and neck or breast cancer cells to establish its utility for targeted siRNA delivery into human cancer cells. The researchers used a FITC-HN-1TYR-anti-hRRM2 siRNAR construct to image its successful internalization into a human cancer cell line. For the synthesis of the fluorescent siRNA delivery vehicle, FITC-HN-1TYR-anti-hRRM2 siRNAR, a tyrosine and a FITC was added to the N-terminal end. Next, a synthetic anti-hRRM2 siRNA was synthesized with fluorine, incorporated at its 2’-OH position, to avoid degradation by RNases in vivo, and conjugated to the 5’-end of the antisense strand using a hexynyl phophoramidite linker. The selected HN1 peptide, a 12mer peptide that was isolated by peptide display library screening using a M13 phage library, contains the sequence TSPLNIHNGQKL. It has the ability to translocate drugs across the cell membrane into the cytosol, its uptake occurs in a tumor-specific manner, and it is capable of penetrating solid tumors. Ribonucleotide Reductase (RR), composed of the subunits hRRM1 and hRRM2, catalyses the conversion of ribonucleotides to their corresponding deoxy forms need for DNA replication. The researchers choose an anti-hRRM2 siRNA to allow for the degradation of hRRM2’s mRNA to suppress tumorgenesis.

To conclude, tumor-specific internalizing peptides or tumor homing peptides appear to be future drug candidates for targeted siRNA delivery into human cancer cells that may enable a more selective treatment of tumors with less site effects.

References

Kapoor P, Singh H, Gautam A, Chaudhary K, Kumar R, et al. (2012); TumorHoPe: A Database of Tumor Homing Peptides. PLoS ONE 7(4): e35187. doi:10.1371/journal.pone.0035187.

FRANK UN, BINGSEN ZHOU and YUN YEN; The Utility of Tumor-specifically Internalizing Peptides for Targeted siRNA Delivery into Human Solid Tumors. ANTICANCER RESEARCH 32: 4685-4690 (2012).

 

Control templates or standards for molecular DNA and RNA diagnostics

$
0
0

Control templates for molecular DNA/RNA diagnostics

As the number and scope of disease-producing pathogens and their genetic variants that cause human disease have continued to increase, there has been a commensurate and rapid increase in the use of nucleic acid based tests for routine clinical diagnosis. Due to the complex nature of nucleic acids, these molecular tests must be fully controlled to accurately ascertain their specificity and sensitivity. However, the success of molecular diagnostics is often impeded by the availability of DNA- or RNA-based positive controls with the same or similar number of mutations as the organism being screened, for example, in the case of a pandemic or newly emerging disease, such as Ebola, where it can be difficult to acquire necessary positive controls.

DNA or RNA standards allow a researcher to determine if an assay accurately represents the composition or quantities of known input as well as to derive standard calibration curves. This allows to relate read-out counts of analyte concentrations in the studied samples to accurate amounts or quantities. Furthermore, the use of control standards allows for direct measurement of error rates, coverage biases, and other veriables that can affect downstream analysis, such as the analysis of various isoforms.   

 
 

As Good Laboratory Practices, government agencies, and organizations that establish standards and control require diagnostic laboratories to use stringent quality controls (QCs) guidelines that include calibrating equipment against control samples and performing tests of patient samples in tandem with consistent references, it is critical that reference samples be used in a manner that provides comprehensive evaluation of every component in these highly complex procedures and reagent mixtures. The need for these controls and/or standards became particularly acute with the widespread use of high complexity and high volume DNA- or RNA-based real time testing platforms.

Bio-Synthesis provides molecular assay services, focused on the design and development of nucleic acid-based, positive control templates (PCT) to monitor the molecular diagnostic testing process, including the extraction, amplification, and detection components of test systems used to measure disease producing organisms. We provide thousands of PCTs to genotype high value polymorphisms for various drug metabolism and transporter genes. These PCTs can be manufactured in our cutting edge molecular diagnostic facilities and significantly shorten your path from RESEARCH to RESULT by providing you with the full development process for control templates that may be used as standard references in the simultaneous detection of mutations in any genome. These laboratory-safe, synthetic or semi-synthetic DNA/RNA Positive Controls can be a relatively cost effective, simple and efficient alternative to difficult-to-acquire controls from infectious samples.


Our contract services are confidential, fast, efficient and well-documented, with objective to support the improvement of analysis and control of human infectious diseases by providing high quality evaluation materials to aid in the advancement of nucleic acid technologies.

Advantages


Select Platform

 DNA/RNA control templates, length >1000 bp

Technology Friendly

 Suitable for Real-Time PCR, qPCR, microarray...

 Laboratory-Safe

 Non-infectious, laboratory-safe synthetic controls

 Accurate and Reliable

 Reproducible results - known input copy number

 Well-Documented

 Well-characterized sequence to assure maximum fidelity

 Customized Solutions

Optimized preparation for specific applicationsSelect Platform: DNA/RNA control templates, length >1000 bp

 






BSI's On-demand HPV and HLA controls can be used as positive controls in nucleic acid amplification reactions.

These quantitative controls can also be used to generate standard curves for qPCR assays.

 

Ebola Peptides for Diagnostics and Vaccines

$
0
0
Ebola Peptides for Diagnostics and Vaccines


Peptides derived from Ebola virus proteins can be used to study antigenicity and immunogenicity of Ebola proteins. In addition, these peptide epitopes can be used further to develop sensitive and accurate diagnostic tests using polyclonal or monoclonal antibodies. Another potential use for this type of peptides is for the development of unique peptide-based vaccines. In particular, succesful and potent vaccines could be developed using antigenic peptides derived from proteins of the Ebola virus or other Ebola virus strains. 
 

Figure 1: Ultra structures and models of the Ebola virus and its genome (Source: Ellis et al. 1978; CDC).  Ellis et al. in 1978 showed that electron microscopy can be used to detect and observe the ultrastructure of the Eboli virus in infected human tissue. The Ebola virus was detected in tissue samples from human liver, kidney, spleen and lung.   

Infection of a cell by a virus requires the fusion between viral and host membranes. Infection of a cell by the Ebola virus (EboV) begins with the uptake of viral particles into cellular endosomes. Experimental data suggests that the viral envelope glycoprotein (GP) catalyzes the fusion between the viral and host cell membranes. The fusion event is thought to involve conformational rearrangements of the transmembrane subunit (GP2) of the envelope spike ultimately resulting in the formation of a six-helix bundle by the N- and C-terminal heptad repeat (NHR and CHR, respectively) regions of GP2. Membrane fusion is mediated by fusion proteins that extrude from the viral membrane. Key components that are in contact with the host cell membrane are fusion peptides, parts of the fusion proteins. The Ebola glycoprotein (GP) is responsible for both receptor binding and membrane fusion. The GP is composed of two sub-domains, GP1 and GP2. The two domains are connected via a disulfide bond. The Ebola fusion peptide (EFP) (G524AAIGLAWIPYFGPAA539) is thought to be in direct contact with the host cell membrane. This peptide is conserved within the virus family. EFP is an internal fusion peptide located 22 residues from the N-terminus of GP2. Experimental data suggests that the EFP peptide in the presence of the membrane has a tendency to form helical structures.

Figure 2: Model of the Ebola fusion protein in its fusiogenic state as suggested by Jaskierny et al. in 2011. The globular protein GP1 is thought to initiate the binding to the host cell receptor. The GP2 domain contains a helical bundle with the fusion peptide near the N-terminus. Jaskierney et al. studied the monomeric form of the internal fusion peptide from Ebola virus in membrane bilayer and water environments using computer simulations. The wild type Ebola fusion peptide, the W8A mutant form, and an extended construct with flanking residues were examined. The researchers found that the monomeric form of wild type Ebola fusion peptide adopts a coil-helix-coil structure with a short helix from residue 8 to 11 orientate parallel to the membrane surface.

 

Using circular dichroism (CD) together with infrared (IR) spectroscopy the researchers showed that the EFP peptide has three states:

A random coil in solution and either an α–helix or a β–sheet when bound to the membrane. Furthermore, the secondary structure of the membrane-bound peptide depends on the presence of Ca2+ and in the presence of Ca2+ a β-sheet structure is preferred while in the absence of Ca2+ helical structures are dominant. A nuclear magnetic resonance (NMR) study of EFP showed that the peptide adopts a random coil structure in aqueous buffers and a more defined structure in the presence of sodium dodecyl sulfate (SDS) micelles. Tryptophan fluorescent emission data suggests that W8 enters the hydrophobic core of SDS micelles. Nuclear Overhauser effect (NOE) measurements obtained from 1H NMR suggested the presence of a short 310 helix form I9 to F12 in the middle of the peptide while the N- and C-termini appear to be less structured.


Miller et al. in 2011 performed a study using synthetic peptides of the CHR sequence region (C-peptides) to test if these peptides can inhibit the entry of the virus particles. The researchers prepared an EboV C-peptide conjugated to the arginine-rich sequence from HIV-1 Tat, known to accumulate in endosomes, and found that this peptide specifically inhibits viral entry mediated by filovirus GP proteins and infection by authentic filoviruses. The researchers determined that antiviral activity was dependent on both the Tat sequence and the native EboV CHR sequence. Miller et al. argue that targeting C-peptides to endosomal compartments can serve as an approach to localize inhibitors to sites of membrane fusion.


To diagnose and control the endemic outbreaks of haemorrhagic fever in humans caused by filioviruses, such as the Ebola and the Marburg virus, rapid, highly sensitive, reliable, and specific assays are required. The identification and characterization of antigenic sites in viral proteins is important for the development of viral antigen detection assays.

Changula et al. in 2013 generated a panel of mouse monoclonal antibodies (mAbs) to the nucleoprotein (NP) of the Zaire Ebola virus. The researchers divided the mABs into seven groups based on the profiles of their specificity and cross-reactivity to other species in the Ebolavirus genus. The use of synthetic peptides corresponding to the Ebola virus nucleoprotein (NP) sequence allowed to map mAb binding sites to seven antigenic regions in the C-terminal half of the NP. The mapped antigenic sites included two highly conserved regions present among all five Ebola virus species currently known. In addition, the scientists were successfully in producing species-specific rabbit antisera to synthetic peptides predicted to represent unique filovirus B-cell epitopes. These results provide useful information for the development of Ebola virus antigen detection assays and potentially new vaccines for Ebola virus strains.


Table 1: Ebola virus peptides

Peptide

Sequence

Notes

 

Fusion Peptide

Jaskierny et al., 2011.

EFP

G524AAIGLAWIPYFGPAA539

Chain A fusion peptide in SDS micelles at pH 7

 

 

 

 

C-Peptide Study

Miller et al. 2011

Tat-Ebo

YGRKKRRQRRR-GSG-IEPHDWTKNITDKIDQIIHDFVDK

Ebola virus chain A fusion peptide

Lys-Ebo

       KKKK-GSG-IEPHDWTKNITDKIDQIIHDFVDK

Ebola virus chain A fusion peptide

Tat-only

YGRKKRRQRRR

 

Tat-Scram

YGRKKRRQRRR-GSG-HTEHINFQDDTIKIWPDVIKIKDD

 

Tat-ASLV

YGRKKRRQRRR-GSG-FNLSDHSESIQKKFQLMKEHVNKIG

 

 

 

 

 

Peptide epitopes of mABs against EBOV NP

Changula et al. 2013

ZNP31-1-8

ZNP41-2-4

YDDDDDIPFP, aa 421–430

NP protein

ZNP74-7

YDDDDDIPFPGPINDDDNPG, aa 421–440

NP protein

ZNP24-4-2

QTQFRPIQNVPGPHRTIHHA, aa 521–540

TPTVAPPAPVYRDHSEKKEL, aa 601–620

NP protein

ZNP106-9

DTTIPDVVVD, aa 451–460a

NP protein

ZNP98-7

MLTPINEEADPLDDADDETS, aa 561–580

NP protein

ZNP35-16-3-5

DDEDTKPVPNRSTKGGQQKN, aa 491–510

NP protein

ZNP62-7

YRDHSEKKELPQDEQQDQDH, aa 611–630

NP protein

 

Ebola virus NucleoProtein (NP) sequence

>gi|158341892|gb|ABW34756.1| nucleoprotein, partial [Zaire ebolavirus]

RQIQVHAEQGLIQYPTAWQSVGHMMVIFRMMRTNFLIKFLLIHQGMHMVAGHDANDAVISNSVAQARFSG

LLIVKTVLDHILQKTERGVRLHPLARTAKVKNEVNSFKAALSSLAKHGEYAPFARLLNLSGVNNLEHGLF

PQLSAIALGVATAHGSTLAGVNVGEQYQQLREAATEAEKQLQQYAESRELDHLGLDDQEKKILMNFHQKK

NEISFQQTNAMVTLRKERLAKLTEAITAASLPKTSGHYDDDDDIPFPGPINDDDNPGHQDDDPTDSQDTT

IPDVVVDPDDGSYGEYQSYSENGMNAPDDLVLFDLDEDDEDTKPVPNRLTKGGQQKNSQKGHHTEGRQTQ

SRPTQNVPGPRRTIHHASAPLTDNDRGNEPSGSTSPRMLTPINEEADPLDDADDETSSLPPLESDDEEQD

RDETSNRTPTVAPPAPVYRDHSEKKELPQDEQQDQDHTQEARNQDSDNTQPEHSFEEMYRHIL


The location of the Zaire envelope protein (ZNP) peptides are highlighted in red and magenta within the amino acid sequence of Ebola virus nucleoprotein.

Table 2: Observed mutations for the QTQFRPIQNVPGPHRTIHHA, aa 521–540, peptide.


Models of Ebola virus peptides and proteins

Figure 3: NMR structure of the Ebola virus chain A fusion peptide, GAAIGLAWIPYFGPAA.


Figure 4: Crystal structure models of the Ebola virus membrane fusion subunit, GP2 envelope glycoprotein ectodomain.

Table 3: Peptides used for the production of rabbit antisera by Changula et al. 2013.


Virus Protein

Peptide

Amino Acids

EBOV NP

QDHTQEARNQD

628-638

SUDV NP

QGSESEALPINSKK

631-644

TAFV NP

NQVSGSENTDNKPH

630-643

BDBV NP

QSNQTNNEDNVRNN

628-641

RESTV NP

TSQLNEDPDIGQSK

630-643

MARV NP

RVVTKKGRTFLYPNDLLQ

635-652

 

Legend: BDBV = Bundibugyo virus; EBOV = Ebola virus; MATV = Marburg virus; RESTV = Reston virus; SUDV = Sudan virus; TAFV = Tai Forest Ebola virus.


The membrane proximal external region (MPER) peptide


Regula et al. in 2013 investigated the role of the membrane proximal external region (MPER) that precedes the transmembrane domain of glycoprotein 2 (GP2) of Ebola virus strains. Earlier research indicated that an infection by a filovirus requires membrane fusion between the host and the virus. The fusion process is facilitated by the two subunits of the envelope glycoprotein, the surface subunit GP1and the transmembrane subunit GP2. A sequence region called the membrane proximal external region (MPER) is a tryptophan (Trp, W) rich peptide segment located immediately in front of the transmembrane domain of GP2. In the human immunodeficiency virus 1 (HIV-1) glycoprotein gp41, the MPER is known to be critical for membrane fusion. In addition, this amino acid sequence was also identified as a target for several neutralizing antibodies. Regula et al. characterized the properties of GP MPER segment peptides of Ebola virus and Sudan virus. The study used  micelle-forming surfactants and lipids, at pH 7 and pH 4.6. The researchers employed circular dichroism (CD) spectroscopy and tryptophan fluorescence to determine if GP2 MPER peptides bind to micelles of sodium dodecyl sulfate (SDS) and dodecylphosphocholine (DPC). Nuclear magnetic resonance (NMR) spectroscopy was used to reveal that residues 644 to 651 of the Sudan virus MPER peptide interacted directly with DPC. This interaction enhanced the helical conformation of the peptide. The scientists found that the Sudan virus MPER peptide moderately inhibited cell entry by a GP-pseudotyped vesicular stomatitis virus. However, it did not induce leakage of a fluorescent molecule from large unilamellar vesicle comprised of 1-palmitoyl-2-oleoylphostatidyl choline (POPC) or cause hemolysis. The analysis performed by this research group suggested that the filovirus GP MPER binds and inserts shallowly into lipid membranes.


GP2 MPER Peptides


Table 4: Alignment of GP2 MPER peptides from different viruses.

Virus Strain

GP2 MPER Peptide

Amino Acids

EBOV

    DKTLPDQGDNDNWWTGWRQW

632 to 651

BDBV

    DKPLPDQTDNDNWWTGWRQW

632 to 651

SUDV

    DNPLPNQDNDDNWWTGWRQW

632 to 651

TAFV

    DNNLPNQNDGSNWWTGWKQW

632 to 651

RESTV

    DNPLPDHGDDLNNWTGWRQW

633 to 652

FIV

    LQKWEDWVGWIGNIPQYLKG

767 to 786

HIV-1

LLELDKWASLWNWFDITNWLWYIK

660 to 683

 

Table 4 shows the amino acid alignment of GP2 MPER regions from different members of the five Ebola virus species. Many residues that are identical in at least four of the viruses. For comparison, the MPER segments of FIV and HIV-1 gp41 are included.

 

Legend: BDBV Bundbuyo virus, EBOV Ebola virus, FIV filio virus, HIV-1 human immunodeficiency virus 1, RESTV Reston virus, SUDV Sudan virus, TAFV Thai Forest virus.


Alignments of GP2 MPER peptides from various virus strains.


Location of the GP2 MPER peptides within the GP2 protein of the Ebola virus

Figure 5: The location of the MPER peptides is highlighted in yellow in the crystal structure of the Ebola virus membrane fusion subunit, GP2 envelope glycoprotein ectodomain. The amino acid of the peptide shown in gray where not observed in the crystal indicating that this part of the peptide may take up a random coil structure in the crystal.


Regula et al. used EBOV and SUDV MPER peptides for their study because both viruses are the most prevalent and pathogenic among the ebolaviruses. Synthetic peptides corresponding to the MPER region for EBOV and SUDV were used. The N-termini were blocked with an acetyl group and the C-termini contained an amide group.

 

The study revealed three characteristics of the GP2 MPER peptides:



  • As a peptide, the GP2 MPER binds to micelle-forming surfactants in a pH-independent manner with higher affinity for zwitterionic micelles; 
  • A large conformational change to a more predominantly helical state occurs for the tryptophan-rich region of this peptide upon micelle-binding;
  • These peptides have modest viral entry inhibitory activity but do not induce leakage from LUVs.

 

The study observed inhibitory activity for the S-MPER peptide which suggests that addition of this peptide may interfer with the viral entry process. For the FIV MPER peptide it was observed that a WX2WX2W motif is required for the membrane interaction responsible for its inhibitory activity.

This peptide motif, WTGWRQW, is strictly conserved among all species.

Results of the study indicated that the MPER peptide segments of EBOV and SUDV bind membrane surfaces which induces a conformational change in the Trp-rich peptide segment. This behavior suggests a role for the EBOV and SUDV MPER in membrane fusion.

 

Reference



http://www.cdc.gov/vhf/ebola/

D. S. ELLIS, D. I. H. SIMPSON, D. P. FRANCIS, J. KNOBLOCH, E. T. W. BOWEN, PACIFICO LOLIK, AND ISAIAH MAYOM DENG; Ultrastructure of Ebola virus particles in human Liver. Journal of Clinical Pathology, 1978, 31, 201-208.

Katendi Changula
, Reiko Yoshidac, Osamu Noyoric, Andrea Marzid, Hiroko Miyamotoc, Mari Ishijimac, Ayaka Yokoyamac, Masahiro Kajiharac,Heinz Feldmannd, Aaron S. Mweenea, Ayato Takadaa; Mapping of conserved and species-specific antibody epitopes on the Ebola virus nucleoprotein.  Virus Research 176  (2013) 83– 90.

Thomas Hoenen, Allison Groseth, and Heinz Feldmann; Current Ebola vaccines. Expert Opin Biol Ther. 2012 July; 12(7): 859–872.  oi:10.1517/14712598.2012.685152.

Adam J. Jaskierny
, Afra Panahi, and Michael Feig; Effect of flanking residues on the conformational sampling of the internal fusion peptide from Ebola virus. Proteins. 2011 April ; 79(4): 1109–1117. doi:10.1002/prot.22947.

Emily Happy Miller, Joseph S. Harrison, Sheli R. Radoshitzky, Chelsea D. Higgins, Xiaoli Chi, Lian Dong, Jens H. Kuhn, Sina Bavari, Jonathan R. Lai, and Kartik Chandran; Inhibition of Ebola Virus Entry by a C-peptide Targeted to Endosome J Biol Chem. May 6, 2011; 286(18): 15854–15861. Published online Mar 16, 2011. doi:  10.1074/jbc.M110.207084. PMCID: PMC3091195.

Lauren K. Regula, Richard Harris, Fang Wang, Chelsea D. Higgins, Jayne F. Koellhoffer, Yue Zhao, Kartik Chandran, Jianmin Gao, Mark E. Girvin, and Jonathan R. Lai; Conformational Properties of Peptides Corresponding to the Ebolavirus GP2 Membrane-Proximal External Region in the Presence of Micelle-Forming Surfactants and Lipids. Biochemistry. 2013 May 21; 52(20): . doi:10.1021/bi400040v.



MicroRNAs or miRNAs as cancer biomarkers

$
0
0

miRNAs are a class of endogenous small RNAs approximately 22 nucleotides in size found in plants and animals including humans.

miRNA's processing occurs from approximately 70 nucleotides in size hairpin precursor RNAs by the protein Dicer. miRNA have been shown to regulate their target messengerRNA (mRNA) by destabilizing mRNA molecules and translational repression.

Increasingly, it has become apparent that microRNAs take part in the development of cancer. This observation has made miRNAs potential biomarkers for cancer diagnosis and prognosis. Many studies now suggest that the pattern of microRNA expression in tissues reflects the disease status in this tissue. Therefore, miRNA expression levels may serve as potential biomarkers with multiple applications in clinical diagnostics. miRNAs can be successfully isolated from biological fluids allowing for the development of biofluid biopsies or diagnostics. This type of biomarker diagnostics promises to allow for the development of minimal invasive assays, to save cost and simplify complex invasive procedures.

Figure 1: Pre-miRNA nuclear export machinery.


Okada et al in 2009. solved the structure of the "pre-miRNA nuclear export machinery" formed by pre-miRNA complexed with Exp-5 and a guanine triphosphate (GTP)-bound form of the small nuclear guanine triphosphatase (GTPase) Ran (RanGTP) at 2.9 angstrom. The data showed that RNA recognition by Exp-5:RanGTP does not depend on RNA sequence. This implys that Exp-5:RanGTP can recognize a variety of pre-miRNAs.

Figure 2: The molecules present in the structure and their interactions are shown here.


[Source: 
http://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?uid=78532]


During a study of the nematode Caenorhabditis elegans (C. elegans) development involving the gene lin-14 Victor Ambros, Rosalind Lee and Rhonda Feinbaum first discovered miRNAs in 1993. However, at the time the researcher speculated that these molecules could be a nematode idiosyncrasy. In 2000, it was shown that let-7 represses lin-41, lin-14, lin28, lin42 and daf12 mRNA during transition in developmental stages in C. elegans. At this time miRNAs were recognized as small regulatory RNAs. Furthermore, it became clear that miRNAs are conserved in many species.  In addition, it was noted that short non-coding RNAs, first identified in 1993, were part of a wider phenomenon. For example, Lagos-Quintana et al. in 2001 referred to 22- and 21-nucleotide (nt) RNAs as small temporal RNAs (stRNAs). These RNAs functioned as key regulators in developmental timing. The Tuschl lab in 2001 showed that many 21- and 22-nt expressed RNAs exist in invertebrates and vertebrates. Furthermore, some of these RNAs, similar to let-7 stRNA, are highly conserved. This discovery led to the conclusion that sequence-specific, posttranscriptional regulatory mechanisms as mediated by small RNAs are more general than was previously appreciated. Over 4000 miRNAs have been found so far in all studied eukaryotes. More than 700 miRNAs have already been identified in humans. In addition, more than and over 800 are predicted to exist. V. Ambros in 2001 reported that these microRNAs are diverse in sequence and expression patterns. The observation that these molecules are evolutionarily widespread suggests that they may participate in a wide range of genetic regulatory pathways. Figure 3 shows the dramatic increase in publications involving miRNAs and miRNA research. 

Figure 3: Increase in miRNA publications in Pubmed.


Animal miRNAs derived from longer primary transcripts carry hairpin structures. The processing of these precursor hairpin RNA structures proceeds in a stepwise fashion catalyzed by the RNase III enzymes Drosha and Dicer. Drosha cleaves these RNA molecules near the hairpin base to release the pre-miRNA hairpin. This reaction occurs in the nucleus. Next, the pre-miRNA hairpin is exported into the cytoplasm and Dicer cleaves on the loop side of the hairpin.  The result is a miRNA:miRNA* duplex. In the next step, one strand of this complex is preferentially incorporated into a silencing complex.

Recently an alternative nuclear pathway for miRNA biogenesis was identified in invertebrates. Researchers found that short introns with hairpin potential, termed mirtrons, can be spliced and debranched into pre-miRNA hairpin mimics that appear to bypass Drosha cleavage. Debranched mirtrons access the canonical miRNA pathway during nuclear export and are then cleaved by Dicer and incorporated into silencing complexes. As pointed out by Brezikow et al. in 2007, mirtrons are alternative precursor molecules for microRNA biogenesis present in invertebrates. Splicing allows these short hairpin introns to bypass Drosha cleavage. Drosha cleavage is essential for the generation of canonical animal microRNAs. With the help of computational and experimental strategies Brezikow et al. establish that mammals have mirtrons as well. Therefore, mirtrons are miRNAs located in the introns of mRNA encoding genes. Brezikow et al. identified three (3) well conserved mirtrons expressed in diverse mammals. In addition, 16 primate-specific mirtrons, and 46  mirtron candidates, as supported by limited cloning, are suspected to be present in primates as well.

 

Disease dependent differentially expressed miRNAs

$
0
0
Disease dependent differentially expressed miRNAs

 

MicroRNAs (miRNAs) are known to play important roles in diseases pathology such as infections and cancer. The recent development of high-throughput technologies for the global measurement of miRNAs these molecules have now emerged as a new class of cancer biomarkers. Already many studies have explored associations between miRNAs and different cancer features. Often real-time polymerase chain (rt-PCR) reaction is used to measure the expression of miRNAs in various tissues.Analyzing global miRNA gene expression using complementary DNA microarrays allows for the examination of differentially expressed miRNAs. This type of assays allows finding disease specific miRNA associations which hopefully will reveal how miRNAs regulate their target genes. For example, miRNA profiling of different cancer tissue has the potential to allow determination of lineage and differentiation state of tumors. The following table contains a list of differentially regulated miRNAs in various diseases.


miRNAs

Disease Types

Up/Down-regulated

Reference

let-7a, let-7b, let-7c, let-7d, let-7g, miR-16, miR-23a, miR-23b, miR-26a, miR-92, miR-99a, miR-103, miR-125a, miR-125b, miR-143, miR-145, miR-195, miR-199a, miR-199a, miR-221, miR-222, miR-497

prostate cancer

down-regulated

Porkka KP, Pfeiffer MJ, Waltering KK, Vessella RL, Tammela TL, Visakorpi T: MicroRNA expression profiling in prostate cancer. Cancer Res 2007, Jul 1; 67(13):6130-5.

http://cancerres.aacrjournals.org/content/67/13/6130.abstract

 

miR-202, miR-210, miR-296, miR-320, miR-370, miR-373, miR-498, miR-503

prostate cancer

up-regulated

Porkka KP, Pfeiffer MJ, Waltering KK, Vessella RL, Tammela TL, Visakorpi T: MicroRNA expression profiling in prostate cancer. Cancer Res 2007, Jul 1; 67(13):6130-5.       

miR-16, miR-92a, miR-103, miR-107, miR-197, miR-34b, miR-328, miR-485-3p, miR-486-5p, miR-92b, miR-574-3p, miR-636, miR-640, miR-766, miR-885-5p

prostate cancer

up-regulated

Lodes MJ, Caraballo M, Suciu D, Munro S, Kumar A, Anderson B: Detection of cancer with serum miRNAs on an oligonucleotide microarray. PLoS One 2009, Jul 14; 4 (7):e6229.Published: July 14, 2009.

 

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0006229

 

miR-21, miR-155, miR-221

Pancreatic cancer

up-regulated

Bloomston M, Frankel WL, Petrocca F, Volinia S, Alder H, Hagan JP, Liu CG, Bhatt D, Taccioli C, Croce CM: MicroRNA expression patterns to differentiate pancreatic adenocarcinoma from normal pancreas and chronic pancreatitis JAMA 2007, 297(17):1901-1908. (doi:10.1001/jama.297.17.1901). http://www.ncbi.nlm.nih.gov/pubmed/17473300

 

miR-155, miR-21

colon, lung, breast,stomach, prostate

up-regulated

Iorio MV, Ferracin M, Liu CG, Veronese A, Spizzo R, Sabbioni S, Magri A, Musiani P, Volinia S, Nenci I, Calin GA, Querzzoli P: MicroRNA gene expression deregulation in human breast cancer. Cancer Res 2005, 65:7065–7070. http://www.ncbi.nlm.nih.gov/pubmed/16103053

 

Volinia S, Calin GA, Liu CG, Ambs S, Cimmino A, Petrocca F, Visone R, Iorio M, Roldo C, Ferracin M, Prueitt RL, Yanaihara N, Lanza G, Scarpa A, Vecchione A, Negrini M, Harris CC, Croce CM: A microRNA expression signature of human solid tumors defines cancer gene targets. Proc Natl Acad Sci USA 2006, Feb 14; 103(7):2257-2261. Epub 2006 Feb 3.

http://www.ncbi.nlm.nih.gov/pubmed/16461460

 

miR-142-5p, miR-369-3p, miR-215

lung cancer

up-regulated

Baffa R, Fassan M, Volinia S, O'Hara B, Liu CG, Palazzo JP, Gardiman M, Rugge M, Gomella LG, Croce CM, Rosenberg A: MicroRNA expression profiling of human metastatic cancers identifies cancer gene targets. J Pathol 2009, Jun 1.

http://www.ncbi.nlm.nih.gov/pubmed/19593777.

 

miR-373

lung cancer

down- regulated

Baffa et al. 2009

miR-30d, miR-125b, miR-26a, miR-30a-5p

 

thyroid anaplastic carcinomas

down- regulated

Visone R, Pallante P, Vecchione A, Cirombella R, Ferracin M, Ferraro A, Volinia S, Coluzzi S, Leone V, Borbone E, Liu CG, Petrocca F, Troncone G, Calin GA, Scarpa A, Colato C, Tallini G, Santoro M, Croce CM, Fusco A: Specific microRNAs are down regulated in human thyroid anaplastic carcinomas. Oncogene 2007, Nov 29; 26(54):7590-7595. Epub 2007 Jun 11.

http://www.ncbi.nlm.nih.gov/pubmed/17563749.

 

miR-10b, miR-125b, miR-145

breast cancer

down- regulated

Iorio MV, Ferracin M, Liu CG, Veronese A, Spizzo R, Sabbioni S, Magri A, Musiani P, Volinia S, Nenci I, Calin GA, Querzzoli P: MicroRNA gene expression deregulation in human breast cancer. Cancer Res 2005, 65:7065–7070.

http://www.ncbi.nlm.nih.gov/pubmed/16103053.

 

miR-27a, miR-96, miR-182

 

breast cancer

up-regulated

Guttilla IK, White BA: Coordinate regulation of FOXO1 by miR-27a, miR-96, and miR-182 in breast cancer cells. J Biol Chem. 2009 Aug 28;284(35):23204-16.

miR-21, miR-155

breast cancer

up-regulated

Iorio MV, Ferracin M, Liu CG, Veronese A, Spizzo R, Sabbioni S, Magri A, Musiani P, Volinia S, Nenci I, Calin GA, Querzzoli P: MicroRNA gene expression deregulation in human breast cancer. Cancer Res 2005, 65:7065–7070.

http://www.ncbi.nlm.nih.gov/pubmed/16103053

 

miR-21

breast cancer

up-regulated

Huang GL, Zhang XH, Guo GL, Huang KT, Yang KY, Shen X, You J, Hu XQ: Clinical significance of miR-21 expression in breast cancer: SYBR-Green I-based real-time RT-PCR study of invasive ductal carcinoma. Oncol Rep. 2009, Mar 21; (3):673-9.

http://www.ncbi.nlm.nih.gov/pubmed/19212625       

 

miR-30b, miR148a

breast cancer

up-regulated

Baffa R, Fassan M, Volinia S, O'Hara B, Liu CG, Palazzo JP, Gardiman M, Rugge M, Gomella LG, Croce CM, Rosenberg A: MicroRNA expression profiling of human metastatic cancers identifies cancer gene targets. J Pathol. 2009, 219(2):214-21.

 

http://www.ncbi.nlm.nih.gov/pubmed/19593777

 

miR-205

breast cancer

down-regulated

Baffa et al. 2009.

miR-142-5p, miR-29b, miR-30b

bladder cancer

up-regulated

Baffa et al. 2009.

miR-145, miR-143, miR-320

bladder cancer

down- regulated

Baffa et al. 2009.

miR-138, miR-125b

colon cancer

up-regulated

Baffa et al. 2009.

miR-17, miR-106a

colon cancer

down- regulated

Baffa et al. 2009.

hsa-miR-205

head and neck cancer

up-regulated

Tran N, McLean T, Zhang X, Zhao CJ, Thomson JM, O’Brien C, Rose B: MicroRNA expression profiles in head and neck cancer cell lines. Biochem Biophys Res Commun 2007, 358:12–17.

http://www.ncbi.nlm.nih.gov/pubmed/17475218.

 

miR-21, miR-221

brain cancer

up-regulated

Ciafre, S.A., Galardi, S., Mangiola, A., Ferracin, M., Liu, C.G., Sabatino, G., Negrini, M., Maira, G., Croce, C.M., Farace, and M.G: Extensive modulation of a set of microRNAs in primary glioblastomas. Biochem Biophys Res Commun 2005, 334:1351–1358.

http://www.sciencedirect.com/science/article/pii/S0006291X0501481

 

miR-9-2, miR-10b, miR-21, miR-25, miR-123, miR-125b-1, miR-125b-2,  miR-130a, miR-221

glioblastomas

up-regulated

Ciafre et al., 2005

miR-10b, miR-21, miR-26a, miR-383, miR-451, miR-486, miR-516-3p, miR-519d

glioblastomas

up-regulated

Godlewski J, Nowicki MO, Bronisz A, Williams S, Otsuki A, Nuovo G, Raychaudhury A, Newton HB, Chiocca EA, Lawler S: Targeting of the Bmi-1 oncogene/stem cell renewal factor by microRNA-128 inhibits glioma proliferation and self-renewal. Cancer Res 2008, 68:9125–9130. doi:10.1158/0008-5472.CAN-08-2629.

http://www.ncbi.nlm.nih.gov/pubmed/19010882

 

miR-7, miR-29b, miR-31, miR-101, miR-107, miR-124, miR-124-2, miR-128-1, miR-129, miR-132, miR-133a , miR-133b, miR-137

glioblastomas

down- regulated

Silber J, Lim DA, Petritsch C, Persson AI, Maunakea AK, Yu M, Vandenberg SR, Ginzinger DG, James CD, Costello JF, Bergers G, Weiss WA, Alvarez-Buylla A, Hodgson JG: miR-124 and miR-137 inhibit proliferation of glioblastoma multiform cells and induce differentiation of brain tumor stem cells. BMC Med 2008, 6:14. doi:10.1186/1741-7015-6-14.

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2443372/

 

miR-138, miR-139, miR-149, miR-153, miR-154, miR-185, miR-187, miR-203, miR-218, miR-323, miR-328, miR-330

glioblastomas

down- regulated

Gal H, Pandi G, Kanner AA, Ram Z, Lithwick-Yanai G, Amariglio N, Rechavi G, Givol D: MIR-451 and Imatinib mesylate inhibit tumor growth of glioblastoma stem cells. Biochem Biophys Res Commun 2008, 376:86–90. doi:10.1016/j.bbrc. 2008.08.107.

 

http://www.ncbi.nlm.nih.gov/pubmed/18765229

 

miR-15, miR-16

chronic lymphocytic leukemia (CLL)

down- regulated

Calin GA: Frequent deletions and down-regulation of micro-RNA genes miR15 and miR16 at 13q14 in chronic lymphocytic leukemia. Proc Natl Acad Sci USA 2002, 99: 15524–15529.

http://www.ncbi.nlm.nih.gov/pubmed/12434020       

 

miR-150, miR-155           

CLL

up-regulated

Bartels CL, Tsongalis GJ: MicroRNAs: novel biomarkers for human cancer. Clin Chem 2009, Apr; 55(4):623-31. Epub 2009 Feb 26. http://www.ncbi.nlm.nih.gov/pubmed/19246618

 

miR-143, miR-145

colorectal neoplasia

down- regulated

Michael MZ, O´Connor SM, Van Holst Pellekaan NG, Young GP, James RJ: Reduced accumulation of specific microRNAs in colorectal neoplasia. Mol Cancer Res 2003, 1:882-891.

http://www.ncbi.nlm.nih.gov/pubmed/14573789

 

miR-18, miR-224

hepato. Carcinoma

up- regulated

Murakami Y, Yasuda T, Saigo K, Urashima T, Toyoda H, Okanoue T, Shimotohno K: Comprehensive analysis of microRNA expression patterns in hepatocellular carcinoma and non-timorous tissues. Oncogene 2006, 25:2537–2545.

http://www.ncbi.nlm.nih.gov/pubmed/16331254

 

miR-199, miR-195, miR-200, miR-125

hepato. Carcinoma

down- regulated

Murakami et al., 2006

miR-221, miR-222, miR-146, miR-181

papillary thyroid carcinoma

up-regulated

He H, Jazdzewski K, Li W, Liyanarachchi S, Nagi R, Volinia S, Calin GA: The role of microRNA genes in papillary thyroid carcinoma. Proc Natl Acad Sci USA 2005b, 102:19075-19080.

http://www.ncbi.nlm.nih.gov/pubmed/16365291

Pallante P, Visone R, Ferracin M, Ferraro A, Berlingieri MT, Troncone G, Chiappetta G, Liu CG, Santoro M, Negrini M: Deregulation in human thyroid papillary carcinomas. Endocr Relat Cancer 2006, 13:497–508.

http://www.ncbi.nlm.nih.gov/pubmed/16728577

 

miR-372, miR-373

testicular germ cell tumors

up-regulated

Voorhoeve PM, le Sage C, Schrier M: A genetic screen implicates miRNA-372 and miRNA-373 as oncogenes in testicular germ cell tumors. Cell 2006, 124:1169-1181.

http://www.ncbi.nlm.nih.gov/pubmed/16564011

 

miR-31, miR-96, miR-135b, miR-183

colorectal cancer

up-regulated

Bartels and Tsongalis 2009.

miR-48, miR-135b, miR-133b

colorectal cancer

down- regulated

Bartels and Tsongalis 2009.

let-7b, let-7 g, miR-9, miR-21, miR-26a, miR-30a-3p, miR-30a-5p, miR-31, miR-96, miR-124b, miR-132, miR-135a, miR-135b, miR-141, miR-142-3p, miR-142-5p, miR-181a, miR-181b, miR-182, miR-183, miR-194, miR-200a, miR-200b, miR-200c, miR-203, miR-205, miR-215, miR-219, miR-320, miR-338, miR-372

colorectal cancer

up-regulated

Yang L, Belaguli N, Berger DH: MicroRNA and Colorectal Cancer. World J Surg 2009, 33:638–646.

 

let-7a, miR-10a, miR-15b, miR-23a, miR-25, miR-27a, miR-27b, miR-30c, miR-107, miR-124a, miR0125a, miR-125b, miR-127, miR-130a, miR-133a, miR-133b, miR-134, miR-137, miR-143, miR-145, miR-147, miR-154, miR-191, miR-199a, miR-199b, miR-214, miR-296, miR-299, miR-337, miR-339, miR-342, miR-368, miR-370, miR-582

colorectal cancer

down- regulated

Yang L, Belaguli N, Berger DH: MicroRNA and Colorectal Cancer. World J Surg 2009, 33:638–646.

miR-224, miR-18 and pre-miR-P18, miR-221

Hepatocellular cancer

up-regulated

Murakami Y, Yasuda T, Saigo K, Urashima T, Toyoda H, Okanoue T, Shimotohno K: Comprehensive analysis of microRNA expression patterns in hepatocellular carcinoma and non-timorous tissues. Oncogene 2006, 25:2537–2545.

http://www.ncbi.nlm.nih.gov/pubmed/16331254

Fornari F, Gramantieri L, Ferracin M, Veronese A, Sabbioni S, Calin GA, Grazi GL, Giovannini C, Croce CM, Bolondi L, Negrini M: MiR-221 controls CDKN1C/p57 and CDKN1B/p27 expression in human hepatocellular carcinoma.  Oncogene 2008, Sep 25; 27(43):5651-61 Epub 2008 Jun 2.

http://www.ncbi.nlm.nih.gov/pubmed/18521080

 

miR-199a, miR-199a*, miR-200a, miR-125a, miR-195, miR-125b

Hepatocellular cancer

down- regulated

Murakami et al., 2006.

Li W, Xie L, He X, Li J, Tu K, Wei L, Wu J, Guo Y, Ma X, Zhang P, Pan Z, Hu X, Zhao Y, Xie H, Jiang G, Chen T, Wang J, Zheng S, Cheng J, Wan D, Yang S, Li Y, Gu J: Diagnostic and prognostic implications of microRNAs in human hepatocellular carcinoma. Int J Cancer 2008, Oct 1; 123(7):1616-22.

http://www.ncbi.nlm.nih.gov/pubmed/18649363

 


Additional reference

Stefanie S Jeffrey; Cancer biomarker profiling with microRNAs. Nature Biotechnology26, 400 - 401 (2008)
doi:10.1038/nbt0408-400. http://www.nature.com/nbt/journal/v26/n4/full/nbt0408-400.html

Viewing all 572 articles
Browse latest View live




Latest Images