Peptide libraries for heterogeneous nuclear ribonucleoproteins or hnRNPs

Heterogeneous nuclear ribonucleoproteins (hnRNPs) isoforms

Heterogeneous nuclear ribonucleoproteins (hnRNPs) comprise a family of RNA-binding proteins. The nature of hnRNPs is very complex and diverse. This multifunctional protein family is involved not only in processing heterogeneous nuclear RNAs (hnRNAs) into mature mRNAs, but also acting as trans-factors in regulating gene expression. Heterogeneous nuclear ribonucleoprotein E1 (hnRNP E1), a subgroup of hnRNPs, is a KH-triple repeat containing the RNA-binding protein. It is encoded by an intronless gene arising from hnRNP E2 through a retrotransposition event. hnRNP E1 is ubiquitously expressed and functions in regulating major steps of gene expression, including pre-mRNA processing, mRNA stability, and translation. This protein family exhibit wide-ranging functions in the nucleus and cytoplasm and interaction with multiple other proteins and appears to be involved in post-transcriptional regulation events or pathways in eukaryotic organisms.

The A/B subfamily of ubiquitously expressed hnRNPs

This gene belongs to the A/B subfamily of ubiquitously expressed heterogeneous nuclear ribonucleo-proteins (hnRNPs). The hnRNPs are RNA binding proteins and complex with heterogeneous nuclear RNA (hnRNA). These proteins are associated with pre-mRNAs in the nucleus and appear to influence pre-mRNA processing and other aspects of mRNA metabolism and transport. While all of the hnRNPs are present in the nucleus, some seem to shuttle between the nucleus and the cytoplasm. The hnRNP proteins have distinct nucleic acid binding properties. The protein encoded by this gene has two repeats of quasi-RRM domains that bind to RNAs. It is one of the most abundant core proteins of hnRNP complexes and it is localized to the nucleoplasm. This protein, along with other hnRNP proteins, is exported from the nucleus, probably bound to mRNA, and is immediately re-imported. Its M9 domain acts as both a nuclear localization and nuclear export signal. The encoded protein is involved in the packaging of pre-mRNA into hnRNP particles, transport of poly A+ mRNA from the nucleus to the cytoplasm, and may modulate splice site selection. It is also thought to have a primary role in the formation of specific myometrial protein species in childbirth. [The myometrium is the middle layer of the uterine wall, mainly consisting of uterine smooth muscle cells.] Multiple alternatively spliced transcript variants have been found for this gene but only two transcripts are fully described. These variants have multiple alternative transcription initiation sites and multiple polyA sites. [From Refseq: http://www.ncbi.nlm.nih.gov/refseq/].

Precursor mRNA

An immature single strand of messenger ribonucleic acid (mRNA) is called a precursor mRNA (pre-mRNA) which is synthesized in the cell nucleus from a DNA template by transcription. Pre-mRNA makes up the majority of heterogeneous nuclear RNA (hnRNA). hnRNA can also include RNA transcripts that do not end up as cytoplasmic mRNA. More details about these proteins can be found at http://atlasgeneticsoncology.org/Genes/GC_HNRNPA1.html and http://www.uniprot.org/uniprot/P09651.

The heterogeneous nuclear ribonucleoprotein A1 (hnRNP A1) is sometimes also called “Helix-destabilizing protein”, “Single-strand RNA-binding protein”, or “hnRNP core protein A1” and was identified in the spliceosome C complex, and in a mRNP granule complex, which appears to be at least composed of proteins ACTB, ACTN4, DHX9, ERG, HNRNPA1, HNRNPA2B1, HNRNPAB, HNRNPD, HNRNPL, HNRNPR, HNRNPU, HSPA1, HSPA8, IGF2BP1, ILF2, ILF3, NCBP1, NCL, PABPC1, PABPC4, PABPN1, RPLP0, RPS3, RPS3A, RPS4X, RPS8, RPS9, SYNCRIP, TROVE2, YBX1 and untranslated mRNAs. The protein interacts with SEPT6, with HCV NS5B, and with the 5'-UTR and 3'-UTR of HCV RNA. 5′UTR and 3′UTR are the non-coding regions of HCV RNA, referred herein as 5′ and 3′ untranslated regions that contain important sequence and structural elements critical for HCV translation and RNA replication.

Spliceosome

The spliceosome is a multimegadalton RNA-protein machine that removes noncoding sequences from nascent pre-mRNAs. Recruitment of the spliceosome to splice sites and subsequent splicing require a series of dynamic interactions among the spliceosome's component U snRNPs and many additional protein factors.

Pre-mRNA splicing by the U2-type spliceosome

(A) Schematic representation of the two-step mechanism of pre-mRNA splicing. Boxes and solid lines represent the exons (E1, E2) and the intron, respectively. The branch site adenosine is indicated by the letter A and the phosphate groups (p) at the 5′ and 3′ splice sites, which are conserved in the splicing products, are also shown. (B) Conserved sequences found at the 5′ and 3′ splice sites and branch site of U2-type pre-mRNA introns in metazoans and budding yeast (S. cerevisiae). Y = pyrimidine and R = purine. The polypyrimidine tract is indicated by (Yn). (C) Canonical cross-intron assembly and disassembly pathway of the U2-dependent spliceosome. For simplicity, the ordered interactions of the snRNPs (indicated by circles), but not those of non-snRNP proteins, are shown. The various spliceosomal

complexes are named according to the metazoan nomenclature. Exon and intron sequences are indicated by boxes and lines, respectively. The stages at which the evolutionarily conserved DExH/D-box RNA ATPases/helicases Prp5, Sub2/UAP56, Prp28, Brr2, Prp2, Prp16, Prp22 and Prp43, or the GTPase Snu114, act to facilitate conformational changes are indicated. (D) Model of interactions occurring during exon definition. (From: Will C L , and Lührmann R Cold Spring Harb Perspect Biol 2011;3:a003707).

The DExH/D protein family

The DExH/D protein family is the largest group of enzymes found in eukaryotic RNA metabolism. DExH/D proteins unwind RNA duplexes in an ATP-dependent fashion. In recent years it has become clear that these DExH/D RNA helicases are also involved in the ATP-dependent remodeling of RNA–protein complexes. DExH/D proteins are essential for all aspects of cellular RNA metabolism and processing, for the replication of many viruses and for DNA replication as well. The DExH/D protein family database contains information about these proteins and makes it available over the WWW (http://www.columbia.edu/~ej67/dbhome.htm).

Exon definition

During exon definition exons are recognized and defined as units during early assembly by binding of factors to the 3' end of the intron, followed by a search for a downstream 5' splice site. The presence of both a 3' and a 5' splice site in the correct orientation and within 300 nucleotides of one another will allow that stable exon complexes are formed. Concerted recognition of exons may help explain the 300-nucleotide-length maximum of vertebrate internal exons, the mechanism whereby the splicing machinery ignores cryptic sites within introns, the mechanism whereby exon skipping is normally avoided, and the phenotypes of 5' splice site mutations that inhibit splicing of neighboring introns (Roberson et al. 1990).

Cytoplasmic mRNP granules at a glance

(Source: Stacy L. Erickson and Jens Lykke-Andersen; Cytoplasmic mRNP granules at a glance Journal of Cell Science 124, 293-297).

Eukaryotic gene expression is controlled by translation and mRNA degradation which are important in the regulation of these processes. Translation and steps in the major pathway of mRNA decay are in competition with each other. mRNAs that are not engaged in translation can aggregate into cytoplasmic mRNP granules referred to as processing bodies (P-bodies) and stress granules, which are related to mRNP particles that control translation in early development and neurons. The analyses of P-bodies and stress granules suggest a dynamic process. This process is referred to as the mRNA Cycle, wherein mRNPs can move between polysomes, P-bodies and stress granules although the functional roles of mRNP assembly into higher order structures remain poorly understood. (Source: Carolyn J. Decker and Roy Parker; P-Bodies and Stress Granules: Possible Roles in the Control of Translation and mRNA Degradation. Cold Spring Harb Perspect Biol a012286 First published online July 3, 2012).

Structure of the Heterogeneous nuclear ribonucleoprotein A1 isoform [Homo sapiens]

(Source: pdb 2LYV)

Chain A, Solution Structure Of The Two Rrm Domains Of Hnrnp A1 (up1) Using Segmental Isotope Labeling

PubMed Abstract:“Human hnRNP A1 is a multi-functional protein involved in many aspects of nucleic-acid processing such as alternative splicing, micro-RNA biogenesis, nucleo-cytoplasmic mRNA transport and telomere biogenesis and maintenance. The N-terminal region of hnRNP A1, also named unwinding protein 1 (UP1), is composed of two closely related RNA recognition motifs (RRM), and is followed by a C-terminal glycine rich region. Although crystal structures of UP1 revealed inter-domain interactions between RRM1 and RRM2 in both the free and bound form of UP1, these interactions have never been established in solution. Moreover, the relative orientation of hnRNP A1 RRMs is different in the free and bound crystal structures of UP1, raising the question of the biological significance of this domain movement. In the present study, we have used NMR spectroscopy in combination with segmental isotope labeling techniques to carefully analyze the inter-RRM contacts present in solution and subsequently determine the structure of UP1 in solution. Our data unambiguously demonstrate that hnRNP A1 RRMs interact in solution, and surprisingly, the relative orientation of the two RRMs observed in solution is different from the one found in the crystal structure of free UP1 and rather resembles the one observed in the nucleic-acid bound form of the protein. This strongly supports the idea that the two RRMs of hnRNP A1 have a single defined relative orientation which is the conformation previously observed in the bound form and now observed in solution using NMR. It is likely that the conformation in the crystal structure of the free form is a less stable form induced by crystal contacts. Importantly, the relative orientation of the RRMs in proteins containing multiple-RRMs strongly influences the RNA binding topologies that are practically accessible to these proteins. Indeed, RRM domains are asymmetric binding platforms contacting single-stranded nucleic acids in a single defined orientation. Therefore, the path of the nucleic acid molecule on the multiple RRM domains is strongly dependent on whether the RRMs are interacting with each other.” (Taken from: http://www.rcsb.org/pdb/explore/explore.do?structureId=2LYV).

The alignment of 3 hnRNP A1 protein sequences is shown below

Protein and Peptide Library Information for hnRNP proteins

Solid phase peptide synthesis allows the design and creation of libraries compromising a collection of synthetic peptides. Synthetic libraries have been and can be successfully used for antibody epitope mapping, the determination of the specificity of antibodies, identification of bioactive peptides, development of biological assays, T-cell epitope mapping, vaccine efficacy testing, the screening for ligand-binding activities, screening for antimicrobial peptide activities, peptide-protein interactions, drug discovery, for LC-MS/MS method development and validation as well as for receptor-ligand studies, cellular assays, the study of constrained peptides, modified peptides such as modified histone peptides or the screening for MHC class I and class II peptides, and others.

Furthermore peptide libraries provide synthetic, crude or purified peptides that can be customized for the development of screening applications such as epitope mapping, phosphorylation site identifications or the identification of other post-translational modification (PTM) sites, peptide target interaction studies, mid- to high-throughput selected reaction monitoring (SRM) and multiple reaction monitoring (MRM) assays in quantitative mass spectrometry (MS) workflows.

Peptide libraries for proteomics

The study of proteomes, sub-proteomes and protein pathways often requires quantitative MS analysis that depends on the identification and validation of SRM and MRM assays. Peptide libraries offer great convenience and flexibility in the development of multiple applications involving large numbers of peptides including libraries for quantitative MS approaches. The use of peptide libraries greatly reduces the setup time of MS experiments. The SRMAtlas (http://www.mrmatlas.org/) which attempts to map the entire human proteome can be used as a guide for the development of new types of peptide libraries.

Peptides libraries can be derived from the target protein, and, as an option for the use in MS workflows, with either arginine (R) or lysine (K) as the C-terminal amino acid. Synthetic libraries can cover the most commonly used tryptic proteotypic peptides useful for SRM assay development or proteolytic peptides of the whole protein. The maximum peptide length for screening libraries can range from short peptides (5 amino acids) to 20 amino acids. However, for peptide libraries used in MS workflows the standard maximal length is usually 25 amino acids but this length can be increased up to 35 amino acids to ensure that the vast majority of potential proteotypic peptides that are screened for their suitability for a reliable SRM assay are covered.

Highlights:

Convenient– peptides are provided in individual tubes or in 96-well plates
Application-specific– C-terminal amino acid of each peptide can be either R or K or any other amino acid
Easy to use– peptides can be delivered lyophilized or suspended in 0.1% trifluoroacetic acid (TFA) in 50% (v/v) acetonitrile/water
Flexible– extensive list of available modifications

Includes:

Fully synthetic peptides
Standard mass spectrometric quality control (QC) analysis
Optional modifications, peptide sizes and levels of QC analysis
Provided in either in single tubes or 96-well plates

Example of a peptide library for the hnRNP protein useful for MS workflows

Hnrnp A1 protein (up1)

pI of Protein: 8.0, Protein MW: 22246, Amino Acid Composition: A10 C2 D13 E18 F10 G17 H8 I8 K16 L8 M4 N4 P6 Q7 R16 S16 T12 V17 W1 Y4

1	GGSKSESPKE	PEQLRKLFIG	GLSFETTDES	LRSHFEQWGT	LTDCVVMRDP	NTKRSRGFGF	VTYATVEEVD
	AAMNARPHKV
81	DGRVVEPKRA	VSREDSQRPG	AHLTVKKIFV	GGIKEDTEEH	HLRDYFEQYG	KIEVIEIMTD	RGSGKKRGFA
	FVTFDDHDSV
161	DKIVIQKYHT	VNGHNCEVRK	ALSKQEMASA	SSSQRGR

List of Tryptic Peptides

#	m/z (mi)	m/z (av)	Start	End	Sequence
1	418.2660	418.5164	181	184	(K)ALSK(Q)
2	432.2565	432.5028	90	93	(R)AVSR(E)
3	446.2358	446.4862	80	83	(K)VDGR(V)
4	547.2722	547.5889	5	9	(K)SESPK(E)
5	571.3450	571.6984	84	88	(R)VVEPK(R)
6	574.2831	574.6147	49	53	(R)DPNTK(R)
7	600.4079	600.7838	163	167	(K)IVIQK(Y)
8	733.4607	733.9346	108	114	(K)IFVGGIK(E)
9	771.3995	771.8533	10	15	(K)EPEQLR(K)
10	1049.4575	1050.1193	124	131	(R)DYFEQYGK(I)
11	1165.5232	1166.1994	115	123	(K)EDTEEHHLR(D)
12	1181.5215	1182.2617	185	195	(K)QEMASASSSQR(G)
13	1218.6399	1219.4513	132	141	(K)IEVIEIMTDR(G)
14	1428.6437	1429.5663	168	179	(K)YHTVNGHNCEVR(K)
15	1437.7445	1438.5928	94	106	(R)EDSQRPGAHLTVK(K)
16	1699.7598	1700.8128	148	162	(R)GFAFVTFDDHDSVDK(I)
17	1784.9065	1786.0038	17	32	(K)LFIGGLSFETTDESLR(S)
18	1908.8731	1910.1941	33	48	(R)SHFEQWGTLTDCVVMR(D)
19	2510.2133	2511.8362	57	79	(R)GFGFVTYATVEEVDAAMNARPHK(V)

Functional categorization of the proteins for which targeted proteomic assays are available can be found in the MRMAtlas database

Piccoti et al. in 2008 presented the first database of validated SRM assays for ~1500 yeast proteins. The database was constructed by merging the results of more than 650 SRM-triggered MS2 analyses of S. cerevisiae protein digests, carried out on a triple quadrupole-type mass spectrometer. 1324 proteins are represented by assays for at least one of their peptides proteotypic peptides (PTP’s). The database also contains assays for a small number of peptides common to a maximum of two proteins. The peptides were selected because they show intense signal response by electrospray ionization mass spectrometry. Peptide identifications were validated by collecting a full tandem mass spectrum of the peptides in the QQQ-like mass spectrometer also used for SRM measurements. The database is at present the largest resource of validated SRM/MRM assays of any organism. It currently contains assays for 22% of the yeast proteome and the coverage.

The database contains assays for yeast proteins involved in all biological processes, as defined by gene ontology (GO) nomenclature. Peptides for proteins spanning all ranges of abundance in yeast are present in the dataset, down to a concentration below 50 molecules/cell.

The MRMAtlas database

The dataset can be found in the MRMAtlas (www.mrmatlas.org or www.srmatlas.org) which is publicly accessible. This database
was created as part of the PeptideAtlas project (www.peptideatlas.org) and can be queried via the web-interface for peptides,
individual proteins, protein sets, or cellular pathways. These data sets can be used for the design of targeted peptide libraries

that can be custom synthesized. If desired, all or selected peptides can be labeled with stable isotopes useful for spiking

experiments.

Examples of peptide libraries for hnRNP proteins for screening work flows

A: Rrm Domains of HnRNP A1 (up1): Structure ID: PDB: 2LYV_A

Sequence in FASTA format

>gi|433286562|pdb|2LYV|A Chain A, Solution Structure Of The Two Rrm Domains Of Hnrnp A1 (up1) Using Segmental "Isotope Labeling"

GGSKSESPKEPEQLRKLFIGGLSFETTDESLRSHFEQWGTLTDCVVMRDPNTKRSRGFGFVTYATVEEVDAAMNARPHKVD

                        GRVVEPKRAVSREDSQRPGAHLTVKKIFVGGIKEDTEEHHLRDYFEQYGKIEVIEIMTDRGSGKKRGFAFVTFDDHDSVDK

                        IVIQKYHTVNGHNCEVRKALSKQEMASASSSQRGR

Peptide library of 15mers with an overlap of 11 amino acids for HnRNP A1 (up1)

#	Peptide	Position	Mw	Chemical Formula	pI	Charge
1	GGSKSESPKEPEQLR	1->15	1628.76	C67 H113 N21 O26	7.1	0
2	SESPKEPEQLRKLFI	5->19	1801.08	C81 H133 N21 O25	7.1	0
3	KEPEQLRKLFIGGLS	9->23	1715.02	C78 H131 N21 O22	10.02	1
4	QLRKLFIGGLSFETT	13->27	1710.01	C79 H128 N20 O22	10.27	1
5	LFIGGLSFETTDESL	17->31	1628.81	C74 H113 N15 O26	2.87	-3
6	GLSFETTDESLRSHF	21->35	1725.85	C75 H112 N20 O27	4.4	-1
7	ETTDESLRSHFEQWG	25->39	1821.89	C78 H112 N22 O29	4.04	-2
8	ESLRSHFEQWGTLTD	29->43	1805.93	C79 H116 N22 O27	4.4	-1
9	SHFEQWGTLTDCVVM	33->47	1752.98	C77 H113 N19 O24 S2	3.98	-1
10	QWGTLTDCVVMRDPN	37->51	1734.96	C73 H115 N21 O24 S2	3.88	-1
11	LTDCVVMRDPNTKRS	41->55	1735.01	C70 H123 N23 O24 S2	8.93	1
12	VVMRDPNTKRSRGFG	45->59	1719.98	C72 H122 N26 O21 S	12.19	3
13	DPNTKRSRGFGFVTY	49->63	1744.93	C78 H117 N23 O23	10.56	2
14	KRSRGFGFVTYATVE	53->67	1717.95	C78 H120 N22 O22	10.56	2
15	GFGFVTYATVEEVDA	57->71	1604.74	C74 H105 N15 O25	2.87	-3
16	VTYATVEEVDAAMNA	61->75	1583.74	C67 H106 N16 O26 S	2.87	-3
17	TVEEVDAAMNARPHK	65->79	1667.86	C69 H114 N22 O24 S	5.34	0
18	VDAAMNARPHKVDGR	69->83	1636.84	C67 H113 N25 O21 S	10.26	2
19	MNARPHKVDGRVVEP	73->87	1704.96	C72 H121 N25 O21 S	10.26	2
20	PHKVDGRVVEPKRAV	77->91	1686.96	C74 H127 N25 O20	10.69	3
21	DGRVVEPKRAVSRED	81->95	1712.88	C70 H121 N25 O25	7.16	0
22	VEPKRAVSREDSQRP	85->99	1753.94	C72 H124 N26 O25	10.24	1
23	RAVSREDSQRPGAHL	89->103	1678.83	C68 H115 N27 O23	10.98	2
24	REDSQRPGAHLTVKK	93->107	1721.93	C72 H124 N26 O23	10.69	3
25	QRPGAHLTVKKIFVG	97->111	1650.97	C76 H127 N23 O18	11.77	4
26	AHLTVKKIFVGGIKE	101->115	1639.98	C77 H130 N20 O19	10.41	3
27	VKKIFVGGIKEDTEE	105->119	1691.93	C76 H126 N18 O25	4.64	-1
28	FVGGIKEDTEEHHLR	109->123	1766.93	C77 H119 N23 O25	5.23	0
29	IKEDTEEHHLRDYFE	113->127	1961.09	C86 H125 N23 O30	4.39	-2
30	TEEHHLRDYFEQYGK	117->131	1952.08	C87 H122 N24 O28	5.23	0
31	HLRDYFEQYGKIEVI	121->135	1910.16	C89 H132 N22 O25	5.34	0
32	YFEQYGKIEVIEIMT	125->139	1863.17	C87 H131 N17 O26 S	3.73	-2
33	YGKIEVIEIMTDRGS	129->143	1710.97	C74 H123 N19 O25 S	4.43	-1
34	EVIEIMTDRGSGKKR	133->147	1718.99	C71 H127 N23 O24 S	10.01	1
35	IMTDRGSGKKRGFAF	137->151	1670.95	C73 H119 N23 O20 S	11.56	3
36	RGSGKKRGFAFVTFD	141->155	1672.9	C76 H117 N23 O20	11.56	3
37	KKRGFAFVTFDDHDS	145->159	1769.93	C80 H116 N22 O24	7.97	1
38	FAFVTFDDHDSVDKI	149->163	1755.9	C81 H114 N18 O26	3.85	-2
39	TFDDHDSVDKIVIQK	153->167	1759.92	C77 H122 N20 O27	4.47	-1
40	HDSVDKIVIQKYHTV	157->171	1782.01	C80 H128 N22 O24	8	2
41	DKIVIQKYHTVNGHN	161->175	1765.96	C78 H124 N24 O23	9.64	3
42	IQKYHTVNGHNCEVR	165->179	1797.99	C76 H120 N26 O23 S	8.82	3
43	HTVNGHNCEVRKALS	169->183	1664.84	C68 H113 N25 O22 S	8.94	3
44	GHNCEVRKALSKQEM	173->187	1729.98	C70 H120 N24 O23 S2	8.89	2
45	EVRKALSKQEMASAS	177->191	1634.87	C67 H119 N21 O24 S	10.02	1
46	ALSKQEMASASSSQR	181->195	1580.74	C62 H109 N21 O25 S	10.27	1
47	QEMASASSSQRGR	185->197	1394.49	C52 H91 N21 O22 S	11.04	1
Peptide Count: 47

B: Heterogeneous nuclear ribonucleoprotein A1 isoform a [Homo sapiens]

Sequence in FASTA format

>gi|4504445|ref|NP_002127.1| heterogeneous nuclear ribonucleoprotein A1 isoform a [Homo sapiens]

MSKSESPKEPEQLRKLFIGGLSFETTDESLRSHFEQWGTLTDCVVMRDPNTKRSRGFGFVTYATVEEVDAAMNARPHKVD

                        GRVVEPKRAVSREDSQRPGAHLTVKKIFVGGIKEDTEEHHLRDYFEQYGKIEVIEIMTDRGSGKKRGFAFVTFDDHDSVD

                        KIVIQKYHTVNGHNCEVRKALSKQEMASASSSQRGRSGSGNFGGGRGGGFGGNDNFGRGGNFSGRGGFGGSRGGGGYGGS

                        GDGYNGFGNDGSNFGGGGSYNDFGNYNNQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGYGGSSSSSSYGSGRRF

Peptide library of 15mers with an overlap of 11 amino acids for hnRP A1 a, human

#	Peptide	Position	Mw	Chemical Formula	pI	Charge
1	GGSKSESPKEPEQLR	1->15	1628.76	C67 H113 N21 O26	7.1	0
2	SESPKEPEQLRKLFI	5->19	1801.08	C81 H133 N21 O25	7.1	0
3	KEPEQLRKLFIGGLS	9->23	1715.02	C78 H131 N21 O22	10.02	1
4	QLRKLFIGGLSFETT	13->27	1710.01	C79 H128 N20 O22	10.27	1
5	LFIGGLSFETTDESL	17->31	1628.81	C74 H113 N15 O26	2.87	-3
6	GLSFETTDESLRSHF	21->35	1725.85	C75 H112 N20 O27	4.4	-1
7	ETTDESLRSHFEQWG	25->39	1821.89	C78 H112 N22 O29	4.04	-2
8	ESLRSHFEQWGTLTD	29->43	1805.93	C79 H116 N22 O27	4.4	-1
9	SHFEQWGTLTDCVVM	33->47	1752.98	C77 H113 N19 O24 S2	3.98	-1
10	QWGTLTDCVVMRDPN	37->51	1734.96	C73 H115 N21 O24 S2	3.88	-1
11	LTDCVVMRDPNTKRS	41->55	1735.01	C70 H123 N23 O24 S2	8.93	1
12	VVMRDPNTKRSRGFG	45->59	1719.98	C72 H122 N26 O21 S	12.19	3
13	DPNTKRSRGFGFVTY	49->63	1744.93	C78 H117 N23 O23	10.56	2
14	KRSRGFGFVTYATVE	53->67	1717.95	C78 H120 N22 O22	10.56	2
15	GFGFVTYATVEEVDA	57->71	1604.74	C74 H105 N15 O25	2.87	-3
16	VTYATVEEVDAAMNA	61->75	1583.74	C67 H106 N16 O26 S	2.87	-3
17	TVEEVDAAMNARPHK	65->79	1667.86	C69 H114 N22 O24 S	5.34	0
18	VDAAMNARPHKVDGR	69->83	1636.84	C67 H113 N25 O21 S	10.26	2
19	MNARPHKVDGRVVEP	73->87	1704.96	C72 H121 N25 O21 S	10.26	2
20	PHKVDGRVVEPKRAV	77->91	1686.96	C74 H127 N25 O20	10.69	3
21	DGRVVEPKRAVSRED	81->95	1712.88	C70 H121 N25 O25	7.16	0
22	VEPKRAVSREDSQRP	85->99	1753.94	C72 H124 N26 O25	10.24	1
23	RAVSREDSQRPGAHL	89->103	1678.83	C68 H115 N27 O23	10.98	2
24	REDSQRPGAHLTVKK	93->107	1721.93	C72 H124 N26 O23	10.69	3
25	QRPGAHLTVKKIFVG	97->111	1650.97	C76 H127 N23 O18	11.77	4
26	AHLTVKKIFVGGIKE	101->115	1639.98	C77 H130 N20 O19	10.41	3
27	VKKIFVGGIKEDTEE	105->119	1691.93	C76 H126 N18 O25	4.64	-1
28	FVGGIKEDTEEHHLR	109->123	1766.93	C77 H119 N23 O25	5.23	0
29	IKEDTEEHHLRDYFE	113->127	1961.09	C86 H125 N23 O30	4.39	-2
30	TEEHHLRDYFEQYGK	117->131	1952.08	C87 H122 N24 O28	5.23	0
31	HLRDYFEQYGKIEVI	121->135	1910.16	C89 H132 N22 O25	5.34	0
32	YFEQYGKIEVIEIMT	125->139	1863.17	C87 H131 N17 O26 S	3.73	-2
33	YGKIEVIEIMTDRGS	129->143	1710.97	C74 H123 N19 O25 S	4.43	-1
34	EVIEIMTDRGSGKKR	133->147	1718.99	C71 H127 N23 O24 S	10.01	1
35	IMTDRGSGKKRGFAF	137->151	1670.95	C73 H119 N23 O20 S	11.56	3
36	RGSGKKRGFAFVTFD	141->155	1672.9	C76 H117 N23 O20	11.56	3
37	KKRGFAFVTFDDHDS	145->159	1769.93	C80 H116 N22 O24	7.97	1
38	FAFVTFDDHDSVDKI	149->163	1755.9	C81 H114 N18 O26	3.85	-2
39	TFDDHDSVDKIVIQK	153->167	1759.92	C77 H122 N20 O27	4.47	-1
40	HDSVDKIVIQKYHTV	157->171	1782.01	C80 H128 N22 O24	8	2
41	DKIVIQKYHTVNGHN	161->175	1765.96	C78 H124 N24 O23	9.64	3
42	IQKYHTVNGHNCEVR	165->179	1797.99	C76 H120 N26 O23 S	8.82	3
43	HTVNGHNCEVRKALS	169->183	1664.84	C68 H113 N25 O22 S	8.94	3
44	GHNCEVRKALSKQEM	173->187	1729.98	C70 H120 N24 O23 S2	8.89	2
45	EVRKALSKQEMASAS	177->191	1634.87	C67 H119 N21 O24 S	10.02	1
46	ALSKQEMASASSSQR	181->195	1580.74	C62 H109 N21 O25 S	10.27	1
47	QEMASASSSQRGR	185->197	1394.49	C52 H91 N21 O22 S	11.04	1
Peptide Count: 47

C: Heterogeneous nuclear ribonucleoprotein A1 isoform b [Homo sapiens] {hnRNP A1 b, human}

Sequence in FASTA format

>gi|14043070|ref|NP_112420.1| heterogeneous nuclear ribonucleoprotein A1 isoform b [Homo sapiens]

MSKSESPKEPEQLRKLFIGGLSFETTDESLRSHFEQWGTLTDCVVMRDPNTKRSRGFGFVTYATVEEVDAAMNARPHKVD

                        GRVVEPKRAVSREDSQRPGAHLTVKKIFVGGIKEDTEEHHLRDYFEQYGKIEVIEIMTDRGSGKKRGFAFVTFDDHDSVD

                        KIVIQKYHTVNGHNCEVRKALSKQEMASASSSQRGRSGSGNFGGGRGGGFGGNDNFGRGGNFSGRGGFGGSRGGGGYGGS

                        GDGYNGFGNDGGYGGGGPGYSGGSRGYGSGGQGYGNQGSGYGGSGSYDSYNNGGGGGFGGGSGSNFGGGGSYNDFGNYNN

                        QSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGYGGSSSSSSYGSGRRF

Peptide library of 15mers with an overlap of 11 amino acids for hnRNP A1 b, human

#	Peptide	Position	Mw	Chemical Formula	pI	Charge
1	MSKSESPKEPEQLRK	1->15	1774.03	C74 H128 N22 O26 S	9.8	1
2	ESPKEPEQLRKLFIG	5->19	1771.05	C80 H131 N21 O24	7.1	0
3	EPEQLRKLFIGGLSF	9->23	1734.03	C81 H128 N20 O22	7.07	0
4	LRKLFIGGLSFETTD	13->27	1696.97	C78 H125 N19 O23	7.04	0
5	FIGGLSFETTDESLR	17->31	1671.84	C74 H114 N18 O26	3.66	-2
6	LSFETTDESLRSHFE	21->35	1797.92	C78 H116 N20 O29	4.04	-2
7	TTDESLRSHFEQWGT	25->39	1793.88	C77 H112 N22 O28	4.4	-1
8	SLRSHFEQWGTLTDC	29->43	1779.95	C77 H114 N22 O25 S	5.19	0
9	HFEQWGTLTDCVVMR	33->47	1822.09	C80 H120 N22 O23 S2	5.19	0
10	WGTLTDCVVMRDPNT	37->51	1707.94	C72 H114 N20 O24 S2	3.88	-1
11	TDCVVMRDPNTKRSR	41->55	1778.04	C70 H124 N26 O24 S2	10.25	2
12	VMRDPNTKRSRGFGF	45->59	1768.03	C76 H122 N26 O21 S	12.19	3
13	PNTKRSRGFGFVTYA	49->63	1700.92	C77 H117 N23 O21	11.46	3
14	RSRGFGFVTYATVEE	53->67	1718.9	C77 H115 N21 O24	7.03	0
15	FGFVTYATVEEVDAA	57->71	1618.77	C75 H107 N15 O25	2.87	-3
16	TYATVEEVDAAMNAR	61->75	1640.8	C68 H109 N19 O26 S	3.66	-2
17	VEEVDAAMNARPHKV	65->79	1665.88	C70 H116 N22 O23 S	5.34	0
18	DAAMNARPHKVDGRV	69->83	1636.84	C67 H113 N25 O21 S	10.26	2
19	NARPHKVDGRVVEPK	73->87	1701.93	C73 H124 N26 O21	10.69	3
20	HKVDGRVVEPKRAVS	77->91	1676.92	C72 H125 N25 O21	10.69	3
21	GRVVEPKRAVSREDS	81->95	1684.87	C69 H121 N25 O24	10.24	1
22	EPKRAVSREDSQRPG	85->99	1711.86	C69 H118 N26 O25	10.24	1
23	AVSREDSQRPGAHLT	89->103	1623.75	C66 H110 N24 O24	8.02	1
24	EDSQRPGAHLTVKKI	93->107	1678.9	C72 H123 N23 O23	10.02	2
25	RPGAHLTVKKIFVGG	97->111	1579.89	C73 H122 N22 O17	11.77	4
26	HLTVKKIFVGGIKED	101->115	1683.99	C78 H130 N20 O21	9.8	2
27	KKIFVGGIKEDTEEH	105->119	1729.94	C77 H124 N20 O25	5.44	0
28	VGGIKEDTEEHHLRD	109->123	1734.84	C72 H115 N23 O27	4.68	-1
29	KEDTEEHHLRDYFEQ	113->127	1976.06	C85 H122 N24 O31	4.39	-2
30	EEHHLRDYFEQYGKI	117->131	1964.13	C89 H126 N24 O27	5.23	0
31	LRDYFEQYGKIEVIE	121->135	1902.14	C88 H132 N20 O27	4.04	-2
32	FEQYGKIEVIEIMTD	125->139	1815.08	C82 H127 N17 O27 S	3.47	-3
33	GKIEVIEIMTDRGSG	129->143	1604.84	C67 H117 N19 O24 S	4.43	-1
34	VIEIMTDRGSGKKRG	133->147	1646.92	C68 H123 N23 O22 S	10.69	2
35	MTDRGSGKKRGFAFV	137->151	1656.92	C72 H117 N23 O20 S	11.56	3
36	GSGKKRGFAFVTFDD	141->155	1631.8	C74 H110 N20 O22	10.02	1
37	KRGFAFVTFDDHDSV	145->159	1740.89	C79 H113 N21 O24	5.24	0
38	AFVTFDDHDSVDKIV	149->163	1707.85	C77 H114 N18 O26	3.85	-2
39	FDDHDSVDKIVIQKY	153->167	1821.99	C82 H124 N20 O27	4.47	-1
40	DSVDKIVIQKYHTVN	157->171	1758.97	C78 H127 N21 O25	7.9	1
41	KIVIQKYHTVNGHNC	161->175	1754.01	C77 H124 N24 O21 S	9.67	4
42	QKYHTVNGHNCEVRK	165->179	1813	C76 H121 N27 O23 S	9.67	4
43	TVNGHNCEVRKALSK	169->183	1655.87	C68 H118 N24 O22 S	10.02	3
44	HNCEVRKALSKQEMA	173->187	1744.01	C71 H122 N24 O23 S2	8.89	2
45	VRKALSKQEMASASS	177->191	1592.83	C65 H117 N21 O23 S	10.7	2
46	LSKQEMASASSSQRG	181->195	1566.71	C61 H107 N21 O25 S	10.27	1
47	EMASASSSQRGRSGS	185->199	1497.57	C55 H96 N22 O25 S	11.04	1
48	ASSSQRGRSGSGNFG	189->203	1454.47	C56 H91 N23 O23	12.49	2
49	QRGRSGSGNFGGGRG	193->207	1449.49	C56 H92 N26 O20	12.81	3
50	SGSGNFGGGRGGGFG	197->211	1270.26	C52 H75 N19 O19	11.18	1
51	NFGGGRGGGFGGNDN	201->215	1382.34	C56 H79 N21 O21	6.95	0
52	GRGGGFGGNDNFGRG	205->219	1424.43	C58 H85 N23 O20	11.04	1
53	GFGGNDNFGRGGNFS	209->223	1502.5	C64 H87 N21 O22	6.95	0
54	NDNFGRGGNFSGRGG	213->227	1511.51	C61 H90 N24 O22	11.04	1
55	GRGGNFSGRGGFGGS	217->231	1369.4	C56 H84 N22 O19	12.49	2
56	NFSGRGGFGGSRGGG	221->235	1369.4	C56 H84 N22 O19	12.49	2
57	RGGFGGSRGGGGYGG	225->239	1298.32	C53 H79 N21 O18	11.21	2
58	GGSRGGGGYGGSGDG	229->243	1197.12	C45 H68 N18 O21	6.86	0
59	GGGGYGGSGDGYNGF	233->247	1321.26	C56 H72 N16 O22	3.1	-1
60	YGGSGDGYNGFGNDG	237->251	1436.35	C60 H77 N17 O25	2.92	-2
61	GDGYNGFGNDGGYGG	241->255	1406.32	C59 H75 N17 O24	2.92	-2
62	NGFGNDGGYGGGGPG	245->259	1282.22	C53 H71 N17 O21	3.1	-1
63	NDGGYGGGGPGYSGG	249->263	1271.2	C52 H70 N16 O22	3.1	-1
64	YGGGGPGYSGGSRGY	253->267	1391.41	C60 H82 N18 O21	9.43	1
65	GPGYSGGSRGYGSGG	257->271	1315.31	C54 H78 N18 O21	9.63	1
66	SGGSRGYGSGGQGYG	261->275	1346.32	C54 H79 N19 O22	9.63	1
67	RGYGSGGQGYGNQGS	265->279	1444.42	C58 H85 N21 O23	9.63	1
68	SGGQGYGNQGSGYGG	269->283	1345.28	C54 H76 N18 O23	5.96	0
69	GYGNQGSGYGGSGSY	273->287	1410.36	C59 H79 N17 O24	5.96	0
70	QGSGYGGSGSYDSYN	277->291	1498.43	C62 H83 N17 O27	3.1	-1
71	YGGSGSYDSYNNGGG	281->295	1454.37	C60 H79 N17 O26	3.1	-1
72	GSYDSYNNGGGGGFG	285->299	1408.34	C59 H77 N17 O24	3.1	-1
73	SYNNGGGGGFGGGSG	289->303	1244.17	C50 H69 N17 O21	6	0
74	GGGGGFGGGSGSNFG	293->307	1171.12	C48 H66 N16 O19	6.09	0
75	GFGGGSGSNFGGGGS	297->311	1201.15	C49 H68 N16 O20	6.09	0
76	GSGSNFGGGGSYNDF	301->315	1422.37	C60 H79 N17 O24	3.1	-1
77	NFGGGGSYNDFGNYN	305->319	1582.54	C69 H87 N19 O25	3.1	-1
78	GGSYNDFGNYNNQSS	309->323	1623.55	C67 H90 N20 O28	3.1	-1
79	NDFGNYNNQSSNFGP	313->327	1674.64	C71 H95 N21 O27	3.1	-1
80	NYNNQSSNFGPMKGG	317->331	1614.69	C67 H99 N21 O24 S	9.8	1
81	QSSNFGPMKGGNFGG	321->335	1484.59	C63 H93 N19 O21 S	10.28	1
82	FGPMKGGNFGGRSSG	325->339	1455.6	C62 H94 N20 O19 S	11.6	2
83	KGGNFGGRSSGPYGG	329->343	1397.45	C59 H88 N20 O20	10.58	2
84	FGGRSSGPYGGGGQY	333->347	1446.49	C63 H87 N19 O21	9.63	1
85	SSGPYGGGGQYFAKP	337->351	1472.57	C67 H93 N17 O21	9.52	1
86	YGGGGQYFAKPRNQG	341->355	1599.71	C71 H102 N22 O21	10.19	2
87	GQYFAKPRNQGGYGG	345->359	1599.71	C71 H102 N22 O21	10.19	2
88	AKPRNQGGYGGSSSS	349->363	1452.49	C58 H93 N21 O23	10.58	2
89	NQGGYGGSSSSSSYG	353->367	1394.32	C55 H79 N17 O26	5.96	0
90	YGGSSSSSSYGSGRR	357->371	1494.5	C59 H91 N21 O25	10.42	2
91	SSSSSYGSGRRF	361->372	1277.32	C52 H80 N18 O20	11.21	2
Peptide Count: 91

References

Barraud P, Allain FH; Solution structure of the two RNA recognition motifs of hnrnp a1 using segmental isotope labeling: how the relative orientation between rrms influences the nucleic acid binding topology.J.Biomol.Nmr (2013) 55 p.119.

Carolyn J. Decker and Roy Parker; P-Bodies and Stress Granules: Possible Roles in the Control of Translation and mRNA Degradation. Cold Spring
Harb Perspect Biol a012286 First published online July 3, 2012.

Frank Desiere, Eric W. Deutsch, Alexey I. Nesvizhskii, Parag Mallick, Nichole King, Jimmy K. Eng, Alan Aderem, Rose Boyle, Erich Brunner, Samuel Donohoe, Nelson Fausto, Ernst Hafen, Lee Hood, Michael G. Katze, Kathleen Kennedy, Floyd Kregenow, Hookeun Lee, Biaoyang Lin, Dan Martin, Jeff Ranish, David J. Rawlings, Lawrence E. Samelson, Yuzuru Shiio, Julian Watts, Bernd Wollscheid, Michael E. Wright, Wei Yan, Lihong Yang, Eugene Yi, Hui Zhang and Ruedi Aebersold Genome Biology 2004, 6:R9
Integration of Peptide Sequences Obtained by High-Throughput Mass Spectrometry with the Human Genome.

Eric W Deutsch, Henry Lam & Ruedi Aebersold EMBO reports 9, 5, 429–434 (2008) PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows

Domon B, Aebersold R. Science. 2006 Apr 14;312(5771):212-7 Mass spectrometry and protein analysis.

Lange V, Malmström JA, Didion J, King NL, Johansson BP, Schäfer J, Rameseder J, Wong CH, Deutsch EW, Brusniak MY, Bühlmann P, Björck L, Domon B, Aebersold R. Mol Cell Proteomics. 2008 Apr 13. Targeted quantitative analysis of Streptococcus pyogenes virulence factors by multiple reaction monitoring.

Paola Picotti, Mathieu Clément-Ziza, Henry Lam, David S. Campbell, Alexander Schmidt, Eric W. Deutsch, Hannes Röst, Zhi Sun, Oliver Rinner, Lukas Reiter, Qin Shen, Jacob J. Michaelson, Andreas Frei, Simon Alberti, Ulrike Kusebauch, Bernd Wollscheid, Robert L. Moritz, Andreas Beyer & Ruedi Aebersold Nature. 2013 Feb 14;494(7436):266-70. doi: 10.1038/nature11835. Epub 2013 Jan 20. A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis

Paola Picotti, Henry Lam, David Campbell, Eric W. Deutsch, Hamid Mirzaei, Jeff Ranish, Bruno Domon and Ruedi Aebersold Nature Methods A database of mass spectrometric assays for the yeast proteome. Nat Methods. 2008 November; 5(11): 913–914.

B L Robberson, G J Cote, and S M Berget; Exon definition may facilitate splice site selection in RNAs with multiple exons. Mol Cell Biol. 1990 January; 10(1): 84–94. PMCID: PMC360715

Other resources

HNRNPA1: provided by HGNC (http://www.genenames.org/), heterogeneous nuclear ribonucleoprotein A1: provided by HGNC. Primary source: HGNC:5031, See related: Ensembl:ENSG00000135486; HPRD:01242; MIM:164017; Vega:OTTHUMG00000169702; Gene type: protein coding; Organism: Homo sapiens. Lineage: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo. Also known as: HNRPA1; HNRPA1L3; hnRNP A1; hnRNP-A1