A minimal ligand binding pocket within a network of correlated mutations identified by multiple sequence and structural analysis of G protein coupled receptors
- Subhodeep Moitra†1,
- Kalyan C Tirupula†2,
- Judith Klein-Seetharaman2Email author and
- Christopher James Langmead1Email author
© Moitra et al.; licensee BioMed Central Ltd. 2012
Received: 29 February 2012
Accepted: 21 June 2012
Published: 29 June 2012
G protein coupled receptors (GPCRs) are seven helical transmembrane proteins that function as signal transducers. They bind ligands in their extracellular and transmembrane regions and activate cognate G proteins at their intracellular surface at the other side of the membrane. The relay of allosteric communication between the ligand binding site and the distant G protein binding site is poorly understood. In this study, GREMLIN , a recently developed method that identifies networks of co-evolving residues from multiple sequence alignments, was used to identify those that may be involved in communicating the activation signal across the membrane. The GREMLIN-predicted long-range interactions between amino acids were analyzed with respect to the seven GPCR structures that have been crystallized at the time this study was undertaken.
GREMLIN significantly enriches the edges containing residues that are part of the ligand binding pocket, when compared to a control distribution of edges drawn from a random graph. An analysis of these edges reveals a minimal GPCR binding pocket containing four residues (T1183.33, M2075.42, Y2686.51 and A2927.39). Additionally, of the ten residues predicted to have the most long-range interactions (A1173.32, A2726.55, E1133.28, H2115.46, S186EC2, A2927.39, E1223.37, G902.57, G1143.29 and M2075.42), nine are part of the ligand binding pocket.
We demonstrate the use of GREMLIN to reveal a network of statistically correlated and functionally important residues in class A GPCRs. GREMLIN identified that ligand binding pocket residues are extensively correlated with distal residues. An analysis of the GREMLIN edges across multiple structures suggests that there may be a minimal binding pocket common to the seven known GPCRs. Further, the activation of rhodopsin involves these long-range interactions between extracellular and intracellular domain residues mediated by the retinal domain.
KeywordsGPCR GREMLIN Long-range interactions Ligand binding pocket Graphical model
GPCR summary table
PDB IDs [number of structures]
Bovine Rhodopsin (BR)
1 F88, 1GZM, 1HZX, 1JFP, 1L9H, 1LN6, 1U19, 2 G87, 2HPY, 2I35, 2I36, 2I37, 2J4Y, 2PED, 3C9L, 3C9M, 3CAP, 3DQB 
RT, Ligand free
Squid Rhodopsin (SR)
2Z73, 2ZIY 
Turkey β1 adrenergic receptor (β1AR)
2VT4, 2Y00, 2Y01, 2Y02, 2Y03, 2Y04 
Cyanopindilol, Dobutamine Carmoterol, Isoprenaline Salbutamol
Human β2 adrenergic receptor (β2AR)
2R4R, 2R4S, 2RH1, 3D4S, 3KJ6, 3NY8, 3NY9, 3NYA, 3P0G, 3PDS 
Carazalol, Timolol, ICI 118,551, (molecule from Kolb et al., 2009), Alprenolol, BI-167107, FAUC50
Human A2A adenosine receptor (A2A)
Human chemokine receptor (CXCR4)
3ODU, 3OE0, 3OE6, 3OE8, 3OE9 
Human dopamine D3 receptor (D3R)
In GPCRs, the binding of a ligand in the EC or TM domain is the signal that is propagated to the IC domain wherein different effectors bind, in particular the G protein heterotrimer, GPCR receptor kinases (GRK) and β-arrestin. Thus, receptor activation is an inherently allosteric process where the ligand binding signal is communicated to a distant site. The activation of rhodopsin and other class A GPCRs is thought to be conserved and involves rearrangements in structural microdomains . Conformational changes of multiple ‘switches’ in tandem activate the receptor . These long-range interactions between distant residues are important for the function of the receptors and are also closely involved in their folding and structural stability [8, 9]. Identifying the residues involved in the propagation of signals within the protein is important to understand the mechanism of activation. While much information can be directly extracted from crystal structures, allosteric interactions are dynamic and implicit in nature and thus are not directly observable in static crystal structures. Experimental methods for investigating dynamics, such as nuclear magnetic resonance, are presently incapable of resolving allosteric interactions in large membrane proteins, such as GPCRs.
Due to the limitations of experimental methods, statistical analysis of GPCR sequences is an alternative in identifying residues that may be involved in allosteric communication. Here, considerable effort has been directed towards identifying networks of co-evolving residues from multiple sequence alignments (MSA), i.e. residues that are statistically correlated in the MSA. Such correlations are thought to be necessary for function, and may provide insights into how signals are propagated between different domains. A number of computational methods have been developed to identify such couplings from MSAs, including Hidden Markov Models (HMMs) , Statistical Coupling Analysis (SCA) [11, 12], Explicit Likelihood of Subset Co-variation (ELSC) , Graphical Models for Residue Coupling (GMRC) , and Generative REgularized ModeLs of proteINs (GREMLIN) . Like the GMRC method, GREMLIN learns an undirected probabilistic graphical model known as a Markov Random Field (MRF). Unlike HMMs, which are also graphical models, MRFs are well suited to modelling long-range couplings (i.e., between non-sequential residues). The SCA and ELSC methods return a set of residue couplings (which may include long-range couplings), but unlike MRFs, they do not distinguish between direct (conditionally dependent) and indirect (conditionally independent) correlations. This distinction is crucial in determining whether an observed correlation between two residues can be explained in terms of a network of correlations involving other residues. The key difference between the GMRC and GREMLIN methods is that GREMLIN is statistically consistent and guaranteed to learn an optimal MRF, whereas the GMRC uses heuristics to learn the MRF. We have previously reported detailed comparisons of the GMRC and GREMLIN methods  and found that GREMLIN achieved higher accuracy and superior scalability.
Multiple sequence alignments of class A GPCRs have previously been examined by the SCA  and GMRC  methods. In the SCA study, the authors focused on the critical residue at position 296 corresponding to a lysine (K2967.43), which is the covalent attachment site for RT in bovine rhodopsin [6, 15]. Several networks of residues were proposed to mediate the signal flow from the ligand binding pocket to the G protein coupling site. This focus overlooked the important contribution of the EC domain to GPCR structure and dynamics . In contrast to SCA, there were no statistically coupled residues involving K2967.43 in the GMRC study, rendering a comparison of SCA and GMRC results impossible. Only 5 edges in GMRC were considered statistically significant, limiting the interpretability of the results. At the time of the above studies, the rhodopsin crystal structure was the only GPCR structure available. The now larger number of structures published (Table 1) provides us with an opportunity to investigate the generality of the roles of individual residues for allostery in different GPCRs. Furthermore, we re-examine the communication across the entire membrane, not only from a single RT residue to the IC side, but considering all possible communication points.
Because of the demonstrated advantages of GREMLIN over other methods , we applied GREMLIN to the same GPCR sequence alignment previously investigated by SCA and GMRC studies for comparability [12, 14]. Using GREMLIN we identified statistically significant long-range couplings in class A GPCRs and analyzed the results with respect to all seven GPCRs that had been crystallized at the time of our study. Our findings indicate that the ligand binding residues are significantly enriched in these long-range couplings, mediating not only communication to the IC, but also to the EC side of the membrane. 9 out of the 10 residues with the largest number of long-range couplings belong to the ligand binding domain. There a total of 34 statistically significant long-range couplings involving these 10 residues, involving experimentally determined microdomains and activation switches in GPCRs. Our study describes a comprehensive view of the network of statistical couplings across the membrane in class A GPCRs. The details of this network are consistent with the hypothesis that the ligand-binding pocket mediates allosteric communication. The independent identification of a crucial role of the ligand binding pocket in mediating this communication provides the first sequence-based support for the early notion that all three domains in GPCRs are structurally coupled . Finally, the extent of enrichment of edges in different GPCR structures allowed us to propose a novel minimal binding pocket predicted to represent the common core of ligand contact residues crucial for activation of all class A GPCRs.
Results and discussion
GREMLIN  was used to identify a network of correlated mutations in class A GPCRs. We first used bovine rhodopsin as a template to map the edges (correlations) to the structure. We defined the set of residues involved in interaction with the RT ligand based on the structure of rhodopsin, and first analysed the results with respect to these residues. Subsequently, we identified the ligand binding pockets of all GPCRs with known structure to consider generality of our findings. Finally, we identified minimal binding pockets that capture the most general aspects of ligand binding across all GPCRs we examined.
Mapping of GREMLIN edges to the structure of bovine Rhodopsin
Comparison of edge distribution from control set and GREMLIN
Control set (Null Distribution)
GREMLIN (at penaltyλ = 38)
GREMLIN > Null
GREMLIN < Null
% of edges
% of edges
The finding that there is significant enrichment in the EC-EC and IC-IC contacts and that there is an under-representation of EC-IC domain contacts is biologically meaningful, because EC-IC interactions would structurally be mediated via the TM domain. Interestingly, there is a lack of significant enrichment of edges within the TM domain and a slight under-representation of EC-TM and TM-IC edges. A lack of TM enrichment is in line with the general view of the TM helices as rigid bodies in the GPCR field [17–19]. Furthermore, an important evolutionary pressure experienced by the amino acids in the TM region is to ensure that hydrophobic residues in the helices face the lipid bilayer. This pressure may override the importance of specific TM-TM contacts. However, it was puzzling that EC-TM and TM-IC contacts are under-represented since we would expect to find long-range couplings between EC and IC domains to be mediated via the intermediate TM domain. We therefore hypothesized that the EC-IC long-range contacts are more specifically mediated through a subset of TM and EC residues, namely those participating in binding RT. Indeed, 20 residues out of 27 in the RT pocket are in TM regions. We therefore analyzed the edges involving RT binding pocket residues in more detail.
Long-range couplings involving the ligand binding pockets
The RT edges were further classified into EC-RT, RT-TM, IC-RT and RT-RT groups and were compared with the respective distributions in the control set. There is significant enrichment in EC-RT, IC-RT and all other groups compared to the control set (Table 2). This finding supports the hypothesis that the EC-IC long-range couplings are mediated via RT. This is in line with our current understanding of rhodopsin activation, as the initial conformational changes triggered on activation of the receptor are in the ligand binding domain which is ultimately propagated to the IC domain.
Mapping of GREMLIN edges to the structure of other GPCRs
Common ligand binding pockets defined for GPCRs with structural information
M1, G3, L31, Q36, F37, M44, T93, T94, T97, S98, F103, E113, G114, A117, T118, P171, L172, Y178, I179, P180, T193, P194, H195, E196, E197, N200, F203, V204, M207, Y268, A272, I275, H278, Q279, S281, P285, M288, T289, A292
M86, T94, T97, S98, F103, E113, G114, A117, T118, G121, E122, I179, P180, I189, Y191, F203, V204, M207, F208, H211, W265, Y268, A269, A272, P285, M288, T289, A292, F293, K296
T94, T97, S98, E113, G114, A117, T118, G121, E122, I179, P180, I189, Y191, F203, V204, M207, F208, H211, W265, Y268, A269, A272, M288, T289, A292, F293, K296
E113, G114, A117, T118, G121, E122, L125, Y178, E181, S186, C187, G188, I189, Y191, M207, F208, H211, F212, F261, W265, Y268, A269, A272, A292, F293, A295, K296
M86, G90, E113, G114, A117, T118, G121, E122, L125, Y178, E181, S186, C187, G188, I189, M207, F208, H211, F212, F261, W265, Y268, A269, A292, K296
T94, E113, G114, A117, T118, G121, E122, P180, G188, I189, V204, M207, F208, H211, W265, Y268, A269, A272, F273, M288, A292, K296
T118, P180, E181, F203, M207, W265, Y268, A269, A272, F283, P285, M288, T289, A292
A minimal ligand binding pocket
Defining a minimal GPCR pocket
M1, G3, L31, Q36, F37, M44, M86, G90, T93, T94, T97, S98, F103, E113, G114, A117, T118, G121, E122, L125, P171, L172, Y178, I179, P180, E181, S186, C187, G188, I189, Y191, T193, P194, H195, E196, E197, N200, F203, V204, M207, F208, H211, F212, F261, W265, Y268, A269, A272, F273, I275, H278, Q279, S281, F283, P285, M288, T289, A292, F293, A295, K296
M86, T94, T97, S98, F103, E113, G114, A117, T118, G121, E122, L125, Y178, I179, P180, E181, S186, C187, G188, I189, Y191, F203, V204, M207, F208, H211, F212, F261, W265, Y268, A269, A272, P285, M288, T289, A292, F293, K296
T94, T97, S98, E113, G114, A117, T118, G121, E122, Y178, I179, P180, E181, G188, I189, Y191, F203, V204, M207, F208, H211, W265, Y268, A269, A272, P285, M288, T289, A292, F293, K296
T94, E113, G114, A117, T118, G121, E122, P180, I189, F203, V204, M207, F208, H211, W265, Y268, A269, A272, M288, T289, A292, K296
E113, G114, A117, T118, G121, E122, P180, I189, M207, F208, H211, W265, Y268, A269, A272, M288, A292, K296
E113, G114, A117, T118, M207, W265, Y268, A269, A272, A292
T118, M207, Y268, A292
GREMLIN edges (λ = 38) involving residues from the B7 pocket
G90, T94, P171, E197, T198, H211, A269, A272, F293, M309, C316
G90, S98, G114, G121, E122, P171, E181, C185, D190, E196, A233, A269, I275, H278, G284, M288, T289, A292, F293, C316, K325, N326
A26, Y29, H65, L72, G90, T93, T94, V104, N111, A117, G121, N145, F148, S176, Y178, S186, D190, N199, N200, V204, M207, Q237, T243, A269, A272, I275, F276, Q312
Identification of the most frequently observed residues involved in long-range interactions in rhodopsin
The previous section showed that GREMLIN is able to shed light on the biological and structural properties of the GPCR family. In this section we present a strategy for ranking GREMLIN edges. This strategy can be used for exploratory purposes in order to discover novel couplings and residues that might play a key role in structure and function of the GPCR protein family.
The strategy is based on the following two key insights. The first insight is that the residues that have high degree in the graph of GREMLIN couplings could be considered as hubs that lie on the communication pathways in GPCRs. This is motivated by the graphical model since a mutation/perturbation in the hub residue could affect a number of other residues. The second insight is based on the persistence of certain couplings even under stringent model complexity constraints. The larger the regularization parameter, λ, the sparser the Markov Random Field (MRF), see Methods. Thus, each edge in the MRF can be assigned a persistence score equal to the maximum λ until which the coupling was retained. The persistence score is an indicator of the importance of the couplings and the corresponding residues.
List of top ranked residues and the most persistent edges
Number of edges (at λ = 38)
Most persistent pair position (edges at penalty λ = 140)
G902.57, E247 IC3, F293 7.40 , K296 7.43
L72IC1, G114 3.29 , S176EC2, Y178EC2
M441.39, L72IC1, W1263.41, Q237IC3, F293 7.40
F912.58, C140 IC2, F148 IC2
K67IC1, Q244IC3, P2917.38
I481.43, G902.57, E196EC3, M207 5.42 , A269 6.52 , F293 7.40 , C316IC (C-terminus)
A117 3.32 , G1203.35, E122 3.37 , M207 5.42 , Q237IC3, A269 6.52 , F293 7.40
S176EC2, A272 6.55 , Y178EC2
G902.57, E122 3.37 , C316IC (C-terminus)
Involvement of long-range interactions in activation of rhodopsin
Edges involving the EC and TM domains
The RT attachment site, K2967.43, to which RT is covalently linked via a Schiff base with the amino group of this lysine, has 15 edges at λ = 38 and the most persistent edge is A1173.32 - K2967.43, the only long-range edge at λ = 280. K2967.43 is also a key determinant for ligand specificity in different GPCRs [6, 15]. The counter-ion  for the Schiff base is E1133.28, also a top-ranked GREMLIN edge residue. The imine moiety of the RT Schiff base is surrounded by several amino acids of which M441.39 and F2937.40 are identified in the edge list . The major event on light-incidence is the isomerization of 11-cis-RT to all-trans-RT which results in the rotation of the C20 methyl group towards the EC2 loop . This rotation triggers movements of the EC2 loop and rotation of the Schiff base to a more hydrophobic interior . The EC2 loop displacement is one of the molecular switches in rhodopsin activation . Three important residues that are part of this loop, namely S176EC2, Y178EC2 and S186EC2, are identified as top-ranked edges here.
Movement of EC2 is coupled to the outward rotation of the EC end of TM5. The shift in the RT β-ionone ring towards M2075.42 on TM5 results in a rearrangement of the hydrogen bonding network between this helix and TM3 . Residue H2115.46 interacts with E1223.37 and W1263.41 and these interactions are important for receptor activation to form the Meta II state [26, 27]. Other residues that are important for Meta II stability on TM3 and identified by GREMLIN are E1133.28, G1143.29, A1173.32, G1203.35, E1223.37 and W1263.41.
In addition to the rearrangement of the hydrogen bonding network between TM3 and TM5, RT isomerization in rhodopsin and ligand binding in GPCRs results in two major activation switches, the so-called rotamer toggle switch and the breakage of the ionic lock. Rotamer toggle switch refers to the rotation of W2656.48, a residue which is part of the conserved CWxP motif  causing reorientation of Y2235.58, M2576.40 and Y2686.51 on TM6 [7, 29]. The conserved ionic lock involves the (E/D3.49)R3.50Y3.51 motif, Y2235.58 and E247IC3 at the IC side [30–32]. Note that R1353.50, Y2235.58 and W2656.48 did not appear in our edge lists because highly conserved residues naturally do not vary, and thus cannot co-vary, and so GREMLIN does not learn edges to/from such residues (see Methods). For the same reason, absent from our lists are residues from the highly conserved NPxxY motif  that are involved in the TM6 motions on the IC side. However, E247IC3 from the ionic lock which is not highly conserved is present in our list forming an edge with A1173.32. Other important residues that are present in our edge list are A2696.52, A2726.55 on TM6 and A2927.39 on TM7 which contribute to RT binding [5, 20]. In addition, A2696.52 in rhodopsin is usually substituted by F6.52 in other GPCRs and is considered an extension of the conserved aromatic cluster on TM6. F6.52 is thought to act as ‘ligand-sensor’ in concert with the CWxP motif .
Edges involving the IC domain
Persistent edges categorized based on the long-range contacts between different domains
Subset containing RT residues
EC – TM *
EC – RT:
A2726.55 - S176EC2, A2726.55 - Y178EC2, A2927.39 - Y29EC (N-terminus), S186EC2 - P2917.38, E1223.37 - E196EC3, G1143.29 - S176EC2, G1143.29 - Y178EC2
TM – TM 
TM(not RT) – TM(not RT) :
G902.57 - G1203.35
RT – TM :
A1173.32 - G902.57, E1133.28 - M441.39, E1133.28 - W1263.41, H2115.46 - F912.58, E1223.37 - I481.43, E1223.37 - G902.57, E1223.37 - M2075.42, G902.57 - M2075.42, G902.57 - A2696.52, G902.57 - F2937.40
RT – RT :
A1173.32 - F2937.40, A1173.32 - K2967.43, A2726.55 - G1143.29, E1133.28 - F2937.40, E1223.37 - A2696.52, E1223.37 - F2937.40
TM – IC 
TM(not RT) – IC :
G902.57 - Q237IC3
RT – IC :
A1173.32 - E247 IC3, A2726.55 - L72IC1, E1133.28 - L72IC1, H2115.46 - C140 IC2, H2115.46 - F148 IC2, E1223.37 - C316IC (C-terminus), M2075.42 - C316IC (C-terminus)
EC – IC 
RT – IC :
S186EC2 - K67IC1, S186EC2 - Q244IC3
Involvement of long-range interaction residues identified by GREMLIN in ligand binding and function of angiotensin II type I receptor (AT1R)
To validate our findings using a GPCR not used in the present analysis and for which no structure is yet known, we chose the rat angiotensin II type I receptor (AT1R). AT1R is a class A GPCR which plays a vital role in cardiovascular physiology. Unlike rhodopsin, there is no full length structural or extensive biophysical data available for AT1R. However, pharmacological and structure-function properties of this receptor have been well studied by mutagenesis experiments .
Residues in AT1R that are homologous top ranking edge forming residues in rhodopsin
Experimental and computational docking studies suggest that AT1R receptor agonist (angiotensin II [Ang II]) and antagonist (losartan) bind in the homologous RT binding site [44, 45], thus hinting that many of the residues in the top ranking edge list may play a role in ligand binding in AT1R. Interestingly, the first step in AngII binding is thought to be the insertion of the C-terminus of the peptide in the receptor followed by the interaction of N-terminus residues of the peptide with EC and TM ends on the EC face . AngII binding is supposed to extend from the EC face of the protein to the homologous RT binding site buried in the TM similar to peptide bound chemokine structure . The carboxylate group on the C-terminus of AngII forms a salt bridge with K1995.42 on TM5 [47–49]. In addition, K1995.42 is also involved in insurmountable antagonism with carboxylate containing ligands . Similar to K1995.42, Q2576.52 is also shown to be involved in insurmountable antagonism . The C-terminal residue of AngII (F8) makes critical stacking interactions with the minimal binding pocket residue H2566.51 and the aromaticity of F8 and H2566.51 is important for receptor activation [49, 52]. A N1113.35G mutation on TM3 results in constitutive activation of AT1R . The mechanism of constitutive activation of the N1113.35G mutation is due its steric effects involving Y2927.43 on TM7 . N111 is also required for discriminating AT1R specific ligands . Other residues such as V179EC2 in the EC loop are also important for Ang II binding . Residues like Y2927.43 in TM7, N235IC3 and Y312IC(C-terminus) in the IC face are critical for G-protein coupling and second messenger generation in cells [57–59]. Thus, AT1R residues identified to be important by empirically performed mutagenesis experiments represent the bulk of the edges identified by GREMLIN, thus validating the applicability of our approach to other GPCRs, including those for which structural information is lacking.
Comparison of results from GREMLIN with SCA and GMRC
Comparison of edges reported in SCA and GMRC studies with GREMLIN
Residues involved in edges with K296 (at λ = 38)
Residues that are statistically coupled to K296 perturbation
Statistically coupled residues in amine + peptide + rhodopsin model
M44, L72, N73, G90, T93, G114, A117, G121, W175, Y178, C185, D190, S202, H211, A269, P291, A292, F293
I54, T58, N73, N78, F91, T92, T93, E113, A117, G121, E122, I123, L125, V129, E134, Y136, F148, A164, F212, I213, I219, M257, F261, W265, Y268, F293 , F294, A295, S298, A299, N302 , F313, M317
L57 – A82, F313 – R314, I305 – Y306, N302 – I304, C264 – A299
Note: None of the above residues have any edges in GREMLIN (at λ =38)
In the SCA study, the residues statistically coupled to K2967.43 were classified further into three classes: (1) Immediate neighbours - F2937.40, L2947.41, A2957.42, A2997.46, F912.56, E1133.28, (2) Linked network - F2616.44, W2656.48, Y2686.51, F2125.47 and (3) Sparse but contiguous network: G1213.36, I1233.38, L1253.40, I2195.54, F2616.44, S2987.45, A2997.46, N3027.49. These categories were formulated on mapping the residues onto the rhodopsin structure. Residues in the immediate neighbour category are in the vicinity of K2967.43 and are mainly involved in helix packing interactions except for E1133.28. E1133.28 forms a salt bridge interaction with the protonated Schiff base on K2967.43 and is an important interaction identified by SCA. In the GREMLIN model, E1133.28 and K2967.43 aren’t connected by an edge, but they do share three common neighbours: M44, L72, and F293, and are thus indirectly correlated. The linked network residues in SCA are parallel to the membrane and form an aromatic cluster around the β-ionone ring of RT in rhodopsin. The residues in the sparse but contiguous network are distant from K2967.43 and form helix packing interactions toward the IC side. There are critical residues identified in the SCA study, most importantly W2656.48 which is part of the CWxP motif  and N3027.49 which is part of the NPxxY motif . The SCA method performs a perturbation on a particular amino acid only if the corresponding sub-alignment size is beyond a certain cutoff in order to calculate ΔΔGstat values. GREMLIN on the other hand makes no such distinction. Hence it is possible that SCA detects edges even if a position is fairly conserved whereas GREMLIN ignores them. This could be a source of difference between GREMLIN and SCA edge couplings. Overall, compared to SCA and GMRC, GREMLIN seems to identify couplings that are more extensive (i.e., involving EC, TM, RT and IC) and are part of experimentally functional switches and structural micro-domains that are critical for activation as discussed above.
Limitations of the GREMLIN approach
GREMLIN is subject to the same kinds of limitations that all MSA-based analyses face. We briefly discuss these limitations here so that readers can better understand the nature of the results of our study.
GREMLIN is very sensitive to the size and contents of the MSA. A small, and/or poorly constructed MSA may result in subpar models. However, GREMLIN does attempt to deal with small MSAs (i.e., those with relatively few sequences) through regularization. As described in the Methods section, GREMLIN selects a value for the regularization parameter, λ, via a permutation of the columns of the MSA. Specifically, it selects a λ value that minimizes the expected number of false positive edges. It does this at the expense of an increase in the number of false negative edges. The value of λ is expected to be roughly inversely proportional to the number of sequences in the MSA. Likewise, the number of edges in the resulting model will be roughly inversely proportional to λ. So, small MSAs will inherently produce sparse graphs which will probably contain many “missing” edges that don’t have strong statistical support. Users must therefore consider the size of the MSA when interpreting the set of edges in the graph returned by GREMLIN.
In addition to the size of the MSA, the contents of the MSA are also important, especially if the MSA contains functionally heterogeneous sequences (as is the case in our study). In particular, weak signals in the MSA (e.g., due to sampling imbalances between different functional groups) are very likely to be missed. This is especially true for GREMLIN since it is biased towards minimizing false positive edges. Conversely, the GREMLIN algorithm will learn the conservation and correlation statistics for two (or more) divergent subclasses, provided that they are well represented in the MSA. Additionally, some of the edges learned by GREMLIN are due to correlations that distinguish functionally divergent sequences, while others are due to other constraints (e.g., conservation of charge). GREMLIN cannot distinguish between these two kinds of couplings. Naturally, one may attempt to compare the set of edges learned from functionally homogeneous MSAs to those learned from heterogeneous MSAs, but differences in the sizes of the MSAs can make it difficult to compare models, as discussed above. Addressing this limitation is one of our goals as part of on-going research.
As noted by an anonymous reviewer, GREMLIN is not well-suited to learning couplings between one residue and a cluster of functionally redundant residues (e.g., a cluster of Glu residues, any one of which could form a salt bridge with a nearby Lys), unless, the MSA contains examples of each possible clustering. Thus, care must be taken if the MSA contains such clusters.
Finally, the results presented in this paper are limited to GPCRs where the binding pocket is at or near the corresponding binding pocket of rhodopsin. Our MSA did not contain a significant number of GPCRs with binding pockets substantially different than rhodopsin, such as A2A.
In this study we demonstrated the use of GREMLIN to identify a network of statistically correlated and functionally important residues in class A GPCRs. Based on sequence only, GREMLIN identified that ligand binding pocket residues are extensively correlated with distal residues, compared to those that are not part of the ligand pocket. An analysis of the GREMLIN edges across multiple structures suggests that there is a minimal binding pocket common to the seven known GPCRs. Statistically significant long-range couplings identified here were previously identified experimentally to be critical for activation of rhodopsin. Further, the activation of rhodopsin involves these long-range interactions between EC and IC residues mediated by RT. Compared to previously applied methods SCA and GMRC, GREMLIN identifies edges that span the entire protein and are functionally important. Based on our findings here with the GPCR family and our earlier studies with several soluble protein families , GREMLIN can be used to identify functionally important residue couplings in both soluble and membrane proteins. Future work can include validating the functional importance of novel residues and couplings identified by GREMLIN using molecular modelling tools such as GOBLIN  or via Molecular Dynamic Simulations and ultimately wet-lab experiments.
Here, Z is the normalization constant, V and E are the nodes and edges in the MRF, respectively. We note MRFs are generative and can thus be used to sample new sequences (as in protein design).
Figure 5 shows a toy example of the relationship between the input MSA and the MRF that GREMLIN learns. Here, a 7-column MSA is shown. Column 2 is completely conserved, and is therefore statistically independent of the remaining columns. This independence is encoded in the MRF by the absence of an edge to the variable corresponding to the second column. On the other hand, columns 1 and 4 co-vary such that whenever there is an ‘S’ in column 1, there is a ‘H’ in column 4, and whenever there is an ‘F’ in column 1, there is a ‘W’ in column 4. This coupling is represented in the MRF by an edge between the variables corresponding to columns 1 and 4. In this paper, we examine the topology of the learned MRF to gain insights into the network of correlated mutations. Specifically, we are most interested in correlations that are observed between spatially distant residues from different domains of GPCRs.
Multiple sequence alignment (MSA) of class A GPCRs
GPCR structures files
As of January 2011, there were a total of 43 structures representing seven different GPCRs deposited in the PDB (Table 1). Only class A GPCRs have been crystallized so far. The GPCRs for which structural information is available are bovine rhodopsin (BR; 18 structures including opsin), squid rhodopsin (SR; 2 structures) turkey β1 adrenergic receptor (β1AR; 6 structures), human β2 adrenergic receptor (β2AR; 10 structures), human A2A adenosine receptor (A2A; 1 structure), human chemokine receptor CXCR4 (5 structures) and human dopamine D3 receptor (D3R; 1 structure).
Residue numbering scheme
The amino acids of the bovine rhodopsin sequence were used as position references (NCBI Reference Sequence : NP_001014890.1). The positions of amino acids are represented by the single letter amino acid code followed by the sequence number in rhodopsin. To allow easier comparison with other GPCRs, given in superscript is the generic numbering proposed by Ballesteros and Weinstein .
Description of ligand binding pockets in GPCR structures
The residues in the ligand pocket of the different GPCR crystal structures available to date were defined as those which have at least one atom within 5 Å of the respective ligand. Python scripts were written to extract residues within a ligand binding pocket using this cut-off distance from crystal structures.
We mapped the ligand binding pockets of the different GPCRs onto bovine rhodopsin for comparison. Pair-wise sequence/structure based alignments between rhodopsin (PDB ID: 1U19) and other GPCR structures were generated using the ‘salign’ module in the MODELLER  software. All ligand binding pockets discussed in this paper are mapped onto the structure of bovine rhodopsin.
In addition to comparing ligand binding pockets directly (i.e. extracting 5 Å residues in PDB ID: 1F88 for rhodopsin to identify the RT ligand binding pocket), we also generated the following combined sets of pocket residues to investigate similarities and differences between ligand binding pockets of different GPCRs (Table 1). For each of the 7 GPCRs, we defined a common ligand binding pocket by combining the ligand binding pockets from all available crystal structures for the respective receptor (Table 3). Thus, for bovine rhodopsin, the common ligand pocket is the combination of all RT binding pockets of 12 different structures. [Note: Rhodopsin PDBs excluded are 1JFP and 1LN6, because these represent structure models from NMR structures of protein fragments. 2I36, 2I37, 3CAP and 3DQB were also excluded because these are opsin structures and have no RT in them.] In analogous fashion, common pockets were created for squid rhodopsin (SR), turkey β1 adrenergic receptor (β1AR), human β2 adrenergic receptor (β2AR), human A2A adenosine receptor (A2A), human chemokine receptor CXCR4 and human dopamine D3 receptor (D3R).
Finally, to generalize across different GPCRs, we derived additional ligand pockets B1, B2, B3, B4, B5, B6 and B7 representing common sets of residues present in at least one, two, three, four, five, six and seven receptor ligand binding pockets, respectively. These combined ligand binding pockets are listed in Table 4.
Definition of long-range interactions
A long-range interaction is defined as a statistical coupling between two amino acids that are separated by at least 8 amino acids in the sequence (a definition used in CASP ).
Control dataset and statistical significance tests
GREMLIN derived robust edges were checked for statistically over- or under-represented patterns amongst couplings observed. These tests were not done to validate the efficacy of GREMLIN in terms of modeling the protein family, but to get structural and biological insights into the nature of couplings that the model learns. For this purpose we compared the edges that GREMLIN returns against a control distribution of edges. The control distribution is created by drawing edges from a random graph. We classified the edges into one of the following categories: EC-EC, EC-IC, EC-TM, EC-RT, IC-IC, IC-RT, IC-TM, TM-TM, RT-TM and RT-RT. Here, RT stands for the ligand binding domain in rhodopsin (PDB ID: 1F88). To define the control distribution, we enumerated all possible edges coupling any two amino acids in rhodopsin (PDB ID: 1U19) and assigned these edges into the previously defined categories. We defined a control distribution of a category as the probability of randomly picking an edge in that category from the control dataset. To check for statistical significance, we enumerated the edges returned by GREMLIN in each category and compared the fraction of edges in this category against the control distribution. A p-value was calculated by a one-sided binomial test for statistical significance of GREMLIN categories against categories of the control distribution.
Generative REgularized ModeLs of proteINs
G protein coupled receptors
Statistical Coupling Analysis
Graphical Models for Residue Coupling
Hidden Markov Model
Markov Random Field
Multiple sequence analysis
Protein data bank.
This study was supported by NSF grants 1144281 and IIS-0905193, and NIH grant R01 LM007994-07.
- Balakrishnan S, Kamisetty H, Carbonell JG, Lee SI, Langmead CJ: Learning generative models for protein fold families. Proteins. 2011, 79: 1061-1078. 10.1002/prot.22934.View Article
- Takeda S, Kadowaki S, Haga T, Takaesu H, Mitaku S: Identification of G protein-coupled receptor genes from the human genome sequence. FEBS letters. 2002, 520: 97-101. 10.1016/S0014-5793(02)02775-8.View Article
- Overington JP, Al-Lazikani B, Hopkins AL: How many drug targets are there?. Nat Rev Drug Discov. 2006, 5: 993-996. 10.1038/nrd2199.View Article
- Fredriksson R, Lagerstrom MC, Lundin LG, Schioth HB: The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol Pharmacol. 2003, 63: 1256-1272.
- Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Le Trong I, Teller DC, Okada T, Stenkamp RE, et al: Crystal structure of rhodopsin: A G protein-coupled receptor. Science. 2000, 289: 739-745. 10.1126/science.289.5480.739.View ArticleADS
- Ballesteros JA, Shi L, Javitch JA: Structural mimicry in G protein-coupled receptors: implications of the high-resolution structure of rhodopsin for structure-function analysis of rhodopsin-like receptors. Mol Pharmacol. 2001, 60: 1-19.
- Ahuja S, Smith SO: Multiple switches in G protein-coupled receptor activation. Trends Pharmacol Sci. 2009, 30: 494-502. 10.1016/j.tips.2009.06.003.View Article
- Rader AJ, Anderson G, Isin B, Khorana HG, Bahar I, Klein-Seetharaman J: Identification of core amino acids stabilizing rhodopsin. Proceedings of the National Academy of Sciences of the United States of America. 2004, 101: 7246-7251. 10.1073/pnas.0401429101.View ArticleADS
- Klein-Seetharaman J: Dual role of interactions between membranous and soluble portions of helical membrane receptors for folding and signaling. Trends Pharmacol Sci. 2005, 26: 183-189. 10.1016/j.tips.2005.02.009.View Article
- Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, et al: The Pfam protein families database. Nucleic Acids Res. 2010, 38: D211-222. 10.1093/nar/gkp985.View Article
- Lockless SW, Ranganathan R: Evolutionarily conserved pathways of energetic connectivity in protein families. Science. 1999, 286: 295-299. 10.1126/science.286.5438.295.View Article
- Suel GM, Lockless SW, Wall MA, Ranganathan R: Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nature structural biology. 2003, 10: 59-69. 10.1038/nsb881.View Article
- Dekker JP, Fodor A, Aldrich RW, Yellen G: A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments. Bioinformatics (Oxford, England). 2004, 20: 1565-1572. 10.1093/bioinformatics/bth128.View Article
- Thomas J, Ramakrishnan N, Bailey-Kellogg C: Graphical models of residue coupling in protein families. IEEE/ACM Trans Comput Biol Bioinform. 2008, 5: 183-197.View Article
- Gether U: Uncovering molecular mechanisms involved in activation of G protein-coupled receptors. Endocrine reviews. 2000, 21: 90-113. 10.1210/er.21.1.90.View Article
- Hwa J, Garriga P, Liu X, Khorana HG: Structure and function in rhodopsin: packing of the helices in the transmembrane domain and folding to a tertiary structure in the intradiscal domain are coupled. Proceedings of the National Academy of Sciences of the United States of America. 1997, 94: 10571-10576. 10.1073/pnas.94.20.10571.View ArticleADS
- Altenbach C, Yang K, Farrens DL, Farahbakhsh ZT, Khorana HG, Hubbell WL: Structural features and light-dependent changes in the cytoplasmic interhelical E-F loop region of rhodopsin: a site-directed spin-labeling study. Biochemistry. 1996, 35: 12470-12478. 10.1021/bi960849l.View Article
- Farrens DL, Altenbach C, Yang K, Hubbell WL, Khorana HG: Requirement of rigid-body motion of transmembrane helices for light activation of rhodopsin. Science. 1996, 274: 768-770. 10.1126/science.274.5288.768.View ArticleADS
- Sakmar TP, Menon ST, Marin EP, Awad ES: Rhodopsin: insights from recent structural studies. Annu Rev Biophys Biomol Struct. 2002, 31: 443-484. 10.1146/annurev.biophys.31.082901.134348.View Article
- Okada T, Sugihara M, Bondar AN, Elstner M, Entel P, Buss V: The retinal conformation and its environment in rhodopsin in light of a new 2.2 A crystal structure. J Mol Biol. 2004, 342: 571-583. 10.1016/j.jmb.2004.07.044.View Article
- Ahuja S, Hornak V, Yan EC, Syrett N, Goncalves JA, Hirshfeld A, Ziliox M, Sakmar TP, Sheves M, Reeves PJ, et al: Helix movement is coupled to displacement of the second extracellular loop in rhodopsin activation. Nat Struct Mol Biol. 2009, 16: 168-175. 10.1038/nsmb.1549.View Article
- Yan ECY, Epps J, Lewis JW, Szundi I, Bhagat A, Sakmar TP, Kliger DS: Photointermediates of the Rhodopsin S186A Mutant as a Probe of the Hydrogen-Bond Network in the Chromophore Pocket and the Mechanism of Counterion Switch†. The Journal of Physical Chemistry C. 2007, 111: 8843-8848. 10.1021/jp067172o.View Article
- Rao VR, Cohen GB, Oprian DD: Rhodopsin mutation G90D and a molecular mechanism for congenital night blindness. Nature. 1994, 367: 639-642. 10.1038/367639a0.View ArticleADS
- Sakmar TP, Franke RR, Khorana HG: Glutamic acid-113 serves as the retinylidene Schiff base counterion in bovine rhodopsin. Proceedings of the National Academy of Sciences of the United States of America. 1989, 86: 8309-8313. 10.1073/pnas.86.21.8309.View ArticleADS
- Nakamichi H, Okada T: Crystallographic analysis of primary visual photochemistry. Angew Chem Int Ed Engl. 2006, 45: 4270-4273. 10.1002/anie.200600595.View Article
- Lewis JW, Szundi I, Kazmi MA, Sakmar TP, Kliger DS: Proton movement and photointermediate kinetics in rhodopsin mutants. Biochemistry. 2006, 45: 5430-5439. 10.1021/bi0525775.View Article
- Lin SW, Sakmar TP: Specific tryptophan UV-absorbance changes are probes of the transition of rhodopsin to its active state. Biochemistry. 1996, 35: 11149-11159. 10.1021/bi960858u.View Article
- Shi L, Liapakis G, Xu R, Guarnieri F, Ballesteros JA, Javitch JA: Beta2 adrenergic receptor activation. Modulation of the proline kink in transmembrane 6 by a rotamer toggle switch. J Biol Chem. 2002, 277: 40989-40996. 10.1074/jbc.M206801200.View Article
- Patel AB, Crocker E, Reeves PJ, Getmanova EV, Eilers M, Khorana HG, Smith SO: Changes in interhelical hydrogen bonding upon rhodopsin activation. J Mol Biol. 2005, 347: 803-812. 10.1016/j.jmb.2005.01.069.View Article
- Scheerer P, Park JH, Hildebrand PW, Kim YJ, Krauss N, Choe HW, Hofmann KP, Ernst OP: Crystal structure of opsin in its G-protein-interacting conformation. Nature. 2008, 455: 497-502. 10.1038/nature07330.View ArticleADS
- Park JH, Scheerer P, Hofmann KP, Choe HW, Ernst OP: Crystal structure of the ligand-free G-protein-coupled receptor opsin. Nature. 2008, 454: 183-187. 10.1038/nature07063.View ArticleADS
- Ballesteros JA, Jensen AD, Liapakis G, Rasmussen SG, Shi L, Gether U, Javitch JA: Activation of the beta 2-adrenergic receptor involves disruption of an ionic lock between the cytoplasmic ends of transmembrane segments 3 and 6. J Biol Chem. 2001, 276: 29171-29177. 10.1074/jbc.M103747200.View Article
- Fritze O, Filipek S, Kuksa V, Palczewski K, Hofmann KP, Ernst OP: Role of the conserved NPxxY(x)5,6 F motif in the rhodopsin ground state and during activation. Proceedings of the National Academy of Sciences of the United States of America. 2003, 100: 2290-2295. 10.1073/pnas.0435715100.View ArticleADS
- Weinstein H: Hallucinogen actions on 5-HT receptors reveal distinct mechanisms of activation and signaling by G protein-coupled receptors. AAPS J. 2005, 7: E871-884. 10.1208/aapsj070485.View Article
- Cai K, Klein-Seetharaman J, Farrens D, Zhang C, Altenbach C, Hubbell WL, Khorana HG: Single-cysteine substitution mutants at amino acid positions 306–321 in rhodopsin, the sequence between the cytoplasmic end of helix VII and the palmitoylation sites: sulfhydryl reactivity and transducin activation reveal a tertiary structure. Biochemistry. 1999, 38: 7925-7930. 10.1021/bi9900119.View Article
- Cai K, Klein-Seetharaman J, Hwa J, Hubbell WL, Khorana HG: Structure and function in rhodopsin: effects of disulfide cross-links in the cytoplasmic face of rhodopsin on transducin activation and phosphorylation by rhodopsin kinase. Biochemistry. 1999, 38: 12893-12898. 10.1021/bi9912443.View Article
- Klein-Seetharaman J, Hwa J, Cai K, Altenbach C, Hubbell WL, Khorana HG: Single-cysteine substitution mutants at amino acid positions 55–75, the sequence connecting the cytoplasmic ends of helices I and II in rhodopsin: reactivity of the sulfhydryl groups and their derivatives identifies a tertiary structure that changes upon light-activation. Biochemistry. 1999, 38: 7938-7944. 10.1021/bi990013t.View Article
- Altenbach C, Cai K, Klein-Seetharaman J, Khorana HG, Hubbell WL: Structure and function in rhodopsin: mapping light-dependent changes in distance between residue 65 in helix TM1 and residues in the sequence 306–319 at the cytoplasmic end of helix TM7 and in helix H8. Biochemistry. 2001, 40: 15483-15492. 10.1021/bi011546g.View Article
- Altenbach C, Klein-Seetharaman J, Cai K, Khorana HG, Hubbell WL: Structure and function in rhodopsin: mapping light-dependent changes in distance between residue 316 in helix 8 and residues in the sequence 60–75, covering the cytoplasmic end of helices TM1 and TM2 and their connection loop CL1. Biochemistry. 2001, 40: 15493-15500. 10.1021/bi011545o.View Article
- Altenbach C, Cai K, Khorana HG, Hubbell WL: Structural features and light-dependent changes in the sequence 306–322 extending from helix VII to the palmitoylation sites in rhodopsin: a site-directed spin-labeling study. Biochemistry. 1999, 38: 7931-7937. 10.1021/bi9900121.View Article
- Ridge KD, Zhang C, Khorana HG: Mapping of the amino acids in the cytoplasmic loop connecting helices C and D in rhodopsin. Chemical reactivity in the dark state following single cysteine replacements. Biochemistry. 1995, 34: 8804-8811. 10.1021/bi00027a032.View Article
- Yang K: Farrens DL, Hubbell WL. Khorana HG: Structure and function in rhodopsin. Single cysteine substitution mutants in the cytoplasmic interhelical E-F loop region show position-specific effects in transducin activation. Biochemistry. 1996, 35: 12464-12469.
- Farahbakhsh ZT, Ridge KD, Khorana HG, Hubbell WL: Mapping light-dependent structural changes in the cytoplasmic loop connecting helices C and D in rhodopsin: a site-directed spin labeling study. Biochemistry. 1995, 34: 8812-8819. 10.1021/bi00027a033.View Article
- Oliveira L, Costa-Neto CM, Nakaie CR, Schreier S, Shimuta SI, Paiva AC: The angiotensin II AT1 receptor structure-activity correlations in the light of rhodopsin structure. Physiological reviews. 2007, 87: 565-592. 10.1152/physrev.00040.2005.View Article
- Baleanu-Gogonea C, Karnik S: Model of the whole rat AT1 receptor and the ligand-binding site. Journal of molecular modeling. 2006, 12: 325-337. 10.1007/s00894-005-0049-z.View Article
- Wu B, Chien EY, Mol CD, Fenalti G, Liu W, Katritch V, Abagyan R, Brooun A, Wells P, Bi FC, et al: Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists. Science (New York, NY. 2010, 330: 1066-1071. 10.1126/science.1194396.View ArticleADS
- Noda K, Saad Y, Kinoshita A, Boyle TP, Graham RM, Husain A, Karnik SS: Tetrazole and carboxylate groups of angiotensin receptor antagonists bind to the same subsite by different mechanisms. The Journal of biological chemistry. 1995, 270: 2284-2289. 10.1074/jbc.270.5.2284.View Article
- Yamano Y, Ohyama K, Chaki S, Guo DF, Inagami T: Identification of amino acid residues of rat angiotensin II receptor for ligand binding by site directed mutagenesis. Biochemical and biophysical research communications. 1992, 187: 1426-1431. 10.1016/0006-291X(92)90461-S.View Article
- Noda K, Saad Y, Karnik SS: Interaction of Phe8 of angiotensin II with Lys199 and His256 of AT1 receptor in agonist activation. The Journal of biological chemistry. 1995, 270: 28511-28514. 10.1074/jbc.270.48.28511.View Article
- Fierens FL, Vanderheyden PM, Gaborik Z, Minh TL, Backer JP, Hunyady L, Ijzerman A, Vauquelin G: Lys(199) mutation of the human angiotensin type 1 receptor differentially affects the binding of surmountable and insurmountable non-peptide antagonists. Journal of the renin-angiotensin-aldosterone system : JRAAS. 2000, 1: 283-288. 10.3317/jraas.2000.044.View Article
- Takezako T, Gogonea C, Saad Y, Noda K, Karnik SS: "Network leaning" as a mechanism of insurmountable antagonism of the angiotensin II type 1 receptor by non-peptide antagonists. The Journal of biological chemistry. 2004, 279: 15248-15257. 10.1074/jbc.M312728200.View Article
- Miura S, Feng YH, Husain A, Karnik SS: Role of aromaticity of agonist switches of angiotensin II in the activation of the AT1 receptor. The Journal of biological chemistry. 1999, 274: 7103-7110. 10.1074/jbc.274.11.7103.View Article
- Noda K, Feng YH, Liu XP, Saad Y, Husain A, Karnik SS: The active state of the AT1 angiotensin receptor is generated by angiotensin II induction. Biochemistry. 1996, 35: 16435-16442. 10.1021/bi961593m.View Article
- Feng YH, Miura S, Husain A, Karnik SS: Mechanism of constitutive activation of the AT1 receptor: influence of the size of the agonist switch binding residue Asn(111). Biochemistry. 1998, 37: 15791-15798. 10.1021/bi980863t.View Article
- Monnot C, Bihoreau C, Conchon S: Curnow KM, Corvol P, Clauser E: Polar residues in the transmembrane domains of the type 1 angiotensin II receptor are required for binding and coupling. Reconstitution of the binding site by co-expression of two deficient mutants. The Journal of biological chemistry. 1996, 271: 1507-1513.
- Hjorth SA, Schambye HT, Greenlee WJ, Schwartz TW: Identification of peptide binding residues in the extracellular domains of the AT1 receptor. The Journal of biological chemistry. 1994, 269: 30953-30959.
- Zhang M, Zhao X, Chen HC, Catt KJ, Hunyady L: Activation of the AT1 angiotensin receptor is dependent on adjacent apolar residues in the carboxyl terminus of the third cytoplasmic loop. The Journal of biological chemistry. 2000, 275: 15782-15788. 10.1074/jbc.M000198200.View Article
- Sano T, Ohyama K, Yamano Y, Nakagomi Y, Nakazawa S, Kikyo M, Shirai H, Blank JS, Exton JH, Inagami T: A domain for G protein coupling in carboxyl-terminal tail of rat angiotensin II receptor type 1A. The Journal of biological chemistry. 1997, 272: 23631-23636. 10.1074/jbc.272.38.23631.View Article
- Marie J, Maigret B, Joseph MP, Larguier R, Nouet S, Lombard C, Bonnafous JC: Tyr292 in the seventh transmembrane domain of the AT1A angiotensin II receptor is essential for its coupling to phospholipase C. The Journal of biological chemistry. 1994, 269: 20815-20818.
- Kamisetty H, Ramanathan A, Bailey-Kellogg C, Langmead CJ: Accounting for conformational entropy in predicting binding free energies of protein-protein interactions. Proteins. 2011, 79: 444-462. 10.1002/prot.22894.View Article
- Horn F, Weare J, Beukers MW, Horsch S, Bairoch A, Chen W, Edvardsen O, Campagne F, Vriend G: GPCRDB: an information system for G protein-coupled receptors. Nucleic Acids Res. 1998, 26: 275-279. 10.1093/nar/26.1.275.View Article
- Beukers MW, Kristiansen I: AP IJ, Edvardsen I: TinyGRAP database: a bioinformatics tool to mine G-protein-coupled receptor mutant data. Trends Pharmacol Sci. 1999, 20: 475-477. 10.1016/S0165-6147(99)01403-0.View Article
- Kamisetty H: Structured Probabilistic Models of Proteins across Spatial and Fitness Landscape. 2011, Carnegie Mellon, Computer Science
- Pruitt KD, Tatusova T, Maglott DR: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007, 35: D61-65. 10.1093/nar/gkl842.View Article
- Ballesteros JA, Weinstein H: Integrated methods for the construction of three-dimensional models and computational probing of structure-function relations in G protein-coupled receptors. Methods in Neurosciences. 1995, 25: 366-428.View Article
- Eswar N, Eramian D, Webb B, Shen MY, Sali A: Protein structure modeling with MODELLER. Methods Mol Biol. 2008, 426: 145-159. 10.1007/978-1-60327-058-8_8.View Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.