GPCRmd uncovers the dynamics of the 3D-GPCRome

G-protein-coupled receptors (GPCRs) are involved in numerous physiological processes and are the most frequent targets of approved drugs. The explosion in the number of new three-dimensional (3D) molecular structures of GPCRs (3D-GPCRome) over the last decade has greatly advanced the mechanistic understanding and drug design opportunities for this protein family. Molecular dynamics (MD) simulations have become a widely established technique for exploring the conformational landscape of proteins at an atomic level. However, the analysis and visualization of MD simulations require efficient storage resources and specialized software. Here we present GPCRmd (http://gpcrmd.org/), an online platform that incorporates web-based visualization capabilities as well as a comprehensive and user-friendly analysis toolbox that allows scientists from different disciplines to visualize, analyze and share GPCR MD data. GPCRmd originates from a community-driven effort to create an open, interactive and standardized database of GPCR MD simulations. GPCRmd is a community-driven online platform to visualize, analyze and share G-protein-coupled receptor (GPCR) molecular dynamics data. It currently contains simulation data representing 100% of GPCR classes, 71% of receptor subtypes and 80% of GPCR families.

proved useful to complement experiments and characterize GPCR fluctuations at the atomic level 4 . Likely due to technical and sustainability limitations, only a modest number of online resources cover MD simulations (reviewed in ref. 5 ). Recent large improvements of internet bandwidth, compression of simulation data and storage capacities now enable faster and larger online repositories that host atom trajectories from MD simulations. Moreover, new visualization 6 and online file-sharing 7,8 tools have opened the door to streaming and remotely inspecting MD trajectories online, thereby removing the need for specialized MD software 5 .
Here we present the GPCRmd platform ( Fig. 1), an open access and community-driven research resource for sharing GPCR MD simulations with the aim of mapping the entire 3D-GPCRome. This new resource paves the way for GPCR scientists from very different disciplines to perform comparative studies on universal aspects of GPCR dynamics. We showcase the potential of GPCRmd for exploring key aspects of GPCR dynamics by performing comparative analyses of internal water molecules and sodium ion binding in multiple GPCR MD simulations. The open and intuitive design of the GPCRmd platform will not only foster interdisciplinary research and data reproducibility, but also transparent and easy dissemination of GPCR MD simulations.

MD simulations from all GPCR classes structurally solved to date.
GPCRmd is a community-driven resource that provides direct and interactive visualization of MD trajectories, and that is only contingent on a web browser. As a result, the GPCRmd platform grants easy access for both computational and nonexpert scientists. Moreover, we equipped it with a comprehensive set of tools to easily analyze molecular interactions and protein motions involving conserved, pharmacologically relevant or disease-related residues and structural motifs potentially involved in GPCR function (Fig. 1b,c). In adherence to the findable, accessible, interoperable and reusable principles for scientific data management 9 , GPCRmd provides open access to all of its data and simulations protocols (Fig. 1a). Corresponding data are deposited either by individual contributions or biannual updates from the GPCRmd community.
We initiated the GPCRmd database by creating a comprehensive MD dataset including at least one representative structure from each of the four structurally characterized GPCR classes. To allow for comparison of ligand-induced effects across receptors, this first set comprises 98 PDB identifiers from 38 different receptor subtypes ( Fig. 2) in their apo form or bound to a natural ligand, surrogate agonist or antagonist (see Methods). To generate reproducible data, we carefully designed a common protocol for the collective set-up and simulation of all structures listed in Fig. 2 (see Methods) and made it publicly available at https://github.com/GPCRmd/MD-protocol. Each system was simulated for 500 ns in three replicates (total time 1.5 µs) allowing for structure relaxation as well as sampling of receptor flexibility. At the time of writing, the GPCRmd platform holds 588 GPCR MD simulations from the GPCRmd community plus 28 simulations from individual contributions totaling to an aggregated simulation time of ~400 µs.

GPCRmd viewer: sharing and interactive visualization of GPCRs in motion.
To provide easy sharing and interactive visualization of GPCR MD simulations within the 3D-GPCRome, we created the GPCRmd viewer (Fig. 3). This viewer builds on MDsrv 6 , a recently published tool that allows easy trajectory sharing and makes use of the interactive capabilities of the popular web-based structure viewer NGL 5 .
The GPCRmd viewer provides interactive structural analysis of the simulations through on-click actions (Fig. 3b). To account for the fact that almost 25% of the GPCR functional sites show an average of at least one polymorphism, we mapped all GPCR variants 10 and site-directed mutations 11 from the GPCRdb 2 to each GPCR structure. Activation of the modes 'Show variants' or 'Show mutations' displays, respectively, each variant or mutation as small beads (Fig. 3b). A click on a bead reveals further information on the variant/mutation, including a link to experimental data and the original publication. A separate on-click mode, 'Show distances' , exploits NGL 5 to measure atom pair distances.
The powerful selection capabilities of the viewer (Fig. 3c-e) enable fast inspection of trajectories. Standard selections quickly visualize any molecule type in the simulation, neighboring molecules at a custom distance of each other or specific positions along the protein sequence. It is worth noting that the GPCRmd viewer makes use of GPCRdb generic residue numbering 12 by automatically linking each residue to its respective index position. Predefined knowledge-based selections enable more specific displays such as residues within 2.5 Å of the ligand (Fig. 3c), individual GPCR helices or highly conserved positions and functional motifs (Fig. 3e). In addition, the NGL selection language (see Documentation) enables the use of custom selection keywords to create tailored representations of any atom or part of the trajectory loaded in the GPCRmd viewer (Fig. 3d). Since several of these keywords stand for the chemical nature or secondary structure of proteins, they are particularly helpful for visual analysis of GPCR dynamics.
Furthermore, the GPCRmd viewer provides visualization of X-ray and electron microscopy density maps from the PDB. This allows for atomic-level comparison of the GPCR conformational landscape inferred from experimentally determined structures and observed in MD simulations (Fig. 3f).
GPCRmd toolkit: investigation of GPCR dynamics through interactive analysis. The GPCRmd toolkit provides intuitive analysis of the MD simulations by complementing and directly interacting with the GPCRmd viewer (Fig. 1b, left). The toolkit allows to compute custom distances, root mean square deviation (r.m.s.d.), and averaged water density maps for individual simulations (Fig. 1b, right). In addition, it provides interactive tools to qualitatively and quantitatively compare the noncovalent landscape of contacts for the entire GPCRmd dataset (Fig. 1b, right).
Interaction network tool. To easily identify relevant noncovalent contacts in GPCRmd simulations, the GPCRmd toolkit uses Flareplots 13 , an interactive circular representation of contact networks that can be displayed per frame or summarized for the complete trajectory (Fig. 4, right). The interaction network tool automatically integrates the GPCRmd viewer with the GPCRmd toolkit, making it straightforward to detect, for instance, differences in the hydrogen bonding network dynamics between active and inactive receptor simulations. The current version of the interaction network tool focuses on intra-and inter-helical interactions including nine different types of noncovalent interaction (see Methods).
Interaction frequency tools. The GPCRmd toolkit provides two dedicated tools to study key electrostatic interactions, namely hydrogen bonds and salt bridges. The hydrogen bonds tool identifies GPCR intra-and intermolecular hydrogen bonds formed during the simulation, whereas the salt bridges tool identifies GPCR intramolecular salt bridges. Moreover, these tools allow studying the interplay between the receptor and the membrane by computing protein-lipid interactions. Furthermore, it can identify protein residues involved in ligand binding through the ligand-receptor contacts tool. The tool outputs the interaction strength at each residue by computing its contact frequency (Fig. 1b, right). All three contact tools provide interactive visualization of the results in the GPCRmd viewer.  Alternatively, per-frame atom distances can be measured and displayed in the viewer via on-click actions. While distance measurements can provide relevant information on protein structure (for example, functionally relevant protein motions, bond formation/ breaking and so on), r.m.s.d. calculations are more suited to quantify structural stability and conformational changes. The r.m.s.d. tool measures the structural difference of protein and ligand atoms at any point in the simulation with respect to the initial frame. Therefore, it can be used to monitor simulation integrity or structural deviations throughout the simulation. Both tools generate time course plots (Fig. 1b, right) that can interactively link each data point to its respective frame in the viewer.
Water volume distribution tool. Due to the vital role of internal water molecules in GPCRs 14 , we equipped the GPCRmd toolkit with a water density map tool. This tool can quickly display an averaged water density map of the MD trajectory under study in the GPCRmd viewer (Fig. 1b, right), thus allowing to monitor, for example, the formation of the continuous internal water channel known to be essential for GPCR activation 15 .
Tunnels and channels tool. Just like all proteins, GPCRs hold an intricate system of tunnels and channels that can facilitate the access of water, ions, lipids and ligands by connecting the outside environment to the receptor core 16,17 . Since these pathways may change substantially over time, we provided the GPCRmd toolkit with a tool to analyze and display tunnels and channels. The new widget allows users to select among the list of computed tunnels and channels to immediately display them in the GPCRmd viewer using different visualization schemes.
Functional hotspots discovered through meta-analysis of GPCR simulations. The GPCRmd platform can uniquely compare GPCR simulations within the 3D-GPCRome (Figs. 1c and 2). We developed a module specifically comparing multifold GPCR simulations to uncover universal or distinct mechanisms governing the structural dynamics of these receptors. This module computes the contact frequency of each residue pair for multiple simulations and displays a global comparative analysis via an interactive heatmap plot (Fig. 5a, left). The tool also performs clustering analysis of the contact frequency data to hierarchically classify each receptor and display the resulting tree alongside the heatmap plot (Fig. 5a, left).
To further facilitate the interpretation of large heatmaps, we added interactive analysis and visualization capabilities of selected clusters using Flareplots 13 and NGL 5 .
To demonstrate the use of the meta-analysis tool, and due to their critical role in receptor function 14,15 , we investigated the interaction fingerprint of water molecules in GPCRs. Along with previously described 18 conserved water networks, this analysis revealed other water-mediated interactions that are conserved among different receptor subtypes and reported here. For example, in line with Venkatakrishnan et al. 18 , the β 2 -adrenoceptor (β 2 AR) and OX 2 -receptor (OX 2 R) display a common water network that links TM1 (N1x50) and TM2 (D2x50) (Fig. 5b, highlighted in purple). This water network is extended from TM2 (2x50) to TM7 (7x49) in the OX 2 R cluster. Such a water network extension is not observed in the β 2 AR cluster due to closer proximity of residues 2x50 and 7x49, which enables direct, unmediated, contacts. Another conserved water-mediated feature is a bifurcated polar network linking TM6 (6x47, 6x51) and TM7 (7x37) via helix backbones in the β 2 AR and the OX 2 R clusters (Fig. 5b, highlighted in green). Our study shows that this bifurcated network is less prominent in active structures ( Supplementary Fig. 1). Taking into account that TM6 undergoes large conformational changes on receptor activation, it is tempting to speculate that uncoupling the interactions between individual water molecules in this bifurcated network represents a step during receptor activation.
Likewise, our analysis reveals important differences between both clusters. A water bridge between intracellular loop 1 (ICL1, 12x49) and helix 8 (H8, 8x49) is found to be only present in the β 2 AR (Fig. 5b, highlighted in orange). Further studies (for example, site-directed mutagenesis) could be used in the future to investigate whether this water bridge contributes to the distinct coupling  efficacy and/or specificity shown by the β 2 AR (principal signaling pathway, Gs family 19 ) and OX 2 R (principal signaling pathways; Gs family, Gi/Go family andGq/G11 family 19 ). Finally, our collective analysis reveals a water bridge between TM3 (3x41) and TM4 (4x56) only observed in the β 2 AR (Fig. 5b, highlighted in gray) and likely related to the striking change in receptor stability observed on mutation of residue E3x41 in experiments 20 .
Exploiting the entire GPCRmd dataset: custom analysis of sodium ion interactions across class A GPCRs. We made the entire GPCRmd dataset available for download (see Methods), thus opening the door for the scientific community to perform comparative analyses of multiple simulations across several receptor structures, families, subtypes and classes. To demonstrate the value of such a comprehensive dataset, we studied sodium ion (Na + ) interaction in GPCRs 21 , an almost universal, albeit poorly understood, mechanism of allosteric modulation of these receptors 22 . We analyzed Na + interaction to conserved orthosteric (3x32) and allosteric (2x50) residues in 183 simulations (   different frequencies of Na + interaction with these two residues enable receptors to be clustered in three groups (I, II and III, Fig.  6a,c). Note that our dataset (3 × 0.5 μs) provides valuable insights into sodium interaction sites but it is not sufficient to conclude about binding kinetics.
In line with previous studies using multiple simulations 23 , our analysis shows that Na + binds to D2x50 and/or position 3x32 in most of the receptor subtypes (Fig. 6a). Group I (serotonin, dopamine and nociception receptors) shows high sodium interaction frequencies to positions 3x32 and 2x50, the latter being stabilized by a hydrogen bonding network often composed of D2x50, S3x39, N7x45 and S7x46 (so-called DSNS motif) (Fig. 6b). The high interaction frequency to both positions implicates that at times Na + ions bind simultaneously to position 3x32 and the allosteric site at D2x50. This seems to be a consequence of a higher negative net charge at the extracellular side (Fig. 6c,d), which increases the local concentration of positively charged Na + around the receptor entrance, and likely facilitates the simultaneous entrance of a second ion. Notably, despite a completely conserved DSNS motif, group II (β-adrenergic and muscarinic receptors) shows marginal interaction frequency at D2x50, while still exhibiting a high interaction frequency at 3x32. Visual inspection of the simulation reveals hydrophobic barriers that hamper Na + passage from 3x32 to 2x50 (Fig. 6c), in line with previous MD simulation studies 23,24 . In contrast to group II, we find high interaction frequencies at position 2x50 for group III and none or only marginal contacts with 3x32. Most receptors of this group (for example, adenosine A 1A and A 2A , or the chemokine receptor CXCR4) lack an aspartate in position 3x32 allowing for direct diffusion to position 2x50 (Fig. 6c). Despite having a D3x32 (Fig. 6b), only low binding is observed during the simulated time frame at this position for a small subgroup of receptors including histamine H 1 and opioid μ and δ receptors. More simulation time would be The radar plot shows the prevalence of sodium interactions (0-100%) over the total accumulated simulation time of 1.5 µs (3 × 0.5 µs). b, Sequence alignment of sodium binding sites for the GPCR subtypes included in the simulated dataset. Allosteric binding consists of a multi-step binding process typically initiated with accumulation at the extracellular receptor side followed by receptor penetration through the orthosteric binding site (visiting D3x32, if present) before progressing to the allosteric site D2x50. c, GPCRs can be classified into three groups based on the sodium interaction profile. The interaction profile is driven by the structural features of the sodium entrance channel. d, Extracellular net charge and receptor entrance of a second ion. required to improve the sampling of ion binding to D3x32. Finally, in a particular subset of receptors, Na + binds neither to D2x50 nor to position 3x32 within the studied time frame (group IV, Fig. 6a). In fact, slower Na + binding kinetics has previously been reported 23 and could be the consequence of blocked access to the binding site from the extracellular side (for example, receptors taking up ligands from the lipid bilayer). While our results confirm the essential role of D2x50 for allosteric sodium binding 14,25 in class A GPCRs, they also reveal that the presence or absence of D3x32 in the orthosteric binding site determine distinct Na + binding profiles. This analysis exemplifies the potential of the comprehensive GPCRmd dataset to investigate how GPCR sequence, structure and dynamics can jointly contribute to receptor allosteric modulation.

Discussion
In the last decade, static structures in the 3D-GPCRome have predominantly been described as active, intermediate or inactive states. However, a growing body of research suggests that GPCRs are not two-or three-state systems but exhibit a wide range of conformational states with sometimes subtle yet important differences. While several experimental techniques such as nuclear magnetic resonance (NMR) 26 , double electron-electron resonance (DEER) 27 or single-molecule fluorescence energy transfer (smFRET) 28 have provided relevant insights into the dynamics and flexibility of GPCRs, MD simulations have emerged as the most promising opportunity to study the complexity of GPCR conformational dynamics in atomistic detail 4 . Moreover, MD simulations can resolve mechanistic elements at time scales and conditions that are not always accessible with experimental techniques.
We have demonstrated the use of the GPCRmd platform by performing comparative analyses across multiple receptors of two important aspects of GPCR biology, namely water network and allosteric Na + interaction analysis. We were able to pinpoint relevant structural features that help improve our current understanding of the diversity of GPCR function.
A platform for interdisciplinary investigation of the 3D-GPCRome. The GPCRmd is designed to facilitate interactions and data exchange between GPCR scientists of different disciplines including structural and evolutionary biologists, computational and medicinal chemists and protein engineers (Table 1). Our tool will become a useful asset for experimental laboratories by providing open access to the dynamic context of specific GPCRs, hence directing or assisting functionally relevant experiments such as cross-linking or mutagenesis studies. Similarly, protein engineers and structural biologists will now be able to employ the GPCRmd workbench to quickly identify specific flexible regions that potentially require protein stabilization.
Moreover, the GPCRmd will be of great benefit to medicinal chemists and drug designers. They will be able to quickly use atomic-level information on the stability/strength of specific ligandreceptor interactions, and the binding of water molecules or ions using the ligand-receptor contacts or water volume distribution tools (see Fig. 1 and Table 1). In addition, drug design scientists can use GPCRmd to investigate potential ligand binding and unbinding pathways based on the dynamics of specific structural elements such as loops, hence aiding the design of new or improved compounds. Furthermore, the GPCRmd can provide valuable structural insights into the location of natural variants and its potential impact on drug binding or receptor functionality. Our cross-referenced data allows easy mapping of variants and site-directed mutations onto the receptor structure and investigation of their dynamics during the simulations (Fig. 1b, right, and Fig. 3b). This could guide further investigations to predict drug efficacy or adverse reactions in individuals with a specific variant and in turn support the selection of more efficacious and safer drug treatments.
Beyond wet-laboratory applications, GPCRmd is an important dissemination resource for computational biologists, ranging from students and MD novices to MD experts and bioinformaticians from related fields. Our platform offers a harmonized database to perform future comparative studies across different MD setups, force fields, ligands, lipid compositions or GPCR variants, which offers a substantial advantage over currently available archives or data repositories such as FigShare (https://figshare.com/) or Zenodo (http://zenodo.org).
The GPCRmd consortium: reproducible and sustainable research in GPCR MD simulations. This community-driven effort has laid the foundation of the GPCRmd consortium, an open community of GPCR computational researchers driving the centralization, dissemination, and development of open source and reproducible analysis of massive amounts of GPCR MD data. We believe that GPCRmd will enhance the dissemination of scientific results by offering a platform to make published protocols and simulation data publicly available. This will promote transparency, consistency and reproducibility in the field of GPCR dynamics. On the other hand, community engagement will overcome one of the most important challenges faced by this kind of resource, namely sustainability (Supplementary Note 1). The implementation of the GPCRmd consortium under the umbrella of the active European Research Network on Signal Transduction (ERNEST, https:// ernest-gpcr.eu) 29 will provide support to (1) foster the development of new analysis tools (for example, conformational analyses and dynamic pharmacophore models), and (2) increase the coverage of the 3D-GPCRome with future releases of the GPCRmd platform. While the first GPCRmd dataset from the consortium already maps more than 70% of GPCR subtypes within the 3D-GPCRome, future biannual releases as well as individual contributions from the scientific community will further increase this coverage bridging the gap between solved and simulated structures.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41592-020-0884-y.

Methods
MD simulations. The first GPCRmd includes 98 different GPCR structures bound to their natural ligand (for example, sphingosine-bound S1P 1 R), an agonist (for example, ergotamine-bound 5HT 2B R) or an antagonist (for example, alprenolol-bound β 2 AR). In addition to ligand-bound structures, we included an apo form of each receptor by removing the ligand from its binding pocket. We carefully designed a common protocol for the collective set-up and simulation (Supplementary Note 2) phases of all structures. During the set-up phase, different expert members of the GPCR MD community individually prepared each family of GPCR structures by refining/remodeling PDB structures (for example, missing residues, disulfide bridges, cocrystallization molecules, loop remodeling and so on), placing missing water molecules 31 and sodium ions or assigning relevant protonation states (Supplementary Note 2). Next, each protein was prepared for simulation by embedding it in a lipid bilayer and adding water and ions to the system. Each system was equilibrated following a standard procedure previously outlined and discussed within the GPCR MD community (Supplementary Note 2). Finally, the distributed computing platform GPUGRID 32 was used to simulate three replicas of each system for 500 ns (that is, accumulated 1.5 μs). We made all set-up and simulation protocols openly available at https://github.com/GPCRmd/ MD-protocol.  Fig. 2), molecular entities (molecule object) identified by an InChI 33 generated with forced hydrogen connectivity ( Supplementary Fig. 3), crystalized assembly (model) ( Supplementary Fig. 4), MD simulations (dynamics) objects ( Supplementary Fig. 5) and chemical species (compound) identified by standard InChI ( Supplementary  Fig. 3). Furthermore, we incorporated experimental data from IUPHAR 34 and BindingDB 35 (Supplementary Fig. 6) and linked each main object to bibliographic references. GPCRdb 2 tables were used to add standard nomenclatures to GPCR sequence residue numbers.
Custom analysis. The whole GPCRmd repository is released as open source under the Creative Commons Attribution 4.0 International License hence enabling downloading and custom analysis of the comprehensive dataset. Each trajectory can be downloaded from its respective link at the simulation report page (see Documentation). We exemplified this usage by studying sodium ion binding across a selection of class A GPCRs within the GPCRmd dataset. The frequency of sodium ion binding to the closest oxygen atom of the carboxylic group (2 × O ϵ ) of residues 3x32 and 2x50 were computed using a cutoff distance of 5 Å. Both highly conserved positions are normally aspartate residues. For nonconserved residues we used the following atoms: Gln (N ϵ , O ϵ ), His (N δ ), Arg (N ϵ ), Ala (C γ ), Val (2 × C γ Hydrogen), Ile (2 × C γ Hydrogen), Met (S δ ), Phe (2 × C δ ) and Tyr (2 × C δ ).
GPCRmd viewer. The GPCRmd viewer uses builds on NGL 2.00 (ref. 5 ) and MDsrv 0.3.5 (ref. 6 ) and uses data from the PDB (rcsb.org 36 ), the GPCRdb 2 and the Genome Aggregation Database (gnomAD) 37 . The data for on-click modes, variants and site-directed mutagenesis annotations are taken from the GPCRdb 2,10,11 and include generic GPCR numbers 12 , original and mutated residues, effect of the mutation in ligand binding (fold change), experiment type, ligand used for the experiment and bibliographic reference. Variant data is obtained from the gnomAD 37 , and includes amino acid substitutions (canonical and variant), allele frequencies and link to the gnomAD entry describing the variant. On-click selection capabilities build on NGL 2.0.0 (ref. 5 ) web viewer, which allow the creation of different representation objects using the NGL selection language. GPCRmd selection capabilities also feature the GPCR generic numbering scheme 12 . In this case, GPCRdb numbers are adapted to the NGL selection language through regular expressions. Experimental density maps are loaded from PDB and aligned to the first frame of the simulation displayed using NGL 2.0.0 (ref. 5 ). The transformation matrix applied to the density map to perform the alignment is precomputed using the Python library MDAnalysis v.0.20.1 (ref. 38 ).
GPCRmd toolkit. Interaction networks. Noncovalent residue-residue interactions formed in the simulation are displayed using Flareplots 13 . To precompute interactions during the simulation, we used GetContacts 13 in all interaction types except for hydrogen bonds, where we used the definition of Wernet and Nilsson. We manually integrated Flareplots and NGL to allow for interactivity between the GPCRmd toolkit and the GPCRmd viewer.
Interaction frequencies. Hydrogen bonds are calculated using the 'wernet_nilsson' module of MDtraj 39 . A hydrogen bond is defined using distance and angle cut offs between hydrogen donor (NH or OH) and acceptor (N or O) atoms as follows: where r DA is the distance (Å) between donor and acceptor heavy atoms and δ HDA is the angle (degrees) formed between the hydrogen atoms of donor and acceptor atoms. By default, the analysis does not consider hydrogen bonds between neighboring residues and includes side chains as well as backbone atoms. Ligandreceptor contacts are computed using the compute_contacts module of MDtraj 39 . Salt bridge frequency is computed using the 'compute_distances' module of MDtraj 39 . Salt bridges are defined as any combination between the sets {Arg-NH1, Arg-NH2, Lys-NZ, His-NE2, His-ND1} and {Glu-OE1, Glu-OE2, Asp-OD1, Asp-OD2} with atoms closer than 4 Å. Histidine atoms are only considered if the residue is protonated. The distance between atom pairs through the entire or strided trajectories is computed using the 'compute_distances' module of MDtraj 39 . Atom pairs can be defined either using the 'Show distances' on-click mode and imported to the tool, or NGL selection language instances.
The r.m.s.d.. The r.m.s.d. is computed using the rmsd module of MDtraj 39 . The first frame of the trajectory is used as a reference structure by default. The atoms used for r.m.s.d. computation can be defined using the provided preselection in the GPCRmd toolkit (for example, protein alpha carbons, nonhydrogen protein atoms, ligand and so on). The r.m.s.d. is computed after optimal alignment according to the following equation: r:m:s:d:ðtÞ ¼ where N atoms is the number of atoms for structure comparison, r i (1) is the position of atom i in the reference frame (that is, trajectory frame 1) and r i (t) is the position of atom i at time t of the trajectory.
Water volume distribution. Water occupancy maps are precomputed and stored on the server side using the VolMap tool of VMD 40 . Maps are generated only for oxygen atoms of a water molecule using a cutoff distance of 10 Å to the protein and a resolution of 1 Å. Atoms are treated as spheres using their atomic radii. The resulting isosurface is displayed in the GPCRmd viewer.
Tunnels and channels. Tunnels and channels are precomputed using the CAVER 3.0 software 41 and stored on the server side. We used as starting point coordinates for apo forms and receptor-ligand structures the center of mass of ligand-interacting residues in the respective PDB structure. Computations were carried out using a shell radius 3 Å, shell depth 4 Å and a probe radius of 1.4 Å. Selected results are displayed in the GPCRmd viewer.
Meta-analysis tool. Contacts are computed using GetContacts 13 and results plotted as interactive heatmaps using the Bokeh visualization library (https://docs.bokeh. org/en/latest/). Contact frequencies per system are averaged over simulation replicas. For accurate comparison, residue contact pairs are aligned using the GPCRdb generic numbering scheme 12

Data availability
The MD data have been deposited in the GPCRmd database (http://gpcrmd.org/).

Code availability
Set-up, simulation and analysis protocols are openly available at https://github. com/GPCRmd. Last updated by author(s): Apr 30, 2020 Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.

n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.

Data
Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: -Accession codes, unique identifiers, or web links for publicly available datasets -A list of figures that have associated raw data -A description of any restrictions on data availability The molecular dynamics data have been deposited in the GPCRmd database described in the paper and are freely available (http://gpcrmd.org/).