close
close

Structural basis for hepatitis B virus restriction by a viral receptor homologue

Overall structure of mNTCP

The structure of mNTCP was examined to understand why mNTCP does not function as an HBV receptor, although it shares 96.0% identity with hNTCP. Recombinant mNTCP was expressed and purified in Sf9 cells. mNTCP was complexed with a Fab fragment antibody used in structural studies of apo-hNTCP17, and the complex was observed using cryo-EM. The mNTCP-TCA-Fab complex was determined at 3.49 Å (Fig. 1A and Supplementary Fig. 1).

Fig. 1: Overall structure and substrate binding of mNTCP.

A The overall structure and cryo-EM density map of the macaque NTCP (mNTCP)/YN69083Fab complex. The 14 nonconserved residues with human NTCP (hNTCP) are highlighted in green. Cyan, mNTCP; orange, Fab heavy chain; blue, Fab light chain. B Structural comparison of taurocholic acid (TCA)-bound mNTCP (cyan) with apo-state hNTCP (green, PDB:7FCI). Root mean square deviation (RMSD) for all Cα atoms. C Vertical slice-through of a hydrophobicity surface representation of mNTCP, showing two TCAs and the tunnel. Black dashed boxes indicate the binding interfaces of the two TCAs. Hydrophobic and hydrophilic areas are shown in orange and cyan, respectively. D Residues interacting with the two TCAs in the tail and head interfaces. Residues are shown as sticks. Carbon, nitrogen, and oxygen atoms are in yellow, blue, and red, respectively. E Schematic diagrams showing front and side views of TCAs bound in mNTCP. The planar angle and distances between TCA-1 and TCA-2 are measured. The atoms of TCA-1 and TCA-2 used to measure distances are labeled in yellow and orange, respectively. Carbon, nitrogen, oxygen, and chlorine atoms are in black, blue, red, and green, respectively. F Schematic diagrams of the interactions between TCA and mNTCP, drawn using Ligplot+58. Carbon, nitrogen, oxygen, and chlorine atoms are in black, blue, red, and green, respectively. Ball and stick model showing residues in mNTCP that form hydrogen bonds with TCA. G Cartoon and surface representation of TCA bound wild-type mNTCP and S267F mutant homology model. S267 and R158 are represented using a stick model. The distance between sulfur of TCA-1 and oxygen of hydroxyl group in S267F are measured. The region of steric hindrance with upper TCA (TCA-1) is highlighted by a black dashed box. Please note that hydrogens are not modelled in S267F homology model.

NTCP from humans and macaques are a monomeric transmembrane protein with 349 aa residues. The final refined model of mNTCP consisted of 290 aa residues without breaks, from R21 to P310, arranged to form nine transmembrane helices (TM1–9), of which TM3 and TM8 broke in the middle at a “cross-over” point, as described for apo-hNTCP17,18,19,20 (Fig. 1A). The cross-over region was intimately connected to the sodium ion binding sites essential for coupling substrate transport to the sodium gradient across the cell membrane. Comparison of the two models between mNTCP and hNTCP resulted in a root mean square deviation (RMSD) of 0.81 Å for Cα atoms, indicating similar tertiary structures (Fig. 1B). The main differences were found at the N terminus, with hNTCP having modeled seven additional residues than mNTCP. mNTCP and hNTCP differed by 14 aa residues, mostly of a conservative nature and on the protein surface (Supplementary Fig. 2). One of these amino acid differences affected interactions with the antibody: replacement of K86 in hNTCP by Asn in mNTCP permitted an extra hydrogen bond to form with the Fab light chain through N30.

Substrate binding site of mNTCP

The mNTCP model showed the same tunnel found in hNTCP, connecting the extracellular space to the cytoplasm17,18,19,20. The tunnel was between TM1 and TM6 on one side and the rest of the protein on the other. It was partially open to the extracellular milieu and lined predominantly with hydrophilic residues in this region (Fig. 1C). However, it was lined with aliphatic side chains, resulting in a hydrophobic surface toward the cytoplasmic side. Within the tunnel, two elongated regions of density were observed, which had shapes consistent with the polycyclic scaffold of bile acid. Considering the presence of TCA at a concentration of 1 μM in the buffer, and reports of glycochenodeoxycholic acid (GCDCA) bound to hNTCP20, two TCA molecule models were incorporated into observed density, which lay roughly parallel and made extensive interactions with TM1, TM3, TM6, and TM9. Both TCA molecules were modeled with the steroid moiety pointing toward the cytoplasmic end of the tunnel, and the sulfonic acid group closer to the hydrophilic surface of the tunnel near the extracellular face of the protein (Fig. 1C). Residues in contact with the bound bile acids included: TM1 L31, M34, L35; TM3 N103, L104; TM5 S162; TM6 V202, T203; TM8 N262, Q264; and TM9 L287, M290, I291, L294. L31, Q264, and L294 appeared to assist in packing the two TCA molecules within the protein (Fig. 1D). Their snug fit in the tunnel suggested that they could be held relatively immobile by each other and the surrounding helices (Fig. 1E). The two TCA molecules were related by a rotation of approximately 60°, and a translation of ~2 Å (Fig. 1E). N209 exhibited a polar interaction with the upper sulfonate group of TCA, whereas N262 and Q264 displayed a polar interactions with the lower sulfonate group of TCA (Fig. 1F). We speculate that these residues play an important role in the movement of TCA molecules across the membrane.

Most of the nonconserved residues between mNTCP and hNTCP were distant from the TCA binding site, suggesting no involvement in bile acid binding. V38 in TM1 (Ile in hNTCP) and R158 in TM5 (Gly in hNTCP) were within 4.5 Å of one TCA molecule directly, but neither residue formed a strong interaction. In the mNTCP structure, S267 (located in TM8b) lay within 5.1 Å of the TCA molecule toward the extracellular face, which was closer to where GCDCA bound hNTCP20. The S267F variant impaired bile acid transport and HBV infection in humans21,22, probably because the bulky side chain sterically interferes with TCA and preS1 binding (Fig. 1G). It may also alter the energy balance between forms of the protein open to the extracellular milieu or cytoplasm. Similarities between mNTCP and hNTCP suggest that this mutation would block bile acid transport in macaques.

Structural comparison of hNTCP and mNTCP

The 14 non-conserved residues could be classified into six groups based on location: residues 29 and 38 in TM1; residues 84 and 86 in loop TM2–TM3; residues 140 and 142 in TM4; residues 157, 158, 160, 161, and 165 in TM5; residues 303 and 305 in TM9; and residue 332 at the C-terminus (Fig. 1A and Supplementary Fig. 2). Overlaying the model of TCA-bound mNTCP and preS1-bound hNTCP (PDB: 8HRY) gave a RMSD value of 1.29 Å (Fig. 2A)16. The increased deviation––relative to the apo-hNTCP model––was mainly due to widening of the gap between TM1 and TM5, whose N-terminal ends bend apart to create space for preS1 binding (Fig. 2A, B). P165 of mNTCP was substituted with Leu in hNTCP. This substitution did not cause significant distortion of TM5 between the TCA-bound mNTCP and apo-hNTCP models. However, substitution of Pro with Leu might allow TM5 to bend more easily. The impact of P165L substitution on preS1 binding was experimentally examined later (see Supplementary Fig. 9 and discussion). Nevertheless, the Arg side chain at position 158 induced steric clash with the main chain of S6–V7 and V7–P8 in preS1 (Fig. 2B, dotted box). Although steric clash by R158 of mNTCP could be sufficient for blocking preS1 binding and HBV infection, the rigidity of TM5 might be a differentiating factor between the two NTCP homologues. Despite variable preS1 sequences across HBV strains and hepadnaviruses23, any amino acid substitution in preS1 might not overcome R158-mediated hindrance, because the steric hindrance involves the main chain of preS1 irrespective of amino acid species. We evaluated the relevance of mNTCP R158 in the loss of preS1 binding in the following experiments (see Fig. 3). Among other non-conserved residues, those at positions 84 and 86 are located in a loop region connecting TM2–TM3, which interacts with the C-terminus of the 2–48 aa region of preS1 (aa 45–48) (Fig. 2C). The side chain of K86 in hNTCP formed hydrophobic interactions with preS1 N45 and V47 in preS1 (Fig. 2C). Compared with R84 and K86 in hNTCP, substituted amino acids Gln and Asn at positions 84 and 86, respectively, in mNTCP were smaller and presented smaller hydrophobic surfaces (Fig. 2C, D).

Fig. 2: Structural comparison of mNTCP and hNTCP.
figure 2

A Structural comparison of TCA-bound mNTCP (cyan) with preS1 (pink)-bound hNTCP (green, PDB:8HRY). The RMSD values for all Cα atoms are indicated. The movements of TM1 and TM5 by preS1 binding are indicated by red arrows. B Close-up view of preS1 binding cavity surrounded by TM1 and TM5. The region of steric clash with preS1 is highlighted by a black dashed box. TCA-bound mNTCP (cyan), preS1 (pink)-bound hNTCP (green), and preS1 are shown as cartoons. C Structural comparison of extracellular surface binding interfaces of preS1 in mNTCP with preS1-bound hNTCP. Hydrophobic and hydrophilic areas are shown in orange and cyan, respectively. Residues comprising loop TM2–TM3 are represented by a stick model. D Close-up view of extracellular surface binding interfaces of preS1. mNTCP and preS1-bound hNTCP are superimposed and represented as cyan and green cartoon models, respectively.

Fig. 3: R158G substitution in mNTCP is responsible but not sufficient for supporting preS1 binding and HBV infection.
figure 3

A, B PreS1 binding capacity of wild-type hNTCP (hNTCP WT), mNTCP (mNTCP WT), and G158-substituted mNTCP (mNTCP R158G). HepG2 cells overexpressing the indicated NTCPs were incubated with TAMRA-conjugated myristoylated preS1 (2–48)-peptide (preS1-TAMRA) to detect cell surface-bound preS1 (red) and NTCP (green). Pictures of immunofluorescence analysis with 40 nM preS1-TAMRA incubation are shown in A. Scale bar: 100 μm. Quantification of preS1-TAMRA fluorescence intensities is indicated in B. Dashed line indicated the background level. C HBV infection. HepG2 cells overexpressing the indicated NTCPs were inoculated with HBV, washed, and cultured for 12 d. HBV infection was evaluated by monitoring HBsAg in the culture supernatant. D Quantification of fluorescence intensities for preS1-TAMRA bound to NTCP-expressing cells in the preS1 binding assay at different concentrations of exposed preS1-TAMRA (10, 40, 160, and 640 nM). E Bile acid uptake activity. HepG2 cells overexpressing indicated NTCPs were incubated with [3H]-taurocholic acid (TCA) in the presence (dark gray) or absence (light gray) of sodium to measure intracellular radioactivity. F Cell surface NTCP protein production. HepG2 cells overexpressing indicated NTCPs were incubated in biotinylation buffer, washed, pulled down with streptavidin beads. Cell surface NTCP in the pull down fraction (upper) and actin in total cell lysate as internal control (lower) were detected by immunoblotting using anti-myc and anti-actin antibodies, respectively. The gel data are shown from one representative experiment. Uncropped gels are shown in Source data. G, H PreS1 binding capacity of mNTCP WT and its 19 variants. HepG2 cells overexpressing NTCP WT or mutants at position 158 were incubated with preS1-TAMRA. Fluorescence pictures with 40 nM preS1-TAMRA incubation are shown in G. Scale bar: 100 μm. Quantification of preS1-TAMRA fluorescence intensities when treated with 40 (light gray) and 640 nM (dark gray) of preS1-TAMRA are indicated in H. I Bile acid uptake activity of NTCPs. HepG2 cells overexpressing indicated NTCPs were incubated with [3H]-TCA in the presence (dark gray) or absence (light gray) of sodium to measure intracellular radioactivity. Bars showed the means of three independent experiments (n = 3) and error bars represent standard deviation (SD). The dashed line indicates the background levels of the assay based on the control groups. Statistical significance of p values is indicated as follows: ****p

R158G substitution in mNTCP is responsible but not sufficient for promoting viral infection

First, we examined the role of the amino acid at position 158 in NTCP in supporting preS1 binding, HBV infection, bile acid uptake, and surface NTCP expression, by amino acid substitutions (Fig. 3). Using HepG2 cells transiently overexpressing NTCP or its variants having a mutation at aa 158, we evaluated (1) binding capacity to preS1 by incubation with a TAMRA-conjugated myristoylated-preS1 peptide comprising 2–48 aa (preS1-TAMRA) (see Methods in detail) (Fig. 3A, B, and 3D), (2) susceptibility to HBV infection by infection assay (Fig. 3C), (3) bile acid uptake activity by incubation with [3H]-TCA in the presence or absence of sodium (Fig. 3E), and (4) cell surface NTCP expression as a control experiment, by biotinylating cell surface proteins followed by streptavidin-mediated pull down and immunoblotting (Fig. 3F). Although wild-type (WT) mNTCP-expressing cells did not exhibit detectable preS1 binding and HBV infection, the mNTCP R158G variant rendered preS1 binding and HBV infection of cells; however, this signal was significantly less than that seen with hNTCP (Fig. 3A–D), consistent with other reports13,15. Lower preS1 binding to mNTCP R158G compared with that to hNTCP was always observed under all experimental conditions with varying concentrations of the preS1-TAMRA probe (10–640 nM) (Fig. 3D). Cell surface expression levels of NTCP, as well as bile acid uptake activity, were not severely affected by the amino acid substitution (Fig. 3E, F). These results suggest that substitution of R158G in mNTCP renders preS1 binding, but is not sufficient for full activity, as seen in hNTCP.

Amino acid at 158 in NTCP requires glycine for viral receptor function but can be variable for bile acid transport

From the structural insights indicating the significance of space between preS1 and the side chain at position 158, we evaluated sterically hindered regions and distances between preS1 and mNTCP when R158 in mNTCP was virtually substituted with the other 19 amino acids. We generated models for 19 amino acid mutants of NTCP using residue substitution, applying the most prevalent rotamer from the Dunbrack rotamer library, and aligned calculated distance, as well as overlap of these mutants, with preS1 (Supplementary Data 1), to evaluate if mNTCP variants with a small side chain at position 158 (e.g., mNTCP R158A) could support preS1 binding. Then the preS1 binding activity of position 158-substituted mNTCP variants was evaluated (Fig. 3G, H). Of the 20 mNTCPs evaluated, only R158G supported preS1 binding. All the 18 mNTCP variants (except R158G) and WT mNTCP did not support preS1 binding even at high concentrations of preS1-TAMRA (640 nM) (Fig. 3G, H, and Supplementary Fig. 3A). Cell surface expression of NTCPs was somewhat variable, but all mNTCPs showed higher expression than hNTCP (Supplementary Fig. 3B). These data suggest that only Gly––the smallest amino acid––at position 158 secures enough space for preS1 to enter deeply into the bile acid tunnel for efficient binding to NTCP.

Among the NTCP sequences derived from 44 mammalian species in the public database (Supplementary Data 2), the residue at position 158 was partially conserved: 31 species (70.5%), including humans, have Gly; 9, and 4 species carry Arg, and Ser, respectively, with clustering mainly in Old World monkeys, New World monkeys and prosimians (Supplementary Fig. 4). Evolutionary data and structural data indicate that the residue at position 158 is not essential for the physiological function of NTCP. To address this point, we examined whether position 158 was a “toggle switch” (essential for protein function: switching on [full activity] or off [no activity] by amino acid substitutions), “neutral” (dispensable for function: always active irrespective of amino acid types), or a “rheostat” (modifying activity: wide range from full activity to none depending on amino acid type) for the bile acid transport function of NTCP24,25. In total, 20 mNTCP WT and variants at position 158 presented a wide range of bile acid uptake activities (17–142%) of hNTCP WT. NTCP with Gly at aa 158 (mNTCP R158G) had the eighth strongest activity among mNTCPs, following those with Ser (R158S) and Arg (WT) at fifth and seventh places, respectively (Fig. 3I). Thus, the dual role of position 158 functioning as a “rheostat” for bile acid uptake but as a “toggle switch” for viral receptor function suggests an evolutional path of NTCP across mammalian species (see Discussion).

Enhancement of preS1 binding and HBV infection by Lys at position 86 in NTCP

Because substituting Arg with Gly at position 158 in mNTCP intensified preS1 binding and HBV infection, but was not sufficient for showing full activity compared with hNTCP (Fig. 3), we examined the contribution of additional amino acids for acquiring full activity as a viral receptor. Of the 14 different amino acids between mNTCP and hNTCP (Supplementary Fig. 2), 5 were within 4.5 Å distance from preS1: I29, Q84, N86, and L161, in addition to R158 (Fig. 4A). We substituted these amino acids in mNTCP to their human counterparts (mNTCP 5 mut: Fig. 4A) to examine its capacity to support preS1 binding and HBV infection. For comparison, we also examined the mNTCP mutant possessing humanized residues at 29, 84, 86, and 161 positions (mNTCP 4 mut: Fig. 4A), mNTCP R158G, and WT mNTCP and hNTCP. HepG2 cells overexpressing mNTCP 4 mut were insensitive to preS1 binding; however, compared with mNTCP R158G, mNTCP 5 mut showed a significant increase in preS1-cell binding, reaching levels comparable with hNTCP (Fig. 4B, C). The HBV infection assay showed similar results (Fig. 4D). In these experiments, cell surface expression of NTCP was variable, but all NTCPs expressed higher than mNTCP 5 mut, which showed the highest HBV receptor activity (Supplementary Fig. 5A), suggesting that the loss of activity was not due to low protein expression. Then, to identify the main amino acid(s) responsible for the elevated activity, we constructed mNTCP variants with single substitutions (mNTCP I29V, Q84R, N86K, and L161I) and double mutants (mNTCP I29V/R158G, Q84R/R158G, N86K/R158G, and R158G/L161I). In contrast to mNTCP R158G, no single mutant showed preS1 binding (Fig. 4E, F), supporting the G158 requirement for NTCP receptor function. In addition, only mNTCP N86K/R158G significantly potentiated preS1-cell binding and HBV infection, compared with mNTCP R158G (Fig. 4E–G).

Fig. 4: Lys at position 86 enhances NTCP ability for supporting preS1 binding and HBV infection.
figure 4

A Mutated sites of mNTCP 4 mut (I29V/Q84R/N86K/L161I) and 5 mut (I29V/Q84R/N86K/R158G/L161I) are shown in green. BD PreS1 binding (B, C) and HBV infection (D) were examined using HepG2 cells overexpressing the indicated mNTCPs, as well as hNTCP WT. Green and red signals indicate NTCP and preS1-TAMRA, respectively. Scale bar: 100 μm. ns indicates not significant. EG The preS1 binding capacity of mNTCP WT or its indicated variants (R158G, I29V, Q84R, N86K, and L161I, as well as double mutants I29/R158G, Q84/R158G, N86/R158G, and L161/R158G) and hNTCP WT (E, F) and HBV infection (G) was examined with HepG2 cells overexpressing the indicated NTCPs. Scale bar: 100 μm. Bars showed the means of three independent experiments (n = 3) and error bars represent SD. Statistical significance of p values is indicated as follows: ****p F), **p = 0.0023 in (G), and ns indicates not significant. Source data are provided in a Source Data file. hNTCP, human NTCP; mNTCP, macaque NTCP; WT, wild type; MyrB, Myrcludex B; SD, standard deviation.

To support the significance of these amino acids in the NTCP receptor activity, we mutated amino acids at the same positions in the hNTCP backbone with their macaque counterparts. Although the hNTCP G158R mutant exhibited complete loss of activity, the quadruple macaque-typed mutant of the hNTCP backbone (hNTCP 4 mut [V29I/R84Q/K86N/I161L], Supplementary Fig. 5B) retained but significantly reduced preS1 binding and HBV infection ability (70% and 40%, respectively), compared with the hNTCP WT (Supplementary Fig. 5C, D, and G). Among the four sites, a single macaque-typed substitution at position 86 (hNTCP K86N) showed the strongest and most significant reduction in supporting preS1 binding and HBV infection (Supplementary Fig. 5E–G). The hNTCP-based mutation analyses further supported the mNTCP-based humanization results, both of which suggest the critical role of positions 86 and 158 in viral receptor function.

MD simulations of the preS1–NTCP binding mode

Next, we assessed how the N86K substitution in mNTCP potentiates viral receptor function. As shown in Fig. 2, static structures indicated a difference in the interaction with the C-terminus of region 2–48 aa of preS1. To understand the mode of binding and the impact of N86K, we analyzed the dynamic binding poses of preS1 with NTCP using all-atom MD simulations (Fig. 5 and Supplementary Fig. 6). An initial structure of mNTCP N86K/R158G (cyan)-preS1 (pink) complex embedded in a lipid bilayer (gray) surrounded by water molecules (light blue) was shown in Supplementary Fig. 6A. Three independent 1-μs conventional MD simulations were performed for mNTCP (R158G), mNTCP (N86K/R158G), and hNTCP (3-μs × 3 systems) starting from the mNTCP-preS1 and hNTCP-preS1 models to show Cα-RMSD over time (Supplementary Fig. 6B–D). We plotted the probability of each RMSD over time of simulation (Supplementary Fig. 6E). MD simulations with hNTCP showed a small deviation within ~3 Å Cα-RMSD (Supplementary Fig. 6B and 6E, green), indicating that the initial binding pose was well maintained. By contrast, the preS1 binding pose with mNTCP R158G deviated considerably from the initial pose by approximately ~5 Å, up to 38 Å in Cα-RMSD (Supplementary Fig. 6C and 6E, cyan), suggesting that the binding between mNTCP R158G and preS1 was less stable. These findings explain the results showing lower preS1 binding to mNTCP R158G than hNTCP (Fig. 3A, B). However, the preS1 binding pose with mNTCP N86K/R158G deviated moderately, ~5 Å up to 14 Å Cα-RMSD (Supplementary Fig. 6D and 6E, purple). The simulation results suggest that the N86K substitution increased the stability of preS1-NTCP binding.

Fig. 5: MD simulations showing the stabilization of the preS1 binding pose by the N86K substitution in NTCP.
figure 5

A Impairment of preS1 fluctuation by the N86K substitution in mNTCP R158G. Root-mean-square-fluctuation (RMSF) of preS1 relative to NTCP is shown for each amino acid residue in preS1 (2–48 aa) during MD simulations. Average RMSF values for mNTCP R158G, mNTCP N86K/R158G, and hNTCP in three independent simulations are shown in cyan, purple, and green, respectively. B Distribution of the preS1 binding pose on the principal component (PC)1–PC2 surface. The binding poses sampled by three MD simulations for hNTCP, mNTCP R158G, and mNTCP N86K/R158G are shown in green, cyan, and green, respectively. The preS1 binding pose of the hNTCP-preS1 complex solved by cryo-EM are shown as a black square. C PC1 and PC2 vectors for the analysis in (B) are shown colored in orange and blue, respectively. D Histograms of MM/PBSA binding energy during MD simulations for hNTCP (green), mNTCP R158G (cyan), and mNTCP N86K/R158G (purple). E Augmented contact frequency of the preS1 44–48 aa residues to NTCP, by N86K substitution in mNTCP R158G. Frequencies of preS1-NTCP contact over time during the three independent MD simulations are shown for each preS1 amino acid position.

MD simulations showed a large magnitude of preS1 fluctuations, in particular at the C-terminus of the preS1 2–48 aa region (Fig. 5A). In a principal component analysis showing the variety of binding poses in MD simulations (Fig. 5B, C), the binding mode of hNTCP appeared near the cryo-EM structure, whereas that of mNTCP R158G were much more diverged (Fig. 5B, green vs. cyan). However, the binding poses of mNTCP N86K/R158G showed far less variation and frequently appeared near the cryo-EM structure (Fig. 5B, purple), although there was still larger variation compared to that in hNTCP. We further calculated the MM/PBSA binding energy, which was lower for hNTCP (Lys86) than for mNTCP R158G (Asn86) (Supplementary Fig. 6F). A decomposition analysis for Run1 showed a high frequency of lower binding energy for hNTCP and mNTCP N86K/R158G (Lys86) than for mNTCP R158G (Asn86) (Fig. 5D), and its van der Waals energy, a main component of the hydrophobic energy, also showed lower binding energy with Lys86 than with Asn86 (Supplementary Fig. 6G). We replicated the above analyses by performing five additional simulations and obtained the same trend, confirming the robustness and reliability of the analysis (Supplementary Fig. 6H–M). All these analyses suggest that Lys86 supported more stable preS1 interaction than Asn86.

We further analyzed the pattern of interactions by quantifying the frequency of contacts during three independent MD runs (Supplementary Fig. 6N–P). The stabilization of preS1-mNTCP R158G binding by introducing N86K is attributable to increased interactions between K86 of NTCP and each amino acid residue at 44–48 aa of preS1 (Fig. 5E). Although the preS1 contact was maintained in almost all trajectories for hNTCP (Supplementary Fig. 6N), contact formation time with preS1 was much less for mNTCP R158G and relatively increased for mNTCP N86K/R158G (Supplementary Fig. 6O, P): As examples of binding pose, final snapshots of the three MD runs for each NTCP are shown in Supplementary Fig. 6Q–S. mNTCP N86K/R158G and hNTCP, but not mNTCP R158G, formed stable hydrophobic interactions with N45preS1 or K46preS1 through K86 (Supplementary Fig. 6T). In hNTCP, N45preS1 tended to lie closer to K86 than K46preS1 through hydrophobic interactions (Supplementary Fig. 6Q and 6T). In mNTCP N86K/R158G, K46preS1 tended to be closer to K86 than N45preS1 through hydrophobic interactions with K86 (Supplementary Fig. 6S and 6T). By contrast, N86 in mNTCP R158G formed less frequent stable interactions with N45preS1 or K46preS1 (Supplementary Fig. 6R and 6T).

Overall, our analyses suggest that the amino acid at position 86 in NTCP forms and maintains hydrophobic interactions with residues 44–48 of preS1, and Lys was likely to be more favorable than Asn to suppress fluctuation and help to stabilize binding of preS1.

Critical role of the tail chain of bile acids in blocking preS1 binding

Based on the structure of the mNTCP-TCA complex solved in this study and that of the hNTCP–preS1 complex we reported recently16, the mNTCP-TCA-preS1 tripartite complex was superposed (Fig. 6A, B). PreS1 was overlapped with both TCA molecules in the hydrophobic tunnel: The side chains at position 17 of the two TCAs (shown by dotted circles in Fig. 6B) were sterically hindered with P10–L11 and G12–F13 of preS1, respectively. With this structural insight, we hypothesize that the length of the side chain at position 17 (bile acid tail) is important for inhibiting preS1 binding. Therefore, we employed a series of bile acids or derivatives possessing different lengths of the bile acid tail (from null to C6N1S1) to quantify inhibition of preS1 binding (Fig. 6C, D). The chemical structures of these compounds are shown in Fig. 6C. These include a series of bile acids and estrone-3-sulfate, known as NTCP substrates26. To validate whether these compounds could target NTCP, we examined competitive inhibition to NTCP-mediated [3H]-TCA uptake in a transporter assay and showed that all compounds significantly reduced [3H]-TCA uptake (Supplementary Fig. 7A), confirming that they target NTCP. However, inhibition activities for preS1 binding varied between compounds. Compounds with longer tails had higher potencies to reduce preS1 binding (Fig. 6D) without cytotoxicity (Supplementary Fig. 7B). By contrast, estrone-3-sulfate, an NTCP substrate with no tail26, did not show remarkable activity to inhibit preS1 binding (Fig. 6D). These data suggest that the long bile acid tail associates with the high anti-HBV activity of bile acids.

Fig. 6: Higher anti-HBV activity of bile acids with long conjugated-chains at position 17.
figure 6

A, B Tripartite superposed image of mNTCP (cyan), TCA (yellow), and preS1 (pink). B indicates a steric clash of the side chain at position 17 of 2 TCAs with the counterpart preS1 (magenta). C Chemical structures of bile acids or their derivatives. D Activity of the compounds to inhibit preS1 binding was examined at 10, 20, 40, and 80 μM in the preS1 binding assay using HepG2-hNTCP-C4 cells. The values for 50% inhibitory concentration (IC50) are indicated (μM). Bars indicate the means of three independent experiments (n = 3) and error bars represent SD. The dashed line indicates the 50% of the levels for the DMSO-treated control group. Source data are provided in a Source Data file. MyrB Myrcludex B, SD standard deviation, CA cholic acid, CDCA chenodeoxycholic acid, UDCA ursodeoxycholic acid, DCA deoxycholic acid, LCA lithocholic acid, GDCA glycodeoxycholic acid, GUDCA glycoursodeoxycholic acid, GCDCA glycochenodeoxycholic acid, TDCA taurodeoxycholic acid, TUDCA tauroursodeoxycholic acid.