Engineering a precise adenine base editor with minimal bystander editing

低旁观者效应的腺嘌呤碱基编辑器

https://doi.org/10.1038/s41589-022-01163-8.

the introduction of point mutations in wild-type TadA and eTadA or using only an engineered eTadA, several versions of ABEs, such as ABEmax-F148A7 (an F148A mutation introduced to both TadA and eTadA), ABEmax-AW8 (with TadA E59A and eTadA V106W mutations) and SECURE-ABEs9 (with eTadA K20A/R21A or V82G mutations) exhibited minimized off-target edits. To improve the editing efficiency and targeting scope, two new groups of ABE variants, ABE8e10 and ABE8s11, have been developed through molecular evolution of the eTadA monomer. ABE8e is the most efficient and compatible ABE variant whose activity exhibits a 3- to 11-fold improvement compared with ABE7.10, while it also expands the editing window10. ABE8e and ABE8s also showed quite high editing efficiencies in the livers of mice and non-human primates12 or hemopoietic stem cells from patients with sickle cell anemia13, demonstrating their potential for gene therapeutics. However, with the increase of deamination activity, ABE8e exhibits significant Cas9-independent DNA and RNA off-target editing10,14,15.Although ABE8 variants are highly efficient, the editing window is also expanded with significant editing rates on the bystander adenines10,11,16. Moreover, several studies have shown that ABE7.10 exhibits cytosine deamination activity which enables C-to-T/G/A conversions with a preference for TCN motif, demonstrating that ABEs also induce undesired bystander cytosine mutations in cell lines and animal embryos17–19. It is critical to eliminate both adenine and cytosine bystander effects and Cas9-independent off-targeting editing of ABEs, especially for clinical applications. In this study, through structure-based engineering, we generated ABE9 which accurately catalyzed A-to-G conversions within a 1–2-nucleotide editing window without inducing C-to-T conversions in cells and rodent embryos. We also demonstrated it precisely corrected pathogenic single-nucleotide variants (SNVs), especially in homopolymeric adenosine sites with infinitesimally small rates of Cas9-independent RNA and DNA off-target effects.

Results

Structure-based molecular evolution of TadA-8eABE8e

whose deaminase component is a multiple-turnover enzyme with high processivity20, edits more positions than previously reported ABEs10. We also confirmed that adenines in positions 3–12 were efficiently edited by ABE8e, suggesting a much wider editing window than ABEmax (Extended Data Fig. 1a). ABE8e also exhibited elevated cytosine bystander editing effects and increased Cas9-independent DNA off-target editing through a more sensitive orthogonal R-loop assay, which uses SaCas9 nickase instead of dSaCas910,21,22 (Extended Data Fig. 1b,c). These elevated rates of undesired ABE8e editing effects encourage us to further optimize it for more accurate editing.To increase its accuracy, we intended to evolve the TadA-8e based on its DNA-binding cryo-electron microscopy structure20 (Protein Data Bank accession: 6VPC). The structure suggests that three nucleotides of the substrate, including the editing base (Fig. 1a) and the bases before and after it, are important for recognition by the deaminase. We hypothesized that mutating these residues that interacted with either the bases or the backbone of the substrate would change the environment of the binding pocket as well as the accessibility to the substrate. It might eventually reduce the non-specific binding and narrow down the editing window. Moreover, according to the apparently different size and electrophilicity of the purine ring (A) compared to the pyrimidine ring (C), these mutations would change the substrate selectivity of TadA-8e deaminase. Residues included the E27–V28–P29 loop and F148, which inserted into a valley formed by the ‘0’ and ‘+1’ bases; the F84, N108, L145, and Y149, which inserted into the other valley formed by the ‘0’ and ‘−1’ bases; and the P86/H57, which was adjacent to the editing base (Fig. 1a). Thus, ten residues were individually mutated to remove the large side chain (for example, F84T and F148A) or add a bulky residue (for example, V28F and P29W), or change between non-polar and polar residues (for example, L145T, V28N and N108V). Some highly conserved positions adjacent to the pocket (for example,E27–P29, L145 and F148) were also included to maximize the possibility of developing a precise editing tool.Following the above principles, 21 point mutations were constructed in TadA-8e and the activity was determined on three target sites. Deep-sequencing data of the first two targets with multiple adenines showed that the majority of the mutations reduced the editing window with a comparable or slightly decreased A-to-G efficiency compared to ABE8e, while H57D, H57Q, N108T and N108V mutations dramatically reduced the activity. In contrast to ABEmax7, the introduction of an F148A mutation in TadA-8e did not narrow the editing window (Fig. 1b). On the third target site previously used for evaluation of cytosine bystander mutations14,17, ABEmax and ABE8e induced lots of cytosine mutations (8.83% and 45.20% in average) while V28F, V28N, N108Q, L145C, L145T and L145Q mutations exhibited high A-to-G activity on A4 with greatly reduced cytosine conversions (ranging from 2.43% to 11.47%) (Fig. 1c). To evaluate the reduction of bystander editing and undesired cytosine conversion, the editing efficiency ratios of A3/A4and C6/A4 were calculated. The ABE8e-N108Q construct was selected for further investigations, since it showed high A4 editing efficiency (80.5%), relatively less A3/A4 and the lowest C6/A4 ratios (Fig. 1c).

ABE8e-N108Q reduces bystander adenine and cytosine editing

To further profile the performance of ABE8e-N108Q, 21 endogenous targets were tested in HEK293T cells by high-throughput sequencing (HTS). The first batch of 12 target sites contained multiple adenosines and the other 9 targets contained mixed adenines and cytosines in the editing window. ABE8e was highly efficient (>50%) between positions A3to A8 and considerable editing was also observed in a very lateral position such as A2 or A13, but ABE8e-N108Q mainly edited A4–A7 with almost no editing on A9 to protospace adjacent motif (PAM)-proximal positions (Fig. 2a and Extended Data Fig. 2a). In the remaining nine targets, we found that in addition to the TCN motif, ABE8e also edited cytosines in CCN, GCN and ACN motifs (Fig. 2b). ABE8e induced cytosine base conversions up to 39.23% (SSH2-sg10), with the highest efficiency on C6with an average rate of 18.02% (Fig. 2b,c and Extended Data Fig. 2b). By contrast, a significant decrease of cytosine conversions was observed in ABE8e-N108Q-treated cells with an average editing efficiency of 5.79% on C6, although its cytosine deaminase activity was not fully eliminated (Fig. 2c). On the basis of all 21 target results, ABE8e-N108Q exhibited an identical A-to-G efficiency with ABE8e at the highest positions (82.1% versus 82.74% on A5 and 83.62% versus 83.13% on A6), but the major editing window was reduced to A4–A7 (Fig. 2d). Similar to ABE8e, ABE8e-N108Q minimally induced indels on the selected target sites (Fig. 2e). Together, ABE8e-N108Q is highly efficient with significantly reduced adenine and cytosine bystander mutation effects.

Single editing by further evolution of ABE8e-N108Q

Although ABE8e-N108Q exhibits a smaller editing window and fewer cytosine edits, we pursued a more accurate ABE featuring a single-nucleotide window and complete elimination of cytosine editing activity. We assumed that introducing more mutations in ABE8e-N108Q would further reduce bystander editing. As shown in Fig. 3a, the combination of N108Q with an additional single mutation on residues E27, P29, F84 or L145 exhibited a very stringent editing window even to a single adenine at the A5 position. Although all three ABE8e-N108Q/L145 variants exhibited superb performance, we noticed that ABE8e-N108Q/L145T showed the most condensed editing window and high activity at these two sites (Source Data 3a), and we named it ABE9 as additional mutations were introduced into ABE8e. Since some of the mutations on N108 or L145 residue improved the performance of ABE8e (Fig. 1c), we next performed individual saturation mutation on these two residues to investigate whether other amino acid substitutions outperform ABE9. HTS data showed that ABE9 displayed the highest efficiency, lowest cytosine bystander editing and very narrow editing window compared to other 38 ABE variants (Extended Data Fig. 3a).

Fig. 1 | Structure-based molecular evolution of TadA-8e. a, The schematic diagram of the interplays of TadA-8e (pink) with the single-stranded DNA substrate (light blue sticks) (Protein Data Bank accession: 6VPC). Complementary strand DNA is in orange, non-complementary strand DNA is in light blue, Cas9n is in gray, and sgRNA is in cyan. Amino acids reacting with the substrate DNA are labeled on the enlarged image. The editing base is labeled ‘0’ and the bases before and after it are labeled ‘−1’ and ‘+1’, respectively. b, The A-to-G base editing efficiency of ABE8e or ABE8e variants at two endogenous genomic loci containing multiple adenosines (ABE site 16 and ABE site 17) in HEK293T cells. The heat map represents an average editing percentage derived from three independent experiments with editing efficiency determined by HTS. c, Base editing efficiency of ABE8e or ABE8e variants at an endogenous genomic locus (FANCF site 1) for both adenine and cytosine editing in HEK293T cells. A3/A4 means the ratio of undesired A3 editing to desired A4 editing, and C6/A4 means the ratio of undesired C6 editing to desired A4 editing. Data are mean ± s.d. of n = 3 independent experiments. Statistical source data are available (Source Data Fig. 1).

After evaluation at 12 endogenous sites, we found that ABE9 showed much higher activity than ABE7.10 and slightly compromised activity compared to ABE8e, but it significantly reduced adenine bystander edits and narrowed the editing window to 1–2 nucleotides (Fig. 3b). Importantly, its activity at the adjacent A4 or A7 position was dramatically reduced or even eliminated at 10 of these 12 targets, and single adenine editing was observed at half of tested sites (Fig. 3b). Using the editing rate of the most efficient position to divide by the second-highest position, we further confirmed that ABE9 was the most accurate variant and showed up to 8-fold (4.3-fold in average) discrimination of the two most efficient adenine positions compared to ABE8e-N108Q (Fig. 3c). With the editing rate of the highest position divided by the cumulative efficiencies on each edited position, similar results were obtained, suggesting that ABE9 was the most precise of the tested ABE variants (Extended Data Fig. 3b). Collectively, these results show that we developed a more accurate ABE variant ABE9, which showed a stringent and steep editing window of 1–2 nucleotides at A5 or A6 (Fig. 3d and Extended Data Fig. 3c). As expected, the indel rates of ABE9 are comparable or even slightly reduced compared with ABE8e and ABE8e-N108Q (Extended Data Fig. 3d).

Fig. 2 | Characteristics of ABE8e-N108Q in HEK293T cells. a, The editing efficiency of ABE8e or ABE8e-N108Q was examined at 12 endogenous genomic loci containing multiple As in HEK293T cells. The heat map represents the average editing percentage derived from three independent experiments. b, The editing efficiency of ABE8e or ABE8e-N108Q was examined at nine endogenous genomic loci containing an NCN motif in HEK293T cells. Data are mean ± s.d. of n = 3 independent experiments. c, Average C-to-T/G/A editing efficiency of ABE8e or ABE8e-N108Q at the nine target sites in b. d, Average A-to-G editing efficiency of ABE8e or ABE8e-N108Q at the 21 target sites in a,b. e, Frequency of indel formation by ABE8e or ABE8e-N108Q at the 21 target sites in a,b. Each data point represents the average indel frequency at each target site calculated from three independent experiments. Error bar and P value are derived from these 21 data points. Data are mean ± s.d. P value was determined by a two-tailed Student’s t-test. c,d, Data represent averages from three independent experiments. Statistical source data are available (Source Data Fig. 2).

Next, 11 target sites were employed to evaluate the cytosine bystander mutation rate. As shown in Fig. 3e, ABE9 did not edit Cs in 10 of these 11 targets whereas ABE8e and ABE8e-N108Q induced considerable edits on all targets tested (efficiency lower than 1% considered as no editing). The highest cytosine conversion activity of ABE9 detected was 1.5% on C7 of the TIM3-sg4 target but ABE8e and ABE8e-N108Q catalyzed 29.6% and 3.9% conversions on C5, respectively. According to statistical analysis of the cytosine editing rate of the most efficient position, ABE9 strikingly decreased the cytosine bystander mutation rate by 13.2- to 147.5-fold (mean 56.2-fold) and 2.6- to 40.8-fold (10.2-fold on average) in comparison to ABE8e and ABE8e-N108Q, respectively (Fig. 3f). Moreover, we also found that ABE9 was very efficient in Hela cells and displayed a condensed editing window compared with ABE8e-N108Q, suggesting that ABE9 was suitable for variant cell lines (Extended Data Fig. 4).

Off-target analysis of ABEs in mammalian cells

To evaluate Cas9-dependent off-target activity, 44 potential off-target sites from 5 short guide RNA (sgRNA) targets were analyzed, including 17 known off-target sites identified by GUIDE-seq or ChIP-seq3,23 and 27 in silico-predicted off-target sites by Cas-OFFinder24. We found that ABE8e induced mild off-target editing (1.04–12.29%, 4.15% on average) at 11 sites on HEK site 2, HEK site 3 and PD-1-sg4 loci, while ABE9 only edited two sites with comparable on-target activity and background level of indels (Fig. 4a and Extended Data Fig. 5a,b). The Cas9-independent DNA and RNA off-target editing caused by the deaminase were more unpredictable and intractable. Through an enhanced orthogonal R-loop assay22,23, Cas9-independent DNA and RNA off-target effects of ABE9 with infinitesimal indels were greatly reduced compared to ABE8e (Fig. 4b,c and Extended Data Fig. 6a,b). Amazingly, the off-target activity of ABE9 was lowered to near-background levels (mean <0.3%) (Fig. 4b), indicating that it eliminated unpredictable DNA off-target activity. Through whole-genome mRNA profiling analysis, we found that RNA off-target effects of ABE9 were reduced to background level and displayed 726.1- and 117.1-fold reduction compared to ABE8e and ABE8e-N108Q, respectively (Fig. 4c). These results demonstrate that ABE9 is highly specific with infinitesimal rates of unpredictable DNA and RNA off-target activity.

Highly accurate editing by ABE9 in rodent embryos

Accurate base conversion is critical for modeling pathogenic SNVs, but ABEs or CBEs usually induce severe bystander mutations at the target sites in cells and embryos25–27. To test whether ABE9 could generate precise single nucleotide conversion in embryos, ABE8e or ABE9 mRNA was co-injected with sgRNA targeting the splicing acceptor site of Tyrosinase gene intron 3 into mouse zygotes to model albinism. Once the splicing site was destroyed (A5 position), exon skipping might occur to disrupt tyrosinase coding and lead to an albino phenotype (Fig. 5a). After deep sequencing of genomic DNA from F0 pups, all of the mice injected with ABE8e or ABE9 contained A5 editing (Fig. 5b and Extended Data Fig. 7a), and almost no indels (<0.2% on average) were observed in embryos injected with ABEs (Extended Data Fig. 7b). Notably, ABE9 selectively edited A5 in 88% (14 out of 16) of the pups and the other two pups bore very low (8.13% and 9.75%) simultaneous A8 conversions, but only 5% (1 out of 19) of the pups generated by ABE8e injection bore the desired A5 transition (Extended Data Fig. 7a,c). After analysis of total NGS reads from all F0 pups in the same group, ABE9 generated the desired A5 transition in 54.32% of the reads, but only 5.1% of the reads induced by ABE8e was the desired mutation (Fig. 5c). The albino phenotype in the eyes and fur color of the founders suggested that tyrosinase activity was disrupted by ABE9-induced A5 conversion (Fig. 5d).We further inspected the efficiency and accuracy of ABE9 in rat embryos through targeting of a site with three adenines in an A4–A8canonical editing window (Fig. 5e). As our previous data showed that only the A6-to-G conversion, which caused D645 mutation in Gaa gene identified in patients with early-onset Pompe (glycogen storage disease type II) disease, lead to an obvious phenotype in rats27. Through reanalysis of our published data, it showed that ABE7.10 only induced 6 of 28 (21%) pups bearing desired D645G mutation with the efficiency ranging from 6.04% to 27.94% (Extended Data Fig. 7d). By contrast, ABE9 induced desired A6 substitution in all 8 (100%) pups with the efficiency ranging from 36.08% to 62.41% (Fig. 5f,g and Extended Data Fig. 7e). Consistent with the data obtained in mice, ABE9 induced very limited indels similar to ABE7.10 in rats (Extended Data Fig. 7f). From HTS results of all 28 F0 rats treated with ABE7.10, the proportion of desired reads was only about 2.76% of all cumulative HTS reads, while ABE9 induced an 18.0-fold increase (49.59%) compared to that of ABE7.10-treated rats (Extended Data Fig. 7g), suggesting ABE9 was more efficient and accurate than ABE7.10. These data demonstrate that ABE9 is very efficient at generating highly accurate base installation in mouse and rat embryos.

Precise correction of pathogenic mutations by ABE9

ABE generates A-to-G conversions and potentially corrects approximately half of known pathogenic SNVs in the ClinVar database, irrespective of bystander mutations28. To investigate the therapeutic potential of ABE9 for treating genetic diseases, 4 pathogenic SNPs with at least 4 consecutive adenines within positions 4–8 were tested, including missense mutations in COL1A2 gene (causing autosomal-dominant osteogenesis imperfecta)29,30, CARD14 gene (causing psoriasis)31, BVES gene (causing muscular dystrophy)32 and KCNA5 gene (causing common cardiac rhythm disorder)33. ABEs were transfected into four stable cell lines containing the pathogenic variants described above. For the COL1A2 locus, ABE8e or ABE8e-N108Q did not generate considerable conversions selectively on A5, while ABE9 induced 34.25% desired single A-to-G conversion which was 342.5- and 21-fold higher than ABE8e and ABE8e-N108Q, respectively. Similarly, for the other three loci, ABE8e and ABE8e-N108Q only generated desired edits with frequencies of up to 2.06% and average 5.3% (0.3–11.6%), respectively, while ABE9 generated precisely corrected alleles in all four targets with an efficiency ranging from 15.53–37.22% (mean 30.19%), suggesting it was very accurate to generate single nucleotide transition (Fig. 6a and Extended Data Fig. 8a–d). These data demonstrate that ABE9 is a precise and efficient editor with the ability to correct genetic variants even in promiscuous homopolymeric sites.

Target library analysis of ABE9

To unbiasedly characterize the performance of ABE9, we adapted the guide RNA–target pair strategy34,35 and synthesized a library of 9,120 oligonucleotides with all possible 6-mers containing at least an adenine and a cytosine across positions 4 to 9 of a protospacer (Methods). The oligonucleotide library was stably integrated into the genome of HEK293T cells via Tol2 transposon followed by stable transfection of a given base editor (Fig. 6b). We maintained an average 99% coverage of >300× per guide–target pair throughout the culturing process (Supplementary Table 4). Subsequently, the target region was amplified and sequenced at an average depth of 860 per target. The average editing efficiency of ABE8e, ABE8e-N108Q and ABE9 was 31.9%, 28.7%, 25.3% on position 5 respectively, suggesting the experiment was successful (Extended Data Fig. 9). The editing efficiency of the highest position in each target was considered as 100%, and the relative activity of other positions was determined comparing with the highest position. Analysis of the editing outcomes from three distinct base editors showed that ABE8e (evaluated 9,059 sgRNAs) had a wide editing window ranging from positions 3–12 with a major window (>50%) from 4–9,while ABE8e-N108Q (evaluated 9,071 sgRNAs) narrowed the window to positions 4–7 (Fig. 6c). As expected, ABE9 (evaluated 8,954 sgRNAs) presented an extremely narrowed editing window of 1–2 nucleotides with the highest efficiency on position 5. Profiling of the motif preferences of the ABE9 showed that similar to ABE8e, they were suitable for a wide range of accurate A-to-G editing without strict motif requirements, suggesting their accuracy was dependent on the position relative to the protospacer but not on sequence context (Fig. 6d). As determined by thousands of sgRNAs, it suggests that ABE9 is very accurate to preferentially edit adenines in position 5 of the protospacers.

Fig. 3 | Evolution and characterization of single A-to-G base editor. a, The A-to-G base editing efficiency of ABE8e-N108Q and its combination variants at 2 endogenous genomic loci containing multiple As (ABE site 10 and ABE site 3) in HEK293T cells. b, The A-to-G editing efficiency of ABE7.10, ABE8e, ABE8e-N108Q or ABE9 was examined at 12 endogenous genomic loci containing multiple As in HEK293T cells. c, The normalized precision (ABE8e is used for standardization) is defined as the highest or second-highest A-to-G base editing of ABE8e-N108Q or ABE9 at the 12 target sites in b. Data represent mean ± s.d. from three independent experiments. d, Average A-to-G editing efficiency of ABE7.10, ABE8e, ABE8e-N108Q or ABE9 at the 12 target sites in b. Data represent mean from three independent experiments. e, The C-to-T/G/A editing efficiency of ABE9 was examined at 11 endogenous genomic loci containing multiple Cs in HEK293T cells. f, The normalized ratio (ABE8e is used for standardization) of the highest C-to-T/G/A editing efficiency of ABE8e-N108Q or ABE9 at 11 target sites in e. The numbers aside bars display the fold changes of ABE9 in reducing cytosine conversions compared with ABE8e and ABE8e-N108Q. Data represent mean ± s.d. from three independent experiments. In a, b and e, the heat map represents average editing percentage derived from two or three independent experiments. Statistical source data are available (Source Data Fig. 3).

Fig. 4 | Off-target mutation assessment of ABE9. a, Cas9-dependent DNA on- and off-target analysis of the indicated targets (HEK site 2, HEK site 3 and PD-1-sg4) by ABE8e, ABE8e-N108Q and ABE9 in HEK293T cells. Data are mean ± s.d. of n = 2 independent experiments for HEK site 2-GUIDE-seq-OT1 and 2 treated with ABE8e-N108Q, and n = 3 independent experiments for the other biological samples. b, Cas9-independent DNA off-target analysis of ABE8e, ABE8e-N108Q and ABE9 using the modified orthogonal R-loop assay at each R-loop site with nSaCas9-sgRNA plasmid. Data are mean ± s.d. of n = 3 independent experiments. c, RNA off-target editing activity by ABE8e, ABE8e-N108Q and ABE9 using RNAseq. Jitter plots from RNA-seq experiments in HEK293T cells showing efficiencies of A-to-I conversions (y-axis) with ABE8e, ABE8e-N108Q and ABE9 or a GFP control. Each biological replicate (Rep.) and total numbers of modified bases are listed at the top. Statistical source data are available (Source Data Fig. 4).

DiscussionHighly efficient and precise correction of single-nucleotide pathogenic mutation is demanded for gene therapy to reach its potential. Using structure-based design and molecular evolution of TadA-8e, we have generated ABE9, which efficiently edits adenines in a 1–2-nucleotide window without cytosine editing activity. To minimize the editing window of base editors, structure-based molecular evolution has been successfully leveraged to obtain new editors, such as BE4max-YE1 and YEE variants, which catalyze conversions within a 1–2-nucleotide window36, and eA3A-BE preferentially editing in a TCN motif37 andA3G-BEs selectively editing the second C in a CC motif38. Although ABEmax-F148A has been shown to reduce the editing window7, very limited effects have been observed when it has been transferred to TadA-8e (Fig. 1b), indicating the experiences from TadA7.10 could not be directly transferred to TadA-8e.More complicated than CBEs, ABEs are capable of catalyzing both adenines and cytosines in a similar editing window17. Since the editing window of C-to-T is overlapped with that of A-to-G, it is impracticable to eliminate their cytosine deamination activity through reducing the editing window. While we were completing this project, Bae and colleagues reported introduction of D108Q in ABEmax or N108Q in ABE8e could reduce their cytosine deaminase activity14, which was consistent with our current study, suggesting that residue 108 was critical for the discrimination of substrates such as adenines and cytosines. The previous study also showed this residue was important for the recognition of single-stranded DNA substrates as the D108N mutation was pivotal for the generation of eTadA, the unnatural DNA adenine deamination3. Moreover, the combinational mutation in ABEmax (TadA-E59A + N108W/Q) displayed greatly reduced RNA editing and preferentially catalyzing adenine conversions at protospacer position 5 but the activity was compromised8. It is consistent with our findings that ABE8e-N108Q exhibited reduced editing window and RNA off-target effects (Figs 3b,d and 4c).As for the discrimination between cytosines and adenines, we speculated that the mutation of N108 to a larger side chain residue (Q) would expel the backbone of its substrate. It apparently affected the deamination of cytosines greater than adenines since the pyrimidine ring of cytosines needs to be shifted further toward the pocket for the catalytic reaction to happen. However, TadA-8e-N108Q still retained considerable cytosine deaminase activity (Fig. 3e,f) until the introduction of a second mutation, L145T, which nearly abolished cytosine conversions and further narrowed the adenine editing window to 1–2 nucleotides without apparently sacrificing on-target adenine conversion efficiency. We found that introduction of variant mutations at L145 had similar effects on reducing the editing window and cytosine bystander mutation as N108Q, suggesting the L145 position was a previously unnoticed residue, which was also critical for substrate discrimination. It was further supported by saturation mutation analysis on the L145 residue as most of the substitution exhibited compromised cytosine editing efficiency (Extended Data Fig. 3a). As the L145 residue is located relatively distal to the target base, the mutations may adjust the pocket indirectly by influencing the positions of its nearby essential residues, such as P29, F84, N108 and Y149. Especially, the L145T + N108Q double mutants performed the best on adenine editing while removing the bystander cytosine editing, suggesting that the combination of the two mutations within the pocket somehow precisely adopted adenine versus cytosine; however, the detailed mechanism still awaits further structural study. F84 is also a critical residue identified in the initial generation of eTadA3. It is located within the pocket right below the target base ring and it forms a triangle platform together with Y149 and V28 to hold the base ring of the substrate. Additionally, we found that V28 could be a critical position involved in the discrimination of cytosines and adenines, since whereas V28F and V28N showed a significant decrease of cytosine conversions, V28G had opposite effects (Fig. 1c), suggesting it is possible to innovate pure CBE, C-to-G base editors or dual-base editors which are capable of spontaneous adenine and cytosine conversions through further engineering of TadA-8e.

Fig. 5 | Examination of precision in rodent embryos with ABE9. a, The splicing acceptor sequence in intron 3 of the mouse Tyr gene. The ‘ag’ sequence of the splice acceptor site is shown in black. The sgRNA target and PAM sequence are both shown in black (PAM in bold). Target A5 is in blue with bystander A8 in red. b, Genotyping of representative F0 generation pups from mouse embryos microinjected with ABE8e or ABE9. The guanines converted from editable As indicate desired editing in A5 (blue) or undesired in A8 (red). c, Single A5-to-G conversion ratio in F0 mice induced by ABE8e (n = 19) or ABE9 (n = 16). d, Phenotype of F0 generated by microinjection of sgRNA and ABEs. The photo on the left was taken when the mouse was 7 days old, while the right one was at 14 days old. WT, wild type. e, The target sequence of exon 13 (dark purple) in the rat Gaa gene. The sgRNA target sequence where target A6 is in blue with bystander A4 and A8 in red is shown in black (PAM in bold). The triplet codon of D645 is underlined. f, Genotyping of representative F0 generation pups from rat embryos microinjected with ABE9 (desired editing in blue or undesired in red). g, Desired D645 mutation ratios in F0 rats induced by ABE7.10 (n = 28) or ABE9 (n = 8). b,f, The percentage on the right represents the frequency determined by the rate of indicated mutant alleles to total alleles counts. c,g, Data are mean ± s.d. and P values (3.6 × 10−6 in c, 8.7 × 10−16 in g) was determined by a twotailed Student’s t-test. Statistical source data are available (Source Data Fig. 5).

Fig. 6 | Correction of human pathogenic mutations in mammalian cells and target library analysis to unbiasedly characterize ABE9. a, Comparison of correcting pathogenic mutations induced by ABEs in four stable HEK293T cell lines, including COL1A2 c.1136G > A (n = 3), KCNA5 c.1828G> A (n = 3), BVES c.602C > T (n = 3), CARD14 c.424G > A (n = 2 for the tenth or ninth invalid edits induced by ABE8e-N108Q or ABE9, n = 3 for the other samples). Base editing efficiency was determined by HTS. Data are mean ± s.d. Desired A5-to-G percentiles of alleles (green bar) are exhibited, while percentiles of the top ten invalid allele types are presented and percentiles of invalid allele types less than 1% are omitted. The numbers above green bars display the fold changes of ABE9 in desired A5-to-G percentiles compared with ABE8e and ABE8e-N108Q. b, Schematic of target library analysis. c, Analysis of relative editing efficiency of ABE8e, ABE8e-N108Q and ABE9. The heat map represents editing efficiency computed relatively to the highest A-to-G base editing of the protospacer. Positions of the protospacer are shown at the bottom of each heat map, counting the PAM as positions 21–23. d, Motif visualization of ABE8e, ABE8e-N108Q and ABE9 in fifth-adenine-containing cassettes. Statistical source data are available (Source Data Fig. 6).

Developing a base editor with a refined editing window is challenging, especially editing a specific base within promiscuous homopolymeric sites. Recently, a precise ABE-NG variant has been developed through engineering of TadA-8e39, but its major window is A4–A7, which is much wider than 1–2-nucleotide editing window of ABE9. Moreover, questions remain about bystander cytosine editing effects and whether its 4-nucleotide major window could be adapted to SpCas9. Using selected target sites in cells and rodent embryos, we determined that ABE9 was accurate with a very narrow editing window. More importantly, through a guide RNA–target pair library containing over 9,000 targets, the data showed that ABE9 could be considered as an ABE focusing on a 1–2-nucleotide editing window with the highest efficiency A5 (Fig. 6c). To our knowledge, it is potentially the most accurate ABE to date. As SpRY almost does not require any PAM sequence, ideally ABE9-SpRY could precisely target any adenine through an appropriate sgRNA for broad targeting scope. Importantly, ABE9 induces almostno off-target effects (either Cas9-dependent or -independent) at both DNA and RNA levels, which is important not only for basic research but also critically important for clinical applications.