Patching ESM imports with oxc
An ESM import is just a string in the source. A lot of build tooling does nothing fancier than rewrite that string to pin a version, route a bare package through a CDN, or swap in a shim. The rewrite is the easy part. The work is doing it safely. You parse the source into a syntax tree, change one node, and print the tree back to code.
The demo below runs that round trip twice on whatever you type. The same transform, rewriting bare import specifiers, runs in oxc compiled to WebAssembly and in Acorn plus astring. You can watch the AST, the rewritten output, and the time each engine took.
| engine | transform | |
|---|---|---|
| oxc (wasm) | — | |
| acorn + astring (js) | — |
The transform is parse + rewrite imports + codegen — the whole ESM-patching round trip, the same rewrite on both sides (output matches apart from quote style and spacing). Each number is the per-call cost averaged over a fixed time budget (sub-millisecond parses can't be timed directly in a browser, which is why the large module example is where the gap is honest).
What you are looking at
Type an import prefix and the bare specifiers (lit-html, nanoid) pick it up. Relative
paths (./math.js) and node: builtins are left alone. Switch the output tab to AST to
see the tree each parser produces. Both emit ESTree, so
the shapes line up and the same rewrite logic works on both.
The source map tab gets its own section below. The timing row is the point of the comparison. On the TypeScript example both engines strip the types. Vanilla Acorn cannot parse TS, but the acorn-typescript plugin gives it the AST, and stripping types from there is just deleting the type nodes (covered below). oxc does the same in Rust with its built-in transformer, and the wasm build comes out several times faster. You give up editing the pass in JavaScript to get that speed.
How the transform works
The rewrite is small in both engines. Both give you a tree where an import’s specifier
is a string you can reassign. In oxc the nodes live in an arena, so the new string is
allocated there. Clearing the node’s raw makes codegen reprint from the value instead
of the original text.
/// Rewrite bare module specifiers (`"lodash"`) to `<prefix>lodash`, leaving
/// relative (`./x`), absolute (`/x`), and already-URL imports untouched.
pub fn rewrite_imports<'a>(allocator: &'a Allocator, program: &mut Program<'a>, prefix: &str) {
for stmt in program.body.iter_mut() {
let source = match stmt {
Statement::ImportDeclaration(d) => Some(&mut d.source),
Statement::ExportNamedDeclaration(d) => d.source.as_mut(),
Statement::ExportAllDeclaration(d) => Some(&mut d.source),
_ => None,
};
if let Some(lit) = source {
if is_bare(lit.value.as_str()) {
// Allocate the new string in the arena and clear `raw` so codegen
// re-quotes from `value` rather than reprinting the original text.
let rewritten = format!("{prefix}{}", lit.value.as_str());
lit.value = rewritten.into_in(allocator);
lit.raw = None;
}
}
}
}
/// A bare specifier is a package name: not relative, absolute, or a URL.
pub fn is_bare(spec: &str) -> bool {
!(spec.starts_with('.')
|| spec.starts_with('/')
|| spec.contains("://")
|| spec.starts_with("node:"))
} The Acorn side is the same walk over the same node types (ImportDeclaration,
ExportNamedDeclaration, ExportAllDeclaration), reassigning node.source.value.
Both parsers produce ESTree, so the logic lines up field for field.
/** Mirror the Rust `is_bare`: a package name, not relative/absolute/URL/node:. */
function isBare(spec: string): boolean {
return !(
spec.startsWith('.') ||
spec.startsWith('/') ||
spec.includes('://') ||
spec.startsWith('node:')
);
}
// Walk the top-level module edges and rewrite each bare specifier in place.
function rewriteImports(program: Program, prefix: string): void {
for (const node of program.body) {
const isModuleEdge =
node.type === 'ImportDeclaration' ||
node.type === 'ExportNamedDeclaration' ||
node.type === 'ExportAllDeclaration';
if (!isModuleEdge || !node.source) continue;
const value = node.source.value;
if (typeof value === 'string' && isBare(value)) {
node.source.value = prefix + value;
delete node.source.raw; // force astring to re-quote from `value`
}
}
} The two outputs match apart from formatting like quote style and spacing. I did have to
change one thing on the oxc side. Its transformer drops unused value imports by default, so
import process from 'node:process' disappeared even though I only asked it to strip
types. Turning on only_remove_type_imports (the verbatimModuleSyntax behavior) keeps
it, and the two engines agree again.
Stripping the types
On the TypeScript example oxc strips the types with its built-in transformer. That is
not something only oxc can do. Once you have a TypeScript AST, stripping types is walking
it and deleting the type-only nodes, which is what Babel, swc, and Node’s own
stripTypeScriptTypes all do. Acorn gets the AST from the acorn-typescript plugin, and
the strip is a plain recursive walk.
// Stripping types is just walking the AST and dropping the type-only nodes and
// fields. Removable statements (interfaces, type aliases, `import type`) are
// filtered out of their arrays; type annotations and type parameters are deleted
// off the nodes that carry them; and TS-only expression wrappers (`x as T`,
// `x!`) are replaced by the expression inside. What's left is plain ESTree that
// astring can print. (This covers the *erasable* subset — enums, namespaces, and
// parameter properties need real emit, which is where a full transformer earns
// its keep.)
const TYPE_ONLY_NODE = new Set([
'TSInterfaceDeclaration', 'TSTypeAliasDeclaration', 'TSModuleDeclaration',
'TSDeclareFunction', 'TSImportEqualsDeclaration',
]);
const UNWRAP = new Set([
'TSAsExpression', 'TSSatisfiesExpression', 'TSNonNullExpression',
'TSTypeAssertion', 'TSInstantiationExpression',
]);
const TYPE_FIELDS = [
'typeAnnotation', 'returnType', 'typeParameters', 'typeArguments',
'accessibility', 'definite', 'readonly', 'declare', 'optional', 'override',
];
function isTypeOnly(n: { type?: string; importKind?: string; exportKind?: string; declaration?: unknown }): boolean {
if (!n || !n.type) return false;
if (TYPE_ONLY_NODE.has(n.type)) return true;
if (n.type === 'ImportDeclaration' && n.importKind === 'type') return true;
if (n.type === 'ExportNamedDeclaration' && n.exportKind === 'type' && !n.declaration) return true;
if ((n.type === 'ImportSpecifier' || n.type === 'ExportSpecifier') && n.importKind === 'type') return true;
return false;
}
function stripTypes<T>(node: T): T {
if (!node || typeof node !== 'object') return node;
const n = node as Record<string, unknown>;
if (typeof n.type === 'string' && UNWRAP.has(n.type)) return stripTypes(n.expression) as T;
for (const f of TYPE_FIELDS) delete n[f];
for (const key of Object.keys(n)) {
const v = n[key];
if (Array.isArray(v)) {
n[key] = v.filter((c) => !isTypeOnly(c)).map((c) => stripTypes(c));
} else if (v && typeof v === 'object' && typeof (v as { type?: unknown }).type === 'string') {
n[key] = stripTypes(v);
}
}
return node;
} So both columns strip types, and oxc is the faster one. The walk has a limit, called out
in that last comment. It only handles the parts of TypeScript that erase cleanly, like
annotations, interfaces, and import type. Enums, namespaces, parameter properties, and
decorators do not erase. They compile down to real JavaScript, and emitting that code is
the job a full transformer does. To cover them the walk would have to grow into one.
Source maps
Once you rewrite an import or strip a type, the output’s line and column numbers no longer match the source. To keep a stack trace pointing at the code I wrote, the transform emits a source map that records where each generated position came from. The source map tab in the demo above draws it. Hover a colored span and the same color lights up where it came from in the original.
oxc emits the map natively. Its Rust codegen takes a source_map_path. Set it, and the
build returns a SourceMap next to the code that serializes straight to the v3 JSON the
browser reads.
// Setting source_map_path turns codegen's mapping on and names the source.
// The returned value then carries both the generated code and a SourceMap you
// serialize to the v3 JSON the browser reads.
let codegen = Codegen::new().with_options(CodegenOptions {
source_map_path: Some(path.to_path_buf()),
..CodegenOptions::default()
});
let out = codegen.build(&program);
let map = out.map.map(|m| m.to_json_string()).unwrap_or_default(); The JS side has no native map, so it builds one with the
@jridgewell source-map libraries. astring
expects a generator object with an addMapping method and a file property.
@jridgewell/gen-mapping exposes standalone functions instead, so a small shim sits
between them.
// astring writes a source map by calling `sourceMap.addMapping({ original,
// generated, source, name })` and reading `sourceMap.file`. @jridgewell/gen-mapping
// uses standalone functions instead of methods, so this thin shim adapts its
// GenMapping to the shape astring expects. (oxc needs no equivalent — it emits the
// map natively from Rust codegen.)
interface AstringMapping {
original: { line: number; column: number } | null;
generated: { line: number; column: number };
source: string;
name?: string;
}
class AstringGenMapping {
readonly file: string;
private readonly map: GenMapping;
constructor(file: string) {
this.file = file;
this.map = new GenMapping({ file });
}
addMapping(m: AstringMapping): void {
const { original, generated, source, name } = m;
if (!original) return; // astring only maps nodes that carry a location
if (name) {
addMapping(this.map, { generated, original, source, name });
} else {
addMapping(this.map, { generated, original, source });
}
}
toString(): string {
return JSON.stringify(toEncodedMap(this.map));
}
} With the shim in place, the transform parses with locations: true so every node carries
its origin, then hands the shim to astring to fill in while it prints.
// astring takes each mapping's generated position from its own writer and the
// original position from node.loc.start, so parsing with `locations: true` is
// what makes the map possible. The source name comes from the generator's `file`.
export function transformMap(source: string, prefix = '', ts = false): MapResult {
try {
const ast = parseSource(source, ts, true) as unknown as Program;
if (ts) stripTypes(ast);
if (prefix) rewriteImports(ast, prefix);
const map = new AstringGenMapping('input.js');
const code = generate(ast as never, { sourceMap: map as never });
return { ok: true, code, map: map.toString() };
} catch (e) {
return { ok: false, code: (e as Error).message, map: '' };
}
} Reading a map back is the reverse, and @jridgewell/sourcemap-codec does that part. The
mappings string is base64-VLQ
where every number is a delta from the previous one, and its decode handles both the
decoding and the delta accumulation, handing back absolute segments. Keeping the ones with
an original position is the whole decoder the source map tab uses to draw its links.
// @jridgewell/sourcemap-codec does the actual work: `decode` VLQ-decodes the
// mappings string AND un-deltas it, returning, per generated line, segments of
// [genColumn, sourceIndex, srcLine, srcColumn, nameIndex] as absolute values.
// All that's left is to flatten the ones that carry an original position.
export function decodeMappings(mappingsField: string): Mapping[] {
const out: Mapping[] = [];
decode(mappingsField).forEach((line, genLine) => {
for (const seg of line) {
if (seg.length >= 4) {
out.push({ genLine, genCol: seg[0], srcLine: seg[2]!, srcCol: seg[3]! });
}
}
});
return out;
}