Skip to content

Analyzer

yuku-analyzer is full semantic analysis for JavaScript and TypeScript: scopes, symbols, resolved references, closures, and cross-file module linking, computed natively in Zig and queried as plain JavaScript objects.

No single library gives you all of this. Scopes and resolved references mean eslint-scope or @typescript-eslint/scope-manager. Cross-file go-to-definition means the TypeScript compiler or ts-morph. A parser sits underneath both. yuku-analyzer is all of them in one native pass behind one API.

At native speed. Up to ~15× faster per file than eslint-scope, @typescript-eslint/scope-manager, and @babel/traverse, with zero per-query cost after the single native call. Stitch those separate tools together yourself and the gap only widens: each re-walks the AST, you re-parse to resolve across files, and you keep the indexes between them in sync by hand. yuku-analyzer pays all of that once, in Zig.

Terminal window
npm install yuku-analyzer
import { Analyzer } from "yuku-analyzer";
const analyzer = new Analyzer();
analyzer.addFile("config.ts", `export const flags = { debug: true };`);
const app = analyzer.addFile("app.ts", `import { flags } from "./config.ts";`);
const def = app.rootScope.find("flags").definition();
def.module.path; // "config.ts"
def.symbol.name; // "flags"

That is real cross-file resolution, not a string search: it follows import, re-export, and export * chains to the binding that actually defines the name, the same as an editor’s go-to-definition, in plain JavaScript.

Assembling this yourself is not only slower, it is harder to get right. The lightweight tools give you a scope stack but leave the binding rules to you: hoisting, catch clauses, named function expressions, TypeScript declaration merging, value space versus type space. Each tool implements a subset, and each subset has its own bugs.

yuku-analyzer computes none of that in JavaScript. The binder, scope tree, reference resolution, and module records all come from the same well-tested native analyzer that powers the rest of Yuku, so there is one implementation to keep correct, not a JavaScript copy that drifts from it. JavaScript receives a finished model to query, not events to track.

The design rests on one observation: a semantic model is mostly integers. Scopes point to parents, symbols point to scopes, references point to symbols, and everything points to AST nodes. Integers serialize for free.

One native call. addFile parses the source, runs scope construction, binding, and reference resolution in Zig, and serializes the result into a single binary buffer: the AST in Yuku’s flat transfer format, followed by the semantic tables as fixed-stride sections. One FFI crossing per file, total.

Zero-copy decode. On the JavaScript side, the semantic sections are read through typed-array views directly over the transferred buffer. Nothing is parsed, nothing is copied. A symbol’s name, flags, scope, and declaration list are reads at computed offsets.

Lazy objects, eager answers. Scope, Symbol, Reference, Import, and Export are flyweight objects over the tables: tiny, allocated once per row on first access, with getters that read the buffer. Cross-indexes (which references belong to which symbol, which symbols belong to which scope) build lazily on first use and amortize across every later query.

Node identity. AST nodes decode lazily and are memoized by node index. The node you reach by walking module.ast and the node a semantic query hands back are the same JavaScript object. symbol.declarations[0] === someNodeYouWalkedTo is a meaningful comparison, and a WeakMap resolves any node back to its index, which is what makes symbolOf(node) a lookup instead of a search.

The result: native-code analysis speed, JavaScript-object ergonomics, and a wire format that is provably synchronized with the code that reads it.

The Analyzer is the project: a set of modules plus the links between them.

import { Analyzer } from "yuku-analyzer";
const analyzer = new Analyzer();
const module = analyzer.addFile("src/app.tsx", source);
analyzer.removeFile("src/app.tsx"); // true if it existed
analyzer.module("src/app.tsx"); // Module | undefined
analyzer.modules; // ReadonlyMap<string, Module>

addFile accepts the same options as yuku-parser’s parse, with lang and sourceType defaulting from the file extension:

analyzer.addFile("legacy.cjs", source, {
// lang: "js" inferred from the extension
// sourceType: "script" inferred from the extension
preserveParens: true,
allowReturnOutsideFunction: false,
attachComments: false,
});

Adding a path that already exists replaces the module and marks the graph for relinking. The call returns a new Module, and any scopes, symbols, or nodes you held from the previous version belong to that earlier parse. A change in analyzer.module(path) identity is the signal to drop a cache keyed on the old one.

Cross-file linking needs to map import specifiers to added files. The default resolver handles relative specifiers with standard extension probing (./util matches util.ts, util/index.ts, and so on). For anything else, supply your own:

const analyzer = new Analyzer({
resolve(specifier, importerPath) {
// return the path of an added file, or null for external modules
return myAliasMap.get(specifier) ?? null;
},
});

Returning null marks the import as external: import.resolvedModule stays null and definition chains stop there, without diagnostics.

addFile returns a Module, the per-file unit of the analysis. Everything on it is local JavaScript: no native calls happen after addFile returns.

module.path; // the path it was added under
module.source; // the original source text
module.ast; // ESTree / TS-ESTree Program, lazily decoded
module.diagnostics; // syntax and semantic errors for this file
module.comments; // every comment in source order
module.lineStarts; // sorted offsets where each line begins
module.locOf(120); // { line, column } for an offset

The AST is the same ESTree / TypeScript-ESTree output as yuku-parser, and nodes are plain mutable objects. Edit them, run them through any ESTree tool, print them with yuku-codegen.

The semantic surface:

module.scopes; // Scope[], index is the scope id
module.rootScope; // the scope top-level code runs in
module.symbols; // Symbol[], index is the symbol id
module.references; // Reference[], in source order
module.unresolvedReferences; // references that resolve to no binding
module.imports; // Import[], in source order
module.exports; // Export[], in source order

Ids are stable within a parse: (module.path, symbol.id) is a persistable key for caches and incremental tooling. Re-adding a path reparses it into a new Module and can renumber, so pair the key with module identity and invalidate when analyzer.module(path) changes.

Every lexical environment in the file, as a tree:

const scope = module.scopes[3];
scope.kind; // "global" | "module" | "function" | "block" | "class"
// | "staticBlock" | "expressionName" | "tsModule"
scope.node; // the AST node that created the scope
scope.parent; // parent Scope, or null at the global scope
scope.strict; // strict mode, propagated per spec
scope.hoistTarget; // the scope where a `var` declared here actually lands
scope.bindings; // symbols declared directly in this scope
scope.find("x"); // direct binding lookup, no chain walk
scope.contains(other); // is `other` this scope or a descendant?
for (const s of scope.ancestors()) { /* this scope up to global */ }

The scope tree is the native binder’s exact output, so the spec subtleties are already right.

A Symbol is one declared binding:

const sym = module.rootScope.find("render");
sym.name; // "render"
sym.scope; // the Scope it is declared in
sym.declarations; // every declarator node, in source order
sym.references; // every resolved use site in this module
sym.id; // stable index into module.symbols

One symbol can have several declarations when the language merges them: TypeScript function overloads, class + interface merging, namespace + enum merging. The analyzer records every declarator, which is exactly what go-to-definition and rename need.

What a symbol is lives in a bitset. There is exactly one way to query it: has (any of the given flags) and hasAll (all of them), against the exported SymbolFlags constants. No parallel boolean getters, so the API stays small and predictable.

import { SymbolFlags } from "yuku-analyzer";
sym.has(SymbolFlags.Function); // is it a function?
sym.has(SymbolFlags.TypeAlias | SymbolFlags.Interface); // either kind?
sym.hasAll(SymbolFlags.Function | SymbolFlags.Exported); // an exported function?

Alongside the single-bit flags, four composites answer the common categorical questions directly:

sym.has(SymbolFlags.Variable); // var / let / const, parameters and catch bindings included
sym.has(SymbolFlags.Import); // any import binding, value or `import type`
sym.has(SymbolFlags.ValueSpace); // visible at runtime
sym.has(SymbolFlags.TypeSpace); // referencable from a TS type position

A class satisfies both ValueSpace and TypeSpace, which is what makes “use a class as a type” work without special cases. The flag values, composites included, are generated from the native binder’s bit layout at build time, so they can never disagree with what the binder wrote.

A Reference is one identifier in use position, already resolved:

const ref = module.references[0];
ref.name; // the identifier text
ref.node; // the Identifier node, identity-shared with the AST
ref.scope; // the scope the use occurs in
ref.symbol; // the resolved Symbol, or null for free names
ref.kind; // "value" for runtime uses, "type" for TS type positions
ref.isWrite; // true when this use (re)assigns the binding

kind lets rename and dead-code tools treat a value and a same-named type independently. isWrite is computed structurally in the native pass.

module.unresolvedReferences is the complement: every name that resolves to no local binding. That list is precisely what a no-undef lint rule or a globals collector wants.

These methods connect AST nodes to the semantic model. All of them work on node object identity, not positions or names:

module.symbolOf(node); // the symbol a node declares or references, or null
module.referenceOf(node); // the Reference for an identifier node, or null
module.scopeOf(node); // the innermost scope whose extent contains the node
module.parentOf(node); // the node that structurally contains it, or null
module.resolve("fetch"); // scope-chain lookup from the root scope
module.resolve("x", someScope); // or from any scope, like the engine would

symbolOf is the workhorse: hand it a declaration identifier and you get the symbol it declares, hand it a reference identifier and you get the symbol it resolves to.

parentOf walks upward from a node you already hold, with no ancestor stack and no full walk. Because nodes are memoized by index, it is the same constant-time lookup as the others. It returns null at the program root and for any node that is not part of this module’s AST.

module.walk is a typed visitor walk with the semantic model in context. Handlers are keyed by node type and receive the exact node type, not a generic node:

module.walk({
// bare function = enter handler
CallExpression(node, ctx) {
if (node.callee.type === "Identifier") {
const target = ctx.module.symbolOf(node.callee);
if (target?.has(SymbolFlags.Import)) {
console.log(`calls imported ${node.callee.name}`);
}
}
},
// or an enter/leave pair
FunctionDeclaration: {
enter(node, ctx) { console.log("entering", node.id.name); },
leave(node, ctx) { console.log("leaving", node.id.name); },
},
// universal catch-alls
enter(node, ctx) {},
leave(node, ctx) {},
});

Per node, the order is: catch-all enter, typed enter, children, typed leave, catch-all leave. Pass a node as the second argument to walk only a subtree: module.walk(visitors, someFunction).

One context object is reused across the whole walk (do not store it). It carries the position and the semantics:

ctx.node; // the current node
ctx.parent; // its parent, or null at the walk root
ctx.key; // the field on the parent holding this node
ctx.index; // position in an array field, or null
ctx.ancestors(); // a copy of the ancestor chain, root first
ctx.scope; // the innermost Scope at this node
ctx.symbol; // shorthand for module.symbolOf(node)
ctx.reference; // shorthand for module.referenceOf(node)
ctx.module; // the module being walked

ctx.scope is not tracked during the walk. The binder records the scope at every node and ships it as a per-node table, so ctx.scope (like module.scopeOf) is a single read off that table. No scoping rule is evaluated in JavaScript, and the answer is exact even where scopes do not nest with spans, such as decorators.

The walk mutates the AST in place, with precise semantics:

OperationEffect
ctx.skip()Do not descend into this node’s children. leave still fires.
ctx.stop()End the walk immediately.
ctx.replace(node)Swap the current node. The walk continues into the replacement’s children and leave fires for its new type.
ctx.remove()Splice the node out of an array field, or null a plain field. Children are not walked, leave does not fire.
ctx.insertBefore(node)Insert a sibling before the current node. The inserted node is not visited.
ctx.insertAfter(node)Insert a sibling after the current node. The walk visits it.

A replacement node created with start: 0, end: 0 inherits the original node’s span, which keeps source maps meaningful through yuku-codegen.

module.walk({
DebuggerStatement(node, ctx) {
ctx.remove();
},
Identifier(node, ctx) {
if (ctx.symbol === legacyName) node.name = "modernName";
},
});

One rule to remember: the semantic tables are a snapshot of the parsed source. Nodes you create have no symbols or references of their own. Analyze, transform, print, and re-analyze the output if you need fresh semantics for the transformed code.

For the simplest queries there is a one-liner:

module.findAll("FunctionDeclaration"); // FunctionDeclaration[]
module.findAll(["ClassDeclaration", "TSInterfaceDeclaration"]);

capturesOf computes the free variables of a function: every binding referenced inside it (nested closures included) that is declared outside it.

const source = `
let count = 0;
const step = 2;
export function tick() {
count += step;
return () => count;
}
`;
const module = analyzer.addFile("counter.ts", source);
const [tick] = module.findAll("FunctionDeclaration");
for (const capture of module.capturesOf(tick)) {
console.log(capture.symbol.name, capture.isWritten);
}
// count true (tick writes to it)
// step false (read only)

Each Capture carries the outer symbol, the capturing references inside the function, and isWritten. Type-only references are excluded, since they do not exist at runtime. Only bindings appear: this, arguments, and unresolved globals carry no symbol and are never reported, while module-scope and imported bindings count like any other outer binding.

Because the computation rides the resolved reference table, it is shadowing-correct and alias-correct by construction. A local count declared inside the function does not produce a false capture, and a reference is attributed to the binding it actually resolves to, not to the nearest matching name.

Each module carries spec-true records of its module surface, computed natively:

for (const imp of module.imports) {
imp.specifier; // "./lib.ts"
imp.name; // imported export name, "default" for default imports,
// null for namespace and side-effect imports
imp.local; // the local binding Symbol, or null for side effects
imp.isNamespace; // import * as ns
imp.isSideEffect; // import "m"
imp.typeOnly; // import type / import { type x }
imp.phase; // "source" | "defer" | null (stage 3 phase imports)
imp.resolvedModule; // the defining Module, or null when external
}
for (const exp of module.exports) {
exp.name; // exported name, "default" included, null for export *
exp.local; // backing local Symbol, when there is one
exp.isStar; // export * from "m"
exp.specifier; // re-export source, or null for local exports
exp.fromName; // the name taken from the source module
exp.isNamespaceReexport; // export * as ns from "m"
exp.isExportEquals; // TS export = expr (the module's entire value)
exp.globalName; // TS export as namespace N, else null
exp.typeOnly; // export type
exp.resolvedModule; // the source Module for re-exports
}

Following the specification, default is modeled as an export name, not a separate kind, and export * never forwards default. TypeScript’s legacy module forms (export =, export as namespace) are recorded with their own kinds, so ESM tooling never mistakes them for named exports. Tools built on these records inherit the spec behavior instead of approximating it.

These records cover ECMAScript module syntax and TypeScript’s module forms (import / export, import type, export =, export as namespace). CommonJS is ordinary code rather than module syntax, so require, module.exports, and exports.x produce no import or export records and take no part in linking. Everything per file (scopes, symbols, references, captures) is computed for CommonJS sources the same way.

analyzer.link() joins the graph: resolves every specifier through the resolver, populates resolvedModule on imports and re-exports, builds dependencies / dependents, and validates every imported name and named re-export.

Name resolution implements the spec’s ResolveExport: renaming re-export chains are followed per name, default is never satisfied by export *, and a name supplied by multiple export * declarations through different bindings is reported as ambiguous, the same conditions an engine raises at link time.

Calling it is optional. Every cross-file surface links on demand after files change, so reading import.resolvedModule or module.dependencies is always correct. Call link() explicitly when you want to control when the work happens and collect the diagnostics at a known point:

analyzer.link();
for (const d of analyzer.diagnostics) {
console.log(`${d.module}: ${d.message}`);
// "main.ts: Module './lib.ts' has no export 'helpr'"
}

definitionOf follows import, re-export, and export * chains to the place a binding is actually defined, however many files away:

// a.ts: export const value = 1;
// b.ts: export { value as renamed } from "./a.ts";
// c.ts: import { renamed } from "./b.ts";
const c = analyzer.module("c.ts");
const sym = c.rootScope.find("renamed");
const def = analyzer.definitionOf(sym);
def.module.path; // "a.ts"
def.symbol.name; // "value"

symbol.definition() is the instance-method shorthand. A result with symbol: null means the definition is a whole module namespace (import * as ns). A null result means the chain leaves the added file set (an external package, by design not an error), cannot be resolved, or is ambiguous.

Chains with cycles terminate safely: a circular request is detected per (module, name) pair, so a chain may legitimately pass through the same module twice under different names.

The inverse direction: every use of a symbol anywhere in the graph, with imports followed back to the definition:

const uses = analyzer.referencesOf(def.symbol);
for (const { module, reference } of uses) {
console.log(module.path, reference.name, reference.isWrite);
}

This is find-all-references as a compiler primitive: rename across files, unused-export detection, impact analysis.

// unused exports, whole project
for (const module of analyzer.modules.values()) {
for (const exp of module.exports) {
if (exp.local && analyzer.referencesOf(exp.local).length === 0) {
console.log(`${module.path}: '${exp.name}' is exported but never used`);
}
}
}

module.exportedNames() lists everything a module exports with export * chains followed, the spec’s GetExportedNames. Per the spec, ambiguous star names are included (ambiguity is a resolution error, not an enumeration one) and default never arrives through a star:

// a.ts: export const one = 1;
// lib.ts: export const two = 2; export default x; export * from "./a.ts";
analyzer.module("lib.ts").exportedNames(); // ["two", "default", "one"]
analyzer.module("a.ts").exportedNames(); // ["one"]

This is what namespace-member completion and re-export expansion build on.

The full bitset, generated from the native binder’s layout:

FlagMeaning
FunctionScopedVariablevar, parameter, or catch variable
BlockScopedVariablelet, const, using, await using
Functionfunction declaration or expression
Classclass declaration or expression
RegularEnumTS enum
ConstEnumTS const enum
ValueModuleTS namespace with runtime content
InterfaceTS interface
TypeAliasTS type alias
TypeParameterTS <T>, infer T, mapped-type key
NamespaceModuleTS namespace of any kind
ValueImporta value import binding (import x / import { x })
TypeImportimport type / import { type x } binding
Constconst or using binding
AmbientTS declare
Parameterfunction or method parameter
CatchVariablecatch (e) binding
Exportedexported from its module
Defaultthe default export

Plus four composites (unions of the above), for the common categorical questions:

CompositeMatches
Variablevar / let / const, parameters and catch bindings included
Importany import binding, value or import type
ValueSpacevisible at runtime (var, function, class, enum, value namespace)
TypeSpacereferencable from a type position (class, enum, interface, alias, type param)

Analysis runs in the native parser pass, so full semantics cost roughly half of parsing time on top of the parse itself. Validated against 55,000+ real-world files.

Concretely, on an Apple M-series machine: parsing plus complete semantic analysis of a typical source file lands well under a millisecond, walking sustains tens of millions of nodes per second, and linking a 2,000-module graph takes about a millisecond.

Everything is fully typed. Visitor handlers receive exact node types, and the semantic surface (Module, Scope, Symbol, Reference, Import, Export, Capture) is exported:

import type { Module, Symbol, Capture } from "yuku-analyzer";