如何优雅地 Hack 用户的代码

2023-02-28

node body ast

前言：做基础技术的时候，会经常碰到一个问题就是如何让自己提供的代码对用户少侵入，无感。比如我提供了一个 SDK 收集 Node.js 进程的 HTTP 请求耗时，最简单的方式就是给用户提供一个 request 方法，然后让用户统一调用，这样我就可以在 request 里拿到这些数据。但是这种方式很多时候并不方便，这时候我们就需要去 hack Node.js 的 HTTP 模块或者给 Node.js 提 PR。在操作系统层面，有提供很多技术解决这种问题，比如 ebpf、uprobe、kprobe。但是应用层无法使用这种技术解决我们的问题，因为操作系统的这些技术针对的是底层的函数，比如我想知道一个 JS 函数的耗时，只能在 V8 层面或者 JS 层面去解决，V8 这方面似乎也没有提供很好能力，所以目前我们更多是考虑纯 JS 或者 Node.js 内核层面。本文介绍一些一种在 JS 层面 hack 用户代码的方式。

在 Node.js 中，统计 JS 函数的耗时通常的做法是 cpu profile，但是这种方式只能拿到一段时间的耗时，如果我想实时收集耗时数据，cpu profile 就有点难搞，最直接的就是定时收集 cpu profile 数据，然后我们手动去解析 profile 数据然后上报。除了这种方式外，本文介绍另外一种方式。就是通过 hack JS 代码的方式。假如有以下一个函数。

function compute() {
    // do something
}1.
2.
3.

如果我们想统计这种函数的执行耗时，最自然的方式就是在函数的开始和结束的地方插入一些代码。但是我们不希望这种事情让用户手动去做，而是使用一种更优雅的方式。那就是通过分析源码，拿到 AST，然后重写 AST。我们看看怎么做。

const acorn = require('acorn');
const escodegen = require('escodegen');
const b = require('ast-types').builders;
const walk = require("acorn-walk");
const fs = require('fs');

// 分析源码，拿到 AST
const ast = acorn.parse(fs.readFileSync('./test.js', 'utf-8'), {
    ecmaVersion: 'latest',
});

function inject(node) {
    // 在函数前后插入代码
    const entryNode = b.variableDeclaration('const', [b.variableDeclarator(b.identifier('start'), b.callExpression(
        b.identifier('(() => { return Date.now(); })'), [],
    ))]);
    const exitNode = b.returnStatement(b.callExpression(
        b.identifier('((start) => {console.log(Date.now() - start);})'), [ 
            b.identifier('start')
        ],
    ));

    if (node.body.body) {
        node.body.body.unshift(entryNode);
        node.body.body.push(exitNode);
    }
}

// 遍历 AST，修改 AST
walk.simple(ast, {
    ArrowFunctionExpression: inject,
    ArrowFunctionDeclaration: inject,
    FunctionDeclaration: inject,
    FunctionExpression: inject
});

// 根据修改的 AST 重新生成代码
const newCode = escodegen.generate(ast);

fs.writeFileSync('test.js', newCode)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.

执行上面的代码后拿到如下结果。

QQ18BVI" data-card-editable="false" class="" data-syntax="typescript">

function compute() {
    const start = (() => { return Date.now(); })();
    return ((start) => {console.log(Date.now() - start);})(start);
}1.
2.
3.
4.

这样我们就可以拿到每个函数的耗时数据了。但是这种方式是静态分析源码，落地起来需要用户主动操作，并不是那么友好。那么基于这个基础我们利用 V8 调试协议中的 Debugger Domain 实现动态重写，这种方式还能重写 Node.js 内部的 JS 代码。首先改一下测试代码。

function compute() {
    // do something
}

setInterval(compute, 1000)
1.
2.
3.
4.
5.

然后再看改写代码的逻辑。

const { Session } = require('inspector');
const acorn = require('acorn');
const escodegen = require('escodegen');
const b = require('ast-types').builders;
const walk = require("acorn-walk");
const session = new Session();
session.connect();

require('./test_ast');
// 监听 JS 代码解析事件，拿到所有的 JS
session.on('Debugger.scriptParsed', (message) => {
    // 只处理这个文件
    if (message.params.url.indexOf('test_ast') === -1) {
        return;
    }
    // 拿到源码
    session.post('Debugger.getScriptSource', {scriptId: message.params.scriptId}, (err, ret) => {
        const ast = acorn.parse(ret.scriptSource, {
            ecmaVersion: 'latest',
        });
        function inject(node) {
            const entry = b.variableDeclaration('const', [b.variableDeclarator(b.identifier('start'), b.callExpression(
                b.identifier('(() => { return Date.now(); })'), [],
            ))]);
            const exit = b.returnStatement(b.callExpression(
                b.identifier('((start) => {console.log(Date.now() - start);})'), [ 
                    b.identifier('start')
                ],
            ));

            if (node.body.body) {
                node.body.body.unshift(entry);
                node.body.body.push(exit);
            }
        }
        walk.simple(ast, {
            ArrowFunctionExpression: inject,
            ArrowFunctionDeclaration: inject,
            FunctionDeclaration: inject,
            FunctionExpression: inject
        });
        const newCode = escodegen.generate(ast);
        // 分析完，重写 AST后生成新的代码，并重写
        session.post('Debugger.setScriptSource', {
            scriptId: message.params.scriptId,
            scriptSource: newCode,
            dryRun: false
        });
    })
});

session.post('Debugger.enable', () => {});1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.

正常来说，setInterval 执行的函数没有东西输出，但是我们发现会不断输出 0，也就是耗时，因为这里使用毫秒级的统计，所以是 0，不过我们不需要关注这个。这样我们就完成了 hack 用户的代码，而对用户来说是无感的，唯一需要做的事情就是引入我们提供的一个 SDK。不过这种方式的难点在重写代码的逻辑，风险也比较大，但是如果我们解决了这个问题后，我们就可以随便 hack 用户的代码，做我们想做的事情，当然，是正事。