webpack-bundle-analyzer
webpack-bundle-analyzer 是一個插件:通過分析構(gòu)建產(chǎn)物,最終生成 矩形樹圖 方便開發(fā)者根據(jù)項目構(gòu)建后的依賴關(guān)系以及實際的文件尺寸,來進行相應(yīng)的性能優(yōu)化。
為什么要研究這個插件?因為縱觀當(dāng)前的幾類依賴分析的插件,包括 webpack 自身提供的一個工具 http://webpack.github.io/analyse/ 從可視化的角度來說,都沒有 webpack-bundle-analyzer
提供的矩形樹圖來的直觀:
- webpack visualizer
- webpack chart
以上幾個工具都是只分析 stats.json
暴露的內(nèi)容來生成圖表。
而 webpack-bundle-analyzer 和他們之間的區(qū)別在于借助 acorn ,通過分析構(gòu)建產(chǎn)物來得出模塊依賴關(guān)系,核心實現(xiàn)上其實是脫離了 webpack 的能力,但由于是分析 webpack 的構(gòu)建產(chǎn)物,因而要對打包出來的 js 內(nèi)容的組裝結(jié)構(gòu)需要了解,隨著 webpack 的不斷升級,產(chǎn)物結(jié)構(gòu)也會隨之發(fā)生改變,因而需要不斷的兼容,通過閱讀源碼以及作者的注釋可以看到,webpack v5 的產(chǎn)物已經(jīng)比較難以分析了
在看之前,需要先了解以下幾個知識點:
- Webpack 提供的
stats
屬性中module
,chunk
,assets
的含義。 - acorn:一個完全使用 Javascript 實現(xiàn)的,小型且快速的 Javascript 解析器 ,AST 抽象語法樹相關(guān)知識。
核心的流程如下:
插件入口
首先是遵循 Webpack 插件的寫法,在 done
函數(shù)里獲取到 stats
class BundleAnalyzerPlugin {
apply(compiler) {
// 核心實現(xiàn)入口
const done = (stats, callback) => {/* ... */ }
// 兼容 webpack 新老版本的寫法
if (compiler.hooks) {
compiler.hooks.done.tapAsync('webpack-bundle-analyzer', done);
} else {
compiler.plugin('done', done);
}
}
}
復(fù)制
webpack-bundle-analyzer 插件的數(shù)據(jù)源取自 stats.toJson()
這個方法,而生成圖表數(shù)據(jù)的函數(shù)則是 getViewerData()
,下面拆分了解這個函數(shù)的具體實現(xiàn)。
1. 確認資源的結(jié)構(gòu)
const _ = require('lodash');
const FILENAME_QUERY_REGEXP = /\?.*$/u;
const FILENAME_EXTENSIONS = /\.(js|mjs)$/iu;
函數(shù)入?yún)?bundleStats
// Sometimes all the information is located in `children` array (e.g. problem in #10)
// assets 為空 && children 存在,這種情況下資源信息都在 children 屬性當(dāng)中
if (_.isEmpty(bundleStats.assets) && !_.isEmpty(bundleStats.children)) {
const { children } = bundleStats;
bundleStats = bundleStats.children[0];
// Sometimes if there are additional child chunks produced add them as child assets,
// leave the 1st one as that is considered the 'root' asset.
// 這種情況下 children 數(shù)組中的第一個元素當(dāng)作根節(jié)點,children 數(shù)組中如果還有 assets,則 push 到 bundleStats.assets 中
for (let i = 1; i < children.length; i++) {
children[i].assets.forEach((asset) => {
asset.isChild = true;
bundleStats.assets.push(asset);
});
}
} else if (!_.isEmpty(bundleStats.children)) {
// Sometimes if there are additional child chunks produced add them as child assets
bundleStats.children.forEach((child) => {
child.assets.forEach((asset) => {
asset.isChild = true;
bundleStats.assets.push(asset);
});
});
}
// Picking only `*.js or *.mjs` assets from bundle that has non-empty `chunks` array
// 過濾出 *.js 和 *.mjs 并且 chunks 不為空的 assets
bundleStats.assets = bundleStats.assets.filter(asset => {
// Filter out non 'asset' type asset if type is provided (Webpack 5 add a type to indicate asset types)
if (asset.type && asset.type !== 'asset') {
return false;
}
// Removing query part from filename (yes, somebody uses it for some reason and Webpack supports it)
// See #22
asset.name = asset.name.replace(FILENAME_QUERY_REGEXP, '');
return FILENAME_EXTENSIONS.test(asset.name) && !_.isEmpty(asset.chunks);
});
復(fù)制
拿一個項目來舉例,圖片里的內(nèi)容是取自 stats.json()
,上面這段代碼最后提取的產(chǎn)物是圖中標注的三個對象:
2. 使用 Acorn AST 分析
接著開始遍歷一個數(shù)組,數(shù)組中的內(nèi)容是上面的三個對象。
首先判斷 compiler.outputPath
是否存在?存在就用 acorn
庫解析 JS 文件,調(diào)用 acorn-walk
的 recursive
方法遞歸處理解析后的 AST 樹??。
AST 相關(guān)的源碼:
const fs = require('fs');
const _ = require('lodash');
const acorn = require('acorn');
const walk = require('acorn-walk');
// 傳入路徑
function parseBundle(bundlePath) {
const content = fs.readFileSync(bundlePath, 'utf8');
// acorn 解析 js 文件內(nèi)容
const ast = acorn.parse(content, {
sourceType: 'script',
// I believe in a bright future of ECMAScript!
// Actually, it's set to `2050` to support the latest ECMAScript version that currently exists.
// Seems like `acorn` supports such weird option value.
ecmaVersion: 2050
});
const walkState = {
locations: null,
expressionStatementDepth: 0
};
// 遞歸執(zhí)行
walk.recursive(
ast,
walkState,
{
// 表達式語句節(jié)點
ExpressionStatement(node, state, c) {
if (state.locations) return;
state.expressionStatementDepth++; // expressionStatement 深度 +1
if (
// Webpack 5 stores modules in the the top-level IIFE
state.expressionStatementDepth === 1 &&
ast.body.includes(node) &&
isIIFE(node)
) {
const fn = getIIFECallExpression(node);
if (
// It should not contain neither arguments
fn.arguments.length === 0 &&
// ...nor parameters
fn.callee.params.length === 0
) {
// Modules are stored in the very first variable declaration as hash
const firstVariableDeclaration = fn.callee.body.body.find(node => node.type === 'VariableDeclaration');
if (firstVariableDeclaration) {
for (const declaration of firstVariableDeclaration.declarations) {
if (declaration.init) {
state.locations = getModulesLocations(declaration.init);
if (state.locations) {
break;
}
}
}
}
}
}
if (!state.locations) {
c(node.expression, state);
}
state.expressionStatementDepth--;
},
// 賦值表達式節(jié)點
AssignmentExpression(node, state) {
if (state.locations) return;
// Modules are stored in exports.modules:
// exports.modules = {};
const { left, right } = node;
if (
left &&
left.object && left.object.name === 'exports' &&
left.property && left.property.name === 'modules' &&
isModulesHash(right)
) {
state.locations = getModulesLocations(right);
}
},
// 函數(shù)調(diào)用表達式
CallExpression(node, state, c) {
if (state.locations) return;
const args = node.arguments;
// Main chunk with webpack loader.
// Modules are stored in first argument:
// (function (...) {...})()
if (
node.callee.type === 'FunctionExpression' &&
!node.callee.id &&
args.length === 1 &&
isSimpleModulesList(args[0])
) {
state.locations = getModulesLocations(args[0]);
return;
}
// Async Webpack < v4 chunk without webpack loader.
// webpackJsonp([], , ...)
// As function name may be changed with `output.jsonpFunction` option we can't rely on it's default name.
if (
node.callee.type === 'Identifier' &&
mayBeAsyncChunkArguments(args) &&
isModulesList(args[1])
) {
state.locations = getModulesLocations(args[1]);
return;
}
// Async Webpack v4 chunk without webpack loader.
// (window.webpackJsonp=window.webpackJsonp||[]).push([[], , ...]);
// As function name may be changed with `output.jsonpFunction` option we can't rely on it's default name.
if (isAsyncChunkPushExpression(node)) {
state.locations = getModulesLocations(args[0].elements[1]);
return;
}
// Webpack v4 WebWorkerChunkTemplatePlugin
// globalObject.chunkCallbackName([],, ...);
// Both globalObject and chunkCallbackName can be changed through the config, so we can't check them.
if (isAsyncWebWorkerChunkExpression(node)) {
state.locations = getModulesLocations(args[1]);
return;
}
// Walking into arguments because some of plugins (e.g. `DedupePlugin`) or some Webpack
// features (e.g. `umd` library output) can wrap modules list into additional IIFE.
args.forEach(arg => c(arg, state));
}
}
);
let modules;
if (walkState.locations) {
modules = _.mapValues(walkState.locations,
loc => content.slice(loc.start, loc.end)
);
} else {
modules = {};
}
return {
modules, // 獲取 modules
src: content, // 內(nèi)容
runtimeSrc: getBundleRuntime(content, walkState.locations) // 沒有包含 modules 的 bundle 產(chǎn)物代碼
};
}
/**
* Returns bundle source except modules
*/
function getBundleRuntime(content, modulesLocations) {
const sortedLocations = Object.values(modulesLocations || {})
.sort((a, b) => a.start - b.start);
let result = '';
let lastIndex = 0;
for (const { start, end } of sortedLocations) {
result += content.slice(lastIndex, start);
lastIndex = end;
}
return result + content.slice(lastIndex, content.length);
}
function isIIFE(node) {
return (
node.type === 'ExpressionStatement' &&
(
node.expression.type === 'CallExpression' ||
(node.expression.type === 'UnaryExpression' && node.expression.argument.type === 'CallExpression')
)
);
}
function getIIFECallExpression(node) {
if (node.expression.type === 'UnaryExpression') {
return node.expression.argument;
} else {
return node.expression;
}
}
function isModulesList(node) {
return (
isSimpleModulesList(node) ||
// Modules are contained in expression `Array([minimum ID]).concat([, , ...])`
isOptimizedModulesArray(node)
);
}
function isSimpleModulesList(node) {
return (
// Modules are contained in hash. Keys are module ids.
isModulesHash(node) ||
// Modules are contained in array. Indexes are module ids.
isModulesArray(node)
);
}
function isModulesHash(node) {
return (
node.type === 'ObjectExpression' &&
node.properties
.map(node => node.value)
.every(isModuleWrapper)
);
}
function isModulesArray(node) {
return (
node.type === 'ArrayExpression' &&
node.elements.every(elem =>
// Some of array items may be skipped because there is no module with such id
!elem ||
isModuleWrapper(elem)
)
);
}
function isOptimizedModulesArray(node) {
// Checking whether modules are contained in `Array().concat(...modules)` array:
// https://github.com/webpack/webpack/blob/v1.14.0/lib/Template.js#L91
// The `` + array indexes are module ids
return (
node.type === 'CallExpression' &&
node.callee.type === 'MemberExpression' &&
// Make sure the object called is `Array()`
node.callee.object.type === 'CallExpression' &&
node.callee.object.callee.type === 'Identifier' &&
node.callee.object.callee.name === 'Array' &&
node.callee.object.arguments.length === 1 &&
isNumericId(node.callee.object.arguments[0]) &&
// Make sure the property X called for `Array().X` is `concat`
node.callee.property.type === 'Identifier' &&
node.callee.property.name === 'concat' &&
// Make sure exactly one array is passed in to `concat`
node.arguments.length === 1 &&
isModulesArray(node.arguments[0])
);
}
function isModuleWrapper(node) {
return (
// It's an anonymous function expression that wraps module
((node.type === 'FunctionExpression' || node.type === 'ArrowFunctionExpression') && !node.id) ||
// If `DedupePlugin` is used it can be an ID of duplicated module...
isModuleId(node) ||
// or an array of shape [, ...args]
(node.type === 'ArrayExpression' && node.elements.length > 1 && isModuleId(node.elements[0]))
);
}
// 判斷是否為 module id
function isModuleId(node) {
return (node.type === 'Literal' && (isNumericId(node) || typeof node.value === 'string'));
}
// 判斷是否為數(shù)字類型 id
function isNumericId(node) {
return (node.type === 'Literal' && Number.isInteger(node.value) && node.value >= 0);
}
function isChunkIds(node) {
// Array of numeric or string ids. Chunk IDs are strings when NamedChunksPlugin is used
return (
node.type === 'ArrayExpression' &&
node.elements.every(isModuleId)
);
}
function isAsyncChunkPushExpression(node) {
const {
callee,
arguments: args
} = node;
return (
callee.type === 'MemberExpression' &&
callee.property.name === 'push' &&
callee.object.type === 'AssignmentExpression' &&
args.length === 1 &&
args[0].type === 'ArrayExpression' &&
mayBeAsyncChunkArguments(args[0].elements) &&
isModulesList(args[0].elements[1])
);
}
function mayBeAsyncChunkArguments(args) {
return (
args.length >= 2 &&
isChunkIds(args[0])
);
}
function isAsyncWebWorkerChunkExpression(node) {
const { callee, type, arguments: args } = node;
return (
type === 'CallExpression' &&
callee.type === 'MemberExpression' &&
args.length === 2 &&
isChunkIds(args[0]) &&
isModulesList(args[1])
);
}
// 獲取模塊位置
function getModulesLocations(node) {
if (node.type === 'ObjectExpression') {
// Modules hash
const modulesNodes = node.properties;
return modulesNodes.reduce((result, moduleNode) => {
const moduleId = moduleNode.key.name || moduleNode.key.value;
result[moduleId] = getModuleLocation(moduleNode.value);
return result;
}, {});
}
const isOptimizedArray = (node.type === 'CallExpression');
if (node.type === 'ArrayExpression' || isOptimizedArray) {
// Modules array or optimized array
const minId = isOptimizedArray ?
// Get the [minId] value from the Array() call first argument literal value
node.callee.object.arguments[0].value :
// `0` for simple array
0;
const modulesNodes = isOptimizedArray ?
// The modules reside in the `concat()` function call arguments
node.arguments[0].elements :
node.elements;
return modulesNodes.reduce((result, moduleNode, i) => {
if (moduleNode) {
result[i + minId] = getModuleLocation(moduleNode);
}
return result;
}, {});
}
return {};
}
function getModuleLocation(node) {
return {
start: node.start,
end: node.end
};
}
module.exports = parseBundle;
復(fù)制
分別使用到了下面三個函數(shù)分析構(gòu)建后的 js 文件。
ExpressionStatement 表達式語句節(jié)點
AssignmentExpression 賦值表達式節(jié)點
CallExpression 函數(shù)調(diào)用表達式
看不明白?先試著看下面的:
解析前:
module.exports = test; // test 是一個函數(shù)
復(fù)制
解析后:
{
"type": "Program",
"start": 0,
"end": 201,
"body": [
{
"type": "ExpressionStatement",
"start": 179,
"end": 201,
"expression": {
"type": "AssignmentExpression",
"start": 179,
"end": 200,
"operator": "=",
"left": {
"type": "MemberExpression",
"start": 179,
"end": 193,
"object": {
"type": "Identifier",
"start": 179,
"end": 185,
"name": "module"
},
"property": {
"type": "Identifier",
"start": 186,
"end": 193,
"name": "exports"
},
"computed": false,
"optional": false
},
"right": {
"type": "Identifier",
"start": 196,
"end": 200,
"name": "test"
}
}
}
],
"sourceType": "module"
}
復(fù)制
解析前:
(
function(e){}
)(
0, (function () {})()
)
復(fù)制
解析后:
{
"type": "Program",
"start": 0,
"end": 46,
"body": [
{
"type": "ExpressionStatement",
"start": 0,
"end": 46,
"expression": {
"type": "CallExpression",
"start": 0,
"end": 46,
"callee": {
"type": "FunctionExpression",
"start": 4,
"end": 17,
"id": null,
"expression": false,
"generator": false,
"async": false,
"params": [
{
"type": "Identifier",
"start": 13,
"end": 14,
"name": "e"
}
],
"body": {
"type": "BlockStatement",
"start": 15,
"end": 17,
"body": []
}
},
"arguments": [
{
"type": "Literal",
"start": 23,
"end": 24,
"value": 0,
"raw": "0"
},
{
"type": "CallExpression",
"start": 26,
"end": 44,
"callee": {
"type": "FunctionExpression",
"start": 27,
"end": 41,
"id": null,
"expression": false,
"generator": false,
"async": false,
"params": [],
"body": {
"type": "BlockStatement",
"start": 39,
"end": 41,
"body": []
}
},
"arguments": [],
"optional": false
}
],
"optional": false
}
}
],
"sourceType": "module"
}
復(fù)制
上面舉例了兩種寫法以及對應(yīng)的 AST 生成結(jié)果,對照 webpack-bundle-analyzer 這部分的源碼,按照個人理解,作者\枚舉了 webpack 輸出產(chǎn)物的幾種結(jié)構(gòu),通過編寫對應(yīng)的解析方法來實現(xiàn)對依賴的獲取。
因而parseBundle
函數(shù)的作用是為了分析出依賴的代碼塊,這個代碼塊就是最終構(gòu)建產(chǎn)物,也就是某個 JS 文件中的一段代碼,在構(gòu)建過程中 JS 文件里的代碼都是字符串,因而就是對字符串的切割。通過 AST 分析得到的結(jié)構(gòu)中有 start,end 兩個屬性,這兩個屬性代表著一個代碼塊在源文件中的位置,所以使用 slice(start, end) 也就可以拿到對應(yīng) 模塊 的實際代碼。
通過分析,最終這一步是獲得了:
{
modules, // 獲取 modules
src: content, // 內(nèi)容
runtimeSrc: getBundleRuntime(content, walkState.locations) // 沒有包含 modules 的 bundle 產(chǎn)物代碼
}
復(fù)制
上述三個屬性。
看一下輸出結(jié)果:bundleInfo
一共輸出了三次,符合之前的過濾結(jié)果。那像這些 0a3c 已經(jīng)被編號過的 module 如何跟實際的 node_modules 內(nèi)的文件相關(guān)聯(lián)呢?繼續(xù)往下看??
3. 遍歷 assets 屬性,開始組裝對象
// bundleStats 就是 stats.json() 的內(nèi)容
const assets = bundleStats.assets.reduce((result, statAsset) => {
// If asset is a childAsset, then calculate appropriate bundle modules by looking through stats.children
const assetBundles = statAsset.isChild ? getChildAssetBundles(bundleStats, statAsset.name) : bundleStats;
const modules = assetBundles ? getBundleModules(assetBundles) : []; // 所有的 modules 合集
const asset = result[statAsset.name] = _.pick(statAsset, 'size');
const assetSources = bundlesSources && _.has(bundlesSources, statAsset.name) ?
bundlesSources[statAsset.name] : null;
if (assetSources) {
asset.parsedSize = Buffer.byteLength(assetSources.src);
asset.gzipSize = gzipSize.sync(assetSources.src);
}
// Picking modules from current bundle script
// 根據(jù) chunks(數(shù)組) 字段來進行過濾 statAssets 和 statModule 都含有 chunks 進行比對
const assetModules = modules.filter(statModule => assetHasModule(statAsset, statModule));
// Adding parsed sources
if (parsedModules) {
const unparsedEntryModules = [];
for (const statModule of assetModules) {
if (parsedModules[statModule.id]) {
statModule.parsedSrc = parsedModules[statModule.id]; // 提取先前 ast 解析的 module
} else if (isEntryModule(statModule)) { // 根據(jù) depth 為 0判斷是否為入口 module
unparsedEntryModules.push(statModule);
}
}
// Webpack 5 changed bundle format and now entry modules are concatenated and located at the end of it.
// Because of this they basically become a concatenated module, for which we can't even precisely determine its
// parsed source as it's located in the same scope as all Webpack runtime helpers.
if (unparsedEntryModules.length && assetSources) {
if (unparsedEntryModules.length === 1) {
// So if there is only one entry we consider its parsed source to be all the bundle code excluding code
// from parsed modules.
unparsedEntryModules[0].parsedSrc = assetSources.runtimeSrc;
} else {
// If there are multiple entry points we move all of them under synthetic concatenated module.
_.pullAll(assetModules, unparsedEntryModules);
assetModules.unshift({
identifier: './entry modules',
name: './entry modules',
modules: unparsedEntryModules,
size: unparsedEntryModules.reduce((totalSize, module) => totalSize + module.size, 0),
parsedSrc: assetSources.runtimeSrc
});
}
}
}
asset.modules = assetModules;
asset.tree = createModulesTree(asset.modules);
return result;
}, {});
function getChildAssetBundles(bundleStats, assetName) {
return (bundleStats.children || []).find((c) =>
_(c.assetsByChunkName)
.values()
.flatten()
.includes(assetName)
);
}
function getBundleModules(bundleStats) {
return _(bundleStats.chunks)
.map('modules') // chunks 下的 modules 數(shù)組和 bundleStats.modules 合并
.concat(bundleStats.modules)
.compact()
.flatten()
.uniqBy('id')
// Filtering out Webpack's runtime modules as they don't have ids and can't be parsed (introduced in Webpack 5)
.reject(isRuntimeModule)
.value();
}
// 判斷資源是否包含 module
function assetHasModule(statAsset, statModule) {
// Checking if this module is the part of asset chunks
return (statModule.chunks || []).some(moduleChunk =>
statAsset.chunks.includes(moduleChunk)
);
}
function isEntryModule(statModule) {
return statModule.depth === 0;
}
function isRuntimeModule(statModule) {
return statModule.moduleType === 'runtime';
}
function createModulesTree(modules) {
const root = new Folder('.');
modules.forEach(module => root.addModule(module));
root.mergeNestedFolders();
return root;
}
復(fù)制
3.1 createModulesTree
實現(xiàn)
function createModulesTree(modules) {
const root = new Folder('.');
modules.forEach(module => root.addModule(module));
root.mergeNestedFolders();
return root;
}
復(fù)制
該函數(shù)遍歷的是每一個 module,如下圖所示,此時的 module 的結(jié)構(gòu):
module 結(jié)構(gòu)
module 結(jié)構(gòu)
觀察 name
字段,一個是帶 multi
一個是不帶的。
看一下 addModule 運行的結(jié)果:
運行 addModule
getModulePathParts()
通過對 name
字段的拆分,構(gòu)造文件夾來對應(yīng)文件的歸屬。
addModule(moduleData) {
const pathParts = getModulePathParts(moduleData); // 生成真實路徑數(shù)組
if (!pathParts) {
return;
}
const [folders, fileName] = [pathParts.slice(0, -1), _.last(pathParts)]; // 如果 name 是帶有 multi ,則 folders 為空, filename 則為 multi ./src/main.js
let currentFolder = this; // 剛開始為調(diào)用者的節(jié)點,也就是 root
// 遍歷文件夾路徑數(shù)組,目的是為了創(chuàng)建所有的子文件夾節(jié)點
folders.forEach(folderName => {
// 或者這個文件夾下是否已經(jīng)有子文件夾
let childNode = currentFolder.getChild(folderName);
if (
// Folder is not created yet
// 文件夾沒有被創(chuàng)建過,則創(chuàng)建
!childNode ||
// In some situations (invalid usage of dynamic `require()`) webpack generates a module with empty require
// context, but it's moduleId points to a directory in filesystem.
// In this case we replace this `File` node with `Folder`.
// See `test/stats/with-invalid-dynamic-require.json` as an example.
!(childNode instanceof Folder)
) {
childNode = currentFolder.addChildFolder(new Folder(folderName));
}
currentFolder = childNode; // 替換為當(dāng)前的文件夾節(jié)點,繼而判斷當(dāng)前所在的節(jié)點下是否有子節(jié)點(文件夾)
});
const ModuleConstructor = moduleData.modules ? ConcatenatedModule : Module; // 賦值對象引用
const module = new ModuleConstructor(fileName, moduleData, this); // 實例化對象
currentFolder.addChildModule(module); // 調(diào)用父類的 addChildModule 設(shè)置當(dāng)前文件夾下的依賴
復(fù)制
這塊的代碼核心是遞歸,構(gòu)造每一個目錄和對應(yīng)的文件:錄了個視頻,看起來更直觀,通過 debug 的方式。來看最終生成的產(chǎn)物:
https://www.noxxxx.com/wp-content/uploads/2021/09/folder_demo.mp4
原文鏈接: https://cloud.tencent.com/developer/article/2023233