Loop Invariant Code Motion

2017年08月17日留言

https://go-review.googlesource.com/c/go/+/694016

循环不变代码外提 (Loop-invariant code motion)

https://www.google.com/search?q=Loop+Invariant+Code+Motion

Loop Invariant Code Motion（循环不变代码移动）是一种编译器优化技术。其主要目的是提高程序的执行效率。

基本概念

循环不变代码：在循环体内每次迭代都不会改变其结果的代码。例如，计算一个常数值或获取某个不变的数组元素。

优化过程

识别不变代码：编译器分析循环，找出循环内每次迭代结果不变的语句。
移动代码：将这些不变代码移动到循环外部，只在循环开始时执行一次。

例子

假设有如下代码：

1
2
3

for i in range(n):
    result = compute(x)  # compute(x) 是不变的
    process(result)

优化后，可以改为：

1
2
3

result = compute(x)  # 移到循环外
for i in range(n):
    process(result)

优势

减少了循环体内的计算量，从而提升了程序的运行效率。
降低了不必要的重复计算，节约了时间和资源。

这种优化在处理复杂循环时特别有效，常用于编译器的优化阶段。

下面是一个简单的 Go 语言示例，展示如何使用循环不变代码移动进行优化。

原始代码

package main

import "fmt"

func main() {
    x := 10
    n := 1000000
    sum := 0

    for i := 0; i < n; i++ {
        result := expensiveComputation(x) // 不变的计算
        sum += result
    }

    fmt.Println(sum)
}

func expensiveComputation(x int) int {
    // 假设这是一个耗时的计算
    return x * x
}

优化后的代码

package main

import "fmt"

func main() {
    x := 10
    n := 1000000
    sum := 0

    // 移动不变的计算到循环外部
    result := expensiveComputation(x)

    for i := 0; i < n; i++ {
        sum += result // 只使用计算结果
    }

    fmt.Println(sum)
}

func expensiveComputation(x int) int {
    return x * x
}

解释

在原始代码中，expensiveComputation(x) 在每次循环中被调用，而其结果在每次迭代中都是相同的。通过将这个计算移到循环外部，优化后的代码只进行一次计算，从而提高了性能。

分析器:

当然可以。使用 Go 语言自带的 go/analysis 包，我们可以编写一个自定义的静态分析器 (static analyzer) 来检测这种可以进行“循环不变量代码外提”的模式。

静态分析器是一种在不实际运行代码的情况下，通过分析源代码的结构（例如抽象语法树 AST）来找出潜在问题的工具。go vet 就是一个著名的静态分析器。

下面我将为你展示如何编写一个基本的分析器来检测这个问题。

分析器的核心逻辑

我们的分析器需要完成以下几件事情：

遍历代码，找到所有的 for 循环。
对于每一个 for 循环，确定在循环体内哪些变量被修改了。 这包括循环变量（如 i）、在循环内部被赋值 (=, :=, ++, --) 的变量。
在循环体内寻找函数调用 (CallExpr)。
检查这些函数调用的所有参数。 判断这些参数是否依赖于在循环体内被修改的变量。
如果一个函数调用的所有参数都不依赖于循环内被修改的变量，那么这个调用就是“循环不变量”的，我们应该报告它。

分析器实现

下面是一个完整的、可运行的分析器代码。

licm_analyzer.go

// package licm_analyzer defines an analyzer that detects loop-invariant code motion opportunities.
package main

import (
	"fmt"
	"go/ast"
	"go/types"

	"golang.org/x/tools/go/analysis"
	"golang.org/x/tools/go/analysis/passes/inspect"
	"golang.org/x/tools/go/analysis/singlechecker"
	"golang.org/x/tools/go/ast/inspector"
)

// Analyzer is the entry point for our analysis.
var Analyzer = &analysis.Analyzer{
	Name:     "licm",
	Doc:      "detects potential loop-invariant code motion opportunities",
	Requires: []*analysis.Analyzer{inspect.Analyzer},
	Run:      run,
}

func main() {
	// This allows us to run the analyzer as a standalone tool.
	singlechecker.Main(Analyzer)
}

func run(pass *analysis.Pass) (interface{}, error) {
	// Get the inspector for traversing the AST.
	insp := pass.ResultOf[inspect.Analyzer].(*inspector.Inspector)

	// We are interested in `for` statements.
	nodeFilter := []ast.Node{
		(*ast.ForStmt)(nil),
	}

	insp.Preorder(nodeFilter, func(n ast.Node) {
		forStmt := n.(*ast.ForStmt)

		// 1. Find all variables modified inside the loop.
		// A map to store the objects of variables that are modified.
		modifiedInLoop := make(map[types.Object]bool)

		// Helper to add a variable object to the modified set.
		addModified := func(ident *ast.Ident) {
			if ident == nil {
				return
			}
			// Use types.Info to resolve the identifier to its defining object.
			// This correctly handles variable shadowing.
			if obj := pass.TypesInfo.Defs[ident]; obj != nil {
				modifiedInLoop[obj] = true
			}
		}

		// Add loop variables from the init/post statements.
		// e.g., for i := 0; ...; i++ { ... }
		if initStmt, ok := forStmt.Init.(*ast.AssignStmt); ok {
			for _, lhs := range initStmt.Lhs {
				if ident, ok := lhs.(*ast.Ident); ok {
					addModified(ident)
				}
			}
		}
		if postStmt, ok := forStmt.Post.(*ast.IncDecStmt); ok {
			if ident, ok := postStmt.X.(*ast.Ident); ok {
				addModified(pass.TypesInfo.Uses[ident])
			}
		}

		// Find assignments inside the loop body.
		ast.Inspect(forStmt.Body, func(node ast.Node) bool {
			switch stmt := node.(type) {
			case *ast.AssignStmt: // x = y or x := y
				for _, lhs := range stmt.Lhs {
					if ident, ok := lhs.(*ast.Ident); ok {
						addModified(ident)
					}
				}
			case *ast.IncDecStmt: // x++ or x--
				if ident, ok := stmt.X.(*ast.Ident); ok {
					// For uses, we need to resolve the object
					addModified(pass.TypesInfo.Uses[ident])
				}
			}
			return true
		})

		// 2. Find all function calls inside the loop and check if they are invariant.
		ast.Inspect(forStmt.Body, func(node ast.Node) bool {
			call, ok := node.(*ast.CallExpr)
			if !ok {
				return true
			}

			isInvariant := true
			// 3. Check all arguments of the function call.
			for _, arg := range call.Args {
				// Inspect the argument expression tree.
				ast.Inspect(arg, func(argNode ast.Node) bool {
					// If an identifier within an argument refers to a variable
					// that is modified in the loop, then the call is NOT invariant.
					if ident, ok := argNode.(*ast.Ident); ok {
						if obj := pass.TypesInfo.Uses[ident]; obj != nil {
							if modifiedInLoop[obj] {
								isInvariant = false
								return false // Stop inspecting this argument
							}
						}
					}
					return true // Continue inspecting
				})

				if !isInvariant {
					break // No need to check other arguments
				}
			}

			// 4. Report if the call is invariant.
			if isInvariant {
				// Don't flag built-in functions like `len` or `append` as they are often cheap.
				// A more robust check would be needed for production use.
				if ident, ok := call.Fun.(*ast.Ident); ok {
					if obj := pass.TypesInfo.Uses[ident]; obj != nil {
						if _, ok := obj.(*types.Builtin); ok {
							return true
						}
					}
				}

				// Avoid reporting on log functions or other known side-effect functions.
				// This is a simplified check.
				funcName := types.ExprString(call.Fun)
				if funcName == "fmt.Println" { // Example of a function to ignore
					return true
				}

				pass.Reportf(call.Pos(), "call to %s appears to be loop-invariant", funcName)
			}

			return true
		})

	})
	return nil, nil
}

如何运行分析器

保存分析器代码：将上面的代码保存为 licm_analyzer.go。

创建 go.mod 文件 (如果项目还没有的话):

1 2	go mod init licm.analyzer go get golang.org/x/tools/go/analysis/singlechecker

创建要分析的目标代码：创建一个名为 test.go 的文件，包含我们的测试用例。

test.go

package main

import "fmt"

func expensiveComputation(x int) int {
	// Assume this is a time-consuming calculation
	return x * x
}

func anotherComputation(x, y int) int {
	return x + y
}

func main() {
	x := 10
	n := 1000000
	sum := 0

	// Case 1: This should be detected
	for i := 0; i < n; i++ {
		result := expensiveComputation(x) // Invariant call
		sum += result
		sum++
	}

	// Case 2: This should NOT be detected
	for i := 0; i < n; i++ {
		// 'i' is modified in the loop, so this call is not invariant
		result := anotherComputation(x, i)
		sum += result
	}

	// Case 3: A more complex case, also should be detected
	factor := 2
	for j := 0; j < 100; j++ {
		localSum := 0
		res := anotherComputation(x, factor) // Invariant call
		localSum += res
		fmt.Println(localSum)
	}

	fmt.Println(sum)
}

运行分析：在终端中，运行以下命令。这会编译并运行 licm_analyzer.go，并让它分析当前目录下的所有代码 (./...)。
1
go run ./licm_analyzer.go ./...

预期输出

运行后，你应该会看到如下输出，准确地指出了可以优化的代码位置：

1 2	/path/to/your/project/test.go:21:14: call to expensiveComputation appears to be loop-invariant /path/to/your/project/test.go:34:9: call to anotherComputation appears to be loop-invariant

分析器成功地找出了第一个和第三个循环中的不变量调用，并忽略了第二个循环中依赖于循环变量 i 的调用。

重要限制和说明

函数副作用 (Side Effects)：这个简单的分析器有一个巨大的前提：它假设被调用的函数是纯函数，即函数的输出只依赖于其输入参数，并且没有副作用（如修改全局变量、进行I/O操作等）。我们的分析器无法检测副作用，因此可能会错误地标记一个有副作用的函数（如 fmt.Println("hello")），尽管移动它可能会改变程序的行为。在上面的代码中，我简单地排除了 fmt.Println，但一个生产级的分析器需要更复杂的机制来处理副作用。
指针和别名：分析器没有进行深入的数据流分析，无法处理通过指针或别名修改变量的情况。
复杂度：这是一个教学目的的简化版实现。一个工业级的 LICM 分析器（就像编译器内置的那样）会复杂得多，需要进行控制流分析、数据流分析和别名分析来确保优化的绝对安全。

尽管有这些限制，这个例子完整地展示了利用 go/analysis 框架来构建强大的自定义代码检查工具的基本思想和过程。

原文链接: https://dashen.tech/2017/08/17/Loop-Invariant-Code-Motion/

版权声明: 转载请注明出处.

清澄秋爽

苹果树下的思索者书写是对思维的缓存