-
Notifications
You must be signed in to change notification settings - Fork 16
[wip] Docs/add shared doc of llgo #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
1351914167
wants to merge
8
commits into
qiniu:main
Choose a base branch
from
1351914167:llgo_share_doc
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
79b9787
add shared doc of llgo
1351914167 1919910
fix problems from AI reviewer
1351914167 ce3cfd8
refine discription
1351914167 d4b938f
refine content in line 70
1351914167 e2acd37
refine content in line 21 & 91
1351914167 a012bbe
refine content in line 37 & 47
1351914167 ba8c01b
add more detailed description
1351914167 9948ab1
add more to shown purpose
1351914167 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,203 @@ | ||
# LLGo 中 Python 编译与运行时集成:从依赖识别到一键交付 | ||
|
||
## 前言 | ||
|
||
LLGo 是一款基于 LLVM 的 Go 编译器,它把 Go 的类型系统和 SSA/IR 构建与 C/C++/Python 生态融合在一起,从“能否编到一起”到“如何舒服地用起来”,中间隔着一整套构建、版本、分发与运行时的工程系统。但目前 LLGo 在 Python 能力中仍存在不足,即对用户 Python 环境的强依赖。为解决这个问题,本文展示了一种用户不可见的 Python 环境构建方案,以“LLGo 中与 Python 相关的编译流程”为主线,串联 C/C++ 与 Python 的关键差异与共同点,并结合 `bundle` 能力说明如何把 Python 一起打包,做到“拿来就跑”。 | ||
|
||
## LLGo 中与 Python 相关的编译流程解析 | ||
|
||
### 顶层入口:把 Python 能力“接进来” | ||
|
||
- 入口函数负责建立 SSA/IR 编译容器,并懒加载运行时与 Python 符号包: | ||
```go | ||
prog.SetPython(func() *types.Package { | ||
return dedup.Check(llssa.PkgPython).Types | ||
}) | ||
``` | ||
这是 LLGo 中已实现的语言编译容器,此处不做赘述。 | ||
|
||
### 构建包:识别依赖、归一化链接、标记是否需要 Python 初始化 | ||
|
||
- 统一遍历待构建的包,按“包类别”决定如何处理: | ||
```go | ||
switch kind, param := cl.PkgKindOf(pkg.Types); kind { | ||
case cl.PkgDeclOnly: | ||
pkg.ExportFile = "" | ||
case cl.PkgLinkIR, cl.PkgLinkExtern, cl.PkgPyModule: | ||
// ... 见下文 | ||
default: | ||
// 常规包 | ||
``` | ||
- 与 Python 直接相关的两类: | ||
- 外链库(link: ...):当参数内出现 `$(pkg-config --libs python3-embed)`,**先准备一套可用的 Python 工具链**,再展开成 `-lpythonX -L...` 等链接参数。 | ||
- Python 模块(py.<module>):若缺失,则我们希望在“独立 Python 环境”内用 pip 安装,从而避免污染系统,实现对用户环境的最小入侵。 | ||
|
||
因此在进行 `pkg-config` 展开之前,我们需要进行 Python环境的构建。 | ||
|
||
关键实现(展开 pkg-config 前的“Python 预构建”四步): | ||
```go | ||
//确保缓存目录存在;若目录为空则下载并解压指定(或默认)Python发行包到缓存目录。 | ||
func EnsureWithFetch(url string) error { | ||
if url == "" { | ||
url = defaultPythonURL() | ||
} | ||
} | ||
|
||
//设置构建所需环境(PATH、PYTHONHOME、PKG_CONFIG_PATH 等),为后续 pkg-config/链接做准备。会在该编译程序的运行时指定python环境 | ||
func EnsureBuildEnv() error { | ||
pyHome := PythonHome() | ||
return applyEnv(pyHome) | ||
} | ||
|
||
//快速校验当前可用的 Python 解释器是否可运行。 | ||
func Verify() error { | ||
cmd := exec.Command(exe, "-c", "import sys; print('OK')") | ||
return cmd.Run() | ||
} | ||
|
||
//(macOS) 把 libpython 的 install_name 改为 @rpath/…,确保链接与运行时能按 rpath 正确定位库。 | ||
func FixLibpythonInstallName(pyHome string) error { | ||
if runtime.GOOS != "darwin" { | ||
return nil | ||
} | ||
} | ||
``` | ||
- **为何需要下载到缓存?** | ||
|
||
为了不对用户的环境做任何侵入性的改变,我们希望编译时所需的运行时应不与用户环境有关,且对用户不可见,故使用 stand alone 的形式将环境构建在用户的 cache中。 | ||
|
||
- **为何需要设置rpath?** | ||
|
||
为保证可用的 Python 构建链,我们选用了 `Python-build-standalone` 作为独立环境供 LLGo 使用,从而不对用户环境做任何修改。 | ||
|
||
在编译时,`EnsureBuildEnv()`保证了程序可以找到我们加载的该 Python 位置,从而展开该 Python 的 `$(pkg-config --libs python3-embed)`。 但 `Python-build-standalone` 在其 `python3-embed`中嵌入的路径为 `/install/...` 前缀,这与 `Python-build-standalone` 的构建有关,与 LLGo 无关。那么编译出的二进制根据该路径去寻找 libpython 时,会找不到库而报错。故我们需要将该内容修改为 `@rpath/...` 以让程序可以找到正确的 libpython 位置。 | ||
|
||
但在此处,并未设置 Rpath 的实际内容,仍为系统默认值,实际设置发生在链接期 | ||
|
||
- 接上文,若检测 `PkgPyModule // Python 模块(LLGoPackage="py.<module>")` , 则使用 pip 下载对应第三方库 | ||
|
||
```go | ||
func PipInstall(spec string) error { | ||
... | ||
return InstallPackages(spec) | ||
} | ||
``` | ||
|
||
- 为何 C/C++ 不需要(额外的)构建环境准备 | ||
|
||
- 这里的分支只做“链接参数解析/归一化”,不负责编译源码;C/C++ 源码的编译早在 cgo 与 LLGoFiles 流程中完成为 .o。 | ||
|
||
- 普通 C/C++ 外部库通常依赖系统/现成目录与已安装的 .a/.so/.dylib,链接器只需 -L/-l 即可,不需要像 Python 那样额外下载解释器、设置 PYTHONHOME、修正 install_name 等。 | ||
|
||
|
||
### 链接阶段:注入“初始化解释器”,并确保运行时能找到库 | ||
|
||
- 汇总所有对象文件与链接参数的同时,聚合“是否需要 Python 初始化”标记(使用到 Python C-API 的包会置 true): | ||
```go | ||
if p.ExportFile != "" && aPkg != nil { | ||
// ... | ||
need1, need2 := isNeedRuntimeOrPyInit(ctx, p) | ||
if !needRuntime { needRuntime = need1 } | ||
if !needPyInit { needPyInit = need2 } | ||
} | ||
``` | ||
- 生成主入口 IR:按需声明并调用 Python 初始化符号(入口早期执行),然后导出为一个 .o 参与最终链接: | ||
```go | ||
if needPyInit { | ||
pyEnvInit = "call void @__llgo_py_init_from_exedir()" | ||
pyEnvInitDecl = "declare void @__llgo_py_init_from_exedir()" | ||
} | ||
``` | ||
- 生成“Python 初始化.o”(C 源即时编译):它会从可执行文件相对位置推导 PYTHONHOME,并完成解释器初始化,入口 IR `__llgo_py_init_from_exedir` 会调用这个 `.o` 文件。 | ||
```go | ||
out := tmp.Name() + ".o" | ||
args := []string{ | ||
"-x", "c", | ||
"-o", out, "-c", tmp.Name(), | ||
} | ||
// 注入 Python 头文件 | ||
inc := filepath.Join(pyenv.PythonHome(), "include", "python3.12") | ||
args = append(args, "-I"+inc) | ||
cmd := ctx.compiler() | ||
if err := cmd.Compile(args...); err != nil { return "", err } | ||
``` | ||
- 注入 rpath:在得到完整 linkArgs 之后再去重追加,既考虑独立 Python 的 lib 路径,也考虑常用的 `@executable_path` 前缀(macOS): | ||
```go | ||
for _, dir := range pyenv.FindPythonRpaths(pyenv.PythonHome()) { | ||
addRpath(&linkArgs, dir) | ||
} | ||
addRpath(&linkArgs, "@executable_path/python/lib") | ||
addRpath(&linkArgs, "@executable_path/lib/python/lib") | ||
``` | ||
- 最终链接(统一交给 clang/交叉链接器),把根据链接参数,将以上对象构建为可执行文件: | ||
```go | ||
buildArgs := []string{"-o", app} | ||
buildArgs = append(buildArgs, linkArgs...) | ||
// 可选:调试符号/交叉编译 LDFLAGS/EXTRAFLAGS | ||
buildArgs = append(buildArgs, ctx.crossCompile.LDFLAGS...) | ||
buildArgs = append(buildArgs, ctx.crossCompile.EXTRAFLAGS...) | ||
buildArgs = append(buildArgs, objFiles...) | ||
return cmd.Link(buildArgs...) | ||
``` | ||
|
||
## 可选打包(Bundle):让用户“开箱即用” | ||
想让用户机器“无需安装/配置 Python”,可以把 Python 打进发布物里: | ||
|
||
- 命令:llgo bundle | ||
- 关键参数: | ||
- -mode dir|exe | ||
- -out 输出路径(仅 exe 模式用) | ||
- -archive zip|tar 与 -archiveOut(dir 产物可选打包归档) | ||
|
||
```go | ||
// llgo bundle | ||
var Cmd = &base.Command{ | ||
UsageLine: "llgo bundle [-mode dir|exe] [-out output] [-archive zip|tar] [-archiveOut file] [packages]", | ||
Short: "Package executable with embedded Python runtime", | ||
} | ||
... | ||
Cmd.Flag.StringVar(&mode, "mode", "dir", "bundle mode: dir|exe") | ||
Cmd.Flag.StringVar(&out, "out", "", "output file for onefile (default: <exe>)") | ||
Cmd.Flag.StringVar(&archive, "archive", "", "archive dist for onedir: zip|rar|tar (default: none)") | ||
Cmd.Flag.StringVar(&archiveOut, "archiveOut", "", "archive output path (default: <exe>.<ext>)") | ||
``` | ||
|
||
- llgo bundle 的对外用法:llgo bundle [-mode dir|exe] [-out] [-archive ...] [-archiveOut] | ||
- dir:在 dist 目录内生成 lib/python/lib 与 lib/python/lib/python3.12 等完整运行时布局;macOS 添加 rpath;可选再归档。 | ||
- exe:生成单一可执行;运行时解压内嵌的 Python 布局到缓存,设置 PYTHONHOME,转而执行内嵌的 app。 | ||
|
||
## C/C++ 与 Python:相同框架,不同要点 | ||
- 相同: | ||
- 统一走“构建包 → 导出 .o → 收集 LinkArgs → 链接”的编译框架; | ||
- 外部库都走 `link: ...` 归一化为 `-L/-l/...`。 | ||
- 不同: | ||
- 运行期需求:C/C++ 无需“启动运行时”;Python 必须初始化解释器(设置 PYTHONHOME + 初始化调用)。 | ||
- 环境准备:C/C++ 通常只要系统已有库即可;Python 需预置独立环境、修改 install_name(macOS)、并在链接期注入 rpath。 | ||
- 额外对象:Python 会生成“初始化桥接 .o”;C/C++ 无需此步。 | ||
|
||
## 一眼看懂的调用顺序 | ||
- Do | ||
- buildAllPkgs(初始包集) | ||
- buildPkg(每包:NewPackageEx → buildCgo/LLGoFiles → exportObject) | ||
- buildAllPkgs(替代包/补丁) | ||
- createGlobals(如有) | ||
- linkMainPkg | ||
- 收集对象/链接参数 → 生成主入口 .o(含 Python 初始化声明/调用) → 如需再生成 Python 初始化桥接 .o → 追加 rpath → 最终链接 | ||
- 运行/测试(按模式) | ||
|
||
## 总结 | ||
|
||
- 识别与分类:通过 link: ... 与 py.<module> 判定 Python 依赖,触发专属流程;C/C++ 仅归一化为 -L/-l 无需额外运行时。 | ||
|
||
- 预构建环境:在展开 $(pkg-config --libs python3-embed) 前完成 EnsureWithFetch、EnsureBuildEnv、Verify、FixLibpythonInstallName,保证可解析、可链接、可运行且不侵入系统。 | ||
|
||
- 链接注入:主入口注入 __llgo_py_init_from_exedir 调用并生成桥接 .o,统一追加 rpath(含独立 Python 与 @executable_path/...),再交由链接器合成可执行文件。 | ||
|
||
- 可选打包:BundleOnedir(目录式)与 BuildOnefileBinary(单文件)让应用“拿来就跑”,无需用户安装/配置 Python。 | ||
|
||
- 本质差异:C/C++ 无需“启动运行时”;Python 需在启动早期设置 PYTHONHOME 并初始化解释器。 | ||
|
||
- 结果与价值:实现“可编译、可链接、可运行、可分发、可复现”,以最小侵入把 Python 能力工程化纳入 Go 应用交付链路。 | ||
|
||
|
||
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused new line