3
3
LuaJIT memory profiler
4
4
======================
5
5
6
- Starting from version :doc: `2.7.1 </release/2.7.1 >`, Tarantool
6
+ Since version :doc: `2.7.1 </release/2.7.1 >`, Tarantool
7
7
has a built‑in module called ``misc.memprof `` that implements a LuaJIT memory
8
- profiler (which we will just call *the profiler * in this section). The profiler provides
9
- a memory allocation report that helps analyze Lua code and find the places
10
- that put the most pressure on the Lua garbage collector (GC).
8
+ profiler (further in this section we call it *the profiler * for short).
9
+ The profiler provides a memory allocation report that helps analyze Lua code
10
+ and find the places that put the most pressure on the Lua garbage collector (GC).
11
+
12
+ Inside this section:
11
13
12
14
.. contents ::
13
15
:local:
@@ -18,7 +20,7 @@ that put the most pressure on the Lua garbage collector (GC).
18
20
Working with the profiler
19
21
-------------------------
20
22
21
- Usage of the profiler involves two steps:
23
+ The profiler usage involves two steps:
22
24
23
25
1. :ref: `Collecting <profiler_usage_get >` a binary profile of allocations,
24
26
reallocations, and deallocations in memory related to Lua
@@ -28,13 +30,13 @@ Usage of the profiler involves two steps:
28
30
29
31
.. _profiler_usage_get :
30
32
31
- Collecting binary profile
32
- ~~~~~~~~~~~~~~~~~~~~~~~~~
33
+ Collecting a binary profile
34
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
33
35
34
36
To collect a binary profile for a particular part of the Lua code,
35
37
you need to place this part between two ``misc.memprof `` functions,
36
38
namely, ``misc.memprof.start() `` and ``misc.memprof.stop() ``, and then execute
37
- the code under Tarantool.
39
+ the code in Tarantool.
38
40
39
41
Below is a chunk of Lua code named ``test.lua `` to illustrate this.
40
42
@@ -43,26 +45,26 @@ Below is a chunk of Lua code named ``test.lua`` to illustrate this.
43
45
.. code-block :: lua
44
46
:linenos:
45
47
46
- -- Prevent allocations on traces.
47
- jit.off()
48
- local str, err = misc.memprof.start("memprof_new.bin")
49
- -- Lua doesn't create a new frame to call string.rep, and all allocations
50
- -- are attributed not to the append() function but to the parent scope.
51
- local function append(str, rep)
52
- return string.rep(str, rep)
53
- end
48
+ -- Prevent allocations on traces.
49
+ jit.off()
50
+ local str, err = misc.memprof.start("memprof_new.bin")
51
+ -- Lua doesn't create a new frame to call string.rep, and all allocations
52
+ -- are attributed not to the append() function but to the parent scope.
53
+ local function append(str, rep)
54
+ return string.rep(str, rep)
55
+ end
54
56
55
- local t = {}
56
- for i = 1, 1e4 do
57
- -- table.insert is the built-in function and all corresponding
58
- -- allocations are reported in the scope of the main chunk.
59
- table.insert(t,
60
- append('q', i)
61
- )
62
- end
63
- local stp , err = misc.memprof.stop()
57
+ local t = {}
58
+ for i = 1, 1e4 do
59
+ -- table.insert is the built-in function and all corresponding
60
+ -- allocations are reported in the scope of the main chunk.
61
+ table.insert(t,
62
+ append('q', i)
63
+ )
64
+ end
65
+ local str , err = misc.memprof.stop()
64
66
65
- The Lua code for starting the profiler -- as in line 3 in the test.lua example above -- is:
67
+ The Lua code for starting the profiler -- as in line 3 in the `` test.lua `` example above -- is:
66
68
67
69
.. code-block :: lua
68
70
@@ -75,27 +77,29 @@ for example if it is not possible to open a file for writing or if the profiler
75
77
``misc.memprof.start() `` returns ``nil `` as the first result,
76
78
an error-message string as the second result,
77
79
and a system-dependent error code number as the third result.
80
+
78
81
If the operation succeeds, ``misc.memprof.start() `` returns ``true ``.
79
82
80
- The Lua code for stopping the profiler -- as in line 18 in the test.lua example above -- is:
83
+ The Lua code for stopping the profiler -- as in line 18 in the `` test.lua `` example above -- is:
81
84
82
85
.. code-block :: lua
83
86
84
- local stp , err = misc.memprof.stop()
87
+ local str , err = misc.memprof.stop()
85
88
86
89
If the operation fails,
87
- for example if there is an error when the file descriptor is being closed
90
+ for example if there is an error when the file descriptor is being closed
88
91
or if there is a failure during reporting,
89
92
``misc.memprof.stop() `` returns ``nil `` as the first result,
90
93
an error-message string as the second result,
91
94
and a system-dependent error code number as the third result.
95
+
92
96
If the operation succeeds, ``misc.memprof.stop() `` returns ``true ``.
93
97
94
98
.. _profiler_usage_generate :
95
99
96
100
To generate the file with memory profile in binary format
97
101
(in the :ref: `test.lua code example above <profiler_usage_example01 >`
98
- the file name is ``memprof_new.bin ``), execute the code under Tarantool:
102
+ the file name is ``memprof_new.bin ``), execute the code in Tarantool:
99
103
100
104
.. code-block :: console
101
105
@@ -115,7 +119,7 @@ a profiling report:
115
119
:ref: `note about jitoff <profiler_usage_internal_jitoff >` for more details.
116
120
117
121
* Lines 6-8: Tail call optimization doesn't create a new call frame, so all
118
- allocations inside the function called via the ``CALLT/CALLMT `` `bytecodes <http ://wiki.luajit.org/Bytecode-2.0#calls-and-vararg-handling >`_
122
+ allocations inside the function called via the ``CALLT/CALLMT `` `bytecodes <https ://github.com/tarantool/tarantool/wiki/LuaJIT-Bytecodes >`_
119
123
are attributed to the function's caller. See also the comments preceding these lines.
120
124
121
125
* Lines 14-16: Usually the information about allocations inside Lua built‑ins
@@ -127,8 +131,8 @@ a profiling report:
127
131
128
132
.. _profiler_usage_parse :
129
133
130
- Parsing binary profile and generating profiling report
131
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
134
+ Parsing a binary profile and generating a profiling report
135
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132
136
133
137
.. _profiler_usage_parse_command :
134
138
@@ -143,8 +147,11 @@ via Tarantool by using the following command
143
147
144
148
where ``memprof_new.bin `` is the binary profile
145
149
:ref: `generated earlier <profiler_usage_generate >` by ``tarantool test.lua ``.
146
- (Warning: there is a slight behavior change here, the ``tarantool -e ... ``
147
- command was slightly different in Tarantool versions prior to Tarantool 2.8.1.)
150
+
151
+ .. note ::
152
+
153
+ There is a slight behavior change here: the ``tarantool -e ... `` command
154
+ was slightly different in Tarantool versions prior to Tarantool 2.8.1.
148
155
149
156
Tarantool generates a profiling report and displays it on the console before closing
150
157
the session:
@@ -187,7 +194,6 @@ the session:
187
194
On macOS, a report will be different for the same chunk of code because
188
195
Tarantool and LuaJIT are built with the GC64 mode enabled for macOS.
189
196
190
-
191
197
Let's examine the report structure. A report has four sections:
192
198
193
199
* ALLOCATIONS
@@ -203,21 +209,20 @@ An event record has the following format:
203
209
.. code-block :: text
204
210
205
211
@<filename>:<line_number>: <number_of_events> events +<allocated> bytes -<freed> bytes
206
-
207
212
213
+ where:
208
214
209
- * <filename>— a name of the file containing Lua code.
210
- * <line_number>— the line number where the event is detected.
211
- * <number_of_events>— a number of events for this code line.
212
- * +<allocated> bytes— amount of memory allocated during all the events on this line.
213
- * -<freed> bytes— amount of memory freed during all the events on this line.
215
+ * `` <filename> `` -— a name of the file containing Lua code.
216
+ * `` <line_number> `` -— the line number where the event is detected.
217
+ * `` <number_of_events> `` —- a number of events for this code line.
218
+ * `` +<allocated> bytes `` —- amount of memory allocated during all the events on this line.
219
+ * `` -<freed> bytes `` —- amount of memory freed during all the events on this line.
214
220
215
221
The ``Overrides `` label shows what allocation has been overridden.
216
222
217
223
See the :ref: `test.lua chunk above <profiler_usage_example01 >`
218
224
with the explanation in the comments for some examples.
219
225
220
-
221
226
.. _profiler_usage_internal_jitoff :
222
227
223
228
The ``INTERNAL `` label indicates that this event is caused by internal LuaJIT
@@ -256,10 +261,10 @@ structures.
256
261
As for investigating the Lua code with the help of profiling reports,
257
262
it is always code-dependent and there can't be hundred per cent definite
258
263
recommendations in this regard. Nevertheless, you can see some of the things
259
- in the :ref: `Profiling report analysis example <profiler_analysis >` later.
264
+ in the :ref: `Profiling a report analysis example <profiler_analysis >` later.
260
265
261
266
Also, below is the :ref: `FAQ <profiler_faq >` section with the questions that
262
- most probably can arise while using profiler.
267
+ most probably can arise while using the profiler.
263
268
264
269
.. _profiler_faq :
265
270
@@ -317,44 +322,44 @@ console for a running instance.
317
322
.. code-block :: lua
318
323
:linenos:
319
324
320
- local fiber = require "fiber"
321
- local log = require "log"
325
+ local fiber = require "fiber"
326
+ local log = require "log"
322
327
323
- fiber.create(function()
324
- fiber.name("memprof")
328
+ fiber.create(function()
329
+ fiber.name("memprof")
325
330
326
- collectgarbage() -- Collect all objects already dead
327
- log.warn("start of profile")
331
+ collectgarbage() -- Collect all objects already dead
332
+ log.warn("start of profile")
328
333
329
- local st, err = misc.memprof.start(FILENAME)
330
- if not st then
331
- log.error("failed to start profiler: %s", err)
332
- end
334
+ local st, err = misc.memprof.start(FILENAME)
335
+ if not st then
336
+ log.error("failed to start profiler: %s", err)
337
+ end
333
338
334
- fiber.sleep(TIME)
339
+ fiber.sleep(TIME)
335
340
336
- collectgarbage()
337
- st, err = misc.memprof.stop()
341
+ collectgarbage()
342
+ st, err = misc.memprof.stop()
338
343
339
- if not st then
340
- log.error("profiler on stop error: %s", err)
341
- end
344
+ if not st then
345
+ log.error("profiler on stop error: %s", err)
346
+ end
342
347
343
- log.warn("end of profile")
344
- end)
348
+ log.warn("end of profile")
349
+ end)
345
350
346
- where
351
+ where:
347
352
348
- * ``FILENAME ``—the name of the binary file where profiling events are written
349
- * ``TIME ``—duration of profiling, in seconds.
353
+ * ``FILENAME ``—the name of the binary file where profiling events are written
354
+ * ``TIME ``—duration of profiling, in seconds.
350
355
351
356
Also, you can directly call ``misc.memprof.start() `` and ``misc.memprof.stop() ``
352
357
from a console.
353
358
354
359
.. _profiler_analysis :
355
360
356
- Profiling report analysis example
357
- ---------------------------------
361
+ Profiling a report analysis example
362
+ -----------------------------------
358
363
359
364
In the example below, the following Lua code named ``format_concat.lua `` is
360
365
investigated with the help of the memory profiler reports.
@@ -364,39 +369,39 @@ investigated with the help of the memory profiler reports.
364
369
.. code-block :: lua
365
370
:linenos:
366
371
367
- -- Prevent allocations on new traces.
368
- jit.off()
372
+ -- Prevent allocations on new traces.
373
+ jit.off()
369
374
370
- local function concat(a)
371
- local nstr = a.."a"
372
- return nstr
373
- end
375
+ local function concat(a)
376
+ local nstr = a.."a"
377
+ return nstr
378
+ end
374
379
375
- local function format(a)
376
- local nstr = string.format("%sa", a)
377
- return nstr
378
- end
380
+ local function format(a)
381
+ local nstr = string.format("%sa", a)
382
+ return nstr
383
+ end
379
384
380
- collectgarbage()
385
+ collectgarbage()
381
386
382
- local binfile = "/tmp/memprof_"..(arg[0]):match("([^/]*).lua")..".bin"
387
+ local binfile = "/tmp/memprof_"..(arg[0]):match("([^/]*).lua")..".bin"
383
388
384
- local st, err = misc.memprof.start(binfile)
385
- assert(st, err)
389
+ local st, err = misc.memprof.start(binfile)
390
+ assert(st, err)
386
391
387
- -- Payload.
388
- for i = 1, 10000 do
389
- local f = format(i)
390
- local c = concat(i)
391
- end
392
- collectgarbage()
392
+ -- Payload.
393
+ for i = 1, 10000 do
394
+ local f = format(i)
395
+ local c = concat(i)
396
+ end
397
+ collectgarbage()
393
398
394
- local st, err = misc.memprof.stop()
395
- assert(st, err)
399
+ local st, err = misc.memprof.stop()
400
+ assert(st, err)
396
401
397
- os.exit()
402
+ os.exit()
398
403
399
- When you run this code :ref: `under Tarantool <profiler_usage_generate >` and
404
+ When you run this code :ref: `in Tarantool <profiler_usage_generate >` and
400
405
then :ref: `parse <profiler_usage_parse_command >` the binary memory profile
401
406
in /tmp/memprof_format_concat.bin,
402
407
you will get the following profiling report:
@@ -423,9 +428,9 @@ you will get the following profiling report:
423
428
424
429
Reasonable questions regarding the report can be:
425
430
426
- * Why are there no allocations related to the ``concat() `` function?
427
- * Why is the number of allocations not a round number?
428
- * Why are there about 20K allocations instead of 10K?
431
+ * Why are there no allocations related to the ``concat() `` function?
432
+ * Why is the number of allocations not a round number?
433
+ * Why are there about 20K allocations instead of 10K?
429
434
430
435
First of all, LuaJIT doesn't create a new string if the string with the same
431
436
payload exists (see details on `lua-users.org/wiki <http://lua-users.org/wiki/ImmutableObjects >`_).
@@ -469,8 +474,8 @@ To profile only the ``concat()`` function, comment out line 23 (which is
469
474
**Q **: But what will change if JIT compilation is enabled?
470
475
471
476
**A **: In the :ref: `code <profiler_usage_example03 >`, comment out line 2 (which is
472
- ``jit.off() ``) and run
473
- the profiler . Now there are only 56 allocations in the report, and all other
477
+ ``jit.off() ``) and run the profiler.
478
+ Now there are only 56 allocations in the report, and all the other
474
479
allocations are JIT-related (see also the related
475
480
`dev issue <https://github.com/tarantool/tarantool/issues/5679 >`_):
476
481
@@ -494,8 +499,6 @@ allocations are JIT-related (see also the related
494
499
@format_concat.lua:0 holds 640 bytes: 4 allocs, 0 frees
495
500
INTERNAL holds 360 bytes: 2 allocs, 1 frees
496
501
497
-
498
-
499
502
This happens because a trace has been compiled after 56 iterations (the default
500
503
value of the ``hotloop `` compiler parameter). Then, the
501
504
JIT-compiler removed the unused variable ``c `` from the trace, and, therefore,
@@ -538,7 +541,7 @@ identifier is not yet compiled via LuaJIT. So, a trace can't be recorded and
538
541
the compiler doesn't perform the corresponding optimizations.
539
542
540
543
If we change the ``format() `` function in lines 9-12 of the
541
- :ref: `Profiling report analysis example <profiler_usage_example03 >`
544
+ :ref: `Profiling a report analysis example <profiler_usage_example03 >`
542
545
in the following way
543
546
544
547
.. code-block :: lua
@@ -594,4 +597,4 @@ for example
594
597
595
598
$ tarantool -e 'require("memprof")(arg)' - --leak-only memprof_new.bin
596
599
597
- When `--leak-only`` is used, only the HEAP SUMMARY section is displayed.
600
+ When `` --leak-only `` is used, only the HEAP SUMMARY section is displayed.
0 commit comments