Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: os.get_raw_line - buffer overflow #23764

Open
DamianFekete opened this issue Feb 19, 2025 · 8 comments
Open

Windows: os.get_raw_line - buffer overflow #23764

DamianFekete opened this issue Feb 19, 2025 · 8 comments
Labels
Bug This tag is applied to issues which reports bugs. Status: Confirmed This bug has been confirmed to be valid by a contributor.

Comments

@DamianFekete
Copy link

DamianFekete commented Feb 19, 2025

Root cause described here: #23764 (comment)


Describe the bug

On Windows 10 (no issue on WSL) I get a RUNTIME ERROR: invalid memory access.

LE.

  • I don't think this is related to the parsing of jsons when the Log struct doesn't have the fields defined. I've removed them to make the test case smaller, this is not the original app
  • If I make any of these changes, the error is not reproducible (but the memory may still be corrupted):
    • Remove the print from the or { } block
    • Read from the standard input but parse another string (with the same value) to json.decode

Reproduction Steps

Create a file yyy:

{"@timestamp":"2025-02-19T15:45:49.746Z","@version":"1","message":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","logger_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","thread_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","level":"INFO","level_value":20000,"uid":"","request_id":"xxxxxx","session_log_id":"xxxxxxxxxxx","ip":"xxx.xxx.xxx.xxx","referer":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx012345678901234567890123456","processing_time_ms":0,"method":"GET","content_type":"text/javascript","query":null,"type":"xxxxxxxxxxxxxxxx","request_uri":"/pub/js/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.js","status":200}
{"@timestamp":"2025-02-19T15:45:49.746Z","@version":"1","message":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","logger_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","thread_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","level":"INFO","level_value":20000,"uid":"","request_id":"xxxxxx","session_log_id":"xxxxxxxxxxx","ip":"xxx.xxx.xxx.xxx","referer":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","processing_time_ms":0,"method":"GET","content_type":"text/javascript","query":null,"type":"xxxxxxxxxxxxxxxx","request_uri":"/pub/js/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.js","status":200}
{"@timestamp":"2025-02-19T15:45:49.746Z","@version":"1","message":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","logger_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","thread_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","level":"INFO","level_value":20000,"uid":"","request_id":"xxxxxx","session_log_id":"xxxxxxxxxxx","ip":"xxx.xxx.xxx.xxx","referer":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","processing_time_ms":0,"method":"GET","content_type":"text/javascript","query":null,"type":"xxxxxxxxxxxxxxxx","request_uri":"/pub/js/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.js","status":200} 

main.v:

module main

import os
import json

struct Log {
}

fn main() {
	mut i:= 0
	for {
		i += 1
		mut line := os.get_raw_line()
		if line.len == 0 {
			break
		}
		line2 := line.trim_right('\r\n')
		json.decode(Log, line2) or {
			println('${i}: ${line2}')
			continue
		}
		println("${i}: OK")
	}
}

Run

cat yyy | v run main.v

Expected Behavior

The output of the script should be:
1: OK
2: OK
3: OK

Current Behavior

1: {"@timestamp":"2025-02-19T15:45:49.746Z","@version":"1","message":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","logger_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","thread_name":"xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxx","level":"INFO","level_value":20000,"uid":"","request_id":"xxxxxx","session_log_id":"xxxxxxxxxxx","ip":"xxx.xxx.xxx.xxx","referer":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx012345678901234567890123456","processing_time_ms":0,"method":"GET","content_type":"text/javascript","query":null,"type":"xxxxxxxxxxxxxxxx","reque
st_uri":"/pub/js/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.js","status":200}
2: OK
Unhandled Exception 0x14aefb0
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:4808: at print_backtrace_skipping_top_frames_tcc: Backtrace
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:4794: by print_backtrace_skipping_top_frames
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:5379: by unhandled_exception_handler
7ffbd4329b4c : by ???
00432b93 : at ???: RUNTIME ERROR: invalid memory access
00432c71 : by ???
00432c9c : by ???
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:5135: by malloc_noscan
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:6686: by string_clone
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:6743: by string_substr
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:6808: by string_trim_chars
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:6850: by string_trim_right
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:7931: by main__main
C:/Temp/AppDataTemp/v_0/main2.01JMFH394GS2TV66BHEXQ9795V.tmp.c:7972: by wmain
00443eb0 : by ???
00444013 : by ???
7ffbd4187374 : by ???

Possible Solution

No response

Additional Information/Context

In addition, the first line logged in the console misses some x characters.

Image

V version

0.4.9 99635cf

Environment details (OS name and version, etc.)

Windows 10

@DamianFekete DamianFekete added the Bug This tag is applied to issues which reports bugs. label Feb 19, 2025
Copy link

Connected to Huly®: V_0.6-22178

@JalonSolov
Copy link
Contributor

V should definitely be giving you an error message, but it is definitely a problem in your code. To make it work, all you have to do is actually fill out the Log struct, as in:

struct Log {
        timestamp          string @[json: '@timestamp']
        version            string @[json: '@version']
        message            string
        logger_name        string
        thread_name        string
        level              string
        level_value        int
        uid                string
        request_id         string
        session_log_id     string
        ip                 string
        referer            string
        processing_time_ms int
        method             string
        content_type       string
        query              string
        type               string
        request_uri        string
        status             int
}

then it works as it is supposed to work:

$ cat yyy |v run main.v
1: OK
2: OK
3: OK
$

@JalonSolov JalonSolov added the Status: Confirmed This bug has been confirmed to be valid by a contributor. label Feb 19, 2025
@DamianFekete
Copy link
Author

V should definitely be giving you an error message, but it is definitely a problem in your code. To make it work, all you have to do is actually fill out the Log struct, as in:

  • It does not work (on Windows), even with that addition

Image

  • If I must do that it would mean that I can't parse JSON strings if I don't specify ALL the fields (and I may not know all the fields beforehand).

@JalonSolov
Copy link
Contributor

If you don't know all the fields ahead of time, then try switching to x.json2 instead of json module.

It has methods for "raw" parsing (without needing a struct), turning the parsed values into a map for easier access, etc.

@jorgeluismireles
Copy link

I think is not json nor the empty Log struct. I did this in playground (without reading any file). Then paste it in windows and worked. Maybe the problem could be some windows file reading I am still don't check.
https://play.vlang.io/p/e0a712faa6

import json

const lines = [
	'{"@timestamp":"2025-02-19T15:45:49.746Z","@version":"1","message":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","logger_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","thread_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","level":"INFO","level_value":20000,"uid":"","request_id":"xxxxxx","session_log_id":"xxxxxxxxxxx","ip":"xxx.xxx.xxx.xxx","referer":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx012345678901234567890123456","processing_time_ms":0,"method":"GET","content_type":"text/javascript","query":null,"type":"xxxxxxxxxxxxxxxx","request_uri":"/pub/js/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.js","status":200}'
	'{"@timestamp":"2025-02-19T15:45:49.746Z","@version":"1","message":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","logger_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","thread_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","level":"INFO","level_value":20000,"uid":"","request_id":"xxxxxx","session_log_id":"xxxxxxxxxxx","ip":"xxx.xxx.xxx.xxx","referer":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","processing_time_ms":0,"method":"GET","content_type":"text/javascript","query":null,"type":"xxxxxxxxxxxxxxxx","request_uri":"/pub/js/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.js","status":200}'
	'{"@timestamp":"2025-02-19T15:45:49.746Z","@version":"1","message":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","logger_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","thread_name":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","level":"INFO","level_value":20000,"uid":"","request_id":"xxxxxx","session_log_id":"xxxxxxxxxxx","ip":"xxx.xxx.xxx.xxx","referer":"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx","processing_time_ms":0,"method":"GET","content_type":"text/javascript","query":null,"type":"xxxxxxxxxxxxxxxx","request_uri":"/pub/js/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.js","status":200}'
]

struct Log {
}

fn main() {
	for i, line in lines {
		line2 := line.trim_right('\r\n')
		json.decode(Log, line2) or {
			println('${i}: ${line2}')
			continue
		}
		println("${i}: OK")
	}
}
0: OK
1: OK
2: OK

@jorgeluismireles
Copy link

I remove line2 := line.trim_right('\r\n') and the runtime error disapeared. Also print the number of columns:

module main

import os
import json

struct Log {
}

fn main() {
	mut i:= 0
	for {
		i += 1
		mut line := os.get_raw_line()
		if line.len == 0 {
			break
		}
		json.decode(Log, line) or {
			println('${i}: ${line.len} ${err}')
			continue
		}
		println("${i}: ${line.len} OK")
	}
}

Then I compile the program to an exe and use type to pass the data as input to the exe.
But still first line has a problem though is the same size of the other two.

C:\Users\jmireles\issues>v 23764.v

C:\Users\jmireles\issues>type 23764.txt | 23764.exe
1: 1035 failed to decode JSON string: xx.xxx.xxx","referer":"xxxxxxx
2: 1035 OK
3: 1035 OK

Now I added two returns at the start of the txt file and three lines with data can be read ok:

C:\Users\jmireles\issues>type 23764.txt | 23764.exe
1: 1 failed to decode JSON string:
2: 1 failed to decode JSON string:
3: 1035 OK
4: 1035 OK
5: 1035 OK

@changrui
Copy link
Contributor

changrui commented Feb 20, 2025

  • If I must do that it would mean that I can't parse JSON strings if I don't specify ALL the fields (and I may not know all the fields beforehand).
  • You may define the fields you need. struct Log {timestamp string @[json:'@timestamp'] message string status int ...}.
  • then, lg := json.decode(Log, line)! println(lg).

@DamianFekete
Copy link
Author

DamianFekete commented Feb 20, 2025

The problem is os.get_raw_line.

It allocates 256 "chars" / 512 bytes ( https://github.com/vlang/v/blob/master/vlib/os/os.c.v#L506 ) but it doesn't do any reallocation or checks if more memory is needed ( https://github.com/vlang/v/blob/master/vlib/os/os.c.v#L519-L533 )

echo xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxabcdefghijklmnopqrstuvwxyz | v run main.v
module main

import os

fn main() {
	assert os.get_raw_line().contains("abcdefghijklmnopqrstuvwxyz")
}

@DamianFekete DamianFekete changed the title RUNTIME ERROR: invalid memory access Windows: os.get_raw_line - buffer overflow Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug This tag is applied to issues which reports bugs. Status: Confirmed This bug has been confirmed to be valid by a contributor.
Projects
None yet
Development

No branches or pull requests

4 participants