Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 decode error #103

Open
futureweihk opened this issue Sep 21, 2020 · 7 comments
Open

UTF-8 decode error #103

futureweihk opened this issue Sep 21, 2020 · 7 comments
Labels
bug help wanted need info Blocked awaiting information python3

Comments

@futureweihk
Copy link

Dear Sir,

we run a normal command like git deps -d 90xxxxx0deeca0a98fbac4368c547bd650c99a95,
meet below error:

Traceback (most recent call last):
File "/usr/local/bin/git-deps", line 11, in
sys.exit(run())
File "/usr/local/lib/python3.6/site-packages/git_deps/cli.py", line 141, in run
main(sys.argv[1:])
File "/usr/local/lib/python3.6/site-packages/git_deps/cli.py", line 135, in main
cli(options, args)
File "/usr/local/lib/python3.6/site-packages/git_deps/cli.py", line 119, in cli
detector.find_dependencies(rev)
File "/usr/local/lib/python3.6/site-packages/git_deps/detector.py", line 122, in find_dependencies
self.find_dependencies_with_parent(dependent, parent)
File "/usr/local/lib/python3.6/site-packages/git_deps/detector.py", line 147, in find_dependencies_with_parent
self.blame_hunk(dependent, parent, path, hunk)
File "/usr/local/lib/python3.6/site-packages/git_deps/detector.py", line 172, in blame_hunk
blame = subprocess.check_output(cmd, universal_newlines=True)
File "/usr/local/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/usr/local/lib/python3.6/subprocess.py", line 425, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/local/lib/python3.6/subprocess.py", line 850, in communicate
stdout = self.stdout.read()
File "/usr/local/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbd in position 624: invalid start byte

As we enable the debug mode, we find the log last part of "Blaming hunk" as below:
Sep 21 11:30:16 DEBUG Blaming hunk -454,8 @ c7cfcd69
Sep 21 11:30:16 DEBUG !863359ffe0c44c26890b903dfa637c139f6e60ef 454 454 4
90877300deeca0a98fbac4368c547bd650c99a95 863359ffe0c44c26890b903dfa637c139f6e60ef
Sep 21 11:30:16 DEBUG New dependency 90877300 -> 863359ff via line 454 (prod standalone.xml add VN,KH DB info)
Sep 21 11:30:16 DEBUG New line for 90877300 -> 863359ff: 863359ffe0c44c26890b903dfa637c139f6e60ef 454 454 4
Sep 21 11:30:16 DEBUG !author XXX16374
Sep 21 11:30:16 DEBUG !author-mail XXX16374@xxxxxxx
Sep 21 11:30:16 DEBUG !author-time 1592396697
Sep 21 11:30:16 DEBUG !author-tz +0800
Sep 21 11:30:16 DEBUG !committer XXX16374
Sep 21 11:30:16 DEBUG !committer-mail XXX16374@xxxxxxx
Sep 21 11:30:16 DEBUG !committer-time 1592396697
Sep 21 11:30:16 DEBUG !committer-tz +0800
Sep 21 11:30:16 DEBUG !summary prod standalone.xml add VN,KH DB info
Sep 21 11:30:16 DEBUG !previous 315fe9922d84db614e42a0c18ae48a369adedf83 RESOURCES/TW_SVC/prod/jboss-eap-7.0/standalone.xml
Sep 21 11:30:16 DEBUG !filename RESOURCES/TW_SVC/prod/jboss-eap-7.0/standalone.xml
Sep 21 11:30:16 DEBUG ! oracle.net.encryption_client=REQUIRED,
Sep 21 11:30:16 DEBUG !863359ffe0c44c26890b903dfa637c139f6e60ef 455 455
Sep 21 11:30:16 DEBUG New line for 90877300 -> 863359ff: 863359ffe0c44c26890b903dfa637c139f6e60ef 455 455
Sep 21 11:30:16 DEBUG ! oracle.net.encryption_types_client=(AES256,AES192),
Sep 21 11:30:16 DEBUG !863359ffe0c44c26890b903dfa637c139f6e60ef 456 456
Sep 21 11:30:16 DEBUG New line for 90877300 -> 863359ff: 863359ffe0c44c26890b903dfa637c139f6e60ef 456 456
Sep 21 11:30:16 DEBUG ! oracle.net.crypto_checksum_client=REQUIRED,
Sep 21 11:30:16 DEBUG !863359ffe0c44c26890b903dfa637c139f6e60ef 457 457
Sep 21 11:30:16 DEBUG New line for 90877300 -> 863359ff: 863359ffe0c44c26890b903dfa637c139f6e60ef 457 457
Sep 21 11:30:16 DEBUG ! oracle.net.crypto_checksum_types_client=SHA1
Sep 21 11:30:16 DEBUG !06803a1f010cb0a8b75919e1a6870f7f8e835250 521 458 1
Sep 21 11:30:16 DEBUG New line for 90877300 -> 06803a1f: 06803a1f010cb0a8b75919e1a6870f7f8e835250 521 458 1
Sep 21 11:30:16 DEBUG !author XXX16374
Sep 21 11:30:16 DEBUG !author-mail XXX16374@xxxxxxx
Sep 21 11:30:16 DEBUG !author-time 1584068622
Sep 21 11:30:16 DEBUG !author-tz +0800
Sep 21 11:30:16 DEBUG !committer XXX16374
Sep 21 11:30:16 DEBUG !committer-mail XXX16374@xxxxxxxxx
Sep 21 11:30:16 DEBUG !committer-time 1584068622
Sep 21 11:30:16 DEBUG !committer-tz +0800
Sep 21 11:30:16 DEBUG !summary update RESOURCES
Sep 21 11:30:16 DEBUG !filename RESOURCES/TW_SVC/prod/jboss-eap-7.0/standalone.xml
Sep 21 11:30:16 DEBUG !
Sep 21 11:30:16 DEBUG !863359ffe0c44c26890b903dfa637c139f6e60ef 459 459 2
Sep 21 11:30:16 DEBUG New line for 90877300 -> 863359ff: 863359ffe0c44c26890b903dfa637c139f6e60ef 459 459 2
Sep 21 11:30:16 DEBUG ! oracle7_XA
Sep 21 11:30:16 DEBUG !863359ffe0c44c26890b903dfa637c139f6e60ef 460 460
Sep 21 11:30:16 DEBUG New line for 90877300 -> 863359ff: 863359ffe0c44c26890b903dfa637c139f6e60ef 460 460
Sep 21 11:30:16 DEBUG ! ALTER SESSION SET current_schema=XXXXX
Sep 21 11:30:16 DEBUG !06803a1f010cb0a8b75919e1a6870f7f8e835250 523 461 1
Sep 21 11:30:16 DEBUG New line for 90877300 -> 06803a1f: 06803a1f010cb0a8b75919e1a6870f7f8e835250 523 461 1
Sep 21 11:30:16 DEBUG ! TRANSACTION_READ_COMMITTED
Sep 21 11:30:16 DEBUG !
Sep 21 11:30:16 DEBUG |-------- ----- @@ -454,8 +454,8 @@
Sep 21 11:30:16 DEBUG |863359ff 454 oracle.net.encryption_client=REQUIRED,
Sep 21 11:30:16 DEBUG |863359ff 455 - oracle.net.encryption_types_client=(AES256,AES192),
Sep 21 11:30:16 DEBUG |863359ff 456 - oracle.net.crypto_checksum_client=REQUIRED,
Sep 21 11:30:16 DEBUG |863359ff 457 - oracle.net.crypto_checksum_types_client=SHA1
Sep 21 11:30:16 DEBUG | + oracle.net.encryption_types_client=(AES256,AES192),
Sep 21 11:30:16 DEBUG | + oracle.net.crypto_checksum_client=REQUIRED,
Sep 21 11:30:16 DEBUG | + oracle.net.crypto_checksum_types_client=SHA1
Sep 21 11:30:16 DEBUG |06803a1f 458
Sep 21 11:30:16 DEBUG |863359ff 459 - oracle7_XA
Sep 21 11:30:16 DEBUG |863359ff 460 - ALTER SESSION SET current_schema=XXXXX
Sep 21 11:30:16 DEBUG | + oracle8_XA
Sep 21 11:30:16 DEBUG | + ALTER SESSION SET current_schema=XXXXX
Sep 21 11:30:16 DEBUG |06803a1f 461 TRANSACTION_READ_COMMITTED
Sep 21 11:30:16 DEBUG Blaming hunk -470,3 @ c7cfcd69

Does the log mean that the error is happen on line 470? many thanks for your help.

@aspiers
Copy link
Owner

aspiers commented Sep 21, 2020

This is most likely a bug in the approach to UTF-8 decoding. Perhaps the data being decoded is not actually UTF-8. Either way it will be related to the use of Python 3, as I haven't gotten around to doing heavy testing on Python 3 yet. See also #98 and #87 which are both related to Python 3.

@aspiers
Copy link
Owner

aspiers commented Sep 21, 2020

Are you able to share the repository which caused this bug, so we can try to reproduce?

@futureweihk
Copy link
Author

This is most likely a bug in the approach to UTF-8 decoding. Perhaps the data being decoded is not actually UTF-8. Either way it will be related to the use of Python 3, as I haven't gotten around to doing heavy testing on Python 3 yet. See also #98 and #87 which are both related to Python 3.

Thanks, so do you mean that if we use Pyhon 2.7 can avoid such error?

@futureweihk
Copy link
Author

futureweihk commented Sep 22, 2020

How can we change git-deps Python engine? need reinstall git-deps, or any parameter changing can acheive this? thx

@aspiers
Copy link
Owner

aspiers commented Sep 22, 2020

@futureweihk commented on September 22, 2020 4:54 AM:

Thanks, so do you mean that if we use Pyhon 2.7 can avoid such error?

Possibly - I'd say there is a good chance but I can't guarantee it.

@futureweihk commented on September 22, 2020 4:56 AM:

How can we change git-deps Python engine? need reinstall git-deps, or any parameter changing can acheive this? thx

That depends very much on your OS and how you normally install Python. Please just follow standard Python installation documentation, as I do not have time to provide general support for Python. git-deps does not do anything significantly different to other Python programs, so standard procedures work as normal. If you are not too familiar with Python then you could alternatively find a Python consultant to help. It is not hard.

@futureweihk
Copy link
Author

futureweihk commented Sep 24, 2020

Sir,

We change the detector.py line 173 from:
blame = subprocess.check_output(cmd, universal_newlines=True)
to:
blame = subprocess.check_output(cmd, encoding="utf-8", errors="replace", universal_newlines=True)
Now seems the git-deps can run without utf-8 error, do you have any suggestions for the approach?

@aspiers
Copy link
Owner

aspiers commented Apr 4, 2021

Thanks, that's very helpful. I still need this though:

@aspiers commented on September 21, 2020 5:14 PM:

Are you able to share the repository which caused this bug, so we can try to reproduce?

so that I can reproduce and test the fix. Please can you share it?

@aspiers aspiers added the need info Blocked awaiting information label Apr 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug help wanted need info Blocked awaiting information python3
Projects
None yet
Development

No branches or pull requests

2 participants