Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG since v3.5.0: Trying to load a playlist from file using an absolute path fails on Windows because of drive letters (URL-test in m3u8.load not implemented correctly!) #387

Open
e-d-n-a opened this issue Oct 10, 2024 · 2 comments

Comments

@e-d-n-a
Copy link

e-d-n-a commented Oct 10, 2024

I got an error while working with m3u8 v4.0.0 and trying to load a playlist from file using an absolute path on Windows.

Traceback:

Traceback (most recent call last):
  File "[myscript.py]", line 843, in <module>
    asyncio.run(main())
  File "C:\Program Files\Python39\lib\asyncio\runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "C:\Program Files\Python39\lib\asyncio\base_events.py", line 647, in run_until_complete
    return future.result()
  File "[myscript.py]", line 806, in main
    await DLer.download_vod(file_pl, base_uri=url_pl_base, target=folder_target)
  File "[myscript.py]", line 478, in download_vod
    pl = m3u8.load(str(playlist), custom_tags_parser=self.__class__._parse_twitch_tags)
  File "C:\Program Files\Python39\lib\site-packages\m3u8\__init__.py", line 94, in load
    content, base_uri = http_client.download(uri, timeout, headers, verify_ssl)
  File "C:\Program Files\Python39\lib\site-packages\m3u8\httpclient.py", line 16, in download
    resource = opener.open(uri, timeout=timeout)
  File "C:\Program Files\Python39\lib\urllib\request.py", line 517, in open
    response = self._open(req, data)
  File "C:\Program Files\Python39\lib\urllib\request.py", line 539, in _open
    return self._call_chain(self.handle_open, 'unknown',
  File "C:\Program Files\Python39\lib\urllib\request.py", line 494, in _call_chain
    result = func(*args)
  File "C:\Program Files\Python39\lib\urllib\request.py", line 1417, in unknown_open
    raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: d>

... where file_pl was an absolute path to a m3u8-playlist on drive D (see 'd' in urlopen error).
(same code worked, whenfile_pl was a relative path!)

I was dumbfounded to find out, that a bug got already introduced back in v3.5.0, that makes it impossible to load playlists from files (using absolute paths) on Windows, because the new method of distinction between URLs and file paths in m3u8.load just got changed for the worse.

see blame of faulty line in m3u8.load and commit that introduced the bug with v3.5.0.

I guess, no one has loaded a playlist from a file using an absolute path since then, because it only works with relative paths on Windows now.

Distinction-method in m3u8.load of v3.5.0 [failing]:

m3u8/m3u8/__init__.py

Lines 47 to 51 in 57d2547

if urlsplit(uri).scheme:
content, base_uri = http_client.download(uri, timeout, headers, verify_ssl)
return M3U8(content, base_uri=base_uri, custom_tags_parser=custom_tags_parser)
else:
return _load_from_file(uri, custom_tags_parser)

Distinction-method in init.py and parser.py of v3.4.0 (=previous release) [working]:

if is_url(uri):

m3u8/m3u8/parser.py

Lines 597 to 598 in b2a1342

def is_url(uri):
return uri.startswith(URI_PREFIXES)

URI_PREFIXES = ('https://', 'http://', 's3://', 's3a://', 's3n://')

History of solutions:

2012-05-02: commit that introduced local file support
2012-05-18: commit that introduced is_url-function
2023-05-10: commit that introduced the bug (removing is_url-function)

urlsplit(uri).scheme == '' is a bad solution:

Test code to verify the issue (on Windows):

from urllib.parse import urlsplit
from pathlib import Path

# with current working directory being on a drive with assigned letter:
path_m3u8_rel = Path('playlist.m3u8')
# assert path_m3u8_rel.is_file()
path_m3u8_abs = path_m3u8_rel.absolute()
(urlsplit(str(path_m3u8_rel)).scheme == '' # True  - WORKS
,urlsplit(str(path_m3u8_abs)).scheme == '' # False - FAILS
,urlsplit(str(path_m3u8_abs)).scheme == path_m3u8_abs.drive[0].lower()) # True - scheme == drive letter!

Suggested solutions from Stackoverflow:

There are discussions and answers on Stackoverflow regarding this issue, with solutions you could adapt.

Answer from https://stackoverflow.com/questions/7849818/argument-is-url-or-path:

from urllib2 import urlopen

try:
    f = urlopen(sys.argv[1])
except ValueError:  # invalid URL
    f = open(sys.argv[1])

or better:

Answer from https://stackoverflow.com/questions/68626097/pythonic-way-to-identify-a-local-file-or-a-url:

from urllib.parse import urlparse
from os.path import exists

def is_local(url):
    url_parsed = urlparse(url)
    if url_parsed.scheme in ('file', ''): # Possibly a local file
        return exists(url_parsed.path)
    return False
@bbayles
Copy link
Contributor

bbayles commented Oct 11, 2024

I'll make a PR for this!

@bbayles
Copy link
Contributor

bbayles commented Oct 11, 2024

Here is my proposed fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@bbayles @e-d-n-a and others