Skip to content

Update napi.py script #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 80 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
791a1b5
ported script to python 3, corrected tab/spaces and formatting to mat…
emkor Aug 12, 2018
7b73e21
few more renames and method extractions to increase readability
emkor Aug 12, 2018
cc0f2c4
changed main loop strategy to best-effort, extracted exit codes as co…
emkor Aug 12, 2018
583f3df
added basic unit tests for generating hashes and for unpacking 7zippe…
emkor Aug 12, 2018
c4000e2
added 7zipped subtitles for unit-testing purposes, implemented encodi…
emkor Aug 12, 2018
28c0356
wrapped opening files in with-clause for safety, moved exceptions to top
emkor Aug 12, 2018
d100caf
expanded readme, added gitignore
emkor Aug 12, 2018
6cc3cff
corrected markdown styling in readme, expanded readme
emkor Aug 12, 2018
88ca661
Merge branch 'master' of github.com:emkor/napi.py
emkor Aug 12, 2018
490c593
corrected markdown styling in readme, expanded readme
emkor Aug 12, 2018
b35d77b
Merge branch 'master' of github.com:emkor/napi.py
emkor Aug 12, 2018
b5e6b01
added exception info on failed read from movie file
emkor Jan 20, 2019
b87f949
removed accidentally copied script part
emkor Jan 20, 2019
32be276
split script into separate files
emkor Jan 21, 2019
5bbda9f
refactored implementation, implemented detecting source encoding, imp…
emkor Jan 21, 2019
02ec789
chardet seems to be too erratic; reverted to hardcoded windows-1250 w…
emkor Jan 21, 2019
3e54b93
reorganized package, added makefile and setup files, updated readme
emkor Jan 21, 2019
07786f5
updated gitignore with build caches, changed venv path in makefile
emkor Aug 25, 2019
22320be
bumped patch version, fixed tests, split method for decoding/encoding…
emkor Aug 25, 2019
5a27b7a
introduce logging instead of raw prints, update readme
emkor Aug 25, 2019
99d9ca0
add acceptance test, add base class to use as a lib, extend readme
emkor Aug 25, 2019
f23c5a7
add CI definition (travis)
emkor Aug 25, 2019
799b61c
add installation of 7z exec
emkor Aug 25, 2019
a040d7f
add installation of 7z exec
emkor Aug 25, 2019
f135c50
add scripts for publishing to pypi
emkor Aug 25, 2019
d127f9f
reduce testing to just python 3.7 as travis would try to publish pack…
emkor Aug 25, 2019
1cd9b1b
wrap if branch == master into bash script as it was failing on travis CI
emkor Aug 25, 2019
41ebd24
moving publishing code to script
emkor Aug 25, 2019
5bce86d
implement case with no such subtitles found
emkor Aug 27, 2019
745d934
implement feature of downloading subs for given hash, fix few bugs (m…
emkor Aug 27, 2019
021dfda
fix tests according to download_subs() API change
emkor Aug 27, 2019
48f5888
add flag to force treating downloaded subs as given encoding instead …
emkor Sep 27, 2019
af07dc9
use tox instead of raw pytest to check if tools is working under diff…
emkor Dec 29, 2019
876cc83
fix travis by adding missing python interpreter
emkor Dec 29, 2019
92a8a46
add checking which python is available on CI
emkor Dec 29, 2019
d9445dc
fixed publishing: now it's executed only on python 3.7-based job variant
emkor Dec 29, 2019
c92fdda
fix build status link, remove not-needed dependency in unit test phas…
emkor Dec 29, 2019
67e5469
include envrc for easier developemnt, add instruction for installing …
emkor Nov 22, 2020
40f811a
allow using movie hash format `napiprojekt:SOMEHASH`
emkor Nov 22, 2020
e6bb7aa
include testing against python 3.8 in tox, include clean command in m…
emkor Nov 22, 2020
3bfd079
bump version in relation to updating makefile
emkor Nov 22, 2020
1bf16a4
change default extension to .srt, add acceptance test for CLI tool
emkor Nov 22, 2020
4d35cd0
tag the release, add some minor fixes in readme
emkor Nov 22, 2020
608702a
add setup for python 3.8
emkor Nov 22, 2020
8a58874
include verifying twine installation
emkor Nov 22, 2020
5ac5e6d
remove beta classifier
emkor Nov 22, 2020
1a23551
implement detecting encoding correctly, implement acceptance tests
emkor Nov 23, 2020
6612cb1
remove unused creds.sh sourcing in .envrc
emkor Nov 23, 2020
4971ec6
switch to using raw pytest instead of tox
emkor Nov 23, 2020
9032fae
remove tox.ini, switch to using makefile on CI
emkor Nov 23, 2020
9f679d6
bump minor versions
emkor Nov 23, 2020
c35d33b
rewrite the tests to use pytest approach
emkor Nov 24, 2020
f28a557
support python 3.6 in travis CI
emkor Nov 24, 2020
4396174
update patch revision as the changes were not in functionality
emkor Nov 24, 2020
e879bfd
Remove the usage of 7z (#2)
lipowskm Dec 2, 2020
ff03d43
implement workflow for github actions
emkor Dec 25, 2020
54ece62
fix test failing due to double `napi-py` phrase in project path appea…
emkor Dec 25, 2020
196e659
change the CI badge to github-provided one
emkor Dec 25, 2020
f24738d
remove travis CI - related configuration
emkor Dec 25, 2020
0742efb
bump patch revision
emkor Dec 25, 2020
faacc62
Update main.yml
emkor Dec 25, 2020
f393c39
install twine globally, remove it from dependencies
emkor Dec 25, 2020
86f896c
remove no-longer-required 7z-executable leftovers
emkor Dec 25, 2020
2088e67
fix bug with cross-device error while moving subs from /tmp to mounte…
emkor Dec 25, 2020
59f64fc
correct grammar errors in readme, include python3-dev package as expl…
emkor Dec 25, 2020
e5f3ff3
switch from setup.py to poetry tool for dependency management
emkor May 31, 2021
eb13757
switch from check_call to run in acceptance tests
emkor May 31, 2021
00498f6
register CLI entrypoint in poetry
emkor May 31, 2021
7d59e8e
add black code reformatter as a part of CI job
emkor May 31, 2021
9ae8ddc
bump patch revision to publish the pacakge on pypi
emkor May 31, 2021
e2d959e
switch back to publishing with twine
emkor May 31, 2021
b1691fa
increase logging in twine publish step
emkor May 31, 2021
2a25341
add --skip-existing flag for twine to force uploading new release to …
emkor May 31, 2021
f652eac
publish new version with minor updated
emkor May 31, 2021
86a1aa0
re-format using black, bump dependencies, publish using tokens
emkor Jun 13, 2021
3632fea
Fix encoding issues for utf-16 (#3)
steciuk Jun 15, 2024
b9ca62b
remove support for python 3.6, add support for 3.10
emkor Jun 15, 2024
8ccf878
update dev dependencies
emkor Jun 15, 2024
f1b93ec
update README.md, correct version number
emkor Jun 15, 2024
c2b3c06
Issue 4 test newer python (#5)
emkor Sep 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: CI

on: push

jobs:

test:
name: Test with Python ${{ matrix.python-version }}
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: [ "3.7", "3.8", "3.9", "3.10", "3.11", "3.12" ]
steps:
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Checkout code
uses: actions/checkout@v2
- name: Install poetry
run: pip install poetry
- name: Set up virtualenv
run: make install
- name: Run tests
run: make test
- name: Build distributable package
run: make build

release:
name: Publish the package
needs: test
if: ${{ github.ref == 'refs/heads/master' }}
runs-on: ubuntu-22.04
steps:
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: "3.10"
- name: Checkout code
uses: actions/checkout@v2
- name: Install poetry
run: pip install poetry
- name: Install from source
run: make install
- name: Publish package to pypi
run: poetry publish --build -u ${{ secrets.PYPI_USER }} -p ${{ secrets.PYPI_PASSWORD }}
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
*.iml
.idea/

.venv/
.tox/
.mypy_cache/
.pytest_cache/
*.egg-info/
build/
dist/
__pycache__/
674 changes: 674 additions & 0 deletions LICENSE.md

Large diffs are not rendered by default.

31 changes: 31 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
test: lint unit-test acceptance-test
all: clean test build

POETRY = poetry

clean:
@echo "---- Doing cleanup ----"
@rm -rf .mypy_cache .pytest_cache dist

install:
@echo "---- Installing package ---- "
@$(POETRY) install

lint:
@echo "---- Running type check and linter ---- "
@$(POETRY) run mypy napi
@$(POETRY) run black napi

unit-test:
@echo "---- Running unit tests ---- "
@$(POETRY) run pytest -ra -vv test/unit

acceptance-test:
@echo "---- Running acceptance tests ---- "
@$(POETRY) run pytest -ra -vv test/acceptance

build:
@echo "---- Build distributable ---- "
@$(POETRY) build

.PHONY: all config test build clean setup install lint ut at
37 changes: 34 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,35 @@
napi.py
=======
# napi-py ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/napi-py) ![CI](https://github.com/emkor/napi-py/workflows/CI/badge.svg)
CLI tool for downloading subtitles from napiprojekt.pl, fork of [gabrys/napi.py](https://github.com/gabrys/napi.py)

CLI script to download subtitles from napiprojekt.pl
## prerequisites
- Python 3.7 or newer

## installation
- `pip install napi-py` for user-wide installation

## usage as CLI tool
- `napi-py ~/Downloads/MyMovie.mp4` will download and save subtitles under `~/Downloads/MyMovie.srt`

## usage as lib
```python
from napi import NapiPy

movie_path = "~/Downloads/MyMovie.mp4"

napi = NapiPy()
movie_hash = napi.calc_hash(movie_path)
source_encoding, target_encoding, tmp_file = napi.download_subs(movie_hash)
subs_path = napi.move_subs_to_movie(tmp_file, movie_path)
print(subs_path)
```

## in case of issues
- if there are no subs for your movie, there's still hope:
- open the movie web page on `napiprojekt.pl` in your browser, as in example: `https://www.napiprojekt.pl/napisy1,1,1-dla-55534-Z%C5%82odziejaszki-(2018)`
- choose subtitles that might match your movie, right-click them and select "Copy link" on link containing hash, which looks like this `napiprojekt:96edd6537d9852a51cbdd5b64fee9194`
- use flag `--hash YOURHASH` in this tool, i.e. `--hash 96edd6537d9852a51cbdd5b64fee9194` or `--hash napiprojekt:96edd6537d9852a51cbdd5b64fee9194`

## development
- `make install` installs poetry virtualenv
- `make test` runs tests
- `make build` creates installable package
5 changes: 5 additions & 0 deletions mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[mypy]
warn_unused_configs = True

[mypy-py7zlib]
ignore_missing_imports = True
1 change: 1 addition & 0 deletions napi/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from napi.napi import NapiPy
31 changes: 31 additions & 0 deletions napi/api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import os
from urllib import request


def _cipher(z):
idx = [0xE, 0x3, 0x6, 0x8, 0x2]
mul = [2, 2, 5, 4, 3]
add = [0, 0xD, 0x10, 0xB, 0x5]

b = []
for i in range(len(idx)):
a = add[i]
m = mul[i]
i = idx[i]

t = a + int(z[i], 16)
v = int(z[t : t + 2], 16)
b.append(("%x" % (v * m))[-1])

return "".join(b)


def _build_url(movie_hash):
return "http://napiprojekt.pl/unit_napisy/dl.php?l=PL&f={}&t={}&v=other&kolejka=false&nick=&pass=&napios={}".format(
movie_hash, _cipher(movie_hash), os.name
)


def download_for(movie_hash: str) -> bytes:
the_url = _build_url(movie_hash)
return request.urlopen(the_url).read()
63 changes: 63 additions & 0 deletions napi/encoding.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
import locale
from typing import Optional, Tuple

import chardet

DECODING_ORDER = ["utf-16", "windows-1250", "windows-1251", "windows-1252", "windows-1253", "windows-1254", "utf-8"]
CHECK_NUM_CHARS = 5000
AUTO_DETECT_THRESHOLD = 0.9


def _is_ascii(c: str) -> bool:
return ord(c) < 128


def _is_polish_diacritic(c: str) -> bool:
return c in "ąćęłńóśżźĄĆĘŁŃÓŚŻŹ"


def _is_correct_encoding(subs: str) -> bool:
err_symbols, diacritics = 0, 0
for char in subs[:CHECK_NUM_CHARS]:
if _is_polish_diacritic(char):
diacritics += 1
elif not _is_ascii(char):
err_symbols += 1

return err_symbols < diacritics


def _detect_encoding(subs: bytes) -> Tuple[Optional[str], float]:
result = chardet.detect(subs)
return result["encoding"], result["confidence"]


def _try_decode(subs: bytes) -> Tuple[str, str]:
encoding, confidence = _detect_encoding(subs)
if encoding and confidence > AUTO_DETECT_THRESHOLD:
try:
return encoding, subs.decode(encoding)
except UnicodeDecodeError:
pass

last_exc = None
for i, enc in enumerate(DECODING_ORDER):
try:
encoded_subs = subs.decode(enc)
if _is_correct_encoding(encoded_subs):
return enc, encoded_subs
except UnicodeDecodeError as e:
last_exc = e
raise ValueError("Could not encode using any of {}: {}".format(DECODING_ORDER, last_exc))


def decode_subs(subtitles_binary: bytes, use_enc: Optional[str] = None) -> Tuple[str, str]:
if use_enc is not None:
return use_enc, subtitles_binary.decode(use_enc)
else:
return _try_decode(subtitles_binary)


def encode_subs(subs: str) -> Tuple[str, bytes]:
target_encoding = locale.getpreferredencoding()
return target_encoding, subs.encode(target_encoding)
11 changes: 11 additions & 0 deletions napi/hash.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import hashlib

SIZE_10_MBs_IN_BYTES = 10485760


def calc_movie_hash_as_hex(movie_path: str) -> str:
md5_hash_gen = hashlib.md5()
with open(movie_path, mode="rb") as movie_file:
content_of_first_10mbs = movie_file.read(SIZE_10_MBs_IN_BYTES)
md5_hash_gen.update(content_of_first_10mbs)
return md5_hash_gen.hexdigest()
104 changes: 104 additions & 0 deletions napi/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
import argparse
import logging
import time
import traceback
from os import path
from typing import Optional

from napi import NapiPy
from napi.store_subs import get_target_path_for_subtitle

EXIT_CODE_OK = 0
EXIT_CODE_WRONG_ARGS = 1
EXIT_CODE_NO_SUCH_MOVIE = 2
EXIT_SUBS_NOT_FOUND = 4
EXIT_CODE_FAILED = 5


def setup_logger(level: int = logging.INFO) -> None:
logging.basicConfig(format="%(asctime)s UTC | %(levelname)s | %(message)s", level=level)
logging.Formatter.converter = time.gmtime


class NoMatchingSubtitle(Exception):
pass


def _parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser(prog="napi-py", description="CLI for downloading subtitles from napiprojekt.pl")
parser.add_argument("movie_path", type=str, help="Path to movie file")
parser.add_argument(
"--target",
type=str,
required=False,
default=None,
help="Path to store the subtitles in",
)
parser.add_argument(
"--hash",
type=str,
required=False,
default=None,
help="Use given hash for this movie",
)
parser.add_argument(
"--from-enc",
type=str,
required=False,
default=None,
help="Treat downloaded subs as this encoding instead of guessing",
)
return parser.parse_args()


def main(
movie_path: str,
subtitles_path: Optional[str] = None,
use_hash: Optional[str] = None,
from_enc: Optional[str] = None,
) -> None:
log = logging.getLogger()
movie_path = path.abspath(movie_path)
subtitles_path = path.abspath(subtitles_path or get_target_path_for_subtitle(movie_path))
if path.exists(movie_path):
if use_hash and use_hash.startswith("napiprojekt:"):
use_hash = use_hash.partition("napiprojekt:")[-1]
try:
napi_client = NapiPy()
movie_hash = use_hash or napi_client.calc_hash(movie_path)
log.info("Downloading subs for {} (hash: {})".format(path.basename(movie_path), movie_hash))
src_enc, tgt_src, tmp_file = napi_client.download_subs(movie_hash, use_enc=from_enc)
if src_enc is not None and tmp_file is not None:
subs_path = (
napi_client.move_subs(tmp_file, subtitles_path)
if subtitles_path
else napi_client.move_subs_to_movie(tmp_file, movie_path)
)
log.info("Saved subs ({} -> {}) in {}".format(src_enc, tgt_src, subs_path))
else:
log.error("Napiprojekt.pl does not have subtitles for this movie")
exit(EXIT_SUBS_NOT_FOUND)
except Exception as e:
traceback.print_exc()
log.error(e)
exit(EXIT_CODE_FAILED)
else:
log.error("No such file: {}".format(movie_path))
exit(EXIT_CODE_NO_SUCH_MOVIE)


def cli_main():
setup_logger()
log = logging.getLogger()
try:
args = _parse_args()
main(
args.movie_path,
subtitles_path=args.target,
use_hash=args.hash,
from_enc=args.from_enc,
)
except Exception as e:
log.error("Parameters error: {}".format(e))
exit(EXIT_CODE_WRONG_ARGS)
exit(EXIT_CODE_OK)
41 changes: 41 additions & 0 deletions napi/napi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import os
import shutil
import tempfile
from typing import Tuple, Optional

from napi.api import download_for
from napi.encoding import decode_subs, encode_subs
from napi.hash import calc_movie_hash_as_hex
from napi.read_7z import un7zip_api_response
from napi.store_subs import get_target_path_for_subtitle


class NapiPy:
def __init__(self) -> None:
pass

def calc_hash(self, movie: str) -> str:
return calc_movie_hash_as_hex(movie)

def download_subs(
self, movie_hash: str, use_enc: Optional[str] = None
) -> Tuple[Optional[str], Optional[str], Optional[str]]:
subs_bin = un7zip_api_response(download_for(movie_hash))
if subs_bin:
src_enc, subs_utf8 = decode_subs(subs_bin, use_enc=use_enc)
tgt_enc, subs_bin = encode_subs(subs_utf8)
with tempfile.NamedTemporaryFile(delete=False) as fileTemp:
fileTemp.write(subs_bin)
return src_enc, tgt_enc, fileTemp.name
return None, None, None

def move_subs_to_movie(self, tmp_subs: str, movie: str) -> str:
tgt_path = get_target_path_for_subtitle(movie)
shutil.copy(tmp_subs, tgt_path)
os.remove(tmp_subs)
return tgt_path

def move_subs(self, tmp_subs: str, path: str) -> str:
shutil.copy(tmp_subs, path)
os.remove(tmp_subs)
return path
Loading