GitHub - seatgeek/thefuzz at refs/heads/master

This branch is up to date with master.

Name	Name	Last commit message	Last commit date
Latest commit johnthedebs Merge pull request #71 from mjpieters/process_types Feb 27, 2024 83bea3d · Feb 27, 2024 History 65 Commits
.github/workflows	.github/workflows	Drop support for EOL Python 3.7	Sep 26, 2023
data	data	initial commit as part of name migration	Apr 22, 2021
thefuzz	thefuzz	Full inlined type hints for thefuzz.process	Jan 20, 2024
.editorconfig	.editorconfig	initial commit as part of name migration	Apr 22, 2021
.gitignore	.gitignore	pytest: drop the dot	Apr 2, 2022
CHANGES.rst	CHANGES.rst	Fix typos	Nov 2, 2021
LICENSE.txt	LICENSE.txt	Switch to MIT license	Oct 30, 2023
MANIFEST.in	MANIFEST.in	initial commit as part of name migration	Apr 22, 2021
README.rst	README.rst	Drop support for EOL Python 3.7	Sep 26, 2023
benchmarks.py	benchmarks.py	replace python-Levenshtein with rapidfuzz	Jan 13, 2023
release	release	Fix typos	Nov 2, 2021
requirements.txt	requirements.txt	Pin requirement versions	Oct 30, 2023
setup.py	setup.py	update license specifier	Oct 30, 2023
test_thefuzz.py	test_thefuzz.py	account for force_ascii	Aug 29, 2023
test_thefuzz_hypothesis.py	test_thefuzz_hypothesis.py	replace python-Levenshtein with rapidfuzz	Jan 13, 2023
test_thefuzz_pytest.py	test_thefuzz_pytest.py	Fix tests for module-scoped logger	May 16, 2022
tox.ini	tox.ini	Drop support for EOL Python 3.7	Sep 26, 2023

Repository files navigation

TheFuzz

Fuzzy string matching like a boss. It uses Levenshtein Distance to calculate the differences between sequences in a simple-to-use package.

Requirements

Python 3.8 or higher
rapidfuzz

For testing

pycodestyle
hypothesis
pytest

Installation

Using pip via PyPI

pip install thefuzz

Using pip via GitHub

pip install git+git://github.com/seatgeek/[email protected]#egg=thefuzz

Adding to your requirements.txt file (run pip install -r requirements.txt afterwards)

git+ssh://[email protected]/seatgeek/[email protected]#egg=thefuzz

Manually via GIT

git clone git://github.com/seatgeek/thefuzz.git thefuzz
cd thefuzz
python setup.py install

Usage

>>> from thefuzz import fuzz
>>> from thefuzz import process

Simple Ratio

>>> fuzz.ratio("this is a test", "this is a test!")
    97

Partial Ratio

>>> fuzz.partial_ratio("this is a test", "this is a test!")
    100

Token Sort Ratio

>>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    91
>>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
    100

Token Set Ratio

>>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    84
>>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
    100

Partial Token Sort Ratio

>>> fuzz.token_sort_ratio("fuzzy was a bear", "wuzzy fuzzy was a bear")
    84
>>> fuzz.partial_token_sort_ratio("fuzzy was a bear", "wuzzy fuzzy was a bear")
    100

Process

>>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
>>> process.extract("new york jets", choices, limit=2)
    [('New York Jets', 100), ('New York Giants', 78)]
>>> process.extractOne("cowboys", choices)
    ("Dallas Cowboys", 90)

You can also pass additional parameters to extractOne method to make it use a specific scorer. A typical use case is to match file paths:

>>> process.extractOne("System of a down - Hypnotize - Heroin", songs)
    ('/music/library/good/System of a Down/2005 - Hypnotize/01 - Attack.mp3', 86)
>>> process.extractOne("System of a down - Hypnotize - Heroin", songs, scorer=fuzz.token_sort_ratio)
    ("/music/library/good/System of a Down/2005 - Hypnotize/10 - She's Like Heroin.mp3", 61)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TheFuzz

Requirements

For testing

Installation

Usage

Simple Ratio

Partial Ratio

Token Sort Ratio

Token Set Ratio

Partial Token Sort Ratio

Process

About

Releases

Packages

Used by 3.6k

Contributors 12

Languages

License

seatgeek/thefuzz

Folders and files

Latest commit

History

Repository files navigation

TheFuzz

Requirements

For testing

Installation

Usage

Simple Ratio

Partial Ratio

Token Sort Ratio

Token Set Ratio

Partial Token Sort Ratio

Process

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Used by 3.6k

Contributors 12

Languages

Packages